human genome center laboratory of genome …eshita 1: department of infectious disease con- trol,...

The large-scale molecular datasets generated by genome sequencing and otherhigh-throughput experimental technologies are the basis for understanding life asa molecular system and for bringing the genomic revolution to society. We are de-veloping bioinformatics technologies to integrate and interpret such large-scale da-tasets, especially for medical and pharmaceutical applications.

1. KEGG DISEASE and KEGG DRUG

Minoru Kanehisa

KEGG is a database of biological systems thatintegrates genomic, chemical, and systemic func-tional information. It is widely used as a refer-ence knowledge base for understanding higher-order functions and utilities of the cell, the or-ganism, and the ecosystem from genomic infor-mation. This Laboratory is responsible for theapplied areas of KEGG, especially in medicaland pharmaceutical sciences. We consider dis-eases as perturbed states of the molecular sys-tem that operates the cell and the organism, anddrugs as perturbants to the molecular system.The KEGG DISEASE database (http://www.genome.jp/kegg/disease/) is a collection of dis-ease entries capturing knowledge on genetic andenvironmental perturbations. The database cur-rently contains about 1,000 entries for diseaseswith known genetic fators and infectious dis-eases with known pathogen genomes. TheKEGG DRUG database (http://www.genome.jp/kegg/drug/) contains about 10,000 entriesfor the approved drugs in Japan, USA, andEurope unified based on the chemical structures

and/or the chemical components. Each entry isassociated with target, metabolizing enzyme,and other molecular interaction network infor-mation, enabling understanding of the pertur-bants in the context of KEGG pathway maps.Furthermore, it contains another type of molecu-lar network information, namely, knowledgeabout chemical structure transformation patternsduring the history of drug development as rep-resented in the KEGG DRUG structure maps.KEGG DISEASE and KEGG DRUG now consti-tute the core part of KEGG MEDICUS (http://www.genome.jp/kegg/medicus.html), an inte-grated information resource of diseases, drugs,and health-related substances, aiming to bringthe genomic revolution to society.

2. KEGG OC: Automatic assignments oforthologs and paralogs in complete geno-mes

Toshiaki Katayama, Shuichi Kawashima, Aki-hiro Nakaya1 and Minoru Kanehisa: 1Biore-source Science Branch, Brain Research Insti-tute, Niigata University

The increase in the number of complete

Human Genome Center

Laboratory of Genome Databaseゲノムデータベース分野

Professor Minoru Kanehisa, Ph.D.Research Associate Toshiaki Katayama, M.Sc.Research Associate Shuichi Kawashima, M.Sc.

教授理学博士金久實助教理学修士片山俊明助教理学修士川島秀一

108

genomes has provided clues to gain useful in-sights to understand the evolution of the geneuniverse. Among the KEGG suites of databases,the GENES database contains more than 6.6 mil-lion genes from over 1,560 organisms as of Au-gust 2011. Sequence similarities among thesegenes are calculated by all-against-all SSEARCHcomparison and stored in the SSDB database.Based on those databases, the ORTHOLOGY da-tabase has been manually constructed to storethe relationships among the genes sharing thesame biological function. However, in this strat-egy, only the well known functions can be usedfor annotation of newly added genes, thus thenumber of annotated genes is limited. To over-come this situation, we have developed a fullyautomated procedure to find candidate ortholo-gous clusters including those without any func-tional annotation. The method is based on agraph analysis of the SSDB database, treatinggenes as nodes and the Smith-Waterman se-quence similarity scores as edge weights. Thecluster is found by our heuristic method forfinding quasi-cliques, but the SSDB graph is toolarge to perform quasi-clique finding at a time.Therefore, we introduce a hierarchy (evolution-ary relationship) of organisms and treat theSSDB graph as a nested graph. The automaticdecomposition of the SSDB graph into a set ofquasi-cliques results in the KEGG OC (OrthologCluster) database. We have built a system thatperforms automatic update of KEGG OC, whichcan be run on a weekly basis. As a result, weobtained 1,139,058 clusters including 715,367singleton clusters from 6,686,868 protein codinggenes. Among them, 5,704 clusters were sharedacross kingdoms and other clusters were king-dom specific. The automatic classification of ourortholog clusters is largely consistent with themanually curated ORTHOLOGY database. Theresulted table of orthologous genes is madeavailable through the KEGG FTP site.

3. KEGG API: SOAP/WSDL interface for theKEGG system

Shuichi Kawashima, Toshiaki Katayama andMinoru Kanehisa

KEGG is a suite of databases and associatedsoftware, integrating our current knowledge ofmolecular interaction/reaction pathways andother systemic functions (PATHWAY and BRITEdatabases), information about the genomic space(GENES database), and information about thechemical space (LIGAND databases). To facili-tate large-scale applications of the KEGG systemprogrammatically, we have been developingand maintaining the KEGG API as a stable

SOAP/WSDL based web service. The KEGGAPI is available at http://www.genome.jp/kegg/soap/.

4. Whole transcriptome analysis of Anophe-les stephensi 14 days post Plasmodiumyoelii infection using RNA-seq

Shuichi Kawashima, Lucky R. Runtuwene1,Yutaka Suzuki2, Sumio Sugano2, Kenta Nakai3,Yuki Eshita1: 1Department of Infectious DiseaseControl, Faculty of Medicine, Oita University,2Department of Medical Genome Sciences,Graduate School of Frontier Sciences, The Uni-versity of Tokyo, 3Laboratory of FunctionalAnalysis in Silico, Institute of Medical Science,The University of Tokyo

Anopheles stephensi is one of the major vectorsof human malaria in Asia. However, transcrip-tome of malaria-infected An. stephensi have notbeen studied enough yet compared to that ofAnopheles gambiae which is known as the mostimportant vector of Plasmodium falciparum. Thuswe conducted an RNA-seq experiment againstAn. stephensi of 14 days post-plasmodium-infec-tion and non-infection to find the mosquitogenes that are affected by plasmodium infection.Illumina Genome Analyzer produced approxi-mately 73 and 63 millions short reads for eachcontrol and target samples respectively. BecauseAe. stephensi genome sequence is not availableyet, we assembled the short reads de novo togenerate transcript sequences as the referencefor the short reads mapping. We obtained 28,074transcripts by using the Trinity assembler andthen mapped the short reads to the transcriptsby using the Bowtie aligner. A statistical testwas carried out by the DEGseq which is an Rpackage in the Bioconductor. MARS method im-plemented in the DEGseq showed 5,563 tran-scripts were expressed differentially betweenboth samples with p-value of 0.001 or less.Among them, a total of 79 and 787 genes wereup- and down-regulated more than 2 folds in anatural logarithm scale, respectively. While mostof them were homologous to genes annotatedwith “putative uncharacterized gene”, we foundtwo Trypsin homologs in the up-regulated genesand some histones in the down-regulated genes.

5. Whole transcriptome analysis of Aedes ae-gypti 14 days post-dengue infection usingRNA-seq

Lucky R. Runtuwene1, Shuichi Kawashima,Yutaka Suzuki2, Sumio Sugano2, Kenta Nakai3,Ryuichiro Maeda4, Chihiro Sugimoto5, Tomo-hiko Takasaki6, Ichiro Kurane6 and Yuki

109

Eshita1: 1Department of Infectious Disease Con-trol, Faculty of Medicine, Oita University, 2De-partment of Medical Genome Sciences, Gradu-ate School of Frontier Sciences, The Universityof Tokyo, 3Laboratory of Functional Analysisin Silico, Institute of Medical Science, The Uni-versity of Tokyo, 4Obihiro University of Agri-culture and Veterinary Medicine, 5Departmentof Collaboration and Education, Research Cen-ter for Zoonosis Control, Hokkaido University,6National Institute of Infectious Diseases

Dengue virus (DENV) is the causative agentof fatal disease which has no vaccine available.Breakage of transmission is still the core of pre-vention. The prospects of innovating new pre-vention techniques include genetically modifieddengue-resistant transgenic mosquitoes. That in-novation begins with a search of mosquito genesthat are affected by dengue infection. To addressthe issue, we analyzed the whole transcriptomeof 14 days post-dengue-infection Aedes aegyptiby using RNA-seq technique. Illumina GenomeAnalyzer produced approximately 60 millionsshort reads for each control and target samples.These were mapped onto 14,839 and 14,394genes respectively. A statistical test showed9,059 genes were expressed differentially be-tween both samples with false discovery rate(FDR) of 0.01 or less. Among them, a total of 91and 394 genes were up- and down-regulatedmore than 7.39 folds, respectively. Histoneswere the most prominent in the up-regulatedgenes. In addition, five of the 22 HSP20 homolo-gous genes were also significantly up-regulatedin the DENV infected mosquito. Although HSP20s are known to serve diverse protective func-tions, its detailed functions in insects are stillunclear. Induced specific HSP20s we found sug-gest that those are responsible to the viral infec-tion.

6. Biogem: an effective tool based approachfor scaling up open source software devel-opment in bioinformatics

Bonnal, R.J.P.1, Aerts, J.2, Githinji, G.3, Goto,N.4, MacLean, D.5, Miller C.A.6, Mishima, H.7,Pagani, M.1, Ramirez-Gonzalez, R.8, Smant G.9,Strozzi, F.10, Syme, R.11, Vos, R.12, Wennblom,T.J.13, Woodcroft, B.J.14, Katayama, T., Prins,P.9: 1Integrative Biology Program, IstitutoNazionale Genetica Molecolare, Milan, Italy,2ESAT/SCD, Faculty of Engineering and IBBTFuture Health Department, University of Leu-ven, Belgium, 3KEMRI-Wellcome Trust Re-search Program, Kilifi, Kenya, 4Research Insti-tute for Microbial Diseases, Osaka University,Japan, 5The Sainsbury Laboratory, Norwich,

UK, 6Biology Department, Boston College,USA, 7Department of Human Genetics, Na-gasaki University Graduate School of Biomedi-cal Sciences, Japan, 8The Genome AnalysisCentre, Norwich, UK, 9Laboratory of Nematol-ogy, Wageningen University, the Netherlands,10Parco Tecnologico Padano, Lodi, Italy, 11DeptEnvironment & Agriculture, Curtin University,Australia, 12NCB Naturalis, Leiden, the Nether-lands, 13Silicon Life Sciences, Minneapolis,USA, 14Department of Biochemistry and Mo-lecular Biology, University of Melbourne, Aus-tralia

Biogem provides a software development en-vironment for the Ruby programming language,which encourages communitybased software de-velopment for bioinformatics while lowering thebarrier to entry and encouraging best practices.Biogem, with its targeted modular and decen-tralized approach, software generator, tools, andtight web integration, is an improved generalmodel for scaling up collaborative open sourcesoftware development in bioinformatics. Avail-ability: Biogem and modules are free and opensource software. Biogem runs on all systemsthat support recent versions of Ruby, includingLinux, Mac OS X and Windows. Further infor-mation at http://www.biogems.info. A tutorialis available at http://www.biogems.info/howto.html.

7. TogoWS: integrated SOAP and REST APIsfor interoperable bioinformatics Web serv-ices

Toshiaki Katayama

Web services have become widely used inbioinformatics analysis, but there exist incom-patibilities in interfaces and data types, whichprevent users from making full use of a combi-nation of these services. Therefore, we have de-veloped the TogoWS service to provide an inte-grated interface with advanced features. In theTogoWS REST (REpresentative State Transfer)API (application programming interface), we in-troduce a unified access method for major data-base resources through intuitive URIs that canbe used to search, retrieve, parse and convertthe database entries. Recently, converters forCSV, GenBank/DDBJ/EMBL and HMMER3 intoSemantic Web data formats including RDF/XML and Turtle are added. The TogoWS serviceis freely available at: http://togows.dbcls.jp/.

8. TogoDB: Instantly publish your researchmaterial as a public database

110

Toshiaki Katayama, Mitsuteru Nakao1 andToshihisa Takagi2,3: 1Database Center for LifeScience, ROIS, 2National Bioscience DatabaseCenter, JST, 3Graduate School of Frontier Sci-ences, The University of Tokyo

Supplemental materials are often provided asseparate files downloadable from the publisher’ssite along with the publication of a journal arti-cle. However, these data are not fully utilizedsince they are not available in the form of regu-lar biological databases and hard to find by thepopular Web search engines. TogoDB is a sim-ple and intuitive database system to publishtabular formatted data instantly on the Web. Us-ers can upload their research materials to theTogoDB through the simple web interface and

the data will be made available as a fully func-tional database in a minute or two. TogoWS isan integrated and uniformed interface for themajor bioinformatics web services and also pro-vides REST API for the contents in the TogoDB.Recently, we extended TogoDB and TogoWS tobe used as a consolidated platform for the Se-mantic Web by adding a metadata editor and anautomatic Resource Description Framework(RDF) dumper. Users can specify their choice ofa Property for each column and a Class for therange of each cell from a pull-down menu forRDF, RDFS, DC, SKOS and FOAF vocabularies,or directly assign a URI from any other ontolo-gies. This system fills the gap between user’sdata and major public databases to deliver effec-tive variations in the Linked Data.

Publications

Kanehisa, M., Goto, S., Sato, Y., Furumichi, M.,Tanabe, M. KEGG for integration and inter-pretation of large-scale molecular datasets.Nucleic Acids Res. 40, D109-114, 2012

Takarabe, M., Shigemizu, D., Kotera, M., Goto,S., Kanehisa, M. Network-based analysis andcharacterization of adverse drug-drug interac-tions. J. Chem. Inf. Model. 51, 2977-2985, 2011

Kotera, M., Tokimatsu, T., Kanehisa, M., Goto,S. MUCHA: multiple chemical alignment al-gorithm to obtain building block substructuresof orphan metabolites. BMC Bioinformatics 12,S1, 2011

Bonnal, R.J.P., Aerts, J., Githinji, G., Goto, N.,MacLean, D., Miller C.A., Mishima, H., Pa-gani, M., Ramirez-Gonzalez, R., Smant G.,Strozzi, F., Syme, R., Vos, R., Wennblom, T.J.,Woodcroft, B.J., Katayama, T., Prins, P. Bio-gem: an effective tool based approach for scal-ing up open source software development inbioinformatics. Bioinformatics, 2012, in press

Guberman, J.M., Ai, J., Arnaiz, O., Baran, J.,Blake, A., Baldock, R., Chelala, C., Croft, D.,Cros, A., Cutts, R.J., Di Génova, A., Forbes, S.,Fujisawa, T., Gadaleta, E., Goodstein, D.M.,Gundem, G., Haggarty, B., Haider, S., Hall,M., Harris, T., Haw, R., Hu, S., Hubbard, S.,Hsu, J., Iyer, V., Jones, P., Katayama, T., Kin-sella, R., Kong, L., Lawson, D., Liang, Y.,Lopez-Bigas, N., Luo, J., Lush, M., Mason, J.,

Moreews, F., Ndegwa, N., Oakley, D., Perez-Llamas, C., Primig, M., Rivkin, E., Rosanoff,S., Shepherd, R., Simon R, Skarnes, B., Smed-ley, D., Sperling, L., Spooner, W., Stevenson,P., Stone, K., Teague, J., Wang, J., Wang, J.,Whitty, B., Wong, D.T., Wong-Erasmus, M.,Yao, L., Youens-Clark, K., Yung, C., Zhang, J.,Kasprzyk, A. BioMart Central Portal: an opendatabase network for the biological commu-nity. Database, bar041, 2011.

Katayama, T., Wilkinson, M.D., Vos, R.,Kawashima, T., Kawashima, S., Nakao, M.,Yamamoto, Y., Chun, H.W., Yamaguchi, A.,Kawano, S., Aerts, J., Aoki-Kinoshita, K.F.,Arakawa, K., Aranda, B., Bonnal, R.J., Fernán-dez, J.M., Fujisawa, T., Gordon, P.M., Goto,N., Haider, S., Harris, T., Hatakeyama, T., Ho,I., Itoh, M., Kasprzyk, A., Kido, N., Kim, Y.J.,Kinjo, A.R., Konishi, F., Kovarskaya, Y., vonKuster, G., Labarga, A., Limviphuvadh, V.,McCarthy, L., Nakamura, Y., Nam, Y.,Nishida, K., Nishimura, K., Nishizawa, T.,Ogishima, S., Oinn, T., Okamoto, S., Okuda,S., Ono, K., Oshita, K., Park, K.J., Putnam, N.,Senger, M., Severin, J., Shigemoto, Y., Suga-wara, H., Taylor, J., Trelles, O., Yamasaki, C.,Yamashita, R., Satoh, N., Takagi, T. The 2ndDBCLS BioHackathon: interoperable bioinfor-matics Web services for integrated applica-tions. J Biomed Semantics. 2:4 2011

111

The recent advances in biomedical research have been producing large-scale,ultra-high dimensional, ultra-heterogeneous data. Due to these post-genomic re-search progresses, our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics. With this mission,we have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology.

1. Gene Network Analysis and Cancer Sys-tems Biology

a. A novel network profiling analysis revealssystem changes in epithelial-mesenchymaltransition

Shimamura T, Imoto S, Shimada Y1, HosonoY1, Niida A, Nagasaki M, Yamaguchi R, Taka-hashi T1, Miyano S: 1Nagoya University Gradu-ate School of Medicine

Patient-specific analysis of molecular networks

is a promising strategy for making individualrisk predictions and treatment decisions in can-cer therapy. Although systems biology allowsthe gene network of a cell to be reconstructedfrom clinical gene expression data, traditionalmethods, such as Bayesian networks, only pro-vide an averaged network for all samples.Therefore, these methods cannot reveal patient-specific differences in molecular networks dur-ing cancer progression. In this study, we devel-oped a novel statistical method called Network-Profiler, which infers patient-specific gene regu-latory networks for a specific clinical characteris-

Human Genome Center

Laboratory of DNA Information AnalysisLaboratory of Sequence Data AnalysisLaboratory of Functional GenomicsDNA情報解析分野シークエンスデータ情報処理分野ゲノム機能解析分野

Professor Satoru Miyano, Ph.D.Associate Professor Seiya Imoto, Ph.D.Assistant Professor Teppei Shimamura, Ph.D.Project Assistant Professor Atsushi Niida, Ph.D.Project Assistant Professor Yoshinori Tamada, Ph.D.Associate Professor Tetsuo Shibuya, Ph.D.Lecturer Rui Yamaguchi, Ph.D.Associate Professor Masao Nagasaki, Ph.D.

教授理学博士宮野悟准教授博士（数理学）井元清哉助教博士（工学）島村徹平特任助教博士（理学）新井田厚司特任助教博士（情報学）玉田嘉紀准教授博士（理学）渋谷哲朗講師博士（理学）山口類准教授博士（理学）長崎正朗

112

tic, such as cancer progression, from gene ex-pression data of cancer patients. We appliedNetworkProfiler to microarray gene expressiondata from 762 cancer cell lines and extracted thesystem changes that were related to the epithe-lial-mesenchymal transition (EMT). Out of 1732possible regulators of E-cadherin, a cell adhesionmolecule that modulates the EMT, NetworkPro-filer, identified 25 candidate regulators, of whichabout half have been experimentally verified inthe literature. In addition, we used NetworkPro-filer to predict EMT-dependent master regula-tors that enhanced cell adhesion, migration, in-vasion, and metastasis. In order to further eva-luate the performance of NetworkProfiler, weselected Krueppel-like factor 5 (KLF5) from a listof the remaining candidate regulators of E-cadherin and conducted in vitro validation ex-periments. As a result, we found that knock-down of KLF5 by siRNA significantly decreasedE-cadherin expression and induced morphologi-cal changes characteristic of EMT. In addition, invitro experiments of a novel candidate EMT-related microRNA, miR-100, confirmed the in-volvement of miR-100 in several EMT-relatedaspects, which was consistent with the predic-tions obtained by NetworkProfiler.

b. Estimating genome-wide gene networksusing nonparametric Bayesian networkmodels on massively parallel computers

Tamada Y, Imoto S, Araki H2, Nagasaki M,Print C3, Charnock-Jones DS4, Miyano S: 2CellInovator, Inc. 3University of Auckland 4Univer-sity of Cambridge

We developed a novel algorithm to estimategenome-wide gene networks consisting of morethan 20 000 genes from gene expression data us-ing nonparametric Bayesian networks. Due tothe difficulty of learning Bayesian networkstructures, existing algorithms cannot be appliedto more than a few thousand genes. Our algo-rithm overcomes this limitation by repeatedlyestimating subnetworks in parallel for genes se-lected by neighbor node sampling. Through nu-merical simulation, we confirmed that our algo-rithm outperformed a heuristic algorithm in ashorter time. We applied our algorithm to mi-croarray data from human umbilical vein endo-thelial cells (HUVECs) treated with siRNAs, toconstruct a human genome-wide gene network,which we compared to a small gene network es-timated for the genes extracted using a tradi-tional bioinformatics method. The resultsshowed that our genome-wide gene networkcontains many features of the small network, aswell as others that could not be captured during

the small network estimation. The results alsorevealed master-regulator genes that are not inthe small network but that control many of thegenes in the small network. These analyses wereimpossible to realize without our proposed algo-rithm.

c. SiGN-SSM: open source parallel softwarefor estimating gene networks with statespace models

Tamada Y, Yamaguchi R, Imoto S, Hirose O,Yoshida R5, Nagasaki M, Miyano S: 5Instituteof Statistical Mathematics

SiGN-SSM is an open-source gene network es-timation software able to run in parallel on PCsand massively parallel supercomputers. Thesoftware estimates a state space model (SSM),that is a statistical dynamic model suitable foranalyzing short time and/or replicated time se-ries gene expression profiles. SiGN-SSM imple-ments a novel parameter constraint effective tostabilize the estimated models. Also, by using asupercomputer, it is able to determine the genenetwork structure by a statistical permutationtest in a practical time. SiGN-SSM is applicablenot only to analyzing temporal regulatory de-pendencies between genes, but also to extractingthe differentially regulated genes from time se-ries expression profiles. SiGN-SSM is distributedunder GNU Affero General Public Licence(GNU AGPL) version 3 and can be downloadedat http://sign.hgc.jp/signssm/. The pre-com-piled binaries for some architectures are avail-able in addition to the source code. The pre-installed binaries are also available on the Hu-man Genome Center supercomputer system.The online manual and the supplementary infor-mation of SiGN-SSM is available on our website.

d. Inferring contagion in regulatory networks

Fujita A6, Sato JR7, Demas MAA8, YamaguchiR, Shimamura T, Ferreira CE8, Sogayar MC8,Miyano S: 6RIKEN, ISLiM 7Universidade Fed-eral do ABC: 8University of São Paulo

Several gene regulatory network models con-taining concepts of directionality at the edgeshave been proposed. However, only a few re-ports have an interpretable definition of direc-tionality. Here, differently from the standardcausality concept defined by Pearl, we introducethe concept of contagion in order to infer direc-tionality at the edges, i.e., asymmetries in geneexpression dependences of regulatory networks.Moreover, we present a bootstrap algorithm in

113

order to test the contagion concept. This tech-nique was applied in simulated data and, also,in an actual large sample of biological data. Lit-erature review has confirmed some genes identi-fied by contagion as actually belonging to theTP53 pathway.

e. Searching optimal Bayesian network struc-ture on constraint search space: super-structure approach

Imoto S, Kojima K, Perrier E, Tamada Y,Miyano S

Optimal search on Bayesian network structureis known as an NP-hard problem and the appli-cability of existing optimal algorithms is limitedin small Bayesian networks with 30 nodes or so.To learn larger Bayesian networks from observa-tional data, some heuristic algorithms wereused, but only a local optimal structure is foundand its accuracy is not high in many cases. Inthis paper, we review optimal search algorithmsin a constraint search space; The skeleton of thelearned Bayesian network is a sub-graph of thegiven undirected graph called super-structure.The introduced optimal search algorithm canlearn Bayesian networks with several hundredsof nodes when the degree of super-structure isaround four. Numerical experiments indicatethat constraint optimal search outperforms state-of-the-art heuristic algorithms in terms of accu-racy, even if the super-structure is also learnedby data.

f. Parallel algorithm for learning optimal Bay-esian network structure

Tamada Y, Imoto S, Miyano S

We developed a parallel algorithm for thescore-based optimal structure search of Bayesiannetworks. This algorithm is based on a dynamicprogramming (DP) algorithm having O (n・2n)time and space complexity, which is known tobe the fastest algorithm for the optimal structuresearch of networks with n nodes. The bottleneckof the problem is the memory requirement, andtherefore, the algorithm is currently applicablefor up to a few tens of nodes. While the recentlyproposed algorithm overcomes this limitation bya space-time trade-off, our proposed algorithmrealizes direct parallelization of the original DPalgorithm with O (nσ) time and space overheadcalculations, where σ＞0 controls the communi-cation-space trade-off. The overall time andspace complexity is O (nσ＋1・2n). This algorithmsplits the search space so that the required com-munication between independent calculations is

minimal. Because of this advantage, our algo-rithm can run on distributed memory supercom-puters. Through computational experiments, weconfirmed that our algorithm can run in parallelusing up to 256 processors with a parallelizationefficiency of 0.74, compared to the original DPalgorithm with a single processor. We also dem-onstrate optimal structure search for a 32-nodenetwork without any constraints, which is thelargest network search presented in literature.

g. Gene network inference and visualizationtools for biologists: application to new hu-man transcriptome datasets

Hurley D3, Araki H3, Tamada Y, Dunmore B4,Sanders D4, Humphreys S4, Affara M4, Imoto S,Yasuda K9 Tomiyasu Y9, Tashiro K9, SavoieC10, Cho V3, Smith S4, Kuhara S9, Miyano S,Charnock-Jones DS4, Crampin EJ3, Print CG3:9Kyushu University 10GNI, Inc.

Gene regulatory networks inferred from RNAabundance data have generated significant inter-est, but despite this, gene network approachesare used infrequently and often require inputfrom bioinformaticians. We have assembled asuite of tools for analysing regulatory networks,and we illustrate their use with microarray da-tasets generated in human endothelial cells. Weinfer a range of regulatory networks, and basedon this analysis discuss the strengths and limita-tions of network inference from RNA abun-dance data. We welcome contact from research-ers interested in using our inference and visuali-zation tools to answer biological questions.

h. Frequent pathway mutations of splicingmachinery in myelodysplasia

Yoshida K11, Sanada M11, Shiraishi Y, NowakD12, Nagata Y11, Yamamoto R13, Sato Y11, Sato-Otsubo A11, Kon A11, Nagasaki M, Chalkidis G,Suzuki Y14, Shiosaka M11, Kawahata R11, Yama-guchi T13, Otsu M13, Obara N15, Sakata-Yanagimoto M15, Ishiyama K16, Mori H17, NolteF12, Hofmann WK12, Miyawaki S16, Sugano S14,Haferlach C18, Koeffler HP19, Shih LY20, Hafer-lach T18, Chiba S15, Nakauchi H13, Miyano S,Ogawa S11: 11Graduate School of Medicine, Uni-versity of Tokyo 12University of Heidelberg13Center for Stem Cell Biology and Regenera-tive Medicine, Institute of Medical Science,University of Tokyo 14Department of MedicalGenome Sciences, University of Tokyo 15Insti-tute of Clinical Medicine, University of Tsu-kuba 16Tokyo Metropolitan Ohtsuka Hospital17Showa University Fujigaoka Hospital 18Mu-nich Leukemia Laboratory 19Cedars-Sinai Me-

114

dical Center 20Chang Gung University

By using the supercomputer system of Hu-man Genome Center, we contributed to dataanalysis of whole-exome sequencing data of 29myelodysplasia specimens, which unexpectedlyrevealed novel pathway mutations involvingmultiple components of the RNA splicing ma-chinery, including U2AF35, ZRSR2, SRSF2 andSF3B1.

i. Long non-coding RNA HOTAIR regulatesPolycomb-dependent chromatin modifica-tion and is associated with poor prognosisin colorectal cancers

Kogo R21, Shimamura T, Mimori K22, Kawa-hara K21, Imoto S, Sudo T21, Tanaka F21, Shi-bata K21, Suzuki A21, Komune S23, Miyano S,Mori M24: 21Medical Institute of Bioregulation,Kyushu University 22Kyushu University BeppuHospital 23Graduate School of Medicine, Kyu-shu University 24Gastroenterological Surgery,Osaka University

We contributed to data analysis in the studyof examining the status and function of HO-TAIR in stage IV colorectal cancer (CRC) pa-tients who have liver metastases and a poorprognosis. The analysis showed that HOTAIRexpression levels were higher in cancerous tis-sues than corresponding noncancerous tissuesand high HOTAIR expression correlated tightlywith the presence of liver metastasis. Moreover,patients with high HOTAIR expression had arelatively poorer prognosis. In a subset of 32CRC speciments, gene set enrichment analysisusing cDNA array data revealed a close correla-tion between expression of HOTAIR and mem-bers of the PRC2 complex (SUZ12, EZH2 and H3K27me3). These findings suggest that HOTAIRexpression is associated with a genome-wide re-programming of PRC2 function not only inbreast cancer but also in CRC, where upregula-tion of this long ncRNA may be a critical ele-ment in metastatic progression.

j. N-cadherin expression is a potential sur-vival mechanism of gefitinib-resistant lungcancer cells

Yamauchi M32, Yoshino I32, Yamaguchi R,Shimamura T, Nagasaki M, Imoto S, Niida A,Koizumi F33, Kohno T33, Yokota J33, Miyano S,Gotoh N32: 32Division of Systems BiomedicalTechnology, Institute of Medical Science, Uni-versity of Tokyo: 33National Cancer Center Re-searh Institute

We contributed to design of measurement ofgene expression data for systems understandingof gefitinib-sensitive NSCLC and gefitinib-resis-tant NSCLC. We used DNA microarrays to ex-amine gene expression profiles of gefitinib-resistant PC9/ZD cells that are derived fromgefitinib-sensitive PC9 cells and harbor a threo-nine to methionine mutation at codon 790(T790M) in EGFR, a known mechanism of ac-quired resistance to gefitinib. Data analysis sug-gested showed that N-cadherin expression wassignificantly upregulated in PC9/ZD cells com-pared with PC9 cells. Inhibition of N-cadherinexpression by siRNA or treatment with antibod-ies against N-cadherin induced apoptosis ofPC9/ZD cells in association with reduced phos-phorylation of Akt and Bad, a proapoptotic pro-tein. Moreover, inhibition of Akt expression bysiRNA or treatment with an inhibitor for phos-phatidylinositol (PI)-3 kinase reduced survivalof PC9/ZD cells. In addition, we found severalN-cadherin-expressing lung cancer cells thatshowed inherent resistance to gefitinib treatmentand reduced survival owing to siRNA-inducedinhibition of N-cadherin expression. Thus, it ap-pears that N-cadherin maintains the survival ofthe gefitinib-resistant lung cancer cells via thePI-3 kinase/Akt survival pathway. From theseresults, we propose that N-cadherin signalingcontributes, at least in part, to the survivalmechanisms of gefitinib-resistant NSCLC cellsand that N-cadherin is a potential molecular tar-get in the treatment of NSCLC. (AJCR0000074).

k. MRG-binding protein contributes to devel-opment of colorectal cancer

Yamaguchi K31, Sakai M32, Kim J31, TsunesumiS31, Fujii T31, Ikenoue T31, Yamada Y33, Aki-yama Y33, Muto Y33, Yamaguchi R, Miyano S,Nakamura Y32, Furukawa Y31: 31Division ofClinical Genome Research, Advanced ClinicalResearch Center, Institute of Medical Science:32Laboratory of Molecular Medicine, HumanGenome Center, Institute of Medical Science33Tokyo Hitachi Hospital

MRGBP (MORF4-related gene-binding pro-tein; also known as chromosome 20 open read-ing frame 20) encodes a subunit of the transfor-mation/transcription domain-associated protein(TRRAP)/tat-interacting protein 60 (TIP60)-con-taining histone acetyltransferase complex. To in-vestigated the role MRGBP in colorectal carcino-genesis, we constributed to statistical data analy-sis of the differences between gene expressiondata of MRGBP-knockdown colorectral cell lineand normal colorectral cell line. We appliedFisher exact test. Based gene sets based on GO

115

and annother functional annotation databases(KEGG, etc), we screened out significantlyexressed gene sets together with functinal anno-tations.

2. Pathway Modeling, Simulation and Analy-sis

a. Comprehensive pharmacogenomic path-way screening by data assimilation

Hasegawa T, Yamaguchi R, Nagasaki M,Imoto S, Miyano S

We developed a computational method tocomprehensively screen for pharmacogenomicpathway simulation models. A systematic modelgeneration strategy is developed; candidatepharmacogenomic models are automaticallygenerated from some prototype models con-structed from existing literature. The parametersin the model are automatically estimated basedon time-course observed gene expression databy data assimilation technique. The candidatesimulation models are also ranked based ontheir prediction power measured by Bayesianinformation criterion. We generated 53 pharma-cogenomic simulation models from five proto-types and applied the proposed method to mi-croarray gene expression data of rat liver cellstreated with corticosteroid. We found that someextended simulation models have higher predic-tion power for some genes than the originalmodels.

b. Systems biology model repository formacrophage pathway simulation

Nagasaki M, Saito A, Fujita A6, Tremmel G,Ueno K, Ikeda E, Jeong E, Miyano S

The Macrophage Pathway Knowledgebase(MACPAK) is a computational system which al-lows biomedical researchers to query and studythe dynamic behaviors of macrophage molecularpathways. It integrates the knowledge of 230 re-views that were carefully checked by specialistsfor their accuracy and then converted to 230 dy-namic mathematical pathway models. MACPAKcomprises a total of 24,009 entities and 12,774processes and is described in the Cell SystemMarkup Language (CSML), an XML format thatruns on the Cell Illustrator platform and can bevisualized with a customized Cytoscape for fur-ther analysis. MACPAK can be accessed via aninteractive website at http://macpak.csml.org.The CSML pathway models are available underthe Creative Commons license.

c. MIRACH: Efficient model checker for quan-titative biological pathway models

Koh CH25, Nagasaki M, Saito A, Li C, WongL25, Miyano S: 25National University of Singa-pore

Model checking is playing an increasingly im-portant role in systems biology as larger andmore complex biological pathways are beingmodeled. In this article we report the release ofan efficient model checker MIRACH 1.0, whichsupports any model written in popular formatssuch as CSML and SBML. MIRACH is inte-grated with a Petri-net-based simulation engine,enabling efficient online (on-the-fly) checking. Inour experiment, by using Levchenko et al.model, we reveal that timesaving gains by usingMIRACH easily surpass 400％ compared withits offline-based counterpart. MIRACH 1.0 wasdeveloped using Java and thus executable onany platform installed with JDK 6.0 (not JRE 6.0)or later. MIRACH 1.0, along with its sourcecodes, documentation and examples are avail-able at http://sourceforge.net/projects/mirach/under the LGPLv3 license.

d. Parameter estimation of biological path-ways using data assimilation and modelchecking

Li C, Kuroyanagi K, Nagasaki M, Miyano S

In this study we developed a novel methodfor estimating kinetic parameter of biologicalpathways by using observed time-series dataand other knowledge that cannot be formulatedin the form of time-series data. Our method util-izes data assimilation (DA) framework andmodel checking (MC) technique, with a quanti-tative modeling and simulation architecturenamed hybrid functional Petri net with exten-sion (HFPNe). Proposed method is applied to anHFPNe model underlying circadian rhythm inmouse. We first translate 23 rules of biologicalknowledge with temporal logic for the modelchecking, which are not described in the time-series data. Next, we employ particle filter oftenapplied to DA for our estimation procedure.Each particle checks whether its simulation re-sult satisfies the rules or not, and the result ofthe checking is used for its resampling step. Oursimulation results show that proposed methodis faster and more accurate than previousmethod.

e. CSO validator: improving manual curationworkflow for biological pathways

116

Jeong E, Nagasaki M, Ikeda E, Saito A,Miyano S

Manual curation and validation of large-scalebiological pathways are required to obtain high-quality pathway databases. In a typical curationprocess, model validation and model updatebased on appropriate feedback are repeated andrequires considerable cooperation of scientists.We have developed a CSO (Cell System Ontol-ogy) validator to reduce the repetition and timeduring the curation process. This tool assists inquickly obtaining agreement among curatorsand domain exper ts and in providing a consis-tent and accurate pathway database. The tool isavailable at http://csovalidator.csml.org.

f. Ontology-based instance data validationfor high-quality curated biological path-ways

Jeong E, Nagasaki M, Ueno K, Miyano S

Modeling in systems biology is vital for un-derstanding the complexity of biological systemsacross scales and predicting system-level behav-iors. To obtain high-quality pathway databases,it is essential to improve the efficiency of modelvalidation and model update based on appro-priate feedback. We have developed a newmethod to guide creating novel high-quality bio-logical pathways, using a rule-based validation.Rules are defined to correct models against bio-logical semantics and improve models for dy-namic simulation. In this work, we have defined40 rules which constrain event-specific partici-pants and the related features and adding miss-ing processes based on biological events. Thisapproach is applied to data in Cell System On-tology which is a comprehensive ontology thatrepresents complex biological pathways withdynamics and visualization. The experimentalresults show that the relatively simple rules canefficiently detect errors made during curation,such as misassignment and misuse of ontologyconcepts and terms in curated models. A newrule-based approach has been developed to fa-cilitate model validation and model complemen-tation. Our rule-based validation embeddingbiological semantics enables us to provide high-quality curated biological pathways. This ap-proach can serve as a preprocessing step formodel integration, exchange and extraction data,and simulation.

g. High performance hybrid functional Petrinet simulations of biological pathwaymodels on CUDA

Chalkidis G, Nagasaki M, Miyano S

Hybrid functional Petri nets are a wide-spreadtool for representing and simulating biologicalmodels. Due to their potential of providing vir-tual drug testing environments, biological simu-lations have a growing impact on pharmaceuti-cal research. Continuous research advancementsin biology and medicine lead to exponentiallyincreasing simulation times, thus raising the de-mand for performance accelerations by efficientand inexpensive parallel computation solutions.Recent developments in the field of general-purpose computation on graphics processingunits (GPGPU) enabled the scientific communityto port a variety of compute intensive algo-rithms onto the graphics processing unit (GPU).This work presents the first scheme for mappingbiological hybrid functional Petri net models,which can handle both discrete and continuousentities, onto compute unified device architec-ture (CUDA) enabled GPUs. GPU acceleratedsimulations are observed to run up to 18 timesfaster than sequential implementations. Simulat-ing the cell boundary formation by Delta-Notchsignaling on a CUDA enabled GPU results in aspeedup of approximately 7 times for a modelcontaining 1,600 cells.

h. Online model checking approach basedparameter estimation to a neuronal fatedecision simulation model in Caenorhabdi-tis elegans with hybrid functional Petri netwith extension

Li C, Nagasaki M, Miyano S

Mathematical modeling and simulation stud-ies are playing an increasingly important role inhelping researchers elucidate how living organ-isms function in cells. In systems biology, re-searchers typically tune many parameters manu-ally to achieve simulation results that are consis-tent with biological knowledge. This severelylimits the size and complexity of simulationmodels built. In order to break this limitation,we propose a computational framework to auto-matically estimate kinetic parameters for a givennetwork structure. We utilized an online (on-the-fly) model checking technique (which savesresources compared to the offline approach),with a quantitative modeling and simulation ar-chitecture named hybrid functional Petri netwith extension (HFPNe). We demonstrate theapplicability of this framework by the analysisof the underlying model for the neuronal cellfate decision model (ASE fate model) in Caenor-habditis elegans . First, we built a quantitativeASE fate model containing 3327 components

117

emulating nine genetic conditions. Then, usingour developed efficient online model checker,MIRACH 1.0, together with parameter estima-tion, we ran 20-million simulation runs, andwere able to locate 57 parameter sets for 23 pa-rameters in the model that are consistent with45 biological rules extracted from published bio-logical articles without much manual interven-tion. To evaluate the robustness of these 57 pa-rameter sets, we run another 20 million simula-tion runs using different magnitudes of noise.Our simulation results concluded that amongthese models, one model is the most reasonableand robust simulation model owing to the highstability against these stochastic noises. Oursimulation results provide interesting biologicalfindings which could be used for future wet-labexperiments.

3. Statistical Data Analysis Methods for GeneExpression Data, and Next-Generation Se-quence Data, and Clinical Data

a. Estimating exogenous variables in datawith more variables than observations

Sogawa Y26, Shimizu S26, Shimamura T, Hy-varinen A27, Washio T26, Imoto S: 26The Insti-tute of Scientific and Industrial Research,Osaka University 27Department of Mathematicsand Statistics, University of Helsinki

Many statistical methods have been proposedto estimate causal models in classical situationswith fewer variables than observations. How-ever, modern datasets including gene expressiondata increase the needs of high-dimensionalcausal modeling in challenging situations withorders of magnitude more variables than obser-vations. In this paper, we propose a method tofind exogenous variables in a linear non-Gaussian causal model, which requires muchsmaller sample sizes than conventional methodsand works even under orders of magnitudemore variables than observations. Exogenousvariables work as triggers that activate causalchains in the model, and their identificationleads to more efficient experimental designs andbetter understanding of the causal mechanism.We present experiments with artificial data andreal-world gene expression data to evaluate themethod.

b. A rank-based statistical test for measuringsynergistic effects between two gene sets

Shiraishi U, Okada-Hatakeyama M28, MiyanoS: 28RIKEN Research Center for Allergy andImmunology

Due to recent advances in high-throughputtechnologies, data on various types of genomicannotation have accumulated. These data willbe crucially helpful for elucidating the combina-torial logic of transcription. Although severalapproaches have been proposed for inferring co-operativity among multiple factors, most ap-proaches are haunted by the issues of normali-zation and threshold values. In this study weproposed a rank-based nonparametric statisticaltest for measuring the effects between two genesets. This method is free from the issues of nor-malization and threshold value determinationfor gene expression values. Furthermore, wehave proposed an efficient Markov chain MonteCarlo method for calculating an approximatesignificance value of synergy. We have appliedthis approach for detecting synergistic combina-tions of transcription factor binding motifs andhistone modifications. C implementation of themethod is available from http://www.hgc.jp/~yshira/software/rankSynergy.zip.

c. Strategy of finding optimal number of fea-tures on gene expression data

Sharma A, Koh CH, Imoto S, Miyano S

Feature selection is considered to be an im-portant step in the analysis of transcriptomes orgene expression data. Carrying out feature selec-tion reduces the curse of the dimensionalityproblem and improves the interpretability of theproblem. Numerous feature selection methodshave been proposed in the literature and thesemethods rank the genes in order of their relativeimportance. However, most of these methodsdetermine the number of genes to be used in anarbitraryly or heuristic fashion. Proposed is atheoretical way to determine the optimal num-ber of genes to be selected for a given task. Thisproposed strategy has been applied on a num-ber of gene expression datasets and promisingresults have been obtained.

d. A top-r feature selection algorithm for mi-croarray gene expression data

Sharma A, Imoto S, Miyano S

Most of the conventional feature selection al-gorithms have a drawback whereby a weaklyranked gene that could perform well in terms ofclassification accuracy with an appropriate sub-set of genes will be left out of the selection.Considering this shortcoming, we propose a fea-ture selection algorithm in gene expression dataanalysis of sample classifications. The proposedalgorithm first divides genes into subsets, the

118

sizes of which are relatively small (roughly ofsize h), then selects informative smaller subsetsof genes (of size r＜h) from a subset and mergesthe chosen genes with another gene subset (ofsize r) to update the gene subset. We repeat thisprocess until all subsets are merged into one in-formative subset. We illustrate the effectivenessof the proposed algorithm by analyzing threedistinct gene expression datasets. Our methodshows promising classification accuracy for allthe test datasets. We also show the relevance ofthe selected genes in terms of their biologicalfunctions.

e. ClipCrop: a tool for detecting structuralvariations with single-base resolution us-ing soft-clipping information

Suzuki S, Yasuda T, Shiraishi Y, Miyano S,Nagasaki M

Structural variations (SVs) change the struc-ture of the genome and are therefore the causesof various diseases. Next-generation sequencingallows us to obtain a multitude of sequencedata, some of which can be used to infer the po-sition of SVs. We developed a new method andimplementation named ClipCrop for detectingSVs with single-base resolution using soft-clipping information. A soft-clipped sequence isan unmatched fragment in a partially mappedread. To assess the performance of ClipCropwith other SV-detecting tools, we generatedvarious patterns of simulation data-SV lengths,read lengths, and the depth of coverage of shortreads-with insertions, deletions, tandem duplica-tions, inversions and single nucleotide altera-tions in a human chromosome. For comparison,we selected BreakDancer, CNVnator and Pindel,each of which adopts a different approach to de-tect SVs, e.g. discordant pair approach, depth ofcoverage approach and split read approach, re-spectively. Our method outperformed Break-Dancer and CNVnator in both discovering rateand call accuracy in any type of SV. Pindel of-fered a similar performance as our method, butour method crucially outperformed for detectingsmall duplications. From our experiments, Clip-Crop infer reliable SVs for the data set withmore than 50 bases read lengths and 20x depthof coverage, both of which are reasonable valuesin current NGS data set. ClipCrop can detectSVs with higher discovering rate and call accu-racy than any other tool in our simulation dataset.

f. Symbolic hierarchical clustering for visualanalogue scale data

Katayama K, Yamaguchi R, Imoto S, Toku-naga H29, Imazu Y29, Matuura K29, WatanabeK29, Miyano S: 29Center for Kampo Medicine,Keio University School of Medicine

We proposed a hierarchical clustering in theframework of Symbolic Data Analysis (SDA).SDA was proposed by Diday at the end of the1980s and is a new approach for analysing hugeand complex data. In SDA, an observation is de-scribed by not only numerical values but also“higher-level units”; sets, intervals, distributions,etc. Most SDA works have dealt with only inter-vals as the descriptions. In this study, we de-fined “pain distribution” as new type data inSDA and proposed a hierarchical clustering forthis new type data.

g. Clustering for visual analogue scale datain symbolic data analysis

Katayama K, Yamaguchi R, Imoto S, Matsu-ura K29, Watanabe K29, Miyano S

We proposed a hierarchical clustering for thevisual analogue scale (VAS) in the framework ofSymbolic Data Analysis (SDA). The VAS is amethod that can be readily understood by mostpeople to measure a characteristic or attitudethat cannot be directly measured. VAS is ofmost value when looking at change within peo-ple, and is of less value for comparing across agroup of people because they have differentsense. It could be argued that a VAS is trying toproduce interval/ratio data out of subjectivevalues that are at best ordinal. Thus, some cau-tion is required in handling VAS. We describedVAS as distribution and handle it as new typedata in SDA. In this study, we defined “VASdistribution” as new type data in SDA and pro-posed a hierarchical clustering for this new typedata.

4. Algorithms and Data Structures for Bioin-formatics and Cheminformatics

a. A subpath kernel for rooted unorderedtrees

Kimura D30, Kuboyama T, Shibuya T, Ka-shima H30: 30Graduate School of InformationScience and Technology, University of Tokyo31Computer Centre, Gakushuin University

Kernel method is one of the promising ap-proaches to learning with tree-structured data,and various efficient tree kernels have been pro-posed to capture informative structures in trees.In this paper, we propose a new tree kernel

119

function based on “subpath sets” to capture ver-tical structures in rooted unordered trees, sincesuch tree-structures are often used to code hier-archical information in data. We also propose asimple and efficient algorithm for computing thekernel by extending the multikey quicksort algo-rithm used for sorting strings. The time com-plexity of the algorithm is O ((|T1|＋|T2|) log(|T1|＋|T2|)) time on average, and the spacecomplexity is O (|T1|＋|T2|), where |T1| and|T2| are the numbers of nodes in two trees T1

and T2. We apply the proposed kernel to twosupervised classification tasks, XML classifica-tion in web mining and glycan classification inbioinformatics. The experimental results showthat the predictive performance of the proposedkernel is competitive with that of the existing ef-ficient tree kernel for unordered trees proposedby “Vishwanathan et al. Fast kernels for stringand tree matching. In: Advances in Neural In-formation Processing Systems, vol. 15, pp. 569-576 (2003)”, and is also empirically faster thanthe existing kernel.

b. An index structure for spaced seed search

Onodera T, Shibuya T

In this study, we introduced an index struc-ture of texts which supports fast search of pat-terns with “don’t care”s in predetermined posi-tions. This data structure is a generalization ofthe suffix array and has many applications espe-cially for computational biology. We proposedthree algorithms to construct the index. Two ofthem are based on a variant of radix sort buteach utilizes different types of referential infor-mation to sort suffixes by multiple characters ata time. The other is for the case when “don’tcare”s appear periodically in patterns and canbe combined with the others.

c. 2D-Qsar for 450 types of amino acid induc-tion peptides with a novel substructurepair descriptor having wider scope

Osoda T, Miyano S

Quantitative structure-activity relationships(QSAR) analysis of peptides is helpful for de-signing various types of drugs such as kinaseinhibitor or antigen. Capturing various proper-ties of peptides is essential for analyzing two-dimensional QSAR. A descriptor of peptides isan important element for capturing properties.The atom pair holographic (APH) code is de-signed for the description of peptides and it rep-resents peptides as the combination of thirty-sixtypes of key atoms and their intermediate bind-

ing between two key atoms. The substructurepair descriptor (SPAD) represents peptides asthe combination of forty-nine types of key sub-structures and the sequence of amino acid resi-dues between two substructures. The size of thekey substructures is larger and the length of thesequence is longer than traditional descriptors.Similarity searches on C5a inhibitor data set andkinase inhibitor data set showed that order ofinhibitors become three times higher by repre-senting peptides with SPAD, respectively. Com-paring scope of each descriptor shows thatSPAD captures different properties from APH.QSAR/QSPR for peptides is helpful for design-ing various types of drugs such as kinase inhibi-tor and antigen. SPAD is a novel and powerfuldescriptor for various types of peptides. Accu-racy of QSAR/QSPR becomes higher by describ-ing peptides with SPAD.

d. Noise-tolerant active learning algorithm

Osoda T, Miyano S

Selection of unlabeled instances seriously af-fects the efficiency of active learning. Noise indata degrades efficiency because noise causes apredictive model to be inaccurate. To analyzethis effect, we introduce a noise parameter tothe expected log likelihood and formulate theobjective function. Then, we define the densityaround a labeled instance to approximate theexpected log likelihood with noise more pre-cisely. Empirical experiments performed on theUCT data set show that the accuracy of noise-tolerant active learning is always the highestamong several approaches under various noiseconditions. Therefore, the noise-tolerant activelearning algorithm is stably effective regardlessof whether the data set includes considerablenoise or not.

5. Pandemic Control Simulation

a. Estimation of macroscopic parameter inagent-based pandemic simulation

Saito MM5, Imoto S, Yamaguchi R, Miyano S,Higuchi T5

Simulations of epidemic spread is an appeal-ing approach to design intervention programsagainst influenza epidemic. Agent-based ap-proach is particularly useful, since it enable usto administrate and evaluate effectiveness ofvarious intervention measures in the simulatedcity. On the other hand, observation data areobtained as macroscopic information. Hence, weneed to extract some macroscopic information,

120

to examine the validity of the simulation result.The reproduction number, which indicates howmany new infected persons are produced due toone infected person, is often used in epidemiol-ogy. This quantity is directly related to coeffi-cients in differential equations in macro simula-tions, whereas it is non-trivial in agent-basedsimulations. In this paper, we demonstrate theestimation of the reproduction number from anagent-based simulation result, by assimilatingthe number of infected persons yielded by anagent-based simulator to the SEIR model, whichis a representative macro simulation model.

b. Parallel agent-based simulator for influ-enza pandemic

Saito MM5, Imoto S, Yamaguchi R, Miyano S,Higuchi T5

We developed a parallel agent-based influ-enza pandemic simulator, in order to study theinfluenza spread in a city. In the simulator, thecity consists of several towns connected tightlyby trains. Residents of the towns walk aroundplaces such as corporations and schools usingtrains by need, following to own schedulers.The influenza spread in these congested placesis simulated as stochastic processes. We havedemonstrated simulations with a realistic scaleof population (an order of million) and showedthat one simulation run is completed aroundone hour.

Publications

1. Chalkidis G, Nagasaki M, Miyano S. Highperformance hybrid functional Petri netsimulations of biological pathway models onCUDA. IEEE/ACM Transactions on Compu-tational Biology and Bioinformatics. 8(6):1545-1556, 2011.

2. Chen X-W, Miyano S (Eds.) Special Issue onThe First IEEE Conference on Healthcare In-formatics, Imaging, and Systems biologyHISB’11. J. Bioinformatics and Computa-tional Biology 9(5), 2011.

3. Fujita A, Sato JR, Demas MAA, YamaguchiR, Shimamura T, Ferreira CE, Sogayar MC,Miyano S. Inferring contagion in regulatorynetworks. IEEE/ACM Transactions on Com-putational Biology and Bioinformatics. 8(2):570-576, 2011.

4. Furu M, Kajita Y, Nagayama S, Ishibe T,Shima Y, Nishijo K, Uejima D, Takahashi R,Aoyama T, Nakayama T, Nakamura T,Nakashima Y, Ikegawa M, Imoto S, KatagiriT, Nakamura Y, Toguchida J. Toguchida.Identification of AFAP1L1 as a prognosticmarker for spindle cell sarcomas. Oncogene.30(38): 4015-4025, 2011.

5. Hasegawa T, Yamaguchi R, Nagasaki M,Imoto S, Miyano S. Comprehensive pharma-cogenomic pathway screening by data as-similation. Lecture Notes in Bioinformatics.6674: 160-171, 2011.

6. Hurley D, Araki H, Tamada Y, Dunmore B,Sanders D, Humphreys S, Affara M, ImotoS, Yasuda K, Tomiyasu Y, Tashiro K, SavoieC, Cho V, Smith S, Kuhara S, Miyano S,Charnock-Jones DS, Crampin EJ, Print CG.Gene network inference and visualizationtools for biologists: application to new hu-man transcriptome datasets. Nucleic AcidsRes. 2011 Dec 6. [Epub ahead of print]

7. Imoto S, Kojima K, Perrier E, Tamada Y, Mi-yano S. Searching optimal Bayesian networkstructure on constraint search space: super-structure approach. Lecture Notes in Com-puter Science. 6797: 210-218, 2011.

8. Imoto S, Tamada Y, Araki H, Miyano S.Computational Drug Target Pathway Dis-covery: A Bayesian Network Approach.Handbook of Computational Statistics: Sta-tistical Bioinformatics, Springer. p. 501-532,2011.

9. Jeong E, Nagasaki M, Ueno K, Miyano S.Ontology-based instance data validation forhigh-quality curated biological pathways.BMC Bioinformatics. 12(Suppl 1): S8, 2011.

10. Jeong E, Nagasaki M, Ikeda E, Saito A, Miy-ano S. CSO validator: improving manualcuration workflow for biological pathways.Bioinformatics. 27(17): 2471-2472, 2011.

11. Katayama K, Yamaguchi R, Imoto S, Toku-naga H, Imazu Y, Matuura K, Watanabe K,Miyano S. Symbolic hierarchical clusteringfor visual analogue scale data. Smart Inno-vation, Systems and Technologies. 10: 799-805, 2011.

12. Katayama K, Yamaguchi R, Imoto S, Matsu-ura K, Watanabe K, Miyano S. Clusteringfor Visual analogue scale data in symbolicdata analysis. Procedia Computer Science. 6:370-374, 2011.

13. Kimura D, Kuboyama T, Shibuya T, Ka-shima H. A subpath kernel for rooted unor-dered trees. Proc. 15th Pacific-Asia Confer-ence on Knowledge Discovery and DataMining (PAKDD 2011). 62-74, 2011.

14. Kogo R, Shimamura T, Mimori K, KawaharaK, Imoto S, Sudo T, Tanaka F, Shibata K,Suzuki A, Komune S, Miyano S, Mori M.Long non-coding RNA HOTAIR regulates

121

Polycomb-dependent chromatin modifica-tion and is associated with poor prognosisin colorectal cancers. Cancer Res. 2011 Aug23. [Epub ahead of print]

15. Koh CH, Nagasaki M, Saito A, Li C, WongL, Miyano S. MIRACH: Efficient modelchecker for quantitative biological pathwaymodels. Bioinformatics. 27(5): 734-735, 2011.

16. Li C, Kuroyanagi K, Nagasaki M, Miyano S.Parameter estimation of biological pathwaysusing data assimilation and model checking.Proceedings of 2nd International Workshopon Biological Processes & Petri Nets(BioPPN2011) (http://ceur-ws.org/Vol-724).53-70, 2011.

17. Li C, Nagasaki M, Miyano S. Online modelchecking approach based parameter estima-tion to a neuronal fate decision simulationmodel in Caenorhabditis elegans with hybridfunctional Petri net with extension. Molecu-lar BioSystems. 7(5): 1576-1592, 2011.

18. Mandoiu II, Miyano S, Przytycka TM, Ra-jasekaran S (Eds.) Proceedings of The IEEEFirst International Conference on Computa-tional Advances in Bio and Medical Sci-ences. IEEE Computer Society Press, 2011.

19. Matsuno H, Nagasaki M, Miyano S. HybridPetri net based modeling for biological path-way simulation. Natural Computing 10(3):1099-1120, 2011.

20. Nagasaki M, Saito A, Fujita A, Tremmel G,Ueno K, Ikeda E, Jeong E, Miyano S. Sys-tems biology model repository for macro-phage pathway simulation. Bioinformatics.27 (11): 1591-1593, 2011.

21. Onodera T, Shibuya T. An index structurefor spaced seed search. Lecture Notes inComputer Science. 7074: 764-772, 2011.(Proc. 22nd annual International Symposiumon Algorithms and Computation (ISAAC2011))

22. Osoda T, Miyano S. 2D-Qsar for 450 typesof amino acid induction peptides with anovel substructure pair descriptor havingwider scope. J Cheminform. 3(1): 50, 2011.

23. Osoda T, Miyano S. Noise-tolerant activelearning algorithm. Proc. the 2011 Interna-tional Conference on Data Mining. 10-14,2011.

24. Saito MM, Imoto S, Yamaguchi R, Miyano S,Higuchi T. Parallel agent-based simulatorfor influenza pandemic. Lecture Notes inComputer Science. 7068: 361-370, 2011.

25. Saito MM, Imoto S, Yamaguchi R, Miyano S,Higuchi T. Estimation of macroscopic pa-rameter in agent-based pandemic simula-tion. Proc. 13th International Conference onInformation Fusion. 1-6, 2011.

26. Sharma A, Imoto S, Miyano S.A top-r fea-

ture selection algorithm for microarray geneexpression data. IEEE/ACM Transactions onComputational Biology and Bioinformatics.2011 Nov 11. [Epub ahead of print]

27. Sharma A, Koh CH, Imoto S, Miyano S.Strategy of finding optimal number of fea-tures on gene expression data. ElectronicLetters. 47(8): 480-482, 2011.

28. Shimamura T, Imoto S, Shimada Y, HosonoY, Niida A, Nagasaki M, Yamaguchi R,Takahashi T, Miyano S.A novel networkprofiling analysis reveals system changes inepithelial-mesenchymal transition . PLoSONE. 6(6): e20804, 2011.

29. Shiraishi U, Okada-Hatakeyama M, MiyanoS.A rank-based statistical test for measuringsynergistic effects between two gene sets.Bioinformatics. 27(17): 2399-2405, 2011.

30. Sogawa Y, Shimizu S, Shimamura T, Hy-varinen A, Washio T, Imoto S. Estimatingexogenous variables in data with more vari-ables than observations. Neural Networks.24(8): 875-880, 2011.

31. Suzuki S, Yasuda T, Shiraishi Y, Miyano S,Nagasaki M. ClipCrop: a tool for detectingstructural variations with single-base resolu-tion using soft-clipping information. BMCBioinformatics. 12: S7, 2011.

32. Tamada Y, Imoto S, Miyano S. Parallel algo-rithm for learning optimal Bayesian networkstructure. J Machine Learning Research. 12:2437-2459, 2011.

33. Tamada Y, Imoto S, Araki H, Nagasaki M,Print C, Charnock-Jones DS, Miyano S. Esti-mating genome-wide gene networks usingnonparametric Bayesian network models onmassively parallel computers. IEEE/ACMTransactions on Computational Biology andBioinformatics. 8(3): 683-697, 2011.

34. Tamada Y, Shimamura T, Yamaguchi R,Imoto S, Nagasaki M, Miyano S. SiGN:large-scale gene network estimation environ-ment for high performance computing.Genome Informatics. 25: 40-52, 2011.

35. Tamada Y, Yamaguchi R, Imoto S, Hirose O,Yoshida R, Nagasaki M, Miyano S. SiGN-SSM: open source parallel software for esti-mating gene networks with state space mod-els. Bioinformatics. 27: 1172-1173, 2011.

36. Yamaguchi K, Sakai M, Kim J, Tsunesumi S,Fujii T, Ikenoue T, Yamada Y, Akiyama Y,Muto Y, Yamaguchi R, Miyano S, NakamuraY, Furukawa Y. MRG-binding protein con-tributes to development of colorectal cancer.Cancer Sci. 102(8): 1486-1492, 2011.

37. Yamauchi M, Yoshino I, Yamaguchi R, Shi-mamura T, Nagasaki M, Imoto S, Niida A,Koizumi F, Kohno T, Yokota J, Miyano S,Gotoh N. N-cadherin expression is a poten-

122

tial survival mechanism of gefitinib-resistantlung cancer cells. Am J Cancer Res. 1(7):823-833, 2011.

38. Yoshida K, Sanada M, Shiraishi Y, NowakD, Nagata Y, Yamamoto R, Sato Y, Sato-Otsubo A, Kon A, Nagasaki M, Chalkidis G,Suzuki Y, Shiosaka M, Kawahata R, Yama-guchi T, Otsu M, Obara N, Sakata-Yanagi-

moto M, Ishiyama K, Mori H, Nolte F, Hof-mann WK, Miyawaki S, Sugano S, HaferlachC, Koeffler HP, Shih LY, Haferlach T, ChibaS, Nakauchi H, Miyano S, Ogawa S. Fre-quent pathway mutations of splicing ma-chinery in myelodysplasia. Nature. 478(7367): 64-69, 2011.

123

The major goal of our group is to identify genes of medical importance, and to de-velop new diagnostic and therapeutic tools. We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions. Bymeans of technologies developed through the genome project including a high-resolution SNP map, a large-scale DNA sequencing, and the cDNA microarraymethod, we have isolated a number of biologically and/or medically importantgenes, and are developing novel diagnostic and therapeutic tools.

1. Genes playing significant roles in humancancer

Yusuke Nakamura, Koichi Matsuda, HitoshiZembutsu, Ryuji Hamamoto, Yataro Daigo,Hidewaki Nakagawa, Chizu Tanikawa, Cui Ri,Hamdi Mbarek, Vinod Kumar, Yuji Urabe,Jiaying Lin, Zhenzhong Deng, Paulisally HauYi Lo, Martha Espinosa, Zhenzhong Deng,Satoko Uno, Yoichiro Kato, Mitsuko Naka-shima, Motoko Unoki, Masanori Yoshimatsu,Shinya Hayami, Hyun-Soo Cho, Goji Toyo-kawa, Tadashi Takawa, Reem AbdelrahimIbrahim, Kang Daechun, Lianhua Piao, Su-Youn Chung, Osman W Mohammed, TakashiFujitomo, Seham Elgazzar, Low Siew Kee, ChaPei Chieng, Koji Ueda, Nguyen Minh-Hue,Junkichi Koinuma, Daiki Miki, Ken Masuda,Masato Aragaki, Hideto Oshita

(1) Epigenetics

Regulation of histone modification and chro-matin structure by the p53-PADI4 pathway

Histone proteins are modified in response tovarious external signals, however their mecha-nisms are still not fully understood. Citrullina-tion is a post-transcriptional modification whichconverts arginine in protein into citrulline. Herewe show in vivo and in vitro citrullination of ar-ginine 3 residue of histone H4 (cit-H4R3) in re-sponse to DNA damage through the p53-PADI4pathway. We also observed DNA damage-induced citrullination of Lamin C. Cit-H4R3 andcitrullinated Lamin C are located around frag-mented nuclei in apoptotic cells. Ectopic expres-sion of PADI4 led to chromatin decondensationand promoted DNA cleavage, while Padi4－／－

mice exhibited resistance to radiation-inducedapoptosis in the thymus. Furthermore, the levelof cit-H4R3 was negatively correlated with p53protein expression and with tumor size in non-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura, M.D., Ph.D.Associate Professor Koichi Matsuda, M.D., Ph.D.Assistant Professor Hitoshi Zembutsu, M.D., Ph.D.Assistant Professor Ryuji Hamamoto, Ph.D.

教授中村祐輔准教授松田浩一助教前佛均助教浜本隆二

124

small cell lung cancer tissues. Our findings re-veal that cit-H4R3 would be an “apoptotic his-tone code” to detect damaged cells and inducenuclear fragmentation, which plays a crucialrole in carcinogenesis.

Histone lysine methyltransferase Wolf-Hirschhorn syndrome candidate 1 is involvedin human carcinogenesis through regulationof the Wnt pathway

A number of histone methyltransferases havebeen identified and biochemically characterized,but the pathologic roles of their dysfunction inhuman diseases like cancer are not well under-stood. Here, we demonstrate that Wolf-Hirschhorn syndrome candidate 1 (WHSC1)plays important roles in human carcinogenesis.Transcriptional levels of this gene are signifi-cantly elevated in various types of cancer in-cluding bladder and lung cancers. Immunohisto-chemical analysis using a number of clinical tis-sues confirmed significant up-regulation ofWHSC1 expression in bladder and lung cancercells at the protein level. Treatment of cancercell lines with small interfering RNA targetingWHSC1 significantly knocked down its expres-sion and resulted in the suppression of prolif-eration. Cell cycle analysis by flow cytometry in-dicated that knockdown of WHSC1 decreasedthe cell population of cancer cells at the S phasewhile increasing that at the G2/M phase. WHSC1 interacts with some proteins related to theWNT pathway including β-catenin and tran-scriptionally regulates CCND1, the target geneof the β-catenin/Tcf-4 complex, through histoneH3 at lysine 36 trimethylation. This is a novelmechanism for WNT pathway dysregulation inhuman carcinogenesis, mediated by the epige-netic regulation of histone H3. Because expres-sion levels of WHSC1 are significantly low inmost normal tissue types, it should be feasibleto develop specific and selective inhibitors tar-geting the enzyme as antitumor agents thathave a minimal risk of adverse reaction.

The JmjC domain-containing histone demeth-ylase KDM3A is a positive regulator of theG1/S transition in cancer cells via transcrip-tional regulation of the HOXA1 gene

A number of histone demethylases have beenidentified and biochemically characterized, yettheir biological functions largely remain unchar-acterized, particularly in the context of humandiseases such as cancer. In this study, we de-scribe important roles for the histone demethy-lase KDM3A, also known as JMJD1A, in humancarcinogensis. Expression levels of KDM3A were

significantly elevated in human bladder carcino-mas compared with nonneoplastic bladder tis-sues (p＜0.0001), when assessed by real-timePCR. We confirmed that some other cancers in-cluding lung cancer also overexpressed KDM3A,using cDNA microarray analysis. Treatment ofcancer cell lines with small interfering RNA tar-geting KDM3A significantly knocked down itsexpression and resulted in the suppression ofproliferation. Importantly, we found that KDM3A activates transcription of the HOXA1 genethrough demethylating histone H3 at lysine 9 di-methylation by binding to its promoter region.Indeed, expression levels of KDM3A and HOXA1 in several types of cancer cell lines and blad-der cancer samples were statistically correlated.We observed the down-regulation of HOXA1 aswell as CCND1 after treatment with KDM3AsiRNA, indicating G1 arrest of cancer cells. To-gether, our results suggest that elevated expres-sion of KDM3A plays a critical role in thegrowth of cancer cells, and further studies mayreveal a cancer therapeutic potential in KDM3Ainhibition.

The histone demethylase JMJD2B plays anessential role in human carcinogenesisthrough positive regulation of cyclin-dependent kinase 6

Histone methyltransferases and demethylasesare known to regulate transcription by alteringthe epigenetic marks on histones, but the pa-thologic roles of their dysfunction in human dis-eases, such as cancer, still remain to be eluci-dated. Herein, we show that the histone de-methylase JMJD2B is involved in human car-cinogenesis. Quantitative real-time PCR showednotably elevated levels of JMJD2B expression inbladder cancers, compared with correspondingnonneoplastic tissues (P＜0.0001), and elevatedprotein expression was confirmed by immuno-histochemistry. In addition, cDNA microarrayanalysis revealed transactivation of JMJD2B inlung cancer, and immunohistochemical analysisshowed protein overexpression in lung cancer.siRNA-mediated reduction of expression ofJMJD2B in bladder and lung cancer cell linessignificantly suppressed the proliferation of can-cer cells, and suppressing JMJD2B expressionlead to a decreased population of cancer cells inS phase, with a concomitant increase of cells inG(1) phase. Furthermore, a clonogenicity assayshowed that the demethylase activity of JMJD2Bpossesses an oncogenic activity. Microarrayanalysis after knockdown of JMJD2B revealedthat JMJD2B could regulate multiple pathwayswhich contribute to carcinogenesis, includingthe cell-cycle pathway. Of the downstream

125

genes, chromatin immunoprecipitation showedthat CDK6 (cyclin-dependent kinase 6), essentialin G1-S transition, was directly regulated byJMJD2B, via demethylation of histone H3-K9 inits promoter region. Expression levels of JMJD2Band CDK6 were significantly correlated in vari-ous types of cell lines. Deregulation of histonedemethylation resulting in perturbation of thecell cycle, represents a novel mechanism for hu-man carcinogenesis and JMJD2B is a feasiblemolecular target for anticancer therapy.

Enhanced expression of EHMT2 is involvedin the proliferation of cancer cells throughnegative regulation of SIAH1

EHMT2 is a histone lysine methyltransferaselocalized in euchromatin regions and acting as acorepressor for specific transcription factors. Al-though the role of EHMT2 in transcriptionalregulation has been well documented, the pa-thologic consequences of its dysfunction in hu-man disease have not been well understood.Here, we describe important roles of EHMT2 inhuman carcinogenesis. Expression levels ofEHMT2 are significantly elevated in humanbladder carcinomas compared with nonneoplas-tic bladder tissues (P＜．0001) in real-time po-lymerase chain reaction analysis. Complemen-tary DNA microarray analysis also revealed itsoverexpression in various types of cancer. Thereduction of EHMT2 expression by small inter-fering RNAs resulted in the suppression of thegrowth of cancer cells and possibly causedapoptotic cell death in cancer cells. Importantly,we show that EHMT2 can suppress transcrip-tion of the SIAH1 gene by binding to its pro-moter region (-293 to ＋51) and by methylatinglysine 9 of histone H3. Furthermore, an EHMT2-specific inhibitor, BIX-01294, significantly sup-pressed the growth of cancer cells. Our resultssuggest that dysregulation of EHMT2 plays animportant role in the growth regulation of can-cer cells, and further functional studies may af-firm the importance of EHMT2 as a promisingtherapeutic target for various types of cancer.

Minichromosome Maintenance Protein 7 is apotential therapeutic target in human cancerand a novel prognostic marker of non-smallcell lung cancer

The research emphasis in anti-cancer drugdiscovery has always been to search for a drugwith the greatest antitumor potential but fewestside effects. This can only be achieved if thedrug used is against a specific target located inthe tumor cells. In this study, we evaluatedMinichromosome Maintenance Protein 7 (MCM

7) as a novel therapeutic target in cancer. Immu-nohistochemical analysis showed that MCM7was positively stained in 196 of 331 non-smallcell lung cancer (NSCLC), 21 of 29 bladder tu-mor and 25 of 70 liver tumor cases whereas nosignificant staining was observed in various nor-mal tissues. We also found an elevated expres-sion of MCM7 to be associated with poor prog-nosis for patients with NSCLC (P＝0.0055). qRT-PCR revealed a higher expression of MCM7 inclinical bladder cancer tissues than in corre-sponding non-neoplastic tissues (P＜0.0001), andwe confirmed that a wide range of cancers alsooverexpressed MCM7 by cDNA microarrayanalysis. Suppression of MCM7 using specificsiRNAs inhibited incorporation of BrdU in lungand bladder cancer cells overexpressing MCM7,and suppressed the growth of those cells moreefficiently than that of normal cell strains ex-pressing lower levels of MCM7. Since MCM7expression was generally low in a number ofnormal tissues we examined, MCM7 has thecharacteristics of an ideal candidate for molecu-lar targeted cancer therapy in various tumorsand also as a good prognostic biomarker forNSCLC patients.

Validation of the histone methyltransferaseEZH2 as a therapeutic target for varioustypes of human cancer and as a prognosticmarker

The emphasis in anticancer drug discoveryhas always been on finding a drug with greatantitumor potential but few side-effects. Thiscan be achieved if the drug is specific for a mo-lecular site found only in tumor cells. Here, wefind the enhancer of zeste homolog 2 (EZH2) tobe highly overexpressed in lung and other can-cers, and show that EZH2 is integral to prolif-eration in cancer cells. Quantitative real-timePCR analysis revealed higher expression of EZH2 in clinical bladder cancer tissues than in corre-sponding non-neoplastic tissues (P＜0.0001), andwe confirmed that a wide range of cancers alsooverexpress EZH2, using cDNA microarrayanalysis. Immunohistochemical analysis showedpositive staining for EZH2 in 14 of 29 cases ofbladder cancer, 135 of 292 cases of non-small-cell lung cancer (NSCLC), and 214 of 245 casesof colorectal cancer, whereas no significantstaining was observed in various normal tissues.We found elevated expression of EZH2 to be as-sociated with poor prognosis for patients withNSCLC (P＝0.0239). In lung and bladder cancercells overexpressing EZH2, suppression of EZH2using specific siRNAs inhibited incorporation ofBrdU and resulted in significant suppression ofcell growth, even though no significant effect

126

was observed in the normal cell strain CCD-18Co, which has undetectable EZH2. Because EZH2 expression was scarcely detectable in all nor-mal tissues we examined, EZH2 shows promiseas a tumor-specific therapeutic target. Further-more, as elevated levels of EZH2 are associatedwith poor prognosis of patients with NSCLC, itsoverexpression in resected specimens couldprove a useful molecular marker, indicating thenecessity for a more extensive follow-up insome lung cancer patients after surgical treat-ment.

Demethylation of RB regulator MYPT1 by his-tone demethylase LSD1 promotes cell cycleprogression in cancer cells

Histone demethylase LSD1 (also known asKDM1 and AOF2) is active in various cancercells, but its biological significance in humancarcinogenesis is unexplored. In this study, weexplored hypothesized interactions between LSD1 and MYPT1, a known regulator of RB1 phos-phorylation. We found that MYPT1 was methyl-ated in vitro and in vivo by histone lysine meth-yltransferase SETD7 and demethylated by LSD1,identifying Lys 442 of MYPT1 as a target formethylation/demethylation by these enzymes.LSD1 silencing increased MYPT1 protein levels,decreasing the steady state level of phospho-rylated RB1 (Ser 807/811) and reducing E2F ac-tivity. MYPT1 methylation status influenced theaffinity of MYPT1 for the ubiquitin-proteasomepathway of protein turnover. MYPT1 was unsta-ble in murine cells deficient in SETD7, support-ing the concept that MYPT1 protein stability isphysiologically regulated by methylation status.LSD1 overexpression could activate RB1 phos-phorylation by inducing a destabilization ofMYPT1 protein. Taken together, our resultscomprise a novel cell cycle regulatory mecha-nism mediated by methylation/demethylationdynamics, and they reveal the significance ofLSD1 overexpression in human carcinogenesis.

Dysregulation of PRMT1 and PRMT6, Type Iarginine methyltransferases, is involved invarious types of human cancers

Protein arginine methylation is a novel post-translational modification regulating a diversityof cellular processes, including histone func-tions, but the roles of protein arginine methyl-transferases (PRMTs) in human cancer are notwell investigated. To address this issue, we firstexamined expression levels of genes belongingto the PRMT family and found significantlyhigher expression of PRMT1 and PRMT6, bothof which are Type I PRMTs, in cancer cells of

various tissues than in non-neoplastic cells. Ab-rogation of the expression of these genes withspecific siRNAs significantly suppressed growthof bladder and lung cancer cells. Expressionprofile analysis using the cells transfected withthe siRNAs indicated that PRMT1 and PRMT6interplay in multiple pathways, supportingregulatory roles in the cell cycle, RNA process-ing and also DNA replication that are funda-mentally important for cancer cell proliferation.Furthermore, we demonstrated that serumasymmetric dimethylarginine (ADMA) levels ofa number of cancer cases are significantly higherthan those of nontumor control cases. In sum-mary, our results suggest that dysregulation ofPRMT1 and PRMT6 can be involved in humancarcinogenesis and that these Type I argininemethyltransferases are good therapeutic targetsfor various types of cancer.

Overexpression of LSD1 contributes to hu-man carcinogenesis through chromatin regu-lation in various cancers

A number of histone demethylases have beenidentified and biochemically characterized, butthe pathological roles of their dysfunction in hu-man disease like cancer have not been well un-derstood. Here, we demonstrate important rolesof lysine-specific demethylase 1 (LSD1) in hu-man carcinogenesis. Expression levels of LSD1are significantly elevated in human bladder car-cinomas compared with nonneoplastic bladdertissues (p＜0.0001). cDNA microarray analysisalso revealed its transactivation in lung and col-orectal carcinomas. LSD1-specific small interfer-ing RNAs significantly knocked down its ex-pression and resulted in suppression of prolif-eration of various bladder and lung cancer celllines. Concordantly, introduction of exogenousLSD1 expression promoted cell cycle progres-sion of human embryonic kidney fibroblast cells.Expression profile analysis showed that LSD1could affect the expression of genes involved invarious chromatin-modifying pathways such aschromatin remodeling at centromere, cen-tromeric heterochromatin formation and chro-matin assembly, indicating its essential roles incarcinogenesis through chromatin modification.

(2) Lung cancer

Chondrolectin is a novel diagnosticbiomarker and a therapeutic target for lungcancer.

PURPOSE: This study aims to identify mole-cules that might be useful as diagnostic/prog-nostic biomarkers and as targets for the devel-

127

opment of new molecular therapies for lungcancer.

EXPERIMENTAL DESIGN: We screened forgenes that were highly transactivated in a largeproportion of 120 lung cancers by means of acDNA microarray representing 27,648 genes andfound chondrolectin (CHODL) as a candidate.Tumor tissue microarray was applied to exam-ine the expression of CHODL protein and itsclinicopathologic significance in archival non-small cell lung cancer (NSCLC) tissues from 295patients. A role of CHODL in cancer cell growthand/or survival was examined by siRNA ex-periments. Cellular invasive effect of CHODL onmammalian cells was examined by Matrigel as-says.

RESULTS: Immunohistochemical staining re-vealed that strong positivity of CHODL proteinwas associated with shorter survival of patientswith NSCLC (P＝0.0006), and multivariateanalysis confirmed it to be an independentprognostic factor. Treatment of lung cancer cellswith siRNAs against CHODL suppressedgrowth of the cancer cells. Furthermore, induc-tion of exogenous expression of CHODL con-ferred growth and invasive activity of mammal-ian cells.

CONCLUSIONS: CHODL is likely to be aprognostic biomarker in the clinic and targetingCHODL might be a strategy for the develop-ment of anticancer drugs.

Identification of Epstein-Barr virus-inducedgene 3 as a novel serum and tissuebiomarker and a therapeutic target for lungcancer.

PURPOSE: This study aims to identify novelbiomarkers and therapeutic targets for lung can-cer.

EXPERIMENTAL DESIGN: We carried outgene expression profile analysis of 120 lung can-cers to screen for genes encoding transmem-brane/secretory molecules that are commonlytransactivated in lung cancers. Epstein-Barrvirus-induced gene 3 (EBI3), which encodes asecretory glycoprotein, was selected as a goodcandidate. Immunohistochemical staining usingtissue microarray consisting of 414 non-smallcell lung cancers was applied to examine the ex-pression level and prognostic value of EBI3. Se-rum EBI3 levels in 400 individuals for trainingassays (274 lung cancers and 126 healthy volun-teers) and those in 173 individuals for validationanalysis (132 lung cancers and 41 healthy volun-teers) were measured by ELISA. The role of EBI3 in cancer cell growth was examined by siRNAand cell growth assays, using cells stably ex-pressing exogenous EBI3.

RESULTS: Immunohistochemical staining ofEBI3 using tissue microarrays revealed that ahigh level of EBI3 expression was associatedwith a poor prognosis of lung cancer (P＝0.0014) and multivariate analysis confirmed it tobe an independent prognostic factor (P＝0.0439).Serum levels of EBI3 in the training set werefound to be significantly higher in lung cancerpatients than in healthy volunteers; this resultwas also observed in the validation set. Further-more, reduction in EBI3 expression by siRNAsuppressed cancer cell proliferation whereas in-duction of exogenous EBI3 conferred growth-promoting activity.

CONCLUSIONS: EBI3 is a potential serumand tissue biomarker as well as therapeutic tar-get for lung cancer.

2. Pharmacogenetics

Genome-wide association study ofepirubicin-induced leukopenia in Japanesepatients.

OBJECTIVES: Despite long-term clinical expe-rience with epirubicin, unpredictable severe ad-verse reactions remain an important determi-nant to limit the drug use. To identify a geneticfactor(s) affecting the risk of epirubicin-inducedleukopenia / neutropenia , we performed agenome-wide association study.

METHODS: We studied 270 patients consist-ing of 67 patients with grade 3 or 4 leukopenia/neutropenia, and 203 patients showing no toxic-ity (patients with grade 1 or 2 were excludedfrom the study) for genome-wide associationstudy. We further examined the single nucleo-tide polymorphisms (SNPs) showing P values ofless than 0.0001 using an additional set of 48 pa-tients with grade 3/4 leukopenia/neutropenia.

RESULTS: The combined analysis indicatedthat rs2916733 in microcephalin 1 [combinedPFisher min＝2.27×10, odds ratio (OR)＝2.74with 95％ confidence interval (CI)＝1.96-3.83;the nonrisk genotype as reference] was signifi-cantly associated with epirubicin-induced leuko-penia/neutropenia. A subgroup analysis of pa-tients with only breast cancer showed a similartrend of association for the marker SNP rs2916733 (combined PFisher min＝6.76×10, OR＝2.80 with 95％ CI＝1.86-4.21). We subse-quently performed haplotype analysis andfound that a haplotype constructed from rs2916733 and rs1031309, which was in linkagedisequilibrium with rs2916733 (r＝0.64), showedstronger association (P＝2.20×10, OR＝2.88with 95％ CI＝2.05-4.03) than a single landmarkSNP (rs2916733; P＝2.27×10, OR＝2.74 with 95％ CI＝1.96-3.83), suggesting that causative vari-

128

ant(s) that could influence the susceptibility ofepirubicin-induced adverse drug reactions(ADRs) might exist in this haplotype.

CONCLUSION: Our findings show that ge-netic variants in the microcephalin 1 locus aresuggestively associated with the risk ofepirubicin-induced ADRs and might be applica-ble in development of diagnostic system for pre-dicting the risk of the ADRs, leading to betterprognosis and quality of life for patients withcancer. However, these results should be consid-ered preliminary until replicated in adequatelylarger powered and controlled samples

Dose-adjustment study of tamoxifen basedon CYP2D6 genotypes in Japanese breastcancer patients.

CYP2D6 is a key enzyme responsible for themetabolism of tamoxifen to active metabolites,endoxifen, and 4-hydroxytamoxifen. The breastcancer patients who are heterozygous and ho-mozygous for decreased-function and null al-leles of CYP2D6 showed lower plasma concen-trations of endoxifen and 4-hydroxytamoxifencompared to patients with homozygous-wild-type allele, resulting in worse clinical outcomein tamoxifen therapy. We recruited 98 Japanesebreast cancer patients, who had been taking 20mg of tamoxifen daily as adjuvant setting. Forthe patients who have one or no normal allele ofCYP2D6, dosages of tamoxifen were increasedto 30 and 40 mg/day, respectively. The plasmaconcentrations of tamoxifen and its metaboliteswere measured at 8 weeks after dose-adjustmentusing liquid chromatography-tandem mass spec-trometry. Association between tamoxifen doseand the incidence of adverse events during thetamoxifen treatment was investigated. In the pa-tients with CYP2D6＊1/＊10 and CYP2D6＊10/＊10,the mean plasma endoxifen levels after dose in-crease were 1.4- and 1.7-fold higher, respec-tively, than those before the increase (P＜0.001).These plasma concentrations of endoxifenachieved similar level of those in the CYP2D6＊1/＊1 patients receiving 20 mg/day of tamoxifen.Plasma 4-hydroxytamoxifen concentrations inthe patients with CYP2D6＊1/＊10 and CYP2D6＊

10/＊10 were also significantly increased to thesimilar levels of the CYP2D6＊1/＊1 patients ac-cording to the increasing tamoxifen dosages (P＜0.001). The incidence of adverse events wasnot significantly different between before andafter dose adjustment. This study provides theevidence that dose adjustment is useful for thepatients carrying CYP2D6＊10 allele to maintainthe effective endoxifen level.

Pharmacogenomics of tamoxifen: roles ofdrug metabolizing enzymes and transporters.

Tamoxifen has been widely used for the pre-vention of recurrence in patients with hormonereceptor-positive breast cancer. Tamoxifen re-quires metabolic activation by cytochrome P450(CYP) enzymes for formation of active metabo-lites, 4-hydroxytamoxifen and endoxifen, whichhave 30- to 100-fold greater affinity to the estro-gen receptor and the potency to suppressestrogen-dependent breast cancer cell prolifera-tion. CYP2D6 is a key enzyme in this metabolicactivation and it has been suggested that the ge-netic polymorphisms of CYP2D6 influence theplasma concentrations of active tamoxifen me-tabolites and clinical outcomes for breast cancerpatients treated with tamoxifen. The geneticpolymorphisms in the other drug-metabolizingenzymes, including other CYP isoforms, sul-fotransferases and UDP-glucuronosyltransferasesmight contribute to individual differences in thetamoxifen metabolism and clinical outcome oftamoxifen therapy although their contributionswould be small. Recently, involvement of a drugtransporter in disposition of active tamoxifenmetabolites was identified. The genetic polymor-phisms of transporter genes have the potentialto improve the prediction of clinical outcome forthe treatment of hormone receptor-positivebreast cancer. This review summarizes currentknowledge on the roles of polymorphisms in thedrug-metabolizing enzymes and transporters intamoxifen pharmacogenomics.

A Genome-Wide Association Study of OverallSurvival in Pancreatic Cancer PatientsTreated with Gemcitabine in CALGB 80303.

Background and Aims: Cancer and LeukemiaGroup B 80303 was a randomized, phase IIIstudy in patients with advanced pancreatic can-cer treated with gemcitabine plus either bevaci-zumab or placebo. We prospectively collectedgermline DNA and conducted a genome-wideassociation study (GWAS) using overall survival(OS) as the endpoint.

EXPERIMENTAL DESIGN: DNA from 351patients was genotyped for more than 550,000single-nucleotide polymorphisms (SNP). Asso-ciations between OS and SNPs were investi-gated using the log-linear 2-way multiplicativeCox proportional hazards model. The subset of294 genetically European patients was used forthe primary analysis.

RESULTS: A nonsynonymous SNP in inter-leukin (IL)17F (rs763780, H161R) and an intronicSNP in strong linkage disequilibrium (rs7771466) were associated with OS using

129

genome-wide criteria (P＜－10(-7)). Median OSwas significantly shorter (P＝2.61×10(-8)) forthe rs763780 heterozygotes [3.1 months; 95％confidence interval (CI), 2.3-4.3] than for the pa-tients without this variant (6.8 months; 95％ CI,5.8-7.3). After adjustment by stratification fac-tors, the P value for the association was 9.51×10(-7).

CONCLUSIONS: The variant 161R form of IL-17F is a natural antagonist of the antiangiogeniceffects of wild-type 161H IL-17F, and angiogene-sis may play an important role in the metastaticspread of pancreatic cancer. In this preliminarystudy, we hypothesize that the angiogenetic po-tential of pancreatic cancers in patients withvariant IL-17F is higher than that of tumors inpatients with wild-type IL-17F, conferring worseprognosis. This exploratory GWAS may providethe foundation for testing the biology and clini-cal effects of novel genes and their heritablevariants through mechanistic and confirmatorystudies in pancreatic cancer.

A genome-wide association study identifieslocus at 10q22 associated with clinical out-comes of adjuvant tamoxifen therapy forbreast cancer patients in Japanese.

Although many association studies of poly-morphisms in candidate genes with the clinicaloutcomes of breast cancer patients receiving ad-juvant tamoxifen therapy have been reported,genetic factors determining individual responseto tamoxifen are not fully understood. To iden-tify genetic polymorphisms associated withclinical outcomes of patients with tamoxifentreatment, we conducted a genome-wide asso-ciation study (GWAS). We studied 462 Japanesepatients with hormone receptor-positive, inva-sive breast cancer receiving adjuvant tamoxifentherapy. Of them, 240 patients were analyzed bygenome-wide genotyping using the IlluminaHuman610-Quad BeadChips, and two inde-pendent sets of 105 and 117 cases were used forreplication studies. In the GWAS, we detectedsignificant associations with recurrence-free sur-vival at 15 single-nucleotide polymorphisms(SNPs) on nine chromosomal loci (1p31, 1q41, 5q33, 7p11, 10q22, 12q13, 13q22, 18q12 and 19p13) that satisfied a genome-wide significantthreshold (log-rank P＝2.87×10(-9)-9.41×10(-8)).Among them, rs10509373 in C10orf11 gene on10 q 22 was significantly associated withrecurrence-free survival in the replication study(log-rank P＝2.02×10(-4)) and a combinedanalysis indicated a strong association of thisSNP with recurrence-free survival in breast can-cer patients treated with tamoxifen (log-rank P＝1.26×10(-10)). Hazard ratio per C allele of rs

10509373 was 4.51 [95％ confidence interval (CI),2.72-7.51; P＝6.29×10(-9)]. In a combined analy-sis of rs10509373 genotype with previously iden-tified genetic makers, CYP2D6 and ABCC2, thenumber of risk alleles of these three genes hadcumulative effects on recurrence-free survivalamong 345 patients receiving tamoxifen mono-therapy (log-rank P＝2.28×10(-12)). In conclu-sion, we identified a novel locus associated withrecurrence-free survival in Japanese breast can-cer patients receiving adjuvant tamoxifen ther-apy.

3. Genome-wide association study

(1) cancer susceptibility gene

Genome-wide association study identifies asusceptibility locus for HCV-induced hepato-cellular carcinoma

To identify the genetic susceptibility factor(s)for hepatitis C virus-induced hepatocellular car-cinoma (HCV-induced HCC), we conducted agenome-wide association study using 432,703autosomal SNPs in 721 individuals with HCV-induced HCC (cases) and 2,890 HCV-negativecontrols of Japanese origin. Eight SNPs thatshowed possible association (P＜1×10(-5)) inthe genome-wide association study were furthergenotyped in 673 cases and 2,596 controls. Wefound a previously unidentified locus in the 5’flanking region of MICA on 6p21.33 (rs2596542,P(combined)＝4.21×10(-13), odds ratio＝1.39) tobe strongly associated with HCV-induced HCC.Subsequent analyses using individuals withchronic hepatitis C (CHC) indicated that thisSNP is not associated with CHC susceptibility(P＝0.61) but is significantly associated withprogression from CHC to HCC (P＝3.13×10(-8)). We also found that the risk allele of rs2596542 was associated with lower solubleMICA protein levels in individuals with HCV-induced HCC (P＝1.38×10(-13)).

Common variant in 6q26-q27 is associatedwith distal colon cancer in Asian population

Colorectal cancer (CRC) is a multifactorial dis-ease with both environmental and genetic fac-tors contributing to its development. The inci-dence of CRC is increasing year by year in Ja-pan. Patients with CRC in advanced stages havea poor prognosis, but detection of CRC at earlierstages can improve clinical outcome. Therefore,identification of epidemiologic factors that influ-ence development of CRC would facilitate theprevention or early detection of disease.

To identify loci associated with CRC risk, we

130

performed a genome-wide association study(GWAS) for CRC and sub-analyses by tumor lo-cation using 1,583 Japanese CRC cases and 1,898controls. Subsequently, we conducted replicationanalyses using a total of 4,809 CRC cases and2,973 controls including 225 Korean subjectswith distal colon cancer and 377 controls.

We identified a novel locus on 6q26-q27 re-gion (rs7758229 in SLC22A3, P＝7.92×10-9,Odds ratio of 1.28) that was significantly associ-ated with distal colon cancer. We also replicatedthe association between CRC and SNPs on 8q24(rs6983267 and rs7837328, P＝1.51×10-8 and7.44×10-8, Odds ratios of 1.18 and 1.17, respec-tively). Moreover, we found cumulative effectsof three genetic (rs7758229, rs6983267, and rs4939827 in SMAD7) and one environmental fac-tors (alcohol drinking) which appear to increaseCRC risk approximately twofold.

We found a novel susceptible locus in SLC22A3 that contributes to the risk of distal coloncancer in Asian population. These findingswould further extend our understanding of therole of common genetic variants in CRC etiol-ogy.

(2) other diseases

A genome-wide association study identifiestwo susceptibility loci for duodenal ulcer inthe Japanese population

Through a genome-wide association analysisusing a total of 7,035 duodenal ulcer cases and25,323 controls of Japanese populations, weidentified two susceptibility loci at the Prostatestem cell antigen (PSCA) on 8q24 and at the ABOblood group (ABO ) on 9q34. A C-allele of rs2294008 at PSCA increased the risk of duodenalulcer (odds ratio (OR) of 1.84 with P value of3.92×10－33) in a recessive model, while it de-creased the risk of gastric cancer (OR of 0.79with P value of 6.79×10－12) as reported previ-ously1. A T-allele of SNP rs2294008 created theupstream translational initiation codon and af-fects the protein localization from cytoplasm tocell surface. SNP rs505922 on ABO also associ-ated with duodenal ulcer in a recessive model(OR of 1.32 with P value of 1.15×10－10). Ourfinding implies the crucial roles of genetic vari-ations on the pathogenesis of duodenal ulcer.

Common variant near the endothelin receptortype A (EDNRA) gene is associated with in-tracranial aneurysm risk.

The pathogenesis of intracranial aneurysm(IA) formation and rupture is complex, with sig-nificant contribution from genetic factors. We

previously reported genome-wide associationstudies based on European discovery and Japa-nese replication cohorts of 5,891 cases and14,181 controls that identified five disease-related loci. These studies were based on testingreplication of genomic regions that containedSNPs with posterior probability of association(PPA) greater than 0.5 in the discovery cohort.To identify additional IA risk loci, we pursued14 loci with PPAs in the discovery cohort be-tween 0.1 and 0.5. Twenty-five SNPs from theseloci were genotyped using two independentJapanese cohorts, and the results from discoveryand replication cohorts were combined by meta-analysis. The results demonstrated significantassociation of IA with rs6841581 on chromosome4q31.23, immediately 5’ of the endothelin recep-tor type A with P＝2.2×10(-8) [odds ratio (OR)＝1.22, PPA＝0.986]. We also observed substan-tially increased evidence of association for twoother regions on chromosomes 12q22 (OR＝1.16,P＝1.1×10(-7), PPA＝0.934) and 20p12.1 (OR＝1.20, P＝6.9×10(-7), PPA＝0.728). Although en-dothelin signaling has been hypothesized toplay a role in various cardiovascular disordersfor over two decades, our results are unique inproviding genetic evidence for a significant as-sociation with IA and suggest that manipulationof the endothelin pathway may have importantimplications for the prevention and treatment ofIA

A genome-wide association study of chronichepatitis B identified novel risk locus in aJapanese population.

Hepatitis B virus (HBV) infection is a majorhealth issue worldwide which may lead to he-patic dysfunction, liver cirrhosis and hepatocel-lular carcinoma. To identify host genetic factorsthat are associated with chronic hepatitis B(CHB) susceptibility, we previously conducted atwo-stage genome-wide association study(GWAS) and identified the association of HLA-DP variants with CHB in Asians; however, only179 cases and 934 controls were genotyped us-ing genome-wide single nucleotide polymor-phism (SNP) arrays. Here, we performed a sec-ond GWAS of 519 747 SNPs in 458 JapaneseCHB cases and 2056 controls. After adjustmentwith the previously identified variants in theHLA-DP locus (rs9277535), we detected strongassociations at 16 loci with P-value of ＜5×10(-5). We analyzed these loci in three independentJapanese cohorts (2209 CHB cases and 4440 con-trols) and found significant association of twoSNPs (rs2856718 and rs7453920) within theHLA-DQ locus (overall P-value of 5.98×10(-28)and 3.99×10(-37)). Association of CHB with

131

SNPs rs2856718 and rs7453920 remains signifi-cant even after stratification with rs3077 and rs9277535, indicating independent effect of HLA-DQ variants on CHB susceptibility (P-value of1.52×10(-21)- 2.38×10(-30)). Subsequent analy-ses revealed DQA1＊0102-DQB1＊0604 and DQA1＊0101-DQB1＊0501 [odds ratios (OR) ＝0.16,and 0.39, respectively] as protective haplotypesand DQA1＊0102-DQB1＊0303 and DQA1＊0301-DQB1＊0601 (OR＝19.03 and 5.02, respectively)as risk haplotypes. These findings indicated thatvariants in antigen-binding regions of HLA-DPand HLA-DQ contribute to the risk of persistentHBV infection.

A genome-wide association study of neph-rolithiasis in the Japanese population identi-fies novel susceptible loci at 5q35.3, 7p14.3and 13q14.1

Nephrolithiasis is a common nephrologic dis-order with complex etiology. To identify the ge-netic factor(s) for nephrolithiasis, we conducteda three-stage genome-wide association study(GWAS) using a total of 5,892 nephrolithiasiscases and 17,809 controls of Japanese origin.Here we found three novel loci for nephrolithia-sis: RGS14-SLC34A1-PFN3-F12 on 5q35.3 (rs11746443; P＝8.51×10－12, odds ratio (OR)＝1.19), INMT-FAM188B-AQP1 on 7p14.3 (rs1000597; P＝2.16×10－14, OR＝1.22), and DGKHon 13q14.1 (rs4142110; P＝4.62×10－9, OR＝1.14).Subsequent analyses in 21,842 Japanese subjectsrevealed the association of SNP rs11746443 withthe reduction of estimated glomerular filtrationrate (eGFR) (P＝6.54×10－8), suggesting the cru-cial role of this variation on renal function. Ourfindings elucidated the significance of geneticvariations for the pathogenesis of nephrolithia-sis.

References

1. H.-S. Cho, T. Suzuki, N. Dohmae, S.Hayami, M. Unoki, M. Yoshimatsu, G.Toyokowa, M. Takawa, T. Chen, J.K.Kurash, H.I. Field, B.A.J. Ponder, Y. Naka-mura, and R. Hamamoto: Demethylation ofRB regulator MYPT1 by histone demethylaseLSD1 promotes cell cycle progression in can-cer cells. Cancer Research, 71655-660, 2011

2. R. Cui, Y. Okada, S.G. Jang, J.L. Ku, J.G.Park, Y. Kamatani, N. Hosono, T. Tsunoda,V. Kumar, C. Tanikawa, N. Kamatani, R.Yamada, M. Kubo, Y. Nakamura, and K.Matsuda: Common variant in 6q26-q27 is as-sociated with distal colon cancer in Asianpopulation. Gut, 60799-805, 2011

3. K. Imai, S. Hirata, A. Irie, S. Senju, Y. Ikuta,K. Yokomine, M. Harao, M. Inoue, Y. Tomi-ta, T. Tsunoda, H. Nakagawa, Y. Nakamura,H. Baba, and Y. Nishimura: Identification ofHLA-A2-restricted CTL epitopes of a noveltumor-associated antigen, KIF20A, overex-pressed in pancreatic cancer. British Journalof Cancer, 104300-307, 2011

4. A. Iida, N. Hosono, M. Sano, T. Kamei, S.Oshima, T. Tokuda, M. Kubo, Y. Nakamura,and S. Ikegawa: Optineurin mutations inJapanese amyotrophic lateral sclerosis. Jour-nal of Neurology, Neurosurgery & Psychia-try, Jan 8. [Epub ahead of print], 2011

5. N. Akuta, F. Suzuki, M. Hirakawa, Y. Kawa-mura, H. Sezaki, Y. Suzuki, T. Hosaka, M.Kobayashi, M. Kobayashi, S. Saitoh, Y.Arase, K. Ikeda, K. Chayama, Y. Nakamura,and H. Kumada: Amino acid substitution inHCV core/NS5A region and genetic vari-

ation near IL28B gene affect treatment effi-cacy to interferon plus ribavirin combinationtherapy. Intervirology, Feb 16, [Epub aheadof print], 2011

6. T. Ozeki, T. Mushiroda, A. Yowang, A.Takahashi, M. Kubo, Y. Shirakata, Z.Ikezawa, M. Iijima, T. Shiohara, K. Hashi-moto, N. Kamatani, and Y. Nakamura:Genome-wide association study identifiesHLA-A＊3101 allele as a genetic risk factorfor carbamazepine-induced cutaneous ad-verse drug reactions in Japanese population.Human Molecular Genetics, 20: 1034-1041,2011

7. Y. Fujimoto, H. Ochi, T. Maekawa, H. Abe,C.N. Hayes, H. Kumada, Y. Nakamura, andK. Chayama: A single nucleotide polymor-phism in activated cdc42 associated tyrosinekinase 1 influences the interferon therapy inhepatitis C patients. J. Hepatology, 54: 629-639, 2011

8. Y. Hashimoto, H. Ochi, H. Abe, Y.Hayashida, M. Tsuge, F. Mitsui, N. Hiraga,M. Imamura, S. Takahashi, C.N. Hayes, W.Ohishi, M. Kubo, T. Tsunoda, N. Kamatani,Y. Nakamura, and K. Chayama: Predictionof response to peginterferon-alfa-2b plusribavirin therapy in Japanese patients in-fected with hepatitis C virus genotype 1b. J.Med. Virol., 83981-988, 2011

9. F. Suzuki, Y. Suzuki, N. Akuta, H. Sezaki,M. Hirakawa, Y. Kawamura, T. Hosaka, M.Kobayashi, S. Saito, Y. Arase, K. Ikeda, M.Kobayashi, K. Chayama, N. Kamatani, Y.Nakamura, Y. Miyakawa, and H. Kumada:

132

Influence of ITPA polymorphism on de-creases of hemoglobin during treatmentwith pegylated IFN, ribavirin and telaprevir.Hepatology, 53415-421, 2011

10. T. Kawaoka , C.N. Hayes, W. Ohishi, H.Ochi, T. Maekawa, H. Abe, M. Tsuge, F.Mitsui, N. Hiraga, M. Imamura, S. Tkaka-hashi, M. Kubo, T. Tsunoda, Y. Nakamura,H. Kumada, and K. Chayama: Predictivevalue of IL28B polymorphism on the effectof interferon therapy in chronic hepatitis Cpatients with genotypes 2a and 2b. J. Hepa-tology, 54408-414, 2011

11. A. Aoki, K. Ozaki, H. Sato, A. Takahashi, M.Kubo, Y. Sakata, Y. Onouchi, T. Kawaguchi,T.-H. Lin, H. Takano, M. Yasutake, P.-C.Hsu, S. Ikegawa, N. Kamatani, T. Tsunoda,S.-H. H. Juo, M. Hori, I. Komuro, K. Mi-zuno, Y. Nakamura, and T. Tanaka: SNPson chromosome 5p15.3 associated with myo-cardial infarction in Japanese population.Journal of Human Genetics, 5647-51, 2011

12. M. Yoshimatsu, G. Toyokawa, S. Hayami,M. Unoki, T. Tsunoda, H.I. Field, J.D. Kelly,D.E. Neal, Y. Maehara, B.A.J. Ponder, Y.Nakamura, and R. Hamamoto: Dysregula-tion of PRMT1 and PRMT6, type I argininemethyltransferases, is involved in varioustypes of human cancers. Int. J. Cancer,128562-573, 2011

13. C.N. Hayes, M. Kobayashi, N. Akuta, F.Suzuki, H. Kumada, H. Abe, D. Miki, M.Imamura, H. Ochi, N. Kamatani, Y. Naka-mura, and K. Chayama: HCV substitutionsand IL28B polymorphisms on outcome ofpeg-interferon plus ribavirin combinationtherapy. Gut, 60261-267, 2011

14. P. Saetre, M. Vares, T. Werge, O.A. Andreas-sen, T. Arinami, H. Ishiguro, S. Nanko, E.C.Tan, D.H. Han, J. Roffman, J.-W. Muntjew-erff, P.P. Jagodzinski, B. Kempisty, J.Hauser, E. Vilella, E. Betcheva, Y. Naka-mura, B. Regland, I. Agartz, H. Hall, L.Terenius, and E.G. Jönsson: Methylenetetra-hydrofolate reductase (MTHFR) C677T andA1298C polymorphisms and age of onset inschizophrenia: a combined analysis of inde-pendent samples. Neuropsychiatric Genetics,Jan 13. [Epub ahead of print]:, 2011

15. S. Chung, H. Nakagawa, M. Uemura, L.Piao, K. Ashikawa, N. Hosono, R. Takata, S.Akamatsu, T. Kawaguchi, T. Morizono, T.Tsunoda, Y. Daigo, K. Matsuda, N. Kama-tani, Y. Nakamura, and M. Kubo: Associa-tion of a novel long non-coding RNA in 8q24 with prostate cancer susceptibility. Can-cer Science, 102245-252, 2011

16. L. Piao, H. Nakagawa, K. Ueda, S. Chung,K. Kashiwaya, H. Eguchi, H. Ohigashi, O.

Ishikawa, Y. Daigo, K. Matsuda, and Y.Nakamura: C12orf48, termed PARP-1 bind-ing protein (PARPBP), enhances poly (ADP-ribose) polymerase-1 (PARP-1) activity andprotects pancreatic cancer cells from DNAdamage. Genes Chromosomes and Cancer,5013-24, 2011

17. A. Iida, Takahashi, M. Deng, Y. Zhang, J.Wang, N. Atsuta, F. Tanaka, T. Kamei, M.Sano, S. Oshima, T. Tokuda, M. Morita, C.Akimoto, M. Nakajima, M. Kubo, N. Kama-tani, I. Nakano, G. Sobue, Y. Nakamura, D.Fan, and S. Ikegawa: Replication analysis ofSNPs on 9p21.2 and 19 p 13.3 withamyotrophic lateral sclerosis in East Asians.Neurobiology of Aging, in press, 2011

18. Y. Tomita, K. Imai, S. Senju, A. Irie, M.Inoue, Y. Hayashida, K. Shiraishi, T. Mori,Y. Daigo, T. Tsunoda, T. Ito, H. Nomori, Y.Nakamura, H. Kohrogi, and Y. Nishimura:A novel tumor-associated antigen, cell divi-sion cycle 45-like can induce cytotoxic Tlymphocytes reactive to tumor cells. CancerScience, in press, 2011

19. J.-H. Park, T. Katagiri, S. Chung, K. Kijima,and Y . Nakamura : Polypeptide N-acetylgalactosaminyltransferase 6 (GALNT6)disrupts mammary acinar morphogenesisthrough O-glycosylation of fibronectin. Neo-plasia, in press, 2011

20. Y. Kato, H. Zembutsu, R. Takata, F. Miya, T.Tsunoda, W. Obara, T. Fujioka, and Y.Nakamura: Predicting response of bladdercancers to gemcitabine and carboplatinneoadjuvant chemotherapy throughgenome-wide gene expression profiling. Ex-perimental and Therapeutic Medicine, 247-56, 2011

21. H. Zembutsu, M. Sasa, K. Kiyotani, T.Mushiroda, and Y. Nakamura: Should CYP2D6 inhibitors be administered in conjunctionwith tamoxifen? Expert Review of Antican-cer Therapy, in press, 2011

22. S. Maeda, D. Koya, S. Araki, T. Babazono, T.Umezono, M. Toyoda, K. Kawai, M. Ima-nishi, T. Uzu, D. Suzuki, H. Maegawa, A.Kashiwagi, Y. Iwamoto, and Y. Nakamura:Association of single nucleotide polymor-phisms within genes encoding sirtuin fami-lies with diabetic nephropathy in Japanesesubjects with type 2 diabetes. Clinical Ex-perimental Nephrology, in press, 2011

23. K. Okamoto, K. Tokunaga, K. Doi, T. Fujita,H. Suzuki, T. Katoh, T. Watanabe, N.Nishida, A. Mabuchi, A. Takahashi, M.Kubo, S. Maeda, Y. Nakamura, and E. Noiri:Common variation in GPC5 is associatedwith acquired nephrotic syndrome. NatureGenetics, 43459-463, 2011

133

24. T.A. Johnson, Y. Niimura, H. Tanaka, Y.Nakamura, and T. Tsunoda: hzAnalyzer:Detection, quantification, and visualizationof contiguous homozygosity in high-densitygenotyping datasets. Genome Biology, inpress, 2011

25. K.G. Tantisira, J. Lasky-Su, M. Harada, A.Murphy, A.A. Litonjua, B.E. Himes, C.Lange, R. Lazarus, J. Sylvia, B. Klanderman,Q.L. Duan, W. Qiu, T. Hirota, F.D. Martinez,D. Mauger, C. Sorkness, S. Szefler, S.C.Lazarus, R.F. Lemanske Jr, S.P. Peters, J.J.Lima, Y. Nakamura, M. Tamari, and S.T.Weiss: Genome-wide association of GLCCI1with asthma steroid treatment response.New Eng. J. Med., in press, 2011

26. V. Kumar, N. Kato, K. Urabe, A. Takahashi,R. Muroyama, N. Hosono, M. Otsuka, R.Tateishi, M. Omata, H. Nakagawa, K. Koike,N. Kamatani, M. Kubo, Y. Nakamura, andK. Matsuda: Genome-wide association studyidentifies a susceptibility locus for HCV-induced hepatocellular carcinoma. NatureGenetics, 43455-458, 2011

27. P.-C. Cha, A. Takahashi, N. Hosono, S.-K.Low, N. Kamatani, M. Kubo, and Y. Naka-mura: A genome-wide association studyidentifies three loci associated with suscepti-bility to uterine fibroids. Nature Genetics,43447-450, 2011

28. F. Nyberg, B.J. Barratt, T. Mushiroda, A.Takahashi, A. Jawaid, S. Hada, T. Umemura,M. Fukuoka, K. Nakata, Y. Ohe, H. Kato, S.Kudoh, R. March, Y. Nakamura, and N.Kamatani: Interstitial lung disease ingefitinib-treated Japanese patients with non-small-cell lung cancer: genome-wide analysisof genetic data. Pharmacogenomics, 12965-975, 2011

29. V. Kumar, K. Matsuo, A. Takahashi, N.Hosono, T. Tsunoda, N. Kamatani, S.-Y.Kong, H. Nakagawa, R. Cui, C. Tanikawa,M. Seto, Y. Morishima, M. Kubo, Y. Naka-mura, and K. Matsuda: Common variantson 14q32 and 13q12 are associated withDLBCL susceptibility. Journal of Human Ge-netics, 56436-439, 2011

30. Y. Okada, T. Hirota, Y. Kamatani, A. Taka-hashi, H. Ohmiya, N. Kumasaka, K. Higasa,Y. Yamaguchi-Kabata, N. Hosono, M.A.Nalls, M.H. Chen, F.J.A. van Rooij, A.V.Smith, T. Tanaka, D.J. Couper, N.A. Zakai,L. Ferrucci, D.L. Longo, D.G. Hernandez, J.C.M. Witteman, T.B. Harris, C.J. O’Donnell,S.K. Ganesh, K. Matsuda, T. Tsunoda, T.Tanaka, M. Kubo, Y. Nakamura, M. Tamari,K. Yamamoto, and N. Kamatani: Identifica-tion of nine novel loci associated with whiteblood cell subtypes in a Japanese popula-

tion. PLoS Genetics, 7e1002067, 2011”31. K. Ueda, N. Saichi, S. Takami, D. Kang, A.

Toyama, Y. Daigo, N. Ishikawa, N. Kohno,K. Tamura, T. Shuin, A. Yamane, M. Ota, T.Sato, Y. Nakamura, and H. Nakagawa:Rapid and efficient peptidome profilingtechnology for the comprehensive identifica-tion of serum peptide biomarkers. PLoSONE, in press, 2011

32. A. Toyama, H. Nakagawa, K. Matsuda, N.Ishikawa, N. Kohno, Y. Daigo, T. Sato, Y.Nakamura, and K. Ueda: Deglycosylationand label-free quantitative LC-MALDI MSapplied to efficient serum biomarker discov-ery of lung cancer. Proteome Science, inpress, 2011

33. M. Takawa, K. Masuda, M. Kunizaki, Y.Daigo, K. Takagi, Y. Iwai, H.-S. Cho, G.Toyokawa, Y. Yamane, K. Maejima, H.I.Field, T. Kobayashi, T. Akasu, M. Sugiyama,E. Tsuchiya, Y. Atomi, B.A.J. Ponder, Y.Nakamura, and R. Hamamoto: Validation ofthe histone methyltransferase EZH2 as atherapeutic target for various types of hu-man cancer and as a prognostic marker.Cancer Science, in press, 2011

34. I. Kou, A. Takahashi, T. Urano, N. Fukui, H.Ito, K. Ozaki, T. Tanaka, T, Hosoi, M.Shiraki, S. Inoue, Y. Nakamura, N. Kama-tani, M. Kubo, S. Mori, and S. Ikegawa:Common variants in a novel gene, FONG onchromosome 2q33.1 confer risk of osteoporo-sis in Japanese. PLoS ONE, in press, 2011

35. M.A. Nalls, D.J. Couper, T. Tanaka, F.J.A.van Rooij, M-H. Chen, A.V. Smith, D. To-niolo, N.A. Zakai, Q. Yang, A. Greinacher,A.R. Wood, M. Garcia, P. Gasparini, Y. Liu,T. Lumley, A.R. Folsom, A.P. Reiner, C.Gieger, V. Lagou, J.F. Felix, H. Völzke, N.A.Gouskova, A. Biffi, A. Döring, U. Völker, S.Chong, K.L. Wiggins, A. Rendon, A.Dehghan, M. Moore, K. Taylor, J.G. Wilson,G. Lettre, A. Hofman, J.C. Bis, N. Pirastu, C.S. Fox, C. Meisinger, J. Sambrook, S. Are-palli, M. Nauck, H. Prokisch, J. Stephens, N.L. Glazer, L.A. Cupples, Y. Okada, A. Taka-hashi, Y. Kamatani, K. Matsuda, T. Tsunoda,T. Tanaka, M. Kubo, Y. Nakamura, K.Yamamoto, N. Kamatani, M. Stumvoll, A.Tonjes, I. Prokopenko, T. Illig, K.V. Patel, S.F. Garner, B. Kuhnel, M. Mangino, B.A. Oos-tra, S.L. Thein, J. Coresh, H.-E. Wichmann,S. Menzel, J. Lin, G. Pistis, A.G. Uitterlinden,T.D. Spector, A. Teumer, G. Eriksdottir, V.Gudnason, S. Bandinelli, T. Frayling, A.Chakravarti, C.M. van Duijn, D. Melzer, W.H. Ouwehand, D. Levy, E. Boerwinkle, A.B.Singleton, D.G. Hernandez, D.L. Longo, N.Soranzo, J.C.M. Witteman, B.M. Psaty, L.

134

Ferrucci, T.B. Harris, C.J. O’Donnell, and S.K. Ganesh: Multiple loci are associated withwhite blood cell phenotypes. PLoS Genetics,7e1002113, 2011

36. C. Terao, R. Yamada, K. Ohmura, M. Taka-hashi, T. Kawaguchi, Y. Kochi, Human Dis-ease Genomics Working Group, RA Clinicaland Genetic Study Consortium, Y. Okada, Y.Nakamura, K. Yamamoto, I. Melchers, M.Lathrop, T. Mimori, and F. Matsuda: Thehuman AIRE gene at chromosome 21q22 is agenetic determinant for the predisposition torheumatoid arthritis in Japanese population.Human Molecular Genetics, in press:, 2011

37. K. Chayama, et al. IL28B but not ITPA poly-morphism is predictive of response to peg-interferon, ribavirin and telaprevir tripletherapy in patients with genotype 1 hepatitisC. Journal of Infectious Diseases, in press:,2011

38. T. Azakami, C.N. Hayes, H. Sezaki, M. Ko-bayashi, N. Akuta, F. Suzuki, H. Kumada,H. Abe, D. Miki, M. Tsuge, M. Imamura, Y.Kawakami, S. Takahashi, H. Ochi, Y. Naka-mura, N. Kamatani, and K. Chayama: Com-mon genetic polymorphism of ITPA gene af-fects ribavirin-induced anemia and effect ofpeg-interferon plus ribavirin therapy. J.Med. Virol., 831048-1057, 2011

39. H. Ochi, T. Maekawa, H. Abe, Y. Hayashida,R. Nakano, M. Imamura, N. Hiraga, Y.Kawakami, S. Aimitsu, J.-H. Kao, M. Kubo,T. Tsunoda, H. Kumada, Y. Nakamura, C.N.Hayes, and K. Chayama: IL-28B predicts re-sponse to chronic hepatitis C therapy - fine-mapping and replication study in Asianpopulations. Journal or General Virology,921071-1081, 2011

40. T. Mushiroda, Y. Nakamura: Personalizingcarbamazepine therapy (Review). GenomicMedicine, in press, 2011

41. S. Chantarangsu, T. Mushiroda, S. Mahasiri-mongkol, S. Kiertiburanakul, S. Sungkanu-parph, W. Manosuthi, W. Tantisiriwat, A.Charoenyingwattana, T. Sura, A. Takahashi,M. Kubo, N. Kamatani, W. Chantratita, andY. Nakamura: Genome-wide associationstudy identifies variations in 6p21.3 associ-ated with nevirapine-induced rash. ClinicalInfectious Diseases, in press, 2011

42. G. Toyokawa, M. Yoshimatsu, K. Masuda,Y. Daigo, E. Tsuchiya, H.-S. Cho, M.Takawa, S. Hayami, K. Maejima, M. Chino,H.I. Field, D.E. Neal, B.A.J. Ponder, Y. Mae-hara, Y. Nakamura, and R. Hamamoto:Minichromosome Maintenance Protein 7 is apotential therapeutic target in human cancerand a novel prognostic marker of non-smallcell lung cancer. Molecular Cancer, in press,

201143. Y. Srinivasan, M. Sasa, J. Honda, A. Taka-

hashi, S. Uno, N. Kamatani, M. Kubo, Y.Nakamura. and H. Zembutsu: Genome-wideassociation study of epirubicin-inducedleukopenia in Japanese patients. Pharmaco-genetics and Genomics, in press, 2011

44. H.-S. Cho, J.D. Kelly, S. Hayami, G. Toyo-kawa, M. Takawa, M. Yoshimatsu, T.Tsunoda, H.I. Field, D.E. Neal, B.A.J. Pon-der, Y. Nakamura, and R. Hamamoto: En-hanced expression of EHMT2 is involved inthe proliferation of cancer cells throughnegative regulation of SIAH. Neoplasia, 13:676-684, 2011

45. T. Tanabe, N. Yamaguchi, K. Matsuda, K.Yamazaki, S. Takahashi, A. Tojo, M. Oni-zuka, Y. Eishi, H. Akiyama, J. Ishikawa, T.Mori, M. Hara, K. Koike, K. Kawa, T.Kawase, Y. Morishima, H. Amano, M.Kobayashi-Miura, T. Kakamu, Y. Nakamura,S. Asano, and Y. Fujita: Association analysisof the NOD2 gene with susceptibility tograft-versus-host disease in a Japanesepopulation. International Journal of Hema-tology, DOI 10.1007/s12185-011-0860-5, 2011

46. D. Miki, H. Ochi, C.N. Hayes, H. Abe, T.Yoshima, H. Aikata, K. Ikeda, H. Kumada, J.Toyota, T. Morizono, T. Tsunoda, M. Kubo,Y. Nakamura, N. Kamatani, and K.Chayama: Variation in the DEPDC5 locus isassociated with progression to hepatocellu-lar carcinoma in chronic hepatitis C viruscarriers. Nature Genetics, in press, 2011

47. T. Hirota, A. Takahashi, M. Kubo, T.Tsunoda, K. Tomita, S. Doi, K. Fujita, A. Mi-yatake, T. Enomoto, T. Miyagawa, M. Ada-chi, H. Tanaka, A. Niimi, H. Matsumoto, I.Ito, H. Masuko, T. Sakamoto, N. Hizawa, M.Taniguchi, J.J. Lima, C.G. Irvin, S.P. Peters,B.E. Himes, A.A. Litonjua, K.G. Tantisira, S.T. Weiss, N. Kamatani, Y. Nakamura, andM. Tamari: Genome-wide association studyidentifies five susceptibility loci for adultasthma in the Japanese population. NatureGenetics, in press:, 2011

48. S. Arakawa, A. Takahashi, K. Ashikawa, N.Hosono, T. Aoi, M. Yasuda, Y. Oshima, S.Yoshida, H. Enaida, T. Tsuchihashi, K. Mori,S. Honda, A. Negi, A. Arakawa, K.Kadonosono, Y. Kiyohara, N. Kamatani, Y.Nakamura, T. Ishibashi, and M. Kubo:Genome-wide association study identifiestwo susceptibility loci for exudative age-related macular degeneration in the Japa-nese population. Nature Genetics, in press:,2011

49. A.P. Reiner, G. Lettre, M.A. Nalls, S.K. Ga-nesh, R. Mathias, M.A. Austin, E. Dean, S.

135

Arepalli, A. Britton, Z. Chen, D. Couper, J.D.Curb, C.B. Eaton, M. Fornage, S.F.A. Grant,T.B. Harris, D. Hernandez, N. Kamatani, B.J.Keating, M. Kubo, A. LaCroix, L.A. Lange,S. Liu, K. Lohman, Y. Meng, E.R. Mohler III,S. Musani, Y. Nakamura, C.J. O’Donnell, Y.Okada, C.D. Palmer, G.J. Papanicolaou, K.V.Patel, A.B. Singleton, A. Takahashi, H. Tang,H.A. Taylor Jr., K. Taylor, C. Thomson, L.R.Yanek, L. Yang, E. Ziv, A.B. Zonderman, A.R. Folsom, M.K. Evans, Y. Liu, D.M. Becker,B.M. Snively, and J.G. Wilson: Genome-wideassociation study of white blood cell countin 16,388 African Americans: the ContinentalOrigins and Genetic Epidemiology Network(COGENT). PLoS Genetics, 7e1002108, 2011

50. Y. Okada, A. Takahashi, H. Ohmiya, N.Kumasaka, Y. kamatani, N. Hosono, T.Tsunoda, K. Matsuda, T. Tanaka, M. Kubo,Y. Nakamura, K. Yamamoto, and N. Kama-tani: Genome-wide association study for C-reactive protein levels identified pleiotropicassociations in the IL6 locus. Human Mo-lecular Genetics, 201224-1231, 2011

51. A. Iida, A. Takahashi, M. Kubo, S. Saito, N.Hosono, Y. Ohnishi, K. Kiyotani, T. Mushi-roda, M. Nakajima, K. Ozaki, T. Tanaka, T.Tsunoda, S. Oshima, M. Sano, T. Kamei, T.Tokuda, M. Aoki, K. Hasegawa, K. Mizo-guchi, M. Morita , Y. Takahashi, M. Katsu-no, N. Atsuta, H. Watanabe, F. Tanaka, R.Kaji, I. Nakano, N. Kamatani, S. Tsuji, G.Sobue, Y. Nakamura, and S. Ikegawa: Afunctional variant in ZNF512B is associatedwith susceptibility to amyotrophic lateralsclerosis in Japanese. Human Molecular Ge-netics, in press:, 2011

52. A. Yosifova, T. Mushiroda, M. Kubo, A.Takahashi, Y. Kamatani, N. Kamatani, D.Stoianov, R. Vazharova, S. Karachanak, I.Zaharieva, I. Dimova, S. Hadjidekova, V.Milanova, N. Madjirova, I. Gerdjikov, T.Tolev, N. Poryazova, G. Kirov, M. Owen, M.O’Donovan, D. Toncheva, and Y. Nakamura:Genome wide association study on bipolardisorder in the Bulgarian population. Genes,Brain and Behavior, in press, 2011

53. H. Mbarek, H. Ochi, Y. Urabe, V. Kumar, M.Kubo, N. Hosono, A. Takahashi, Y. Kama-tani, D. Miki, H. Abe, T. Tsunoda, N. Kama-tani, K. Chayama, Y. Nakamura, and K.Matsuda: A genome-wide association studyof chronic hepatitis B identified novel risklocus in a Japanese population. Human Mo-lecular Genetics, doi:10.1093/hmg/ddr301,2011

54. S. Chung, M. Nakashima, H. Zembutsu, andY. Nakamura: Possible involvement ofNEDD4 in keloid formation; its critical role

in fibroblast proliferation and collagen pro-duction. Proceedings of the Japan Academy,Ser. B, in press, 2011

55. M. Aragaki, K. Takahashi, H. Akiyama, E.Tsuchiya, S. Kondo, Y. Nakamura, and Y.Daigo: Characterization of a CleavageStimulation Factor, 3’ pre-RNA, Subunit 2,64kDa (CSTF2) as a therapeutic target forlung cancer. Clinical Cancer Research, inpress, 2011

56. R. Nishino, A. Takano, N. Ishikawa, H. Aki-yama, H. Ito, H. Nakayama, Y. Miyagi, E.Tsuchiya, N. Kohno, Y. Nakamura, and Y.Daigo: Identification of Epstein-Barr virus-induced gene 3 as a novel serum and tissuebiomarker and a therapeutic target for lungcancer. Clinical Cancer Research, in press,2011

57. M. Maimbo, K. Kiyotani, T. Mushiroda, C.Masimirembwa, and Y. Nakamura: CYP2B6genotype is a strong predictor for systemicexposure of efavirenz in HIV-infected Zim-babweans. European Journal of ClinicalPharmacology, in press, 2011

58. I.P.M. Tomlinson, M. Dunlop, H. Campbell,B. Zanke, S. Gallinger, T. Hudson, T.Koessler, P.D. Pharoah, I. Niittymäkix, S.Tuupanenx, L.A. Aaltonen, K. Hemminki, A.Lindblom, A. Försti, O. Sieber, L. Lipton, T.van Wezel, H. Morreau, J.T. Wijnen, P. Dev-ilee, K. Matsuda, Y. Nakamura, S. Castellvl-Bel, C. Ruiz-Ponte, A. Castells, A. Car-racedo, J.W.C. Ho, P. Sham, R.M.W. Hofstra,P. Vodicka, H. Brenner, J. Hampe, C. Schaf-mayer, J. Tepel, S. Schreiber, H. Völzke, M.M. Lerch, C.A. Schmidt, S. Buch, V. Moreno,C.M. Villanueva, P. Peterlongo, P. Radice,M.M. Echeverry, A. Velez, L. Carvajal-Carmona, R. Scott, S. Penegar, P. Broderick,A. Tenesa, and R.S. Houlston: COGENT(COlorectal cancer GENeTics): an interna-tional consortium to study the role of poly-morphic variation on the risk of colorectalcancer. Mutagenesis, in press, 2011

59. J.C. Chambers, W. Zhang, J. Sehm, Y. Naka-mura et al. Identification of genetic loci in-fluencing markers of liver function in man.Nature Genetics, in press, 2011

60. N. Kumasaka , H. Fujisawa, N. Hosono, Y.Okada, A. Takahashi, Y. Nakamura, M.Kubo, and N. Kamatani: Platinum CNV: abayesian gaussian mixture model for geno-typing copy number polymorphisms usingSNP array signal intensity data. Genetic Epi-demiology, in press, 2011

61. K. Kiyotani, T. Mushiroda, and Y. Naka-mura: Pharmacogenomics of anticonvulsantagents (review). BioTech International, inpress, 2011

136

62. G. Toyokawa, H.-S. Cho, K. Masuda, M.Yoshimatsu, S. Hayami, M. Takawa, Y. Iwai,Y. Daigo, E. Tsuchiya, T. Tsunoda, H.I.Field, J.D. Kelly, D.E. Neal, Y. Maehara, B.A.J. Ponder, Y. Nakamura, and R. Hamamoto:The histone lysine methyltransferase Wolf-Hirschhorn Syndrome Candidate 1 is in-volved in human carcinogenesis throughregulation of the Wnt pathway. Neoplasia,in press, 2011

63. J. Li, D. Yang, Y. He, M. Wang, Z. Wen, L.Liu, J. Yao, K. Matsuda, Y. Nakamura, J. Yu,X. Jiang, S. Sun, Q. Liu, X. Jiang, Q. Song,M. Chen, H. Yang, F. Tang, X. Hu, J. Wang,Y. Chang, X. He, Y. Chen, and J. Lin: Asso-ciations of HLA-DP variants with hepatitis Bvirus infection in Southern and NorthernHan Chinese populations: A multicentercase-control study. PLoS ONE, 6e24221, 2011

64. G. Toyokawa, H.-S. Cho, Y. Iwai, M. Taka-wa, M. Yoshimatsu, S. Hayami, K. Maejima,N. Shimizu, H. Tanaka, T. Tsunoda, H.I.Field, J.D. Kelly, D.E. Neal, B.A.J. Ponder, Y.Maehara, Y. Nakamura, and R. Hamamoto:The histone demethylase JMJD2B plays anessential role in human carcinogenesisthrough positive regulation of cyclin-dependent kinase. Cancer Prevention Re-search, 4:2051-2061, 2011

65. K. Kiyotani, T. Mushiroda, Y. Nakamura,and H. Zembutsu: Pharmacogenomics of ta-moxifen: roles of drug metabolizing en-zymes and transporters. Drug Metabolismand Pharmacokinetics, in press, 2011

66. H.-S. Cho, G. Toyokawa, Y. Daigo, S.Hayami, K. Masuda, N. Ikawa, Y. Yamane,K. Maejima, M. Yoshimatsu, T. Tsunoda, H.I. Field, J.D. Kelly, D.E. Neal, B.A.J. Ponder,Y. Maehara, Y. Nakamura, and R. Hama-moto: The JmjC domain-containing histonedemethylase KDM3A Is a positive regulatorof the G1/S transition in cancer cells viatranscriptional regulation of the HOXA1gene. Int. J. Cancer, in press, 2011

67. M. Imamura, M. Iwata, H. Maegawa, H.Watada, H. Hirose, Y. Tanaka, K. Tobe, K.Kaku, A. Kashiwagi, R. Kawamori, Y. Naka-mura, and S. Maeda: Genetic variants atCDC123/CAMK1D and SPRY2 are associ-ated with susceptibility to type 2 diabetes inthe Japanese population. Diabetologia, inpress, 2011

68. T. Ohshige, M. Iwata, S. Omori, Y. Tanaka,H. Hirose, K. Kaku, H. Maegawa, H.Watada, A. Kashiwagi, R. Kawamori, K.Tobe, T. Kadowaki, Y. Nakamura, and S.Maeda: Association of new loci identified inEuropean genome-wide association studieswith susceptibility to type 2 diabetes in the

Japanese. PLoS ONE, 6e26911, 201169. C. Gieger, A. Radhakrishnan, A. Cvejic, W.

Tang, Y. Nakamura, et al.: New gene func-tions in megakaryopoiesis and platelet for-mation. Nature, in press, 2011

70. J. Koinuma, H. Akiyama, M. Fujita, M.Hosokawa, E. Tsuchiya, S. Kondo, Y. Naka-mura, and Y. Daigo: Characterization of anOpa interacting protein 5 (OIP5) involved inlung and esophageal carcinogenesis. CancerScience, in press, 2011

71. F. Innocenti, K. Owzar, N.L. Cox, P. Evans,M. Kubo, H. Zembutsu, C. Jiang, D. Hollis,T. Mushiroda, L. Li, P. Friedman, L. Wang,H. Hurwitz, K.M. Giacomini, H.L. McLeod,R.M. Goldberg, R.L. Schilsky, H.L. Kindler,Y. Nakamura, and M.J. Ratain: A genome-wide association study of overall survival inpancreatic cancer patients treated with gem-citabine in CALGB 80303. Clinical CancerResearch, in press, 2011

72. M. Furu, Y. Kajita, S. Nagayama, T. Ishibe,Y. Shima, K. Nishijo, D. Uejima, R Taka-hashi, T. Aoyama, T. Nakayama, T Naka-mura, Y. Nakashima, M. Ikegawa, S. Imoto,T. Katagiri, Y. Nakamura, and J. Toguchida:Identification of AFAP1L1 as a prognosticmarker for spindle cell sarcomas. Oncogene,304015-4025, 2011

73. K. Masuda, A. Takano, H. Oshita, H. Aki-yama, E. Tsuchiya, N. Kohno, Y. Nakamura,and Y. Daigo: Chondrolectin is a novel diag-nostic biomarker and a therapeutic target forlung cancer. Clinical Cancer Research, inpress, 2012

74. K. Kiyotani, S. Uno, T. Mushiroda, A. Taka-hashi, M. Kubo, N. Mitsuhata, S. Ina, C. Ki-hara, Y. Kimura, H. Yamaue, K. Hirata, Y.Nakamura, and H. Zembutsu: Genome-wideassociation study identifies four geneticmarkers for hematological toxicities in can-cer patients receiving gemcitabine therapy.Pharmacogenetics and Genomics, in press,2012

75. Y. Okada, M. Kubo, H. Ohmiya, A. Taka-hashi, N. Kumasaka, N. Hosono, S. Maeda,W. Wen, R. Dorajoo, M.-J. Go, W. Zheng, N.Kato, J.-Y. Wu, Q. Lu, the GIANT consor-tium, T. Tsunoda, K. Yamamoto, Y. Naka-mura, N. Kamatani, and T. Tanaka: Com-mon variants at CDKAL1 and KLF9 are as-sociated with body mass index in East Asianpopulations. Nature Genetics, in press, 2012

76. K. Dabanaka, S.-Y. Chung, H. Nakagawa, Y.Nakamura, T. Okabayashi, T. Sugimoto, K.Hanazaki, and M. Furihata: PKIB expressionstrongly correlated with phosphorylated Aktexpression in the breast cancers and alsowith triple negative breast cancer subtype.

137

Medical Molecular Morphology, in press,2012

77. W. Wen, Y.S. Cho, W. Zheng, R. Dorajoo, Y.Nakamura, et al.: Meta-analysis of genome-wide association studies in East Asians iden-tifies novel genetic variants associated withbody mass index. Nature Genetics, in press,2012

78. R. Abo, S. Hebbring, Y. Ji, H. Zhu, Z.-B.Zeng, A. Batzler, G.D. Jenkins, J. Biernacka,K. Snyder, M. Drews, O. Fiehn, B. Fridley,D. Schaid, N. Kamatani, Y. Nakamura, M.Kubo, T. Mushiroda, R. Kaddurah-Daouk,D.A. Mrazek, and R.M. Weinshilboum:Merging pharmacometabolomics with phar-macogenomics using “1000 Genomes” SNPimputation: Selective serotonin reuptake in-hibitor response pharmacogenomics. Phar-macogenetics and Genomics, in press, 2012

79. H. Nakagawa, S. Akamatsu, R. Takata, A.Takahashi, M. Kubo, and Y. Nakamura:Prostate cancer genomics, biology, and riskassessment through genome-wide associa-tion studies. Cancer Science, in press, 2012

80. K. Kiyotani, T. Mushiroda, T. Tsunoda, T.Morizono, N. Hosono, M. Kubo, Y. Taniga-wara, C.K. Imamura, D.A. Flockhart, F. Aki,K. Hirata, Y. Takatsuka, M. Okazaki, S. Oh-sumi, T. Yamakawa, M. Sasa, Y. Nakamura,and H. Zembutsu: A Genome-wide associa-tion study identifies locus at 10q22 associ-ated with clinical outcomes of adjuvant ta-moxifen therapy for breast cancer patients.Human Molecular Genetics, in press, 2012

81. H. Yoshioka, S. Yamamoto, H. Hanaoka, Y.Iida, P. Paudya, T. Higuchi, H. Tominaga,N. Oriuchi, H. Nakagawa, Y. Shiba, K.Yoshida, R. Osawa, T. Katagiri, T. Tsunoda,Y. Nakamura, and K. Endo: In vivo thera-peutic effect of CDH3/P-cadherin-targetingradioimmunotherapy. Cancer Immunology,Immunotherapy, in press, 2012

82. C. Tanikawa, Y. Urabe, K. Matsuo, M. Kubo,A. Takahashi, H. Ito, K. Tajima, N. Kama-tani, Y. Nakamura, and K. Matsuda:Genome wide association study identifiedtwo susceptible loci for duodenal ulcer inJapanese population. Nature Genetics, inpress, 2012

83. Y. Urabe, C. Tanikawa, A. Takahashi, Y.Okada, T. Morizono, T. Tsunoda, N. Kama-tani, K. Kohri, K. Chayama, M. Kubo, Y.Nakamura, and K. Matsuda: Genome-wideassociation study of nephrolithiasis in Japa-nese population identifies novel susceptibleloci at 5q35.3, 7p14.3 and 13q14.1. PLoS Ge-netics, in press, 2012

84. S.-K. Low, A. Takahashi, P.-C. Cha, H. Zem-butsu, N. Kamatani, M. Kubo, and Y. Naka-

mura: Genome-wide association study forintracranial aneurysm in Japanese popula-tion identifies three candidate susceptibleloci and a functional genetic variant at ED-NRA. Human Molecular Genetics, in press,2012

85. C. Tanikawa, M. Espinosa, A. Suzuki, K.Masuda, K. Yamamoto, E. Tsuchiya, K.Ueda, Y. Daigo, Y. Nakamura, and K.Matsuda: Regulation of histone modificationand chromatin structure by the p53-PADI4pathway. Nature Communications, in press,2012

86. P.-C. Cha, H. Zembutsu, A. Takahashi, M.Kubo, N. Kamatani, and Y. Nakamura Agenome-wide association study (GWAS)identifies SNP in DCC is associated withgallbladder cancer (GC) in the Japanesepopulation Journal of Human Genetics, inpress, 2012

87. S. Akamatsu, R. Takata, C.A. Haiman, A.Takahashi, T. Inoue, M. Kubo, M. Furihata,N. Kamatani, J. Inazawa, G.K. Chen, L.L.Marchand, L.N. Kolonel, T. Katoh, Y.Yamano, M. Yamakado, H. Takahashi, H.Yamada, S. Egawa, T. Fujioka, B.E. Hender-son, T. Habuchi, O. Ogawa, Y. Nakamura,and H. Nakagawa: Common variants at 11q12, 10q26 and 3p11.2 are associated withprostate cancer susceptibility in Japanese.Nature Genetics, in press, 2012

88. W.-C. Chang, C.-H. Lee, T. Hirota, L.-F.Wang, S. Doi, A. Miyatake, T. Enomoto, K.Tomita, M. Sakashita, T. Yamada, S. Fujieda,K. Ebe, H. Saeki, S. Takeuchi, M. Furue, W.-C. Chen, Y.-C. Chiu, W.P. Chang, C.-H.Hong, E. Hsi, S.-H. H. Juo, H.-S. Yu, Y.Nakamura, and M. Tamari: ORAI1 geneticpolymorphisms associated with the suscepti-bility of atopic dermatitis in Japanese andTaiwanese populations. PLoS ONE, 1e29387,2012

89. W. Osman, Y. Okada, Y. Kamatani, M.Kubo, K. Matsuda, and Y. Nakamura: Asso-ciation of common variants in TNFRSF13B,TNFSF13, and ANXA3 with serum levels ofnon-albumin protein and immunoglobulinisotypes in the Japanese population. PLoSONE, in press, 2012

90. H.H. Nguyen, R. Takata, S. Akamatsu, D.Shigemizu, T. Tsunoda, M. Furihata, A.Takahashi, M. Kubo, N. Kamatani, O.Ogawa, T. Fujioka, Y Nakamura, and H.Nakagawa: IRX4 at 5p15 suppresses prostatecancer growth through the interaction withvitamin D receptor, conferring prostate can-cer susceptibility. Human Molecular Genet-ics, in press, 2012

91. Y. Okada, C. Terao, K. Ikari, Y. Kochi, K.

138

Ohmura, A. Suzuki, T. Kawaguchi, E. Stahl,F. Kurreman, N. Nishida, H. Ohmiya, K.Myouzen, M. Takahashi, T. Sawada, Y.Nishioka, M. Yukioka, T. Matsubara, S.Wakitani, R. Teshima, S. Tohma, K.Takasugi, K. Shimada, A. Murasawa, S.Honjo, K. Matsuo, H. Tanaka, K. Tajima, T.Suzuki, T. Iwamoto, Y. Kawamura, H. Tanii,Y. Okazaki, T. Sasaki, P.K. Gregersen, L.Padyukov, J. Worthington, K.A. Siminovitch,M. Lathrop, A. Taniguchi, A. Takahashi, K.Tokunaga, M. Kubo, Y. Nakamura, N.Kamatani, T. Mimori, R.M. Plenge, H.Yamanaka, S. Momohara, R. Yamada, F.Matsuda, and K. Yamamoto: Meta-analysisidentifies nine new loci associated withrheumatoid arthritis in the Japanese popula-tion. Nature Genetics, in press, 2012

92. W. Jongjaroenpraser, T. Phusantisampan, S.Mahasirimongkol, T. Mushiroda, N. Hirank-arn, T. Snabboon, S. Chanprasertyotin, P.Tantiwong, S. Soonthornpun, P. Rat-tanapichart, S. Mamanasiri, T. Himathong-

kam, B. Ongphiphadhanakul, A. Takahashi,N. Kamatani, M. Kubo, and Y. Nakamura: Agenome-wide association study identifiesnovel susceptibility loci for thyrotoxic hy-pokalemic periodic paralysis. Human Genet-ics, in press, 2012

93. M. Aoki, N. Hosono, S. Kakata, Y. Naka-mura, N. Kamatani, and M. Kubo: Newpharmacogenetic test for detecting HLA-A＊

31:01 allele using invaderPlus assay. Phar-macogenetics and Genomics, in press, 2012

94. K. Kiyotani, T. Mushiroda, C.K. Imamura, Y.Tanigawara, N. Hosono, M. Kubo, M. Sasa,Y. Nakamura, and H. Zembutsu: Dose-adjustment study of tamoxifen based onCYP2D6 genotypes in Japanese breast cancerpatients. Breast Cancer Res Treat, in press,2012

95. C. Tanikawa, H. Nakagawa, Y. Furukawa,Y. Nakamura, and K. Matsuda: Title CLCA2as a p53-inducible senescence mediator.Neoplasia, in press, 2012

139

The mission of our laboratory is to conduct computational (“in silico”) studies onthe functional aspects of genome information. Roughly speaking, genome informa-tion represents what kind of proteins/RNAs are synthesized under which condi-tions. Thus, our study includes the structural analysis of molecular function ofeach gene product as well as the analysis of its regulatory information, which willlead us to the understanding of its cellular role represented by the networks ofinter-gene interactions.

1. Computational model for deciphering tran-scription of co-expressed genes

Yosvany Lopez and Kenta Nakai

The understanding of the mechanisms of tran-scriptional regulation remains a great challengefor molecular biologists in the post-genome era.At the transcriptional level, DNA-binding pro-teins (transcription factors) modulate the expres-sion of genes by binding to their specific DNAregulatory elements in nearby genomic regions.Nowadays, the identification and characteriza-tion of these components is valuable because thepresence or absence of transcription factor bind-ing sites (TFBSs) is thought to be responsible forthe complexity of gene regulation in every liv-ing organism. Based on the assumption that theregulatory regions of (at least a part of) thosegenes showing similar expression profilesshould share some common structural character-istics, we are attempting to design a computa-tional model capable of explaining how thebinding of transcription factors is carried out inpromoter regions of co-regulated genes in spe-cific biological tissues. We are working on a da-tabase of co-expressed genes in Arabidopsis thali-ana (ATTED-II) because, among other reasons,

the intergenic regions of plant genes are smallerthan that of higher organisms. Working on A.thaliana genes will facilitate our initial analysis,but once a good enough model has beenachieved, it will be applied to different humantissues to shed light on the transcriptional proc-esses behind co-regulated genes.

2. TSS-Seq time-series analysis of mice den-dritic cells after LPS stimulation

Kuo-Ching Liang, Yutaro Kumagai1, ShizuoAkira1, and K. Nakai: 1WPI-iFReC, OsakaUniversity.

Dendritic cells act as intermediary betweenexternal environment and mammalian adaptiveimmunity mechanism by presenting foreign an-tigens to various types of lymphocytes. The dy-namic changes in the complex web of controland interaction relationships in the dendriticcells due to immune responses are therefore ofgreat interest in the understanding of mammal-ian immune system. In this work, we attempt toelucidate and characterize the temporal behaviorof transcription start sites (TSS) and gene ex-pression in mouse dendritic cell after elicitingimmune response by lipopolysaccharide (LPS)

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析イン・シリコ分野

Professor Kenta Nakai, Ph.D.Assistant Professor Ashwini Patil, Ph.D.

教授理学博士中井謙太助教理学博士パティル，アシュウィニ

140

stimulation, we analyze the TSS-Seq data formouse dendritic cell samples induced frombone-marrow cells with the existence of GM-CSF, and collected at 0hr, 0.5hr, 1hr, 2hrs, 3hrs,4hrs, 6hrs, 8hrs, 16hrs and 24hrs from LPSstimulation. In addition to looking for changesin the TSS distribution patterns of the dendriticcell genome across the time steps, the TSS-tagsfor each gene are also collected as a digitalcount of gene expression for time-series analysis.After the appropriate normalization and re-sampling steps, genes are clustered based ontheir gene expression time-series, and comparedto gene-ontology annotations and known pro-tein and genomic networks to discover anyfunctional over-representation. Cross correlationand convolutional analysis are also used to dis-cover possible time-delayed gene regulatory re-lationship present in the data. Furthermore, weexplore the expression time-series to discoverany cyclic behavior.

3. Construction of DBTSS version 8 with epi-genomic information

Riu Yamashita2, Yutaka Suzuki3, SumioSugano3, Kenta Nakai: 2Frontier Research In-itiative, IMSUT, 3Grad. Sch. Frontier Sciences,U. Tokyo.

We have constructed the DBTSS (DataBase ofTranscriptional Start Sites), which represents ex-act positions of transcriptional start sites (TSSs)in the genome based on our unique experimen-tally validated TSS sequencing method, TSS-Seq.In this update, we included new TSS data, sothat a major part of human adult and embryonictissues are covered. DBTSS now contains 491million TSS tag sequences for collected from atotal of 20 tissues and 7 cell cultures. We alsointegrated our newly generated RNA-seq dataof subcellular- fractionated RNAs and ChIP-Seqdata of histone modifications, RNA polymeraseII and several transcriptional regulatory factorsin cultured cell lines. We also included recentlyaccumulating external epigenomic data, such aschromatin map of the ENCODE project. We fur-ther associated this TSS information with publicand original SNV data, in order to indentify sin-gle nucleotide variations (SNVs) in the regula-tory regions. This data can be browsed in ournew viewer which also supports versatile searchconditions for users (http://dbtss.hgc.jp).

4. Massive-scale RNA-Seq analysis of mouseearly embryo development

Sung-Joon Park, Makiko Komata4, FukashiInoue5, Kaori Yamada5, Kenta Nakai, Miho

Ohsugi5, Katsuhiko Shirahige4: 4IMCB, U.Tokyo, 5Division of Oncology.

To understand the mechanism of biologicalevents observed during the early mammalianembryo development, we performed transcrip-tome analysis on mouse oocyte, one-cell, two-cell, and four-cell embryonic stages using ap-proximately 10,000 high-quality cells per stage,which is an unprecedented experimental scale.Over 100 million short reads sequenced bySOLiD were analyzed by Tophat and Cufflinkssoftware coupled with a recursive mappingstrategy, where unmapped reads are truncatedand mapped again. Consequently, we profiledover 16,000 RefSeq coding transcripts and 37candidates of novel genes. The profile coveredalmost all genes initially assayed by microarrayand detected more 7,718 genes, supporting theadvance of RNA-Seq assay. By clustering themRNA abundance at each stage, we found 21gene expression patterns. The patterns arelinked to the well-known expression transitions,such as the degradation of maternal mRNAs,minor or major zygotic expression, and stage-specific transient expression. Remarkably, wediscovered four novel patterns that exhibit anacute oscillatory transition of activation and re-pression. The analysis of gene ontology enrich-ment for the novel patterns suggested their im-portance in the development. In addition, weobserved that particular ncRNAs, such assnoRNA and RNase MRP RNA, have been in-creasingly pooled after fertilizing. It implies thatthe gene regulation program switches on thegene expression required for the appropriate de-velopment and the regulation is accompaniedby ncRNA activity. The transcriptome profile weestablished here is an important resource to helpenhance our understanding of mammalian em-bryo development.

5. Comprehensive analysis of transcriptioninitiation patterns, promoter architectureand function in Ciona intestinalis

Rui Yokomori, Kotaro Shimai6, Koki Nishi-tsuji7, Yutaka Suzuki3, Takehiro Kusakabe6

and Kenta Nakai: 6Fac. Sci. and Eng., KonanU., 7Grad. Sch. Life Sci., U. Hyogo.

Transcription is known to start from multiplepositions and the distribution of transcriptionstart sites (TSSs) can be classified into threetypes: NP (Narrow with Peak), BP (Broad withPeak) and WP (Weak Peak). However, the bio-logical meaning of each type is poorly under-stood. In this study, we identified promotersacross the Ciona intestinalis genome and ana-

141

lyzed their TSS distributions (transcription in-itiation patterns), promoter architectures andfunctions using 8 different tissues. As a result,we found that TATA-box was significantly en-riched in NP type compared to WP type, al-though initiator motif did not show the differ-ence across each type. Gene Ontology enrich-ment analysis showed that the transcripts withNP, BP and WP are associated with develop-ment, translation and transport, respectively.Also, our analysis suggested current annotationof transcription start sites is incomplete andidentified 40 novel candidate promoters.

6. Analyses of relationship between Y14RNAi knockdown and occurrences of ir-regular splicing

Shunichi Wakabayashi, Kazuhiro Fukumura8,Masami Shiimori8, Kenta Nakai, Kunio Inoue8,and Hiroshi Sakamoto8: 8Gradu. Sch. Sci.,Kobe U.

RNA-binding protein Y14 (also known asRBM8A) is a member of the protein complexEJC (exon junction complex) which influencesmRNA splicing and their transport to the cyto-plasm. Under the condition of Y14 knockdown,it was reported that some mRNAs retained in-trons and occurrences of irregular splicing in cy-toplasm were observed. Here we analyzed func-tions of Y14 for mRNA processing in worm andhuman cells. We identified some significantly re-tained introns (e.g. 6th intron of TALDO1, 4th in-tron of TMEM147) by Y14 RNAi using computa-tional RNA-seq data analyses and experimentalconfirmations using RT-PCR. We analyzed rela-tionships between occurrences of irregular splic-ing and attributes of introns like length or posi-tion in respective genes.

7. Non-repetitive protein regions encoded bynucleotide repeats have a significant im-pact on protein structure

Toshiyuki Tsuji, Ashwini Patil and KentaNakai

Nucleotide repeats in coding regions signifi-cantly affect the structure and function of pro-teins. While the structural properties of the re-sulting amino acid repeats have been widelystudied, those of the encoded non-repetitive re-gions have so far not been investigated. In thisstudy, we compared the structural features ofnon-repetitive protein regions encoded by nu-cleotide repeats with those of amino acid re-peats and non-repeat fragments. Similar toamino acid repeat regions, non-repetitive pro-

tein regions resulting from nucleotide repeatsshow higher levels of intrinsic disorder andstructural flexibility compared to non-repeat re-gions. Their amino acid propensity is also simi-lar to that of amino acid repeat regions. How-ever, their patterns of amino acid conservationand their greater prevalence in the interface re-gions, as well as their ability to form stable sec-ondary structures bear a greater similarity tonon-repeat regions. We conclude that non-repetitive protein regions encoded by nucleotiderepeats are distinct from amino acid repeat andnon-repeat regions, but retain certain structuralproperties of both. We propose that their uniquecombination of structural characteristics enablesthem to play an important role in protein func-tion.

8. Functional annotation of proteins with in-trinsically disordered regions using aminoacid content similarity

Ashwini Patil, Shunsuke Teraguchi9, HuyDinh9, Daron Standley9 and Kenta Nakai: 9Inst.Protein Res., Osaka U.

Intrinsically disordered regions in proteins areregions without a stable tertiary structure. De-spite the lack of a stable structure, these regionsplay an important role in protein function as aresult of their flexibility and adaptability. Intrin-sically disordered regions are found in proteinsfunctioning in cell signaling and transcriptionregulation. However, several such regions arenot associated with any function and are oftenignored during the functional annotation of pro-teins. Function prediction of proteins with largedisordered regions is difficult using conven-tional techniques of sequence similarity becausethese regions evolve rapidly and are often oflow complexity. In this project, we developed aweb server called the IDD Navigator that, givena disordered region, identifies similar disor-dered regions using amino acid content and pre-dicts the function of the query sequence usingGene Ontology terms enriched in the hit se-quences. We found that this method performsbetter than random at predicting associatedGene Ontology terms with given disordered re-gions.

9. Assessing the utility of a gene co-expression stability measure in the studyof protein-protein interaction networks.

Ashwini Patil, Kengo Kinoshita10 and KentaNakai: 10Tohoku U.

We assessed the utility of gene co-expression

142

stability, a means of measuring the bias in geneco-expression, as an additional measure to sup-port the co-expression correlation in the analysisof protein-protein interaction networks. Westudied the patterns of co-expression correlationand stability in interacting proteins with respectto their interaction promiscuity, levels of intrin-sic disorder, and essentiality or disease-relatedness. Co-expression stability, along withco-expression correlation, acts as a better classi-fier of hub proteins in interaction networks,than co-expression correlation alone, enablingthe identification of a class of hubs that arefunctionally distinct from the widely acceptedtransient (date) and obligate (party) hubs. Pro-teins with high levels of intrinsic disorder havelow co-expression correlation and high stabilitywith their interaction partners suggesting theirinvolvement in transient interactions, except fora small group that have high co-expression cor-relation and are typically subunits of stablecomplexes. Similar behavior was seen fordisease-related and essential genes. Interactingproteins that are both disordered have higher co-expression stability than ordered protein pairs.Using co-expression correlation and stability, wefound that transient interactions are more likelyto occur between an ordered and a disorderedprotein while obligate interactions primarily oc-cur between proteins that are either both or-dered, or disordered. We observed that co-expression stability shows distinct patterns instructurally and functionally different groups ofproteins and interactions.

10. A large-scale study of the conservationand classification of intrinsically disor-dered regions

Ashwini Patil, Harry Amri Moesa, ShunichiWakabayashi and Kenta Nakai

Despite the importance of intrinsically disor-dered regions in proteins, there is currently noclassification system available for these regions.We studied the levels of conservation of knownand predicted disordered regions in eukaryotesand proposed a method to classify them basedon their conservation and their residue content i.e. amount of charge and hydropathy. We foundthat highly conserved disordered regions can beclassified into distinct groups based on chargeand hydropathy which are associated with dis-tinct functions suggesting a possible functionalclassification of disordered regions. We alsofound that disordered regions showing low con-servation often show high residue type contentconservation (i.e. similar fraction of charged, po-lar and hydrophobic residues), which may be

used to maintain their disorderliness in order tostay functional despite high rates of evolution.

11. Project for accelerating the clinical appli-cation of regenerative medicine technolo-gies

Kenta Nakai, Tatsutoshi Nakahata11, NorioNakatsuji12, Kohji Nishida13, Hideyuki Okano14,Masayo Takahashi15 , Akihiro Umezawa16 ,Masayuki Yamato17, and Shinya Yamanaka11:11CiRA, Kyoto U., 12IFMS, Kyoto U., 13Grad.Sch. Med., Osaka U., 14Sch. Med. Keio U.,15CDB, RIKEN, 16Nat. Res. Inst. Child Healthand Development, 17Tokyo Women’s Med. U.

This project aims to achieve the earliest possi-ble clinical application of regenerative medicinetechnologies using safe, effective and high-quality human stem cells by building a collabo-rative platform among researchers through anetwork centric research infrastructure. Also inthe project, the basis of an “open innovation”environment allowing research institutes to con-tinuously create innovative technologies will bedeveloped. To support this activity, an advisorycommittee has been formed with members fromvarious sectors.

12. Construction of a system for analyzingthe clonality of retroviruses using nextgeneration sequencer

Sakura Aoki3, Yutaka Suzuki3, Kenta Nakai,and Toshiki Watanabe3

Human T-cell Leukemia Virus type-1 (HTLV-1) is the causative agent of Adult T-cell Leuke-mia (ATL). While most of the HTLV-1 infectedindividuals remain as asymptomatic carriers(ACs) throughout their lifetime, about 5％ ofthem develop ATL. However, the mechanismresponsible for such a significant risk variationamong ACs remains to be elucidated. ACs har-bor polyclonal population of HTLV-1 infectedcells, whereas ATL patients show mono-clonalexpansion. Thus we postulate that the onset of aclonal expansion of HTLV-1-infected cells can bea reliable risk indicator of progression into ATLamong ACs. Although it has not yet been ac-complished by the conventional methods, track-ing the behavior of each clone in the complexpopulation of HTLV-1 infected cells is essentialto understand the clonality of HTLV-1 infectedcells. In the present study, taking advantage ofnext-generation sequencing technology andnested-splinkerette PCR, a new high-throughputmethod has been developed. We expect that thistechnique will enable us to track the changes in

143

the clonality during the course of ATL develop-ment. Data obtained from ATL patients havebeen compared with those from ACs, includingthe “high-risk carriers” who subsequently devel-oped ATL. The resulting information will not

only provide us with a useful method to predictthe probability of ATL onset among ACs, butalso a new insight into the natural course of thedisease.

Publications

Patil, A., Nakai, K., and Nakamura, H. HitPre-dict: a database of quality assessed protein-protein interactions in nine species, Nucl. Ac-ids Res ., 39, D744-D799 (2011)

Park, S.-J., and Nakai, K. A regression analysisof gene expression in ES cells reveals twogene classes that are significantly different inepigenetic patterns. BMC Bioinformatics. 12(Suppl 1): S50, 2011.

Khare, P., Mortimer, SI., Cleto, CL,. Okamura,K., Suzuki, Y., Kusakabe, T., Nakai, K.,Meedel, TH., and Hastings, KEM. Cross-validated methods for promoter/transcriptionstart site mapping in SL trans-spliced genes,established using the Ciona intestinalis tro-ponin I gene. Nucleic Acids Res. 39(7): 2638-2648 (2011).

Yamashita, R., Sathira, N.P., Kanai, A., Tani-moto, K., Arauchi, T., Tanaka, Y., Hashimoto,S., Sugano, S., Nakai, K., and Suzuki, Y.Genome-wide characterization of transcriptionstart sites in humans by integrative transcrip-tome analysis. Genome Res. 21(5): 775-789(2011).

Irie, T., Park, S.J., Yamashita, R., Seki, M., Yada,T., Sugano, S., Nakai, K., and Suzuki, Y. Pre-dicting promoter activities of primary humanDNA sequences. Nucl. Acids Res. 39(11): e75(2011).

Ohshima, D., Qin, J., Konnno, H., Hirosawa, A.,Shiraishi, T., Yanai, H., Shimo, Y., Akiyama,N., Yamashita, R., Nakai, K., and Inoue, J.RANK signaling induces interferon-stimulatedgenes in the fetal thymic stroma. Biochem.Biophys. Res. Comm. 408(4): 530-536, 2011.

Kimura, K., Koike, A., and Nakai, K. Seed-set

construction by equi-entropy partitioning forefficient and sensitive short-read mapping. Al-gorithms in Bioinformatics in (T.M. Przytyckaand M.-F. Sagot ed.) Lecture Notes in Com-puter Science. 6833: 151-162, 2011. (ISBN: 978-3-642-23037-0).

Okamura, K., Yamashita, R., Takimoto, N.Nishitsuji, K., Suzuki, Y., Kusakabe, T.G., andNakai, K. Profiling ascidian promoters as theprimodial type of vertebrate promoter. BMCGenomics 12(Suppl. 3): S7, 2011. BEST PAPERAWARD in InCoB211

Patil, A., Nakai, K., and Kinoshita, K. Assessingthe utility of gene co-expression stability incombination with correlation in the analysis ofprotein-protein interaction networks. BMCGenomics 12(Suppl. 3): S19, 2011.

Patil, A., Teraguchi, S., Dinh, H., Nakai, K., andStandley, DM. Functional annotation of intrin-sically disordered domains by their aminoacid content using IDD Navigator. PacificSymposium on Biocomputing 17: 164-175,2012.

Yamashita, R., Sugano, S., Suzuki, Y., andNakai, K. DBTSS: database of transcriptionalstart sites progress report in 2012. Nucl. AcidsRes. 40(Database Issue): D150-154, 2012.

Kimura, K., Koike, A., and Nakai, K. A Bit-parallel dynamic programming algorithmsuitable for DNA sequence alignment. J. Bio-informatics and Computational Biology. Inpress

Makita, Y. and Nakai, K. Bacillus subtilis tran-scriptional network. In (Babu, M. ed.) Bacte-rial Gene Regulation and Transcriptional Net-works, Horizon Sci. Press., in press.

144

The Department of Public Policy works to achieve three major missions: publicpolicy studies of translational research, its application, and its impact on society;research ethics consultation for scientists to comply with ethical guidelines and tobuild public trust; and development of “minority-centered” scientific communication.By conducting qualitative and quantitative social science study and policy analysis,we facilitate discussion of challenges arising from advances in medical sciences.Furthermore, we study specific ethics issues related to construction of a humanbiological substances collection, and related to vaccination policy.

1. Research ethics consultation for multi-center cancer genome research

Public interest in research ethics has grown.Society increasingly makes demands in thisarea. Provision of a system that can support“on-site” researchers to avail themselves of im-mediate consultation when concerns and issuesarise related to research ethics and other mattersis also among those demands. Based on thesedemands, in recent years, the universities andresearch institutes providing “research ethicsconsultation” have become increasingly numer-ous in the United States. We have been commis-sioned by the government to provide researchethics consultation to several big projects pro-moting medical sciences.

We have started to provide research ethicsconsultation for a new and comprehensive can-cer genome research project for 34 clinical seedsof 64 designated institutions, called as Projectfor Development of Innovative Research onCancer Therapeutics (MEXT). We have ad-dressed a basic research ethics policy to usestored samples for research. We have also devel-

oped a research ethics management system onthe website.

2. Biobank Japan Project (BBJP) and its ethi-cal, legal and social implications

The Biobank Japan Project (BBJP) is a disease-focused biobanking project headed by ProfessorYusuke Nakamura since 2003. Biobank Japanconsists of donated DNA, sera and clinical infor-mation from 200,000 patients of 66 hospitals inJapan. Informed consent, which ensures theautonomous decisions of participants, is be-lieved to be practically impossible for the bio-banking project in general.

We have conducted interview studies of re-search coordinators (n＝50) since 2010 and wehave just compiled our self-evaluation report oninformed consent process of this project. As ameans to maintain the participants’ trust of theproject, research coordinators who had beenspecially trained for the BBJP have played im-portant roles. We have worked steadily to com-plete analyses of the research coordinators of theBBJP. At the beginning of the BBJP, their pri-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto, Ph.D.Assistant Professor Yusuke Inoue, Ph.D.Project Assistant Professor Hyunsoo Hong, Ph.D.Project Assistant Professor Ayako Kamisato, M.A.

准教授保健学博士武藤香織助教社会医学博士井上悠輔特任助教学術博士洪賢秀特任助教法学修士神里彩子

145

mary roles were recruitment. After the end ofrecruitment, their roles shifted to the tracing ofparticipants to extract clinical information andto input it into the database. However, theirsupport and encouragement of participants com-plemented the contents of the initial consentprocess and reinforced participants’ incentive tocontinue in their role. The results of this studywill contribute to improved quality control andbetter communication between administrators oflong-term research projects and the project par-ticipants.

We also have provided research ethics consul-tation for conducting a follow-up study withoutconsent, to obtain permission from 1,100 munici-palities to access cause of death of 44,698 partici-pants.

3. Rethinking voluntary egg donation

In the scandal around Korean stem cell scien-tist Woo-Suk Hwang, the inappropriate collec-tion of human eggs as research material, fabri-cated data on ES cells obtained through somaticcell nuclear transfer, and fraudulent fundraisingwere condemned as legal and ethical transgres-sions. Among the criticisms, the donation ofeggs by many women became a big issue. Someof the women were motivated by financial com-pensation or in-kind support, while others de-cided to donate their eggs without payment, be-ing convinced that the research would bringtherapy for thus far incurable patients, a prom-ise unfulfilled. Regardless of the multiple re-ports published to articulate why the Hwangscandal happened in South Korea, we realizedduring our ethnographical fieldwork in thatcountry that it would be meaningful to considerthe ethical issues in a global context. In this pa-per (#3), we focus on the motivations of theSouth Korean women who donated their eggsvoluntarily as research materials, and aim to un-derstand it in a more general context. We pointout that not only their love of family but alsoother altruistic motivations for donating eggsare affected by the attitudes revealed in theirnarratives. Finally, we argue that there is a seri-ous bioethical issue when a social environmentof sick or disabled people makes women decideto help these individuals by donating eggs.

4. Direct-to-consumer genetic testing for tal-ent identification

The regulation of direct-to-consumer genetictesting has till now focused on identifying pre-dispositions to specific health problems andquality issues in testing, such as analytical andclinical validity criteria. In the United States, for

example, the Food and Drug Administrationwarned that federal regulations for medical de-vices will apply to commercial genetic tests forhealth purposes. By contrast, the regulation ofcommercial genetic testing for other purposes israrely discussed. These tests are advertised asidentifying intelligence, athletic ability, and ar-tistic sensibility in children. One company offerstesting for a “gold medalist gene” purportedlyassociated with remarkable athletic talent; in Ja-pan, the test has been favorably featured onsome television programs and is recruiting cus-tomers through the Internet. We have alertedethical issues among these sales in our paper (#4).

5. A cross-sectional survey of physicians’recommendations of the 2009 influenza A/H1N1 pandemic in Japan

Striking a balance between the rapid availabil-ity of a novel vaccine while ensuring its safety,quality, and efficacy is a major challenge duringa pandemic. In our paper (#5), we aimed to elu-cidate physicians’ attitudes regarding the novelvaccine during the influenza A/H1N1 pandemicof 2009, and to determine factors that affectedtheir vaccination recommendations to patients.Of a random sample of 1,000 general practitio-ners (GPs) in Japan, 515 participated in thecross-sectional anonymous survey conductedjust before the novel vaccine was available (be-tween 28 September and 18 October 2009). A to-tal of 453 GPs (88.3％) replied that they in-tended to receive the new vaccine themselves;however, only 177 GPs (34.6％) intended to pro-actively recommend it to their patients. The an-ticipated cost of the vaccine negatively influ-enced the intention to vaccinate themselves andtheir recommendations to patients (P＜0.001, χ(2) test). Results of multivariate logistic regres-sion analysis showed that physicians with expe-rience in influenza A/H1N1 patient contacts [1-20 contacts, odds ratio (OR)＝7.49 (95％ confi-dence interval [CI]: 1.73-32.36), P＝0.007; ＞20contacts, OR＝8.03 (95％ CI: 1.77-36.50), P＝0.007, compared with no contacts] were morelikely to recommend the vaccine to patients,whereas those with knowledge of the fear onthe causal association between Guillain-Barrésyndrome (GBS) cases and the 1976 swine fluvaccination in the USA were less likely to rec-ommend the vaccine [OR＝0.66 (95％ CI: 0.45-0.97), P＝0.036].

6. The agendas for discussion on human andanimal chimera for scientific research

In July 2010, the Minister of Education, Cul-

146

ture, Sports, Science, and Technology receivedthe first notification of the procedure for creat-ing human-to-animal chimeric embryos by in-serting human induced pluripotent stem cells(iPS cells) to animal blastocysts. This case drewattention to the necessity of reviewing the regu-lations of human-to-animal chimeric embryosunder the Act on Regulation of Human CloningTechniques and the Guidelines for Specified Em-bryo. In Kamisato’s paper (#6, in Japanese), shehas succeeded to organize the concept of “hu-man and animal chimera”-including human-to-animal chimera and animal-to-human chimera-which eventually became quite complicated.Subsequently, she has discussed the problematicissues of the existing regulations and the agen-das for reviewing the regulations of creatingand using human-to-animal chimeric embryos.As a result, she has clearly listed four problemsof the existing regulations: for example, the du-ration that the human-to-animal chimeric em-bryos can be maintained is lacking validity. Inaddition, she has proposed the discussion forhuman and animal chimera with wider visionwhich includes the scientific, ethical, legal, andsociological aspect. In the end, she has men-tioned three important points that need to be

discussed including what kind of human andanimal chimera can be allowed to create.

7. Research in progress

We have been conducting other studies as de-scribed below.- Ethical, legal and social implications of com-

mercial genetic/genomic testing services ineastern Asia

- Development and evaluation of communica-tion methods with participants of Biobank Ja-pan and other long-term studies

- Analysis of roles of research coordinators forbetter recruitment and for building trust fromparticipants

- Ethical, legal and social implications of stemcell studies including animal-human chimericembryos and iPS cell banking

- Bench-side research ethics consultation andquality assurance of research ethics committees

- Science communication through art and ethicalchallenges of biomaterial art

- Ethics issues in collecting human biologicalsubstances for constructing a research infra-structure

- Vaccination policy

Publications

1. Ishiyama I, Tanzawa T, Watanabe M, MaedaT, Muto K, Tamakoshi A, Nagai A, andYamagata Z. Public attitudes to the promo-tion of genomic crop studies in Japan: corre-lations between genomic literacy, trust, andfavourable attitude. Public Understanding ofScience, first published on January 17, 2012

2. Suzuki K, Sawa R, Muto K, Kosuda S,Banno K, Yamagata Z. Risk perception ofpregnancy promotes disapproval of gesta-tional surrogacy: Analysis of a nationallyrepresentative opinion survey in Japan, In-ternational Journal of Fertility and Sterility,5(2): 78-85, 2011.

3. Tsuge A, Hong H. Reconsidering ethical is-sues about “voluntary egg donors” inHwang’s case in global context, New Genet-ics and Society. 30 (3): 241-252, 2011.

4. Inoue Y, Muto K. Children and the genetic

identification of talent. The Hastings CenterReport, 41, inside back cover, 2011.

5. Inoue Y, Matsui K. Physicians’ recommenda-tions to their patients concerning a novelvaccine: a cross-sectional survey on 2009 A/H1N1 vaccination in Japan. EnvironmentalHealth and Preventive Medicine, 16 (5): 320-326, 2011.

6. 神里彩子．ヒトと動物のキメラを作成する研究はどこまで認められるか？―再議論に向けた検討課題の提示―，生命倫理２１�：２２―３２，２０１１．

7. 武藤香織．難病をもつ地域住民への支援～市町村の役割再考．月刊「自治研」２０１１年７月号，１９―２６，２０１１．

8. 井上悠輔．『臨床研究のための倫理審査ハンドブック』（笹栗俊之ほか編），（１―４―２，２―４―１分担），丸善，２０１１年．

147

human genome center laboratory of genome …eshita 1: department of infectious disease con- trol,...

Documents