comparative genomics978-1-59745-515... · 2017-08-26 · methods in molecular biologytm john m....

17
Comparative Genomics

Upload: others

Post on 24-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Comparative Genomics

Page 2: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

M E T H O D S I N M O L E C U L A R B I O L O G YTM

John M. Walker, SERIES EDITOR

404. Topics in Biostatistics, edited by WalterT. Ambrosius, 2007

403. Patch-Clamp Methods and Protocols, edited byPeter Molnar and James J. Hickman, 2007

402. PCR Primer Design, edited by Anton Yuryev, 2007401. Neuroinformatics, edited by Chiquito J.

Crasto, 2007400. Methods in Lipid Membranes, edited by Alex

Dopico, 2007399. Neuroprotection Methods and Protocols, edited by

Tiziana Borsello, 2007398. Lipid Rafts, edited by Thomas J. McIntosh, 2007397. Hedgehog Signaling Protocols, edited by Jamila

I. Horabin, 2007396. Comparative Genomics, Volume 2, edited by

Nicholas H. Bergman, 2007395. Comparative Genomics, Volume 1, edited by

Nicholas H. Bergman, 2007394. Salmonella: Methods and Protocols, edited by Heide

Schatten and Abe Eisenstark, 2007393. Plant Secondary Metabolites, edited by Harinder

P. S. Makkar, P. Siddhuraju, and KlausBecker, 2007

392. Molecular Motors: Methods and Protocols, edited byAnn O. Sperry, 2007

391. MRSA Protocols, edited by Yinduo Ji, 2007390. Protein Targeting Protocols, Second Edition, edited

by Mark van der Giezen, 2007389. Pichia Protocols, Second Edition, edited by James

M. Cregg, 2007388. Baculovirus and Insect Cell Expression Protocols,

Second Edition, edited by David W.Murhammer, 2007

387. Serial Analysis of Gene Expression (SAGE): DigitalGene Expression Profiling, edited by Kare LehmannNielsen, 2007

386. Peptide Characterization and ApplicationProtocols� edited by Gregg B. Fields, 2007

385. Microchip-Based Assay Systems: Methods andApplications, edited by Pierre N. Floriano, 2007

384. Capillary Electrophoresis: Methods and Protocols,edited by Philippe Schmitt-Kopplin, 2007

383. Cancer Genomics and Proteomics: Methods andProtocols, edited by Paul B. Fisher, 2007

382. Microarrays, Second Edition: Volume 2,Applications and Data Analysis, edited by JangB. Rampal, 2007

381. Microarrays, Second Edition: Volume 1, SynthesisMethods, edited by Jang B. Rampal, 2007

380. Immunological Tolerance: Methods and Protocols,edited by Paul J. Fairchild, 2007

379. Glycovirology Protocols� edited by RichardJ. Sugrue, 2007

378. Monoclonal Antibodies: Methods and Protocols,edited by Maher Albitar, 2007

377. Microarray Data Analysis: Methods andApplications, edited by Michael J. Korenberg, 2007

376. Linkage Disequilibrium and Association Mapping:Analysis and Application, edited by AndrewR. Collins, 2007

375. In Vitro Transcription and Translation Protocols:Second Edition, edited by Guido Grandi, 2007

374. Quantum Dots: edited by Marcel Bruchez andCharles Z. Hotz, 2007

373. Pyrosequencing® Protocols� edited by SharonMarsh, 2007

372. Mitochondria: Practical Protocols, edited by DarioLeister and Johannes Herrmann, 2007

371. Biological Aging: Methods and Protocols, edited byTrygve O. Tollefsbol, 2007

370. Adhesion Protein Protocols, Second Edition, editedby Amanda S. Coutts, 2007

369. Electron Microscopy: Methods and Protocols,Second Edition, edited by John Kuo, 2007

368. Cryopreservation and Freeze-Drying Protocols,Second Edition, edited by John G. Day and GlynStacey, 2007

367. Mass Spectrometry Data Analysis in Proteomics�edited by Rune Matthiesen, 2007

366. Cardiac Gene Expression: Methods and Protocols,edited by Jun Zhang and Gregg Rokosh, 2007

365. Protein Phosphatase Protocols: edited by GregMoorhead, 2007

364. Macromolecular Crystallography Protocols:Volume 2, Structure Determination, edited by SylvieDoublié, 2007

363. Macromolecular Crystallography Protocols: Volume1, Preparation and Crystallization of Macromolecules,edited by Sylvie Doublié, 2007

362. Circadian Rhythms: Methods and Protocols, editedby Ezio Rosato, 2007

361. Target Discovery and Validation Reviews andProtocols: Emerging Molecular Targets andTreatment Options, Volume 2, edited by MouldySioud, 2007

360. Target Discovery and Validation Reviews andProtocols: Emerging Strategies for Targets andBiomarker Discovery, Volume 1, edited by MouldySioud, 2007

359. Quantitative Proteomics by Mass Spectrometry�edited by Salvatore Sechi, 2007

358. Metabolomics: Methods and Protocols, edited byWolfram Weckwerth, 2007

357. Cardiovascular Proteomics: Methods and Protocols,edited by Fernando Vivanco, 2007

356. High-Content Screening: A Powerful Approach toSystems Cell Biology and Drug Discovery, edited byD. Lansing Taylor, Jeffrey Haskins, and Ken Guiliano,and 2007

355. Plant Proteomics: Methods and Protocols, edited byHervé Thiellement, Michel Zivy, Catherine Damerval,and Valerie Mechin, 2007

354. Plant–Pathogen Interactions: Methods andProtocols, edited by Pamela C. Ronald, 2006

Page 3: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

M E T H O D S I N M O L E C U L A R B I O L O G YTM

Comparative GenomicsVolume 2

Edited by

Nicholas H. BergmanBioinformatics Program and Department

of Microbiology and Immunology,University of Michigan Medical School,

Ann Arbor, MI

Page 4: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

©2007 Humana Press Inc.999 Riverview Drive, Suite 208Totowa, New Jersey 07512

www.humanapress.com

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or byany means, electronic, mechanical, photocopying, microfilming, recording, or otherwise without written permission fromthe Publisher. Methods in Molecular BiologyTM is a trademark of The Humana Press Inc.

All papers, comments, opinions, conclusions, or recommendations are those of the author(s), and do not necessarily reflectthe views of the publisher.

This publication is printed on acid-free paper. ©�ANSI Z39.48-1984 (American Standards Institute) Permanence of Paper for Printed Library Materials

Cover illustration: From Figure 1, Volume 1, Chapter 10, “PSI-BLAST Tutorial,” by Medha Bhagwat and L. Aravind.Ribbon diagrams comparing the three-dimensional structures of the human PCNA protein and the E. coli DNA polymeraseIII beta subunit. The coordinates for these structures are taken from a public database.

Cover Design: Karen Schulz

Production Editor: Christina M. Thomas

For additional copies, pricing for bulk purchases, and/or information about other Humana titles, contact Humana at the aboveaddress or at any of the following numbers: Tel.: 973-256-1699; Fax: 973-256-8341; E-mail: [email protected]; orvisit our Website: www.humanapress.com

Photocopy Authorization Policy: Authorization to photocopy items for internal or personal use, or the internal or personaluse of specific clients, is granted by Humana Press Inc., provided that the base fee of US $30 copy is paid directly to theCopyright Clearance Center at 222 Rosewood Drive, Danvers, MA 01923. For those organizations that have been granteda photocopy license from the CCC, a separate system of payment has been arranged and is acceptable to Humana PressInc. The fee code for users of the Transactional Reporting Service is: [978-1-934115-37-4/07 $30].

Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1

Library of Congress Control Number: 2007930590

Page 5: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Preface

Over the last ten years the amount of biological sequence data availableto researchers has increased by several orders of magnitude, and completegenome sequences (nearly nonexistent ten years ago) have become common-place. The techniques involved in analyzing these sequences have evolvedalmost as rapidly, and several (e.g, BLAST) have become so commonly usedin molecular biology that their names have become verbs. Even so, a number ofextremely powerful tools and techniques developed for comparative genomicanalysis remain unfamiliar to molecular biologists, and thus are underutilized.

The primary aim of these volumes is to provide a set of tutorials that will beuseful to molecular biologists beginning to use comparative genomic analysistools in a number of different areas. Volume I contains the first four of sevensections: In the first section, the reader is introduced to genomes via a numberof visualization tools that allow one to browse through a particular genomeof interest. The second and third sections deal with comparative analysis atthe level of individual sequences, and present methods useful in sequencealignment, the discovery of conserved sequence motifs, and the analysis ofcodon usage. The fourth section deals with the identification and structuralcharacterization of non-coding RNA genes—this class of genes is particularlydifficult to predict, and discovery of these elements is almost completely relianton comparative genomics. (Note that the much larger question of identifyingprotein-coding genes is not addressed here, because there a separate volume inthe MiMB series devoted to this issue).

In the second volume, the fifth section describes a number of tools forcomparative analysis of domain and gene families. These tools are particu-larly useful for predicting protein function as well as potential protein-proteininteractions. In the sixth section, methods for comparing groups of genes andgene order are discussed, as are several tools for analyzing genome evolution.Finally, the seventh section deals with experimental comparative genomics.This section includes methods for comparing gene copy number across anentire genome, comparative genomic hybridization, SNP analysis, as well asgenome-wide mapping and typing systems for bacterial genomes.

v

Page 6: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

vi Preface

Each chapter includes not only detailed instructions for using a particulartool or method, but also an introduction to the theory behind the technique.Importantly, there are also a number of Notes at the end of each chapterthat guide the beginning user through commonly encountered difficulties, andprovide key tips for using the method most efficiently. Readers are encouragedto note that although some of tools presented in a given section are quitesimilar in aim, they are often designed quite differently, and will have differentstrengths and weaknesses. This is particularly true in considering the computa-tional tools, where the same overall goal (e.g., discovery of conserved motifs)can be pursued using a number of very different statistical approaches. Usersshould therefore explore several different options in attempting comparativeanalyses—a combined approach is often best.

These volumes are the collective effort of many people. I would like toextend a special thanks to all of the contributors, and to the staff at HumanaPress, who helped at every stage of the publication process. I would also liketo especially thank Erica Anderson and Ellen Swenson at the University ofMichigan Medical School and Tim Read at the US Naval Medical ResearchCenter for valuable advice, and help in putting these books together.

Nicholas H. Bergman, PhD

Page 7: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vContributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiTable of Contents—Volume 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Part I: Comparative Analysis of Domain and Protein

Families

1 Computational Prediction of Domain InteractionsPhilipp Pagel, Normann Strack, Matthias Oesterheld,

Volker Stümpflen, and Dmitrij Frishman . . . . . . . . . . . . . . . . . . . . . . . 3

2 Domain Team: Synteny of Domains is a New Approachin Comparative Genomics

Sophie Pasek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Inference of Gene Function Based on Gene Fusion Events:The Rosetta-Stone Method

Karsten Suhre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Pfam: A Domain-Centric Method for Analyzing Proteinsand Proteomes

Jaina Mistry and Robert Finn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 InterPro and InterProScan: Tools for Protein SequenceClassification and Comparison

Nicola Mulder and Rolf Apweiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6 Gene Annotation and Pathway Mapping in KEGGKiyoko F. Aoki-Kinoshita and Minoru Kanehisa . . . . . . . . . . . . . . . . . . . . 71

Part II: Orthologs, Synteny, and Genome Evolution

7 Ortholog Detection Using the Reciprocal SmallestDistance Algorithm

Dennis P. Wall and Todd DeLuca . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

8 Finding Conserved Gene Order Across Multiple GenomesGiulio Pavesi and Graziano Pesole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

vii

Page 8: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

viii Table of Contents

9 Analysis of Genome Rearrangement by Block-InterchangesChin Lung Lu, Ying Chih Lin, Yen Lin Huang,

and Chuan Yi Tang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

10 Analyzing Patterns of Microbial Evolution Using the MauveGenome Alignment System

Aaron E. Darling, Todd J. Treangen, Xavier Messeguer,and Nicole T. Perna. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

11 Visualization of Syntenic Relationships With SynBrowseVolker Brendel, Stefan Kurtz, and Xioakang Pan . . . . . . . . . . . . . . . . . . 153

12 Gecko and GhostFam: Rigorous and Efficient Gene ClusterDetection in Prokaryotic Genomes

Thomas Schmidt and Jens Stoye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Part III: Experimental Analysis of Whole Genomes: ANALYSIS

OF COPY NUMBER AND SEQUENCE POLYMORPHISMS

13 Genome-wide Copy Number Analysis on GeneChip®

Platform Using Copy Number Analyzer for AffymetrixGeneChip 2.0 Software

Seishi Ogawa, Yasuhito Nanya, and Go Yamamoto . . . . . . . . . . . . . . . 185

14 Oligonucleotide Array Comparative GenomicHybridization

Paul van den IJssel and Bauke Ylstra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

15 Studying Bacterial Genome DynamicsUsing Microarray-Based Comparative GenomicHybridization

Eduardo N. Taboada, Christian C. Luebbert,and John H. E. Nash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

16 DNA Copy Number Data Analysis Using the CGHAnalyzerSoftware Suite

Joel Greshock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

17 Microarray-Based Approach for Genome-Wide Survey ofNucleotide Polymorphisms

Brian W. Brunelle and Tracy L. Nicholson . . . . . . . . . . . . . . . . . . . . . . . . 267

18 High-Throughput Genotyping of Single NucleotidePolymorphisms with High Sensitivity

Honghua Li, Hui-Yun Wang, Xiangfeng Cui, Minjie Luo,Guohong Hu, Danielle M. Greenawalt, Irina V. Tereshchenko,James Y. Li, Yi Chu, and Richeng Gao . . . . . . . . . . . . . . . . . . . . . . . . . . 281

Page 9: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Table of Contents ix

19 Single Nucleotide Polymorphism Mapping Array AssayXiaofeng Zhou and David T. W. Wong . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

20 Molecular Inversion Probe AssayFarnaz Absalan and Mostafa Ronaghi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

21 novoSNP3: Variant Detection and Sequence Annotationin Resequencing Projects

Peter De Rijk and Jurgen Del-Favero. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331

22 Rapid Identification of Single Nucleotide SubstitutionsUsing SeqDoC

Mark L. Crowe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

23 SNPHunter: A Versatile Web-Based Tool for Acquiringand Managing Single Nucleotide Polymorphisms

Tianhua Niu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

24 Identification of Disease Genes: Example-DrivenWeb-Based Tutorial

Medha Bhagwat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

25 Variable Number Tandem Repeat Typing of BacteriaSiamak P. Yazdankhah and Bjørn-Arne Lindstedt . . . . . . . . . . . . . . . . . . 395

26 Fluorescent Amplified Fragment Length PolymorphismGenotyping of Bacterial Species

Meeta Desai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407

27 FLP-Mapping: A Universal, Cost-Effective, andAutomatable Method for Gene Mapping

Knud Nairz, Peder Zipperlen, and Manuel Schneider. . . . . . . . . . . . . . 419

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433

Page 10: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Contributors

Farnaz Absalan • Stanford Genome Technology CenterKiyoko Aoki-Kanehisa • Department of Bioinformatics Soka University,

Faculty of EngineeringRolf Apweiler • European Bioinformatics InstituteMedha Bhagwat • National Center for Biotechnology Information,

National Library of Medicine, National Institutes of HealthVolker Brendel • Department of Genetics, Development, and Cell Biology,

Iowa State UniversityBrian Brunelle • Virus and Prion Diseases of Livestock Research Unit,

National Animal Disease Center, USDA Agricultural Research ServiceYi Chu • University of Medicine and Dentistry of New Jersey, Robert Wood

Johnson Medical School, Department of Molecular Genetics, Microbiology& Immunology

Mark L. Crowe • Genetic Solutions Pty Ltd, AustraliaXiangfeng Cui • University of Medicine and Dentistry of New Jersey,

Robert Wood Johnson Medical School, Department of Molecular Genetics,Microbiology & Immunology

Aaron Darling • Department of Computer Science, University ofWisconsin-Madison

Peter De Rijk • Department of Molecular Genetics, University of AntwerpJurgen Del-Favero • Department of Molecular Genetics, University of

AntwerpTodd DeLuca• Department of Systems Biology, Harvard Medical SchoolMeeta Desai • Applied and Functional Genomics, Health Protection

Agency, United KingdomRobert Finn • Wellcome Trust Sanger InstituteDmitrij Frishman • Technical University of Munich, Department of Genome

Oriented Bioinformatics, Institute for Bioinformatics/MIPS, GSF—ResearchCenter for Environment and Health

xi

Page 11: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

xii Contributors

Richeng Gao • University of Medicine and Dentistry of New Jersey,Robert Wood Johnson Medical School, Department of Molecular Genetics,Microbiology & Immunology

Danielle M. Greenawalt • University of Medicine and Dentistry ofNew Jersey, Robert Wood Johnson Medical School, Department ofMolecular Genetics, Microbiology & Immunology

Joel Greshock • GlaxoSmithKline, Abramson Family Cancer ResearchInstitute, University Pennsylvania School of Medicine

Guohong Hu • University of Medicine and Dentistry of New Jersey,Robert Wood Johnson Medical School, Department of Molecular Genetics,Microbiology & Immunology

Yen Lin Huang • Department of Computer Science, National Tsing HuaUniversity

Minoru Kanehisa • Kyoto University Bioinformatics Center, HumanGenome Center, Institute of Medical Science, University of Tokyo

Stefan Kurtz • Department of Genetics, Development, and Cell Biology,Iowa State University

Honghua Li • University of Medicine and Dentistry of New Jersey,Robert Wood Johnson Medical School, Department of Molecular Genetics,Microbiology & Immunology

James Y. Li • University of Medicine and Dentistry of New Jersey,Robert Wood Johnson Medical School, Department of Molecular Genetics,Microbiology & Immunology

Ying Chih Lin • Department of Computer Science, National Tsing HuaUniversity

Bjørn-Arne Lindstedt • Norwegian Institute of Public HealthChin Lung Lu • Department of Biological Science and Technology,

National Chiao Tung UniversityChristian C. Luebbert • Genomics and Proteomics Group, Institute for

Biological Sciences, Canadian National Research CouncilMinjie Luo • University of Medicine and Dentistry of New Jersey,

Robert Wood Johnson Medical School, Department of Molecular Genetics,Microbiology & Immunology

Xavier Messeguer • Department of Software, Technical University ofCatalonia-Barcelona, Barcelona Supercomputing Center (BSC)

Jaina Mistry • Wellcome Trust Sanger InstituteNicola Mulder • European Bioinformatics InstituteKnud Nairz • Institute of Neuropathology, University Hospital of Zurich

Page 12: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Contributors xiii

Yasuhito Nanya • University of Tokyo, Department of RegenerationMedicine

John H.E. Nash • Genomics and Proteomics Group, Institute for BiologicalSciences, Canadian National Research Council

Tracy Nicholson • Respiratory Diseases of Livestock Research Unit,National Animal Disease Center, USDA Agricultural Research Service

Tianhua Niu • Division of Preventative Medicine, Department of Medicine,Brigham and Women’s Hospital, Harvard Medical School

Matthias Oesterheld • Institute for Bioinformatics/MIPS, GSF—ResearchCenter for Environment and Health

Seishi Ogawa • University of Tokyo, Department of Regeneration MedicinePhilipp Pagel • Technical University of Munich, Department of Genome

Oriented Bioinformatics, Institute for Bioinformatics/MIPS, GSF—ResearchCenter for Environment and Health

Xioakang Pan • Department of Genetics, Development, and Cell Biology,Iowa State University

Sophie Pasek • Laboratoire Statistique et Génome, CNRSGiulio Pavesi • Dipartimento di Scienze Biomolecolari e Biotecnologie,

University of MilanNicole T. Perna • Department of Animal Health and Biomedical Sciences

Genome Center, University of Wisconsin-MadisonGraziano Pesole • Dipartimento di Biochimica e Biologia Molecolare,

University of Bari and Istituto Tecnologie Biomediche del C.N.R. (sede diBari)

Mostafa Ronaghi • Stanford Genome Technology CenterThomas Schmidt • Technische Fakultät, Universitat Bielefeld, International

NRW Graduate School in Bioinformatics and Genome Research, GermanyManuel Schneider • Kantonsschule ZugJens Stoye • Technische Fakultät, Universitat Bielefeld, GermanyNormann Strack • Technical University of Munich, Department of Genome

Oriented BioinformaticsVolker Stümpfeln • Institute for Bioinformatics/MIPS, GSF—Research

Center for Environment and HealthKarsten Suhre • Information Génomique et Structurale, CNRSEduardo N. Taboada • Genomics and Proteomics Group, Institute for

Biological Sciences, Canadian National Research CouncilChuan Yi Tang • Department of Computer Science, National Tsing Hua

University

Page 13: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

xiv Contributors

Irina V. Tereshchenko • University of Medicine and Dentistry ofNew Jersey, Robert Wood Johnson Medical School, Department ofMolecular Genetics, Microbiology & Immunology

Todd Treangen • Department of Software, Technical University ofCatalonia-Barcelona

Paul van den IJssel • VU University Medical CenterDennis P. Wall • Department of Systems Biology, Harvard Medical SchoolHui-Yun Wang • University of Medicine and Dentistry of New Jersey,

Robert Wood Johnson Medical School, Department of Molecular Genetics,Microbiology & Immunology

David T.W. Wong • UCLA School of DentistryGo Yamamoto • University of Tokyo, Department of Regeneration MedicineSiamak P. Yazdankhah • Norwegian Institute of Public HealthBauke Ylstra • VU University Medical CenterXiaofeng Zhou • UCLA School of DentistryPeder Zipperlen • Tecan Schweiz AG

Page 14: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Table of Contents

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixContributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Part 1: Genome Visualization and Annotation 11 Comparative Analysis and Visualization of Genomic

Sequences Using VISTA Browser and AssociatedComputational Tools

Inna Dubchak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Comparative Genomic Analysis Using the UCSC GenomeBrowser

Donna Karolchik, Gill Bejerano, Angie S. Hinrichs, RobertM. Kuhn, Webb Miller, Kate R. Rosenbloom, Ann S. Zweig,David Haussler, and W. James Kent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Comparative Genome Analysis in the Integrated MicrobialGenomes (IMG) System

Victor M. Markowitz and Nikos C. Kyrpides . . . . . . . . . . . . . . . . . . . . . . 35

4 WebACT: An Online Genome Comparison SuiteJames C. Abbott, David M. Aanensen,

and Stephen D. Bentley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 GenColors: Annotation and Comparative Genomicsof Prokaryotes Made Easy

Alessandro Romualdi, Marius Felder, Dominic Rose, UlrikeGausmann, Markus Schilhabel, Gernot Glöckner,Matthias Platzer, and Jürgen Sühnel. . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6 Comparative Microbial Genome VisualizationUsing GenomeViz

Rohit Ghai and Trinad Chakraborty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

7 BugView: A Tool for Genome Visualizationand Comparison

David P. Leader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

xv

Page 15: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

xvi Table of Contents

8 CGAS: A Comparative Genome Annotation SystemKwangmin Choi, Youngik Yang, and Sun Kim . . . . . . . . . . . . . . . . . . . . . 133

Part 2: Sequence Alignments 1479 BLAST QuickStart: Example-Driven Web-Based BLAST

TutorialDavid Wheeler and Medha Bhagwat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

10 PSI-BLAST TutorialMedha Bhagwat and L. Aravind. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

11 Organizing and Updating Whole Genome BLAST SearchesWith ReHAB

David J. Esteban, Aijazuddin Syed, and Chris Upton . . . . . . . . . . . . . . 187

12 Alignment of Genomic Sequences Using DIALIGNBurkhard Morgenstern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

13 An Introduction to the Lagan Alignment ToolkitMichael Brudno . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

14 Aligning Multiple Whole Genomes with Mercatorand MAVID

Colin N. Dewey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

15 Mulan: Multiple-Sequence Alignment to Predict FunctionalElements in Genomic Sequences

Gabriela G. Loots and Ivan Ovcharenko . . . . . . . . . . . . . . . . . . . . . . . . . . 237

16 Improving Pairwise Sequence Alignment between DistantlyRelated Proteins

Jin-an Feng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

Part 3: Identification of Conserved Sequences

and Biases in Codon Usage 26917 Discovering Sequence Motifs

Timothy L. Bailey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

18 Discovery of Conserved Motifs in Promoters of OrthologousGenes in Prokaryotes

Rekin’s Janky and Jacques van Helden . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

19 PhyME: A Software tool for Finding Motifs in Setsof Orthologous Sequences

Saurabh Sinha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

Page 16: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

Table of Contents xvii

20 Comparative Genomics-Based OrthologousPromoter Analysis Using the DoOP Databaseand the DoOPSearch Web Tool

Endre Barta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

21 Discovery of Motifs in Promoters of Coregulated GenesOlivier Sand and Jacques van Helden . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

22 Fastcompare: A Nonalignment Approach forGenome-Scale Discovery of DNA and mRNA RegulatoryElements Using Network-Level Conservation

Olivier Elemento and Saeed Tavazoie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

23 Phylogenetic Footprinting to Find Functional DNA ElementsAusten R. D. Ganley and Takehiko Kobayashi . . . . . . . . . . . . . . . . . . . . . 367

24 Detecting Regulatory Sites Using PhyloGibbsRahul Siddharthan and Erik van Nimwegen . . . . . . . . . . . . . . . . . . . . . . . 381

25 Using the Gibbs Motif Sampler for PhylogeneticFootprinting

William Thompson, Sean Conlan, Lee Ann McCue,and Charles E. Lawrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

26 Web-Based Identification of Evolutionary Conserved DNAcis-Regulatory Elements

Panayiotis V. Benos, David L. Corcoran, and Eleanor Feingold . . . . 425

27 Exploring Conservation of Transcription Factor BindingSites with CONREAL

Eugene Berezikov, Victor Guryev, and Edwin Cuppen . . . . . . . . . . . . . 437

28 Computational and Statistical Methodologies for ORFeomePrimary Structure Analysis

Gabriela Moura, Miguel Pinheiro, Adelaide Valente Freitas,José Luís Oliveira, and Manuel A. S. Santos . . . . . . . . . . . . . . . . . . . . 449

Part 4: Identification and Structural Characterization of

Noncoding RNAs 46329 Comparative Analysis of RNA Genes: The caRNAc

SoftwareHélène Touzet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

30 Efficient Annotation of Bacterial Genomes for Small,Noncoding RNAs Using the Integrative ComputationalTool sRNAPredict2

Jonathan Livny . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475

Page 17: Comparative Genomics978-1-59745-515... · 2017-08-26 · METHODS IN MOLECULAR BIOLOGYTM John M. Walker, SERIES EDITOR 404. Topics in Biostatistics, edited by Walter T. Ambrosius,

xviii Table of Contents

31 Methods for Multiple Alignment and Consensus StructurePrediction of RNAs Implemented in MARNA

Sven Siebert and Rolf Backofen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489

32 Prediction of Structural Noncoding RNAs With RNAzStefan Washietl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

33 RNA Consensus Structure Prediction With RNAalifoldIvo L. Hofacker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545