apollo — collaborative and scalable manual genome annotation

33
Nathan A. Dunn 1 , Monica C. Munoz-Torres 1 , Deepak Unni 2 , Eric Yao 3 , Colin Diesh 2 , Ian Holmes 3 , Christine G. Elsik 2 and Suzanna E. Lewis 1 (1) Lawrence Berkeley National Laboratory, Berkeley, CA, (2) Division of Animal Sciences, University of Missouri, Columbia, MO, (3) Department of Bioengineering, Berkeley, CA Apollo — Collaborative and Scalable Manual Genome Annotation https://github.org/GMOD/Apollo/ http://genomearchitect.org/ @apollo_bbop

Upload: nathan-dunn

Post on 14-Apr-2017

52 views

Category:

Science


2 download

TRANSCRIPT

NathanA.Dunn1,MonicaC.Munoz-Torres1,DeepakUnni2,EricYao3,ColinDiesh2,IanHolmes3,Christine

G.Elsik2andSuzannaE.Lewis1

(1)LawrenceBerkeleyNationalLaboratory,Berkeley,CA,(2)DivisionofAnimalSciences,UniversityofMissouri,Columbia,MO,(3)DepartmentofBioengineering,Berkeley,CA

Apollo—CollaborativeandScalableManualGenomeAnnotation

h t t p s : / /g i t h ub . o r g /GMOD/Apo l l o /

h t t p : / /genomea r c h i t e c t . o r g /

@apollo_bbop

genomea r c h i t e c t . o r g

GenomeAnnotationStructuralAnnotation• exons,introns,UTRs• repeatregions• transposableelements• tRNA,snRNA,snoRNA,miRNA,ncRNA,rRNA

2

FunctionalAnnotation• metabolicpathways/functions• GeneOntology

• molecularfunction• biologicalprocess• cellularcomponent

• expression• genefamilies

http://geneontology.orgPhoto Credit: Alex Wild at http://www.alexanderwild.com/

genomea r c h i t e c t . o r g

FunctionalAnnotation• metabolicpathways/functions• GeneOntology

• molecularfunction• biologicalprocess• cellularcomponent

• expression• genefamilies

GenomeAnnotationStructuralAnnotation• exons,introns,UTRs• repeatregions• transposableelements• tRNA,snRNA,snoRNA,miRNA,ncRNA,rRNA

3http://geneontology.orgPhoto Credit: Alex Wild at http://www.alexanderwild.com/

genomea r c h i t e c t . o r g

ExampleGenomeAnalysisWorkflow

4

Experimental design, sampling

Comparative analyses

Curated Gene Set

Manual Annotation

Sequencing

Synthesis & dissemination

Create Assembly

FGENESH

Automated Annotation

genomea r c h i t e c t . o r g

Experimental design, sampling

5

Consensus Gene Set

Automated Annotation

Sequencing

Synthesis & dissemination

FGENESH

ExampleGenomeAnalysisWorkflowCreate Assembly

AnalysesneedQualityData

Comparative analyses

Annotation

Manual Annotation

genomea r c h i t e c t . o r g

IntegrationintoWorkflowandTools

6

• Over100organizationsrefineannotation• Multiplegenomesperorganization

NCBI Ensemble

RefinedAnnotationsDistributedtoPublic

genomea r c h i t e c t . o r g

AutomatedIdentificationisnotPerfect

7

Automated Annotation

GenerationofGeneModelsFindORFs,multipleroundsofgeneprediction

AnnotationofGeneModelsPredictingfunction,expressionpatterns,metabolicnetworkmemberships

• Assemblyerrorscancausefragmentedannotations

• Limitedcoveragemakespreciseidentificationdifficult

Manual Annotation

genomea r c h i t e c t . o r g

Human Analysis

Automated Annotation

ManualAnnotationRefinesGenome

8

Experimental Evidence

cDNAs,HMMdomainsearches,RNAseq,genesfromotherspecies.

• Additionaldata• Biologicalknowledge• Curatorbestrepresentsunderlyingevidence

Manual Annotation

genomea r c h i t e c t . o r g

9

AnnotatorsApollo

Google Web Toolkit (GWT) / Bootstrap

ApolloisaToolforCollaborativeAnnotationAnnotators

ApolloGoogle Web Toolkit (GWT) / Bootstrap

Annotators

ApolloGoogle Web Toolkit (GWT) / Bootstrap

• Web-basedEditor• Real-timecollaborative• Easytouse• genomicbrowser

Photo Credits: i5K; Alex Wild at http://www.alexanderwild.com/: leaf cutter ant, ensign wasp; Leo Bukeboom: Nasonia vitripennis jewel wasp; Wikimedia Commons: Apis mellifera honey bee; Mike MacNeil USDA/ARS Fort Keogh LARRL: Bos taurus cow.

genomea r c h i t e c t . o r g

1-EvidenceViewer/GenomeBrowser

10

Evidence

Transcripts(GFF3,GBK)

BAMReads

Transcripts(GFF3,GBK)

BigWigXY

BigWigHeatMap

Themes(dark/light)

ColorCDSFrame

Automated Annotation

Manual Annotation

genomea r c h i t e c t . o r g

1-EvidenceViewer(GenomeBrowser)

11

DynamicallyOpenConfigureMultipleTracks

addStores={"url":{"type":"JBrowse/Store/SeqFeature/GFF3","urlTemplate":"http://host/genes.gff"}}&addTracks=[{"label":"genes","type":"JBrowse/View/Track/CanvasFeatures","store":"url"}]

AppendviaURL

StaticallyConfigure

• BAM• BigWig• GFF• GTF• GBK• VCF• FASTA• FASTAi• SPARQL• customtypes(e.g.,RESTend-point)

https://gmod.github.io/jbrowse-registry/

genomea r c h i t e c t . o r g

2-GenomeAnnotationEditor

12

Transcripts(GFF3,GBK)

BAMReads

Transcripts(GFF3,GBK)

BigWigXY

BigWigHeatMap

Automated Annotation

Manual Annotation

ExportedRefinedGenomicElements

genomea r c h i t e c t . o r g

13

Alignmentsshowninred

Annotateothergenomictypeswithdrop-down

CreateAnnotationAddAnnotationbyDraggingaGenomicElement

genomea r c h i t e c t . o r g

EditAnnotationStructure

14

Adjustexonbydragging

genomea r c h i t e c t . o r g

EditingAnnotations

15

EditAdditionalStructuralData(right-clickpopup)

EditAssociations

• PubMed/dbxref

• GeneOntology

• Metadata

• key/value

• status

• comments

ChangeAnnotationType

HistoryofStructuralEdits

genomea r c h i t e c t . o r g

EditAnnotationStructure

16

RevertibleHistoryofStructuralOperations

Currentposition

Highlightedrowshown

genomea r c h i t e c t . o r g

AnnotateReferenceSequenceAlterations

17

AlterationReflected

genomea r c h i t e c t . o r g

18

Search

View/EditDetails

List/NavigateVertically

Collapsible

3-AnnotatorPanelLinktoLocation

AlternateAnnotationsView

genomea r c h i t e c t . o r g

ReferenceSequence-SearchandExport

19

Search

Navigation

ExportAnnotations

genomea r c h i t e c t . o r g

Organism:Configuration

20

ImportJBrowsedirectoryShare“Public”organisms

GenomeRes.2009Sep;19(9):1630-8.doi:10.1101/gr.094607.109

CreateJBrowsetracksfromFASTA/GFF3/BAM/BigWig

genomea r c h i t e c t . o r g

Admin:UsersandGroups

21

Add/SearchUsers

EditUserPermission

UserCan“Admin”anOrganism

UseGroupstoManageBulkPermissions

genomea r c h i t e c t . o r g

ApolloServer-Grails

Security

Architecture

22

WebServicesClient

Perl,Shell,Groovy,PHP,etc.

Annotators

ApolloGoogle Web Toolkit (GWT) / BootstrapJBrowse

DOJO / jQuery

WebSocket

JDBC

FileSystem

Apollo

Client(s)

Server

REST

genomea r c h i t e c t . o r g

ApolloServer-Grails

Security

Architecture

23

WebServicesClient

Perl,Shell,Groovy,PHP,etc.

Annotators

ApolloGoogle Web Toolkit (GWT) / BootstrapJBrowse

DOJO / jQuery

WebSocket

JDBC

FileSystem

Apollo

Client(s)

Server

REST

genomea r c h i t e c t . o r g

ApolloServer-Grails

Security

Architecture

24

WebServicesClient

Perl,Shell,Groovy,PHP,etc.

Annotators

ApolloGoogle Web Toolkit (GWT) / BootstrapJBrowse

DOJO / jQuery

WebSocket

JDBC

FileSystem

Apollo

Server

Client(s)REST

genomea r c h i t e c t . o r g

ApolloServer-Grails

Security

Architecture

25

WebServicesClient

Perl,Shell,Groovy,PHP,etc.

Annotators

ApolloGoogle Web Toolkit (GWT) / BootstrapJBrowse

DOJO / jQuery

WebSocket

JDBC

FileSystem

Apollo

Server

Client(s)REST

genomea r c h i t e c t . o r g

ApolloServer-Grails

Security

Architecture

26

WebServicesClient

Perl,Shell,Groovy,PHP,etc.

Annotators

ApolloGoogle Web Toolkit (GWT) / BootstrapJBrowse

DOJO / jQuery

WebSocket

JDBC

FileSystem

Apollo

Client(s)

Server

REST

genomea r c h i t e c t . o r g

Summary

Annotators

ApolloGoogle Web Toolkit (GWT) / Bootstrap

ApolloGoogle Web Toolkit (GWT) / Bootstrap

ApolloGoogle Web Toolkit (GWT) / Bootstrap

Real-timecollaborativeCuratorsrefinegenomeannotations

IntegrateswithinworkflowVisualevidenceandfeedback

genomea r c h i t e c t . o r g

*CoordinateTransformation

FutureWork:CoordinateTransform

28

Mavenize

WebApollo

DesktopApollo

DBbackend,Sidebar,Grails,Multi-organism,WS

1.0

2.0

2.1

2.2 *Variantannotationandvisualization

2.3

GenomeFolding

Phenotypeannotation

AssemblyComposition

Group20 Group31

genomea r c h i t e c t . o r g

CombineScaffolds

LockandOrientCombinedScaffolds

UsedScaffolds

SetOrientation

SelecttoCombine

DragtoRearrange

genomea r c h i t e c t . o r g

CombineScaffolds

ViewIndividualFeatures

genomea r c h i t e c t . o r g

VariantAnnotationandVisualization

Mavenize

WebApollo

DesktopApollo

DBbackend,Sidebar,Grails,Multi-organism,WS

1.0

2.0

2.1

2.2

2.3

AnnotateVariants

Phenotypeannotation

VisualPredictions

*CoordinateTransformation

*Variantannotationandvisualization

CreatefromEvidence(e.g.,VCF)

• BerkeleyBioinformaticsOpen-sourceProjects(BBOP),BerkeleyLab:ApolloandGeneOntologyteams.SuzannaE.Lewis(PI).

• §ChristineG.Elsik(PI).UniversityofMissouri.

• *IanHolmes(PI).UniversityofCaliforniaBerkeley.

• StephenFicklin,GenSAS,WashingtonStateUniversity

• ApolloissupportedbyNIHgrants5R01GM080203fromNIGMS,and5R01HG004483fromNHGRI.AlsosupportedbytheDirector,OfficeofScience,OfficeofBasicEnergySciences,oftheU.S.DepartmentofEnergyunderContractNo.DE-AC02-05CH11231

• AlexWildathttp://www.alexanderwild.com/:leafcutterant,ensignwasp;LeoBukeboom:Nasoniavitripennisjewelwasp;WikimediaCommons:Apismelliferahoneybee;MikeMacNeilUSDA/ARS

• ThankstoyouandtheApollo/GMODCommunities

Apollo

*NathanDunn

MonicaMunoz-Torres

DeepakUnni§

ColinDiesh§

JBrowse

EricYao

TexasA&MUniversity

EricRasche

GeneOntology

ChrisMungall

SethCarbon

JeremyNguyen

BBOP

Apollo:http://genomearchitect.org

https://github.org/GMOD/Apollo/

Questions?

NALatUSDA

ChristopherChilders

MonicaPoelchau

Yu-Yu“Fish”Lin

@apollo_bbop

ExtraSlides