2009 gmod meetinggmod.org/mediawiki/images/7/70/2009_gmod_meeting_dhileep... · 2009. 1. 16. ·...
TRANSCRIPT
![Page 1: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/1.jpg)
2009 GMOD MeetingDhileep Sivam & Isabelle Phan
Seattle Biomedical ResearchInstitute
![Page 2: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/2.jpg)
Seattle Biomedical Research Institute(SBRI)
• Founded in 1976• About 250 full-time staff• Focus on infectious disease• 13 Labs• Strong ties to the University of Washington• Bioinformatics Core
![Page 3: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/3.jpg)
How we first came to useChado
LmjF Probe Set LinJ Probe Set
LmjF V5.2 LinJ V2.0 LinJ V3.0 LinJ V4.0LmjF V4.0
Mapping MappingMapping
Result Set Result Set Result Set
![Page 4: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/4.jpg)
Microarray Project
ChadoNimblegen
DataParsers
Analysis ToolsNormalization
ScalingFeature-level aggregation
RemappingVisualization
![Page 5: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/5.jpg)
Use Case: SSGCIDSeattle Structural Genomics Center for Infectious Disease
Vaccine Targets!
Gene Cloning & Expression
Protein Crystallization
Structure Determination
Bioinformatic Screening
Project Aim
3D Protein Structure
NIAID Emerging and re-emergingpriority pathogens
Structures will serve as a startingpoint for drug development
Multi-center
![Page 6: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/6.jpg)
SSGCID
Vaccine Targets!
Gene Cloning & Expression
Protein Crystallization
Structure Determination
Bioinformatic Screening
![Page 7: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/7.jpg)
SSGCID
Chado
ExternalSequenceResources
BLAST Screening
ExportParsers
Bulk Loader
![Page 8: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/8.jpg)
Things that have come up…Complexity of querying
BLAST results
Gene Models
Complexity of queryingmicroarray data
MaterializedViews
SimplestPossible Model
“Grouping of Genes” DBXrefs
![Page 9: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/9.jpg)
warehouse
ProteomicsMicroarrayStructural genomics
Data access
curation
Automatedanalysispipeline
Sequence data management at SBRI
![Page 10: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/10.jpg)
Chado + GUS: why do weneed both?
• Chado– Collaboration with IGS– Annotation tools: Manatee (apollo), Ergatis
• Internal data production
• Gus– Collaboration with UPenn– Web front end
• External data access
![Page 11: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/11.jpg)
Chado
ProteomicsMicroarrayStructural genomics
ManateeManual annotation
ErgatisAnalysis pipeline
Sequence data management at SBRI
GUS
GUS WDK
![Page 12: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/12.jpg)
Chado2GUS: Lost intranslation
• Chado– Denormalized
schema• Polymorphism
– Mysql (IGS Chado)
• GUS– Normalized schema
• Subclassing
– Postgres port fromOracle
![Page 13: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/13.jpg)
Picking the best of two worlds
• Chado– Biological data model– Flexibility
• GUS– Software engineering– Flexibility
![Page 14: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/14.jpg)
The future?
• SQL-free data production– Instead of custom wrappers over raw SQL:
• ORMs: Chado Hibernate, ActiveRecords• Unified object model
• RDBMS-free data mining– Instead of GUS predefined query + set
combination• Biomart + Galaxy• RDF + triple store + sparql (object store + Lucene)
![Page 15: 2009 GMOD Meetinggmod.org/mediawiki/images/7/70/2009_GMOD_Meeting_Dhileep... · 2009. 1. 16. · Dhileep Sivam & Isabelle Phan Seattle Biomedical Research Institute. Seattle Biomedical](https://reader036.vdocument.in/reader036/viewer/2022081411/60acffea0d79bd14d5517635/html5/thumbnails/15.jpg)