april 12, 2004 michael conrad memorial lecture the future of bioinformatics philip e. bourne the...
Post on 20-Jan-2016
216 views
TRANSCRIPT
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
The Future of The Future of BioinformaticsBioinformatics
Philip E. BournePhilip E. BourneThe University of California San DiegoThe University of California San Diego
[email protected]@ucsd.eduhttp://www.sdsc.edu/pbhttp://www.sdsc.edu/pb
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Many of Michael’s contributions are Many of Michael’s contributions are now being more fully realized in the now being more fully realized in the fields of bioinformatics and systems fields of bioinformatics and systems biology. We will explore current and biology. We will explore current and
future trends in these fields to future trends in these fields to further appreciate Michael’s vision further appreciate Michael’s vision
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
We have Come a Long Way…We have Come a Long Way…
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
It will be through the increasing merger It will be through the increasing merger of computer science, computational of computer science, computational science, information science and the science, information science and the life sciences that Michael’s foresights life sciences that Michael’s foresights
will be fully appreciated. will be fully appreciated.
Large amounts of complex data puts Large amounts of complex data puts these disciplines on the same page these disciplines on the same page
and the book of bioinformatics can be and the book of bioinformatics can be written. It is therefore appropriate that written. It is therefore appropriate that
today we spend time looking at the today we spend time looking at the immediate future of bioinformaticsimmediate future of bioinformatics
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Today’s OutlineToday’s Outline
We will address the following questions We will address the following questions from two perspectives – data complexity from two perspectives – data complexity and biological complexity:and biological complexity: How did bioinformatics get here?How did bioinformatics get here? What are the challenges today? What are the challenges today? Apology – Apology –
many illustrations are drawn from our own many illustrations are drawn from our own work in structural bioinformaticswork in structural bioinformatics
What will the short and long term future hold?What will the short and long term future hold?
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
You Are Here
TIME
AN
YT
HIN
G
“The thing about change is that things will be different afterwards.”— Alan McMahon
Disclaimer - Plotting ChangeDisclaimer - Plotting Change
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Rules of PredictionRules of Prediction
Looking back, everything appears to have Looking back, everything appears to have developed faster than realitydeveloped faster than reality
Looking forward, everything will develop Looking forward, everything will develop faster that you predictfaster that you predict
Hence, we are all very poor at predicting Hence, we are all very poor at predicting beyond the next 5 years – examples:beyond the next 5 years – examples: The Next Fifty Years : Science in the First Half of the Twenty-first The Next Fifty Years : Science in the First Half of the Twenty-first
CenturyCentury by by John BrockmanJohn Brockman (Editor) (Editor) CACM Volume 40 , Issue 2 CACM Volume 40 , Issue 2 (February 1997) (February 1997)
"This is like deja vu all over again."
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Can I even do 5 years?Can I even do 5 years?
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Bourne Bioinformatics Editorial 1999 15(9):715 “Over the next 5 years there will be an estimated 10
major structural genomics efforts each yielding 200structures per year. While these efforts will deplete
regular structure determination efforts, improvementsin technology and a general expansion of the field
will continue to yield 50 structures per week worldwideoutside of the structural genomics initiatives.”
Net result 35,000 structures by 2005
"You can observe a lot just by watching."
There were 11,000 structures at the time of this prediction
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
PDB Growth CurvePDB Growth Curve
Approx. 25,000 structures todayIn 2003 approx. 5,000 structures were deposited
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
HistoryHistoryPredictions Can Be Good
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
So Let Us Review the History of Bioinformatics So Let Us Review the History of Bioinformatics Thus Far – General ObservationsThus Far – General Observations
A scientific endeavor driven out of a paradigm shift in A scientific endeavor driven out of a paradigm shift in which biology became a data driven science – Today which biology became a data driven science – Today macromolecular structure data will be used to illustrate macromolecular structure data will be used to illustrate this paradigm shift this paradigm shift
A relatively new term for a scientific endeavor that has A relatively new term for a scientific endeavor that has been around much longerbeen around much longer
Medical informatics preceded it, and defined some of the Medical informatics preceded it, and defined some of the foundationsfoundations
A scientific endeavor that has gained from fundamental A scientific endeavor that has gained from fundamental developments is computer and information science e.g., developments is computer and information science e.g., algorithms, ontologies, Bayesian networks, simulation, algorithms, ontologies, Bayesian networks, simulation, neural networks, text mining and which in turn defines neural networks, text mining and which in turn defines new problem domains for computer sciencenew problem domains for computer science
Systems biology may overtake itSystems biology may overtake it
"Do you mean now?" -- When asked for the time. "
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
A More Specific Chronology – Pre A More Specific Chronology – Pre 19701970
Bioinformatics (2003) 19 2176-2190Bioinformatics (2003) 19 2176-2190
1945 Biochemical Pathways - Horowitz1953 Structure of DNA – W&C1969 Genetic Variation
1953 Game Theory – Neumann and Morgenstern1959 Grammars – Chomsky1962 Information Theory – Shannon & Weaver1966 Cellular automata – Neuman
1962 Molecular Homology – Florkin1965 Evolutionary Patterns – Purling1966 Molecular Modeling - Levinthal1967 Phylogenetic Trees – Fitch1969 Properties – Ptitsyn1970 Dynamic Programming N&W1970 Adaptability - Conrad
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
A More Specific Chronology – 1970’sA More Specific Chronology – 1970’sProblem DefinitionProblem Definition
Improved Sequence AlignmentsSanakoff
Structural PatternsAnd PropertiesRichards
Smith Waterman Algorithm
Exon/IntronsGilbert
Structure PredictionLevittChou and FasmanScheraga
Public Resources Dayhoff, PDB
Information processingIn molecular systemsConrad
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
A More Specific Chronology – 1980’sA More Specific Chronology – 1980’sComputational Biology EmergesComputational Biology Emerges
Domains recognizedRashin
Tree of Life Emerges
FASTALipman & Pearson
ProfilesGribskov
Reductionism beginsThorntonSander
Neural netsHopfield
Molecular computingConrad
NanotechnologyDrexler
ClusteringShepard
Relational DatabasesNetworks – EMBLnet, BIONET
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
A More Specific Chronology – 1990- A More Specific Chronology – 1990- Bioinformatics and Biotechnology Bioinformatics and Biotechnology
EmergeEmerge
Human Genome Human Genome ProjectProject
Internet/WebInternet/Web
Conrad, M., Adaptability theory as a guide for interfacing computers and human society, Systems Research 10, 3-23 (1993).
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
2004 – Overview of the Current 2004 – Overview of the Current ChallengesChallenges
GenomesGene
ProductsStructure &
FunctionPathways &Physiology
~ Scientific Challenges - Deciphering the genome, mapping the genotype-phenotype relationships, dissecting organismic function, engineering organisms with altered functionality, figuring out complex traits and polymorphism, understanding physiology.
~ Algorithmic Challenges - comparisons of whole and partial genomes, metrics for similarity and homology, metabolic reconstruction, dissecting pathways, and whole cell modeling.
~ Computational Challenges - creating the informatics infrastructure, information integration, annotation, curation and dissemination of databases, development of parallel computational methods.
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Bioinformatics Journal
0
200
400
600
800
1000
1200
1400
1997 1998 1999 2000 2001 2002 2003
Submissions
Bioinformatics Journal
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
1997 1998 1999 2000 2001 2002 2003
Impact Factor
Data fromBioinformatics
Growth outweighs readershipparticularly among biologists
Sociological Challenge
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Bioinformatics - A Vice Chancellor’s View
Biological Experiment Data Information Knowledge Discovery
Collect Characterize Compare Model Infer
Sequence
Structure
Assembly
Sub-cellular
Cellular
Organ
Higher-life
Year90 05
Computing Power
SequencingTechnology
Data1 10 100 1000 100000
95 00
Human Genome Project
E.ColiGenome
C.ElegansGenome 1 Small
Genome/Mo.ESTs
YeastGenome
Gene Chips
Virus Structure
Ribosome
Model Metaboloic Pathway of E.coli
Complexity Technology
Brain Mapping
Genetic Circuits
Neuronal Modeling
Cardiac Modeling
Human Genome
# People/Web Site
(C) Copyright Phil Bourne 1998
106 102 1
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
A Data Centric View of the FutureA Data Centric View of the Future
Data complexityData complexity High throughput data collectionHigh throughput data collection Database vs literatureDatabase vs literature Bioinformatics as data driverBioinformatics as data driver Data representationData representation Data integrationData integration
"If you come to a fork in the road, take it."
(a) myoglobin (b) hemoglobin (c) lysozyme (d) transfer RNA(e) antibodies (f) viruses (g) actin (h) the nucleosome (i) myosin (j) ribosome
Numbers and Complexity
Courtesy of David Goodsell, TSRI
Complexity is increasing
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
High Throughput - The Structural Genomics Pipeline (X-ray Crystallography)
Basic Steps
Target Selection
Crystallomics• Isolation,• Expression,• Purification,• Crystallization
DataCollection
StructureSolution
StructureRefinement
Functional Annotation Publish
Bioinformatics Throughout the Process
Bioinformatics• Distant homologs • Domain recognition
AutomationBioinformatics• Empirical rules
AutomationBetter sources
Software integrationDecision Support
MAD Phasing Automated fitting
Bioinformatics• Alignments• Protein-protein interactions• Protein-ligand interactions• Motif recognition
No?
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
An Aside on the Future of PublishingFull Description Captured as the Paper/Database is
Written/Deposited Does away with ...
… the p53 core domain structure consists of a ß sandwich that serves as a scaffold for two large loops and a loop-sheet- helix motif ... ----Science Vol.265, p346
1TSR
Corresponding structure from the PDB
?Oops!
ß sandwich? Where?Large loop? Which one??
Loop-sheet-helix???
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
BioEditor - A DTD Driven BioEditor - A DTD Driven Domain Specific EditorDomain Specific Editor
http://bioeditor.sdsc.edu
Bioinformatics 2003 19(7) 897-898
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
The Data - Bioinformatics CycleThe Data - Bioinformatics CycleResult – Computation and Experiment Result – Computation and Experiment
become More Synergisticbecome More Synergistic
Turn Data into Knowledge
Turn Knowledge into New Data Requirements
Data Bioinformatics
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Deuterium Exchange Mass Spec to Predict StructureDeuterium Exchange Mass Spec to Predict StructureWoods, Baker et al.Woods, Baker et al.
DXMS
COREX
Target ProteinStructure Templates
CASP
X-ray or NMR
Sequence
Homology
Threadingab in
itio
others
Amino Acid
S
tabi
lity
)
Profile Match Method
Best Structure(s)
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Biological RepresentationBiological Representation
The Gene Ontology changes everythingThe Gene Ontology changes everything Molecular functionMolecular function Biochemical processBiochemical process Cellular locationCellular location DAG – machine usableDAG – machine usable
The number of papers referencing the The number of papers referencing the gene ontology has increased dramatically gene ontology has increased dramatically in the last yearin the last year
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Biological Data Representation Biological Data Representation Future Future
Tools to construct ontologies from free Tools to construct ontologies from free text?text?
Ontologies for details of function, protein-Ontologies for details of function, protein-protein interaction, protocols, complete protein interaction, protocols, complete pathway informationpathway information
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Data IntegrationData Integration
Web Services – the Web Services – the holy grail of holy grail of
interoperability? interoperability?
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Web ServicesWeb Services
Its not CORBA – biologists can do itIts not CORBA – biologists can do it You know longer have to remember where You know longer have to remember where
you left it – i.e. registriesyou left it – i.e. registries Platform independentPlatform independent Driver to force data providers to define and Driver to force data providers to define and
publish a detailed API publish a detailed API Compelling - introduces the prospect of Compelling - introduces the prospect of
global workflowglobal workflow
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Perl Web Services Client ExamplePerl Web Services Client Example A small PERL program to access all Pubmed A small PERL program to access all Pubmed
abstracts containing the word ‘ferritin’abstracts containing the word ‘ferritin’use SOAP::Lite;
$ids_ref = SOAP::Lite
-> uri(‘http://server.location.edu/pdbWebServices’)
-> proxy(‘http://server.location.edu/pdbWebServices’)
-> pubmedAbstractQuery($ARGV[0])
-> result;
@ids = @($ids_ref);
Print “@ids\n”;
Mycomputer(1)% web_service.pl ferritin
1AEW 1AQO 1BCF 1BFR 1BG7 1DPS 1EUM 1FHA 1JGC 1JI5 1JIG 1MFR 1QGH 1RCC 1RCD 1RCE 1RCG 1RCI 1RYT 2FHA
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
The Future -The Future -A Biological Complexity A Biological Complexity
PerspectivePerspective
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Cell BiologyCell Biology
AnatomyAnatomy
PhysiologyPhysiology
ProteomicsProteomicsGenomicsGenomics
MedicinalMedicinal ChemistryChemistry
OrganismsOrganisms
OrgansOrgans
CellsCells
MacromoleculesMacromoleculesBiopolymersBiopolymers
Atoms & MoleculesAtoms & Molecules
SCIENTIFIC RESEARCH& DISCOVERY
REPRESENTATIVE DISCIPLINE
EXAMPLE UNITS
MRIMRI
HeartHeart
NeuronNeuron
StructureStructureSequenceSequence
ProteaseProteaseInhibitorInhibitor
ElectronElectronMicroscopyMicroscopy
Migratory Migratory SensorsSensors
VentricularVentricularModelingModeling
X-rayX-rayCrystallographyCrystallography
ProteinProteinDockingDocking
REPRESENTATIVE TECHNOLOGY
Technologies
TrainingInfrastructure
Simulation
Data
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Exploring Biological Complexity Exploring Biological Complexity Requires:Requires:
We do NOT neglect the detailsWe do NOT neglect the details Synergy between theory and experiment Synergy between theory and experiment
which highlights the need for better which highlights the need for better algorithms and quality control algorithms and quality control
But….But…. We have existing and emerging We have existing and emerging
technologies to measure complex systemstechnologies to measure complex systems Provides the opportunity to address some Provides the opportunity to address some
of biology’s fundamental questionsof biology’s fundamental questions
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Structure is a Useful Tool to Study Structure is a Useful Tool to Study Biological Complexity as Nature Biological Complexity as Nature has Provided a Helping Hand…has Provided a Helping Hand…
An average protein is 350 amino acids in length, An average protein is 350 amino acids in length, with 20 amino acids there are 20with 20 amino acids there are 20350350 possible possible proteins – way more than all the atoms in the proteins – way more than all the atoms in the universeuniverse
In actuality there may be only 2-5x10In actuality there may be only 2-5x1066 proteins proteins There are likely between 1-5000 unique foldsThere are likely between 1-5000 unique folds Fold is far more conserved than sequence and Fold is far more conserved than sequence and
permits us to look back farther in evolutionary permits us to look back farther in evolutionary time than sequencetime than sequence
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
But.. much detail remains But.. much detail remains and our current and our current
methodologies fall short..methodologies fall short..
Consider structure comparison Consider structure comparison and alignment of the diverse and alignment of the diverse
protein kinasesprotein kinases
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
An Example of a Structural Superfamily: An Example of a Structural Superfamily: The Protein Kinase-Like SuperfamilyThe Protein Kinase-Like Superfamily
Superfamily: not all eukaryotic or protein kinases: some homologues discovered in bacteria that phosphorylate antibiotics, others phosphorylate lipids Typical Kinase Core (c-Src, PDB ID: 2SRC)
SCOP grouping for kinases
1) Class: Alpha+Beta
2) Fold: Protein Kinase Catalytic Core
3) Superfamily: Protein Kinase Catalytic Core
4) Families:
a) Ser/Thr Kinases
b) Tyr Kinases
c) Atypical Kinases
d) Antibiotic Kinases
e) Lipid Kinases
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Evolution of the Kinase Evolution of the Kinase Superfamily: Comparison of Superfamily: Comparison of Three Superfamily MembersThree Superfamily Members
•A: Casein kinase 1 (PDB ID: 1CSN)
•B: Aminoglycoside kinase (PDB ID: 1J7L)
•C: Phosphatidylinositol 3-kinase (PDB ID: 1E8X).
•D: The previous three structures with only their shared region superposed (1CSN: light blue, 1J7L: red, 1E8X: yellow).
•The three kinases share a minimal core required for ATP binding and phosphotransfer.
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
An accurate alignment would An accurate alignment would allow us to look back farther in allow us to look back farther in
evolutionary time that sequence evolutionary time that sequence alone. Alignment algorithms alone. Alignment algorithms
need to simulate what humans need to simulate what humans can do and beyondcan do and beyond
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
An Example of Manual vs. Automated with Combinatorial Extension An Example of Manual vs. Automated with Combinatorial Extension (CE)(CE)•The manual alignment can be used to better understand the limitations of our automated method
•Alignment of helix C of two tyrosine kinases
•Insulin Receptor Kinase (pdb id 1IR3)
•c-Src (pdb id 2SRC)
•Can be aligned with 40% ident, 3.0Å RMSD
•In Src, C-helix is displaced and rotated outward
•Rotation pushes n-terminal end of helix out very far from n-terminal end of IRK
•CE gaps a part of this (yellow), splitting helix, aligning part of IRK helix C with loop leading to helix C in Src
Orange: IRK, Blue: c-SrcYellow: CE gap region
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Improving CEfam: Improving CEfam: Multiple Alignments Multiple Alignments with CEwith CE
•Example with strands 1 and 2 of kinase superfamily
•A: original
•B: optimal parameters
•C: manual
•Parameters also improved results with other protein superfamilies in visual analysis
•Just as sequence alignments are benchmarked against structure alignments, structure alignments should be benchmarked to manual results
•Improvement in optimization is now being folded into the next generation of CE
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Quality ControlQuality Control
Consider an exampleConsider an example
The definition of domains from The definition of domains from
3-D structure3-D structure
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
The 3D Domain Assignment Problem
Domain is a fundamental structural, functional and evolutionary unit of protein:
Compact
Stable
Have hydrophobic core
Fold independently
Perform specific function
Can be re-shuffled and put together in different combinations
Evolution works on the level of domain
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Exact assignments of domains remains a difficult and unresolved problem.
There is no complete agreement among experts on domain assignment given a protein structure.
Expert methods agree on 80% of all existing manual assignments, the remaining 20% represent “difficult” cases
Expert assignment #1
Expert assignment #2
Expert assignment #3
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Manual and automatic consensusagree
328 chains (77.3% of chains with consensus)
Automatic consensus only46 chains (10.9% of chains
with consensus)Manual consensus only 47 chains (11.1% of chains with consensus)
Automatic consensus and manual consensus disagree 3 chains (0.7% of chains with consensus)
Chains with manual consensus: 375 (80% of entire dataset)
Chains with automatic consensus: 374 (80% of entire dataset)
Chains with consensus (automatic or manual) : 424 (90.6% of entire dataset)
Manual vs. automatic consensuses: do they overlap?
Veretnik et al. 2004 JMB in press
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
1cjaa1cjaa (actin-fragmin kinase, slime mold):(actin-fragmin kinase, slime mold): an unusual kinase an unusual kinase [complex interface][complex interface]
1 domain 1 domain + unassigned 4 domains
DALICATHSCOP, PDP, DomainParser
typical kinase
Exemplar Bioinformatics ProblemsExemplar Bioinformatics ProblemsThe Next 5 Years…The Next 5 Years…
1. Full genome comparisons
2. Rapid assessment of polymorphic variations
3. Complete construction of orthologous and paralogous groups
4. Structure resolution of large assemblies/complexes
5. Dynamical simulation of realistic systems
6. Rapid structural/topological clustering of proteins
7. Protein folding
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Exemplar Bioinformatics ProblemsExemplar Bioinformatics ProblemsThe Next 5 Years The Next 5 Years
8. Computer simulation of membrane insertion9. Simulation of cellular pathways/ sensitivity
analysis of pathways stoichiometry and kinetics
10 Comparison of complex networks and pathways
11 Deciphering the metabolome12 Integration and interpretation of data at different
biological scales – genomic to population13 Identification of biomarkers for use in diagnostic
medicine
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
These problems will be dealt These problems will be dealt with by a new generation of with by a new generation of scientists comforable at both scientists comforable at both
the bench and computer. the bench and computer. Until then bioinforamticians Until then bioinforamticians
need to work hard to need to work hard to overcome the “high noon” overcome the “high noon”
problemproblem
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
High Noon – A Working DefinitionHigh Noon – A Working Definition
12:00The cost:benefit ratio of entry to bioinformatics
tools and resources istoo high for the majority of biologists
Thus, those who could gain and
contribute most from the services provided are not users
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
One Approach - MBTOne Approach - MBT Java toolkit for developing custom molecular Java toolkit for developing custom molecular
visualization applicationsvisualization applications
High-qualityHigh-qualityinteractiveinteractiverendering of: rendering of:
sequence sequence structurestructure functionfunction
http://mbt.sdsc.edu
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
MBT ArchitectureMBT Architecture
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Future - The Structure Should Future - The Structure Should be the User Interfacebe the User Interface
Ligand - What otherentries contain this?
Chain - What otherentries have chains with >90% sequence identity?
Residue - What is the environment of this residue?
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Beyond 5 Years…Beyond 5 Years…
Transitional medicine Transitional medicine Personalized medicinePersonalized medicine Merger of medical-, chem- and bio- informaticsMerger of medical-, chem- and bio- informatics Societies that reflect thisSocieties that reflect this Training in cooperative in silico and Training in cooperative in silico and
experimental researchexperimental research Centers that reflect that training ie different to Centers that reflect that training ie different to
NCBI or EBINCBI or EBI
Think! How the hell are you gonna think and hit at the same time?" "
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
Beyond 5 YearsBeyond 5 Years
Simulations used in the clinic settingSimulations used in the clinic setting Smart {genome} cardsSmart {genome} cards A ubiquitous life sciences Web that A ubiquitous life sciences Web that
permits views from populations to atomspermits views from populations to atoms
"I knew I was going to take the wrong train, so I left early."
April 12, 2004April 12, 2004 Michael Conrad Memorial LectureMichael Conrad Memorial Lecture
AcknowledgementsAcknowledgements
To all those who have chosen To all those who have chosen bioinformatics as a career and make the bioinformatics as a career and make the field so richfield so rich
Particularly those who do so for lesser Particularly those who do so for lesser rewards – the data providers and rewards – the data providers and annotatorsannotators
My group for the fun we had discussing My group for the fun we had discussing this topicthis topic
http://rinkworks.com/said/yogiberra.shtmlhttp://rinkworks.com/said/yogiberra.shtml
"I didn't really say everything I said."