scott e. baker pacific northwest national laboratory bms annual scientific meeting: exploitation of...
TRANSCRIPT
Scott E. BakerPacific Northwest National Laboratory
BMS Annual Scientific Meeting: Exploitation of FungiManchester, UK
September 6, 2005
Genome and proteomic analysis of Genome and proteomic analysis of industrial fungiindustrial fungi
Genome and proteomic analysis of Genome and proteomic analysis of industrial fungiindustrial fungi
2
3
Current Current andand futurefuture routes to fuels and routes to fuels and chemicalschemicals
Current Current andand futurefuture routes to fuels and routes to fuels and chemicalschemicals
Petroleum Petroleumrefinery
Products: Fuels and chemicals
Bio-refinery
Complex biomass: Agricultural products
and “waste” Products: Fuels and chemicals
Petroleum Petroleum productsproducts
Biobased Biobased productsproducts
4
Filamentous fungi inside the Bio-refineryFilamentous fungi inside the Bio-refineryFilamentous fungi inside the Bio-refineryFilamentous fungi inside the Bio-refinery
Bio-refinery
Complex biomass: Agricultural products and
“waste”
Fungal fermentation and catalysis
Simple sugars
Products: Fuels and chemicals
Fungal fermentation and catalysis
5
The world of the mycologist…The world of the mycologist…
6
Publicly available and pending fungal Publicly available and pending fungal genome sequence databasesgenome sequence databases
Publicly available and pending fungal Publicly available and pending fungal genome sequence databasesgenome sequence databases
Organism Sequencing Institution(s)Magnaporthe grisea Broad InstituteAspergillus nidulans Broad InstituteFusarium graminearum Broad InstituteUstilago maydis Broad InstituteCoprinus cinereus Broad InstituteCoccidioides immitis Broad InstituteRhizopus oryzae Broad InstituteCandida guilliermondii Broad InstituteCandida tropicalis Broad InstituteCandida lusitaniae Broad InstitutePneumocystis carinii Broad InstituteUncinocarpus reesii Broad InstituteLodderomyces elongisporus Broad InstituteNeurospora crassa Broad Institute/MIPSCandida albicans Broad Institute/StanfordCryptococcus neoformans Broad Institute/Stanford/TIGRAspergillus niger CBS 513.88 DSMLeptosphaeria maculans GenoscopePodospora anserina GenoscopeAspergillus oryzae JapanTrichoderma reesei JGIPhanerochaete chrysosporium JGIPichia stipitis JGILaccaria bicolor JGIGlomus intraradices JGIAspergillus niger JGINectria haematococca JGIMycosphaerella graminicola JGIMycosphaerella figiensis JGISporobolomyces roseus JGIAspergillus niger ATCC 1015 JGISpraguea loffei MBLSaccharomyces cerevisiae MultipleAshbya gossypii MultipleSchizosaccharomyces pombe SangerFusarium verticillioides Syngenta/Broad Botrytis cinerea Syngenta/Broad/GenoscopeAspergillus fumigatus TIGRAspergillus flavus TIGRAspergillus terreus TIGR/Broad InstituteAspergillus fisherianus TIGR/Broad InstituteAspergillus clavatus TIGR/Broad InstituteAlternaria brassicola Wash. U
7
Why fungal genomics?Why fungal genomics?Why fungal genomics?Why fungal genomics?
Genome sequence paints a high level picture of organism biologyGenome sequence is a platform for discoveryGenome sequence enables other high throughput discovery tools (proteomics, transcriptomics, etc)A genome project can unite/revive a research community
8
http://www.aspergillus.man.ac.uk/indexhome.htm?secure/sequence_info/index.php~main
9
Why a public Why a public Aspergillus niger Aspergillus niger genome genome project?project?
Why a public Why a public Aspergillus niger Aspergillus niger genome genome project?project?
A bioprocess organism First citric acid process reported in 1917 with wildtype
ATCC 1015 Aspergillus niger strain Highly efficient fermentation of glucose to citric acid
A protein production organism Source of important enzymes Industrial protein producer
Sequenced twice by industry Public access to sequence with restriction
Large economic footprint
10
ATCC 9029 NRRL 3122
11
ATCC 9029 NRRL 3122
~80X
~10X
12
PNNL A. niger strain (sequenced by Integrated Genomics)
Phylogeny from Robert A. Samson, Jos A.M.P. Houbraken, Angelina F.A. Kuijpers, J. Mick Frank and Jens C. Frisvad. 2004. New ochratoxin A or sclerotium producing species in Aspergillus section Nigri. Studies in Mycology. 50:45-61.
13
The DOE The DOE Aspergillus niger Aspergillus niger genome projectgenome projectThe DOE The DOE Aspergillus niger Aspergillus niger genome projectgenome project
Proposed to the US Department of Energy Microbial Genome Program by the PNNL Fungal Biotechnology teamCollaboration with DOE’s Joint Genome InstituteCurrent status Final coverage: ~8X shotgun Production sequenced to 4X and QC assembly performed, 8X
sequencing to be completed by September 14th, assembly and automated annotation to follow
EST libraries constructed from RNA isolated from citric acid production and complex biomass digestion conditions~25,000 to be sequenced
Annotation: In collaboration with JGI/LANL Public release: Target of December 1st
Other Aspergillus niger genomes ATCC 9029: low coverage, sequenced by Integrated Genomics –
Sequence available on request. Contact Scott Baker or Jon Magnuson ([email protected] or [email protected])
CBS 513.88: DSM – announced public release at Asilomar FGC
14
Integrated Genomics JGI
•Date
•Method
•Coverage
•Genomic library insert size
•Contigs or scaffolds
2000
ShotgunNo finishing
~4-6X
1-2kb
>9000contigs
2005
ShotgunPlus finishing
~8X
3kb8kb40kb
<100scaffolds
15
The “QC” The “QC” A. nigerA. niger ATCC 1015 assembly – ATCC 1015 assembly – 4X coverage4X coverage
The “QC” The “QC” A. nigerA. niger ATCC 1015 assembly – ATCC 1015 assembly – 4X coverage4X coverage
total number of scaffolds: 118 total length of scaffolds: 35634017 N50 scaffold number: 6 N50 scaffold size: 1931570 total number of contigs: 1646 total length of contigs: 34162656 N50 contig number: 215 N50 contig size: 47636 Total: 243,688 reads
3 kb: 105,065 = 43.1% 8 kb: 118,655 = 48.7% 40 kb: 19,968 = 8.2%
16
The “QC” The “QC” A. nigerA. niger ATCC 1015 assembly – ATCC 1015 assembly – 4X coverage4X coverage
The “QC” The “QC” A. nigerA. niger ATCC 1015 assembly – ATCC 1015 assembly – 4X coverage4X coverage
Over 40 that encode ketosynthase and acyl-transferase domains(i.e. PKSs and FASs)Mat-1-1(alpha box)~95% of the of the genome is found in 24 scaffolds – 1.5 scaffolds/chromosome armEST coverage/annotation Genencor to release ~7500 EST sequences from several different
growth condition libraries JGI will sequence 25,000 ESTs The Fungal Genomics program at Concordia University will
contribute ~12,000 ESTs
Annotation jamboree tentatively scheduled for April 2006, following the European Conference on Fungal GeneticsLimited gap closure or “finishing” is planned by JGI-LANL
17
PNNL A. niger strain (sequenced by Integrated Genomics)
Phylogeny from Robert A. Samson, Jos A.M.P. Houbraken, Angelina F.A. Kuijpers, J. Mick Frank and Jens C. Frisvad. 2004. New ochratoxin A or sclerotium producing species in Aspergillus section Nigri. Studies in Mycology. 50:45-61.
DOE MGP Proposed July
14th
What’s next? A multi-gene phylogeny and more genomes from Aspergillus section NigriWhat’s next? A multi-gene phylogeny and more genomes from Aspergillus section Nigri
Genome sequence…so what?Genome sequence…so what?Genome sequence…so what?Genome sequence…so what?
19
General Procedure for ProteomicsGeneral Procedure for ProteomicsGeneral Procedure for ProteomicsGeneral Procedure for Proteomics
Perform LCQ analysis
Lyse cells
Isolate proteins Digest with trypsin
Separate in one or more dimensions reverse phase, ion exchange
Raw Data
MS
MS/MS
Deinococcus_1415N_1200_1400 #1606 RT: 58.85 AV: 1 NL: 2.60E6T: + c ESI SIM ms [ 1200.00-1400.00]
1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
1283.7
1296.8
1298.5
1282.9
1299.31311.3
1380.61312.2
1377.71342.5 1358.71205.4 1325.6 1395.41248.6 1265.01225.5
Deinococcus_1415N_1200_1400 #1604 RT: 58.79 AV: 1 NL: 2.61E5T: + c d Full ms2 [email protected] [ 340.00-2000.00]
400 600 800 1000 1200 1400 1600 1800 2000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lativ
e A
bu
nd
an
ce
1087.6
1525.9
1526.9
1088.6
1478.6
1201.8
1332.51039.4
1690.5
1688.91731.8 1817.8
926.6748.31310.1 1963.91396.0
955.4733.5464.0 703.3641.3 1946.81607.8
Run data throughpeptide identifyingprogram (SEAQUEST)
Identify unique peptide = identify parent ORF
20
Quantitative proteomicsQuantitative proteomicsQuantitative proteomicsQuantitative proteomics
Used for comparison of biological samples generated by two or more different experimental conditionsCurrent technologies utilize isotopic labeling strategies ICAT Metabolic labeling Pairwise comparison
Our goal: Generate a quantitative proteomic methodology using statistical analysis of raw MS abundance data and that does not use isotopic labeling
21
MASIC: A program for measuring ion peak MASIC: A program for measuring ion peak intensity and areaintensity and area
MASIC: A program for measuring ion peak MASIC: A program for measuring ion peak intensity and areaintensity and area
Scan Number
Scan Time
Scan Type
Total Ion Intensity
Base Peak Intensity
Base Peak MZ
5841 29.997 1 2.2E+07 2.2E+06 610.26105842 30.001 2 3.3E+04 8.7E+03 843.76315843 30.006 2 2.5E+04 2.4E+03 879.69095844 30.012 2 3.8E+04 4.8E+03 489.47425845 30.017 2 6.6E+04 2.0E+04 448.24575846 30.022 2 3.9E+04 3.8E+03 613.41995847 30.027 1 1.5E+07 1.9E+05 609.88825848 30.031 2 7.3E+04 5.8E+03 260.14645849 30.035 2 9.1E+02 6.2E+01 1364.7010
10300 10400 105000
2e+006
4e+006
6e+006
8e+006
1e+007
10700 10800 109000
20000
40000
60000
80000
100000
7800 7900 80000
20000
40000
60000
80000
100000
22
There are lies, there are damn dirty lies There are lies, there are damn dirty lies and there are statistics…and there are statistics…
There are lies, there are damn dirty lies There are lies, there are damn dirty lies and there are statistics…and there are statistics…
23
Peptides from sample proteinPeptides from sample proteinPeptides from sample proteinPeptides from sample protein
24
Statistical analysis IStatistical analysis IStatistical analysis IStatistical analysis I
25
Statistical analysis IIStatistical analysis IIStatistical analysis IIStatistical analysis II
26
Relative quantitation with confidence Relative quantitation with confidence intervals (95%)intervals (95%)
Relative quantitation with confidence Relative quantitation with confidence intervals (95%)intervals (95%)
27
““Global” proteomic summary chartGlobal” proteomic summary chart““Global” proteomic summary chartGlobal” proteomic summary chart
0.01
0.1
1
0 10 20 30 40 50
Protein Number
Ra
tio
2 to 1 rel conc
4 to 1 rel conc
8 to 1 rel conc
16 to 1 rel conc
28
Proteomics summary and future directionsProteomics summary and future directionsProteomics summary and future directionsProteomics summary and future directions
Using statistical analysis of raw mass spec data we have developed a methodology for relative quantitation of proteins across multiple samplesFuture experiments: Time course experiment – Citric acid production in
Aspergillus niger Strain comparison – Trichoderma reesei QM6a vs Rut-
C-30 Other comparisons – “Production” strains vs. wildtype Internal standards for greater quality control Isotopic peptides for “absolute” quantitation
29
PNNL Fungal BiotechnologyZiyu DaiJon MagnusonChris Wend Ellen Panisko Ken Bruno
Kyle FowlerKelly VincentBob RomineBeth HofstadMark ButcherKatie PantherDennis StilesLinda Lasure Linda Lasure
Acknowledgements
DOE JGI A. niger ATCC 1015 genomeDan Drell Erika LindquistDiego Martinez Paul RichardsonDan RokhsarChris Detter …the list continues to grow!David Bruce
Phylogenetic analysisRob Samson CBSJens Frisvad DTUDave Geiser Penn State
Secreteome analysisAdrian Tsang Concordia U
PNNL Proteomics QC Analysis TeamDon DalyDon DalyKevin AndersonMatt Monroe
30
Phycomyces fungi, showing sporangiophores (fruiting bodies) in the wild type and color mutants. Photo by Tamotsu Ootaki.
http://www.jgi.doe.gov/sequencing/why/CSP2006/Pblakesleeanus.html
Piromyces sp E2.. Photo by Johannes Hackstein.
http://www.jgi.doe.gov/sequencing/why/CSP2006/piromyces.html
Future genomes… Future genomes… Two “lower” fungi through the JGI CSPTwo “lower” fungi through the JGI CSP
Phycomyces (led by Luis Corrochano)
Piromyces