contribution of epigenetic variation to expression changes among tissues and genotypes steve eichten...
TRANSCRIPT
Contribution of Epigenetic Variation to Expression Changes Among Tissues and
Genotypes
Steve Eichten – Springer LabPAG iPlant Workshop1/17/12
Outline
• Questions we are interested in
• Using iPlant to assist in answering these questions
• Current and future analysis
Epigenetics & Gene Expression
• Heritable variation not due to sequence variation– DNA methylation, histone modification,
etc– Classic examples: imprinting,
paramutationAreas of research:• Epigenomic variation across genotypes &
development• Relationship between genetic & epigenetic
variation• Role of epigenetic variation in phenotype (Chandler & Stam, 2004)
Epigenetic variation• Across genotypes:
– Maize Nested Association Mapping population
(26 inbreds)• Across development:
– 5 tissues• Embryo• Endosperm• Leaf• Immature ear• tassel
25 DL SSD
B97
CML103
CML228
CML247
CML277
CML322
CML333
CML52
CML69
Hp301
Il14H
Ki11
Ki3
Ky21
M162W
M37W
Mo18W
MS71
NC350
NC358
Oh43
Oh7B
P39
Tx303
Tzi8
To answer these questions…• Genome-wide methylation assessment
– meDIP-chip (2.1M probe platform)– BS-seq (for B73 and Mo17 inbreds)
• Genome-wide expression assessment– RNA-seq (~20M reads x 25 inbreds)
Technology Leaf Embryo Endosperm Ear Tassel TOTAL
meDIP-chip 75 6 6 6 6 99 arrays
BS-seq 179,109,340 179,109,340 reads
RNA-seq 558,210,732 42,581,924 43,419,680 85,671,330 73,105,335 802,989,001 reads
~98,200,000,000 bases209,949,399 array data points
• Bioinformatic knowhow• Computational power and storage
– iPlant!
Old way1. Buy larger hard drives2. Upgrade computers3. Buy more hard drives4. Gain basic understanding of UNIX / Perl /
R / etc…5. Lots of command line work / installation /
troubleshooting- Array normalization / fastqc / tophat / cufflinks / DEGseq /
……
6. Realize you want it run analyses a different way. {GOTO 4}
7. Realize you really need more computing power to do what you want. {GOTO 2 or make some friends at a supercomputing institute}
8. Train others in your lab group how to do it! {Document everything and GOTO 1}
New way• Move sequence data to iPlant DE• Select apps for desired analysis• Run software faster than you can
locally• Quickly adapt methods to find
optimum• Develop analysis pipeline for
others to use• Store large files for others in your
lab group to access
Larger storage + Faster Computing + Faster Training + Faster Adaptation + Faster Implementation
= more user-friendly, more powerful computing
Applying iPlant to maize epigenomics
fastq files
Tophat aligner to pre-indexed maize reference genome
Cufflinks transcript assembly against maize
gene models
iRODS file transfer to iPlant
iPlant integrated Apps
Download files from iPlant, display in IGV
Assess read quality (FastQC)iPlant integrated Apps
Methylation variation in B73 and Mo17
Ch
rom
oso
me
1
2
3
4
5
6
7
8
9
10
• Most locations show similar methylation profiles
• Identified ~700 differentially methylated regions (DMRs) in B73 and Mo17 inbreds
Eichten et al., 2011. PloS Genetics
(B73 methylated, Mo17 not)(Mo17 methylated, B73 not)
Examples of expression changes correlated with epigenetic stateChromosome 6
Methylation inversely correlated with expression state
Inbred-specific examples of methylation & expression variation
Hundreds of genes correlated with epigenetic state
Methylation & Expression variation across NAMChromosome 4
Putative targets of heritable epigenetic variation
Analysis of DNA methylation patterns across additional tissues and genotypes
B73 embryo
B73 endosperm
B73 leaf
Mo17 embryo
Mo17 endosperm
Mo17 leaf
Ki11 leaf
Mo18w leaf
NC358 leaf
Oh7b leaf
• DNA methylation patterns are generally quite similar among genotypes and tissues.
• However, there are ~1000 DMRs between any two genotypes.
• Variation frequently acts equally upon all tissues.
• Few Tissue specific DMRs and rarely conserved between genotypes
Wrap Up• Epigenetics
– Epigenetic variation exists in maize– Examples of gene expression states
correlated with epigenetic state can be identified
– Few tissue-specific methylation variants• Utility of iPlant
– Fast & remote location for storing large amounts of data
– Fluid analysis of sequence data to develop transcript alignment and quantification
• Springer Lab– Amanda Waters– Ruth Swanson-Wagner– Peter Hermanson– Nathan Springer
• iPlant & TACC– Matthew Vaughn
• NSF
Thanks!