global analysis of genetic, epigenetic and transcriptional polymorphisms in arabidopsis thaliana...

43
Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Post on 21-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis

thaliana using whole genome tiling array

Page 2: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Linaria vulgaris flowers (Cubas et al., 1999)

DNA methylation

Tomato ripening mutant (Manning et al., 2006)

Genome defense against mobile elements

Regulation of gene activity

Page 3: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Symmetric cytosine methylation:

mCG

mCNG

Asymmetric cytosine methylation:

mCNN

Plant DNA methylation

Page 4: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Extent of CG methylation and methylation polymorphism among natural accessions

Inheritance of methylation polymorphisms

Any effect of methylation on gene expression

What we want to know

Page 5: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

5’-C CGG-

3’-GGC C-

Enzyme methylome approach

5’-CCGG-

3’-GGCC-

5’-CmCGG-

3’-GGCmC-

5’-mCmCGG-

3’-GGCmCm-

5’-mCCGG-

3’-GGCCm-

HpaII cutting

Y N N

Rare in plantMspI

cuttingY Y N

Page 6: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

CG-methylation and expression profiling

300ng genomic DNA Digest with either mspI or hpaII Label with biotin random primers Hybridize to AtTILE1F

Col♀ x Col♂ Van ♀ x Van ♂ Col ♀ x Van ♂Van ♀ x Col ♂

mRNA from 20ug totoal RNA Double-stranded cDNA synthesis Label with biotin random primers Hybridize to AtTILE1F

Page 7: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

HpaII digestion

Random labeling

Random labeling

MspI digestion

* * *

* * *

* * *

A)

B)

Constitutive CG methylation

Hp

aII

Msp

IHp

aII

Msp

I

Col Van

inte

nsi

ty

Page 8: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Methylation polymorphisms

Hp

aII

Col Van

inte

nsi

ty

Msp

I

Hp

aII

Msp

I

HpaII digestion

* * *

* * *

A)

B)

Col genotype

Van genotype

MspI digestion

HpaII digestion

MspI digestion

* * *

Page 9: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

A)

B)

Hp

aII

Col Van

inte

nsi

ty

Msp

I

Hp

aII

Msp

I

Sequence polymorphisms

*

*

Col genotype

Van genotype

Page 10: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Simultaneous genetic and epigenetic profiling

# of unique probes: 1,683,620

# of CCGG-containing probes: 54,519

model:

Intensity ~ genotype + enzyme + genotype x enzyme

Page 11: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Summary of sequence polymorphisms

FDR Calleda Falseb Sig-c Sig+c

13.05%  211220  29007  58628  152592 

6.22%  173611  11363  33227  140384 

2.74%  153401  4431  23326  130075 

1.16%  138552  1698  17742  120810 

0.51%  126499  678  14131  112368 

0.22%  116122  272  11448  104674 

0.09%  106817  104  9347  97470 

Called: significant features

False: false positives based on permutation

Sig-: Van greater signal

Sig+: Col greater signal

Page 12: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Genome distribution of SFPs

Page 13: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

  CDc Intron UTRs Promoterd Downstreame Intergenic  Total

SFPa 23180 19806 5130 30190 32158 50539 161003

Featureb  526407 301947 105260 429585 452681 593757 2409637

Percentage 4.40% 6.56% 4.87% 7.03% 7.10% 8.51% 6.68%

Genic distribution of SFPs

aThe number of SFPs within each annotation category. bThe number of features within each annotation category. cCoding sequences. dThe sequences from transcriptional start to upstream 1kb. eThe sequences from transcriptional stop to downstream 1kb.

Page 14: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Enzyme Genotype x enzyme

p-value HpaII > MspIa p-value Col-specificb Van-specificc

<0.01 2373 <0.01 1062 407

<0.05 4522 <0.03 2389 944

<0.1 6324 <0.05 3700 1515

Gened 3628 (20%) Gened 3498 (20%)

Total genee 17760 Total genee 17760

Promoterf 305 (6%) Promoterf 455 (9%)

Total promoterg 5041 Total promoterg 5041

Intergenich 1298 (16%) Intergenich 782 (9%)

Total intergenici 8264 Total intergenici 8264

Methylation polymorphisms are extensive

a Features of constitutive CG methylation bc Features of Col- or Van-specific methylation df cDNAs or promoters with feature(s) of enzyme effect (p < 0.1) or genotype × enzyme

interaction (p < 0.05) eg cDNAs or promoters containing CCGG feature(s)h Intergenic features (excluding cDNAs or promoters) of enzyme effect (p < 0.1) or

genotype × enzyme interaction (p < 0.05)i Intergenic (excluding cDNAs or promoters) CCGG-containing features

Page 15: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Verification of methylation polymorphisms

Page 16: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Verification of methylation polymorphisms

Page 17: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

bp

Genome distribution of constitutive and polymorphic methylation sites

Page 18: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

bp

Co-methylation of pericentromere regions

Page 19: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Genic distribution of constitutive and polymorphic methylation sites

Page 20: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Correlation between gene size and constitutive CG methylation

Page 21: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

ColColColVanVanVan

Col♂ x Van♀Col♂ x Van ♀

Van♂ x Col♀Van ♂ x Col ♀Van♂ x Col ♀

CC*GG

chromomethylase 2 (CMT2) exon19

0

1

2

3

4

5

6

hpaII mspI

log

inte

nsity

col van col♂xvan♀ van♂xcol♀

epiTyper

Page 22: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Full model:

Intensity ~ genotype + enzyme + genotype x enzyme

Genotype:

Additive (between parents)

Dominant (between F1 and mid-parent)

Maternal (between reciprocal F1s)

Inheritance of CG methylation polymorphism

Page 23: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Additive effect describes intensity difference between parent strains across enzyme treatments.

Additive effect

HpaII MspI

log

inte

nsi

ty

Col

Van Van

Col

HpaII MspI

log

inte

nsi

ty

Col

Van

Col

Van

SFP; Col has greater signal than Van.

Van duplication or deletion in Col; Van has greater signal than Col

F1c

F1v

F1c

F1v F1c

F1v

F1c

F1v

Additive effect + Additive effect -

Page 24: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Dominant effect describes intensity difference between mid-parent (average of parents; dashed line) and average of F1 hybrids across enzyme treatments.

Dominant effectC

ol

Van

Van

Col

F1c

F1v F1c

F1v

HpaII MspI

Increased F1 hybridization compared with expected from mid-parent

log

inte

nsi

ty

Dominant effect +

Col

Van

Van

Col

Dominant effect -

F1c

F1v

F1c

F1v

HpaII MspI

log

inte

nsi

ty

Reduced F1 hybridization compared with expected from mid-parent

Page 25: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Maternal effect describes intensity difference between reciprocal F1 hybrids across enzyme treatments.

Maternal effect

F1v F1c

Maternal effect +

HpaII MspI F

1vF1c

log

inte

nsi

ty

F1c

HpaII MspI

F1v

F1c F1v

Random variation; Col-mother F1 with greater signal than Van-mother F1

Random variation; Van-mother F1 with greater signal than Col-mother F1

log

inte

nsi

ty

Col

Van Col

Van

Col

Van Col

Van

Maternal effect -

Page 26: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Enzyme effect describes intensity difference between HpaII and MspI enzyme treatment across genotypes.

Enzyme effect

HpaII MspI

Col

Col

Van

Enzyme effect +V

an

Van

F1c

F1v

F1c

F1v

log

inte

nsi

ty

HpaII MspI

Col

Col

Van

Van

F1c

F1v

F1c

F1v

Constitutive CG methylation; HpaII samples have greater signal

Normalization and/or preferential labeling of short fragment; MspI samples have greater signal

log

inte

nsi

ty

Enzyme effect -

Page 27: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Additive x enzyme effect describes differential enzyme sensitivity between parent strains.

Additive x enzyme interaction

Additive x enzyme effect +

log

inte

nsi

ty

Van

Col

Col

Van

HpaII MspI

Col-specific methylation Van-specific methylation

F1c

F1v

F1c

F1v

log

inte

nsi

ty

Van

Col

Col

VanF

1c

F1v

F1c

F1v

HpaII MspI

Additive x enzyme effect -

Page 28: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Dominant x enzyme effect describes differential enzyme sensitivity between mid-parent (average of parents; dashed line) and average of F1 hybrids.

Dominant x enzyme interactionC

ol

Van

Col

F1c

HpaII MspI

Dominant x enzyme effect +F

1v

Van F1c F1v

Col

Van ColF

1c

HpaII MspI

F1v

Van F1c F1vlo

g in

ten

sity

Col-dominant methylation Van-dominant methylationlo

g in

ten

sity

Dominant x enzyme effect -

Page 29: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Maternal x enzyme effect describes differential enzyme sensitivity between reciprocal F1 hybrids

Maternal x enzyme interaction

Maternal x enzyme effect +

Col-mother hybrid specific methylation

Van-mother hybrid specific methylation

log

inte

nsi

ty

Van

Col

Col

Van

HpaII MspI

F1c

F1v

F1c

F1v

log

inte

nsi

ty

Van

Col Col

Van

HpaII MspI

F1c

F1v

F1c

F1v

Maternal x enzyme effect -

Page 30: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

additive dominant

maternal enzyme

Significance of main effects

Page 31: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

additive χ enzyme dominant χ enzyme maternal χ enzyme

Significance of genotype x enyzme effects

Page 32: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Correlation of constitutive CG methylation and absolute gene expression

Page 33: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Correlation of polymorphic CG methylation and gene expresson variation

Page 34: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

effect GOa term p-value GOa term p-value

addenz

Col > Van Van > Col

GO:0006457 protein folding 7.84E-05 GO:0007242 intracellular signaling cascade 1.72E-03

GO:0009909 regulation of flower development 5.05E-03 GO:0015979 photosynthesis 2.76E-03

GO:0007018 microtubule-based movement 8.56E-03 GO:0006952 defense response 5.82E-03

GO:0006511 ubiquitin-dependent protein catabolic process 1.27E-02 GO:0030001 metal ion transport 1.24E-02

GO:0007275 multicellular organismal development 1.50E-02 GO:0009809 lignin biosynthetic process 2.49E-02

GO:0042254 ribosome biogenesis and assembly 2.03E-02 GO:0006813 potassium ion transport 2.91E-02

GO:0019538 protein metabolic process 2.16E-02 GO:0009739 response to gibberellin stimulus 4.85E-02

GO:0006470 protein amino acid dephosphorylation 2.73E-02    

GO:0009567 double fertilization forming a zygote and endosperm 2.98E-02    

GO:0045454 cell redox homeostasis 3.39E-02    

GO:0007568 aging 4.67E-02      

domenz

F1 hybrids > parentsc   parents > F1 hybridsc

GO:0009965 leaf morphogenesis 2.13E-04 GO:0042254 ribosome biogenesis and assembly 6.60E-03

GO:0009225 nucleotide-sugar metabolic process 4.21E-04 GO:0009617 response to bacterium 1.36E-02

GO:0006869 lipid transport 2.96E-03 GO:0009744 response to sucrose stimulus 2.31E-02

GO:0010119 regulation of stomatal movement 8.79E-03 GO:0016192 vesicle-mediated transport 2.59E-02

GO:0000271 polysaccharide biosynthetic process 9.77E-03 GO:0000074 regulation of progression through cell cycle 2.60E-02

GO:0015995 chlorophyll biosynthetic process 2.03E-02 GO:0045449 regulation of transcription 3.60E-02

GO:0048364 root development 2.11E-02 GO:0006810 transport 4.08E-02

GO:0009408 response to heat 2.33E-02    

GO:0009908 flower development 4.08E-02    

GO:0015979 photosynthesis 4.11E-02    

GO:0045454 cell redox homeostasis 4.13E-02    

GO:0019575 sucrose catabolic process using beta-fructofuranosidase 4.41E-02    

GO:0009887 organ morphogenesis 4.49E-02      

matenz

Col-mother F1 > Van-mother F1d Van-mother F1 > Col-mother F1d

GO:0015979 photosynthesis 1.17E-03 GO:0015986 ATP synthesis coupled proton transport 1.09E-02

GO:0015995 chlorophyll biosynthetic process 1.22E-03 GO:0006470 protein amino acid dephosphorylation 1.11E-02

GO:0009408 response to heat 1.76E-02 GO:0009407 toxin catabolic process 1.14E-02

GO:0009416 response to light stimulus 2.87E-02 GO:0006944 membrane fusion 2.60E-02

GO:0006520 amino acid metabolic process 3.38E-02 GO:0009909 regulation of flower development 2.92E-02

GO:0042742 defense response to bacterium 3.50E-02 GO:0009873 ethylene mediated signaling pathway 4.07E-02

GO:0006397 mRNA processing 4.44E-02      

Gene set enrichment in genic CG methylation polymorphisms

Page 35: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array

Col methylation > Van methylationc Col-mother F1 expression > Van-mother F1 expressiond

GOe term p-value GOe term p-value

BPa

GO:0006457 protein folding* 7.84E-05 GO:0006412 translation 2.13E-32

GO:0009909 regulation of flower development 5.05E-03 GO:0006457 protein folding* 2.09E-30

GO:0007018 microtubule-based movement* 8.56E-03 GO:0042254 ribosome biogenesis and assembly* 2.82E-15

GO:0006511 ubiquitin-dependent protein catabolic process 1.27E-02 GO:0007018 microtubule-based movement* 1.14E-11

GO:0007275 multicellular organismal development 1.50E-02 GO:0006334 nucleosome assembly 1.88E-09

GO:0042254 ribosome biogenesis and assembly* 2.03E-02 GO:0009408 response to heat 4.49E-09

MFb

GO:0031072 heat shock protein binding* 1.67E-03 GO:0003735 structural constituent of ribosome 6.21E-32

GO:0003777 microtubule motor activity* 7.56E-03 GO:0003777 microtubule motor activity* 2.75E-13

GO:0051082 unfolded protein binding* 1.27E-02 GO:0003723 RNA binding 1.34E-12

GO:0015035 protein disulfide oxidoreductase activity 1.90E-02 GO:0051082 unfolded protein binding* 1.44E-12

GO:0005528 FK506 binding* 2.59E-02 GO:0003755 peptidyl-prolyl cis-trans isomerase activity* 6.31E-10

GO:0003755 peptidyl-prolyl cis-trans isomerase activity* 3.19E-02 GO:0005525 GTP binding 1.63E-08

GO:0031072 heat shock protein binding* 1.99E-08

GO:0005528 FK506 binding* 3.02E-06

Maternal methylome could be important for reciprocal F1 gene expression

Page 36: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array
Page 37: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array
Page 38: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array
Page 39: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array
Page 40: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array
Page 41: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array
Page 42: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array
Page 43: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis thaliana using whole genome tiling array