tiling arrays for genetic, epigentic, and environmental variation in arabidopsis thaliana justin...

39
Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago http://naturalvariation.org/

Post on 15-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana

Justin BorevitzEcology & EvolutionUniversity of Chicagohttp://naturalvariation.org/

Page 2: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Widely Distributed

http://www.inra.fr/qtlat/NaturalVar/NewCollection.htm

Olivier Loudet

Page 3: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Local Population Variation

Scott HodgesIvan Baxter

Page 4: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Seasonal Variation

Matt Horton

Megan Dunning

Page 5: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Seasons in the Growth Chamber

• Changing Day length• Cycle Light Intensity• Cycle Light Colors• Cycle Temperature

Sweden Spain

Seasons in the Growth Chamber

• Changing Day length

• Cycle Light Intensity

• Cycle Light Colors

• Cycle Temperature

Day Length

0:00

2:00

4:00

6:00

8:00

10:00

12:00

14:00

16:00

18:00

20:00

22:00

sep

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

month

hour

s

Sweden

Spain

standard

standard

Light Intensity

0

200

400

600

800

1000

1200

1400se

p

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

month

W/m

2

Sweden

Spain

standard

Temperature

-10

-5

0

5

10

15

20

25

30

35

sep

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

monthde

gree

s C

Spain High

Spain Low

Sweden High

Sweden Low

standard

Developmental Plasticity == BehaviorDevelopmental Plasticity == Behavior

Page 6: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Which arrays should be used?

cDNA array

Long oligo array

BAC array

Page 7: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Which 25mer arrays should be used?

Gene array

Exon array

Tiling array35bp tile, 25mers 10bp gaps

Page 8: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Which 25mer arrays should be used?

Tiling/SNP array

SNP array

Ressequencing array

Page 9: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

RNA DNA

Universal Whole Genome Array

Transcriptome AtlasExpression levelsTissues specificity

Transcriptome AtlasExpression levelsTissues specificity

Gene/Exon DiscoveryGene model correctionNon-coding/ micro-RNA

Gene/Exon DiscoveryGene model correctionNon-coding/ micro-RNA

Alternative SplicingAlternative Splicing

Comparative GenomeHybridization (CGH)

Insertion/DeletionsCopy Number Polymorphisms

Comparative GenomeHybridization (CGH)

Insertion/DeletionsCopy Number Polymorphisms

MethylationMethylation

ChromatinImmunoprecipitation

ChIP chip

ChromatinImmunoprecipitation

ChIP chip

Polymorphism SFPsDiscovery/Genotyping

Polymorphism SFPsDiscovery/Genotyping

Control for hybridization/genetic polymorphismsto understand true EXPRESSION polymorphisms

RNA ImmunoprecipitationRIP chip

RNA ImmunoprecipitationRIP chip

Antisense transcription

Allele Specific Expression

Page 10: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

SNP SFP MMMMM MSFP

SFP

MMMMM M

Chromosome (bp)

con

serv

atio

n

SNP

ORFa

start AAAAA

Tra

nsc

ripto

me

Atla

s

ORFb

deletion

Improved Genome Annotation

Page 11: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Talk Outline• Whole Genome Tiling Arrays

– Spatial Correction, grid alignment– Alternative splicing– Methylation – Single Feature Polymorphisms (SFPs)– Genetic Mapping– Potential deletions/ Copy Number Variants– Allele Specific Expression

• Resequencing/ Haplotypes– Variation Scanning

• Whole Genome Tiling Arrays– Spatial Correction, grid alignment– Alternative splicing– Methylation – Single Feature Polymorphisms (SFPs)– Genetic Mapping– Potential deletions/ Copy Number Variants– Allele Specific Expression

• Resequencing/ Haplotypes– Variation Scanning

Page 12: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Tiling Array Re annotation

• 6.25Million probes

• 3.125Million PM probes

• 1.67Million unique PM probes 17bp (blast)

• 736k PM features in TUs (exon array)

• 130k TUs

• 28k genes

Page 13: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Spatial Correction, grid Alignment

Background correction for RNA, ! For DNA

Page 14: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Transcription subUnits (TUs)

Exon1 Exon2Intron1

Tu1 Tu2 Tu3

Page 15: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Alternative Splicing

V V V C C C

VanCol

Xu Zhang

Page 16: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Gene/Tu model for alternative splicing

Page 17: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

ChIP chip treatment effect!

Experimental Design

same protocol/antibody

dynamic binding

model treatment effect

Actual biological signal

Page 18: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Potential Deletions

Page 19: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Methods for labeling

• Extract genomic 100ng DNA (single leaf)

• Digest with either msp1 or hpa2 CCGG

• Label with biotin random primers

• Hybridize to array

• Fit model

Y = + E * G +

Page 20: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Delta p0 FALSE Called FDR

1.00 0.95 18865 160145 11.2%

1.25 0.95 10477 132390 7.5%

1.50 0.95 6545 115042 5.4%

1.75 0.95 4484 102385 4.2%

2.00 0.95 3298 92027 3.4%

SFP detection on tiling arrays

Intergenic Exon intron

SFPs 60770 23519 17216

total 685575 665524 301648

% 8.86% 3.53% 5.71%

SFPs/gene 0 >=1 >=2 >=3 >=4 >=5

genes 16322 9146 4304 2495 1687 1121

Page 21: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

methylated features and mSFPs

>10,000 of 100,000 at 5% FDR

Enzyme effect, on CCGG features GxE

276 at 15% FDR

mQTL?

Page 22: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Chip genotyping of a Recombinant Inbred Line

29kb interval

Page 23: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Mapbibb100bibb mutant plants100wt mutant plants

Page 24: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Array Mapping

Hazen et al Plant Physiology 2005

Page 25: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Potential Deletions (wild lines)

>500 potential deletions45 confirmed by Ler sequence

23 (of 114) transposons

Disease Resistance(R) gene clusters

Single R gene deletions

Genes involved in Secondary metabolism

Unknown genes

Page 26: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Fast Neutron deletions

FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1

Het

Page 27: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Natural Variation on Tiling Arrays

Page 28: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Potential Deletions Suggest Candidate Genes

FLOWERING1 QTL

Chr1 (bp)

Flowering Time QTL caused by a natural deletion in FLM

FLM

FLM natural deletion

(Werner et al PNAS 2005)

Page 29: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Allele specific expression

Page 30: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

cis regulatory variation

Col/ColCol/VanVan/ColVan/Van

Van allele expressedCol allele expressed

Col Female imprint

Page 31: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Allele specific expressionbetween Col and Van

Page 32: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Array Haplotyping

• What about Diversity/selection across the genome?

• A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ

• LD decay, Haplotype block size• Deep population structure?• Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas,

C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2Fl-1, Ita-0, Mr-0, St-0, Sah-0

Page 33: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Array Haplotyping

Inbred lines

Low effectiverecombinationdue to partialselfing

Extensive LDblocks

Col Ler Cvi Kas Bay Shah Lz Nd

Chr

omos

ome1

~50

0kb

Page 34: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

SFPs for reverse genetics

http://naturalvariation.org/sfp

14 Accessions 30,950 SFPs`

Page 35: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Chromosome Wide Diversity

Page 36: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Diversity 50kb windows

Page 37: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

Tajima’s D like 50kb windows

RPS4 unknown

Page 38: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

R genes vs bHLH

(-1,-0.8] (-0.6,-0.4] (-0.2,0] (0.2,0.4] (0.6,0.8]

Selection

Tajima's D like statistic

freq

uen

cy

01

02

03

04

05

06

07

0

RgenesbHLH

Page 39: Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago

NaturalVariation.orgNaturalVariation.orgUSC

Magnus NordborgPaul Marjoram

Max Planck

Detlef Weigel

Scripps

Sam Hazen

University of Michigan

Sebastian Zollner

University of Chicago

Xu ZhangEvadne SmithKen Okamoto

Yan Li

Michigan State

Shinhan Shui

PurdueIvan Baxter

Sainsbury Laboratory

Jonathan Jones

USC

Magnus NordborgPaul Marjoram

Max Planck

Detlef Weigel

Scripps

Sam Hazen

University of Michigan

Sebastian Zollner

University of Chicago

Xu ZhangEvadne SmithKen Okamoto

Yan Li

Michigan State

Shinhan Shui

PurdueIvan Baxter

Sainsbury Laboratory

Jonathan Jones