arrays as tools for natural variation studies: mapping, haplotyping, and gene expression justin...

58
Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Post on 20-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Arrays as tools for Natural Variation studies:Mapping, Haplotyping, and gene expression

Justin BorevitzUniversity of Chicagonaturalvariation.org`

Page 2: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Talk Outline

• Single Feature Polymorphisms (SFPs)– Potential deletions

• Bulk Segregant Mapping– Extreme Array Mapping

• Haplotyping– Selection

• Transcriptional profiling– for QTL candidate genes

Page 3: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

What is Array Genotyping?

• Affymetrix expression GeneChips contain 202,806 unique 25bp oligo nucleotides.

• 11 features per probset for 21546 genes• New array’s have even more• Genomic DNA is randomly labeled with

biotin, product ~50bp.• 3 independent biological replicates

compared to the reference strain Col

GeneChip

Page 4: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Potential Deletions

Page 5: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Spatial Correction

Spatial Artifacts

Improved reproducibilityNext: Quantile Normalization

Page 6: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`
Page 7: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`
Page 8: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`
Page 9: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

False Discovery and Sensitivity

PM only

SAM threshold

5% FDR

GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity

Polymorphic 340 117 223 34% Non-polymorphic 477 4 473

False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p-value = 1.845e-40 SAM threshold 18% FDR

GeneChip SFPs nonSFPs Cereon marker accuracy 10627 82297 100% Sequence 817 223 594 Sensitivity

Polymorphic 340 195 145 57% Non-polymorphic 477 28 449

False Discovery rate: 13% Test for independence of all factors: Chisq = 265.13, df = 1, p-value = 1.309e-59

3/4 Cvi markers were also confirmed in PHYB

90% 80% 70%

41% 53% 85%

90% 80% 70%

67% 85% 100%

Cereonmay be asequencingError

TIGRmatch isa match

Page 10: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Chip genotyping of a Recombinant Inbred Line

29kb interval

Discovery 6 replicates X $500 12,000 SFPs = $0.25Typing 1 replicate X $500 12,000 SFPs = $0.041

Page 11: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

LIGHT1 NIL

Page 12: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Potential Deletions

>500 potential deletions45 confirmed by Ler sequence

23 (of 114) transposons

Disease Resistance(R) gene clusters

Single R gene deletions

Genes involved in Secondary metabolism

Unknown genes

Page 13: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Potential Deletions Suggest Candidate Genes

FLOWERING1 QTL

Chr1 (bp)

Flowering Time QTL caused by a natural deletion in MAF1

MAF1

MAF1 natural deletion

Page 14: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Fast Neutron deletions

FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1

Het

Page 15: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Map bibb100 bibb mutant plants100 wt mutant plants

Page 16: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

bibb mapping

ChipMapAS1

Bulk segregantMapping usingChip hybridization

bibb maps toChromosome2 near ASYMETRIC LEAVES1

Page 17: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

BIBB = ASYMETRIC LEAVES1

Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain

bibb as1-101

MYB

bib-1W49*

as-101Q107*

as1bibb

AS1 (ASYMMETRIC LEAVES1) =MYB closely related toPHANTASTICA located at 64cM

Page 18: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

stamenstayLerSarah LiljegrenMapping confirmed

0 20 40 60 80 100

-0.5

0.0

0.5

stamenstaymut

cM Chromosome 1

alle

le fr

eque

ncy

0 20 40 60 80

-0.5

0.0

0.5

stamenstaymut

cM Chromosome 2

alle

le fr

eque

ncy

0 20 40 60 80

-0.5

0.0

0.5

stamenstaymut

cM Chromosome 3

alle

le fr

eque

ncy

0 20 40 60

-0.5

0.0

0.5

stamenstaymut

cM Chromosome 4

alle

le fr

eque

ncy

0 20 40 60 80 100

-0.5

0.0

0.5

stamenstaymut

cM Chromosome 5

alle

le fr

eque

ncy

Page 19: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

0 20 40 60 80 100

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

ein6F2mut

cM Chromosome 1

alle

le fr

eque

ncy

0 20 40 60 80

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

ein6F2mut

cM Chromosome 2al

lele

freq

uenc

y

0 20 40 60 80

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

ein6F2mut

cM Chromosome 3

alle

le fr

eque

ncy

0 20 40 60

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

ein6F2mut

cM Chromosome 4

alle

le fr

eque

ncy

0 20 40 60 80 100

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

ein6F2mut

cM Chromosome 5

alle

le fr

eque

ncy ein6een

double mutantRamlah NehringMapping confirmed

Page 20: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

eXtreme Array Mapping

Histogram of Kas/Col RILs Red light

hypocotyl length (mm)

cou

nts

6 8 10 12 14

02

46

81

01

2

15 tallest RILs pooled vs15 shortest RILs pooled

Page 21: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

LOD

eXtreme Array Mapping

Red light QTL RED2 from 100 Kas/ Col RILs

Allele frequencies determined by SFP genotyping. Thresholds set by simulations

15 tallest RILs pooled vs15 shortest RILs pooled

0

4

8

12

16

0 20 40 60 80 100cM

LO

D

Composite Interval Mapping

RED2 QTL

Chromosome 2

RED2 QTL 12cM

Page 22: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Fine Mapping with Arrays

0 100 200 300 400 500 600

-1.0

-0.5

0.0

0.5

1.0

Chromosome 1 (cM)

kb

geno

type

0 100 200 300 400 500 600

-1.0

-0.5

0.0

0.5

1.0

Chromosome 2 (cM)

kbge

noty

pe

0 100 200 300 400 500 600

-1.0

-0.5

0.0

0.5

1.0

Chromosome 3 (cM)

kb

geno

type

0 100 200 300 400 500 600

-1.0

-0.5

0.0

0.5

1.0

Chromosome 4 (cM)

kb

geno

type

0 100 200 300 400 500 600

-1.0

-0.5

0.0

0.5

1.0

Chromosome 5 (cM)

kb

geno

type

Single Additive Gene1000 F2sSelect recombinantsby PCR 1Mb region

Page 23: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Barley SFPs gDNA

• 9 arrays, random labeled genomic DNA

• 3 wild type, 3 parent 1, 3 parent 2

• Hope to verify some RNA SFPs

• Pairs plots, correlation matrix

• SFP table

Page 24: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Just better than permutations

delta ori.data perm.data difference FDR0.10 2866 2114.2 751.8 0.740.15 1870 578.4 1291.6 0.310.20 1274 269.3 1004.7 0.210.25 991 174.7 816.3 0.180.30 816 126.8 689.2 0.160.35 660 95.8 564.2 0.150.40 554 75.8 478.2 0.14

Increase specific activity with other labeling methodsPerform more replicates

Page 25: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

• Single Feature Polymorphisms– Improve with replicates (easy)– Improved statistical models

• Genotyping– Precisely define recombination breakpoints– Fine mapping

• Potential Deletions– Candidate genes/ induced mutations

• Bulk segregant Mapping– eXtreme Array Mapping, F2s etc

Page 26: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Array Haplotyping

• What about Diversity/selection across the genome?

• A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ

• LD decay, Haplotype block size

• Deep population structure?

• Col, Lz, Ler, Bay, Shah, Cvi, Kas, C24,

Est, Kin, Mt, Nd, Sorbo, Van, Ws2

Page 27: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

C c c c C c C j j j j j j L L L B B B S S C C C k k c c E E E K K M M M N N N S S S v v V WWW

Cc

cc

Cc

Cj

jj

jj

jL

LL

BB

BS

SC

CC

kk

cc

EE

EK

KM

MM

NN

NS

SS

vv

VW

WW

o o o o o o o w w w w w w e e e a a a h h v v v a a 2 2 s s s e e t t t d d d o o o a a a s s s

oo

oo

oo

ow

ww

ww

we

ee

aa

ah

hv

vv

aa

22

ss

se

et

tt

dd

do

oo

aa

as

ss

l l l l l l l C C C L L L r r r y y y a a i i i s s 4 4 t t t n n 0 0 0 - - - r r r n n n - - -

ll

ll

ll

lC

CC

LL

Lr

rr

yy

ya

ai

ii

ss

44

tt

tn

n0

00

--

-r

rr

nn

n-

--

Pairwise Correlation between and within replicates

Page 28: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Array Haplotyping

Inbred lines

Low effectiverecombinationdue to partialselfing

Extensive LDblocks

Col Ler Cvi Kas Bay Shah Lz Nd

Chr

omos

ome1

~50

0kb

Page 29: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

(-4,-3.5] (-3,-2.5] (-2,-1.5] (-1,-0.5] (0,0.5] (1,1.5] (2,2.5] (3,3.5]

T statistic

fre

qu

en

cy

0

e+

00

4

e+

04

8

e+

04

Distribution of T-stats

null (permutation)actual

Not Col ColNA NA duplications

32,427Calls

208,729

12,250 SFPs

Page 30: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Accession FDR Sensitivity SNP Totalbay 0.0% 43% 51 563c24 0.2% 39% 64 580cvi 0.0% 38% 91 543est 0.0% 59% 39 548kas 1.9% 44% 66 577kendl 3.1% 33% 57 545ler 0.0% 49% 43 562lz 0.0% 53% 51 573mt 0.2% 61% 49 570nd 0.0% 47% 49 568shah 0.0% 24% 80 548sorbo 0.0% 45% 55 526van 0.2% 29% 92 571ws2 0.0% 49% 57 514

Sequence confirmation of SFPs

Page 31: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

SFPs for reverse genetics

http://naturalvariation.org/sfp

14 Accessions 30,950 SFPs`

Page 32: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Chromosome Wide Diversity

Page 33: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Self Incompatibility-locus

Page 34: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Self Incompatibility-locus

Page 35: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Diversity 50kb windows

Page 36: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Tajima’s D like 50kb windows

RPS4 unknown

Page 37: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

R genes vs bHLH Theta W

RPS4

Page 38: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Rgenes vs bHLH Tajimas’ D

RPS4

Page 39: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

R genes vs bHLH

Page 40: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Summery Haplotyping

• Patterns of variation across accessions

• Natural reverse genetics– Polymorphism database

• Increased polymorphism in centromere

• Selection on R/genes

Page 41: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

• Look for gene expression differences between genotypes

• Identify candidate genes that map to mutation

• Downstream targets that map elsewhere

Transcription based cloning

Page 42: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`
Page 43: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`
Page 44: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

differences may be due to expression or hybridization

Page 45: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

PAG1 down regulated in Cvi

PLALE GREEN1 knock out has long hypocotyl in red light

Page 46: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

SFPs from RNA

• Barley Affy array 22801 probe sets– Most probes sets 11 probes– Background correction “rma2”– Quantile normalization

• 36 arrays total– 3 replicates– 6 tissues, leaf, crown, root, radical, gem, col?– 2 genotypes (Golden Promise 7,459 ESTs)– (Morex 52,695 ESTs)

Page 47: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Look at some plots raw data

Page 48: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Remove probe effect

Page 49: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Remove Tissue + Genotype effect

Page 50: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Look at some plots raw data

Page 51: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Remove probe effect

Page 52: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Remove Tissue + Genotype effect

Page 53: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

SAM False Discovery Rate

delta ori.data perm.data difference FDR0.1 13210 1210.34 11999.66 0.0916230130.2 7903 183.95 7719.05 0.0232759710.3 5462 49.18 5412.82 0.0090040280.4 4036 18.31 4017.69 0.0045366700.5 3024 8.49 3015.51 0.0028075400.6 2285 3.85 2281.15 0.001684902

Both + and – SFPs since no reference comparison

Need to compare with ESTs

Page 54: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

Review• Single Feature

Polymorphisms (SFPs) can be used to identify recombination breakpoints, potential deletions, for eXtreme Array mapping, and haplotyping

• Expression analysis to identify QTL candidate genes and downstream responses that consider polymorphisms

Page 55: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

RNA DNA

Universal Whole Genome Array

Transcriptome AtlasExpression levelsTissues specificity

Transcriptome AtlasExpression levelsTissues specificity

Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription

Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription

Alternative SplicingAlternative Splicing Comparative GenomeHybridization (CGH)

Insertion/Deletions

Comparative GenomeHybridization (CGH)

Insertion/Deletions

MethylationMethylation

ChromatinImmunoprecipitation

ChIP chip

ChromatinImmunoprecipitation

ChIP chip

Polymorphism SFPsDiscovery/Genotyping

Polymorphism SFPsDiscovery/Genotyping

~19 bp tile, eliminate repeat regionsboth strands “good” binding oligos

Page 56: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

SNP SFP MMMMM MSFP

SFP

MMMMM M

Chromosome (bp)

con

serv

atio

n

SNP

ORFa

start AAAAA

Tra

nsc

ripto

me

Atla

s

ORFb

deletion

Improved Genome Annotation

Page 57: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

ChipViewer: Mapping of transcriptional units of ORFeome

From 2000v At1g09750 (MIPS) to the latest AGI At1g09750

2000 v Annotation (MIPS)

The latest AGI Annotation

Page 58: Arrays as tools for Natural Variation studies: Mapping, Haplotyping, and gene expression Justin Borevitz University of Chicago naturalvariation.org`

NaturalVariation.org

SyngentaHur-Song ChangTong Zhu

SyngentaHur-Song ChangTong Zhu

University of Guelph, CanadaDave WolynUniversity of Guelph, CanadaDave Wolyn

Salk

Jon WernerTodd MocklerSarah LiljegrenRamlah NehringJoanne ChoryDetlef WeigelJoseph Ecker

UC Davis

Julin Maloof

UC San Diego

Charles Berry

Scripps

Sam HazenElizabeth Winzeler

NaturalVariation.orgSalk

Jon WernerTodd MocklerSarah LiljegrenRamlah NehringJoanne ChoryDetlef WeigelJoseph Ecker

UC Davis

Julin Maloof

UC San Diego

Charles Berry

Scripps

Sam HazenElizabeth Winzeler