association mapping with high density marker panels

Post on 02-Feb-2016

57 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Association mapping with high density marker panels. Jeffrey Barrett. Outline. Linkage disequilibrium and recombination HapMap ‘Tag’ SNPs Basic association Practical. Linkage disequilibrium. Linkage disequilibrium. time. Indirect association. Measuring LD. locus 1. D =  11 - pq. - PowerPoint PPT Presentation

TRANSCRIPT

Association mapping with high density marker panels

Jeffrey Barrett

Outline

Linkage disequilibrium and recombination

HapMap

‘Tag’ SNPs

Basic association

Practical

Linkage disequilibrium

Linkage disequilibrium

time

Indirect association

Measuring LD

locus 1

locu

s 2

D = 11 - pq

r2 = D2/p(1-p)q(1-q)

D´ = D/DMAX

p 1-p

q pq (1-p)q

1-q p(1-q) (1-p)(1-q)

p 1-p

q 11 12

1-q 21 22

Theoretical and empirical LD

Reich et al. Nature (2001)

LD analysis with Haploview

Genotypes vs haplotypes

Genotypes: AA CT CC GA

Haplotypes: ACCG / ATCA

ACCA / ATCG

ATCG / ACCA

ATCA / ACCG

2n possible reconstructions n = number of heterozygous sites

Limited haplotype diversity

Daly et al, Nat Genet (2001)

Visualizing empirical LD

Haplotype blocks

Haplotype blocks

Haplotype blocks

Haplotype blocks

D´ and r2

D´ in 100kb

D´ in common SNPs, 100kb

r2 in 100kb

HapMap

HapMap samples

90 Yoruba individuals (30 parent-parent-offspring trios) from Ibadan, Nigeria (YRI)

90 individuals (30 trios) of European descent from Utah (CEU)

45 Han Chinese individuals from Beijing (CHB)

45 Japanese individuals from Tokyo (JPT)

Why multiple populations?

HapMap SNPs

PHASE I: 1,000,000 successful SNPs across the genome

PHASE II: 5,000,000 additional SNPs attempted

~4,000,000 total polymorphic SNPs genomewide

Panel %r2 > 0.8 max r2

YRI 81 0.90CEU 94 0.97CHB+JPT 94 0.97

Enabling association studies:dbSNP

International HapMap Project. Nature (2005).

Tagging

Reference panel: HapMap data

Tags: SNPs chosen for genotyping with the aim of capturing as much information as possible

Tests: statistical tests for association to disease

Pairwise tagging

Tags:

SNP 1SNP 3SNP 6

3 in total

Test for association:

SNP 1SNP 3SNP 6

A/T1

G/A2

G/C3

T/C4

G/C5

A/C6

high r2 high r2 high r2

AATT

GC

CG

GC

CG

TCCC

ACCC

GC

CG

TCCC

GGAA

GGAA

Carlson et al. (2004) AJHG 74:106

Testing tags for association

Genotype tags in cases and controls

Each tag is tested for association

How can we better use this information?

Tags:

SNP 1SNP 3SNP 6

3 in total

Test for association:

SNP 1SNP 3SNP 6

Use of haplotypes can improve genotyping efficiency

Tags:

SNP 1SNP 3

2 in total

Test for association:

SNP 1 captures 1+2SNP 3 captures 3+5

“AG” haplotype captures SNP 4+6

AATT

GC

CG

GC

CG

TCCC

ACCC

GC

CG

TCCC

GGAA

GGAA

ACCC

A/T1

G/A2

G/C3

T/C4

G/C5

A/C6

de Bakker et al. (2005) Nat Genet 37:1217

Efficiency

de Bakker et al. (2005) Nat Genet 37:1217

Transferability among populations

CEUCEU

Whites fromLos Angeles, CA

Whites fromLos Angeles, CA Botnia, FinlandBotnia, Finland

CEUCEUCEUCEU

Utah residents with European ancestry

(CEPH)

Utah residents with European ancestry

(CEPH)

PIW de Bakker et al.

Genome-wide tagging coverage

Barrett and Cardon, Nat Genet (2006).

Population structure

Marchini, Nat Genet (2004)

Population structure -

BD 1.15

CAD 1.08

HT 1.09

CD 1.26

RA 1.06

T1D 1.07

T2D 1.10

Genomic control - genome-wide inflation of median test statistic

Crohn’s collection center

Center 3: = 1.77

All others: = 1.09

Center

1

No. of samples

524

2 271

3 439

4 465

5 301

IBS clustering

Compute IBS between all pairs of individuals, as well as 270 HapMap samples

Create a distance matrix of (1-IBS)

Classical multidimensional scaling generates principal components which capture largest fraction of variation

Crohn’s PCA

Genotype calling

Calling wrinkles: > 3 clusters

Plate effects

Transition to SSF site

Association: allelic 2

Case Control

A 70 90

T 30 10Assumes:

multiplicative

HW equilibrium

2 (O E)2

E

Haploview practical

www.hapmap.org

1. Find bounding hotspots for CARD15 (>10 cM/Mb)

2. Download file for this window

Haploview practical

1. What fraction of the dataset can be captured with 8 pairwise tags?

2. How much more information can be gained by using multimarker tagging?

Haploview practical

Data in F:\barrett

Is our result experiment-wide significant?

top related