introduction to linkage analysis course “study design and data analysis for genetic studies”,...

113
Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005 Harald H.H. Göring

Upload: elijah-parks

Post on 30-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Introduction to linkage analysis

Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Harald H.H. Göring

Page 2: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

“Marker” loci

There are many different types of polymorphisms, e.g.:

• single nucleotide polymorphism (SNP):

AAACATAGACCGGTT

AAACATAGCCCGGTT

• microsatellite/variable number of tandem repeat (VNTR):

AAACATAGCACACA----CCGGTT

AAACATAGCACACACACCGGTT

• insertion/deletion (indel):

AAACATAGACCACCGGTT

AAACATAG--------CCGGTT

• restriction fragment length polymorphism (RFLP)

Page 3: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Tracing chromosomal inheritanceusing “marker” locus genotypes

1/2 3/4

1/5 4/5

5/5 5/5

1/4 5/5

Page 4: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Tracing chromosomal inheritance(fully informative situation)

Page 5: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Linkage analysis:locus with known genotypes

1/2 3/3 2/4 1/1

1/3

2/3 1/3

1/2

Where do the observed genotypes “fit”?

Page 6: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Linkage analysis

In linkage analysis, one evaluates statistically whether or not the alleles at 2 loci co-segregate during meiosis more often than expected by chance. If the evidence of increased co-segregation is convincing, one generally concludes that the 2 loci are “linked”, i.e. are located on the same chromosome (“syntenic loci”). The degree of co-segregation provides an estimate of the proximity of the 2 loci, with near complete co-segregation for very tightly linked loci.

Page 7: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Let’s step back…

to Mendel

Page 8: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

P1

F1

F2

315 108 101 32

9 : 3 : 3 : 1

Mendel’s law of uniformity

Mendel’s law of independent assortment

observed:

~ ratio:

One of Mendel’s pea crosses

Page 9: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

P1 1 1 2 2

F1 1 2 1 2

F2 1 1 1 2 2 2

25% 50% 25% (in expectation)

Mendel’s law of uniformity

Mendel’s law of segregation

Page 10: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

P1a a b b

F1a b a b

F2a a a b b b

Mendel’s law of uniformity

Mendel’s law of segregation

25% 50% 25% (in expectation)

Page 11: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

P1

Mendel’s law of independent assortment

1 1a a

2 2b b

F1 1 2a b

1 2a b

F2 1 1a a

1 1a b

1 1b b

1 2a a

1 2a b

1 2b b

2 2a a

2 2a b

2 2b b

6.25%

12.5%

6.25%

12.5%

25 %

12.5%

6.25%

12.5%

6.25%

Assume, we did this experiment and observed the following:

25% 50% 25%non-independent assortment

(in expectation)

Mendel’s law of uniformity

Page 12: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

1 1 2 2

25 % 50 % 25 %

a a b b

1a

2b

1a

2b

1a

2b

1a

1a

2b

2b

1 1a a

2 2b b

1a

2b

Mendel’s law of uniformity

Mendel’s law of segregation

P1 generation (diploid)

F1 generation (diploid)

gametes (haploid)

F2 generation (diploid)

gametes (haploid)

Co-segregation(due to linkage)

Page 13: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Recombination

Recombination between 2 loci is said to have occurred if an individual received, from one parent, alleles (at these 2 loci) that originated in 2 different grandparents.

Page 14: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

1/1a/a

2/2b/b

1 2a b

3/3c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

N N N R N N N N R N? ? ? ? ? ? ? ? ? ?

Who is a recombinant?

Page 15: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

1/1a/a

2/2b/b

1 2a b

1a

1b

2a

2b

N R R N

Possible explanations for recombination

I1

a

2

b

1

a

2

b

1

b

2

a

different chromosomes

II1

a

2

b

1

a

2

b

1

b

2

a

homologous recombination during meiosis

2

a

III genotyping errorR

Page 16: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Recombination fraction

The recombination fraction between 2 loci is defined as the proportion of meioses resulting in a recombinant gamete. For loci on different chromosomes (or for loci far apart on the same, large chromosome), the recombination fraction is 0.5. Such loci are said to be unlinked. For loci close together on the same chromosome, the recombination fraction is < 0.5. Such loci are said to be linked. The closer the loci, the smaller the recombination fraction ( 0).

Page 17: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

1/1a/a

2/2b/b

1 2a b

3/3c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

N N N R N N N N R N

Estimation of recombination fraction

ˆ ϑ =# N

# N+# R=

2

2 + 8=

2

10= 0.2

Page 18: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

1/2a/b

3/3c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

N N N R N N N N R N

1 2a bphase 1:

1 2b aphase 2:

R R R N R R R R N R

Missing phase information:Who is a recombinant??

Page 19: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

?/? 3/3c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

Missing phase and genotype information:

Who is a recombinant??

1/2a/b

Page 20: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

?/? ?/?c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

Missing phase and genotype information:

Who is a recombinant???

a/b

Page 21: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

• The likelihood of a hypothesis (e.g. specific parameter value(s)) on a given dataset, L(hypothesis|data), is defined to be proportional to the probability of the data given the hypothesis, P(data|hypothesis):

L(hypothesis|data) = constant * P(data|hypothesis)

• Because of the proportionality constant, a likelihood by itself has no interpretation.

• The likelihood ratio (LR) of 2 hypotheses is meaningful if the 2 hypotheses are nested (i.e., one hypothesis is contained within the other):

• Under certain conditions, maximum likelihood estimates are asymptotically unbiased and asymptotically efficient. Likelihood theory describes how to interpret a likelihood ratio.

LR =L H1 | data( )L H0 | data( )

=cP data | H1( )cP data | H0( )

=P data | H1( )P data | H0( )

Likelihood

Page 22: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

The lod (logarithm of odds) score is defined as the logarithm (to the base 10) of the likelihood of 2 hypothesis on a given dataset:

lod = log10

L H1 | data( )L H0 | data( )

In linkage analysis, typically the different hypotheses refer to different values of the recombination fraction:

Z ϑ( ) = log10

L linkage at specific recombination fraction | data( )L no linkage | data( )

= log10

L ϑ | data( )L ϑ = 0.5 | data( )

Zmax = log10

maxϑ

L ϑ | data( )

L 0.5 | data( )

Asymptotically, 2ln 10( )Zmax ~ 0.5χ 1( )2 .

Evaluating the evidence of linkage:lod score

Page 23: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

1/1a/a

2/2b/b

1 2a b

3/3c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

N N N R N N N N R N? ? ? ? ? ? ? ? ? ?

Who is a recombinant?

Page 24: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Z ϑ( ) = log10

L ϑ | data( )L ϑ = 0.5 | data( )

= log10

cP data |ϑ( )cP data |ϑ = 0.5( )

= log10

ϑ 2 1−ϑ( )8

0.52 1− 0.5( )8

Example lod score calculation

0

0.1 0.644

0.2 0.837

0.3 0.725

0.4 0.439

0.5 0

ϑ

Z ϑ( )

−∞

Page 25: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

1/2a/b

3/3c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

N N N R N N N N R N

1 2a bphase 1:

1 2b aphase 2:

R R R N R R R R N R

Missing phase information:Who is a recombinant??

Page 26: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

P(data|) = P(phase 1) P(data|phase 1, ) + P(phase 2) P(data|phase 2 , )

Z ϑ( ) = log10

cP data |ϑ( )cP data |ϑ = 0.5( )

= log10

1

2ϑ 2 1−ϑ( )

8+

1

2ϑ 8 1−ϑ( )

2

1

20.52 1− 0.5( )

8+

1

20.58 1− 0.5( )

2

0

0.1 0.343

0.2 0.536

0.3 0.427

0.4 0.175

0.5 0

ϑ

Z ϑ( )

−∞

Example lod score calculation(missing phase information)

Page 27: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

?/? ?/?c/c

1 3a c

2 3b c

1 3a c

1 3b c

1 3a c

1 3a c

2 3b c

2 3b c

2 3a c

2 3b c

Missing phase and genotype information:

Who is a recombinant???

a/b

Page 28: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

( ) ( )( )∑ ×

=

phasesandgenotypesparental

phasegenotypepaternaldataP

phasesandgenotypesparentalPsfrequenciealleledataP

,, |

,|

ϑϑ

Z()0 -0.3040.1 0.2040.2 0.3460.3 0.2640.4 0.0960.5 0

Assuming 3 equally frequent alleles , i.e. P(1) = P(2) = P(3) = 0.333:

Z()0 -0.3780.1 0.1830.2 0.3320.3 0.2530.4 0.0910.5 0

Assuming P(1) = 0.495, P(2) = 0.495, P(3) = 0.010:

Example lod score calculation(missing phase and genotype

information)

Page 29: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Lod score curves (for previous example pedigree)

-1

-0.9

-0.8

-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 lod score

known phase, known genotypes

3

unknown phase, known genotypes

unknown phase, unknown genotypes

Page 30: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Interpretation of lod score

• The traditional threshold for declaring evidence of linkage statistically significance is a lod score of 3, or a likelihood ratio of 1000:1, meaning the likelihood of linkage on the data is 1000-times higher than the likelihood of no linkage on the data.

• Asymptotically, a lod score of 3 has a point-wise significance level (p-value) of 0.0001. In other words, the probability of obtaining a lod score of at least this magnitude by chance is 0.0001.

• Due to the many linkage tests being conducted as part of a genome-wide linkage scan, a lod score of 3 has a significance level of ~0.05.

Page 31: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

The p-value is defined as the probability of obtaining an outcome at least as extreme as observed by chance (i.e. when the null hypothesis is true).

Example: Testing whether a coin is fair

H0: P(head) = 0.5

H1: P(head) 0.5 (2-sided alternative hypothesis).

You observe 1 head out of 10 coin tosses. The p-value then is the probability of observing exactly 1 head in 10 trials (observed outcome), or 0 head in 10 trials (more extreme outcome), or 9 (equally extreme outcome) or 10 (more extreme outcome) heads in 10 trials.

p =10

i

⎝ ⎜

⎠ ⎟

i= 0,1,9,10

∑ 0.5i 1− 0.5( )10−i

=1+10 +10 +1

1024=

22

1024≈ 0.021

P-value

Page 32: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

The p-value is defined as the probability of obtaining an outcome at least as extreme as observed by chance (i.e. when the null hypothesis is true).

Example: Testing whether 2 loci are linked

H0: P(recombination) = 0.5

H1: P(recombination) ≤ 0.5 (1-sided alternative hypothesis).

You observe 0 recombinant and 10 non-recombinant in 10 informative meioses. The p-value then is the probability of observing exactly 0 recombinants in 10 trials (observed outcome; there is no more extreme outcome).

p =10 0

⎛⎝⎜

⎞⎠⎟0.50 1−0.5( )10−0 =

11024

≈0.001

P-value

Page 33: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Example: Testing whether 2 loci are linked

H0: P(recombination) = 0.5

H1: P(recombination) ≤ 0.5 (1-sided alternative hypothesis).

You observe 0 recombinant and 10 non-recombinant in 10 informative meioses. The p-value then is the probability of observing exactly 0 recombinants in 10 trials (observed outcome; there is no more extreme outcome).

Lod score

Zmax =log10

maxϑ

L ϑ |data( )

L 0.5 |data( )=log10

L ϑ =0 |data( )L 0.5 |data( )

=log10

11

1024

=log101024

≈3

In the ideal case, 10 fully informative meioses may suffice to obtain significant evidence of linkage.

Page 34: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Lod score and significance level

lod score (point-wise) p-value

0.588 0.05

1.175 0.01

2.000 ~0.001

3.000 0.0001

4.000 ~0.00001

5.000 ~0.000001

Page 35: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Linkage analysis reducesmultiple testing problem

• Linkage analysis is so useful because it greatly reduces the multiple testing problem: ~3,000,000,000 bp of DNA are interrogated in ~500 independent linkage tests for human data. This is possible because a meiotic recombination event occurs on average only once every 100,000,000 bp.

• No specification of prior hypotheses is therefore necessary, as all possible hypotheses can be screened.

Page 36: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Linkage analysis: trait locus with unknown genotypes

?/? ?/? ?/? ?/?

?/?

?/? ?/?

?/?

Where do the observed genotypes “fit”?

Page 37: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Statistical gene mapping with trait phenotypes

genetic distance

(linkage, allelic association)

unobserved trait locus

genotypes

observed marker

genotypes

observed trait

phenotypescorrelation

to be detected

etiology?

Page 38: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Many different types of linkage methods

• penetrance model-based linkage analysis (“classical” linkage analysis)

• penetrance model-free linkage analysis (“model-free” or “non-parametric” linkage analysis– affected sib-pair linkage analysis– affected relative-pair linkage analysis– regression-based linkage analysis– variance components-based linkage analysis– …

Page 39: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Variation with each linkage method

• 2-point analysis vs. multiple 2-point analysis vs. multi-point analysis

• exact calculation vs. approximation (e.g., MCMC)

• qualitative trait vs. quantitative traits

• rare “simple mendelian” diseases vs. common “complex multifactorial diseases”

• …

Page 40: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Penetrance-model-based linkage analysis

Page 41: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Segregation analysis

In segregation analysis, one attempts to characterize the mode of inheritance of a trait, by statistically examining the segregation pattern of the trait through a sample of related individuals.

In a way, heritability analysis is a way of segregation analysis. In heritability analysis, the analysis is not focused on characterization of the segregation pattern per se, but on quantification of inheritance assuming a given mode of inheritance (such as, generally, additivity/co-dominance).

Page 42: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Relationship between genotypes and phenotypes (penetrances) at the ABO

blood group locus

Phenotype (blood group)

Genotype A B AB O

A/A 1 0 0 0

A/B 0 0 1 0

A/O 1 0 0 0

B/B 0 1 0 0

B/O 0 1 0 0

O/O 0 0 0 1

penetrance: P(phenotype given genotype)

Page 43: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Probability model correlating trait phenotypes and trait locus genotypes:

penetrances

penetrance: P(phenotype given genotype)

Genotype unaffected affected

+/+ 1 0

D/+ or +/D 0 1

D/D 0 1

Ex.: fully-penetrant dominant disease without “phenocopies”

Phenotype

Page 44: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Statistical gene mapping with trait phenotypes:

“simple” dominant inheritance model

genetic distance

(linkage, allelic association)

unobserved trait locus

genotypes

observed marker

genotypes

observed trait

phenotypescorrelation

to be detected

=affected

not affected

D/+ +/+

Page 45: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Linkage analysis: trait locus (genotypes based on assumed dominant inheritance model)

+/+

D/+

+/+

+/+

D/+

D/+

D/+

+/+

Where do the observed genotypes “fit”?

Page 46: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Example of multipoint lod score curve: Pseudoxanthoma elasticum

Multipoint lod scores

0

1

2

3

4

5

6

7

8

9

0 5 10 15 20 25 30

map position in cM

lod score

AR

AD

NPL

From: Le Saux et al (1999) Pseudoxanthoma elasticum maps to an 820 kb region of the p13.1 region of chromosome 16. Genomics 62:1-10

Page 47: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Genetic heterogeneity

timelocus homogeneity, allelic homogeneity

locus homogeneity, allelic heterogeneity

locus heterogeneity, allelic homogeneity (at each locus)

time

locus heterogeneity, allelic heterogeneity (at each locus)

Page 48: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Pros and cons ofpenetrance-model-based linkage

analysis

+ potentially very powerful (under suitable penetrance model)+ statistically well-behaved

- requires specification of penetrance model; not powerful at all under unsuitable penetrance model

Page 49: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

dominant inheritance:

recessive inheritance:

P(aff.|DD or D+) = 1

P(aff.|++) = 0

P(aff.|DD) = 1

P(aff.|++ or D+) = 0

1/2 3/4

1/3 1/4 2/3

Effects of model misspecification

+/+ D/+

D/+ +/+ D/+

1/2 3/4

1/3 1/4 2/3

D/+ D/D

D/D D/+ D/D

informativeuninformative

uninformativeinformative

Page 50: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Pros and cons ofpenetrance-model-based linkage

analysis

+ potentially very powerful (under suitable penetrance model)+ statistically well-behaved

- requires specification of penetrance model; not powerful at all under unsuitable penetrance model

- modeling flexibility limited- computationally intensive

Page 51: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

“Mendelian” vs. “complex” traits“simple mendelian” disease

•genotypes of a single locus cause disease

•often little genetic (locus) heterogeneity (sometimes even little allelic heterogeneity); little interaction between genotypes at different genes

•often hardly any environmental effects

•often low prevalence

•often early onset

•often clear mode of inheritance

•“good” pedigrees for gene mapping can often be found

•often straightforward to map

“complex multifactorial” disease

•genotypes of a single locus merely increase risk of disease

•genotypes of many different genes (and various environmental factors) jointly and often interactively determine the disease status

•important environmental factors

•often high prevalence

•often late onset

•no clear mode of inheritance

•not easy to find “good” pedigrees for gene mapping

•difficult to map

Page 52: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

A quantitative trait is not necessarily complex

observed trait

phenotypes

observed marker

genotypes

correlation to be

detected

genetic distance

(linkage, allelic

association)

unobserved trait locus

genotypes

etiology given ascertainment

P Gtrait locus

| Phtrait

⎝ ⎜

⎠ ⎟→1

Page 53: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Fundamental problem in complex trait gene mapping

genetic distance

(linkage, allelic

association)

observed marker

genotypes

unobserved trait locus

genotypes

observed trait

phenotypescorrelation

to be detected etiology given

ascertainment

P Gtrait locus

| Phtrait

⎝ ⎜

⎠ ⎟→ P Gtrait

locus

⎝ ⎜

⎠ ⎟≈ small

Page 54: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Etiological complexity

trait phenotyp

e

other env. factor(s)

gene 1

gene 2 gene

3

environm. factor 2

environm. factor 1

other gene(s)

environm. factor 3

genotype 1

genotype 2

other genotype

s

genotype 1

genotype 2

other genotype

sgenotype

1genotype

2other

genotypes

Page 55: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

How to improve power to detect correlations between trait phenotypes and trait locus

genotypes?

unobserved trait locus

genotypes

observed trait

phenotypes

etiology

0| →⎟⎟⎠

⎞⎜⎜⎝

⎛trait

locustrait PhGP 1

Page 56: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

How to simplify the etiological architecture?

• choose tractable trait– Are there sub-phenotypes within trait?

• age of onset• severity• combination of symptoms (syndrome)

– “endophenotype” or “biomarker ” vs. disease• quantitative vs. qualitative (discrete)• Dichotomizing quantitative phenotypes leads to loss of information.• simple/cheap measurement vs. uncertain/expensive diagnosis• not as clinically relevant, but with simpler etiology

• given trait, choose appropriate study design/ascertainment protocol– study population

• genetic heterogeneity• environmental heterogeneity

– “random” ascertainment vs. ascertainment based on phenotype of interest• single or multiple probands• concordant or discordant probands• pedigrees with apparent “mendelian” inheritance?• inbred pedigrees?

– data structures• singletons, small pedigrees, large pedigrees

– account for/stratify by known genetic and environmental risk factors

Page 57: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Affected sib-pair linkage analysis

Page 58: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Identity-by-state (IBS) vs. identity-by-descent (IBD)

1 2 3 4

1 3 1 4

1 2 1 3

1 2 1 3

1 1 2 3

1 3 1 2

1 2 1 2

1 2 1 2

IBD(also IBS)

IBS(not IBD)

? ?(both or

neither IBD)If IBD then necessarily IBS (assuming absence of mutation event).

If IBS then not necessarily IBD (unless a locus is 100% informative, i.e. has an infinite number of alleles, each with infinitesimally small allele frequency).

Page 59: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Probabilistic inference of IBD

1 2 3 4

1 3 1 4

1 2 1 3

1 2 1 3

1 1 2 3

1 3 1 2

1 2 1 2

1 2 1 2

1 0 0.5 1

1 2 1.5 1

0.5 0 0.25 0.5

NIBD

IBD

Page 60: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Rationale ofaffected sib-pair linkage analysis

A pair of sibs affected with the same disorder is expected to share the alleles at the trait locus/loci---and also alleles at linked loci---more often (> 50 %) than a random pair of sibs (50 %).

Page 61: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Basic concept ofaffected sib pair linkage analysis

IBD? IBD?

IBD NIBD

1/2 3/4

1/3 1/4

IBD? IBD?

IBD NIBD

1/2 3/4

1/3 1/4

Page 62: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Affected sib pair linkage analysis(mean test)

IBD? IBD?

IBD NIBD

1/2 3/4

1/3 1/4

? 5.0)Pr(

? )Pr()Pr(

?

>>>

IBDNIBDIBD

nn NIBDIBD

NIBD IBD

counts in example ped.

1 1

total counts in dataset NIBDn IBDn

Conditional on the fact that both sibs are affected, test if:

Page 63: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

NIBD IBD

probability

counts in ex. 1 1

total counts

Affected sib pair linkage analysis(mean test)

NIBDn IBDn

IBD? IBD?

IBD NIBD

1/2 3/4

1/3 1/4

φ−1 φ

IBDNIBD

IBDNIBD

nn

nn

L

L

5.0)5.01(

)ˆ1(ln2

)5.0(

)ˆ(ln2

−−

==

ϕϕϕ

ϕ

Page 64: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Penetrance-model based linkage analysis on affected sib pair

1/2 3/4

1/3 1/4

Trait locus genotypes are inferred probabilistically conditional on observed phenotypes according to an assumed inheritance model (number of alleles, allele frequencies and genotypic penetrances).

?/? ?/?

?/? ?/?

Page 65: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Penetrance-model-based linkage analysis on affected sib pair

assuming a rare recessive trait w/o “phenocopies”

1/2 3/4

1/3 1/4

D/+

D/+

D/D D/D

Conditional on the fact that both affected sibs inherited the D allele from each parent, test if:

? 5.0

?)Pr()Pr(

? ))1(2Pr())1(Pr(

? )Pr()Pr(22

<>−

−>−+

>

trecombinantrecombinannon

NIBDIBD

Page 66: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Penetrance-based linkage analysis on affected sib pair

1/2 3/4

1/3 1/4

D/+

D/+

D/D D/D

(assuming a rare, recessive trait w/o

“phenocopies”)

10 5.0)5.01(

ˆ)ˆ1(ln2

)5.0(

)ˆ(ln2

nn

nn IBDNIBD

L

L

−−

=

=

ϕϕϕ

ϕ

and because

22

22

)1(

.)(.)(

ϑϑϕ

−+=

−+= recnonPrecP

)5.0(

)ˆ(ln2

5.0)5.01(

])ˆ1(ˆ[))]ˆ1(ˆ2[ln2

22

==

−−+−

=

ϑ

ϑϑϑϑ

LL

IBDNIBD

IBDNIBD

nn

nn

Page 67: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Relationship of affected sib-pair linkage analysis and penetrance-

model-based linkage analysis

0

0.1

0.2

0.3

0.4

0.5

0.5 0.6 0.7 0.8 0.9 1.0

φ = (2 )P affected sibs share the allele from a parent IBD

= recombination fraction " - " in pseudo marker analysis

For an affected sib-pair of unaffected parents, affected sib-pair linkage analysis and penetrance-model-based linkage analysis assuming a rare recessive trait w/o “phenocopies” are identical.

Page 68: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Penetrance-based linkage analysis on affected sib pair

Assuming a rare, recessive trait w/o “phenocopies”, the father is no longer informative.

Penetrance-based linkage analysis is then no longer equivalent to affected sib pair linkage analysis.

1/2 3/4

1/3 1/4

D/D D/+

D/D D/D

Page 69: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

“Pseudo-marker” analog of affected sib pair linkage analysis (mean test)

“pseudo-marker”

genotypes

1/2 3/4

1/3 1/4

D/+

D/+

D/D D/D

1/2 3/4

1/3 1/4

D/+

D/+

D/D D/D

Page 70: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Take home message regarding relationship of penetrance-model-based and “model-free” approaches to gene

mapping:• The perceived differences between penetrance-model based

and many popular “model-free” methods are more related to the underlying study design than the statistical methodology.

• A deterministic “pseudo-marker” genotype assignment algorithm can be used to mimic popular “model-free approaches”, allowing joint analysis of different data structures for linkage and/or LD in a framework identical to penetrance-based analysis.

• These “pseudo-marker” statistics are generally better behaved and more powerful than their conventional “model-free” analogs.

Page 71: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Regression-based methods forlinkage analysis of quantitative traits

The basic rationale behind this approach (in its various forms) is that pairs of individuals (of a given relationship) with similar phenotypes are expected to be more similar to each other genetically at/near loci influencing the trait of interest than pairs of relatives (of the same relationship) who have dissimilar phenotypes. The degree of phenotypic similarity therefore should be reflected in the proportion of alleles that individuals share IBD at/near trait loci.

Page 72: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Haseman-Elston sib pair linkage testfor quantitative traits

2

IBD0 0.5 1

****

**

**

***

squared phenotypic difference

between 2 sibsStatistical inference:

Is the regression slope < 0?

Page 73: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Variance components-basedlinkage analysis

Page 74: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Rationale of variance components-based linkage analysis

The pattern of phenotypic similarity among pedigree members should be reflected by the pattern of IBD sharing among them at chromosomal loci influencing the trait of interest.

Page 75: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Variance components approach:multivariate normal distribution (MVN)

In variance components analysis, the phenotype is generally assumed to follow a multivariate normal distribution:

f x( ) =1

2( )n Ω( )12

exp12

x−μ( )'Ω−1 x−μ( )⎛

⎝⎜⎞

⎠⎟

ln f x( ) =−n2

ln 2( )−12Ω −

12

x−μ( )'Ω−1 x−μ( )

no. of individuals (in a pedigree)

nn covariance matrix

phenotype vector

mean phenotype

vector

Page 76: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Modeling the resemblance among relative

Ω =Ισ e2 + 2Φσ g

2

Ω = Ισ e2 + 2Φσ g

2 + ˆ Π σ q2

heritability analysis

linkage analysis

Page 77: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Matrix of estimated allele sharing among relatives

12 33

13 13 13

P M

S1 S2 S3

P M S1 S2 S3

P 1 0 0.5 0.5 0.5

M 1 0.5 0.5 0.5

S1 1 0.5 0.5

S2 1 0.5

S3 1

Πexpected = 2Φ

P M S1 S2 S3

P 1 0 0.5 0.5 0.5

M 1 0.5 0.5 0.5

S1 1 0.75 0.75

S2 1 0.75

S3 1

ˆ Π estimated

Page 78: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

lod = log10

L H1 | data( )L H0 | data( )

= log10

maxσ e

2 ,σ g2 ,σ q

2L σ e

2,σ g2,σ q

2,| data( )

maxσ e

2 ,σ g2L σ e

2,σ g2,σ q

2 = 0 | data( )

Asymptotically, 2ln 10( )lod ~ 0.5χ 1( )2 .

Variance components-based lod score

Page 79: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

100

1,000

10,000

100,000

0 0.1 0.2 0.3 0.4 0.5Heritability due to QTL

Number of Individuals

PedigreeSibship (2)Sibship (4)

Sample size requirements to detect linkage to a QTL with a lod score of ≥ 3

and 80% power

Page 80: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Pros and cons ofvariance-components-based linkage

analysis

+ no need to specify inheritance model+ robust to allelic heterogeneity at a locus+ modeling flexibility+ computationally feasible even on large pedigrees

- generally assumes additive inheritance model- modeling restrictions- not always well-behaved statistically (depending on phenotypic

distribution and ascertainment)- generally less powerful than penetrance-model-based linkage

analysis under suitable model

Page 81: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Choice of covariates

Covariates ought to be included in the likelihood model if they are known to influence the phenotype of interest and if their own genetic regulation does not overlap the genetic regulation of the target phenotype.

Typical examples include sex and age.

In the analysis of height, information on nutrition during childhood should probably be included during analysis. However, known growth hormone levels probably should not be.

Page 82: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Choice of covariates

σ p2 σ p

2

hq

2without cov =

σq2

σ p2 ≈0.15 > 0.05 ≈

σq2 − σq

2 I σ cov2( )

σ p2 −σ cov

2 =hq2withcov

σ q2

σ q2

σ cov2

Page 83: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Choice of covariates

σ p2 σ p

2

σ q2 σ q

2

σ cov2

hq

2without cov =

σq2

σ p2 ≈0.15 < 0.2 ≈

σq2 − σq

2 I σ cov2( )

σ p2 −σ cov

2 =hq2withcov

Page 84: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Choice of covariates:special case of treatment/medication

Page 85: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Before treatment/medicationof affected individuals

phenotype

probability density

unaffected affected

Page 86: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

After (partially effective) treatment / medication of affected individuals

phenotype

probability density

unaffected affected

apparent effect of covariate

Page 87: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Choice of covariates:special case of treatment/medication

• If medication is ineffective/partially effective, including treatment as a covariate is worse than ignoring it in the analysis.

• If medication is very effective, such that the phenotypic mean of individuals after treatment is equal to the phenotypic mean of the population as a whole, then including medication as a covariate has no effect.

• If medication is extremely effective, such that the phenotypic mean of individuals after treatment is “better” than the phenotypic mean of the population as a whole, then including medication as a covariate is better than ignoring it, but still far from satisfying.

• Either censor individuals or, better, infer or integrate over their phenotypes before treatment, based on information on efficacy etc.

Page 88: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Two-point vs. multi-point linkage analysis

• In linkage analysis, one always examines whether or not the alleles at 2 loci tend to co-segregate during meiosis.

• In “two-point” linkage analysis, chromosomal inheritance is inferred from the observed trait phenotypes on the one hand (locus 1) and from a single (genotyped) marker locus on the other hand (locus 2).

• In “multi-point” linkage analysis, chromosomal inheritance is inferred from the observed trait phenotypes on the one hand (locus 1) and from multiple (genotyped) marker loci on the other hand (locus 2).

Page 89: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Pros and cons of multi-point linkage analysis

+ Genotypes at multiple markers contain at least as much and generally more information to infer chromosomal inheritance than genotypes at a single marker, resulting in greater power to detect linkage.

+ The number of independent tests in genome-wide linkage analysis is somewhat reduced in multi-point linkage analysis vs. two-point linkage analysis.

- Multi-point linkage analysis requires knowledge of the genetic marker map (marker order and inter-marker recombination fractions). If this information is incorrect, power can be reduced and/or the false positive rate can be increased.

- Multi-point linkage analysis is more susceptible to genotyping errors.- Multi-point linkage analysis typically assumes linkage equilibrium between

markers. If this does not hold, power can be reduced and/or the false positive rate can be increased.

- Multi-point linkage analysis is computationally more demanding than two-point linkage analysis.

Page 90: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Genetic map vs. physical map

m1 m2 m3 m4

1223 34

x1 x2 x3 x4 cM

genetic map

physicalmap

y1 y2 y3 y4 Mb

Page 91: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Genetic map distance vs. recombination fraction

Def. of recombination fraction: probability that recombination takes place between 2 chromosomal positions during meiosis

Recombination fractions are not additive, i.e., for 3 loci and recombination fractions 12 and 23, 13 ≠ 12 + 23.

Def. of genetic map distance (Morgan, M): distance in which 1 recombination event is expected to take place or, equivalently, average distance between recombination events. centi-Morgan (cM) is equal to 1/100 Morgan.

Genetic map distances are additive, i.e. for 3 loci and map distances x12 cM and x23 cM, x13 = x12 + x23 cM.

Neither recombination fractions nore genetic map distances are easily converted into physical map distances.

Page 92: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Why a genome-wide linkage scan may fail

• The sample size is too small.• The marker genotypes are not sufficiently informative (low

heterozygosity and/or large gaps in marker map).• There is no major gene.• The chosen analytical approach is unsuitable.• Bad luck!

Page 93: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

A fairytale of 2 traits

Page 94: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Heritability estimates

trait A trait B

45-82% 63-92%

Page 95: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait A (sample 1)

large, randomly ascertained pedigrees

no. of phenotyped individuals: 268

trait heritability estimate: 0.55

Page 96: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (sample 1)

large, randomly ascertained pedigrees

no. of phenotyped individuals: 324

trait heritability estimate: 0.88

Page 97: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait A (sample 1)

Page 98: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait A (samples 1--2)

Page 99: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait A (samples 1--3)

Page 100: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait A (samples 1--3 + combined)

Page 101: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (sample 1)

Page 102: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--2)

Page 103: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--3)

Page 104: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--4)

Page 105: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--5)

Page 106: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--6)

Page 107: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--7)

Page 108: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--8)

Page 109: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Quantitative trait B (samples 1--9)

Page 110: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

quantitative trait A: lipoprotein A (concentration in serum)

quantitative trait B: height (in adults)

Page 111: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Heritability of adult height(additive heritability, adjusted for sex and age)

study sample sizeheritability estimate

TOPS 2199 0.78

FLS 705 0.83

GAIT 324 0.88

SAFHS 903 0.76

SAFDS 737 0.92

SHFS

AZ 643 0.80

DK 675 0.81

OK 647 0.79

Jiri 616 0.63

total 7449

Page 112: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Polygenic or

oligogenic ?

Page 113: Introduction to linkage analysis Course “Study Design and Data Analysis for Genetic Studies”, Universidad ded Zulia, Maracaibo, Venezuela, 9-10 April 2005

Height (9 samples)

ˆ h q,GAIT2 = 0.29

ˆ h q,TOPS2 = 0.03

ˆ h q,FLS2 = 0

ˆ h q,SAFDS2 = 0.08

ˆ h q,SAFHS2 = 0

ˆ h q,SHFS−AZ2 = 0.05

ˆ h q,SHFS−DK2 = 0.01

ˆ h q,SHFS−OK2 = 0.01

ˆ h q,Jiri2 = 0