penalized maximum likelihood estimates of genetic covariance matrices with shrinkage towards...

18
Penalized maximum likelihood estimates of genetic covariance matrices with shrinkage towards phenotypic dispersion Karin Meyer 1 , Mark Kirkpatrick 2 , Daniel Gianola 3 1 Animal Genetics and Breeding Unit, University of New England, Armidale 2 Section of Integrative Biology, University of Texas, Austin 3 University of Wisconsin-Madison, Madison AAABG 2011

Upload: prettygully

Post on 15-Aug-2015

29 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized maximum likelihood

estimates of genetic covariance

matrices with shrinkage towards

phenotypic dispersion

Karin Meyer1, Mark Kirkpatrick2, Daniel Gianola3

1Animal Genetics and Breeding Unit, University of New England, Armidale

2Section of Integrative Biology, University of Texas, Austin

3University of Wisconsin-Madison, Madison

AAABG 2011

Page 2: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Introduction

Motivation

Multivariate genetic analyses: more than 2-4 traits

→ desirable!→ technically increasingly feasible→ inherently problematic

SAMPLING VARIANCE ↑↑ with no. of traits

Measures to alleviate S.V.

large→ gigantic data setsparsimonious models→ less parameters than covar.sestimation→ use additional information

Bayesian: Prior

REML: Impose penalty P on likelihood→ P = f(parameters)→ designed to reduce S.V.

K. M. / M. K. / D. G. | | AAABG 2011 2 / 13

Page 3: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Introduction

Motivation

Multivariate genetic analyses: more than 2-4 traits

→ desirable!→ technically increasingly feasible→ inherently problematic

SAMPLING VARIANCE ↑↑ with no. of traits

Measures to alleviate S.V.

large→ gigantic data setsparsimonious models→ less parameters than covar.sestimation→ use additional information

Bayesian: Prior

REML: Impose penalty P on likelihood→ P = f(parameters)→ designed to reduce S.V.

K. M. / M. K. / D. G. | | AAABG 2011 2 / 13

Page 4: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Introduction

Penalized REML

Maximize:

log LP = log L − 12 ψ P

Tuning factor

Penalty on parameters

Standard, unpenalized REML log likelihood

Objectives:Introduce new type of P→ prior: Σ ∼ IW

Compare efficacy of different P

K. M. / M. K. / D. G. | | AAABG 2011 3 / 13

Page 5: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | REML and penalties

Penalties to improve estimates of ΣG

Σ̂P estimated much more accurately than Σ̂G

Idea: ‘Borrow strength’ from Σ̂P

1 Shrink canonical eigenvalues towards their mean→ ‘bending’ (Hayes & Hill 1981)

P ℓλ∝∑

i(logλi − λ̄)2 λi : eig.values of Σ̂−1

PΣ̂G

2 a) Shrink Σ̂G towards Σ̂P

→ assume ΣG ∼ IW(Σ−1P, ψ)

→ obtain penalty as minus log density of IW

PΣ ∝ C log |Σ̂G|+ tr(Σ̂−1G

Σ̂0P

)b) Shrink R̂G towards R̂P

PR ∝ C log |R̂G|+ tr(R̂−1G

R̂0P

)

K. M. / M. K. / D. G. | | AAABG 2011 4 / 13

Page 6: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | REML and penalties

Penalties to improve estimates of ΣG

Σ̂P estimated much more accurately than Σ̂G

Idea: ‘Borrow strength’ from Σ̂P

1 Shrink canonical eigenvalues towards their mean→ ‘bending’ (Hayes & Hill 1981)

P ℓλ∝∑

i(logλi − λ̄)2 λi : eig.values of Σ̂−1

PΣ̂G

2 a) Shrink Σ̂G towards Σ̂P

→ assume ΣG ∼ IW(Σ−1P, ψ)

→ obtain penalty as minus log density of IW

PΣ ∝ C log |Σ̂G|+ tr(Σ̂−1G

Σ̂0P

)b) Shrink R̂G towards R̂P

PR ∝ C log |R̂G|+ tr(R̂−1G

R̂0P

)K. M. / M. K. / D. G. | | AAABG 2011 4 / 13

Page 7: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Simulation study

Simulation

Paternal half-sib design: s = 100, n = 10

5 traits, h21≥ . . . ≥ h2

5, MVN

60 sets of population values

→ vary mean & spread of λi

1000 replicates per case

3 penalties

P ℓλ

regress log(λ̂i) towards their mean

PΣ shrink Σ̂G towards Σ̂P

PR shrink R̂G towards R̂P

Obtain Σ̂ψG

and Σ̂ψE

for values of ψ, range 0− 1000

K. M. / M. K. / D. G. | | AAABG 2011 5 / 13

Page 8: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Simulation study

Simulation - cont.

Estimate ψ using population values (V∞)

→ construct MSB & MSW→ validation→ ψ̂ maximize log L in valid. data

Evaluate effect of penalty

→ Loss: deviation of Σ̂X from ΣX

L1(ΣX, Σ̂X) = tr(Σ−1X

Σ̂X)− log |Σ−1X

Σ̂X| − q

→ Percentage Reduction In Average Loss

PRIAL = 100

1−L̄1(ΣX, Σ̂ψ̂

X)

L̄1(ΣX, Σ̂0X

)

K. M. / M. K. / D. G. | | AAABG 2011 6 / 13

penalized

unpenalized

Page 9: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Results | PRIAL

PRIAL in estimated covariance matrices

Pλl

PΣ PR

40

60

80

100

●●

Genetic

71.4 70.6 72.3

Pλl

PΣ PR

0

20

40

60

80

Residual

43.4 13.3 37.1

Pλl

PΣ PR

0

2

4

6

8

●●

●●

Phenotypic

1.2 1.2 2.2

K. M. / M. K. / D. G. | | AAABG 2011 7 / 13

Page 10: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Results | PRIAL

PRIAL for Σ̂G – 60 individual cases

in order of PRIAL for P ℓλ

5K 5L 5J 5D 2K 5H 5F 2F 5E 2L 4F 4K 2H 4L 5I 2D 4H 3K 3F 2E 5C 1D 4D 3L 1C 3H 3D 2J 5G 4J 4E 5B 3E 5A 4I 1G 3C 3A 3B 1E 3G 3J 1F 4A 2I 4C 4B 1K 3I 4G 2C 1B 2G 1H 2B 2A 1L 1J 1A 1I

40

60

80

100

● ●

●●

●●

●●

●●

●●

●● ●

●● ●

● ● ● ●

● ●

● ● ●●

● ●● ●

●● ●

● ●

●●

Penalty

Pλl PΣ PR

K. M. / M. K. / D. G. | | AAABG 2011 8 / 13

Page 11: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Results | PRIAL

Extension: PRIAL “double” penalties

P so far: improve Σ̂G

“Double”

→ PΣ on Σ̂G & Σ̂E

PR on R̂G & R̂E

P ℓλ

on logλi &log(1−λi)

→ 1: joint ψ2: separate ψ

P ℓλ

PΣ PR

Σ̂G G 71 71 72G+E 1 73 70 73

2 74 73 74

Σ̂E G 43 13 37G+E 1 62 54 60

2 65 65 63

Little impact on PRIAL for Σ̂G

Can ↑↑ PRIAL for Σ̂E w/out ↓ PRIAL for Σ̂G

Separate ψ̂: extra effort, limited add. improvement

K. M. / M. K. / D. G. | | AAABG 2011 9 / 13

Page 12: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Results | PRIAL

Summary: PRIAL

Substantial improvements in Σ̂G & Σ̂E feasible

→ Results shown ‘optimistic’→ ψ̂ using pop.values

PRIAL heavily influenced by spread of can. eigenvalues

P ℓλ

best if λi ≈ λ̄; overshrink if not

PΣ & PR worst if λi ≈ λ̄; more robust than P ℓλ

if λi 6≈ λ̄similar canonical eigenvalues unlikely in practice?

New type of P useful

K. M. / M. K. / D. G. | | AAABG 2011 10 / 13

Page 13: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Results | Bias

Mean estimates of can. eigenvalues

Penalties act differently – illustrated for single case

λi = 0.5,0.2,0.15,0.1,0.05; λ̄ = 0.2

0

0.2

0.4

λ1 λ3 λ5

None

Pλl

λ1 λ3 λ5

PR

● Pop.value

K. M. / M. K. / D. G. | | AAABG 2011 11 / 13

Page 14: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Results | Bias

Mean relative bias (in %)

1 2 3 4 5

Can. eigenvaluesNone 9 27 17 -20 -79P ℓλ

-4 16 29 58 101

PΣ 8 25 25 39 75PR 1 16 21 37 57

HeritabilitiesNone -1 4 5 7 12P ℓλ

-7 5 12 23 45

PΣ 1 10 15 26 44PR -2 2 5 9 17

K. M. / M. K. / D. G. | | AAABG 2011 12 / 13

Page 15: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Finale

Conclusions

Can obtain ‘better’ estimates of genetic cov. matrices

→ borrow strength from phenotypic covariance matrix→ easy to implement by penalty on log L

Comparable improvement (PRIAL) from different PNone best overall

PR: Shrinking R̂G towards R̂P

less variableless bias in estimates of genetic parameterseasier to implement than P ℓ

λ

Penalized REML recommended!

Small to moderate data sets, ≥ 4 traits

K. M. / M. K. / D. G. | | AAABG 2011 13 / 13

Page 16: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Finale

Conclusions

Can obtain ‘better’ estimates of genetic cov. matrices

→ borrow strength from phenotypic covariance matrix→ easy to implement by penalty on log L

Comparable improvement (PRIAL) from different PNone best overall

PR: Shrinking R̂G towards R̂P

less variableless bias in estimates of genetic parameterseasier to implement than P ℓ

λ

Penalized REML recommended!

Small to moderate data sets, ≥ 4 traits

K. M. / M. K. / D. G. | | AAABG 2011 13 / 13

Page 17: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

Penalized REML | Finale

Conclusions

Can obtain ‘better’ estimates of genetic cov. matrices

→ borrow strength from phenotypic covariance matrix→ easy to implement by penalty on log L

Comparable improvement (PRIAL) from different PNone best overall

PR: Shrinking R̂G towards R̂P

less variableless bias in estimates of genetic parameterseasier to implement than P ℓ

λ

Penalized REML recommended!

Small to moderate data sets, ≥ 4 traits

K. M. / M. K. / D. G. | | AAABG 2011 13 / 13

Page 18: Penalized maximum likelihood  estimates of genetic covariance  matrices with shrinkage towards phenotypic dispersion

P-REMLpart of

everydaytoolkit