quantitative trait association in parent offspring trios: extension of case/pseudocontrol method and...

21
Genetic Epidemiology 31 : 813–833 (2007) Quantitative Trait Association in Parent Offspring Trios: Extension of Case/Pseudocontrol Method and Comparison of Prospective and Retrospective Approaches Eleanor Wheeler, 1 and Heather J. Cordell 2 1 The Wellcome Trust Sanger Institute, Cambridge, UK 2 Institute of Human Genetics, Newcastle University, UK The case/pseudocontrol method provides a convenient framework for family-based association analysis of case-parent trios, incorporating several previously proposed methods such as the transmission/disequilibrium test and log-linear modelling of parent-of-origin effects. The method allows genotype and haplotype analysis at an arbitrary number of linked and unlinked multiallelic loci, as well as modelling of more complex effects such as epistasis, parent-of-origin effects, maternal genotype and mother-child interaction effects, and gene-environment interactions. Here we extend the method for analysis of quantitative as opposed to dichotomous (e.g. disease) traits. The resulting method can be thought of as a retrospective approach, modelling genotype given trait value, in contrast to prospective approaches that model trait given genotype. Through simulations and analytical derivations, we examine the power and properties of our proposed approach, and compare it to several previously proposed single-locus methods for quantitative trait association analysis. We investigate the performance of the different methods when extended to allow analysis of haplotype, maternal genotype and parent-of-origin effects. With randomly ascertained families, with or without population stratification, the prospective approach (modeling trait value given genotype) is found to be generally most effective, although the retrospective approach has some advantages with regard to estimation and interpretability of parameter estimates when applied to selected samples. Genet. Epidemiol. 31:813–833, 2007. r 2007 Wiley-Liss, Inc. Key words: family-based; regression; imprinting; TDT The supplemental materials described in this article can be found at http://www.interscience.wiley.com/jpages/0741-0395/suppmat Contract grant sponsor: Wellcome Trust; Contract grant numbers: 074524 and 068612. Correspondence to: Heather Cordell, Institute of Human Genetics, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK. E-mail: [email protected] Received 24 November 2006; Revised 15 February 2007; Accepted 27 April 2007 Published online 4 June 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/gepi.20243 INTRODUCTION Numerous methods have been proposed to test for association between a quantitative trait and a diallelic locus of interest. In a group of unrelated subjects, simple linear regression can be used to relate the quantitative trait phenotype to the genotype. However, this approach can be adversely affected by population stratification [Gauderman, 2003] and hence family-based designs are often preferred. Perhaps the simplest family-based design is to genotype a sample of unrelated phenotyped individuals and their par- ents, generating a set of parent-offspring trios. Anala- gous to the transmission/disequilibrium test (TDT) for disease traits [Spielman et al., 1993], tests that are robust to stratification can be derived by focussing on the transmission of the parental alleles to the offspring. If a given marker is not linked to a quantitative trait locus (QTL) (so that the marker alleles are transmitted randomly from parents to offspring), offspring quan- titative phenotype is independent of offspring marker genotype given the parental marker genotypes [Whittaker et al., 2003]. This observation led Whittaker et al. [2003] and Gauderman [2003] to propose a test that is robust to population stratification, by adding terms that code for parental mating type into the linear regression equation. This approach (denoted QTDT M by Gauderman [2003]) is closely related to tests previously proposed by Allison [1997] and Lunetta et al. [2000] (for details, see Gauderman [2003]). An alternative approach was proposed by Fulker et al. [1999] and Abecasis et al. [2000]. These authors added terms to the linear regression model to separate out the within and between mating type information. The within mating type test was referred to as the hierarchical QTDT (HQTDT) by Gauderman [2003] and is the method implemented in the QTDT program [Abecasis et al., 2002]. Gauderman [2003] noted that for parent-offspring trios, HQTDT and QTDT M are virtually identical with regards to inference about the effects of interest (genotype effects on the trait); however HQTDT has the advantage that it has also been extended to apply to general pedigrees [Abecasis r 2007 Wiley-Liss, Inc.

Upload: eleanor-wheeler

Post on 11-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

Genetic Epidemiology 31 : 813–833 (2007)

Quantitative Trait Association in Parent Offspring Trios: Extensionof Case/Pseudocontrol Method and Comparison of Prospective

and Retrospective Approaches

Eleanor Wheeler,1 and Heather J. Cordell2�

1The Wellcome Trust Sanger Institute, Cambridge, UK2Institute of Human Genetics, Newcastle University, UK

The case/pseudocontrol method provides a convenient framework for family-based association analysis of case-parent trios,incorporating several previously proposed methods such as the transmission/disequilibrium test and log-linear modelling ofparent-of-origin effects. The method allows genotype and haplotype analysis at an arbitrary number of linked and unlinkedmultiallelic loci, as well as modelling of more complex effects such as epistasis, parent-of-origin effects, maternal genotype andmother-child interaction effects, and gene-environment interactions. Here we extend the method for analysis of quantitative asopposed to dichotomous (e.g. disease) traits. The resulting method can be thought of as a retrospective approach, modellinggenotype given trait value, in contrast to prospective approaches that model trait given genotype. Through simulations andanalytical derivations, we examine the power and properties of our proposed approach, and compare it to several previouslyproposed single-locus methods for quantitative trait association analysis. We investigate the performance of the different methodswhen extended to allow analysis of haplotype, maternal genotype and parent-of-origin effects. With randomly ascertainedfamilies, with or without population stratification, the prospective approach (modeling trait value given genotype) is found to begenerally most effective, although the retrospective approach has some advantages with regard to estimation and interpretabilityof parameter estimates when applied to selected samples. Genet. Epidemiol. 31:813–833, 2007. r 2007 Wiley-Liss, Inc.

Key words: family-based; regression; imprinting; TDT

The supplemental materials described in this article can be found at http://www.interscience.wiley.com/jpages/0741-0395/suppmat

Contract grant sponsor: Wellcome Trust; Contract grant numbers: 074524 and 068612.�Correspondence to: Heather Cordell, Institute of Human Genetics, Newcastle University, International Centre for Life, Central Parkway,Newcastle upon Tyne NE1 3BZ, UK. E-mail: [email protected] 24 November 2006; Revised 15 February 2007; Accepted 27 April 2007Published online 4 June 2007 in Wiley InterScience (www.interscience.wiley.com).DOI: 10.1002/gepi.20243

INTRODUCTION

Numerous methods have been proposed to testfor association between a quantitative trait and adiallelic locus of interest. In a group of unrelatedsubjects, simple linear regression can be used to relatethe quantitative trait phenotype to the genotype.However, this approach can be adversely affected bypopulationstratification [Gauderman, 2003] and hencefamily-based designs are often preferred. Perhaps thesimplest family-based design is to genotype a sampleof unrelated phenotyped individuals and their par-ents, generating a set of parent-offspring trios. Anala-gous to the transmission/disequilibrium test (TDT) fordisease traits [Spielman et al., 1993], tests that arerobust to stratification can be derived by focussing onthe transmission of the parental alleles to the offspring.If a given marker is not linked to a quantitative traitlocus (QTL) (so that the marker alleles are transmittedrandomly from parents to offspring), offspring quan-titative phenotype is independent of offspring marker

genotype given the parental marker genotypes[Whittaker et al., 2003]. This observation led Whittakeret al. [2003] and Gauderman [2003] to propose a testthat is robust to population stratification, by addingterms that code for parental mating type into the linearregression equation. This approach (denoted QTDTM

by Gauderman [2003]) is closely related to testspreviously proposed by Allison [1997] and Lunettaet al. [2000] (for details, see Gauderman [2003]).

An alternative approach was proposed by Fulkeret al. [1999] and Abecasis et al. [2000]. These authorsadded terms to the linear regression model to separateout the within and between mating type information.The within mating type test was referred to as thehierarchical QTDT (HQTDT) by Gauderman [2003]and is the method implemented in the QTDT program[Abecasis et al., 2002]. Gauderman [2003] noted thatfor parent-offspring trios, HQTDT and QTDTM arevirtually identical with regards to inference about theeffects of interest (genotype effects on the trait);however HQTDT has the advantage that it has alsobeen extended to apply to general pedigrees [Abecasis

r 2007 Wiley-Liss, Inc.

Page 2: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

et al., 2002]. Yang et al. [2000] described a similarmodel to the HQTDT. The differences between themodels have little or no effect on the estimates andtest of interest [Gauderman, 2003], and the Yang et al.[2000] method was therefore treated as equivalent toHQTDT by Gauderman [2003].

The above models are prospective models, that is,they model the quantitative phenotype in terms of theoffspring genotype. They all assume that the quanti-tative traits are normally distributed, or else rely onthe central limit theorem. To protect against theeffects of possible deviations from either normality orselection on the trait, the HQTDT implemented in theQTDT program can carry out a permutation proce-dure based on permutation of genotypes to producean empirical P-value. The connection between max-imum likelihood inference under an assumed normaldistribution and least-squares regression, however,means that in general we expect these regression-based methods to be reasonably robust to smalldeviations of the trait from normality, even withoutuse of permutation arguments.

An alternative approach is to use a retrospectiveapproach, in which offspring genotype is modelledas a function of quantitative phenotype (possiblygiven parental genotypes). This approach is moreakin to the original TDT of Spielman et al. [1993].Such an approach provides the rationale for thefamily-based association test (FBAT) of Laird et al.[2000], although Lange et al. [2002] showed that thisretrospective FBAT approach is in fact equivalent toimplementing the HQTDT of Abecasis et al. [2000]via a score test rather than via a likelihood ratio test.Kistner and Weinberg [2004, 2005] describe a retro-spective approach in which the offspring genotypeis modelled as a function of their phenotype andparental genotypes, making no explicit assumptionsabout the distribution of the quantitative trait. Thismodel, called the quantitative polytomous logistic(QPL) model, can be thought of as an extension ofthe log-linear model proposed by Weinberg et al.[1998] for qualitative traits.

The log-linear model for qualitative traits pro-posed by Weinberg et al. [1998] is very similar to thecase/pseudocontrol approach for case-parent triosproposed by Cordell and Clayton [2002] (describedin more detail by Cordell et al. [2004]). The maindifference between the two approaches is thatCordell and Clayton [2002] model offspring geno-types conditional both on parental genotype andascertainment through the affected offspring,whereas Weinberg et al. [1998] model the frequenciesof the 15 possible trio types consisting of offspringgenotype and parental mating type. The case/pseudocontrol approach can be thought of as ageneralization of the TDT and the approaches ofSchaid and Sommer [1993], Schaid [1996] and

Weinberg et al. [1998], generalized to allow thefitting of more complex models where several linkedand/or unlinked loci may contribute to disease via acombination of offspring and/or maternal genotypeor haplotype effects, parent-of-origin effects andgene-gene or gene-environment interactions. Giventhe flexibility of the case/pseudocontrol approach,here we extend this approach to deal with quanti-tative traits, and compare the resulting methodto original and extended versions of previouslyproposed quantitative trait association approaches.

METHODS

THE QTDTM

Gauderman [2003] and Whittaker et al. [2003]incorporate parental mating type as a fixed effect ina linear regression model. The model is of the form

yi ¼ aM þ bgiþ ei ð1Þ

where yi denotes a quantitative phenotype, gi thegenotype at a particular locus for the ith individual(gi 5 0, 1, 2 according to whether the genotype is 1/1,1/2 (or 2/1) or 2/2), and aM (M 5 1, y, 6) are mating-type specific intercepts. The residual ei is assumed tobe normally distributed with mean 0 and variance s2.The formulation above suggests that the model isparameterized in terms of six mating-type parameters(the aM) and three child-genotype parameters (b0,b1 and b2), but, in fact, only two child-genotypeparameters are estimable, with one of the genotypecategories being chosen as the reference genotypecategory (the baseline genotype category to whichthe other genotype effects are compared). For example,if 1/1 is chosen as the reference genotype, b0 is setequal to 0 and the test of no association between y andg is H0 :b1 5b2 5 0. Alternatively, if the heterozygousgenotype 1/2 is chosen as the reference genotype, b1

is set equal to 0 and the test of no association betweeny and g is H0 :b0 5b2 5 0.

Similar to the HQTDT of Abecasis et al. [2000], theQTDTM draws information from both within andbetween mating types. The HQTDT models thedifferences across mating types using a betweenmating type parameter whereas the QTDTM usesthe multiple fixed intercepts aM. Gauderman [2003]notes that inferences for the genotype effects arethe same using the HQTDT and QTDTM methods.The differences between the methods are in theestimates and interpretation of the mating-typespecific intercepts, which are treated as randomeffects in the HQTDT and fixed effects in QTDTM.

THE QPL METHOD

Kistner and Weinberg [2004, 2005] describe aretrospective model in which offspring genotype is

Genet. Epidemiol. DOI 10.1002/gepi

814 Wheeler and Cordell

Page 3: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

modelled as a function of offspring phenotype andparental genotypes. The model is an extension of thelog-linear model proposed by Weinberg et al. [1998]for qualitative traits and is fit using a polytomouslogistic model with a generalized logit link function.Assuming parental mating symmetry in the popula-tion, there are six distinct parental mating types andthe offspring genotype is modelled conditional onthe offspring’s quantitative trait (yi) and the parentalmating type. Let SM denote the set of possibleoffspring genotypes for mating type M. The summa-tion

Pg�

i2SM

denotes the restricted summation over

the offspring genotypes consistent with SM. Thecontribution of a trio to the likelihood is modelled as

Pðgijgim; gif ; yiÞ ¼expðb00gi

yi þ a00MgiÞP

g�i2SM

expðb00g�i

yi þ a00Mg�iÞð2Þ

where b00g are parameters representing associationbetween quantitative trait and genotype, and a00Mg arenuisance parameters to account for non-Mendelian-ism and/or population stratification and depend onboth parental mating type and the offspring geno-type. In the formulation described by Kistner andWeinberg [2004], these parameters are denoted as bg

and aMg, but here we instead use the double primednotation (b00g and a00Mg) to distinguish these parametersfrom the parameters bg and aM used in theprospective formulation of equation (1). Kistner andWeinberg [2004] code the offspring, maternaland paternal genotypes as 0, 1 or 2 depending onthe number of ‘variant’ alleles (here considered tobe allele 2) they carry (the results of the test will bethe same regardless which allele is considered to be the‘variant’). For the ith trio, let the offspring’s quanti-tative trait value be denoted by yi. Column 3 of Table Ishows the conditional likelihoods for all combina-tions of parent and offspring genotypes that wouldresult from equation (2) if all the parameters wereestimable. Note that there are three child genotypeparameters (b000, b001 and b002) and seven nuisanceparameters (a00010, a00011, a00110, a00111, a00112, a00121, a00122) witha00jkl corresponding to the nuisance parameter for thecategory with (unordered) parental genotypes j andk and child genotype l. In practice, this model isoverparameterized, and Kistner and Weinberg[2004] treat the heterozygous offspring genotype,1 (1/2 or 2/1), as the reference genotype category. Thisresults in a final model with six estimated parameters(b000, b002, a00010, a00110, a00112, and a00122), with b001, a00011, a00111 anda00121 being set equal to zero, as shown in the fourthcolumn of Table I. An alternative parameterizationfor the child’s genotype parameters (which we shallwe use in our simulation study) would be to set b000 tozero and to estimate b001 and b002. Regardless of theparameterization chosen, the categorization into threepossible offspring genotype categories means that

there are a maximum of three possible terms in thedenominator of equation (2), clearly seen in thecolumns 3 and 4 of Table I.

THE QCPG METHOD

The method we propose is closely related to theapproach by Kistner and Weinberg [2004, 2005], butthe parameterization and implementation of themethods are somewhat different. Our approachderives from the case/pseudocontrol method fordichotomous traits [Cordell and Clayton, 2002,Cordell et al., 2004]. This method involves construct-ing (from a sample of case-parent trios) a sampleof cases and matched pseudocontrols. We focus hereon the ‘conditioning on parental genotypes’ (CPG)approach of Cordell et al. [2004], which generatespseudocontrols conditional on the mother’s andfather’s genotypes (and possibly also conditionalon some other event x, such as phase or parent-of-origin being determinable).

The extension of the CPG method to quantitativetraits, here named the quantitative CPG (QCPG), isbased on a calculation of the conditional likelihoodof the offspring genotypes, conditional on theparental genotypes and the offspring phenotypes.For family i, let gi, gim, gif be the offspring, maternaland paternal genotypes respectively, and let yi be theoffspring’s quantitative trait. Then,

Pðgijgim; gif ; yiÞ ¼Pðgi; gim; gif ; yiÞ

Pðgim; gif ; yiÞ¼

Pðgi; gim; gif ; yiÞPg�

iPðg�i ; gim; gif ; yiÞ

¼Pðyijgi; gim; gif ÞPðgijgim; gif ÞPðgim; gif ÞPg�

iPðyijg�i ; gim; gif ÞPðg�i jgim; gif ÞPðgim; gif Þ

¼PðyijgiÞPðgijgim; gif ÞP

g�i2S0

MPðyijg�i ÞPðg

�i jgim; gif Þ

ð3Þ

whereP

g�i

denotes summation over the four

possible offspring genotypes andP

g�i2S0

Mdenotes

summation over all possible offspring genotypesthat could have been transmitted to the offspringgiven the parental genotypes (the probabilitiesPðg�i jgim; gif Þ ¼ 0 for offspring genotypes that areinconsistent with the parental genotypes). This isof the form of the case/pseudocontrol likelihoodfor qualitative traits [Cordell and Clayton, 2002]with the offspring’s affection status replaced by aquantitative phenotype. The likelihood can becalculated via conditional logistic regression asimplemented in standard statistical software. InAppendix A we show that the contribution of a trioto the likelihood may be assumed to be of the form

expðb0giyi þ a0Mgi

ÞPg�

i2S0

Mexpðb0g�

iyi þ a0Mg�

iÞ: ð4Þ

815Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 4: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

Here b0girepresent genotype effects, and a0Mgi

arenuisance parameters modelling non-Mendelianismand population stratification. The likelihood issimilar to Kistner and Weinberg’s QPL likelihoodequation (2), except for the summation in thedenominator. Columns 3 and 4 of Table II show theconditional probabilities corresponding to all com-binations of parent and offspring genotypes for theQCPG method. By comparing column 4 of Tables Iand II, and ignoring constants of proportionality, itcan be seen that the QCPG likelihood is identical tothe QPL likelihood, except for the offspring of twoheterozygous parents. With two heterozygous par-ents, the sum in the denominator of the QCPG like-lihood is over a maximum of four possible off-spring genotypes. However, Kistner and Weinbergsum over a maximum of three possible offspringgenotypes. For example, for a heterozygous off-spring, the contribution to the QPL likelihood is

1

expðb000 yþ a00110Þ þ 1þ expðb002 yþ a00112Þ

whereas for the QCPG method the contribution tothe likelihood is

1

expðb00 yþ a0110Þ þ 1þ 1þ expðb02 yþ a0112Þ:

Essentially, in the QCPG formulation, we distin-guish between the two possible heterozygote off-spring genotypes 1/2 and 2/1 in the summationin the denominator (although in practice – assumingno parent-of-origin effects – the likelihood will beidentical regardless of whether the observed off-spring has genotype 1/2 or 2/1). As a result of thedifferent likelihood formulations, interpretationof the nuisance parameters is different for the QCPGmethod and Kistner and Weinberg’s QPL. Thedifference is most noticeable under the null withno population stratification or non-Mendelianism. InAppendix A we show that, for the QCPG method,the true values of the nuisance parameters a0 underthe null with no population stratification or non-Mendelianism equal zero, whereas in Appendix Bwe show that for Kistner and Weinberg’s QPLmethod, the true values of the nuisance parametersa00 are non-zero. Provided all six estimable para-meters (two offspring genotype effects and four

TABLE I. The conditional likelihoods associated with the QPL model

Parents (gm, gf) Offspring (g)

QPL likelihood

Overparameterized model Six estimable parameters estimated

00 0 1 1

02 1 1 1

22 2 1 1

01 0expðb000yþ a00010Þ

expðb000yþ a00010Þ þ expðb001yþ a00011Þ

expðb000yþ a00010Þ

expðb000yþ a00010Þ þ 1

1expðb001yþ a00011Þ

expðb000yþ a00010Þ þ expðb001yþ a00011Þ1

expðb000yþ a00010Þ þ 1

11 0expðb000yþ a00110Þ

expðb000yþ a00110Þ þ expðb001yþ a00111Þ þ expðb002yþ a00112Þ

expðb000yþ a00010Þ

expðb000yþ a00110Þ þ 1þ expðb002yþ a00112Þ

1expðb001yþ a00111Þ

expðb000yþ a00110Þ þ expðb001yþ a00111Þ þ expðb002yþ a00112Þ1

expðb000yþ a00110Þ þ 1þ expðb002yþ a00112Þ

2expðb002yþ a00112Þ

expðb000yþ a00110Þ þ expðb001yþ a00111Þ þ expðb002yþ a00112Þ

expðb002yþ a00112Þ

expðb000yþ a00110Þ þ 1þ expðb002yþ a00112Þ

12 1expðb001yþ a00121Þ

expðb001yþ a00121Þ þ expðb002yþ a00122Þ1

1þ expðb002yþ a00122Þ

2expðb002yþ a00122Þ

expðb001yþ a00121Þ þ expðb002yþ a00122Þ

expðb002yþ a00122Þ

1þ expðb002yþ a00122Þ

The likelihoods are proportional to P(g|gm, gf, y), corresponding to all combinations of (unordered) parent and offspring genotypes.

816 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 5: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

nuisance parameters) are freely estimated duringlikelihood maximization, inference for the para-meters of interest (b000 and b002, or b001 and b002,depending on which genotype category is chosenas reference) should not be affected by this result.However, the QPL result is slightly counter-intui-tive, as one would generally expect that parametersthat are specifically included in the likelihood tomodel certain effects (such as population stratifica-tion or non-Mendelianism) would take the valuezero (i.e. be removable from the likelihood) whenthese effects do not, in fact, exist.

An additional difference between the QCPG andQPL arises with regard to the number of nuisanceparameters estimated (see Appendix C).

Liu et al. [2002] described a method that is closelyrelated to the QCPG. Although the nuisanceparameters are different, the likelihood is essen-tially of the same form. Rather than having sixpossible parental mating type parameters (aM), Liuet al. [2002] have a single baseline parameter, a, anda number of additional parameters for the clustersof individuals whose trait differs from the popula-tion mean due to population stratification, di.However, without knowing the underlying popula-tion stratification, the di parameters are unknownand cannot be estimated from the data. Liu et al.

[2002] avoid having to estimate these nuisanceparameters by showing that even in the presenceof unobservable population stratification, it is stillvalid to test the null of no genetic effect via ascore test, since population stratification hasno effect on the null distribution of the test.Gauderman [2003] refers to the method of Liuet al. [2002] as the retrospective QTDT (RQTDT).To implement this method via a likelihood ratio test,Gauderman assumed that the quantitative traitfollows a normal distribution with mean aþ bgi

and variance s2, without consideration of thedi parameters. In this implementation, a does notuse information on the parental genotypes tomodel population stratification, although someinformation on the parental genotypes is stillincorporated via the genotypes of the offspringand the pseudocontrols.

EXTENSION OF QCPGTO MULTI-LOCUS HAPLOTYPES

Cordell et al. [2004] showed that the case/pseudocontrol approach can easily be extended tofit models for parent-of-origin effects, multiallelicmarkers, multiple linked loci in multiple unlinkedregions, and gene-gene and gene-environment

TABLE II. The conditional likelihoods associated with the QCPG model

Parents (gm, gf) Offspring (g)

QCPG likelihood

Overparameterized model Six estimable parameters estimated

00 0 1 102 1 1 122 2 1 1

01 0expðb00yþ a0010Þ

2 expðb00yþ a0010Þ þ 2 expðb01yþ a0011Þ

expðb00yþ a0010Þ

2 expðb00yþ a0010Þ þ 2

1expðb01yþ a0011Þ

2 expðb00yþ a0010Þ þ 2 expðb01yþ a0011Þ1

2 expðb00yþ a0010Þ þ 2

11 0expðb00yþ a0110Þ

expðb00yþ a0110Þ þ 2 expðb01yþ a0111Þ þ expðb02yþ a0112Þ

expðb00yþ a0110Þ

expðb00yþ a0110Þ þ 2þ expðb02yþ a0112Þ

1expðb01yþ a0111Þ

expðb00yþ a0110Þ þ 2 expðb01yþ a0111Þ þ expðb02yþ a0112Þ1

expðb00yþ a0110Þ þ 2þ expðb02yþ a0112Þ

2expðb02yþ a0112Þ

expðb00yþ a0110Þ þ 2 expðb01yþ a0111Þ þ expðb02yþ a0112Þ

expðb02yþ a0112Þ

expðb00yþ a0110Þ þ 2þ expðb02yþ a0112Þ

12 1expðb01yþ a0121Þ

2 expðb01yþ a0121Þ þ 2 expðb02yþ a0122Þ1

2þ 2 expðb02yþ a0122Þ

2expðb02yþ a0122Þ

2 expðb01yþ a0121Þ þ 2 expðb02yþ a0122Þ

expðb02yþ a0122Þ

2þ 2 expðb02yþ a0122Þ

The likelihoods are proportional to P(g|gm, gf, y), corresponding to all combinations of (unordered) parent and offspring genotypes.

817Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 6: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

interactions, via an adjustment to the conditioningargument that results in differing numbers ofpseudocontrols depending on the model beingfitted. Here we extend this approach to quantitativetraits. Consider models in which the genotypeeffects depend only on child’s phased genotype.Define gi, gim, gif as the offspring, maternal andpaternal phase-known genotypes respectively, and yi

as the offspring’s quantitative trait. The likelihoodis very similar to that in equation (3) but we definean event x as the event that the set of transmittedand untransmitted haplotypes from the parentscan be deduced. The contribution to the conditionallikelihood is

Pðgijgim; gif ; yi; xÞ

¼Pðgi; gim; gif ; yi; xÞ

Pðgim; gif ; yi; xÞ¼

Pðgi; gim; gif ; yi; xÞPg�

iPðg�i ; gim; gif ; yi; xÞ

¼Pðyijgi; gim; gif ; xÞPðgijgim; gif ; xÞPðgim; gif ; xÞPg�

iPðyijg�i ; gim; gif ; xÞPðg�i jgim; gif ; xÞPðgim; gif ; xÞ

¼PðyijgiÞPðgijgim; gif ; xÞP

g�i2S0

MPðyijg�i ÞPðg

�i jgim; gif ; xÞ

whereP

g�i

denotes summation over all possible off-

spring genotypes andP

g�i2S0

Mdenotes summation

over all possible offspring genotypes that could havebeen transmitted to the offspring given the parentalgenotypes. Under Mendelian inheritance the prob-abilities Pðg�i jgim; gif ; xÞ are equal for all g�i 2 fS

0M \

Gxg and equal zero otherwise, where Gx denotes theset of offspring genotypes determined by x. Then thecontribution to the likelihood under Mendelianinheritance is given by

Pðgijgim; gif ; yi; xÞ ¼PðyijgiÞP

g�i2fS0

M\Gxg

Pðyijg�i Þ:

Note that, provided the models that are to be fitteddo not depend on phase, one could also use theQCPG for analysis of unphased multilocus genotypedata, in the same way that the CPG method canbe used for unphased genotype data [Cordellet al., 2004].

EXTENSIONS FOR MATERNAL GENOTYPEAND PARENT-OF-ORIGIN EFFECTS

Kistner et al. [2006] proposed an extension tothe QPL approach to allow testing for maternallymediated effects and parent-of-origin effects. Thelikelihood factors into two parts. The first factor testsfor genotype effects in the offspring and can bemodelled using the original QPL method. Thesecond factor tests for maternal genotype or par-ent-of-origin effects via a logistic regression model.

Maternal genotype effects are incorporated bymodelling the probability that the mother has morecopies of the variant allele than the father for eachmating type. Parent-of-origin effects are incorpo-rated by additionally including a binary indicatorvariable, indicating whether the offspring inheritedonly one copy of the variant allele. This implies thechild is heterozygous, and since the mother has morecopies of the variant allele than the father, the variantallele must have been inherited from the mother.

We may also extend the QCPG method to allow formaternal genotype and parent-of-origin effects. The‘conditioning on exchangeable parental genotypes’(CEPG) method [Cordell et al., 2004] is an extensionof the CPG approach to detecting parent-of-originor maternal genotype effects by assuming exchange-ability of parental genotypes. The method conditionson the set of parental genotypes but not on theirorder, generating additional pseudocontrols con-structed by exchanging the genotypes of the motherand father. Here we extend this approach toquantitative traits. Maternal genotype effects aredefined to be the direct effect of the maternalgenotype on the offspring’s quantitative trait, andparent-of-origin effects are defined (as in Weinberget al. [1998]) to allow the offspring’s quantitativetrait to vary according to the parental origin of thevariant allele, if present. Like Cordell et al. [2004], weintroduce an additional conditioning event x corre-sponding to the event that parent-of-origin andmaternal genotype can be deduced in the trio. Forquantitative trait CEPG method (denoted hereQCEPG), the contribution of a trio to the QCEPGlikelihood is

Pðgi; gim; gif jfgim; gifg; yi; xÞ

¼Pðgi; gim; gif ; fgim; gifg; yi; xÞ

Pðfgim; gifg; yi; xÞ

¼Pðgi; gim; gif ; fgim; gifg; yi; xÞP

g�i;g�

im;g�

ifPðg�i ; g

�im; g

�if ; fgim; gifg; yi; xÞ

¼Pðyijgi; gim; gif ; xÞPðgi; gim; gif jfgim; gifg; xÞPðfgim; gifg; xÞP

g�i;g�

im;g�

ifPðyijg�i ; g

�im; g

�if ; xÞPðg

�i ; g�im; g

�if jfgim; gifg; xÞPðfgim; gifg; xÞ

¼Pðyijgi; gim; gif ; xÞP

g�i;g�

im;g�

if2Gx

Pðyijg�i ; g�im; g

�if ; xÞ

ð5Þ

where fgim; gifg denotes the unordered set of parentalgenotypes and the final restricted sum is over thepossible trios in which parent-of-origin (as well asmaternal genotype) are deducible. Unlike Kistnerand Weinberg’s extension of the QPL method, theQCEPG method does not involve factoring thelikelihood. In addition, the null hypothesis of noparent-of-origin effects considers the transmission ofvariant alleles from the mother to all offspring, notonly those who are heterozygous. The contribution

818 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 7: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

of the likelihood (see Appendix D) is assumed to beof the form

expððb0giþb0gim

þbImÞyi þ a0gimgif gi

ÞPg�

i;g�

im;g�

if2Gx

expððb0g�iþ b0g�

imþb�ImÞyi þ a0g�

img�

ifg�

iÞ: ð6Þ

The offspring genotype effects are denoted by b0g�i,

the maternal genotype effects by b0g�im

and parent-of-

origin effects by b�Im, where Im is an indicator of

whether an offspring inherits a variant allele fromthe mother. The nuisance parameters a0g�

img�

ifg�

idepend

on the offspring genotype and the genotypes of theparents. For the QCPG, in the absence of maternalgenotype or parent-of-origin effects, the nuisanceparameters only depended on the parental matingtype and the offspring genotype. However, in theQCEPG, the maternal genotype is of interest andneeds to be included. No additional information isgained by incorporating the parent-of-origin indica-tor into the nuisance parameter since once the phase-known genotypes of the parents and offspring arespecified, and conditional on the fact that parent-of-origin can be deduced, parent-of-origin is alsoestablished. If only maternal genotype effects areof interest then the parent-of-origin indicator cansimply be removed from the model and the nuisanceparameters remain the same. However, if onlyparent-of-origin effects are of interest then thenuisance parameters are of the form a0Mg�

iIm , which

depend on the parental mating type, the offspringgenotype and the parent-of-origin indicator.

The QTDTM method can also be extended toinclude maternal genotype effects, bgim

, and parent-of-origin effects, bIm

, through fitting the linearregression model

yi ¼ aM þ bgiþ bgim

þ bIm:

Trios in which parent-of-origin can be resolved canbe found by first generating a case/pseudocontroldataset as described in Cordell et al. [2004], specify-ing that the parental genotypes are exchangeableand parent-of-origin can be resolved. For theQTDTM method, only the original offspring (the‘case’) from the case/pseudocontrol dataset(together with information about maternal genotypeand parent-of-origin status) is used in the prospec-tive likelihood, whereas in the QCEPG method, thefull set of cases and pseudocontrols is required forthe retrospective likelihood.

SIMULATION STUDY

SINGLE-LOCUS SIMULATIONS

Simulations were performed to investigate thepower and properties of the various methodsdescribed. Initially, a single diallelic QTL locus was

considered. One thousand replicates of data weregenerated, each consisting of a number of genotypedtrios (i.e. a single offspring with a quantitative traitand both parents). Bias in the resulting parameterestimates, 95% confidence intervals, power and typeI error were examined. A method that performs wellwould be expected to give unbiased parameterestimation and to show approximately 95% con-fidence interval coverage. The importance of thenuisance parameters in the retrospective modelswas also investigated by examining the estimatesobtained when they are removed from the modeland also when the offspring genotype is used as asubstitute.

For the single-locus model, six generating scenar-ios were considered as shown in Table 1 (online).Three different sampling schemes were employed:random sampling, one-tail sampling from the uppertail of the offspring trait distribution and two-tailedsampling from the upper and lower tails of theoffspring trait distribution. Under random sam-pling, 500 parent-offspring trios were simulatedper replicate where the offspring’s quantitative traitwas drawn from a normal distribution with geno-type mean and standard deviation as shown in Table1 (online). Population stratification was simulatedby combining data in different proportions from twosubpopulations, each of which was in Hardy-Weinberg equilibrium and showed random mating.The subpopulations had different allele frequenciesand mean quantitative trait values, producing aspurious correlation between the quantitative traitand genotypes when the populations are combined.Under selected sampling from the extremes, 5,000trios were generated per replicate, from which asubset were selected for analysis. For the two-tailedsampling scheme, 500 trios were selected from the5,000 (i.e. the top and bottom 5% of the traitdistribution). For the one-tail sampling scheme, weselected 1,000 trios from the 5,000 (i.e. the top 20%),as convergence problems were encountered whenusing only 500 trios under this sampling scheme.

Table III shows results for the first three scenarioswith no population stratification and where the trioswere randomly sampled. Under the null, all themethods gave unbiased estimates and reasonableconfidence intervals, except for the QPL methodwhere the nuisance parameters have been removed.This is expected since under the null, the a00

parameters in the QPL are nonzero and so theirremoval affects the resulting b00 estimates. Similarly,under the first alternative model (Alt 1) all themethods performed well except the QPL and QCPGmethods where the nuisance parameters have beenremoved. The retrospective models in which thenuisance parameters have been replaced by theoffspring genotype parameters give b0 and b00

819Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 8: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

TA

BL

EII

I.T

rue

an

de

stim

ate

dm

ea

ns,

sta

nd

ard

de

via

tio

ns

(SD

)a

nd

cov

era

ge

(CI)

of

the

95

%co

nfi

de

nce

inte

rva

lsfo

rth

esi

ng

lelo

cus

sim

ula

tio

ns

wit

hra

nd

om

sele

ctio

na

nd

no

po

pu

lati

on

stra

tifi

cati

on

Met

ho

dP

aram

eter

Mo

del

s

Nu

llA

lt1

Alt

2

Tru

em

ean

Mea

n(S

D)

CI

Tru

em

ean

Mea

n(S

D)

CI

Tru

em

ean

Mea

n(S

D)

CI

Lin

ear

reg

ress

ion

Co

nst

ant

–1.

99(0

.46)

––

�0.

01(0

.46)

––

�0.

01(0

.46)

–b 1

0.00

0.05

(0.5

0)0.

961.

001.

05(0

.50)

0.96

0.00

0.05

(0.5

0)0.

96b 2

0.00

0.04

(0.4

8)0.

962.

002.

04(0

.48)

0.96

1.00

1.04

(0.4

8)0.

96Q

TD

TM

b 10.

000.

08(0

.60)

0.95

1.00

1.08

(0.6

0)0.

950.

000.

08(0

.60)

0.95

b 20.

000.

07(0

.63)

0.95

2.00

2.07

(0.6

3)0.

951.

001.

07(0

.63)

0.95

QC

PG

b01

0.00

0.02

(0.1

6)0.

950.

250.

27(0

.17)

0.96

0.00

0.02

(0.1

6)0.

95b0

20.

000.

02(0

.16)

0.95

0.50

0.51

(0.1

7)0.

970.

250.

25(0

.16)

0.96

QP

Lb00 1

0.00

0.02

(0.1

6)0.

950.

250.

27(0

.17)

0.96

0.00

0.02

(0.1

6)0.

95b00 2

0.00

0.02

(0.1

6)0.

950.

500.

51(0

.17)

0.97

0.25

0.25

(0.1

6)0.

96Q

CP

Gas

rem

ov

edb0

10.

000.

02(0

.11)

0.96

0.25

0.26

(0.1

5)0.

960.

000.

02(0

.14)

0.96

b02

0.00

0.01

(0.1

1)0.

950.

500.

40(0

.16)

0.88

0.25

0.23

(0.1

5)0.

96Q

PLa00 s

rem

ov

edb00 1

0.00

0.10

(0.1

0)0.

870.

250.

27(0

.14)

0.97

0.00

0.00

(0.1

3)0.

97b00 2

0.00

0.07

(0.1

0)0.

920.

500.

40(0

.14)

0.88

0.25

0.21

(0.1

3)0.

97Q

CP

Ga0

sre

pla

ced

by

off

spri

ng

gen

oty

pe

(g)

b01

0.00

0.02

(0.1

5)0.

960.

250.

27(0

.16)

0.97

0.00

0.02

(0.1

5)0.

96b0

20.

000.

01(0

.16)

0.95

0.50

0.50

(0.1

6)0.

960.

250.

25(0

.16)

0.96

QP

La00 s

rep

lace

db

yo

ffsp

rin

gg

eno

typ

e(g

)b00 1

0.00

0.02

(0.1

5)0.

960.

250.

24(0

.16)

0.97

0.00

0.00

(0.1

5)0.

96b00 2

0.00

0.01

(0.1

5)0.

960.

500.

47(0

.16)

0.96

0.25

0.23

(0.1

5)0.

96

Th

esi

mu

lati

on

par

amet

ers

are

assh

ow

nin

Tab

le1

(on

lin

esu

pp

lem

enta

rym

ater

ials

).

820 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 9: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

estimates very close to the true means and reason-able coverage.

The results for the three scenarios in the presenceof population stratification and under randomsampling are shown in Table IV. Simple linearregression showed the expected bias in the estimatesof b and poor coverage of the estimated 95%confidence intervals since the population stratifica-tion is not accounted for in the method. Under thefirst null model, as in the case without populationstratification (Table III), the QPL method with thea00 parameters removed did not perform well (cov-erages 0.87 and 0.92 instead of 0.95). The remainderof the methods perform well under both nullmodels, even under population stratification. Sub-stitution of the four nuisance parameters by theoffspring genotype in the retrospective methodsappears to account sufficiently for the populationstratification. Under the alternative with populationstratification, only the prospective QTDTM methodproduced unbiased estimates and correct coverage.The retrospective methods (QCPG and QPL) pro-duced biased estimates, as expected (see AppendicesA and C).

Parameter estimates under one-tail selected sam-pling are shown in online Tables 2 and 3 (online).The results under the null (both with and withoutpopulation stratification) are the same as thosefound in the unselected case. Under the alternativewith no population stratification, both prospectivemodels (simple linear regression and QTDTM) showbiased estimates and incorrect coverage of the 95%confidence intervals. This is because the methodscannot account for the selection on quantitative traitvalue. By conditioning on the trait values, theretrospective models should be robust to selection.However, Table 2 (online) suggests that thesemethods are producing biased estimates. By lookingat the median genotype effect estimates (data notshown), we found that the bias is due to a smallnumber of outlying observations. The medians forthe QCPG and QPL methods (with the true nuisanceparameters and with the nuisance parametersreplaced by the offspring genotype) are very closeto the true means. Under the alternative withpopulation stratification, all methods performedpoorly, producing biased estimates and incorrectcoverage. Here, the prospective model QTDTM failssince it cannot account for selection on quantitativetrait value and the retrospective models, QCPG andQPL, fail to estimate the nuisance parameters underthe alternative with population stratification. Similarresults were observed using the two-tailed samplingscheme (Tables 4 and 5 (online)). Without popula-tion stratification (Table 5 (online)), it can be seenthat the bias in the estimates using simple linearregression is not as great under two-tailed sampling

as found when sampling only from the uppertail of the trait distribution (note that under thealternative, the assumption of homoscedasticity ofthe residuals is violated under the one-tailedsampling scheme).

Powers/type I errors are shown in Table 6(online). Since the powers to achieve P value of0.001 for the different methods are all 1.0 under two-tailed sampling, we also investigated the power toachieve a more stringent significance level in thiscase. Under the null with no population stratifica-tion, removal of the nonzero nuisance parameters inthe QPL method generates a bias in the estimatesand hence increased type I error rates, most clearlyseen in Table 6 (online) for the random and one-tailsampling schemes. The remaining methods all havetype I error rates close to or less than the criticalvalues. Highest power to detect a genotypic effectis seen with the linear regression method for therandom and two-tailed sampling schemes and withthe QCPG method with the a0 parameters removedfor the one-tail sampling scheme (powers are mean-ingless for the QPL method with no a00 parameterssince the type I errors are incorrect). In all cases, thehighest powers to detect a genotypic effect are seenfor the two-tailed selected sampling scheme, select-ing from the upper and lower tails of the offspringtrait distribution. In contrast, selection from only theupper tail of the offspring trait distribution actuallydecreases the power to detect a genetic effectcompared to the random sampling scheme, despitehaving the largest sample size, except for the QCPGmethod with the a0 parameters removed. Theseresults also show that, although under the alter-native with selected sampling the QTDTM methodshowed biased estimates and poor coverage of the95% confidence intervals, the method can still beused to test for a genetic effect, as the type 1 error iscorrect. In fact, the large bias in the estimates seenwhen using a two-tailed sampling scheme actuallyincreases the power to detect an effect compared torandom sampling, although this power increase mayalso be due to the fact that the selected subjects carrymore information, since they are concentrated at theextremes of the trait distribution.

Under population stratification, Table 6 (online)shows that for all sampling schemes the simplelinear regression method has increased type I errorrates. Since linear regression cannot account for thepopulation substructure, the resulting bias in theestimates generates a large number of false-positiveassociations. The QPL method with the a00 para-meters removed also has type I errors larger than thenominal values, particularly when selecting from theupper tail of the offspring trait distribution, as foundin the case of no population stratification. The type Ierrors for the QTDTM method under population

821Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 10: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

TA

BL

EIV

.T

rue

an

de

stim

ate

dm

ea

ns,

sta

nd

ard

de

via

tio

ns

(SD

)a

nd

cov

era

ge

(CI)

of

the

95

%co

nfi

de

nce

inte

rva

lsfo

rth

esi

ng

lelo

cus

sim

ula

tio

ns

wit

hra

nd

om

sele

ctio

na

nd

wit

hp

op

ula

tio

nst

rati

fica

tio

n

Met

ho

dP

aram

eter

Mo

del

s

Po

pn

stra

t(N

ull

2)P

op

nst

rat

(Nu

ll2)

Po

pn

stra

t(A

lt)

Tru

em

ean

Mea

n(S

D)

CI

Tru

em

ean

Mea

n(S

D)

CI

Tru

em

ean

Mea

n(S

D)

CI

Lin

ear

reg

ress

ion

Co

nst

ant

–2.

29(0

.21)

––

0.02

(0.1

0)–

–1.

84(0

.19)

–b 1

0.00

�1.

06(0

.26)

0.02

0.00

0.25

(0.1

5)0.

761.

000.

15(0

.24)

0.07

b 20.

00�

2.06

(0.2

2)0.

000.

001.

31(0

.14)

0.00

2.00

0.34

(0.2

0)0.

00Q

TD

TM

b 10.

000.

01(0

.33)

0.95

0.00

0.00

(0.1

7)0.

981.

001.

01(0

.31)

0.95

b 20.

000.

01(0

.39)

0.94

0.00

0.00

(0.2

4)0.

972.

002.

01(0

.35)

0.94

QC

PG

b01

0.00

0.01

(0.2

0)0.

950.

000.

01(0

.15)

0.95

0.25

0.68

(0.2

4)0.

61b0

20.

000.

00(0

.21)

0.95

0.00

0.00

(0.1

6)0.

960.

501.

23(0

.27)

0.17

QP

Lb00 1

0.00

0.01

(0.2

0)0.

950.

000.

01(0

.15)

0.95

0.25

0.68

(0.2

4)0.

61b00 2

0.00

0.00

(0.2

1)0.

950.

000.

00(0

.16)

0.96

0.50

1.23

(0.2

7)0.

17Q

CP

Ga0

sre

mo

ved

b01

0.00

0.00

(0.1

0)0.

960.

000.

01(0

.14)

0.95

0.25

0.14

(0.0

9)0.

81b0

20.

000.

00(0

.12)

0.95

0.00

0.00

(0.1

6)0.

960.

500.

29(0

.10)

0.50

QP

La00 s

rem

ov

edb00 1

0.00

0.08

(0.0

9)0.

910.

000.

02(0

.14)

0.96

0.25

0.20

(0.0

9)0.

92b00 2

0.00

0.01

(0.1

1)0.

970.

000.

01(0

.14)

0.96

0.50

0.30

(0.0

9)0.

50Q

CP

Ga0

sre

pla

ced

by

off

spri

ng

gen

oty

pe

(g)

b01

0.00

0.00

(0.1

9)0.

960.

000.

01(0

.14)

0.95

0.25

0.62

(0.2

2)0.

66b0

20.

00�

0.01

(0.2

0)0.

960.

000.

00(0

.16)

0.95

0.50

1.11

(0.2

3)0.

25Q

PLa00 s

rep

lace

db

yo

ffsp

rin

gg

eno

typ

e(g

)b00 1

0.00

�0.

03(0

.19)

0.95

0.00

0.01

(0.1

4)0.

950.

250.

56(0

.22)

0.79

b00 2

0.00

�0.

07(0

.19)

0.96

0.00

0.04

(0.1

5)0.

950.

501.

00(0

.22)

0.46

Th

esi

mu

lati

on

par

amet

ers

are

assh

ow

nin

Tab

le1

(on

lin

e).

822 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 11: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

stratification for the first null model are slightlylarger than expected, across all three samplingschemes. Similarly, the type I error is inflated forthe second null model under the two-tailed sam-pling scheme. The remainder of the methods havetype I errors close to the nominal values. Therefore,the retrospective methods (QCPG and QPL) with thecorrect nuisance parameters or with the nuisanceparameters replaced by the offspring genotype canbe used (and indeed have high power) to test for agenetic effect, even under circumstances where theyproduce biased estimates.

MULTI-LOCUS HAPLOTYPES

Simulations were carried out to investigate theeffect of the nuisance parameters (intended toaccount for population stratification) when themethods are extended to multi-locus haplotypes.Note that, as originally proposed, the QTDTM

method (and simple linear regression) only applyto single loci: to extend these methods to multi-locushaplotypes it is necessary to first infer the child’s(and if necessary, the parents’) haplotypes giventhe observed genotype data, as is done in thefirst stage of the CPG and QCPG methods [Cordellet al., 2004]. The resulting haplotype variablesmay then be entered as predictor variables intoequation (1).

Tables V and VI show the results of simulations inwhich the offspring quantitative trait was influencedby genotype at two linked diallelic markers assumedto be in moderate LD. The four possible haplotypes,1-1, 1-2, 2-1 and 2-2, had haplotype frequencies andhaplotype means as shown in Table 7 (online).Additive effects of haplotypes were assumed so thatfor each trio, the offspring’s quantitative trait wasdrawn from a normal distribution whose mean wasthe sum of the two haplotype means. For eachsimulation, 1,000 replicates of data were generated,each replicate consisting of 1,000 parent-offspringtrios with random selection or 1,000 trios selectedfrom 10,000 in either one-tail (top 10%) or two-tailed(top and bottom 5%) sampling from the extremes ofthe offspring trait distribution.

Three methods were considered, simple linearregression, QTDTM and the retrospective methodQCPG. The QPL method was not considered as itis so closely related to the QCPG. Simple linearregression does not have any additional parametersto account for population stratification. The QTDTM

and QCPG methods, however, have a significantnumber of nuisance parameters when the methodsare extended to multi-locus haplotypes. For exam-ple, for QTDTM, the number of possible matingtypes (assuming parental mating symmetry) is 55,a large increase from the 6 in the single-locus case.

Therefore, in addition to considering the models inwhich the ‘correct’ nuisance parameters are used,we considered models in which the number ofnuisance parameters were reduced. For the QTDTM

we considered either including in the modelthe single-locus mating-type parameters for eachlocus, or including maternal and paternal genotype(rather than mating-type) parameters. For the QCPGmethod, replacing the nuisance parameters by theoffspring genotype (g) was considered.

Table V shows the results for the case with nopopulation stratification. Also shown is the numberof replicates that converged from the original 1,000.Convergence problems were probably a smallsample size problem, due to the large numbers ofparameters to estimate in the models. Under thenull, all the methods produced unbiased parameterestimates, regardless of selection scheme. Under thealternative with no selection, the prospective meth-ods (simple linear regression and QTDTM withthe different sets of a parameters) gave unbiasedparameter estimates. The retrospective QCPG meth-od shows some small-sample bias in the estimateswhen the full set of ‘correct’ a0 parameters was usedand similar bias when the a0 parameters werereplaced by the offspring genotype g. This biasdisappeared when 10,000 trios (as opposed to 1,000)were used (data not shown). Under the alternativewith selection, only the retrospective QCPG methodwhen all the ‘correct’ a0 parameters are used, orwhen the a0 are replaced by the offspring genotype,gave estimates close to the true mean.

The sensitivity of the estimates to the way thenuisance parameters are modelled is most pro-nounced under population stratification as shownin Table VI. Under the null with random selection,the simple linear regression method producesbiased estimates, as does the QTDTM method inwhich the correct a’s are replaced by those thatwould be generated by considering the loci indivi-dually. The remaining methods, QTDTM with thefull set of a parameters or with parental genotypeparameters, and the QCPG methods with thedifferent sets of nuisance parameters, all haveunbiased estimates. Under the null with selectionfrom the upper tail of the offspring trait distribution,all of the methods produced unbiased parameterestimates. For the two-tailed sampling scheme,linear regression showed the expected bias inparameter estimates but QTDTM with the correcta’s, QTDTM with parental genotypes and theretrospective methods (with the different sets ofa0 parameters) produced unbiased parameter esti-mates. Under the alternative with random selectiononly the prospective QTDTM method (with thedifferent sets of a’s) produced unbiased estimates:as explained in Appendices A and C, the nuisance

823Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 12: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

TA

BL

EV

.M

ea

ne

stim

ate

sa

nd

sta

nd

ard

de

via

tio

ns

of

sim

ula

ted

two

-lo

cus

ha

plo

typ

ee

ffe

cts

usi

ng

the

sim

ple

lin

ea

rre

gre

ssio

n,

QT

DT

M(w

ith

dif

fere

nt

sets

of

nu

isan

cep

ara

mete

rs)

an

dth

eQ

CP

G(w

ith

dif

fere

nt

sets

of

nu

isan

cep

ara

mete

rs)

Nu

llm

od

elA

lter

nat

ive

mo

del

Sam

pli

ng

sch

eme

No

.o

fre

pli

cate

s

Ran

do

m97

3T

op

980

To

p1

bo

tto

m98

2R

and

om

978

To

p13

2T

op

1b

ott

om

292

Met

ho

dP

ara-

met

erT

rue

mea

nM

ean

(SD

)M

ean

(SD

)M

ean

(SD

)T

rue

mea

nM

ean

(SD

)M

ean

(SD

)M

ean

(SD

)

Lin

ear

reg

ress

ion

b 20.

000.

00(0

.08)

0.00

(0.0

3)�

0.01

(0.1

7)1.

001.

00(0

.08)

0.21

(0.0

6)2.

18(0

.19)

b 30.

000.

00(0

.07)

0.00

(0.0

3)0.

00(0

.15)

2.00

2.00

(0.0

7)0.

63(0

.05)

3.52

(0.0

5)b 4

0.00

0.00

(0.0

7)0.

00(0

.03)

0.00

(0.1

4)3.

003.

00(0

.07)

1.17

(0.0

6)4.

08(0

.03)

QT

DT

Mal

las

b 20.

000.

00(0

.11)

0.00

(0.0

5)0.

00(0

.27)

1.00

1.00

(0.1

1)0.

16(0

.15)

1.70

(0.2

8)b 3

0.00

0.00

(0.1

0)0.

00(0

.04)

0.00

(0.2

2)2.

002.

00(0

.10)

0.59

(0.1

2)3.

52(0

.14)

b 40.

000.

00(0

.10)

0.00

(0.0

4)0.

00(0

.20)

3.00

3.00

(0.1

0)1.

08(0

.16)

4.50

(0.1

4)Q

TD

TM

asre

pla

ced

by

par

enta

lg

eno

typ

esb 2

0.00

0.00

(0.1

1)0.

00(0

.05)

0.00

(0.2

6)1.

001.

00(0

.11)

0.15

(0.1

4)2.

21(0

.22)

b 30.

000.

00(0

.10)

0.00

(0.0

4)0.

00(0

.22)

2.00

2.00

(0.1

0)0.

57(0

.12)

3.56

(0.0

8)b 4

0.00

0.00

(0.1

0)0.

00(0

.04)

0.00

(0.1

9)3.

003.

00(0

.10)

1.10

(0.1

4)4.

12(0

.06)

QT

DT

Mas

atb

oth

loci

b 20.

000.

00(0

.10)

0.00

(0.0

4)0.

00(0

.22)

1.00

1.00

(0.1

0)0.

14(0

.10)

1.79

(0.2

1)b 3

0.00

�0.

01(0

.09)

0.00

(0.0

4)0.

00(0

.19)

2.00

1.99

(0.0

9)0.

60(0

.10)

3.78

(0.1

0)b 4

0.00

0.00

(0.0

9)0.

00(0

.04)

0.00

(0.1

8)3.

003.

00(0

.09)

1.12

(0.1

3)4.

34(0

.10)

QC

PG

alla0

sb0

20.

000.

00(0

.13)

0.01

(0.3

1)0.

00(0

.06)

1.00

1.11

(0.1

8)0.

68(0

.90)

1.36

(0.9

6)b0

30.

00�

0.01

(0.1

1)�

0.01

(0.2

5)0.

00(0

.05)

2.00

2.23

(0.2

2)1.

76(0

.83)

2.69

(1.4

3)b0

40.

000.

00(0

.11)

0.00

(0.2

5)0.

00(0

.05)

3.00

3.44

(0.3

6)2.

80(0

.97)

3.83

(1.5

6)Q

CP

Ga0

sre

pla

ced

by

off

spri

ng

gen

oty

pe

(g)

b02

0.00

0.00

(0.1

2)0.

00(0

.28)

0.00

(0.0

6)1.

001.

06(0

.14)

0.86

(0.7

6)1.

32(0

.88)

b03

0.00

0.00

(0.1

0)0.

00(0

.24)

0.00

(0.0

5)2.

002.

12(0

.19)

1.86

(0.6

7)2.

60(1

.10)

b04

0.00

0.00

(0.1

0)0.

00(0

.23)

0.00

(0.0

4)3.

003.

19(0

.27)

2.87

(0.7

6)3.

67(1

.17)

Sim

ula

ted

wit

hn

op

op

ula

tio

nst

rati

fica

tio

n.

824 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 13: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

TA

BL

EV

I.M

ea

ne

stim

ate

sa

nd

sta

nd

ard

de

via

tio

ns

of

sim

ula

ted

two

-lo

cus

ha

plo

typ

ee

ffe

cts

usi

ng

the

sim

ple

lin

ea

rre

gre

ssio

n,

QT

DT

M(w

ith

dif

fere

nt

sets

of

nu

isa

nce

pa

ram

ete

rs)

an

dth

eQ

CP

G(w

ith

dif

fere

nt

sets

of

nu

isa

nce

pa

ram

ete

rs)

Nu

llm

od

elA

lter

nat

ive

mo

del

Sam

pli

ng

sch

eme

No

.o

f.re

pli

cate

s

Ran

do

m99

5T

op

941

To

p1

bo

tto

m99

5R

and

om

996

To

p82

To

p1

bo

tto

m50

2

Met

ho

dP

ara-

met

erT

rue

mea

nM

ean

(SD

)M

ean

(SD

)M

ean

(SD

)T

rue

mea

nM

ean

(SD

)M

ean

(SD

)M

ean

(SD

)

Lin

ear

reg

ress

ion

b 20.

003.

30(0

.28)

0.00

(0.0

4)4.

14(0

.39)

1.00

4.30

(0.2

8)0.

26(0

.14)

0.96

(0.3

2)b 3

0.00

2.07

(0.3

7)0.

00(0

.05)

2.06

(0.4

1)2.

004.

07(0

.37)

0.31

(0.1

0)6.

74(0

.64)

b 40.

004.

07(0

.16)

0.00

(0.0

3)6.

22(0

.17)

3.00

7.07

(0.1

5)0.

50(0

.09)

9.21

(0.0

2)Q

TD

TM

allas

b 20.

00�

0.01

(0.4

2)0.

00(0

.06)

0.00

(0.5

1)1.

000.

99(0

.42)

0.23

(0.1

9)0.

90(0

.45)

b 30.

00�

0.01

(0.4

2)0.

00(0

.07)

0.00

(0.4

4)2.

001.

99(0

.42)

0.30

(0.1

4)5.

73(0

.93)

b 40.

00�

0.01

(0.3

3)0.

00(0

.04)

0.00

(0.4

1)3.

002.

99(0

.33)

0.50

(0.1

3)8.

06(0

.50)

QT

DT

Mas

rep

lace

db

yp

aren

tal

gen

oty

pes

b 20.

00�

0.01

(0.4

3)0.

000.

06�

0.01

(0.5

1)1.

000.

99(0

.43)

0.13

(0.2

3)0.

84(0

.39)

b 30.

00�

0.01

(0.4

4)0.

000.

060.

00(0

.45)

2.00

1.99

(0.4

4)0.

17(0

.23)

6.48

(0.7

4)b 4

0.00

�0.

01(0

.33)

0.00

0.04

0.00

(0.4

2)3.

002.

99(0

.33)

0.37

(0.2

3)8.

79(0

.16)

QT

DT

Mas

atb

oth

loci

b 20.

000.

20(0

.37)

0.00

(0.0

5)�

0.01

(0.4

7)1.

001.

20(0

.37)

0.24

(0.1

6)0.

66(0

.36)

b 30.

000.

23(0

.35)

0.00

(0.0

5)0.

00(0

.41)

2.00

2.23

(0.3

5)0.

31(0

.11)

6.45

(0.7

3)b 4

0.00

0.04

(0.3

2)0.

00(0

.04)

0.00

(0.4

1)3.

003.

03(0

.32)

0.51

(0.1

0)8.

53(0

.26)

QC

PG

alla0

sb0

20.

000.

00(0

.05)

0.01

(0.3

2)0.

00(0

.04)

1.00

�0.

03(0

.08)

8.83

(12.

9)1.

07(0

.55)

b03

0.00

0.00

(0.0

5)0.

00(0

.36)

0.00

(0.0

3)2.

000.

20(0

.09)

9.59

(12.

7)3.

30(1

.73)

b04

0.00

0.00

(0.0

3)0.

00(0

.23)

0.00

(0.0

2)3.

000.

44(0

.10)

10.5

1(1

2.7)

4.13

(1.7

7)Q

CP

Ga0

sre

pla

ced

by

off

spri

ng

gen

oty

pe

(g)

b02

0.00

0.00

(0.0

3)0.

01(0

.29)

0.00

(0.0

2)1.

000.

08(0

.04)

7.64

(9.1

5)1.

05(0

.50)

b03

0.00

0.00

(0.0

3)0.

01(0

.32)

0.00

(0.0

2)2.

000.

14(0

.04)

8.41

(9.0

1)3.

46(1

.85)

b04

0.00

0.00

(0.0

3)0.

00(0

.22)

0.00

(0.0

2)3.

000.

31(0

.05)

9.31

(9.0

3)4.

30(1

.92)

Sim

ula

ted

wit

hp

op

ula

tio

nst

rati

fica

tio

n.

825Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 14: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

parameters for the QPL and QCPG will not becorrectly estimated under population stratification,except under the null. Under the alternative withselection, all the methods gave biased estimates asfound in the single-locus case.

We also investigated the QTDTM and QCPGmethods with a single replicate of data generatedunder a three-locus haplotype model (data notshown). Results were broadly similar to the two-locus haplotype results, except that the QCPGmethod required a very large number (50,000) triosto produce unbiased estimates, while QTDTM gen-erally achieved convergence and unbiased para-meter estimation with only 1,000 trios.

STEPWISE PROCEDURE

A stepwise procedure (results not shown), as usedby Cordell et al. [2004] for disease traits, was used tocompare the prospective QTDTM (using the full setof nuisance parameters) with the QCPG methodwith the ‘correct’ a0 parameters replaced by theoffspring genotype, under models with and withoutpopulation stratification and selection. In general,the pattern of results in terms of power and type1 error was as expected, with the QTDTM methodbeing the more powerful in general. Under popula-tion stratification we found that the Type I errorswere slightly too large for the QTDTM method underrandom sampling, consistent with the resultsobserved (Table 6 (online)) in the single locus simu-lations. Additional simulations (data not shown)indicated that this problem could be solved by use ofWald tests incorporating robust ‘information sand-wich’ variance estimates [Huber, 1967], rather thanlikelihood ratio tests or Wald tests with the usualvariance estimate (which equals minus the inverse ofthe Hessian matrix). We also investigated the powerand type 1 error of the stepwise approach whenapplied to non-normally distributed traits and foundthat both QTDTM and QCPG appear to be suitablefor the analysis of traits that deviate slightly fromnormality. Neither method was found tobe suitable for the analysis of very nonnormallydistributed traits, although it is worth noting thatthe prospective QTDTM method could easily beextended to enable the analysis of nonnormal traits byuse of robust regression, a generalised linear model(GLM), or by assuming a variance-mean relationship,according to the departure from normality.

MATERNAL GENOTYPEAND PARENT-OF-ORIGIN EFFECTS

The previous single-locus simulations were mod-ified to include maternal genotype and parent-of-origin effects. For each replicate, 1,000 trios weregenerated in which the offspring’s quantitative trait

was influenced by its own genotype, and by eitherthe mother’s genotype, or whether the offspringreceived a variant allele from the mother, or both.Under the alternative, 100 replicates of data weregenerated. Under the null (no maternal genotypeeffects or no parent-of-origin effects) 1,000 replicatesof data were generated. The QCEPG and QTDTM

methods were implemented in Stata. For Kistner andWeinberg’s approach [Kistner et al., 2006], the SASmacro provided at http://dir.niehs.nih.gov/dirbb/weinbergfiles/qpl.htm was used. The expectedeffect estimates for QCEPG and QTDTM should bethe same, since the traits were simulated to have unitvariance. The offspring reference category waschosen to be the 1/1 genotype, and b1 and b2 arethe estimated effects for the 1/2 (2/1) and 2/2genotypes respectively. Similarly, maternal genotypeeffects are denoted as bm1 and bm2, and parent-of-origin effects by bI. However, the expected estimatesfor Kistner and Weinberg’s method shouldbe slightly different. In their method the referencecategory for the offspring genotype effects is theheterozygous genotype, rather than the homozygous(1/1) genotype. The maternal effects, denoted by d01

and d12, compare the difference in quantitative traitfor a mother with 1 variant allele to a mother with 0variant alleles, and a mother with 2 variant alleles toa mother with 1 variant allele respectively (while inthe QCEPG method, both comparisons are madewith the homozygous 1/1 genotype category). Theestimates for the parent-of-origin effects (l1) in theQPL represent the log odds that a heterozygouschild inherits a maternal copy of the variant alleleinstead of a paternal copy, per unit increase in traitvalue. Although based only on heterozygous off-spring, these parameters are expected be the same asfor the QCEPG and QTDTM methods.

Tables 8 and 9 (online) show the true effects, theestimated means and standard deviations. All threemethods produce reasonable estimates under thenull. The results for the prospective QTDTM methodshow the least bias. Under the alternative, theretrospective methods show a bias when parent-of-origin effects are present. The QPL appears toproduce parent-of-origin effects of approximately0.5, when they would have been expected to be 1.This may be due to some unrecognised difference inthe parameterizations: the QCEPG uses the originalparent-of-origin parameterization of Weinberg et al.[1998], whereas the QPL uses a parameterizationcloser to the alternative parameterization suggestedby Weinberg [1999]. Table 10 (online) shows thepowers and type I errors. The type I errors for theQPL and QTDTM methods seem reasonable. How-ever, the type I error for the QCEPG method whentesting maternal genotype effects is very large,although this appears to be a small-sample issue as

826 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 15: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

it improved in simulations with a larger number oftrios (data not shown). Overall, the extension of theQPL method had the highest power to detect either amaternal genotype or parent-of-origin effect.

DISCUSSION

In this paper, we have extended the case/pseudocontrol association approach for dichoto-mous phenotypes [Cordell and Clayton, 2002] toperform association analysis with quantitative traits.This approach is very similar to the QPL approachproposed by Kistner and Weinberg [2004], but uses aslightly more intuitive parameterization and extendsmore naturally to allow analysis of multiallelicmarkers, multiple linked loci, multiple unlinkedregions, parent-of-origin or maternal genotypeeffects, gene-gene and gene-environment interac-tions, using the same formulation as Cordell et al.[2004]. We compared this approach to a prospectiveaproach, the QTDTM and also extended the QTDTM

to allow analysis of multiple linked loci (includingmulti-locus haplotypes), parent-of-origin or mater-nal genotype effects. Other extensions to the QTDTM

follow naturally.All the methods incorporate nuisance parameters

intended to account for population stratification.When considering multi-locus haplotypes, the num-ber of nuisance parameters can dramaticallyincrease, and so it is important to find ways toreduce the number of nuisance parameters. It wasfound that replacing the nuisance parameters by theoffspring genotype in the retrospective methodsworked almost as well as the full model, andreplacing the nuisance parameters by parentalgenotypes worked well for the QTDTM. In oursimulations, it was assumed that both parents camefrom the same sub-population. If, in fact, matingsoccurred between individuals from different sub-populations, one might not expect these approxima-tions to work as well as fitting the full set of nuisanceparameters.

Although the retrospective approaches had someadvantages with regard to estimation of parametersunder selected sampling, in general we found theprospective QTDTM to be the most efficientapproach, requiring smaller sample sizes to achieveconvergence and asyptotic behaviour. In addition,the parameter estimates provided by the QTDTM

have a more intuitive interpretation, correspondingto the direct genotype effects on the trait, whereasthe retrospective approaches estimate parametersthat are scaled by division by the unknown(although potentially estimable) trait variance. Cov-ariates are also more easily incorporated into theQTDTM framework, simply by adding them in asterms in the regression equation, although it would

be possible to incorporate covariates in the retro-spective approaches, either by first regressing thetraits on covariates of interest and performingsubsequent analysis on the residuals, or by usingmethods such as those described by Lim et al. [2005].

The QTDTM method was found to be the onlymethod suitable for estimation of effects under thealternative hypothesis with population stratifi-cation (assuming random sampling). Under popula-tion stratification, it was necessary to use robust‘information sandwich’ variance estimates to achievecorrect type 1 errors and confidence interval cover-age with the QTDTM. This is possibly because theparental mating-type stratification parameters act asa surrogate for population membership in the sensethat they soak up the mean level of bias inducedby population stratification, but do they not fullyaccount for population membership, so that thedistribution of trait within parental mating-typeclasses violates the assumption of normality,even if this asssumption holds within eachsub-population.

The analyses described here assumed availabilityof a dataset consisting of parent-offspring trios, withno missing genotype data. A natural extension of themethods proposed here would be to consideranalysis of large extended pedigrees and/or missinggenotype data. The QPL has previously beenextended to allow analysis of multiple siblings andmissing parents [Kistner and Weinberg, 2005] whilean approach asymptotically equivalent to QTDTM,namely the HQTDT of Abecasis et al. [2000], hasbeen extended to apply to pedigrees of arbitrarystructure [Abecasis et al., 2002]. However, theseapproaches focus on testing rather than estimationof effects and apply only to a single locus at a time.A natural way to extend the QCEPG and QTDTM

approaches developed here for analysis of generalpedigrees would be to perform tests using Wald testsand incorporate robust ‘information sandwich’variance estimates that cluster observations accord-ing to pedigree [Huber, 1967]. An alternativeapproach would be to use a random-effects model-ling framework [Xu and Shete, 2006]. With regardsto missing genotype data, methods that sample oraverage over the possible genotype configurationsconsistent with the observed genotype data, in thecorrect proportions [Cordell, 2006], could be con-sidered. Investigation of these approaches and theirbehaviour under complex disease models, in thepresence of population stratification, will form thebasis of future work.

ACKNOWLEDGMENTS

Support for this work was provided by theWellcome Trust (Grant references 074524

827Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 16: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

and 068612). We thank Jenefer Blackwell, JoannaBiernacka and Jeff O’Connell for useful discussions,and to Joanna Howson and Hin-Tak Leung fortechnical assistance.

REFERENCESAbecasis GR, Cardon LR, Cookson WOC. 2000. A general test of

association for quantitative traits in nuclear families. AmJ Hum Genet 66:279–292.

Abecasis GR, Cherney SS, Cookson WO, Cardon LR. 2002. Merlin-rapid analysis of dense genetic maps using sparse gene flowtrees. Nat Genet 30:97–101.

Allison DB. 1997. Transmission-disequilibrium tests for quantita-tive traits. Am J Hum Genet 60:676–690.

Cordell HJ, Barratt BJ, Clayton DG. 2004. Case/pseudocontrolanalysis in genetic association studies: a unified framework fordetection of genotype and haplotype associations, gene-geneand gene-environment interactions, and parent-of-origineffects. Genet Epidemiol 26:167–185.

Cordell HJ, Clayton DG. 2002. A unified stepwise regres-sion procedure for evaluating the relative effects of poly-morphisms within a gene using case/control or family data:application to HLA in type 1 diabetes. Am J Hum Genet 70:124–141.

Cordell HJ. 2006. Estimation and testing of genotypeand haplotype effects in case/control studies: comparison ofweighted regression and multiple imputation procedures.Genet Epidemiol 30:259–275.

Fulker DW, Cherny SS, Sham PC, Hewitt JK. 1999. Combinedlinkage and association sib-pair analysis for quantitative traits.Am J Hum Genet 64:259–267.

Gauderman WJ. 2003. Candidate gene association analysis for aquantitative trait, using parent-offspring trios. Genet Epide-miol 25:327–338.

Huber PJ. 1967. The behaviour of maximum likelihood estimatesunder nonstandard conditions. In: Proceedings of the FifthBerkeley Symposium on Mathematical Statistics and Prob-ability, 221–233.

Kistner EO, Infante-Rivard C, Weinberg CR. 2006. A method forusing incomplete triads to test maternally mediated geneticeffects and parent-of-origin effects in relation to a quantitativetrait. Am J Epidemiol 163:255–261.

Kistner EO, Weinberg CR. 2004. Method for using complete andincomplete trios to identify genes related to a quantitative trait.Genet Epidemiol 27:33–42.

Kistner EO, Weinberg CR. 2005. A method for identifying genesrelated to a quantitative trait, incorporating multiple siblingsand missing parents. Genet Epidemiol 29:155–165.

Laird NM, Horvath S, Xu X. 2000. Implementing a unifiedapproach to family-based tests of association. Genet Epidemiol19 (suppl 1):S36–S42.

Lange C, DeMeo DL, Laird NM. 2002. Power and designconsiderations for a general class of family-based associationtests: quantitative traits. Am J Hum Genet 71:1330–1341.

Lim S, Beyenne J, Greenwood CMT. 2005. Continuous covariatesin genetic association studies of case-parent triads: gene-environment interaction effects, population stratification, andpower analysis. Statisti Appl Genetics Mol Biol 4:1–25.

Liu Y, Tritchler D, Bull SB. 2002. A unified framework fortransmission-disequilibrium test analysis of discrete andcontinuous traits. Genet Epidemiol 22:26–40.

Lunetta K, Faraone S, Biederman J, Laird N. 2000. Family-basedtests of association and linkage that use unaffectedsibs, covariates, and interactions. Am J Hum Genet 66:605–614.

Schaid DJ, Sommer SS. 1993. Genotype relative risks: methods fordesign and analysis of candidate-gene association studies. AmJ Hum Genet 53:1114–1126.

Schaid DJ. 1996. General score tests for associations of geneticmarkers with disease using cases and their parents. GenetEpidemiol 13:423–449.

Spielman RS, McGinnis RE, Ewens WJ. 1993. Transmission test forlinkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52:506–1993.

Weinberg CR, Wilcox AJ, Lie RT. 1998. A log-linear approachto case-parent-triad data: assessing effects of disease genesthat act either directly or through maternal effects and thatmay be subject to parental imprinting. Am J Hum Genet 62:969–978.

Weinberg CR. 1999. Methods for detection of parent-of-origineffects in genetic studies of case-parents triads. Am J HumGenet 65:229–235.

Whittaker JC, Gharani N, Hindmarsh P, McCarthy MI. 2003.Estimation and testing of parent-of-origin effects for quantita-tive traits. Am J Hum Genet 72:1035–1039.

Xu H, Shete S. 2006. Mixed-effects logistic approach for associa-tion following linkage scan for complex disorders. HumHered, in press.

Yang Q, Rabinowitz D, Isasi C, Shea S. 2000. Adjusting forconfounding due to population admixture when estimating theeffect of candidate genes on quantitative traits. Hum Hered 50:227–233.

APPENDIX A

EXPRESSION OF QCPG LIKELIHOOD

The retrospective QCPG likelihood is parameterized in terms of parameters of interest (offspring genotypeeffects, denoted b0) and several nuisance parameters (denoted a0) as shown in Table II. Here we express the b0

and a0 parameters in the QCPG likelihood in terms of various parameters (denoted a and b) in a prospectivemodel for trait given genotype. We do not propose to reparameterize the QCPG likelihood of Table II in termsof these prospective parameters before maximization. Rather, we continue to freely estimate the b0 and a0

parameters when we maximise the QCPG likelihood. However, we use the relationship between the

828 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 17: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

retrospective and prospective parameters to inform our understanding and interpretation of the restrospectiveparameters, since the a and b parameters in the prospective model are generally more intuitively interpretablethan those on the retrospective scale.

For a normally distributed quantitative trait Y with mean m and variance s2, the probability distributionfunction of the observed trait yi is given by

fðyi; m;s2Þ ¼1ffiffiffiffiffiffiffiffiffiffiffi

2ps2p exp �

ðyi � mÞ2

2s2

� �ð7Þ

For a trio with parental mating type M and offspring genotype gi, m equals the mean of the prospective QTDTM

approach from equation (1), that is m ¼ aM þ bgi. Equation (7) becomes

fðyi; aM;bgi;s2Þ ¼

1ffiffiffiffiffiffiffiffiffiffiffi2ps2p exp �

ðyi � ðaM þ bgiÞÞ

2

2s2

!ð8Þ

From equation (3), the contribution of a trio to the likelihood can be expressed as

Pðgijgim; gif ; yiÞ ¼PðyijgiÞPðgijgim; gif ÞP

g�i2S0

MPðyijg�i ÞPðg

�i jgim; gif Þ

ð9Þ

So, from equations (8) and (9), the contribution of a trio to the likelihood for a normally distributed trait withmean aM þ bgi

and variance s2 is given by

Pðgijgim; gif ; yiÞ ¼

1ffiffiffiffiffiffiffiffi2ps2p exp �

ðyi�ðaMþbgiÞÞ

2

2s2

� �Pðgijgim; gif Þ

Pg�

i2S0

M

1ffiffiffiffiffiffiffiffi2ps2p exp �

ðyi�ðaMþbg�iÞÞ

2

2s2

� �Pðg�i jgim; gif Þ

¼

exp �ðyi�ðaMþbgi

ÞÞ2

2s2

� �Pðgijgim; gif Þ

Pg�

i2S0

Mexp �

ðyi�ðaMþbg�iÞÞ

2

2s2

� �Pðg�i jgim; gif Þ

¼

exp �y2

i

2s2 þ2yiðaMþbgi

Þ

2s2 �ðaMþbgi

Þ2

2s2

� �Pðgijgim; gif Þ

Pg�

i2S0

Mexp �

y2i

2s2 þ2yiðaMþbg�

2s2 �ðaMþbg�

iÞ2

2s2

� �Pðg�i jgim; gif Þ

: ð10Þ

In each offspring’s contribution to the likelihood, the quantitative trait, yi, its variance, s2, and the parentalmating type parameter aM are the same for the offspring (the ‘case’) and the pseudocontrols. Therefore,cancelling terms from the numerator and denominator,

Pðgijgim; gif ; yiÞ ¼

expyibgi

s2 �ðaMþbgi

Þ2

2s2

� �Pðgijgim; gif Þ

Pg�

i2S0

Mexp

yibg�i

s2 �ðaMþbg�

iÞ2

2s2

� �Pðg�i jgim; gif Þ

¼

expbgi

s2 yi �aM

2 þ2aMbgiþbgi

2

2s2

� �Pðgijgim; gif Þ

Pg�

i2S0

Mexp

bg�i

s2 yi �aM

2 þ2aMbg�iþb�gi

2

2s2

� �Pðg�i jgim; gif Þ

¼exp

bgi

s2 yi �bgið2aMþbgi

Þ

2s2

� �Pðgijgim; gif ÞP

g�i2S0

Mexp

bg�i

s2 yi �bg�

ið2aMþbg�

2s2

� �Pðg�i jgim; gif Þ

which can be written in the following form:

829Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 18: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

Pðgijgim; gif ; yiÞ ¼exp

bgi

s2 yi

h iþ �

bgið2aMþbgi

Þ

2s2 þ lnðPðgijgim; gif ÞÞ

h i� �P

g�i2S0

Mexp

bg�i

s2 yi

� �þ �

bg�ið2aMþbg�

2s2 þ lnðPðg�i jgim; gif ÞÞ

� �� � ð11Þ

This is in the general form of equation (4), where

b0g�i¼

bg�i

s2 and a0Mg�i¼ �

bg�ið2aMþbg�

2s2 þ lnðPðg�i jgim; gif ÞÞ

Thus the parameters from the retrospective QCPG formulation (equation (4)) may be interpreted as follows.

The term b0g�i¼

bg�i

s2 is the original genotype effect of genotype g�i on the prospective scale, divided by the trait

variance. (Hence, given parameter estimates b0g�i

from fitting the QCPG model, the genotype effects on the

prospective scale could be obtained by multiplying the estimate of b0g�i

by the corresponding trait variance, if it wereknown or estimable.) The termsa0Mg�

i¼ �bg�

ið2aM þ bg�

iÞ=2s2 þ lnðPðg�i jgim; gifÞÞ correspond to nuisance parameters

that allow the model to account for non-Mendelianism and population stratification, as proposed by Kistner andWeinberg [2004]. As discussed in the text, not all of the b0 and a0 are estimable, so restrictions must be made such assetting some of these equal to zero (equivalent to choosing a reference genotype category to which the othergenotype effects are compared). This complicates the interpretation of the nuisance parameters. Suppose that allparameters were in fact estimable. Suppose further that there is no non-Mendelianism, so that Pðg�i jgim; gif Þ ¼ 0:25for all offspring genotypes consistent with the parental genotypes. In that case, we could write out in full therelationships between the retrospective and prospective parameters implied by equation (11):

b00 ¼b0

s2;b01 ¼

b1

s2;b02 ¼

b2

s2

a0010 ¼ �b0ð2a01 þ b0Þ

2s2þ lnð0:25Þ

a0011 ¼ �b1ð2a01 þ b1Þ

2s2þ lnð0:25Þ

a0110 ¼ �b0ð2a11 þ b0Þ

2s2þ lnð0:25Þ

a0111 ¼ �b1ð2a11 þ b1Þ

2s2þ lnð0:25Þ

a0112 ¼ �b2ð2a11 þ b2Þ

2s2þ lnð0:25Þ

a0121 ¼ �b1ð2a12 þ b1Þ

2s2þ lnð0:25Þ

a0122 ¼ �b2ð2a12 þ b2Þ

2s2þ lnð0:25Þ

In practice, not all the parameters are estimable and we set b01, a0011, a0111 and a0121 (the reference parameters)equal to zero. Assuming that on the prospective scale we also set b1 5 0, and rearranging the expressions above,the relationships become

b00 ¼b0

s2;b01 ¼ 0;b02 ¼

b2

s2

a0010 ¼ a0011þb1ð2a01 þ b1Þ

2s2�

b0ð2a01 þ b0Þ

2s2¼ �

b0ð2a01 þ b0Þ

2s2

a0110 ¼ a0111þb1ð2a11 þ b1Þ

2s2�

b0ð2a11 þ b0Þ

2s2¼ �

b0ð2a11 þ b0Þ

2s2

a0112 ¼ a0111þb1ð2a11 þ b1Þ

2s2�

b2ð2a11 þ b2Þ

2s2¼ �

b2ð2a11 þ b2Þ

2s2

a0122 ¼ a0121þb1ð2a12 þ b1Þ

2s2�

b2ð2a12 þ b2Þ

2s2¼ �

b2ð2a12 þ b2Þ

2s2

Under the null hypothesis, the prospective genotype effects b0 and b2 equal zero, and so the true values of theretrospective parameters (both the genotype parameters of interest and the nuisance parameters) will also equal zero.

830 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 19: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

A similar argument applies if, instead of setting b1 and b01 to zero, we set b0 and b00 to zero, so thatthe genotype parameters are all calculated relative to the homozygous 1/1 genotype. In this case, we still findthat under the null hypothesis that b1 and b2 equal zero, the true values of the retrospective parameters (boththe genotype parameters of interest and the nuisance parameters) equal zero (data not shown).

The above derivation for the nuisance parameters only works because Pðg�i jgim; gif Þ ¼ 0:25 in all cases and sois subtracted out when we express a0010, a0110 , a0112 and a0122 in terms of a0011, a0111 and a0121. If, in fact, there wasnon-Mendelianism, we would have instead that

a0jkl ¼ �blð2ajk þ blÞ

2s2þ lnðPðg�i ¼ ljfgim; gifg ¼ fj; kgÞÞ

� lnðPðg�i ¼ 1jfgim; gifg ¼ fj; kgÞÞ

¼ �blð2ajk þ blÞ

2s2þ cjkl; say:

In this case the term cjkl will be nonzero, but fitting a0jkl allows the model to account for this, so that even underthe null hypothesis where b0 and b2 equal zero, the nuisance parameters a0jkl will not equal zero.

The effect of population stratification is modelled on the prospective scale via m ¼ aM þ bgiwhich allows the

mean trait value for a child to vary according to the genotype of its parents. If there really were populationstratification, the correct model would in fact be m ¼ mp þ bgi

, where mp denotes the mean trait value in the(unknown) sub-population p from which the child originates. If we replace aM bymp everywhere above, we find that

a0jkl ¼ �blð2mp þ blÞ

2s2þ cjkl

Thus the true value of the nuisance parameters should vary according to sub-population, which we donot allow for. However, under the null hypothesis where b0 and b2 equal zero, a0jkl does not vary by sub-population and so we should still obtain valid inference for the b0 under the null, even though the modelmisspecification means we may not obtain valid inference under the alternative.

APPENDIX B

EXPRESSION OF QPL LIKELIHOOD

We may use a similar argument as in Appendix A to derive the relationship between the a00 andb00 parameters in the QPL model and the prospective a and b parameters. The only difference is in thesummation in the denominator of equation (11), which is over either two or three possible offspring genotypecategories corresponding to the categories shown in Table I, (e.g. 1/1, 1/2 (unordered) and 2/2 for offspringof two heterozygous parents, rather than over four genotype categories 1/1, 1/2, 2/1 and 2/2). The result ofthis is that the relationships implied by equation (11) become

b000 ¼b0

s2; b001 ¼

b1

s2;b02 ¼

b2

s2

a00010 ¼ �b0ð2a01 þ b0Þ

2s2þ lnð0:5Þ

a00011 ¼ �b1ð2a01 þ b1Þ

2s2þ lnð0:5Þ

a00110 ¼ �b0ð2a11 þ b0Þ

2s2þ lnð0:25Þ

a00111 ¼ �b1ð2a11 þ b1Þ

2s2þ lnð0:5Þ

a00112 ¼ �b2ð2a11 þ b2Þ

2s2þ lnð0:25Þ

a00121 ¼ �b1ð2a12 þ b1Þ

2s2þ lnð0:5Þ

a00122 ¼ �b2ð2a12 þ b2Þ

2s2þ lnð0:5Þ

831Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi

Page 20: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

where the terms Pðg�i jgim; gif Þ differ according to offspring genotype, for offspring of two heterozygous parents.If, as in Appendix A, we set b1, b01, a0011, a0111 and a0121 to zero, we obtain the following equations:

b000 ¼b0

s2;b001 ¼ 0;b002 ¼

b2

s2

a00010 ¼ a0011þb1ð2a01 þ b1Þ

2s2�

b0ð2a01 þ b0Þ

2s2¼ �

b0ð2a01 þ b0Þ

2s2

a00110 ¼ a0111� lnð0:5Þ þ lnð0:25Þ þb1ð2a11 þ b1Þ

2s2�

b0ð2a11 þ b0Þ

2s2¼ �

b0ð2a11 þ b0Þ

2s2þ lnð0:5Þ

a00112 ¼ a0111� lnð0:5Þ þ lnð0:25Þ þb1ð2a11 þ b1Þ

2s2�

b2ð2a11 þ b2Þ

2s2¼ �

b2ð2a11 þ b2Þ

2s2þ lnð0:5Þ

a00122 ¼ a0121þb1ð2a12 þ b1Þ

2s2�

b2ð2a12 þ b2Þ

2s2¼ �

b2ð2a12 þ b2Þ

2s2

Under the null hypothesis that b0 and b2 equal zero, the a00jkl do not all equal zero, and so it is necessary in theQPL model that the a00jkl be included in order to obtain correct inference, even when there is no populationstratification or non-Mendelianism.

APPENDIX C

DISCUSSION OF NUISANCE PARAMETERS

A difference between the QCPG and QPL methods arises with regard to the number of nuisance parametersestimated. It can be seen in Table I that there are a total of four nuisance parameters, a010, a110, a112,a122.However, by fitting the model using a polytomous logistic model as proposed by Kistner and Weinberg,two additional, essentially inestimable, nuisance parameters are estimated. These are referred to in Kistner andWeinberg ([2004] p. 36) as a012 and a120, corresponding to the situations where either one parent has no copiesof the variant allele but the offspring has two copies, or one parent has two copies but the offspring has none.There is no data to estimate these parameters (they correspond to impossible events) but the computer programtries to estimate them since they are included in the model. Implementation of the QPL approach using SAScode available from http://dir.niehs.nih.gov/dirbb/weinbergfiles/qpl.htm (data not shown) indicates thatestimation of these two inestimable a012 and a120 parameters is poor, with very large estimates and confidenceintervals, although this does not appear to adversely affect estimation of the six genuinely estimableparameters.

For the QCPG and QPL methods, if there are genotype effects present, the nuisance parameters are non-zero.When population stratification exists, the true values of these nuisance parameters may differ in the differentsub-populations (due to differences in aM between subpopulations). If estimated in the combined population,the nuisance parameters will be an average of those from the different sub-populations and so willnot necessarily provide accurate population-specific estimates. Hence, although the type 1 errors will becorrect, we do not necessarily expect the QCPG (or QPL) method to have unbiased parameter estimates whenused for estimation of effects under the alternative hypothesis with population stratification, even though theyshould be unbiased under the null.

When extending the QCPG method to multi-locus haplotypes, the number of nuisance parametersdramatically increases since the number of parental mating types and offspring genotypes increases. Thenuisance parameters in the single-locus case are of the form a0Mg�

i¼ �bg�

ið2aM þ bg�

iÞ=2s2 þ lnðPðg�i jgim; gif ÞÞ

(equation (11)).Through simulations (described in the main text), we investigate whether the parental genotype information

still required in the nuisance parameters can be captured if the nuisance parameters are replaced by parametersrepresenting the offspring genotype alone ðgg�

iÞ, thus reducing the overall number of nuisance parameters.

Equation (11) would then become

Pðgijgim; gif ; yiÞ ¼exp

bgi

s2 yi

h iþ ggi

� �P

g�i2S0

Mexp

bg�i

s2 yi

� �þ gg�

i

� �

832 Wheeler and Cordell

Genet. Epidemiol. DOI 10.1002/gepi

Page 21: Quantitative trait association in parent offspring trios: Extension of case/pseudocontrol method and comparison of prospective and retrospective approaches

The hope is that the population stratification can be accounted for by offspring genotype alone, rather thanby a combination of offspring and parental genotypes.

APPENDIX D

EXPRESSION OF THE QCEPG LIKELIHOOD

The rationale for expressing the likelihood equation (5) in the form of equation (6) is very similar to that usedfor the QCPG approach. Previously, for the QCPG method, the population mean of the quantitative trait Y (m)was assumed to equal the mean of the prospective QTDTM method such that for an offspring with genotypebgi

, m ¼ aM þ bgi, for mating type M and offspring genotype effect bgi

. To include maternal genotype or parent-

of-origin effects, the mean is now assumed to depend on parental mating type and the offspring genotypeeffect as before, but it now also depends on the maternal genotype effect, bgim

, and the parent-of-origin effectbIm

. Therefore,

m ¼ aM þ bgiþ bgim

þ bIm

Hence, the trait distribution can be expressed as a normal distribution with mean m as above, and substitutingthis into equation (5) gives

Pðgijgim; gif ; yi; xÞ ¼

1ffiffiffiffiffiffiffiffi2ps2p exp �

ðyi � ðaM þ bgiþ bgim

þ bImÞÞ

2

2s2

!Pðgijgim; gif Þ

Xg�

i;g�

im;g�

if2Gx

1ffiffiffiffiffiffiffiffiffiffiffi2ps2p exp �

ðyi � ðaM þ bg�iþ bg�

imþ b�Im

ÞÞ2

2s2

!Pðg�i jg

�im; g

�if Þ

¼

exp �ðy2

i � 2yiðaM þ bgiþ bgim

þ bImÞ þ ðaM þ bgi

þ bgimþ bIm

Þ2Þ

2s2

!Pðgijgim; gif Þ

Xg�

i;g�

im;g�

if2Gx

exp �ðy2

i � 2yiðaM þ bg�iþ bg�

imþ b�Im

Þ þ ðaM þ bg�iþ bg�

imþ b�Im

Þ2Þ

2s2

!Pðg�i jg

�im; g

�if Þ

¼

expyiðaM þ bgi

þ bgimþ bIm

Þ

s2�ðaM þ bgi

þ bgimþ bIm

Þ2

2s2

!Pðgijgim; gif Þ

Xg�

i;g�

im;g�

if2Gx

expyiðaM þ bg�

iþ bg�

imþ b�Im

Þ

s2�ðaM þ bg�

iþ bg�

imþ b�Im

Þ2

2s2

!Pðg�i jg

�im; g

�if Þ

¼

expbgi

s2yi þ

bgim

s2yi þ

bIm

s2yi �

o2s2

� �Pðgijgim; gif Þ

Xg�

i;g�

im;g�

if2Gx

expbg�

i

s2yi þ

bg�im

s2yi þ

b�Im

s2yi �

o�

2s2

!Pðgijgim; gif Þ

where

o� ¼ bg�ið2aM þ bg�

iÞ þ bg�

imð2aM þ bg�

imÞ

þ bImð2aM þ b�Im

Þ þ 2bg�iðbg�

imþ b�Im

Þ þ 2bg�imb�Im

833Quantitative Trait Association in Parent Offspring Trios

Genet. Epidemiol. DOI 10.1002/gepi