1 environmental and heritable factors in the causation of cancer. the genetic epidemiology of...

40
1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat 246, 2002 Background to and discussion of: Lichtenstein et al, NEJM 343 2000: 78-84, and Risch, Cancer Epi., Biom. & Prev. 10 2001:733-741

Upload: dina-armstrong

Post on 16-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

1

Environmental and heritable factors in the causation of cancer.

The genetic epidemiology of cancer: Interpreting family and twin studies

Week 4, Stat 246, 2002

Background to and discussion of:

Lichtenstein et al, NEJM 343 2000: 78-84, and Risch, Cancer Epi., Biom. & Prev. 10 2001:733-741

Page 2: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

2

Science, July 27, 2001:

Genes Come to the Fore in New Cancer Analysis

Last summer, scientists in Sweden and Finland got a lot of publicity when they published a paper, based on data from mammoth Scandinavian twin studies, concluding that inherited factors make a "minor contribution" to most cancers. But they were using the wrong methodology, says genetic epidemiologist Neil Risch of Stanford University. Risch has done an analysis that comes to the opposite conclusion: Genes play a strong role in who gets cancer.

Risch looked at the same data as in the earlier study, headed by Paul Lichtenstein of Sweden's Karolinska Institute. In the model Lichtenstein used to extract estimates of the relative contributions of genes and environment to cancer liability, environment nearly always won out. But Risch says that was the wrong model--one problem with it is that there aren't enough people with rare cancers to produce meaningful calculations. Risch instead looked at people in twin and family studies who had developed cancer and then estimated the likelihood that a first-degree family member would develop the same cancer. He found that in "the great majority of cancers," a family member was about twice as likely as the average person to develop the cancer. If anything--contrary to Lichtenstein's conclusions--the genetic risk was higher for rarer cancers, Risch reports in the July issue of Cancer Epidemiology Biomarkers & Prevention. Prostate, colorectal, and breast cancers are usually seen as having the strongest genetic components. But the top three on Risch's list are thyroid and testicular cancers and multiple myeloma. The exercise means that "we should be looking for susceptibility genes for all cancers," says Risch. Lichtenstein was on vacation and unavailable for comment. But cancer epidemiologist Sholom Wacholder of the National Cancer Institute in Bethesda, Maryland, says Risch's work is "a reminder of the need to be cautious about interpreting studies that attempt to distinguish genetic and environmental factors."

Page 3: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

3

The papers in brief

Lichtenstein et al (2000). Combined data on 44,788 pairs of twins listed in the Swedish, Danish and Finnish twin registries in order to assess the risks of cancer at 28 anatomical sites for the twins of persons with cancer. Statistical modeling was used to estimate the relative importance of heritable and environmental factors in causing cancer at 11 of those sites.

Risch (2001). Offers a reassessment of the role of genetic factors in cancer susceptibility generally and for site-specific cancers in particular. Presents an detailed critique of Lichtenstein et al (2000).

Page 4: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

4

Summary of conclusions

Lichtenstein et al. “Inherited genetic factors make a minor contribution to susceptibility to most types of cancers. This finding indicates that the environment has the principal role in causing sporadic cancers.”

Risch. “ a) All cancers are familial to approximately the same degree, with only a few exceptions; b) early age of diagnosis is generally associated with increased familiality;c) familiality does not decrease with decreasing prevalence of the tumor- in fact the trend is toward increasing familiality with decreasing prevalence; d) a multifactorial (polygenic) threshold model fits the twin data for most cancers less well than single gene or genetic heterogeneity-type models; e) recessive inheritance is less likely generally than dominant or additive models; f) heritability decreases for rarer tumors only in the context of the polygenic model but not in the context of single-locus or heterogeneity models; g) although the family and twin data do not account for gene-environment interaction or confounding, they are still consistent with genes contributing high attributable risks for most cancer sites.”

Page 5: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

5

Setting the scene, I

Lichtstenstein et al use the multifactorial (polygenic) threshold (MFT) model, and infer the relative contributions of heredity and environment within that model. Their analysis rests on “the usual assumptions of a classic twin study (that there was random mating, no interactions between genes and environment, and equivalent environments for monozygotic and dizygotic twins). Phenotypic variance was divided into a component due to inherited genetic factors (heritability), a component due to environmental factors common to both members of the pair of twins (the shared environmental component), and a component due to environmental factors unique to each twin (the nonshared environmental component)”.

Page 6: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

6

Setting the scene, II

By contrast, Risch makes extensive use of familial risk ratios (FRRs). These are quantities denoted by R , where R denotes a relationship (S=sib, O=offspring, DZ= dizygotic twin, etc), and whose values are the risks of relatives of type R of affected individuals being themselves affected (here by cancer), divided by the population prevalence.

A way to think of R is as pr(affected | R affected)/pr(affected), the ratio of the probability (risk) of someone being affected, given that their relative of type R is affected, divided by the unconditional probability of that person being affected.

In this view, it is entirely analogous to the coincidence coefficient we met in the study of interference.

If we denote the population prevalence of our trait by K, and the frequency of affected pairs with relationship R by K2, then R = K2 / K2 .

Page 7: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

7

More on familial risk ratios

They can be estimated directly from family data, and can also be studied theoretically, by calculating them under different assumptions concerning penetrances and susceptibility allele frequencies.

In particular, we can estimate R , where R = MZ and R=DZ from twin data, and also study the behaviour of these quantities under different genetic models, e.g. a single rare dominant gene causing susceptibility, or a recessive gene, whose susceptibility allele frequency ranges from very common to very rare.

Page 8: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

8

Points to consider when comparing two types of models for the involvement of genes in disease susceptibility

How do the models relate to our current understanding of genetics in general, and that of disease susceptibility in particular?

How interpretable are the models’ parameters? How do the models relate to available data?

Do they fit? Does their qualitative behaviour reflect broadly observed trends?

Page 9: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

9

Single factor models

Suppose that the levels of a quantitative trait in a population are influenced by a gene at a locus at which two (unobserved) alleles A and a are segregating.

Suppose further (with Weinberg and Hardy) that the population frequencies of types AA, Aa and aa are, p2, 2pq and q2, respectively, where 0<p<1, and q=1-p.

Finally, suppose that matings are at random in relation to genotypes at this locus.

Then the joint distribution of a parental (P) genotype and that of an offspring (O) at this locus is readily determined.

E.g. the mating AAAa has population frequency p2 2pq, and by Mendel’s law of segregation, produces offspring with genotypes 50% AA and 50% Aa. Calculations like this for all 9 mating-pair types lead to the table:

Page 10: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

10

One-factor models: the joint distribution of parent-offspring genotypes

AA Aa aa

AA p3 p2q 0

Aa p2q pq pq2

aa 0 pq2 q3

Offspring genotype

Parentalgenotype

Exercise: Obtain the above table and the corresponding table for full sibs.

Page 11: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

11

One-factor models: parent-offspring correlation

Now suppose that deviations from the population mean of the trait for individuals with genotypes AA, Aa and aa are u, v and w, respectively.

With Yule (1906) we put p=1/2. Then the correlation between the trait values of P and O is

corr(P,O) = (u-w)2/ [2(u-w)2 + (u-2v+w)2].

Additivity: u+w=2v, corr(P,O) = 1/2.

Dominance: v=w, corr(P,O) = 1/3.

This calculation reconciled the Mendelian and Biometric schools at the turn of the last century.

Exercise: Derive the above and obtain corr(S,S’) for full sibs S, S’ in the same way. (Remember that u+2v+w=0. Why?) Redo with general allele frequencies.

Page 12: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

12

Multifactorial (polygenic) models

In general, we can postulate arbitrarily many segregating loci, with arbitrary numbers of alleles, and arbitrary allele frequencies, though still satisfying Hardy-Weinberg equilibrium, each locus contributing independently to the quantitative trait, and repeat the analysis just presented. This was done by RA Fisher in a 1918 paper which laid out the framework that underlies the MFT model. One result is a number of formulae like the following:

cov(MZ twins) = VA + VD , cov (DZ twins) = 0.5 VA + 0.25VD

where VA and VD are called the additive and dominance variances, respectively. The total variance of the quantitative trait in this context is

V = VA + VD + VE , where VE is called the environmental variance. An extra twist in twin studies is a further term VS for shared family environment effects which are postulated to be uncorrelated with the general environmental effects contributing to VE ..

In what follows, we suppose that VD = 0, and adopt the terminology that calls H= VA /V the heritability of the quantitative trait.

Page 13: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

13

Threshold models for disease susceptibility

The analysis of RA Fisher was all for quantitative traits. But most diseases are qualitative traits: you get them, or you don’t. People wanted to use the preceding framework for disease traits, and they did it by postulating an unobserved liability to which the foregoing genetic analysis applies. They then suppose an individual becomes affected when their liability exceeds a threshold T. In practice, the population distribution of liabilities is always assumed to be normal N(0,1), and T is determined by the population prevalence K, e.g. if K = 1%, then T=2.33.

When pairs of individuals are being considered, the joint distribution of their liabilities is always assumed to be bivariate normal BN(0,0; 1,1; ), where is the appropriate multiple (1.0 for MZ, O.5 for DZ twins) of the additive variance, which here is just the heritability, as the total variance is 1.

Thus from a knowledge of the prevalence and the heritability of the liability of a disease trait, and the relationship between two individuals, we could calculate the chance under the MFT model that neither, one and not the other, or both will be affected (see later).

In practice, the reverse is done: from the frequencies of such concordant and discordant relative pairs, people can estimate heritability under the MFT model, and this is exactly what Lichtenstein et al (2000) did.

Page 14: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

14

MFT models for disease susceptibility

Now let’s get to the MFT model used in their analysis by Lichtenstein et al. As you see from p. 81 of their paper, their data for a particular type of cancer, say stomach cancer, comes in the form of counts for male and female MZ and DZ twin pairs, namely a concordant affected pairs, b and c both half the number of discordant pairs (one affected, one unaffected) , and d=n-a-b-c concordant unaffected pairs. They find it convenient to form a 22 table with entries a, b, c and d. These counts were also stratified by country in their analysis, but presented in aggregated form in the paper.

Page 15: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

15

Lichtenstein et al’s analysis of 1211 sets of 22 tables

For each country (3), sex (2) and zygosity (2), and each of 11 cancer sites or types, L et al have a 22 table of counts.

+ -+ a b- c d

Here a is the number of concordant affected twin pairs, etc. Suppose that this table corresponds to DZ twins, say, from a country where the liability threshold for affectedness is T, and the components of variance in the bivariate normal model for the liability are VA (additive), VS (shared environment), and VE

(nonshared environment), all adding to 1. The correlation for DZ twins’ liabilities is = 0.5 VA +VS , with 1.0 instead of 0.5 for MZ twins. Their joint analysis of 12 22 tables for a given site or type assumes country-dependent thresholds, but common, sex-specific variance components.

Page 16: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

16

Summary data and results for stomach cancer

Twin pairs a b+c RR CI ConcordanceMen MZ 6 131 9.9 (4.1-23.6) 0.08Men DZ 8 256 6.6 (3.2-13.8) 0.06Women MZ 5 92 19.7(7.5-51.6) 0.10Women DZ 4 198 6.2 (2.2-17.1) 0.04

VA = 0.28 (0.00-0.51) (not sig. diff from 0)

VS = 0.10 (0.00-0.34)

VE = 0.62 (0.49-0.76)Goodness-of-fit 2 = 8.9 on 38 d.f.

Page 17: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

17

Comment on relative risks

L et al calculate a conventional odds ratio they term relative risk R = ad/bc from the 22 table entries, plus a confidence interval for R, as well as a twin concordance 2a/(2a+b+c) for the absolute risk.The second makes sense, but the first doesn’t, as the division of discordants into two equal-sized groups to form the 22 table is artificial. They should be using Risch’s DZ and MZ ,see later.

Page 18: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

18

Comment on the model fitting and residual degrees of freedom

What should the residual d.f. be? I count 12 = 3 (countries) x 2 (sexes) x 2 (twin types) sets of either 3 (my view) or

4 numbers: a=#concordant affected, b+c (or b, c) discordants, and d=n-a-b-c concordant unaffecteds. I say 3 here, as there is no real way to split the discordants, since twins are unlabelled. So I get 36 numbers, actually only 24 freely varying ones, since one of each triple is determined by the other 2 and n. The other calculation gives 48, but really only 36.

How many parameters are estimated? I count 3 thresholds, and either 2 or 4 freely varying variance components (say additive and shared environment) the remainder comes by subtracting the first two from 1. The number 2 is right if we have pooled across sexes, as Table 3 suggests, otherwise 4 for sex specific analyses.

However, not one of 24, 36 or 48 minus 3+2 or 3+4 gives 38! Any ideas?

Page 19: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

19

Some comments from the paper

“The statistical model we used provided an excellent fit to the observed data.”

“Although the model fitting can be used to estimate the magnitude of the heritable component of susceptibility to cancer, it cannot reveal how this component acts or how it interacts with other factors.”

“…we cannot exclude a modifying effect of environment on the genetic component found in our analyses of twins.”

Page 20: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

20

A cautionary remark

“…the study of twins, from being regarded as one of the easiest and most reliable kinds of researches in human genetics, must now be regarded as one of the most treacherous.”

From L S Penrose, Outline of human genetics

Heinemann, 1959, quoted in

Pak Sham, Statistics in human genetics

Arnold, 1998.

Page 21: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

21

Additional references

Michael C Neale and Lon R Cardon Methodology for genetic studies of twins and families Kluwer, 1992.

A very comprehensive treatise on the methods of Lichtenstein et al (2000).

Arthur S. GOLDBERGER and Leon J. KAMIN

Behavior-Genetic Modeling of Twins: A Deconstruction

SSRI Working Paper #9824 University of Wisconsin, 1998.

http://www.ssc.wisc.edu/econ/archive/wp9824.htm

As it says, a deconstruction.

Page 22: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

22

Risch’s critique: a beginning

First, some facts about the familial risk ratios.

If genetic susceptibility is attributable to a single (rare) dominant gene, then P = O = S =DZ = (MZ +1)/2, and so RMD = (DZ -1)/ (MZ-1) = 2.

If susceptibility is attributable to a recessive gene, then P = O < S = DZ. For a recessive model, RMD is usually >2, depending on the allele frequency. For a rare allele, RMD = 4, but diminishes toward 2 if the allele is very common. (Exercise: Derive these facts.)

Page 23: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

23

Proofs of results on previous page

As in the calculation on p.10, we need to enumerate all 9 mating-pair types, and their population frequencies, and then calculate the chance that they have 2 affected sibs, under a singe rare dominant gene model. Then we form RMD . The algebra is ugly, apart from the quantity K= p2+2p(1-p). Clearly (why?) MZ = 1/K, while DZ is a quartic polynomial in p. If we go directly to RMD = (MZ -1)/ (DZ-1), we see that it is K(1-K)/(K2-K2). The numerator and denominator are both quartics in p, but we only need the leading terms, which are 2p and p, respectively, so that for small p their ratio is 2 (Why?).

Exercise: Obtain the second result cited on p.22.

Page 24: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

24

Penetrance functions

Recall that penetrance is the chance of being affected, given genotype, and can be written xi for genotype i. Various new penetrance functions can be built up from single locus penetrances. Examples include:

with phenocopies: x’i = min(1, xi +) multi-locus models: (here i, j, k..refer to different loci) Multiplicative: xijk… = xi xj xk….

Additive xijk… = min(1, xi+xj +xk+….) Genetic heterogeneity xijk.. =1-(1-xi)(1-xj)(1- xk)(…

Page 25: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

25

Non-genetic cases, more than one gene.

How do the foregoing calculations change if there is a proportion of non-genetic cases (so-called phenocopies) of our disease, or if more than one gene influences susceptibility.Risch asserts that phenocopies “do not influence the predictions given above”. Can you see why?

Further, he asserts that if we have locus heterogeneity , the same predictions hold, and that this is also true if we have additivity of risks (penetrances).

On the other hand, if epistasis (interaction) is present among different loci, e.g. if penetrances are multiplicative, things can be very different.

See Risch’s 1990 paper for fuller details on these issues.

Page 26: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

26

Familial relative risks and the MFT model

We saw a few pages back that DZ = K2 / K2 where K2 is the probability of both twins having liabilities exceeding the threshold T, a bivariate normal

integral, whose correlation is 0.5VA + VS. Risch’s Table 1 relates K, VA = H, DZ and MZ when VS = 0, and draws two important conclusions:

• For a fixed value of , the heritability H decreases with decreasing K;

• For the MFT model, RMD is always > 2, and increases directly with heritability H and inversely with prevalence K.

Page 27: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

27

MZ , DZ and the MFT model with shared environment

We are interested in the impact of shared environment on RMD under the MFT model. This isn’t dealt with very neatly by Risch, who says: “For the case where the shared familial environment is equivalent between MZ and DZ twin pairs, RMD will be attenuated: if RMD = 2 without an equivalent shared environment, then R’MD < 2 with it.”

To investigate this we recalculate Risch’s Table 1 with a shared environment term in the variance.

Page 28: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

28

Risch’s Table 1 expanded : MZ

K/H 0.20 0.25 0.30 0.35 0.40 0.45 0.500.050 2.10 2.46 2.85 3.29 3.77 4.30 4.880.040 2.24 2.66 3.13 3.65 4.22 4.86 5.570.030 2.44 2.94 3.52 4.17 4.89 5.71 6.620.025 2.57 3.14 3.79 4.53 5.38 6.33 7.400.020 2.75 3.40 4.16 5.03 6.03 7.17 8.470.015 3.00 3.78 4.69 5.76 7.01 8.44 10.090.010 3.39 4.38 5.56 6.98 8.66 10.63 12.940.005 4.19 5.64 7.46 9.72 12.48 15.82 19.850.003 5.18 7.29 10.04 13.56 18.03 23.62 30.560.001 6.89 10.26 14.90 21.16 29.45 40.28 54.26

Page 29: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

29

Risch’s Table 1 expanded : DZ

K/H 0.20 0.25 0.30 0.35 0.40 0.45 0.500.050 1.49 1.63 1.77 1.93 2.10 2.27 2.460.040 1.54 1.70 1.87 2.05 2.24 2.44 2.660.030 1.61 1.80 2.00 2.21 2.44 2.68 2.940.025 1.66 1.86 2.08 2.32 2.57 2.85 3.140.020 1.72 1.95 2.19 2.46 2.75 3.06 3.400.015 1.80 2.06 2.35 2.66 3.00 3.37 3.780.010 1.93 2.24 2.58 2.96 3.39 3.86 4.380.005 2.16 2.57 3.04 3.58 4.19 4.87 5.640.003 2.43 2.97 3.60 4.33 5.18 6.16 7.290.001 2.83 3.58 4.49 5.59 6.89 8.44 10.26

Page 30: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

30

Risch’s Table 1 expanded : RMD

K/H 0.20 0.25 0.30 0.35 0.40 0.45 0.500.050 2.26 2.33 2.39 2.46 2.52 2.59 2.660.040 2.30 2.38 2.45 2.53 2.60 2.68 2.760.030 2.35 2.44 2.53 2.62 2.71 2.80 2.890.025 2.39 2.49 2.58 2.68 2.78 2.88 2.990.020 2.43 2.54 2.65 2.76 2.88 2.99 3.110.015 2.49 2.62 2.74 2.87 3.00 3.14 3.280.010 2.58 2.73 2.89 3.05 3.21 3.37 3.540.005 2.74 2.95 3.16 3.38 3.60 3.83 4.060.003 2.93 3.20 3.48 3.77 4.07 4.38 4.700.001 3.21 3.58 3.98 4.40 4.83 5.28 5.75

Page 31: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

31

Risch’s Table 1 with shared env : MZ

K/H 0.20 0.25 0.30 0.35 0.40 0.45 0.500.050 2.85 3.29 3.77 4.30 4.88 5.51 6.210.040 3.13 3.65 4.22 4.86 5.57 6.36 7.220.030 3.52 4.17 4.89 5.71 6.62 7.65 8.790.025 3.79 4.53 5.38 6.33 7.40 8.60 9.960.020 4.16 5.03 6.03 7.17 8.47 9.94 11.610.015 4.69 5.76 7.01 8.44 10.09 11.99 14.160.010 5.56 6.98 8.66 10.63 12.94 15.63 18.760.005 7.46 9.72 12.48 15.82 19.85 24.68 30.45 0.003 10.04 13.56 18.03 23.62 30.56 39.10 49.570.001 14.90 21.16 29.45 40.28 54.26 72.13 94.81

Page 32: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

32

Risch’s Table 1 with shared env : DZ

K/H 0.20 0.25 0.30 0.35 0.40 0.45 0.500.050 2.10 2.27 2.46 2.65 2.85 3.07 3.290.040 2.24 2.44 2.66 2.89 3.13 3.38 3.650.030 2.44 2.68 2.94 3.22 3.52 3.83 4.170.025 2.57 2.85 3.14 3.46 3.79 4.15 4.530.020 2.75 3.06 3.40 3.77 4.16 4.58 5.030.015 3.00 3.37 3.78 4.22 4.69 5.21 5.760.010 3.39 3.86 4.38 4.94 5.56 6.24 6.980.005 4.19 4.87 5.64 6.50 7.46 8.53 9.720.003 5.18 6.16 7.29 8.58 10.04 11.69 13.560.001 6.89 8.44 10.26 12.40 14.90 17.80 21.16

Page 33: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

33

Risch’s Table 1 with shared env : RMD

K/H 0.20 0.25 0.30 0.35 0.40 0.45 0.500.050 1.69 1.80 1.90 2.00 2.09 2.18 2.270.040 1.71 1.83 1.94 2.05 2.15 2.25 2.350.030 1.75 1.88 2.00 2.12 2.23 2.35 2.460.025 1.78 1.91 2.04 2.17 2.29 2.41 2.540.020 1.81 1.95 2.09 2.23 2.36 2.50 2.630.015 1.85 2.01 2.16 2.31 2.46 2.61 2.760.010 1.91 2.09 2.27 2.44 2.62 2.79 2.970.005 2.03 2.25 2.47 2.69 2.92 3.14 3.380.003 2.16 2.43 2.71 2.99 3.27 3.56 3.870.001 2.36 2.71 3.07 3.44 3.83 4.23 4.65

Page 34: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

34

Evidence of familiality in cancer

Risch’s Table 2 contains values of 1 , the familial relative risk for 1st degree relatives, at about 25 sites, using data from Utah, and values of O and

S for data from Sweden. In about 10 cases, there is a separate 1 for early onset cancers.

He derives 3 observations from these numbers:• Apart from a few exceptions, the FRRs are all

rather similar, being around 2 in both studies. All the FRRs are >1, and in the majority of cases, between 1.5 and 3.0.

• There is no decline in FRR with decreasing frequency of the cancer site, if anything, the reverse.

• Increased family recurrence associated with early diagnosis.

Page 35: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

35

Evidence from the twin data of Lichtenstein et al (2000)

Risch is able to use the Lichtenstein et al data to estimate his MZ and DZ , see later for how it is done. For the stomach cancer data, he finds:

K = 1%, MZ = 8.49, DZ = 5.96 , and RMD = 1.51, (see next slide). Doing this for a number of sites (results in Table 3), he

finds that the values of MZ and DZ are reasonably consistent across sites and do not decrease with K.

Risch goes on to fit a constant risk ratio model to the data, excluding female breast cancer. The results are given in Table 4 , and simply formalize the obvious.

Page 36: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

36

Twin relative risk calculations

Suppose that in a population we have a concordant twin pairs, (i.e. both affected with the given cancer), b+c discordant pairs (one affected, one unaffected), and d concordant unaffected twin pairs, n=a+b+c+d.

Since K2 = 2a/2n, and K = (2a+b+c)/2n, we have R = Twin RR = K2 / K2 = 2a/2n / [(2a+b+c)/2n]2 = 4an / (2a+b+c)2 . (In Risch p.737 a=c and b+c=d) For stomach cancer, male MZ twins have a=6,

b+c=131 and n=7,231. We find that MZ = 8.49.

Page 37: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

37

Risch’s conclusions from his reanalysis of data from Lichtenstein et al.

“… the conclusion that rarer cancers are less heritable is strictly a consequence of the MFT model and is not robust to violations from that model.”

“Thus the observed value of RMD conforms poorly to the predictions of the MFT model but extremely well to the single locus or additive genetic model”

“Hence the conclusion of a significant shared twin environmental component may simply be a consequence of using the wrong genetic model..”

(There are some important comments about age-structure which we’ll pass over for the moment.)

Page 38: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

38

Heritability versus Attributable Risk.

Heritability is used as a measure of the importance of genetic effects in the context of the MFT model. Small values lead to the conclusion that genetic effects are minor relative to environmental impacts. But what if the MFT does not apply?

Epidemiologists use relative risk RR (risk to exposed versus unexposed individuals) and PAF (proportion of disease prevented by elimination from the population of the risk factor). Table 5 presents the values of RRHet and PAF for 2 values of MZ and DZ and allele frequencies ranging from 0.001 to 0.10. It shows that the PAF can range from small values to 100%, depending on disease allele frequency. Similarly, quite high values of RRHet can arise.

Page 39: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

39

Conclusions

In my view, Risch has demonstrated that his conclusions are justified, in particular, that the MFT model is not a helpful framework within which to assess the contribution of heredity to cancer.

For more on FRRs, see: N. RISCH Linkage strategies for genetically complex

traits. I. Multi-locus models. Am. J. Hum, Genet. 46: 222-228, 1990.

Page 40: 1 Environmental and heritable factors in the causation of cancer. The genetic epidemiology of cancer: Interpreting family and twin studies Week 4, Stat

40

Acknowledgement

Many thanks to Ingileif Hallgrímsdóttir for helping out with class this week.