measuring dietary intake raymond j. carroll department of statistics faculty of nutrition and...

Post on 30-Mar-2015

218 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Measuring Dietary Intake

Raymond J. CarrollDepartment of Statistics

Faculty of Nutrition and Faculty of Toxicology

Texas A&M Universityhttp://stat.tamu.edu/~carroll

_________________________________________________________

I Still Cook

Me in the kitchen, Yokohama (my birthplace), 1953

_________________________________________________________

Advertisement

College Station, home of Texas A&M University

I-35

I-45

Big Bend National Park

Wichita Falls, my hometown

West Texas

Palo DuroCanyon, the Grand Canyon of Texas

Guadalupe Mountains National Park

East Texas

What I am Not

I know that potato chips are not a basic healthy food group. However, if you ask me a detailed question about nutrition, then I will ask

Joanne Lupton Nancy Turner Meeyoung Hong

_________________________________________________________

You are what you eat, but do you know who you are?

• This talk is concerned with a simple question.

• Will lowering her intake of fat decrease a woman’s chance of developing breast cancer?

_________________________________________________________

Basic Outline

• Diet affects health. Many (not all!) studies though are not statistically significant.

• Focus: quality of the instruments used to measure diet

• Conclusion #1: The instruments are largely to blame.

• Conclusion #2: Expect studies to disagree

_________________________________________________________

Evidence in Favor of the Fat-Breast Cancer Hypothesis

• Animal studies

• Ecological comparisons

• Case-control studies

_________________________________________________________

International Comparisons _____________________________________________________________

Evidence against the Fat-Breast Cancer Hypothesis

• Prospective studies• These studies try to assess a woman’s

diet, then follow her health progress to see if she develops breast cancer

• The diets of those who developed breast cancer are compared to those who do not

• Only (?) 1 prospective study has found firm evidence suggesting a fat and breast cancer link, and 1 has a negative link

_________________________________________________________

Prospective Studies

• NHANES (National Health and Nutrition Examination Survey): n = 3,145 women aged 25-50

• Nurses Health Study: n = 100,000+

• Pooled Project: n = 300,000+

• Norfolk (UK) study: n = 15,000+

_________________________________________________________

The Nurses Health Study, Fat and Breast Cancer_________________________________________________________

60,000 women, followed for 10 years

Prospective study

Note that the breast cancer cases were announcing that they eat less fat

Donna Spiegelman, the NHS statistician

Clinical Trials

• The lack of consistent (even positive) findings led to the Women’s Health Initiative

• Approximately 40,000 women randomized to two groups: healthy eating and typical eating

_________________________________________________________

WHI Diet Study Objectives_________________________________________________________

Prior Objections to WHI

• Cost ($415,000,000)

• Whether North Americans can really lower % Calories from Fat to 20%, from the current 38%

• Even if the study was successful, difficulties in measuring diet mean that we will not know what components led to the decrease in risk.

_________________________________________________________

Change in Fat Calories Over Time_________________________________________________________

0

5

10

15

20

25

30

35

40

Y-0 Y-1 Y-3 Y-6

Control

Intervention

Goal

Women reported a decrease in fat-calories, but not to 20%

How do we measure diet in humans?

• 24 hour recalls

• Diaries

• Food Frequency Questionnaires (FFQ)

_________________________________________________________

Walt Willett has a popular book and a popular FFQ

Food diaries

• Hot topic at NCI

• Only measures a few day’s diet, not typical diet

• A single 3-day diary finding a diet-cancer link is not universally scientifically acceptable

• Need for repeated applications

• Induces behavioral change??

_________________________________________________________

1350140014501500155016001650170017501800

FF

Q

Dia

ry 1

Dia

ry 2

Dia

ry 3

Dia

ry 4

Dia

ry 5

Dia

ry 6

Typical (Median) Values of Reported Caloric Intake Over 6 Diary Days: WISH Study

The Food Frequency Questionnaire

• Do you remember the SAT?

_________________________________________________________

The Pizza Question_________________________________________________________

The Norfolk Study with ~Diaries and FFQ_________________________________________________________

15,000 women, aged 45-74, followed for 8 years

163 breast cancer cases

Diary: p = 0.005

FFQ: p = 0.229

Summary

• FFQ does not find a fat and breast cancer link

• 24 hour recalls and diaries are expensive• They have found links, but in opposite directions• Diaries also appear to modify behavior

• Question: do any of these things actually measure dietary intake? • How well or how badly?

• These are statistical questions!

_________________________________________________________

Do We Know Who We Are?

• Karl Pearson was arguably the 1st great modern statistician

• Pearson chi-squared test

• Pearson correlation coefficient

_________________________________________________________

Karl Pearson at age 30

Do We Know Who We Are?

• Pearson was deeply interested in self-reporting errors

• In 1896, Pearson ran the following experiment.

• For each of 3 people, he set up 500 lines of a set of paper, and had them bisected by hand

_________________________________________________________

A gaggle of lines

Pearson’s Experiment

• He then had an postdoc measure the error made by each person on each line, and averaged

• “Dr. Lee spent several months in the summer of 1896 in the reduction of the observations ”

_________________________________________________________

A gaggle of lines, with my bisections

Pearson’s Personal Equations

• Pearson computed the mean error committed by each individual: the “personal equations “

• He found: the errors were individual. His errors were to the right, Dr. Lee’s to the left

_________________________________________________________

Karl Pearson in later life

What Do Personal Equations Mean?

• Given the same set of data, when we are asked to report something, we all make errors, and our errors are personal

• In the context of reporting diet, we call this “person-specific bias “

_________________________________________________________

Laurence Freedman of NCI, with whom I did the work

Model Details for Statisticians

• The model in symbols

• The existence of person-specific bias means that variance of true intake is less than one would have thought

_________________________________________________________

iij 0 1

2r

i

i

ij

i

i

j

Q =β + β + + ;

=true intake;

=personal equation=Normal(0,σ );

=random error =Normal(0,

r

X

ε

ε σ

rX

)

Model Details for Statisticians

• The OPEN Study had the following measurements• Two FFQ• Two Protein biomarkers• Two Energy biomarkers

_________________________________________________________

Model Details for Statisticians

• The model in symbols

• Linear mixed model, fit by PROC MIXED

_________________________________________________________

iij 0Q 1Q

i

iQ

i Fj i

ijQ

j

Q =β +β + +ε

UX

;

M = +

rX

;

Attenuation

• The attenuation is the slope in the linear regression of X on Q

_________________________________________________________

ijQ

ijF

iQij 0Q 1Q

ij

Q

i

i

Q =β +β + + ;

M = + ;

λ =cov( ,Q)/ v

ε

ε

a

X

X

X

r

r(Q)

Relative Risk and Attenuation

• Start with a logistic model

• True relative risk

• Observed relative risk (regression calibration)

0 1pr(D=1)=H X( + )

_________________________________________________________

1R exp( )

QλQR R since λ < 1

Relative Risk and Attenuation_________________________________________________________

Attenuation Relative Risk

1.0 (no meas. Error) 2.0

0.8 1.74

0.5 1.41

0.25 1.19

0.10 1.07

Our Hypothesis

• We hypothesized that when measuring Fat intake• The personal equation, or person-

specific bias, unique to each individual, is large and debilitating.

• The problem: the actual variability in American diets is much smaller than suspected.

_________________________________________________________

Can We Test Our Hypothesis?

• We need biomarker data that are not much subject to the personal equation

• There is no biomarker for Fat

• There are biomarkers for energy (calories) and Protein

• We expect that studies are too small by orders of magnitude

_________________________________________________________

Biomarker Data

Calories and Protein: Available from NCI’s

OPEN study

Results are surprising

Victor Kipnis was the driving force behind OPEN

_________________________________________________________

Sample Size Inflation

There are formulae for how large a study needs to be to detect a doubling of risk from low and high Fat/Energy Diets

These formulae ignore the personal equation

We recalculated the formulae

_________________________________________________________

Biomarker Data: Sample Size Inflation

0

2

4

6

8

10

12P

rote

in

Ca

lorie

s

%-

Prote

in

_________________________________________________________

If you are interested in the effect of calories on health, multiply the sample size you thought you needed by 11. For protein, by 4.5

Relative Risk_________________________________________________________

If high calories increases the risk of breast cancer by 100% in fact, and you change your intake dramatically, the FFQ thinks doing so increases the risk by 4%

1

1.2

1.4

1.6

1.8

2

Relative Risk ForChanging Your Food

Intake

True: 2.00

ObservedProtein: 1.09

ObservedCalories: 1.04

Result: It is not possible to tell if changing your absolute caloric intake, or your fat intake, or your protein intake will have any health effects

Relative Risk, Food Composition_________________________________________________________

If high protein (fat) increases the risk of breast cancer by 100%, your calories remain the same, you dramatically lower your protein (fat) intake, then FFQ thinks your risk increases by 20%-30%

1

1.2

1.4

1.6

1.8

2

Relative Risk for FoodComposition

True: 2.00

ObservedProteinDensity: 1.31

Result: It is pretty difficult to tell if changing your food composition while maintaining your caloric intake will have any health effects

New Results The AARP Study: 250,000+

women, by far the greatest number in any single study

Results according to rumor: Huge size statistical

significance

FFQ small measured increase in risk for dramatic behavioral change

Statistician’s dream: use Pearson’s idea to get at the true increase in risk

_________________________________________________________

A happy statistician dreaming about AARP

New Results

The WHI Controls Study: 30,000+ women

All with > 32% Calories from Fat via FFQ

Diaries in a nested case-control study

Highly significant fat effect in the diaries (RR in quantiles of 1.6)

_________________________________________________________

A happy statistician doing field biology in Northwest Australia (the Kimberley)

Summary

WHI, 2006, clinical trial

My best case conjecture in 2005:

Probably no statistically significant effects

The p-value was 0.07, relative risk about 1.2

My best case conjecture in 2008 after further follow-up Statistically significant, modest effects

_________________________________________________________

You are what you eat, but do you know who you are?

Diet is incredibly hard to measure

Even 100% increases in risk cannot be seen in large cohort studies with an FFQ

If you read about a diet intervention, measured by a FFQ, and it achieves statistical significance multiple times: wow!

_________________________________________________________

You are what you eat, but do you know who you are?

Much work at NCI and WHI and EPIC on new ways of measuring diet

EPIC (a multi-country study) may be a model, because of the wide distribution of intakes

_________________________________________________________

What Was Done

• The OPEN analysis actually fit Protein and Energy together.

• We call this the Seemingly Unrelated Measurement Error Model

• Can get major gains in efficiency

_________________________________________________________

SUMEM

• Gains in efficiency come from the correlations of the random effects

_________________________________________________________

ijP 0QP 1QP

ij

iP

iP

iE

i

ijQP

ijQP

iQP

i

P

ijE QE0QE 1QE

ijE

ijQE

ijQE E

Q =β +β + + ;

M = + ;

Q =β + β + + ;

M = + ;

X

X

X

U

X

ε

U

r

εr

top related