statistical considerations in the analysis of matched case ...160635/fulltext01.pdf · statistical...

30
Comprehensive Summaries of Uppsala Dissertations from the Faculty of Social Sciences 100 _____________________________ _____________________________ Statistical Considerations in the Analysis of Matched Case-Control Studies With Applications in Nutritional Epidemiology BY LISBETH HANSSON ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2001

Upload: duongkien

Post on 22-Mar-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Social Sciences 100

_____________________________ _____________________________

Statistical Considerations in the Analysis of Matched

Case-Control Studies

With Applications in Nutritional Epidemiology

BY

LISBETH HANSSON

ACTA UNIVERSITATIS UPSALIENSISUPPSALA 2001

Dissertation for the Degree of Doctor of Philosophy in Statistics presented at Uppsala Universityin 2001

AbstractHansson L., 2001. Statistical Considerations in the Analysis of Matched Case-Control Studies.With Applications in Nutritional Epidemiology. Acta Universitatis Upsaliensis. ComprehensiveSummaries of Uppsala Dissertations from the Faculty of Social Sciences 100. 33 pp. Uppsala.ISBN 91-554-4950-6.

The case-control study is one of the most frequently used study designs in analyticalepidemiology. This thesis focuses on some methodological aspects in the analysis of the resultsfrom this kind of study.

A population based case-control study was conducted in northern Norway and centralSweden in order to study the associations of several potential risk factors with thyroid cancer.Cases and controls were individually matched and the information on the factors under study wasprovided by means of a self-completed questionnaire. The analysis was conducted with logisticregression. No association was found with pregnancies, oral contraceptives and hormonereplacement after menopause. Early pregnancy and artificial menopause were associated with anincreased risk, and cigarette smoking with a decreased risk, of thyroid cancer (paper I). Therelation with diet was also examined. High consumption with fat- and starch-rich diet wasassociated with an increased risk (paper II).

Conditional and unconditional maximum likelihood estimations of the parameters in alogistic regression were compared through a simulation study. Conditional estimation had higherroot mean square error but better model fit than unconditional, especially for 1:1 matching, withrelatively little effect of the proportion of missing values (paper III). Two common approaches tohandle partial non-response in a questionnaire when calculating nutrient intake from dietvariables were compared. In many situations it is reasonable to interpret the omitted self-reportsof food consumption as indication of “zero-consumption” (paper IV).

The reproducibility of dietary reports was presented and problems for its measurements andanalysis discussed. The most advisable approach to measure repeatability is to look at differentcorrelation methods. Among factors affecting reproducibility frequency and homogeneity ofconsumption are presumably the most important ones (paper V). Nutrient variables can oftenhave a mixed distribution form and therefore transformation to normality will be troublesome.When analysing nutrients we therefore recommend comparing the result from a parametric testwith an analogous distribution-free test. Different methods to transform nutrient variables toachieve normality were discussed (paper VI).

Lisbeth Hansson, Department of Information Science, Division of Statistics, Uppsala University,P.O. Box 513, SE-751 20 Uppsala, Sweden

Lisbeth Hansson 2001

ISSN 0282-7492ISBN 91-554-4950-6

Printed in Sweden by Uppsala University, Tryck & Medier, Uppsala 2001

To the Memory of My Father

Contents

1

2

3

4

5

6

7

Acknowledgements

First and foremost I would like to express my gratitude to:

Professor Reinhold Bergström, my teacher and my supervisor until his dead (1999). It was aprivilege that Reinhold wanted to share his deep knowledge and experience with me.

Professor Harry Khamis, my second supervisor. I was lucky that Swedish folkdance enticedHarry to spend some years in Sweden, and gave me the opportunity to get such a experienced,friendly, brilliant and patient tutor.

MD Rosaria Galanti, co-author and friend, I want to thank Rosaria, with her deep skill, forgiving me an insight in the essence of analytic epidemiological research.

I wish to thank all the former and present members of the Department of Statistics andDepartment of Information Science for encouragement and for providing a pleasant workingenvironment. In particular I give my gratitude to:

Professor Anders Christoffersson, Head of the Department of Information Science, and docentAnders Ågren, for their support and encouragement during my work. Special thanks toAnders Ågren for all his administrative and practical efforts during the final arrangement formy dissertation.

PhD Anna Gunsjö, for being a friend and an outstanding guide in statistical issues.

Professor Adam Taube, for still letting the members of our department share his never endingdevotion in the misuse of statistics.

My colleagues Bertil Anderson, Jonas Andersson, Marie-Louise Nordström, Mats Nyfjäll,Roland Pettersson, Dag Sörbom, Bo and Fan Yang Wallentin for all statistical discussions,non-statistical discussions, nonsense discussions, encouragement, support and friendship.

My PhD-student friends during the years Barbro Dunér, Anna Hermansson, Johan Lyhagen,Stefan Mattson, Tomas Pettersson, Inger Persson, Lars Söderström and Lisa Wernroth.

The administrative staff of the Division of Statistics: Davoud Emamjomeh, Gunilla Klaar,Ingrid Lukell and Ann-Christine Persson.

For support, love and joy I will give special gratitude to:

My husband Ove Hansson and our children Linnea and Joakim.

My mother Inger Törnkvist.My sisters Lena Törnkvist, Monika Nilsson, Gunilla Tholfsson, Viktoria Törnkvist-Mildh andmy brother Torbjörn Törnkvist together with their families.My mother and father-in-law Irma and Elis Hansson.

I would also like to thank The Bank of Sweden Tercentenary Foundation for financialsupport.

This work is dedicated to the memory of my beloved father Lars-Gunnar “Lasse” Törnkvist.

1. List of Papers

5

8

36(1)

54

2. Introduction

3. The Case Control Study

Introduction

=

=

= =

=⋅==

Matching

4. The Logistic Regression Model

Introduction

{ }∈ [ ]=

π π

[ ]−′−−′+

′+

+=+

= xx

( )ββββ =′ x

( ) xx ′+=−

==

Interpretation

,

ππ−

=

.

=

β=

ππ

= ππ

≈−−ππ

)()(

)()(

ππ

ππππ

−−=

−−=

ππ

Estimation

β

Unconditional likelihood

( )( ) ( )

( )−

= ′++′++

′+=∏ ββββ

ββ

ββ

ββ

′+

′+

+

Conditional likelihood

( )

( )∏

=

=

′=

β

β

{ }ll1 { }ll +

( )

( )∏

∏=

=

=

′=

β

β

+

+

Inference

Confidence Interval

( )ββ α−±

The Likelihood Ratio Test

−=

Wald Test

( )ββ

=

Score Test (Lagrange Test)

( ) ( )−−

−=

Discussion

−+−+==− σσσ

µσµα

ββα ++=

β β

5. Thyroid cancer

6. Summary of the papers

Paper I Reproductive history and cigarette smoking as risk factors for thyroid cancer

in women: a population based case-control study

Paper II Diet and the risk of papillary and follicular thyroid carcinoma: a population-

based case-control study in Sweden and Norway

Aims of the Studies

Paper I

Paper II

Study Design and Study Base

Assessment of Exposure

Paper I

Paper II

Statistical Analysis

Paper I

Paper II

Results

Paper I

Paper II -

Discussion

Paper III Unconditional compared to conditional logistic regression in matched case-

control studies with missing observations.

Paper IV Diet-associated risks of disease and self-reported food consumption: How shall

we treat partial nonresponse in a food frequency questionnaire?

Paper V Measurement of reliability: A study of reproducibility of past diet through a

short food frequency questionnaire (FFQ).

Paper VI Transforming nutrient variables

7. References

353(9154)

26

2

70

25

22

141

3

5

56

52

136