master in nursing & midwifery ku leuven · advanced statistical methods master in nursing &...

Advanced statistical methods

Master in nursing & midwiferyKU Leuven

Geert Verbeke

Interuniversity Institute for Biostatisticsand statistical Bioinformatics

[email protected]

http://gbiomed.kuleuven.be/biostat/geertverbeke

Table of Contents

I Introductory material 1

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Central data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 What is statistics ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Confidence intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6 Some frequently used tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Advanced statistical methods i

II Critical appraisal of literature 78

7 Errors in statistics: Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

8 Errors in statistics: Practical implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

III Simple linear regression 143

9 The Pearson correlation coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

10 Simple linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

11 Model diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

12 Influential observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

IV One-way analysis of variance 255

13 The unpaired t-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

14 1-way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

Advanced statistical methods ii

V Multiple linear regression 315

15 Multiple linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

16 Polynomial regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

17 Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377

VI Analysis of variance with multiple factors 399

18 Multiple analysis of variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400

VII Analysis of covariance and the general linear model 444

19 Analysis of covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445

20 The general linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474

21 Regression notation of a general linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495

Advanced statistical methods iii

VIII Models for binary outcomes 516

22 Simple logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517

23 Multiple logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559

IX Models for time-to-event data 579

24 Survival analysis without censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580

25 Survival analysis with censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591

26 Regression for survival data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615

X Further Topics 640

27 Clustered data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641

28 Longitudinal data / Repeated measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667

29 Missing observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697

Advanced statistical methods iv

Bibliography 712

Advanced statistical methods v

Part I

Introductory material

Advanced statistical methods 1

Chapter 1

Introduction

. Motivation

. Course material

. Examination and evaluation


1.1 Motivation

• Master thesis −→ research track

• Statistics in (bio-)medical literature −→ critical reading / appraisal

• Correct analysis of data collected

• Correct interpretation of results obtained


1.2 Course material

• Copies of slides: Toledo

• Publications discussed during course: Library (available online)

• Datasets used during course: Toledo

• Statistica software:

. Avalailable in all K.U.Leuven PC classes

. Available via LUDIT: https://icts.kuleuven.be/sc/english/index

. . . .

• Other packages (SAS, SPSS, . . . ) allowed


• Vestac JAVA applets

. JAVA applets for the visualization of statistical concepts

. Download from: http://lstat.kuleuven.be/newjava/vestac/


1.3 Examination & evaluation

• Critical appraisal of literature (Part A):

. Individual task

. Critical reading of literature

• Data analysis (Part B):

. Take-home team project (3-4 students per team)

. Data analysis and reporting of results

• Reporting and presentation:

. Written reports about Part A & Part B submitted prior to oral exam

. Mid-term individual presentation of intermediate results of Part B

. Individual presentation of results of Part A & Part B at oral exam


Chapter 2

Central data set

. Introduction

. Problem setting

. Sample

. Data collected


2.1 Introduction

• These data are central to this part.

• Origin: Prof.Dr. Koen Milisen, AccentVV, KU Leuven.

• Data available to students in the context of this course but cannot be distributedfurther


2.2 Problem setting

• Research into post-operative variability in the neuro-cognitive and functional status inelderly hip fracture patients.

• A surgical intervention in elderly patients often results in acute cognitivedisfunctioning (= delirium).

• Delirium versus dementia:

. Delirium: → acute start→ usually temporary

. Dementia: → no acute start→ slowly progressing→ irreversible


• Delirium . . .

. leads to medical problems and problems of care

. often is the first symptom of a physical disorder or intoxication stemming frommedicines

. can lead to increased mortality

. is hard to detect

• Economical implications of delirium:

. Extra care

. Longer hospital stay

. High degree of institutionalization

• Research suggest that, among elderly hip fracture patients, the increased degree ofdependence is a consequence of delirium, rather than the hip fracture itself.


2.3 Sample

• Longitudinal design: Certain variables are measured repeatedly over time.

• Prospective (e.g., complications) and retrospective (e.g., living conditions)measurements.

• Data of 2 traumatology units of University Hospital Gasthuisberg, KU Leuven.

• Inclusion criteria:

. ≥ 65 years of age

. hospitalized with hip fracture in the emergency room

. consent for participation into the study

. . . .


• Exclusion criteria:

. time between admission and operation ≥ 72 hours

. various traumas

. . . .

• Data collected 16/09/1996–28/02/1997.


2.4 Data collected

• Statistica file: delirium.sta

• Data on 60 patients

• 78 variables

• Data for every patient, prior to, during, and post operation

• Longitudinal and derived measurements

• Study questionnaire, ADL score, MMSE, and CAM scores


2.4.1 Pre-operative evaluation

Variable Description Values

nummer patient number 1–60

leeftd age (years)

gesl sex 1=male2=female

opnduur length of stay (days)

burgst civil status 1=single2=married3=widow(er)4=divorced5=religious

opleid education 1=university/college2=high school3=lower secundary4=primary

zijfrc side fracture 1=left2=right

typfrc type fracture 1=intra-capsular2=extra-capsular

cardio cardiologic pathology 0=no1=yes

vascul vascular pathology 0=not1=yes


Variabele Description Values

pulmon pulmonary pathology 0=no1=yes

urinai urinary pathology 0=no1=yes

abdom abdominal pathology 0=no1=yes

hyper hypertension 0=no1=yes

zicht vision pathology 0=no1=yes

gehoor auditive pathology 0=no1=yes

malign malignant disease 0=no1=yes

diabet diabetes 0=no1=yes

reumat reumatological pathology 0=no1=yes

vrop past surgery 0=no1=yes

neuro neuro-psychiatric pathology 0=no1=yes

andere other pathology 0=no1=yes


2.4.2 Operative evaluation


opnop time hospitatlization-operation 1=emurgency2=<24 hours3=<48 hours4=<72 hours

soorin type of surgery 1=internal fixation2=THP3=BHP4=DHS

percom per-operative complications 1=yesl2=not

opduur duration surgery 1=<45 min2=45-90 min3=90-120 min4=>120 min

bloed blood loss 1=<300 ml2=300-1000 ml3=>1000 ml

anes anesthesia 1=local2=spinal3=complete


2.4.3 Post-operative evaluation


no mechanic complications 0=no1=yes

luxa luxation of prosthesis 0=no1=yes

impla implantation problems 0=no1=yes

anmec other mechanical problems 0=no1=yes

nolok local complications 0=no1=yes

opper superficial wound problems 0=no1=yes

diep deep infection 0=no1=yes

anlok other local complications 0=no1=yes

gen general complications 0=no1=yes

doorli decubitus 0=no1=yes

diephl deep phlebothrombosis 0=no1=yes

pulemb pulmonary embolism 0=no1=yes



urin urinary complications 0=no1=yes

ander other respiratory problems 0=no1=yes

cardi cardiologic complications 0=no1=yes

cere cerebral complications 0=no1=yes

autre other general complications 0=no1=yes

gn intake medication 0=no1=yes

dig intake digitalis 0=no1=yes

diur intake diuretics 0=no1=yes

bblo intake β-blocker 0=no1=yes

benz intake benzodiazepines 0=no1=yes

anti intake anticholinergics 0=no1=yes

neur intake neuroleptics 0=no1=yes



depres02 intake anti-depressants 0=no1=yes

other intake other medication 0=no1=yes

ontsl discharge to 1=home2=daughter/son3=geriatric ward4=revalidation unit5=psychiatric unit6=RH/RVT7=convent8=other

dood death during hospitalisation 1=yes2=no


2.4.4 Longitudinal and derived measures


sencam CAM result on day 1 1=delirium; 2=no delirium

sencam03 CAM result on day 3

sencam05 CAM result on day 5

senverw Was CAM result ever equal to 1 ? 0=no; 1=yes

adltot1 ADL score on day 1 6-24; 6=not dependent; 24=very dependent

adltot5 ADL score on day 5

adltot12 ADL score on day 12

MMSE1 MMSE score on day 1 0-30; 0=extreme confusion; 30=no confusion

MMSE3 MMSE score on day 3




CAM : Confusion Assessment Method, measured on days 1,3,5,8,12

ADL : Activities of Daily Living, measured on days 1,5,12

MMSE : Mini Mental State Examination, measured on days 1,3,5,8,12


Chapter 3

What is statistics ?

. Example

. Population – sample

. Random variability


3.1 Example: Captopril data

• 15 patients with hypertension

• The response of interest is the supine blood pressure, before and after treatment withCAPTOPRIL

• Research question:

How does treatment affect BP ?


• Dataset ‘Captopril’

Before After

Patient SBP DBP SBP DBP

1 210 130 201 125

2 169 122 165 121

3 187 124 166 121

4 160 104 157 106

5 167 112 147 101

6 176 101 145 85

7 185 121 168 98

8 206 124 180 105

9 173 115 147 103

10 146 102 136 98

11 174 98 151 90

12 201 119 168 98

13 198 106 179 110

14 148 107 129 103

15 154 100 131 82

Average (mm Hg)

Diastolic before: 112.3

Diastolic after: 103.1

Systolic before: 176.9

Systolic after: 158.0


• It would be of interest to know how likely the observed changes in BP are to occur bypure chance.

• If this is very unlikely, the above data provide evidence that BP indeed decreases aftertreatment with Captopril. Otherwise, the above data do not provide evidence forefficacy of Captopril.


• Obviously, we are not interested in drawing conclusions about the 15 observed patientsonly.

• Instead, we would like to draw conclusions about the effect of Captopril on the totalpopulation of all hypertensive patients.

• Conclusion:

Statistics aims at drawing conclusions about some population,based on what has been observed in a random sample


POPULATION

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

S

A

M

P

L

E

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••RANDOM STATISTICS

Effect of Captopril in population

Effect of Captopril in 15 patients


3.2 Population versus random sample

• Population: Hypothetical group of current and future subjects, with a specificcondition, about which conclusions are to be drawn

• Sample: Subgroup from the population on which observations will be taken

• In order for effects observed in the sample to be generalizable to the total population,the sample should be taken at random


3.3 Random variability

• Descriptive statistics of the observed differences in diastolic BP, after treatment withCaptopril, in 15 subjects:

Before After Change

Patient DBP DBP

1 130 125 5

2 122 121 1

3 124 121 3

4 104 106 −2

5 112 101 11

6 101 85 16

7 121 98 23

8 124 105 19

9 115 103 12

10 102 98 4

11 98 90 8

12 119 98 21

13 106 110 −4

14 107 103 4

15 100 82 18


• Note that not all subjects experience the same benefit from the treatment

• An average decrease of 9.27 mm/Hg is observed in our sample

• A new, similar, experiment would lead to another sample, hence to another observedchange in BP:

. More reduction (11.57 mm/Hg) ?

. Less reduction (4.78 mm/Hg) ?

. No change (0.00 mm/Hg) ?

. Increase (-5.23 mm/Hg) ?

• This shows that the observed decrease of 9.27 mm/Hg should not be overinterpreted

• This also shows that one should not hope that 9.27 mm/Hg is the gain in BP onewould observe if the total population were treated with Captopril.


• Let µ be the average change in BP one would observe if the total population would betreated

• 9.27 mm/Hg can then be interpreted as an estimate for µ, based on our sample

• Question:

Is our observed change of 9.27 mm/Hg sufficient evidenceto conclude that the treatment really affects the BP ?

• Answer:

Hypothesis testing


POPULATION

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

S

A

M

P

L

E

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••RANDOM STATISTICS

Is µ different from 0 ?

Observed effect of 9.27 mm/Hg

in 15 randomly selected patients


Chapter 4

Hypothesis testing

. Example

. Null and alternative hypothesis

. The p-value and level of significance

. Possible errors in decision making


4.1 Example

• As before, µ is the average change in diastolic BP one would observe if the totalpopulation of hypertensive patients would be treated with Captopril.

• Note that µ will never be known, but we can use our sample to learn about µ.

• In case the treatment would have no effect, the average µ would be zero.

• So, if one can show that there is (strong) evidence that µ 6= 0, then this can beconsidered as evidence for a treatment effect.

• Based on our sample of 15 observations, we estimated µ by µ = 9.27mm/Hg.

• Obviously, this estimate is relatively far away from 0, suggesting that the treatmentmight affect BP


• On the other hand, the observed effect µ = 9.27 could have occurred by pure chance,even if there would be no treatment effect at all.

• Question:

How likely would that be ?

• Only if this would be very unlikely to happen, the observed data will be consideredsufficient evidence for some effect of the treatment


4.2 Null and alternative hypothesis

• The procedure to decide whether there is sufficient evidence to believe the treatmentdid affect BP is called test of hypothesis

• In practice, the research question is formulated in terms of a null hypothesis H0 andan alternative hypothesis HA:

H0 : µ = 0 versus HA : µ 6= 0

• Based on our observed data, we will investigate whether H0 can be rejected in favourof HA

• If not, the null hypothesis H0 is accepted and one decides that the treatment wasnot effective


4.3 The p-value and level of significance

• Intuitively, it is obvious that H0 : µ = 0 will be rejected if the observed sample averageµ is too far away from 0

• Question:

How far is too far ?

• Answer:

If this result is very unlikely to happen by pure chance


• Equivalently:

If this result is not at all what you expect to see if µ would be 0

• One can calculate that, if Captopril would have no effect at all, that there is only 0.1%chance of observing a sample with average change in BP at least as big as9.27mm/Hg.

• Hence, if Captopril would have no effect (i.e., if µ = 0), then it would be very unlikelyto observe a sample with average as extreme as 9.27. This would happen only onceevery 1000 times a similar experiment would be performed.


• We therefore consider the data observed in our experiment sufficient evidence toreject the null hypothesis and we conclude that the treatment effect is significantlydifferent from 0, or equivalently, that there is a significant treatment effect

• The probability 0.1% that expresses how extreme our observations are in case the nullhypothesis would be true, is denoted by p, and is called the p-value.

• A small p-value is indication of extreme results were H0 true. One then rejects thenull hypothesis


• A large p-value is indication that the observed results are perfectly in line with whatcan be expected to observe, if H0 is true. One then does not reject the nullhypothesis, which is equivalent to accepting the null hypothesis

• In practice, one has to decide how small p should get before the null hypothesis isrejected.

• One therefore specifies the so-called level of significance α:

p < α =⇒ reject H0

p ≥ α =⇒ accept H0

• α is typicaly a small value, such as 0.01, 0.05, 0.10


• In biomedical sciences α = 0.05 =5% is standard.

• One then rejects the null hypothe-sis as soon as the observed resultwould happen in less than 5 timesin 100 experiments, assuming thatthe null hypothesis would be correct

• Strictly speaking, one should always mention what level of significance has been used,and the conclusion would have to be formulated as “the treatment effect issignificantly different from 0 at the 5% level of significance,” or equivalently, that“there is a significant treatment effect at the 5% level of significance.”


• Note that specification of α is onlyrequired if a formal decision is pre-ferred (‘accept’ or ‘reject’).

• It is therefore not meaningful to re-port ‘borderline significance’ inexamples where p is only slightlylarger than α(e.g., p = 0.06 > α = 0.05)


4.4 Possible errors in decision making

• In our example about the Captopril treatment, we obtained p = 0.001 leading to therejection of the null hypothesis of no treatment effect.

• This should not be considered as formal proof that there is a treatment effect

• Even if the treatment has no effect at all, a sample like ours would occur once every1000 times.

• Maybe, our sample was indeed the extreme one that happens once every thousandexperiments.

• Alternatively, suppose we would have obtained p = 0.9812. We then would not haverejected the null hypothesis, and concluded that there is no evidence for any treatmenteffect.


• This should not have been considered as formal proof that any treatment effect wouldbe absent.

• Maybe, the treatment effect µ is not 0, but very close to 0. The data one then wouldobserve would look very similar to data that would be observed if µ = 0, such that thedata do not allow to detect that µ 6= 0


• Conclusion:

“Statistics can prove everything”

• Intuitively: Absolute certainty aboutpopulation characteristics cannot beattained based on a finite sample of ob-servations


Chapter 5

Confidence intervals

. Example

. The confidence interval

. Interpretation

. Properties of confidence intervals

. Hypothesis testing versus confidence intervals


5.1 Example: Captopril data

• Consider the Captopril data, where blood pressure was taken in 15 hypertensivepatients, before and after administration of the drug Captopril:

• Interest is in estimating the average change in diastolic BP.


• Let X be the difference in diastolic BP before and after treatment:

X = BPbefore − BPafter

• The observed values xi for X can be calculated from the observed values of the BP inour sample:

Before After Change

Patient DBP DBP xi

1 130 125 5

2 122 121 1

3 124 121 3

4 104 106 −2

5 112 101 11

6 101 85 16

7 121 98 23

8 124 105 19

9 115 103 12

10 102 98 4

11 98 90 8

12 119 98 21

13 106 110 −4

14 107 103 4

15 100 82 18


• Note that, in relatively small samples, the histogram can be difficult to interpret.

• One therefore prefers not to estimate the complete distribution of X

• On the other hand, there does not seem to be strong evidence for severe skewness.

• Focuss will be on the estimation of the average µ of X . As before, our estimate willbe the sample average:

µ = x = 9.27

• Since every other sample would have lead to another estimate µ, it is of interest toknow how likely it is that our estimate is far from the true value µ

• We want to derive an interval around our estimate µ = 9.27 which is very likely tocontain the true value µ


POPULATION

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

S

A

M

P

L

E

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

How precise is µ as estimate foraverage change µ in diastolic BP

µ = x = 9.27

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••

STATISTICSRANDOM


5.2 The confidence interval

• The confidence interval is an interval around the estimate µ which expresses theprecision with which µ has been estimated

• The interval will contain the unknown value µ with user-defined certainty, called theconfidence level:

Level Confidence Interval

90% [5.61; 12.93]

95% [4.91; 13.63]

99% [3.02; 15.52]

• In biomedical sciences, one traditionally uses 95% confidence levels


5.3 Interpretation

• Let us focuss on the 95% confidence interval. For other confidence levels, theinterpretation is similar.

• The 95% C.I. equals [4.91; 13.63]

• Obviously, this cannot be interpreted as the interval [4.91; 13.63] containing µ with95% probability

• Indeed, [4.91; 13.63] always or never contains µ

• Correct interpretation:

→ confidence interval for mean


• Conclusion:

There is 95% probability that the experiment conducted

results in a C.I. which contains the unknown value µ


5.4 Properties of confidence intervals

• Ideally, C.I.’s are small, as this reflects a very precise estimation of the unknownpopulation parameter µ

• Hence, a C.I. can be used as an indication of the precision of the estimation:

. short C.I.: precise estimation

. long C.I.: imprecise estimation, much uncertainty

• The length of the C.I. increases with the confidence level:

Level Confidence interval

95% [4.91; 13.63]

99% [3.02; 15.52]


• Intuitively: larger intervals are more likely to contain the unknown populationparameter µ

• The length of the C.I. decreases with the sample size n

• Illustration:

→ confidence interval for mean


• Intuitively: More observations leads to more precision:

One can ‘buy’ extra precision with extra observations

• The length of the C.I. increases with the variance σ2 of the original data

• Intuitively: The more the observations are alike, the more precise the mean can beestimated:

|µ

Precise estimation of µ

|µ

Imprecise estimation of µ


•What about 100% C.I.’s ?

• The 100% C.I. for µ equals [−∞; +∞], which is not informative at all

• Intuitively: Absolute certainty aboutpopulation characteristics cannot be at-tained based on a finite sample of obser-vations


5.5 Hypothesis testing versus confidence intervals

• For the Captopril data, we have drawn conclusions about the average treatment effectin the population, through 2 different statistical procedures:

. 95% confidence interval: [4.91; 13.63]

. Significance of treatment effect, p = 0.001

• We know from the C.I. that the average treatment effect is likely to be between 4.91and 13.63, excluding 0

• The significance test has rejected the value 0 as possible value for µ

• So, both procedures agree


• Question:

Do both procedures always agree ?

• Answer:

Yes, provided the levels of significance andconfidence are complementary to each other:

Level of significance α Confidence level (1− α)100%

0.05 95%

0.10 90%

0.01 99%


• In case of accepting H0 (p ≥ α = 0.05):

x.........................................................................................

.

..

..

..

..

..

..

..

..

..

..

..

.

H0

[ ]

95% C.I.

• In case of rejecting H0 (p < α = 0.05):

x.........................................................................................

.

..

..

..

..

...

..

..

..

..

..

..

H0

[ ]

95% C.I.


• An alternative interpretation for the C.I. follows immediately:

A 95% C.I. is the collection of all null hypotheses

that would be accepted in a statistical test

• Statistical tests are to some extent equivalent to C.I.’s

• However, C.I.’s have the advantage of giving an indication of the effect size(treatment esstimate µ), as well as of the precision of estimation (width of C.I.)

• So, C.I.’s should be preferred over statistical tests

↔ Biomedical literature


Chapter 6

Some frequently used tests

. The unpaired t-test

. The chi-squared test

. The paired t-test

. Assumptions


6.1 The unpaired t-test

• Consider data from a rat experiment to study weight gain under a high or a lowprotein diet

• Group-specific histograms:


• Group-specific summary statistics:

• On average, there is an observed difference of 19g between the rats on a high proteindiet and those on a low protein diet.

• Is this observed difference sufficient evidence to conclude that there indeed is an effectof diet on the weight gain ?

• It would be of interest to know how likely such a difference of 19g is to occur if weightgain would be completely unrelated to the protein level of the diet.


• Based on the unpaired t-test, it can be calculated that, in case the diet would notaffect the weight gain at all, one would have p = 0.0757 = 7.57% chance of observinga difference of at least 19g, in a similar experiment.

• So, even if there is no relation at all between the protein content of the diet andweight gain, then one can still expect to observe a difference of at least 19g in 7.6% ofthe future similar experiments.

• Since p = 0.0757 > 0.05 = α, we consider this unsufficient evidence to conclude thatthe protein level would indeed affect the weight gain


• Conclusion:

There is no significant difference (p = 0.0757) in weight gain

between rats on a high protein level diet,

and rats on a low protein level diet


6.2 The chi-squared test

• We consider data on sickness absence, collected on 585 employees with a similar job:

Sickness absenceNo Yes

Genderfemale 245 184 429

male 98 58 156

343 242 585



Is there a relation between absence and gender ?

• 184/429 = 42.9% of the females, and 58/156 = 37.2% of the males have been absent

• This suggests that females are more absent than males

• However, even if absence due to sickness is equally frequent amongst males andfemales, the above results could have occurred by pure chance.

• It therefore would be of interest to calculate how likely it would be to observe suchdifferences, by pure chance


• Based on the chi-squared test, it can be calculated that, even if males and femaleswould be equally frequently absent, there would be p = 0.215 = 21.5% chance ofobserving a similar experiment with difference between the groups at least equal to0.429 − 0.372 = 0.057

• So, even if there is no relation at all between gender and absence, then one can stillexpect to observe a difference of 5.7% in 21.5% of the future similar experiments.

• Since p = 0.215 > 0.05 = α, we consider this unsufficient evidence to conclude thatthe the occurrence of sickness absence is related to gender


• Conclusion:

There is no significant difference (p = 0.215) in prevalence

of sickness absence

between males and females


6.3 The paired t-test

• The Captopril example discussed before considered paired data: Each observationbefore treatment uniquely corresponds to one observation after treatment (from thesame patient), and vice versa

• The paired t-test analyses paired observations:

. Before and after treatment

. Married couples: male and female

. Twin studies

. Ophthalmology: left and right eye

. . . .


6.4 Assumptions

• Most statistical procedurs are based on assumptions about the distribution of theobservations in the population

• For example, the unpaired t-test, used before to compare weight gains under twodifferent diets, assumed weight gains to be normally distributed, with the sameamount of variability in both groups:

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

Low protein High protein

|µ2

|µ1


• If assumptions are not satisfied, wrong results can be obtained

• One will therefore always explore the observed data to check whether the assumptionsare supported by the data.

• In large samples however, results are less sensitive to the underlying assumptions.


Part II

Critical appraisal of literature


Chapter 7

Errors in statistics: Basic concepts

. Introduction

. Two types of errors

. Power

. Sample size calculation

. Examples

. Remarks

. Example from the biomedical literature


7.1 Introduction

• Re-consider the example on the weight gain in rats, where interest is in thecomparison between rats fed on a high or low protein diet

• Group-specific histograms:


• Group-specific summary statistics:

• On average, there is an observed difference of 19g between the rats on a high proteindiet and those on a low protein diet.

• Based on the unpaired t-test, we obtained before that this observed difference is notsufficient evidence to believe that the weight gain is really different for the two diets(p = 0.0757)


• Conclusion:

There is no significant difference (p = 0.0757) in weight gain

between rats on a high protein level diet,

and rats on a low protein level diet

• As indicated before, the result of a statistical test should be interpreted as evidence infavour or against the null hypothesis, and should not be interpreted as formal proof.

• In our example, the difference in weight gain between a population treated with onediet and a population treated with the other diet is too small to be detected based on12 and 7 animals, respectively.


• Alternatively, if the t-test would have lead to p = 0.001, this would still not formallyproof that there is a difference between both populations.

• After all, p = 0.001 would only indicate that the observed difference of 19g occursonce every 1000 times, even if there is no difference at all between both populations.

• Maybe, our sample was indeed the extreme one that happens once every thousandexperiments.

• Hence, whenever statistical tests are used, one has to be aware that errors in theconclusions can occur.

• It is therefore important to quantify the errors, and to keep them under control


7.2 Two types of errors

RealityH0 correct H0 not correct

Test resultAccept H0 No error Type II error

Reject H0 Type I error No error

• Type I error: H0 is incorrectly rejected

• Type II error: H0 is incorrectly accepted


7.3 Type I error

• A type I error occurs if H0 is correct but the test leads to a significant result.

• Question:

How likely is such an error to occur ?

• Suppose the test is performed at the α = 5% level of significance

• If H0 is correct, then one will observe a significant result in 5% of the cases

• Hence, in 5% of the cases, H0 would be incorrectly rejected


• The probability of making a type I error is therefore equal to the chosen level α ofsignificance.

• In practice, the probability of making a type I error is kept under control by choosingα sufficiently small

• In biomedical sciences α = 5% is often used, hereby allowing to make a type I error in5% of the cases.


Test resultAccept H0 1− α

Reject H0 α

1

• If H0 is correct, then the probability of making a type I error is α, while the probabilityof correctly accepting H0 is 1− α.


7.4 Type II error

• A type II error occurs if H0 is incorrect but the test has not detected this, i.e., anon-significant result is obtained

• Question:

How likely is such an error to occur ?

• In contrast to the type I error, the probability of making a type II error is not easilycontrolled, and depends on various aspects of the sample(s) and population(s)


• In analogy to the type I error, the type II error rate is denoted by β


Test resultAccept H0 1− α β

Reject H0 α 1− β

1 1

• The power of a statistical test is 1− β, the probability of correctly rejecting H0


7.5 Power

• In general, a specific testing procedure is acceptable, only if:

. the chance of making a type I error rate is sufficiently small

. the power to detect deviations from H0 is sufficiently large

• The first condition can be met by specifying α sufficiently small.

• The second condition is more difficult to meet, as the power depends on variousaspects of the sample(s) and population(s)

• This will be illustrated in the context of the comparison of two groups (such as theweight gain experiment)


• As before, let µ1 and µ2 represent the average weight gain in the total population,under high and low protein diets, respectively.

• The null and alternative hypotheses are given by

H0 : µ1 = µ2 versus HA : µ1 6= µ2

• The power is the probability of correctly rejecting H0.

• In that case, µ1 6= µ2, and we denote the true difference between both populations by∆ = µ1 − µ2

• The unpaired t-test assumes the data to be normally distributed in both populations,with equal variability σ2


• Graphically:

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••


|µ2

|µ1

................................................................................................................................................................................................................................................................... ........................................ ...........................................................................................................................................................................................................................................................................................................∆

........................................................................................................................................................................................................................................................ ........................................ ................................................................................................................................................................................................................................................................................................σ2

........................................................................................................................................................................................................................................................ ........................................ ................................................................................................................................................................................................................................................................................................σ2


7.5.1 Power as a function of α

The smaller α, the smaller the power

• Intuitively: Type I errors are less likely if the null hypothesis is rejected less often.However, in cases where H0 is truly wrong, it will still be rejected less often.

• An extreme case is obtained for α = 0:

. α = 0 implies that the null hypothesis is always accepted

. So, in case the null hypothesis is wrong, it is still accepted, leading to power 0


7.5.2 Power as a function of true difference ∆

The smaller ∆, the smaller the power

• Intuitively: Large deviations from the null hypothesis are easier to detect

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••


|µ2

|µ1

................................................................................................................................................................................................................................................................... ........................................ ...........................................................................................................................................................................................................................................................................................................∆

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••


|µ2

|µ1

..............................................................................................................................................................∆


7.5.3 Power as a function of variability σ2

The smaller σ2, the larger the power

• Intuitively: Homogeneous groups are easier discriminated than heterogeneous groups

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••


|µ2

|µ1

................................................................................................................................................................................................................................................................... ........................................ ...........................................................................................................................................................................................................................................................................................................∆

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••


|µ2

|µ1

................................................................................................................................................................................................................................................................ ........................................ ........................................................................................................................................................................................................................................................................................................∆


7.5.4 Power as a function of sample size(s)

The more observations, the larger the power

• Intuitively: More observations yields more information about the population(s),therefore implying more precision in the conclusions


7.5.5 Conclusion

• The power depends on various aspects:

. Level of significance α

. True difference ∆ between the populations

. Within-group variance σ2

. Sample size(s)

• Note that the sample size is the only aspect under control of the investigator.

• In practice, one can calculate the sample size needed to reach a sufficiently high power.


7.6 Sample size calculation

• As indicated before, a testing procedure is only acceptable if it has sufficient power,i.e., if the probability of making a type II error is sufficiently small.

• Since the sample size is the only aspect influencing the power, which is under controlof the investigator, it is important that experiments are sufficiently large in order forthe power to be sufficiently large as well

• The level α of significance is chosen such that the probability of making a type I erroris sufficiently small

• The within-group variance σ2 is pre-specified based on earlier, similar experiments,relevant literature, or a pilot study


• To be on the safe side, usually an upperbound for σ2 is used: In case the variabilitywould be smaller, the power would be higher, hence still sufficiently high

• In practice, ∆ is not known. Instead, the smallest ∆ which would still be clinicallyrelevant to detect, is specified.

• If sufficient power is attained for the smallest meaningful ∆, we have that:

. Any larger difference will be detected with even larger power

. We are not concerned about small powers for detecting smaller differences, as suchdifferences are not relevant anyway.

• One can then calculate the number(s) of observations needed to reach a desired levelof power.


7.7 Example: Weight gain data

• In the weight gain data, the observed difference of 19g was found not to be significant(p = 0.0757)

• We can calculate the power that a real difference of 19g would be found significant ifa new experiment were to be conducted, again with 12 and 7 observations in the highand low protein diet groups, respectively.

• Group-specific summary statistics, from the current experiment:


• Power calculations will be based on σ = 21, and α = 0.05

• The power to detect a difference ∆ of 19g equals 43.45%

• Hence, with 12 and 7 observations respectively, there is only 43.45% chance that atrue difference of 19g would be detected.

• If a difference of 19g is considered clinically relevant, then the weight gain experimentwas clearly too small, since it is very likely that such a difference would remainundetected.

• We can also calculate the power for other values of ∆


• Summary:

∆ Power to detect a difference ∆

0g 5.00%∗

10g 15.70%

19g 43.45%

30g 80.80%

40g 96.49%

∗: equal to α

• For example, 12 and 7 observations would be sufficient to show a true difference of40g with more than 96% chance.

• Alternatively, one can also calculate how large the samples should be to detect adifference of, e.g., 20g with sufficiently high power.


• If a power of 90% is required to detect true effects as small as ∆ = 20g, at least 25observations are needed in each group.

• With 30 observations in each group, the probability of making a type II error, whenthe true effect is not smaller than 20g, is approximately 5%.


7.8 Example: Sickness absence

• We re-consider the data on sickness absence, collected on 585 employees with asimilar job:

Sickness absenceNo Yes

Genderfemale 245 184 429

male 98 58 156

343 242 585

• The observed difference between the absence rate 42.9% in females and 37.2% inmales was found not significant (chi-squared test, p = 0.215).


• In case the percentages of sickness absence would be 42% in the total femalepopulation, and 37% in the total male population, and in case a random sample of429 females and 156 males would be taken, there would be 19.01% chance to reach asignificant effect.

• So, if the population proportions are indeed 42% and 37%, an experiment with 429 en156 would detect this difference only 19 times out of 100 experiments.

• If a difference of 5% is considered clinically relevant, then the current experiment wasclearly too small, since it is very likely that such a difference would remain undetected.

• We can calculate how large the samples should be in order to detect a differencebetween 42% and 37%, with sufficiently high power


• For example, two samples of approximately 2500 observations are needed in order toshow a difference between 37% and 42%, with 95% probability

• Compared to the weight change example, many more observations are needed:

. Different outcomes imply ∆ values are not comparable

. Binary data, in general, contain less information than continuous data


7.9 Remarks

• The earlier examples of power and/or sample size calculations were in the context ofthe unpaired t-test and chi-squared test.

• Similar calculations can be done in any other statistical testing situation, e.g., FisherExact test, paired t-test, McNemar test, . . .

• Strictly speaking, all experiments should be preceded by a realistic sample sizecalculation to avoid experiments with unacceptable high type II error rates, i.e., withalmost no chance at all to show clinically meaningful effects.


7.10 Example from the biomedical literature

Wong et al. [1]

• Methodology section, p.658:


• Table 2 with results:

• Discussion, p.664:


• The difference on which the sample size calculation was based was much larger thanwhat actually was observed in the experiment

• Therefore, the power to reject equality of the groups was (much) lower than theexpected 80%

• The current study cannot tell the difference between a 9% increase and a 3% decrease.

• If such differences are considered clinically important, then the current study wasunder-powered, due to the fact that the difference was overestimated at the time ofthe sample size calculation.


Chapter 8

Errors in statistics: Practical implications

. Multiple testing

. Bonferroni correction

. Tests for baseline differences

. Equivalence tests

. Significance versus relevance

. Examples from biomedical literature


8.1 Multiple testing

• Each time a test is performed, there is probability α of making a type I error

• For example, if α = 0.05, we can expect to incorrectly reject the null hypothesis in 5out of 100 times.

• Implication:

“The more tests one performs, the higher the probabilitythat something is detected by pure chance”

• This problem of multiple testing occurs very frequently in bio-medical sciences, invarious settings


8.1.1 Example: A classroom experiment

• On entry in the classroom, assign each student at random to be seated at the left orat the right side of the classroom

• Compare both sides with respect to 100 aspects including weight, height, age, gender,color of hair, color of eyes,. . .

• It is to be expected that for at least 5 of these outcomes, a significant difference isobtained at the 5% level of significance, by pure chance.


8.1.2 Example: Testing many relations

• Amin et al. [2], Table 2:

. 18 tests performed

. only 2 significant results


8.1.3 Example: Subgroup analyses

• Kaplan et al. [3], Table 5:

. Tests based on C.I.’s for odds ratios

. C.I. containing 1 is equivalent to anon-significant test result

. 21× 3 = 63 tests performed

. only 5 significant results


8.1.4 Example: Searching for the most significant results

• This ‘scientific finding’ was printed in the Belgian newspapers:

• It was even stated that those who wake up before 7.21am have a statisticallysignificant higher stress level during the day than those who wake up after 7.21am.


8.1.5 Conclusion

• Significant results obtained by multiple testing are often overinterpreted

• If the number of tests is reported, the reader knows that such results need to beinterpreted with extreme care

• The problem arises when only the significant results are reported, and one does notknow how many tests were performed in total

• This leads to reporting results which turn out to be not reproducible:

. For example, a new study would not find that students seated on the left are tallerthan those on the right. Instead, students seated on the left may weigh more thanthose seated on the right.

. For example, a new experiment might show no difference in stress levels betweensubjects waking up early and those waking up late. Or maybe a difference would befound only when waking up is later than 8.12am.


8.2 Bonferroni correction

• Suppose two tests are performed, both at the 5% level of significance.

• The probability that at least one type I error will be made can be shown not to exceed2× 0.05 = 0.10:

P (at least 1 type I error) ≤ 2× 5% = 10%

• In general, if k tests are performed, all at the 5% level of significance, the probabilityof making at least one type I error can only be shown not to exceed k × 5%

• Obviously, controling the overall type I error rate can be done by performing eachseparate test at the α/k level of significance.


• For example, performing 2 tests at the 2.5% level of significance each implies that theprobability of making at least one type I error will not exceed 5%.

• In general, when k tests are performed at the α/k level of significance, one is surethat the overall probability of making at least one type I error will not exceed α.

• This correction of the significance level is called the Bonferroni correction.

• When confidence intervals are used instead of p-values, the confidence levels can becorrected in a similar way


• Some examples:

Number of tests Significance level α Confidence level

1 0.05 95%

2 0.025 97.5%

5 0.01 99%

k 0.05/k (1− 0.05/k) × 100%

• For example, if CI1, CI2, . . .CI5 are 5 intervals with 99% confidence, for 5 unknownparameters θ1, θ2, . . . , θ5, then there is at least 95% probability that all 5 C.I.’s willcontain all 5 unknown parameters:

P (CI1 contains θ1 and . . . and CI5 contains θ5) ≥ 95%


• Note that, strictly speaking, the Bonferroni correction is an overcorrection, since theoverall type I error rate can only be shown not to exceed 5%, and usually will besmaller than the required 5%.

• In some specific testing situations (e.g., ANOVA analysis), more accurate correctionsare available.


8.3 Examples from the biomedical literature

• Baba et al. [4], p.1202 and p.1203:


• Kellett et al. [5], Table 2 (for example):


In the discussion, R.Roy writes:

Note that the reader cannot perform the Bonferroni correction as the exact p-valueshave not been reported.


8.4 Tests for baseline differences

• In order to show causal effects, patients are often randomized into 2 or more groups

• This ensures (at least in large studies) that all treatment groups are identical, exceptfor the treatment the patients receive

• In (relatively) small studies, imbalances can still occur by pure chance

• Therefore, one often compares the various groups with respect to important factorswhich are believed to be strongly related to the outcome of interest.

• This is called testing for baseline differences, as one compares the characteristicsof the patients at the start of the study.


• As an example, suppose interest is to compare two oral treatments, A and B, for thetreatment of hypertension.

• Suppose the change in diastolic BP is the oucome of interest

• Age is one of the factors believed to be strongly related to BP. Therefore, it isimportant that both treatment groups have the same age distribution

• Therefore, one often tests for age differences between A and B, e.g., based on thetwo-sample t-test.

• The hypothesis tested is

H0 : µA = µB versus HA : µA 6= µB

• Note that H0 and HA express properties of the populations, not the samples


• In the populations (infinitely large), we know that, due to the randomization, µA andµB are identical

• Conclusion:

It makes no sense at all to perform baseline testsin randomized studies

• No matter how small the resulting p-value would be (e.g., < 10−8) we know that theobserved difference in age between groups A and B has occurred purely by chance.

• A meaningful alternative is to calculate a C.I. of the average age difference betweenboth groups, to ensure that the observed difference is sufficiently small to concludethat it cannot (completely) explain the observed differences in the outcome of interest.


• In our example suppose that a 95% confidence interval for the average difference in age(years) is given by [0.1; 0.3], then we believe that this difference would be too small toexplain why patients in group A show more decrease in BP than patients in group B.

• Note also that testing for baseline differences cannot be used to check whether therandomization was done properly.



Nissen et al. [6], abstract and table 1:

A two-arm randomized study


formal tests at baseline


8.6 Equivalence tests

• Suppose two groups A and B are to be compared, with hypotheses to test:

H0 : µA = µB versus HA : µA 6= µB

• In case of a non-significant test result, one often concludes that both groups areidentical or equivalent

• An alternative interpretation is that the experiment did not have sufficient power toshow an effect which is present.

• Conclusion:

Non-significance should not be interpreted as equivalence


• This can also be seen from the fact that, if the two-sample t-test could be used toshow equivalence, it would be best to collect data on (extremely) small samples, as thiswould increase the chance to obtain an non-significant result, due to lack of power.

• Instead, one should reverse H0 and HA:

H0 : |µA − µB| > ∆ versus HA : |µA − µB| ≤ ∆

where ∆ is a pre-specified constant, defining ‘equivalence’

• Note that HA is equivalent to −∆ ≤ µA − µB ≤ ∆

• Hence, in order to reject H0, one needs to show evidence that µA and µB are less than∆ away from each other

• One way to proceed is to construct a C.I. for µA − µB and to check whether it isentirely within the interval [−∆; ∆].


• Graphically, H0 would be rejected if:

µA − µB

−∆ +∆

0

[ ]

95% C.I.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

...

..

..

..

..

..

..

..

..

..

..

..

..

...

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

...

..

.

• Graphically, H0 would not be rejected if:

µA − µB

−∆ +∆

0

[ ]

95% C.I.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

...

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

...

..

..

..

..

..

..

..

..

..

..

..


• Obviously, the result of the equivalence test entirely depends on the choice of ∆

• Therefore, ∆ needs to be specified prior to the data collection



Shatari et al. [7]:

• Title:


• Table 1:

No significantdifferences !


• Results and conclusions (abstract):


8.8 Significance versus relevance

• We discussed before that the power to detect some effect ∆ increases with the samplesize

• This implies that any effect ∆, no matter how small, will, sooner or later, be detected,if the sample is sufficiently large.

• For example, consider the Captopril data, where the observed difference of 9.27 mmHgwas found significantly different from zero (p < 0.001), based on data from 15patients only:


• The 99% confidence interval for the average change µ in BP was found to be[3.02; 15.52].

• Suppose that the observed difference would have been 0.1 mmHg.

• A p-value as small as 0.001 would be likely to be obtained, provided that the samplewould be sufficiently large.

• Obviously, an average change in BP as small as 0.1 mmHg is not relevant from aclinical point of view.

• Conclusion:

Statistical significance 6= Clinical relevance


• A highly significant effect can be a large effect:

µ

0

[ ]

95% C.I. p = 0.0001

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

• A highly significant effect can also be a very small effect, but estimated with highprecision, due to a large sample size:

µ

0

[ ]

95% C.I. p = 0.0001

.

.

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

.

..

.

..

.

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.


• The p-value cannot distinguish between both situations

• It is therefore important not to blindly overinterpret significant results withoutknowing the size of the effect

• This is another reason why confidence intervals are to be preferred over significancetesting


Part III

Simple linear regression


Chapter 9

The Pearson correlation coefficient

. Example

. Pearson correlation

. Properties and interpretation

. Statistical inference

. Application

. Examples from the biomedical literature


9.1 Example

• In the literature, it is suggested that a decreased cognitive status implies an increaseddependence in post-operative hip fracture patients.

• Therefore, we investigate the relationship between MMSE and ADL, 1 day postoperation.

• For each patient, we have two measurements:

. The MMSE score: xi for the ith patient

. The ADL score: yi for the ith patient

• Hence, the data are ordered pairs (xi, yi)


• A graphical representation of the relationship between MMSE and ADL can beobtained via a scatter plot of the yi versus the xi:

• The graph suggests a negative relationship between MMSE and ADL.


9.2 The Pearson correlation coefficient

• The relationship between two variables is often summarized using thePearson correlation coefficient:

r =∑

i(xi − x)(yi − y)√∑

i(xi − x)2√∑

i(yi − y)2

• The sample averages x and y of the x-observations and the y-observations,respectively are given by:

x =1

n

∑

ixi, y =

1

n

∑

iyi


9.3 Properties and interpretation

r =∑

i(xi − x)(yi − y)√∑

i(xi − x)2√∑

i(yi − y)2

x

y

•

••

••

•••

•

•

•

xi

yi(+,+)

(+,–)

(–,+)

(–,–)


The correlation coefficient measures the linear relationship between X and Y , and enjoysthe following properties:

• −1 ≤ r ≤ 1

• r < 0 : negative linear relationship between the xi and the yi

• r > 0 : positive linear relationship between the xi and the yi

• r = −1 : the points xi and yi perfectly lie on a decreasing straight line

• r = 1 : the points xi and yi perfectly lie on an increasing straight line

• r = 0 : there is no LINEAR relationship between xi and yi


9.4 Statistical inference

• The correlation coefficient is calculated based on the observations (xi, yi), and is anestimator for the theoretical correlation ρ in the population

• In practice, one wants to test whether or not there is a linear relationship between thevariables X and Y , i.e., whether the correlation ρ is significantly different from zero.

• Formally, we want to test the null hypothesis

H0 : ρ = 0

versus the alternative hypothesisHA : ρ 6= 0

• The corresponding test procedure assumes that the variables X and Y are jointlynormally distributed.


9.5 Application

• Correlation matrix for ADL and MMSE at days 1, 5, and 12 post-operatively:


• Corresponding scatter plot matrix:


• The correlation between MMSE and ADL on day 1 is r = −0.70 and is significantlydifferent from zero (p < 0.0001).

• Hence, we can conclude that there is a strongly negative linear relationship betweenMMSE and ADL, 1 day post operation: The lower the cognitive status of the patient,the higher his dependence.



• Serrano-Gallardo et al. [8]:

. Methodology section, p. 4:

For data analysis, we performed univariate

analyses (measures of central tendency and dispersion

or percentages, depending on the variables� nature)

and bivariate analyses (Student�s t-test, ANOVA, and


. Results section, p. 6:

There was no evidence of an association between

the clinical learning and students� age (Pearson

there was evidence for an association with the grades

The multiple linear regression model (adjusted


• Salehi et al. [9], Table 3:

Table 3: The correlation between various domains of social well-being in School of midwifery and nursing

students in Shiraz University of Medical Sciences

Social

actualization

Social

coherence

Social

integration

Social

acceptance

Social

contribution

Social actualization 1 - - - -

Social coherence 0.96 1 - - -

Social integration 0.96 0.94 1 - -

Social acceptance 0.97 0.96 0.94 1 -

Social contribution 0.96 0.96 0.94 0.97 1

For all domains every P value is less than 0.0001; Pearson correlation


Chapter 10

Simple linear regression

. Introduction

. The method of least squares

. Application

. Statistical inference

. The ANOVA table

. Application



10.1 Introduction

• The correlation coefficient r measures the linear relationship between two variables, Xand Y . How can we describe this linear relationship?

• One possible way would be to construct the straight line that ‘fits best’ the observedmeasurements:


• A straight line is described analytically by an equation of the form

y = β0 + β1x

• The parameter β0 is the intercept, β1 is the slope.

• If β1 > 0 :

. There is a positive relationship between x and y

. The larger β1, the faster y increases with x

• If β1 < 0 :

. There is a negative relationship between x and y

. The smaller β1, the faster y decreases with x

• In practice, one needs to estimate the parameters β0 and β1 based on the collecteddata (xi, yi).


10.2 The least squares method

• To estimate β0 and β1, we first need to decide which criterion should be satisfied by‘the best’ straight line

0

β0................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

•• •

•

•

•••

•

•

•

xi

yi y = β0 + β1x

yi

yi


• If we would know β0 and β1, then for each observation in the set of data, based on thex value, a predicted value can be calculated for y:

yi = β0 + β1xi

• The prediction will be good if yi lies closely to yi and will be poor if yi deviatesstrongly from yi

• If the straight line describes the data (xi, yi) adequately, then we expect, for mostpoints, yi to lie closely to the true value yi.

• A possible measure to capture how well the straight line has been chosen is

Q =∑

i[yi − yi]

2 =∑

i[yi − (β0 + β1xi)]

2

• Hence, Q is a measure for how closely the data lie to the straight line y = β0 + β1x.


• Note that other straight lines (i.e., other β0 and β1), will lead to different Q values.

• The straight line that describes the data best is the one for which Q is smallest.

• The least squares method calculates the values of β0 and β1 for which Q is minimal.

• It can be shown that these values are given by:

β1 =

∑

i(xi − x)(yi − y)∑

i(xi − x)2

, β0 = y − β1x

• β0 and β1 are termed the least squares estimators for β0 and β1.


• The straight line so obtained,

y = β0 + β1x

is termed the regression line.

• Once the estimators for β0 and β1 known, we can make a prediction, for eachobservation in the data set, for y based on x:

yi = β0 + β1xi

• We are also able, for each data point (xi, yi) in the set of data, to compute the error ifwe try to predict yi by yi:

ei = yi − yi = yi − (β0 + β1xi)


• The quantities ei are termed residuals:

. ei > 0 : the observed yi lies above the regression line

. ei = 0 : the observed yi lies on the regression line

. ei < 0 : the observed yi lies underneath the regression line

• Further, one can show that∑

iei = 0

i.e., the points above the regression line are ‘in equilibrium’ with these underneath theregression line.


10.3 Application

• Regression of ADL on MMSE, one day post-operatively yields the following regressioncoefficients:

• The Y variable is termed response, or also dependent variable.

• The X variable is termed covariate, or also independent variable.


• The parameter estimates are β0 = 23.65 and β1 = −0.30.

• The corresponding regression line is

ADL = 23.65 − 0.30 ×MMSE

• The regression line predicts an ADL score of 23.65 if MMSE is equal to zero.

• Further, there is a negative linear relationship between MMSE and ADL: The higherMMSE the lower ADL, and vice versa.

• The regression line predicts a decrease of 0.30 in ADL, for a unit increase of MMSE.


• Graphical representation:

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• •••••••••••••••••••••••••••••••••••••••• •••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

Difference of 10 units

Difference of

10 ×β1 units


• This ought to be interpreted as follows:

. Consider two groups of patients

. All patients in the first group have identical MMSE (e.g., 20).

. All patients in the second group have identical MMSE values, too, but 1 unithigher than these in the first group (hence, 21).

. Then, we expect the difference in average ADL score between both groups to be0.30, with the lower score for the group with highest MMSE.

• Hence, we should not conclude that an increase of MMSE with 1 unit in a givenpatient will lead to a decrease of 0.30 in ADL.

• Hence, we cannot draw ‘longitudinal’ conclusions from a ‘cross-sectional’experiment.


10.4 Statistical inference

10.4.1 Introduction

• We obtained the following regression output:

• The p-values listed test the hypotheses

H0 : β0 = 0 versus HA : β0 6= 0 and H0 : β1 = 0 versus HA : β1 6= 0


• Indeed, the least squares method allows us to calculate the straight line that bestdescribes our observations (xi, yi).

• However, a different sample from the same population would lead to a differentregression line

y = β0 + β1x

• Illustration:

→ regression plots


• Based on a sample and hence the corresponding estimators β0 and β1, statisticalinference (p-values, confidence intervals) aims to make a statement about theregression line

y = β0 + β1x

that captures the relationship in the entire population.

• This is not possible without additional assumptions about the distribtuion from whichthe data are sampled.

• The assumptions needed are described by the so-called regression model.


10.4.2 The simple linear regression model

• In realistic situations, the points (xi, yi) will never describe a perfect straight line, butrather a cloud of points.

• This implies that the observations do not satisfy

yi = β0 + β1xi

but rather

yi = β0 + β1xi + εi

where εi expresses how much an observation yi lies above or below the regression line.

• The quantities εi are termed errors, and the linear regression model assumes that theyare distributed following a normal distribution with mean 0 and (unknown) variance σ2:

εi ∼ N (0, σ2)


• Note that the εi are the ‘theoretical version’ of the residuals ei

• Hence, the regression model assumes . . .

. . . . linearity: for each X , the mean of the corresponding Y -values lie on theregression line

. . . .normality: for each X , the corresponding Y -values lie symmetrically aroundthe regression line

. . . .constant variance: the prediction errors for small X-values are neither largernor smaller than the errors for X-values


10.4.3 Significance tests for β0 and β1

• If the slope β1 is equal to zero, then the regression model is described by

yi = β0 + εi

which implies that there is no linear relationship between Y and X .

• In practice, if we want to test whether there is a linear relationship between X and Y ,then we need to test the null hypothesis:

H0 : β1 = 0 versus HA : β1 6= 0

• The value observed in our sample is β1 = −0.30


• This value could be obtained by coincidence, even if in the total population β1 = 0would hold.


How likely is it to observe β1 = −0.30 even if β1 = 0?

• Illustration:

→ histograms of slope and intercept


• It is clear that, when β1 = 0, it becomes very unlikely to still observe β1 = −0.30.

• Note that it would be equally unlikely to observe β1 = +0.30.

• The chance that we would find an estimate with |β1| ≥ 0.30 is p < 0.0001.

• Given that this probability is so small, more specifically that p < α = 0.05 = 5%, wewill conclude that what has been observed (β1 = −0.30) is sufficient indication tobelieve that β1 6= 0.

• Hence we reject the null hypothesis and conclude that β1 is significantly different from0, at the 5% significance level.


• Apart from testing hypotheses, the regression model also allows constructingconfidence intervals.

• For example, a 95% C.I. for β1 in our example is [−0.378;−0.218].

• Given that this interval is far away from 0, this is again strong evidence that β1 6= 0.

• Analogously, a significance test can be constructed for

H0 : β0 = 0 versus HA : β0 6= 0

• In practice, one is primarily interested in tests for β1.

• Note that all tests and confidence intervals are valid only when all regression modelassumptions are satisfied.


10.5 The ANOVA table

• How much better can we predict Y , given that we know X?

0

β0

y

...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

•

• •

•

• •

••

•

•

•

xi

yi y = β0 + β1xyi

yi


• Intuitively, this should be related to how well the dataset is describedby the regression line

• When we would not have x-values, then the best possible prediction for each yi-valueis the sample average y.

• A measure for the error so made is the sum of squares∑

i[yi − y]2

• Note that this is a measure for the variability in the yi.

• If we do use the observed xi-values to predict the y-values, then we predict each yi bymeans of

yi = β0 + β1xi


• A measure for the error so made is the sum of squares∑

i[yi − yi]

2 =∑

ie2i

• Because the use of this extra information coming from the xi leads to more precisepredictions, we have that

∑

i[yi − y]2 ≥ ∑

i[yi − yi]

2

• One can show that∑

i[yi − y]2

︸︷︷︸

↓SSTO

=∑

i[yi − yi]

2

︸︷︷︸

↓SSE

+∑

i[yi − y]2

︸︷︷︸

↓SSR


• SSTO: Total sum of squaresThis term captures the total error made by predicting the yi without taking intoaccount the observed values xi.

• SSE: Error sum of squaresThis term captures the error made upon predicting the yi by making use of theobservations xi.

• SSR: Regression sum of squaresThis term captures the decrease in error by predicting the values yi with, rather thanwithout, making use of the covariates.


• A measure for how well the data points (xi, yi) agree with the regression line is

R2 =SSR

SSTO

• R2 enjoys the following properties:

. 0 ≤ R2 ≤ 1

. R2 = 0 implies that SSR = 0 and hence that all yi are equal to y, i.e., theregression line is flat. This is equivalent with β1 = 0.

. R2 = 1 implies that SSE = 0. This implies that yi = yi for all i, and hence thatall points (xi, yi) lie on the regression line.

• R2 expresses ‘the fraction of the variability in the yi which can be explainedby the xi’.

• One can show that R2 is equal to r2, the square of the correlation between the xi andyi values.


10.6 Application

• ANOVA table for regression of ADL on MMSE on day 1 post-operatively:

• ‘R-square’ : R2 = 0.4940, the regression can explain about 50% of the total variabilityin the yi values:

R2 =SSR

SSTO=

351.23

351.23 + 359.76= 0.4940

• The Pearson correlation, found before, was:

r = −√

R2 = −√

0.4940 = −0.70



• Kiekkas et al. [10], Table 2:

Table 2 Simple linear regression in all intensive care unit patients: mean

daily Projet de Recherche en Nursing Rea and its categories as dependent

variables and Acute Physiology and Chronic Health Evaluation II as

explanatory variable

R2 a (M ± SE) b (M ± SE)

PRN Rea 0�256* 121�1 ± 4�1* 1�9 ± 0�3*

Respiration 0�103* 21�5 ± 1�5* 0�4 ± 0�1*

Nutrition 0�003 2�6 ± 0�6* ÿ0�1 ± 0�1

Elimination 0�021 0�1 ± 0�2 0�1 ± 0�1

Hygiene 0�009 25�4 ± 0�7* ÿ0�1 ± 0�1

Mobilization 0�016 10�7 ± 0�7* 0�1 ± 0�1

Communication 0�029** 9�1 ± 1�0* ÿ0�1 ± 0�1**

Diagnostic methods 0�194* 30�5 ± 1�9* 0�7 ± 0�1*

Treatments 0�221* 21�5 ± 2�0* 0�9 ± 0�1*

R2, determination coefficient; a, b, unstandardized regression coefficients

(a: intercept, b: slope), M ± SE, mean ± standard error.

*p < 0�01.

**p < 0�05.


• Frilund & Fagerstrom [11], statistical methodology section:

Data analysis

In order to establish the unitsÕ optimal NI-level certain

valueswere needed: themeanOPCpoints per day andper

care giver, the personnel resources for the specific day (i.e.

the actual time used tomeet the needs of the patients), and

a mean of the PAONCIL assessments for the same day.

The data was analysed by means of a simple linear

regression analysis. It is possible to predict a dependent

variable bymeans of a regression equation Y = a + bx, i.e.

was possible to calculate the optimal NI score per nurse,

that is, the score which led to the average PAONCIL

value zero. It is analysed to what extent the independent

variable (x = the mean OPC points per care giver per

day), explains the variation in values of the dependant

variable (y = the mean of the PAONCIL assessments) on

the basis of a linear relationship. The explanatory power

determination coefficient (R2) gives the proportion of

variance in Y that is counted for by x. If for instance R2 is

0.3, themodel explains 30%of the variation in the values

of the outcome variable (Fagerstrom et al. 2000 b,


Chapter 11

Model diagnostics

. Example

. Linearity

. Constant error variance

. Normality of the errors



11.1 Example

• We wish to assess whether a patient’s dependence (ADL), one day after surgery, canbe used to predict a patient’s length of stay:


• There appears to be a slight increase of length of stay, as a function of the ADL score.Is this relationship significant?

• Therefore, we fit the following regression model:

Length of stay = β0 + β1ADL + εi

• Regression output:


• The parameter estimates are:

. β0 = 9.37

. β1 = 0.29, p-value: 0.1173

• The fitted regression line is

Length of stay = 9.37 + 0.29ADL

• Note that there is no significant relationship between length of stay and ADL score,1 day post operation.

• Further, it follows from R2 = 0.0432 that ADL explains only 4% of the total variabilityin length of stay.


11.2 Model assumptions

• The statistical inferences, obtained for the regression parameters, are valid only if themodel assumptions are satisfied, i.e.,

yi = β0 + β1xi + εi, εi ∼ N (0, σ2)


• Hence, the regression model assumes that . . .

. . . . linearity: for each X , the mean of the corresponding Y -values lie on theregression line

. . . .normality: for each X , the corresponding Y -values lie symmetrically aroundthe regression line

. . . .constant variance: the prediction errors for small X-values are neither largernor smaller than the errors for X-values

• How can these assumptions be verified?


11.3 The assumption of linearity


• To illustrate the effect of non-linearity, consider the following fictitious example:


• There clearly is a positive relationship between xi and yi, but the relationship betweenxi and yi appears to deviate somewhat from linearity.

• What happens if we still apply linear regression?

• Regression output:


• R2 = 0.85: X explains 85% of the observed variability in Y .

• The regression line is given by

Y = 1.19 + 2.06X

• The slope β1 is significantly different from zero (p < 0.001).

• The observed points all lie close to the fitted regression line (explaining the high R2),but the straight line poorly describes the relationship between xi and yi:

. Over-estimation of the yi for small and large xi

. Under-estimation of the yi for intermediate values xi


• The graph suggests that non-linearity can be discerned through studying the residuals

ei = yi − yi = yi − (β0 + β1xi)

and to plot them as a function of x:


• If the assumption of linearity would be satisfied, then, for each value of X , thecorresponding values of Y would lie symmetrically around the regression line. Theresiduals ei would then have to lie symmetrically around zero, for all possible X values.

• Clearly, this is not satisfied in the above example.

• Note that the residuals in fact suggest that the relationship between the yi and the xi

is rather a quadratic function. We return to this point as part of polynomial regression.

• Oftentimes, the covariate X can be transformed so that the yi, as a function of thetransformed xi can be assumed linear.

• Frequently used transformations include ln(X),√

X , 1/X , exp(X), ln(X + 1),. . .


• For our fictitious example we try a logarithmic transformation of the observed xi:

xi −→ ln(xi)

• Regression output after log-transformation of Y :


• Accompanying graph:


• Residual plot:


• R2 = 0.92: Our model has improved, because we now can explain more variability inthe y-values by means of the x-values.

• The estimated regression curve now is

Y = 2.95 + 0.80 ln(X)

• Hence, the transformation complicates the interpretation of the regression coefficients.For example, 0.80 is the estimated increase in Y when ln(X) increases with one unit.

• At the same time is the transformation necessary to render the assumption ofnormality more realistic, which in turn implies that our statistical inferences w.r.t. β0

and β1 improve.


11.4 Example: Length of stay versus ADL

• We now check whether the linearity assumption is satisfied in the regression modelemployed for the prediction of length of stay by means of the ADL score, 1 day postoperation.

• The residual plot does not indicate any systematic trend in the residuals:


11.5 The assumption of constant variance


• For illustration, we study the relationship between diastolic blood pressure and age,using data of 54 healthy adult women, between 20 and 60 years of age:


• We conduct a regression of blood pressure on age:

• The regression explains more than 40% of the variability in blood pressure(R2 = 0.4077); there is a significant (p < 0.0001) linear relationship between age andblood pressure; the estimated regression line is:

Blood pressure = 56.16 + 0.58 × Age


• Given that the residuals ei = yi − yi can be interpreted as estimates of the theoreticaldeviations εi, we can assess the assumption of constant variance for the εi via ascatter plot of the residuals:

• The residuals show that the linearity assumption is satisfied.


• On the other hand, the residual plot suggest that the variance in εi increases with age.

• Violation of this assumption will lead to less than optimal inferences about theparameters β0 and β1:

. The estimated regression line remains to be correct

. The parameters β0 and β1 are estimated less precisely. This leads to larger p-valuesand hence a linear relationship between X and Y may go undetected.

• An optimal analysis is obtained through a so-called weighted least squares analysis.

• Oftentimes, non-constant variance is often paired with non-normality. A solution forthe non-normality problem very often generates, on the side, a solution for thenon-constant-variance problem.



• To check the assumption of constant residual variance for the regression model,employed to predict length of stay by means of the ADL score, 1 day post operation,we re-consider the residual scatter plot, already created to assess linearity:


• Apart from the outlier in the middle, there are no systematic trends in the variabilityof the residuals.

• We can therefore accept the assumption of constant residual variance.


11.7 The assumption of normality


• Given that the residuals ei = yi − yi are estimators for the theoretical deviations εi, itis natural to assess the assumption of normality via residuals.

• In practice, one often uses a combination of two methods:

. Graphical: a histogram of residuals

. A formal test for normality

• Both techniques are illustrated by means of the blood pressure data in 54 women.


11.7.1 A histogram of residuals

• A simple graphical way to explore the distribution of the residuals is by means of ahistogram, together with the normal distribution that most closely fits the histogram:


• From this histogram follows:

. There is no evidence for asymmetry in the distribution of the residuals

. The distribution appears not to be too different from the normal distribution

• We conclude that there is no graphical evidence for non-normal errors εi


11.7.2 The normality test

• Most software packages allow for a formal normality test

• One tests the null hypothesis

H0 : the data are normally distributed

versus the alternative hypothesis

HA : the data are not normally distributed

• Various testing procedures are possible, all leading to a p-value, allowing us to eitherreject or accept the null hypothesis


• Histogram with results of 3 normality tests:


• We obtain a histogram with the normal approximation, but also with the results of 3test procedures for normality: Shapiro-Wilk, Kolmogorov-Smirnov, and Lilliefor.The first two are the more common ones.

• Based on each of the 3 procedures, the null hypothesis of normality would be accepted.We conclude that the residuals ei and hence the errors εi are normally distributed.


11.7.3 Histogram ←→ normality test

• The histogram is an exploration technique to study the distribution of the residuals.

• The normality test is a formal test, allowing to test whether the assumption ofnormality is acceptable.

• In (very) large samples is the rejection of normality, based on a statistical testingprocedure, rather likely: The smallest deviations of normality will be detected.

• It is known that small deviations from normality will still lead to correct results, aslong as the errors are symmetric.

• Hence, if non-normality is not due to asymmetry, then the results obtained will still bereliable.



• We consider again the regression of length of stay with hip fracture patients on theirADL score, 1 day post operation.


• The residuals are clearly non-normally distributed.

• From the histogram, it follows that non-normality is due to asymmetry.

• In case non-normality results from asymmetry, one can sometimes transform the yvalues so as to make residuals in the new regression normally distributed.

• Frequently used transformation are ln(Y ),√

Y , 1/Y , exp(Y ), ln(Y + 1), . . .

• In our example, we have to transform the data (the y-values) such that the largerresidusals approach the bulk of the residuals.

• A possible transformation is

Length of stay −→ ln(Length of stay)


• Note that all observed values of length of stay are positive, implying that a logarithmictransformation is mathematically allowed.

• Before interpreting the regression model output, we check whether the distribution ofthe new residuals is clsoer to a normal distribution:


• Hence, we can conclude that the errors in the new regression model are normallydistributed.

• New regression output:


• The regression model is slightly improved, given that the R2 value has increased from0.0432 to 0.0670

• The regression line is:

ln(Length of stay) = 2.23 + 0.02 × ADL

• Now, we do find a significant relationship:p = 0.0497 in contrast with p = 0.1173 prior to transformation.

• Note that the relationship derived is no longer linear.

• This example underscores the need to check normality of errors, given that possiblenon-linearity can strongly distort the results.


• The transformation of the y-values can, again, distort linearity, and/or non-constantvariance of the errors εi. It is therefore useful to construct, after transformation, ascatter plot of the y-values versus the residuals:

• Linearity and constant variability remain satisfied.


11.9 General Conclusion

• Carrying out a regression is easy

• Evaluating a regression model is difficult



• Bjork et al. [12], methodology section:

Data analyses

Descriptive statistics was used to describe the characteristics

of the participants (age, gender ADL dependence and cogni-

tive impairment) and the prevalence of resident engagement

in the everyday activities. Descriptive results of categorical

data are presented as actual numbers, percentages and

results of continuous data are presented as means and stan-

dard deviations and median. The distribution of quantita-

tive variables was examined for normal distribution. Simple

linear regression analyses were performed with the total

score for thriving as the outcome variable. A total of 26

→ normality check of original variables !


• Kiekkas et al. [10]:

. Methodology section:

collected data, and statistical significance was set at

p < 0�05. Kolmogorov-Smirnov test was used to check

whether continuous variables (age, APACHE II score,

mean daily PRN Rea score and ICU length of stay)

were normally distributed. According to APACHE II

values, patients were divided into six clinical severity

groups (cutoff at every five points), and analysis of

variance was performed to identify differences in

nursing workload among patient groups. Dunnett’s

test was used for comparing the lowest clinical severity

group (control group, because the researchers sup-

posed a positive correlation between clinical severity

and nursingworkload) to each of the other five groups.

Simple linear regression was used to estimate the

variability of mean daily PRN Rea score (and catego-

ries of PRN Rea) with respect to APACHE II score,

within the entire patient population as well as within

patient subgroups.

→ normality check of outcome rather than residuals !


. Table 2:

Table 2 Simple linear regression in all intensive care unit patients: mean

daily Projet de Recherche en Nursing Rea and its categories as dependent

variables and Acute Physiology and Chronic Health Evaluation II as

explanatory variable

R2 a (M ± SE) b (M ± SE)

PRN Rea 0�256* 121�1 ± 4�1* 1�9 ± 0�3*

Respiration 0�103* 21�5 ± 1�5* 0�4 ± 0�1*

Nutrition 0�003 2�6 ± 0�6* ÿ0�1 ± 0�1

Elimination 0�021 0�1 ± 0�2 0�1 ± 0�1

Hygiene 0�009 25�4 ± 0�7* ÿ0�1 ± 0�1

Mobilization 0�016 10�7 ± 0�7* 0�1 ± 0�1

Communication 0�029** 9�1 ± 1�0* ÿ0�1 ± 0�1**

Diagnostic methods 0�194* 30�5 ± 1�9* 0�7 ± 0�1*

Treatments 0�221* 21�5 ± 2�0* 0�9 ± 0�1*

R2, determination coefficient; a, b, unstandardized regression coefficients

(a: intercept, b: slope), M ± SE, mean ± standard error.

*p < 0�01.

**p < 0�05.


Chapter 12

Influential observations

. Example

. Cook’s distance

. Application

. What to do with influential observations ?

. Example from biomedical literature


12.1 Example

• We consider again the regression of ln(Length of stay) on the ADL score, 1 day postoperation:


• Patient #20 has got an ADL score of 17, and is hospitalized during 36 days, which isexceptionally long in comparison with other patients.

• For subject #20, the residual ei = yi − yi is, therefore, very large.

• Given that the parameters β0 and β1 are estimated via the least squares method, it islegitimate to investigate how strongly our results β0 and β1 are influenced by thisindividual.

• A subject is highly influential if deleting the subject leads to strongly differing results.

• Influential observations make interpreting the results more difficult, because theconclusions become sample-dependent: A different sample would have led to differentresults.


• To study a subject’s influence, we can compare β0 and β1 with and without the givensubject.

• To illustrate the method, we consider subject #20, and investigate the effect ofdeleting this patient, together with what the effect would have been, had the subjectnot had an ‘average’ ADL score, but rather a very large (24) or very small (10, 5, 0)ADL.


• Results for ADL= 17:


• Summary of the regression results:

With subject #20 Without subject #20

ADL Parameter Estimate (p-value) Estimate (p-value)

17 Intercept (β0) 2.233 (<0.001) 2.191 (<0.001)

Slope (β1) 0.022 (0.0497) 0.024 (0.0219)

24 Intercept (β0) 2.088 (<0.001) 2.191 (<0.001)

Slope (β1) 0.030 (0.0056) 0.024 (0.0219)

10 Intercept (β0) 2.420 (<0.001) 2.191 (<0.001)

Slope (β1) 0.012 (0.2801) 0.024 (0.0219)

5 Intercept (β0) 2.541 (<0.001) 2.191 (<0.001)

Slope (β1) 0.005 (0.6246) 0.024 (0.0219)

0 Intercept (β0) 2.636 (<0.001) 2.191 (<0.001)

Slope (β1) -0.0003 (0.9764) 0.024 (0.0219)


• In general, a subject is influential if the following two conditions are satisfied:

. The subject is an outlier, i.e., the value yi is exceptionally large or small, given itsxi value.

. The subject is located at the outside of the X-space; in our example this meansthat a large or small ADL score (day 1) is observed.


12.2 Cook’s distance

• The detection of influential subjects requires the following steps:

. Carry out the regression on all subjects

. Step 1: leave out the first subject and compare the new results with these based onall data

. Step 2: leave out the second subject and compare the new results with these basedon all data

. Step 3: leave out the third subject and compare the new results with these basedon all data

. . . .

. Step n: leave out the last subject and compare the new results with these based onall data


• In each step, we have to compare the results obtained in the absence of a certainsubject, with these obtained based on all data.

• This can be done with Cook’s distance, which measures the ‘distance’ between theresults with and without such an observation.

• Cook’s distance for the ith observation is denoted by Di.

• Influential subjects correspond to large Di.

• Non-influential subjects correspond to small Di.


12.3 Application

• We apply this to the regression of ln(Length of stay) on the ADL score, 1 day postoperation.

• Most software packages allow calculationof the Cook’s distance for all observations.

• Note that D20 is relatively large.


• In particular for large data sets, an index plot of Cook’s distances can be very handy,possibly upon explictly constructing a variable with observation numbers:


• Apart from subject #20, we also find that subject #45 has got a relatively large Di.

• It is therefore of interest to carry out the analyse with each of these observationsremoved in turn.

• The results with all observations, without observation #20, and without observation#45, respectively, are:


12.4 What to do with influential observations ?

• Does removing influential subjects lead to qualitatively different results?

• Are the data for influential subjects correct?

. Data-entry errors

. Mixing-up of patients case forms

. . . .

• Do influential subjects satisfy the inclusion/exclusion criteria of the study?

. Are these genuine hip fracture patients?

. Could there be an additional complication/comorbidity that could explain theirinfluence?

. . . .


• When there are no objective criteria for omission, influential subjects ought to be keptin the study.

• Possible, the least squares criterion can be replaced by a different criterion that is lesssensitive to individual observations.

=⇒ Robust regression techniques



Archbold et al. [13],Results section p. 172 and Figure 2:

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� ! � � � � � � � � � � � " � � � � � � � � � � � ! � � � � � � � � � � � � � � � � !� � � � � � � � � ! � � � � � � � � � � � � � � � � � � � � � � � � � � �� # � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� $ % � & � � � � ' � � � � � � � � � � �� ( ( � � � � � � � � � � � � � � � � � � � � � � � � � �) * + , , - . / 0 + 1 2 3 4 - 5 6 7 8 7 7 9 : 8

;<=>?@ABC

< C

= C

> C

? C

D

E F G

HHIJ KL

MN

O P Q R S T U V W T Q S T X X P Y Z [ Z [ \ ] X P X Y ^ [ _ Z T [ ` a ] _ Y _ Z T [ P Z b T cd e f g h [ Z b i a P \ b S T Z j X i [ k T Q Y S ] l T X k d i i l h X m Y S T V g Z b P n P b R [ \X R o p T m k j X b [ k [ _ Y P Z k d q r s t u v w r x y z { | } ~ � � � r � � � � � � � r � �� ~ } � x � } � � ~ x � � � � { � � | � � } � � y � } x � � x { � � x � � x � ~ � � � | ~ � { x �� x � | z x ~ } � y | � � } � z � � { � y � � z | x � u �


Part IV

One-way analysis of variance


Chapter 13

The unpaired t-test

. Example

. The unpaired t-test

. Example

. Variability within versus between groups


13.1 Example

• We study the relationship between the ADL score, 1 day post operation, and thepre-operative neuro-psychiatric condition of the patient, i.e., we want to compare theaverage ADL score between neuro- and non-neuro patients.

• Descriptive statistics:



• We note that, on average, the neuro patients exhibit a higher ADL score and henceare more dependent.


• How can we test whether this difference can be ascribed to chance? In other words, inhow far is this difference significant?

• Ineed, even if there would be no difference between both neuro groups (in thepopulation), then we still might observe differences, purely due to chance, in thesample.

• Illustration:

→ Anova


13.2 The unpaired t-test

• We have two independent groups of patients, and hence two sets of ADLmeasurements:

. y11, y12, y13, . . . , y1n1the measures in the first group

. y21, y22, y23, . . . , y2n2 the measures in the second group

• Both groups do not necessarily have the same number of observations: n1 en n2.

• The unpaired t-test assumes that the measures in both groups are normally distributedwith the same spread, but perhaps a different mean:

Y1j ∼ N (µ1, σ2)

Y2j ∼ N (µ2, σ2)


• Graphically:

.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Non-neuro Neuro

µ1 µ2

ADL

• The null hypothesis that we aim to test is

H0 : µ1 = µ2


HA : µ1 6= µ2


• The test statistic employed for this purpose:

T =y2· − y1·

sp

√1n1

+ 1n2

where y1· and y2· the observed means in the first and second groups are, respectively:

y1· =1

n1

n1∑

i=1y1i y2· =

1

n2

n2∑

i=1y2i

and where s2p is the ‘pooled’ sample variance, an estimate for the common variance σ2:

s2p =

(n1 − 1)s21 + (n2 − 1)s2

2

n1 + n2 − 2

which is a weighted average of the sample variances in both groups separately.


• Note that the test statistic T is a measure for the separation between the observedsamples.

• In our example, the T -value is:

T =20− 17.15

√(40−1)11.51 + (20−1)9.37

40+20−2

√140

+ 120

= 3.16

• Under the null hypothesis, i.e., when µ1 = µ2, we expect T to be small.

• We are interested in knowing in how far T = 3.16 can be obtained purely by chance.


• We calculate the probability to observe a T at least as large as the current value 3.16,in the case that both populations in reality have equal means, i.e., when µ1 = µ2.

• Illustration:

→ Hypothesis Test (two samples)


• It is clear that, if there are no differences between both populations, it would be veryunlikely to observe T = 3.16 or T = −3.16.

• The probability to observe |T | ≥ 3.16 purely by chance is p = 0.002.

• Given that this probability is so small, more specifically p < α = 0.05 = 5%, we willconclude that what has been observed, T = 3.16, is sufficient indication to accept thatµ1 6= µ2.

• We reject the null hypothesis and conclude that µ1 and µ2 are significantly different,at the 5% significance level.

• We reject the null hypothesis that the average ADL score is equal between the neuroand non-neuro patients.


• Note that the calculation of the p-values makes use of the assumptions:

. Normality between both groups

. Common variance between both groups

• Checking these assumptions can be done in exactly the same way as with 1-wayANOVA, and therefore will be explained when model diagnostics for ANOVA arediscussed.


13.3 Example

• A typical unpaired t-test output:

• The unpaired t-test assumes that the variance is the same in both groups. Thisassumption is tested automatically. If the hypothesis of equal variances is rejected,then an appropriately corrected t-test can be carried out.

• The hypothesis of equal variances is acceptable (p = 0.642).


13.4 Variability within versus between groups

• The unpaired t-test rejects H0 if |T | is large, which is equivalent to

T 2 =(y2· − y1·)

2

s2p

(1n1

+ 1n2

)

being large.

• The numerator of T 2 measures separation between the group averages, and is ameasure for the variability between both groups.

• The denominator of T 2 contains s2p, which is an estimator for σ2, and hence is a

measure for the variability within groups.


• Hence, the unpaired t-test rejects the null hypothesis if the variability between groupsis large enough, as compared with the variability within groups:

• This principle is applied in ANOVA to compare more than two groups.


Chapter 14

1-way ANOVA

. Example

. Pairwise t-tests

. 1-way ANOVA

. Example

. Model diagnostics

. Influential observations



14.1 Example

• Because we expect that the ADL score post operation is not only influenced byoperation-specific factors, but also by, for example, how dependent the patient wasprior to the operation, we study the relationship between the ADL score and thepatient’s living condition prior to operation.

• We distinguish between the following classes:

. Single

. With partner / family / religious community

. RH/RVT (Retirement-Home / Retirement and Care Home)

. Other


• Descriptive statistics and graphical exploration:


• The fourth group contains only 1 subject, and will not be included for analysis.

• From the graph, it appears that the average ADL score in RH/RVT patients is higherthan in the other two groups. Is this difference significant?

• Even if the three populations would have the same mean, it would still be possible toobserve differences in the sample, purely by chance.

• How large is the probability that we observe this type of difference?

• Illustration:

→ Anova


14.2 Pairwise t-tests

• In analogy with the unpaired t-test, we assume that we now have r different sets ofmeasurements (in the example, r = 3):

. y11, y12, y13, . . . , y1n1the measurements in the first group

. y21, y22, y23, . . . , y2n2 the measurements in the second group

. . . .

. yr1, yr2, yr3, . . . , yrnr the measurements in the rth group

• Further, we assume that the measurements are sampled from the followingdistributions:

Y1j ∼ N (µ1, σ2), Y2j ∼ N (µ2, σ

2), . . . Yrj ∼ N (µr, σ2)


• The null hypothesis that we want to testis

H0 : µ1 = µ2 = . . . = µr


HA : not all µi equal


• When the above null hypothesis is not satisfied, then at least two of the means µi

must be different. Therefore, we can, in principle, use unpaired t-tests. For r = 3, thiswould mean that we test the following hypotheses:

H0 : µ1 = µ2

H0 : µ1 = µ3

H0 : µ2 = µ3

• For our example, we obtain the following p-values:

Single Partner/family/relig. RH/RVT

Single — 0.8763 0.0013

Partner/family/relig. 0.8763 — <0.0001

RH/RVT 0.0013 <0.0001 —


• Hence, we only find significant differences between the RH/RVT patients on the onehand and the other two groups on the other hand.

• Note that, for each test conducted, there is a chance of 5% for a type-I error(incorrectly rejection H0).

• It can be shown that, for our example, the total probability for a type-I error satisfies:

P (H0 rejected | H0)

= P (at least 1 significance | µ1 = µ2 = µ3)

≤ 3 × 5% = 15%

so that the chance for a type-I error is larger than the 5% requested.


• In general, when conducting k tests, the total probability for a type-I error can increaseto k × α, and hence become large when the number of tests conducted is large.

• It is therefore necessary to dispose of a testing quantity that allows us to test the nullhypothesis

H0 : µ1 = µ2 = . . . = µr

without having to conduct all pairwise t-tests.

=⇒ ANOVA


14.3 1-way ANOVA

• ANOVA (Analysis of variance) is an extension of the unpaired t-test to the comparisonwith more than 2 groups.

• Like with the t-test, the test procedure will compare the variability between groupswith the variability within groups.

• The following equations play a central role:

r∑

i=1

ni∑

j=1[yij − y··]

2

︸︷︷︸

↓

SSTO

=r∑

i=1

ni∑

j=1[yij − yi·]

2

︸︷︷︸

↓

SSwithin

+r∑

i=1ni[yi· − y··]

2

︸︷︷︸

↓

SSbetween


..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Group 1 Group i Group r

y1j y1· yi· yr·y··

y1j − y1· y1· − y··

y1j − y··

. y·· : global mean (all groups together)

. yi· : mean in the ith group

. yij : jth measurement in the ith group


• SSTO : Total sum of squaresThis term expresses the total variability in the data.

• SSwithin : Within-group sum of squaresThis term expresses the variability within the groups

• SSbetween : Between-group sum of squaresThis term expresses the variability between the groups

• In ANOVA, the null hypothesis is rejected if

F =SSbetween/(r − 1)

SSwithin/(N − r)

is large. N is the total sample size, N = ∑i ni


• Note that F is the ratio of the variability between groups over the variability withingroups, which is entirely analogous to the unpaired t-test. This motivates theterminology ‘ANOVA.’

• In our example, F = 8.59

• Under the null hypothesis, F is expected to be small.

• We wish to known in how far F = 8.59 can be obtained purely by chance.

• We calculate the probability that F = 8.59, in case that all populations truly would beequal, i.e., when µ1 = µ2 = µ3.

• Illustration:

→ Histograms on ANOVA


• Clearly, when there is no difference between the three populations, then it is veryunlikely to observe F = 8.59. More specifically, the chance to observe F ≥ 8.59purely by chance is p = 0.0006.

• Given this chance is so small, more specifically p < α = 0.05 = 5%, we conclude thatthe observed value (F = 8.59) is sufficient indication to conclude that µ1, µ2, and µ3

are different.

• We reject the null hypothesis and conclude that the three groups are significantlydifferent at the 5% significance level.

• Note that the calculation of the p-values makes use of the assumptions made:

. Normality within all groups

. Equal variance for all groups

• Exactly like with linear regression, these assumptions need to be checked (see further).


14.4 Example

• ANOVA table with global F -test:

• The ‘SS MODEL’ is the SSbetween.

• The ‘SS Residual’ is the SSwithin.

• In the F statistic, SSbetween and SSwithin need to be divided by r − 1 = 3− 1 andN − r = 54− 3, respectively.


• These quantities are called the numbers of degrees of freedom (df) for SSbetween andSSwithin.

• The F statistic is

F =SSbetween/(r − 1)

SSwithin/(N − r)=

168.60/2

500.23/51= 8.59

• The corresponding p-value is p = 0.0006, which points to significant differencesbetween the three groups, as far as the average ADL on day 1 is concerned.

• As with regression, one can compute a statistic, indicating which portion of thevariability in the ADL scores can be explained by the differences in living conditions (=variability between groups):

R2 =SSbetween

SSTO=

168.60

168.60 + 500.23= 0.252


14.5 Model diagnostics

• With ANOVA, one implicitly assumes that the data are sampled from the followingpopulations:

Y1j ∼ N (µ1, σ2), Y2j ∼ N (µ2, σ

2), . . . Yrj ∼ N (µr, σ2)

• Hence, we assume that . . .

. . . .constant variance: within every group the spread is equally large

. . . .normality: within each group the data are normally distributed

• When the assumptions are not satisfied, as with linear regression, erroneous statisticalresults can follow (p-values, confidence intervals, . . . ).

• How can the above assumptions be verified?


14.5.1 Assumption of constant variance


• Descriptive statistics and graphical exploration:


• Is there too much difference in the variance so as to doubt the assumption of equalvariance?

• In other words, to what extent can the observed differences in variance be ascribed tochance?

• Statistica allows for a formal equal-variance test. The null hypothesis then is

H0 : σ21 = σ2

2 = . . . = σ2r


HA : not all σ2i equal


• A number of statistical tests are available, one of the most commonly used ones beingLevene’s test:

• Hence, we observe that the variances among the three groups are not significantlydifferent (p = 0.0808).

• When there are many groups, or when some groups contain (very) many observations,then small differences can be found to be significant by the formal testing procedure.

• At the same time, it is known that variances that are not too different pose little or noproblem (→ analogy with linear regression).


• Therefore, one employs, next to a formal test for equal variances, also a rule of thumb,stating that variances should not differ by more than a factor 5, to avoid adverselyaffecting the results.

• In our example, this is:

3.772

1.822= 4.29

• In practice, one uses the formal test, combined with the rule of thumb, so as to assesswhether the assumption of equal variance is satisfied.


14.5.2 Assumption of normality


• ANOVA assumes that the observations in every group are normally distributed, withcommon variance.

• Assuming common variance, how can normality be tested ?

• We rewrite the ANOVA model as

Y1j = µ1 + ε1j

Y2j = µ2 + ε2j

. . .

Yrj = µr + εrj

where the ‘error terms’ εij all come from the same normal distribution with mean zeroand variance σ2.

• The distribution of the error terms εij can be done after subtracting the populationaverages µ1, . . . , µr, which comes down to collapsing the population-specificdistributions


..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...............................................................................................................................................................................................................................................................................................................................................................................................................................

....................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...............................................................................................................................................................................................................................................................................................................................................................................................................................

....................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................

Group 1 Group i Group r

Y1j µ1 µi µr0

Y1j = µ1 + ε1j

......................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.............................................................................................................................................................................................................................................................................................................................................................................................................................

N(0, σ2)

ε1j 0

−µ1 −µr

••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••

−µi


• As with regression, we will check the assumption of normality for the εij via theirestimators

eij = yij − µi = yij − yi·

• As with regression, the eij are termed residuals: They represent the error made whenthe observed value yij for an individual in group i would be predicted by the groupaverage yi·.

• Once the residuals eij have been computed, we can assess normality using theirhistograms, or using formal normality tests.

• This is perfomed in full analogy with linear regression.


• Output:

• Hence, we can conclude that the assumption of normality is acceptable.


• Exactly as with simple regression, we have that:

. Departures from normality still lead to correct results, as long as the distribution ofthe errors is symmetric.

. In case of asymmetry, the response can sometimes be transformed, so as to renderthe residuals in the new model normally distributed.

. However, some transformations can disrupt the constant variance, implying thatthis needs to be assessed again after transformation.


14.6 Influential observations

• In spite of the fact that, with ANOVA, we strictly speaking do not dispose ofregression parameters, individual observations can still have a large influence on theestimation of the group averages, µi, and hence ultimately on the ANOVA results.

• Statistica allows us, exactly as with regression, to measure influence of eachobservation through comparing the estimators µi = yi· with these that would beobtained upon deletion of such an observation.

• This results, again, in the so-called ‘Cook’s distance,’ a distance between theestimators with and without a given observation.

• Exactly as with regression, we consider a scatter plot of Cook’s distances versus thesubject number.


• The computations are done in analogy with simple linear regression.

• Output:

• Hence, there are no observations with an unduly large influence.



• van Hooft et al. [14]:

. Analysis section p. 68:

2.4. Analysis

Descriptive data were generated for all variables. Statistical

analyses were performed using SPSS21 (SPSS Inc., Chicago, Il, USA).

Level of significance was set at p-value p < 0.05. Prior to analysis

the data was screened for repetitive response patterns (>10% of the

answers the same on the SEPPS-36; n = 5), and missing subscale

scores (>10% of the items of the subscale). The data of the

dependent variables were checked for normal distribution.

To determine self-ef cacy and behavior, sum scores were

Normality check for response −→ residuals ?


. Analysis section p. 68:

Hypothesis 2. The preferred attitude.

One-way ANOVA variance analysis with a Bonferoni post hoc

test was performed to measure associations between the

descriptions of the attitude towards self-management support

and the sum scores on behavior.

∗ One-way ANOVA to compare 4 groups

∗ Bonferroni correction for pairwise comparisons


. Results section p. 69:

The most preferred attitude towards self-management support

was the coach attitude (38.0%; n = 132). Next came the educator

(32.6%; n = 113), the clinician (15.6%; n = 54), and the gatekeeper

(13.8%; n = 48) attitudes. Analysis of variance showed no significant

difference in the sum scores of behavior between the different

attitudes, implying that the preferred attitude (coach, educator,

clinician, or gatekeeper) was not significantly associated with

nurses’ self-management support behavior (hypothesis 2).


• Huang et al. [15]:

. Statistical analysis section p. 5:

Statistical analysis

Results are presented as median (IQR: interquartile range) or mean ± standard deviation. Dif-

ferences between Group N, Group M, and Group S were tested by one-way ANOVA or the

Kruskal–Wallis test (Table 1). Tukey’s post hoc tests were then performed to find significant

differences between groups (Table 1). Correlations were determined using Pearson’s correla-


∗ Tukey’s post-hoc test for pairwise comparisons

∗ Kruskal-Wallis is a non-parametric rank-sum alternative, in case assumptions notsatisfied


. Table 1, p. 5:

Table 1. Characteristics of the participants by hot flash profiles.

Parameters Hot flash status P Value

None Mild to moderate Severe

n 52 47 52

Age, years 55 (51.5, 58.0) 53 (51, 56) 53 (51.0,55.5) .074†

MP_duration, years 4.0 (2.0, 5.0) 2.0 (1.7, 4.0) 2.5 (2.0, 5.0) .054†

SBP, mmHg 110 (98, 115) 112 (96, 116) 114 (102, 120) .065†

DBP, mmHg 71 (64, 75) 73 (64, 75) 72 (66, 75) .592†

BMI, kg/m2 22.8 2.7 22.6 2.6 23.5 2.4 .175‡

FSH, mIU/mL 69 24 67 25 65 21 .316‡

Estradiol, pg/mL 20 20 20 —

Fasting glucose, mg/dL 92 9 95 7 99 7* .0001‡

Hemoglobin A1c, % 5.5 (5.2, 5.6) 5.4 (5.2, 5.7) 5.6 (5.25, 5.8) .189†

Total cholesterol, mg/dL 209 (191, 228) 202 (173, 223) 201.5 (182, 236) .291†

Triglyceride, mg/dL 110 (84, 144) 97 (72, 144) 126 (81, 192) .090†

HDL cholesterol, mg/dL 56 (46, 68) 52 (46, 58) 52 (44, 62) .345†

LDL cholesterol, mg/dL 126 (104, 145) 123 (101, 139) 125 (103, 142) .662†

Insulin, pg/ml 339 (261, 469) 394 (256, 509) 515 (388, 732)* .0001†

Leptin, ng/mL 9.2 (4.9, 11.5) 10.2 (7.4, 15.6) 16.2 (10.3, 20.6)* .0001†

Adiponectin, ug/mL 14.9 (11.3, 21.9) 14.7 (8.3, 29.0) 8.1 (6.3, 11.9)* .0001†

Leptin to Adiponetin ratio 0.63 (0.25, 1.09) 0.82 (0.36, 1.63) 1.73 (1.08, 2.84)* .0001†

Resistin, ng/mL 16.2 (12.6, 22.8) 15.8 (11.6, 21.1) 13.5 (11.3, 18.4) .095†

HOMA-IR 1.20 (0.90, 1.84) 1.56 (1.00, 2.11) 2.13 (1.64, 2.97)* .0001†

Data are presented as mean SD or median (Q1, Q3). Statistical analysis was conducted by ANOVA test (marked with ‡) or Kruskal-Wallis test (marked

with †) to compare the mean/median differences between three groups of postmenopausal women with or without hot flashes. Tukey’s post hoc tests were

then performed to find significant differences between groups.

*, significant difference between Group S and Group M (p 0.05) and Group S and Group N (p 0.001).

Abbreviations: Q, quarter; Q1, 25th percentile; Q3, 75th percentile; MP_duration, menopause period since final menstrual period; SBP, systolic blood

pressure; DBP, Diastolic blood pressure; FSH, follicle stimulating hormone; BMI, body mass index; HDL, high density lipoprotein; LDL, low density

lipoprotein; HOMA-IR, homeostatic model assessment of insulin resistance.


. Table 2, p. 6:

Table 2. Association of hot flash status with adipocyte-derived hormones and HOMA-IR.

Variables Leptin Adiponectin Resistin Leptin/Adiponectin Ratio HOMA-IR index

Hot flashes

None

Mild to moderate 25.79 -11.36 -6.67 51.03 12.04

(4.54,51.36)a (-30.13,12.44) (-20.56,9.65) (1.89,123.87)a (-7.15,35.19)

Severe 53.18 -56.46 -16.86 140.29 53.89

(29.78,80.79)c (-71.33,-33.86)c (-29.76,-1.59) (70,239.64)c (28.16,84.79)c

Data are expressed as the percentage difference (95% CI). Adipocyte-derived hormones and leptin/adiponectin ratio were log-transformed.

Regression coefficients were back-transformed using formula (100*(exp( )-1)) to calculate the percentage difference and the 95% CI in each adipocyte-

derived hormone for hot-flash group relative to non-hot flash group.


∗ Logarithmic transformation of response

∗ Group differences versus reference group ‘None,’ but back-transformedto original scale:

β = µ2 − µ1 =⇒ exp(β)− 1 =exp(µ2)

exp(µ1)− 1 =

exp(µ2)− exp(µ1)

exp(µ1)


Part V

Multiple linear regression


Chapter 15

Multiple linear regression

. Example

. Regression model

. Application

. Interpretation

. Graphical interpretation

. Model diagnostics




15.1 Example

• It has been shown that the relationship between ADL score and MMSE score, 1 daypost operation, is significant.

• Hence, there is a strongly significant relationship between a patient’s cognitive statusand his/her dependence.

• Likewise, we expect the ADL score to be age-dependent.

• At the same time, there may be a relationship between MMSE score and age.

• Let us study these relationship with 3 simple regressions.


• Regression of ADL on MMSE:


• Regression of ADL on age:


• Regression of MMSE on age:


• Our findings are as follows:

. The dependence is stronger with lower cognitive status.

. The dependence is stronger with increasing age.

. The cognitive status is lower with increasing age.

• It is possible that the relationship found between ADL and MMSE is purely an effectof age, i.e., it would be possible that a better cognitive status corresponds to a lowerdependence, because these patients tend to be younger.

• Hence, a simple regression is not sufficient to capture the complex rlationship betweenADL on the one hand and age and MMSE on the other hand.

=⇒ multiple (linear) regression


15.2 The multiple linear regression model

• We want to determine how the ADL score, 1 day post operation, is influenced by theMMSE score and age, simultaneously.

• Graphically, this relationship can be captured in a 3D scatter plot

• Often rotation of the plot is needed in order to get a clear view on the relationbetween the three variables plotted


• Output (after rotation):


• A possible way to relate ADL simultaneously with MMSE and age is to extend theregression model:

ADLi = β0 + β1MMSEi + εi

yi = β0 + β1xi + εi

used for the regression of ADL on MMSE, to:

ADLi = β0 + β1MMSEi + β2agei + εi

yi = β0 + β1x1i + β2x2i + εi

by which we explicitly indicate that ADL depends, not only on MMSE, but possiblyalso on age.


• ADL is termed the dependent variable (response), while MMSE and age are theindependent variables (covariates).

• The above equation describes a plane in the 3D space, the so-called regression plane(two different rotations):


• As with simple linear regression, the parameters β0, β1, and β2 need to be estimatedbased on a sample.

• This can be done using the least squares method, which searches for the estimatorsβ0,

β1, and β2, for which the predicted ADL scores,

ADLi = β0 + β1MMSEi + β2agei

are as close as possible to the original measurements.

• This comes down to minimizing

∑

i

[ADLi − ADLi

]2.


• As with simple regression, it is assumed that the errors εi are normally distributed withmean 0 and variance σ2.

• When the above assumptions are satisfied, significance can be tested for the regressionparameters β0, β1, and β2.

• Furthermore, in analogy with simple regression, one can construct an ANOVA tablebased on the equality:

∑

i[yi − y]2

︸︷︷︸

↓SSTO

=∑

i[yi − yi]

2

︸︷︷︸

↓SSE

+∑

i[yi − y]2

︸︷︷︸

↓SSR


• SSTO: Total sum of squaresThis term captures the total error made by predicting the yi without taking intoaccount the observed values x1i and x2i for the covariates X1 and X2.

• SSE: Error sum of squaresThis term captures the error made upon predicting the yi by making use of theobservations x1i and x2i.

• SSR: Regression sum of squaresThis term captures the decrease in error by predicting the values yi with rather thanwithout making use of the covariates.

• A measure of the regression’s “quality” is

R2 =SSR

SSTO


• Like with simple regression, R2 enjoys the following properties:

. 0 ≤ R2 ≤ 1

. R2 = 0 implies that SSR = 0 and thus that all yi are equal to y, i.e., theregression plane is horizontal.

This is equivalent with β1 = β2 = 0

. R2 = 1 implies that SSE = 0. This implies that yi = yi for all i, and hence thatall observations lie in the regression plane.

• It is said that R2 expresses ‘which fraction of variability in the response (ADL) can beexplained by covariates’ (MMSE and age).

• With simple regression, we found that R2 equals r2, the square of the correlationbetween xi and yi. Hence R2 can be seen as a generalization of the correlationcoefficient to a ‘correlation’ between one variable on the one hand (the response) andmultiple variables on the other hand (the covariates).


• If R2 = 0, then the covariates X1 and X2 do not help us in predicting the response,which is equivalent to β1 = β2 = 0. In practice, it is therefore important to assesswhether the covariates help us in predicting response. This can be done by testing thenull hypothesis

H0 : β1 = β2 = 0

versus the alternative

HA : β1 6= 0 of β2 6= 0

• In most software packages, the above hypothesis is tested automatically with everyregression. This is done by way of an F test.

• Everything discussed in the context of the regression with two covariates can beextended to several covariates, where a given response is to be predicted from a set ofcovariates.


15.3 Application

• The regression output obtained when regressing ADL on MMSE and age is:


• In the ANOVA table, we find SSTO, SSR, and SSE.

• The global F-test in the ANOVA table tests whether the covariates improve theprediction of ADL in a significant way, i.e., they test the hypothesis

H0 : β1 = β2 = 0

versus the alternative

HA : β1 6= 0 of β2 6= 0

• Given the strong significance (p < 0.0001), we conclude that at least β1 or β2

significantly differ from zero.

• Further R2 = 0.4946. Note that the regression of ADL on MMSE yieldedR2 = 0.4940.

• Hence, we see that age explains little extra variability in ADL, over and above whatwas already explained by MMSE.


• This suggests that, once we know the MMSE score, then the patient’s age provideslittle extra information for the prediction of the ADL score, one day after the operation.

• The least squares estimators are

. β0 = 22.55

. β1 = −0.29

. β2 = 0.01

• Note that these values are different from what would follow from two singleregressions:

Covariates

MMSE and Age MMSE Age

β0 22.55 23.65 5.93

β1 -0.29 -0.30 —

β2 0.01 — 0.15


• This suggests that the parameters change meaning, compared to those in singleregression.

• Note that age in the above regression model is no longer significant (p = 0.7963),which strongly contrasts with the significant univariate regression of ADL on Age(p = 0.0053). This underscores that the results from a multiple regression are to beinterpreted differently from their single-regression counterparts.


15.4 Interpretation

• Our regression of ADL on MMSE and age yielded the following regression equation:

ADL = 22.55 − 0.29 MMSE + 0.01 Age


• The estimator β1 = −0.29 can be interpreted as follows:

. Take two groups of subjects of the same age (e.g., 80 years), of which the first onehas MMSE=20 and the second one MMSE=21.

. Their expected ADL score then is

ADL1 = 22.55 − 0.29 × 20 + 0.01 × 80

ADL2 = 22.55 − 0.29 × 21 + 0.01 × 80

. The difference then is

ADL2 − ADL1 = −0.29 × (21 − 20) = −0.29

. Hence, we find that, for patients of a given age, the ADL score decreases onaverage with 0.29, for a unit increase of MMSE.

. Note that the effect would be the same of patients of a different age (e.g., 70years) would be taken


• Note that we cannot conclude that a unit increase of MMSE for a given patientwill lead to a decrease of ADL with 0.29.

• We can only assert that, for patients of a given age, a unit difference in MMSEcorresponds to an average difference in ADL of 0.29.

• We cannot draw ‘longitudinal’ conclusions from our ‘cross-sectional’ experiment.

• The estimator β1 indicates how the average ADL score varies with MMSE, for patientswith the same age.


• In the regression plane, this corresponds to lines for constant age:


• Similarly, we can interpret the estimator β2 = 0.01 as the average increase of ADL perunit increase of age, for patients with the same MMSE score. In the regression plane,this corresponds to lines for constant MMSE:


• Note that these lines are almost flat, suggesting that, for patients with the sameMMSE, the average ADL score is virtually without age influence.

• This explains why Age is no longer significant in the multiple regression model(p=0.7963): Age contributes only little additional information for the prediction ofADL, whenever the MMSE score is already known.

• We find this also in the fact that adding Age to the regression model of ADL onMMSE only marginally leads to an increase in R2, from 0.4940 to 0.4946.

• In practice, non-significant terms in the regression model are usually deleted, becausethey do not provide additional quality of prediction. In our example, this wouldcorrespond to the omission of the variable Age. The final model would then onlycontain ‘MMSE score on day 1.’


• Note that the p-value gives an indication of the need for one particular covariate, inaddition to the ones already in the model.

• It is therefore not a good practice to remove non-significant covariates simultaneouslyfrom the model.

• Rather, the deletion of non-significant covariates has to be conducted step by step.


15.5 Graphical interpretation

• A response Y and two covariates X1 and X2 contain some information about thepopulation of interest:

�

� �

�

� �

• Obviously, Y and X1 have some information about the population in common.Likewise, Y and X2 have some information about the population in common.


• A simple linear regression analysis of Y on X1 (X2) quantifies the information X1

(X2) contains about Y :

�

� ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¢ ¬ ¨ ® ¯ ° ± ²

³

´ µ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À · Á Â ½ Ã Ä Å Æ Ç

H0 p-value

β1 = 0 0.0023

H0 p-value

β2 = 0 0.0087


• In multiple regression, one outcome Y and multiple covariates, e.g., X1 and X2, arestudied simultaneously. All containing some information about the population:

ÈÉ


• A multiple linear regression analysis of Y on X1 and X2 quantifies the information X1

and X2 jointly contain about Y :

ÊË

Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Í × Ø Ó Ù Ú Û Ü Ý Í Ø Ù Ü Þ

H0 p-value

β1 = β2 = 0 0.0004


• In multiple regression, the effect of an individual covariate quantifies the information itcontains about Y , not already incorporated in the other covariates:

ß

à áà â

ã ä å æ ç è é ê ë ì í ä î ï ê ð ñ ò ó ô ç ï æ ç ì ç è ó õ

ö

÷ ø÷ ù

ú û ü ý þ ÿ � � � � � û � � � � � � þ � ý þ � þ ÿ �

H0

p-value p-value

(simple) (multiple)

β1 = 0 0.0023 0.0374

β2 = 0 0.0087 0.0187


• Multiple regression with one (non-)significant covariate while both are highlysignificant in simple regressions:

� H0

p-value p-value

(simple) (multiple)

β1 = 0 0.0002 0.0138

β2 = 0 0.0087 0.9724


• Multiple regression with equal estimates but different p-values in multiple and simpleanalyses:

��

H0

p-value p-value

(simple) (multiple)

β1 = 0 0.0139 0.0038

β2 = 0 0.0255 0.0049

• This occurs if X1 and X2 do not contain information about each other,hence are independent.


• Multiple regression with two non-significant covariates in multiple regression, whileboth are highly significant in simple regression:

�� H0

p-value p-value

(simple) (multiple)

β1 = 0 0.0002 0.9625

β2 = 0 0.0004 0.8259

• This occurs if X1 and X2 contain much information about each other,hence are highly dependent.



• The general multiple linear regression model with p covariates takes the form

yi = β0 + β1x1i + . . . + βpxpi + εi

where the errors εi are assumed to be zero-mean normally distributed with variance σ2.

• Our assumptions:

. Linearity: The average Y value is well described by

β0 + β1x1i + . . . + βpxpi

and the errors εi have mean zero.

. The variance of the errors is constant.

. The errors εi are normally distributed.


• All significance tests are based on the above assumptions, i.e., their failure to hold canlead to erroneous results. Hence, it is important to check them.

• In our example, we assumed that the average ADL score could be well described by

β0 + β1 MMSE + β2 Age

and that the errors εi are normally distributed with mean zero andconstant variance σ2.

• Verifying these assumptions is more complex than with simple regression, because, asalready discussed, the relationship between ADL and, for example, Age, is alsoinfluenced by the second covariate, MMSE, in the model.


• In simple regression, the assumptions were verified using the residuals, which areestimators for the εi:

ei = ADLi − ADLi = ADLi − (β0 + β1MMSEi + β2Agei)

• If the model assumptions are correct, then we expect no systematic trends in theresiduals, they have to exhibit constant variability, and they have to be normallydistributed.

• In practice, it usually suffices to apply the following techniques:

. Scatter plots of the ei versus all covariates in the model.

. Scatter plot of the ei versus the predicted values yi.

. Normality checks for the ei.

• In most software packages, these techniques can be applied in full analogy with thesimple linear regression case.


15.6.1 Residuals versus covariates

• For simple regression, the residuals ei were plotted versus the model covariate.

• Now, we construct a scatter plot of the residuals ei versus each of the covariates inthe model.

• If the model is correct, then we expect no further systematic trends.

• Such systematic trends can, like with simple regression, point to the need fortransforming one or more covariates.


• Results:

• We find no systematic trends in the residuals. This means that the we neither over-nor underestimate the average ADL score, in a systematic way, for older nor foryounger patients.


15.6.2 Residuals versus predicted values

• The scatter plots of residuals versus covariates allow us to verify whether or not theresponse is systematically over- or underestimated for certain values of the covariates.

• On the other hand, it is also important to verify whether, for example, large or smallpredictions would come from systematic over- or underestimation of the outcome.

• In our example, we want to check whether the model systematically over- orunderestimates certain ADL values.

• This can be verified by plotting the residuals versus the predicted values yi.


• Result:


15.6.3 Normality of the residuals

• As with simple regression, we will check the normality assumption for the errors εi, viathe residuals ei.

• This can be done graphically (histogram), or via a formal test for normality:


• The normality assumption seems acceptable.

• As with simple regression:

. Deviations from normality lead to correct results as long as the errors aresymmetrically distributed.

. In case of asymmetry, the response can sometimes be transformed, so that theresiduals in the ensuing model are normally distributed.

. Potential transformations can disturb linearity and constant variance, implying that,after transformation, the residual plots need to be constructed again.



• In analogy with simple regression, influential subjects can have a strong impact on theregression’s results.

• In principle, one could explictly remove each observation in turn, and then assess howthe results (i.e., the estimators β0, . . . , βp) change.

• This means that, each time, the results of the analysis without a particular observationneed to be compared with the one based on all data.

• Like before, this can be effectuated with Cook’s distances, measuring the ‘distance’between the results with and without a given observation.

• Cook’s distance for the ith observation is again denoted by Di, and influential subjectscorrespond to large Di.


• Calculations proceed similarly as with simple regression.

• Indexplot of all Cook’s distances:

• The figure exhibits observation #43 as the only outlier. Therefore, we compare theregression with and without this subject.


• Results for the analysis with and without this subject are, respectively:

• The final conclusions do not change if we remove subject #43 from the analysis.



van Hooft et al. [14]:

. Analysis section 2.4.1, p. 69:

A tud ePerspec%ve on self-manage ment supp ort

Importance of SMS

Sub jec)ve normsPa%ent’s capabili ty to make choices

Mo%va %on of the pa%ent

Knowledge of the pa%ent

Nee d of the pa%ent

Team Supp ort

Self-efficacy

Barr iersAvail able %me

Behavior

Self-management supp ort

Knowledge & skill s

Edu ca%onal nee ds

Inten%on

Backgroun dAge

Edu ca%on

Work experienceTarge t group

Inp a%ent / outpa%ent department

Fig. 1. The Attitude, Subjective norms, and Self-Efficacy (ASE) model (de Vries et al., 1988).

2.4.1. Predictors of self-management support behavior

To determine which factors influence the behavior of self-

management support a stepwise regression analysis was executed

with the significant variables of the ASE-model.


. Results section 3.8, p. 70-71,and Table 4, p. 70:

3.8. Predictors of self-management support behavior

Stepwise regression analysis showed that three factors were

significant predictors for self-management support behavior. We

first controlled for setting (inpatient or outpatient ward). This

accounted for 3.1% of the variance (adjusted R2 2.7%). In the

subsequent steps the importance of self-management support, the

presumed absence of a patients’ need for self-management support,

the perceived knowledge gap, and self-efficacy respectively, were

entered. In the final model, importance of self-management

support (attitude) and setting were mediated by self-efficacy. The

final model explained 41.1% of the variance of behavior of self-

management support (adjusted R2 39.9%) (Table 4).

Table 4

Determinants of self-management support behavior.

Step 1 Step 2 Step 3 Step 4

Behavior b P Value b P Value b P Value b P Value

Background

Working in an inpatient ward or outpatient department 0.18 0.005 0.14 0.020 0.13 0.025 0.06 0.274

Attitude

Importance 0.19 0.002 0.15 0.010 0.06 0.228

Subjective norms & knowledge

Patients do not have a need �0.19 0.001 �0.16 0.002

Own insufficient knowledge �0.26 <0.001 �0.14 0.005

Self-efficacy 0.53 <0.001

Explained variance R2 = 0 .03 <0.05 R2= 0 .07 <0.001 R2= 0 .17 <0.001 R2= 0 .41 <0.001

F-value (df) 7.97 (253) 8.96 (252) 12.37 (250) 34.68 (249)

Note: Stepwise regression analysis; b, standardized coefficients; df, degrees of freedom.


Chapter 16

Polynomial regression

. Example

. Application

. Interpretation of the results



16.1 Example

• We revisit the fictitious example, used before, to illustrate the effect of non-linearity insimple linear regression:


• The figure clearly shows non-linearity.

• Before, this was solved by logarithmically transforming the covariate X , x −→ ln(x)

• On the other hand, we note that the relationship between yi and xi could be quadratic.

• A possible statistical model could be:

yi = β0 + β1 xi + β2 x2i + εi

where, conventionally, the error terms εi are assumed normally distributed, with meanzero and variance σ2.

• Note that the above model can be considered a multiple regression model withcovariates x1i = xi and x2i = x2

i :

yi = β0 + β1 x1i + β2 x2i + εi


• Hence, the model can be fitted by first calculating a new variable which contains thesquares of xi, whereafter a multiple linear regression is computed.

• Most software packages allow for implicit calculation of the higher order term(s).

• Output for the regresssion coefficients:

• The fitted regression curve is:

yi = 0.72 + 4.50 xi − 2.32 x2i


• The coefficient for the quadratic term β2 is strongly significantly different from zero(p < 0.0001), establishing a strong quadratic effect.

• Graphical representation of the regression curve:


• Output for the ANOVA table:

• The R2 value is now higher than R2 = 0.9247, obtained from the regression modelwith the logarithmically transformed covariate.


16.2 Interpretation of the results

• In our fictitious example, the fitted regression curve was:

yi = 0.72 + 4.50 xi − 2.32 x2i

• Before, we derived that a regression coefficient indicates how the response changes onaverage as a function of the corresponding covariate, while keeping all othercoefficients constant.

• In the above example, this means, e.g., that β1 = 4.50 indicates that the response onaverage increases with 4.50 if X shows a unit increase, while X2 remains constant.

• Now, given that X cannot vary without X2 changing along, such interpretation ismeaningless.


• In general, we have to conclude that the individual regression coefficients inpolynomial regression cannot be interpreted.

• The regression coefficients merely describe a polynomial, describing the averageevolution of Y as a function of X .

• On the other hand, the high significance of β2 (p < 0.0001) indicates that theaddition of the quadratic term has importantly improved the regression model.In other words, there is a strong quadratic effect, superimposed on the linear effect.

• The signficance of the individual parameters in the polynomial regression can beinterpreted, the individual regression parameters cannot.


• Note that the result of the polynomial regression is a curve rather than a plane:


16.3 Remarks

• The foregoing discussion is directly generalizable to polynomials of degree higher thantwo:

yi = β0 + β1xi + . . . + βpxpi + εi

• We refer to third-degree polynomials as cubic regression:

yi = β0 + β1xi + β2x2i + β3x

3i + εi

• One can combine ordinary multiple regression with polynomial regression:

yi = β0 + β1x1i + β2x21i + β3x2i + εi

• Given that polynomial regression is a special case of multiple regression, all techniquesfor model diagnostics and influential observations apply.



Bjork et al. [16]:

. Definition of outcomes:

Neuropsychiatric symptoms were assessed usingthe Neuropsychiatric Inventory Nursing HomeVersion (NPI-NH; Wood et al., 2000), which assessesthe frequency and severity of 12 psychiatric andbehavioral symptoms in nursing home residents.Symptom frequency is rated from 0 to 4 andsymptom severity from 1 to 3. An item score isgenerated by multiplying frequency by severity(0–12); thus, greater NPS scores indicate greaterfrequency and severity.

Cognitive functioning was assessed with Gottfries’


. Relation between 12 outcomes and cognitive functioning (Fig.1):

Figure 1 Neuropsychiatric symptoms in relation to level of cognitive function with polynomial regression curves fitted to the data. Regressioncoefficients are presented in Table 2. The x-axis presents the cognitive score (ranging from 27 to 0), and y-axis presents the mean item score of eachparticular symptom. A, delusions; B, hallucinations; C, aggression/agitation; D, depression/dysphoria; E, anxiety; F, elation/euphoria; G, apathy; H,disinhibition; I, irritability; J, aberrant motor behavior; K, night-time behaviors; L, eating changes.


. Polynomial regression results (Table 2):

Table 2 Characteristics of regression curves for NPS in relation to level of cognitive functioning

NPI-NH item Prevalence, % (n) Polynomial regression curve R R2

p-value

Delusions 32.7 (1442) 3rd degree 0.226 0.051 0.003Hallucinations 26.6 (1177) 3rd degree 0.219 0.048 <0.001Aggression/agitation 39.6 (1720) 3rd degree 0.339 0.115 <0.001Depression/dysphoria 51.8 (2157) 2nd degree 0.105 0.011 <0.001Anxiety 40.7 (1732) 3rd degree 0.189 0.036 0.009Elation/euphoria 17.2 (754) 1st degree 0.143 0.020 <0.001Apathy 42.3 (1762) 2nd degree 0.297 0.088 <0.001Disinhibition 25.7 (1133) 3rd degree 0.171 0.029 0.008Irritability 44.4 (1855) 3rd degree 0.232 0.054 0.001Aberrant motor behavior 29.7 (1298) 3rd degree 0.292 0.085 <0.001Night-time behaviors 35.0 (1487) 3rd degree 0.183 0.033 0.001Eating changes 35.2 (1406) 3rd degree 0.108 0.012 <0.001

NPS, neuropsychiatric symptoms; NPI-NH, Neuropsychiatric Inventory Nursing Home.Data correspond to diagrams in Figure 1.

∗ Prevalence is percentage of cases with neuro-psychiatric symptoms (NPS > 0)

∗ R as well as R2 measures reported

∗ R is not equal to Pearson correlation due to non-linearity


Chapter 17

Interaction

. Example

. Application

. Interpretation of results

. What about non-significant main effects?

. Remarks



17.1 Example

• Let us reconsider the example where we aim to predict the ADL score as a function ofage and the patient’s MMSE score, 1 day post operation, with the associated multipleregression model:

ADL = 22.55 − 0.29 ×MMSE + 0.01 × Age


• This regression assumed that the effect of MMSE on ADL is independent of the effectof the patient’s age: For each age class, we have that the ADL score diminishes onaverage with 0.29 per unit increase of MMSE.

• Conversely, it is also assumed that the effect of Age on ADL is independent of theMMSE score of the patient: For each MMSE class, we have that ADL increases onaverage with 0.01 per unit increase of age.

• A regression model not making this assumption can be obtained through a so-calledinteraction term of Age and MMSE:

ADLi = β0 + β1MMSEi + β2Agei

+β3MMSEi × Agei + εi

• This means that we merely add another covariate to the model, the product of theprevious two covariates.


• To demonstrate that we now no longer assume that the effect of Age is independentof MMSE, and vice versa, we rewrite the above model in two ways:

ADLi = β0 + β2Agei + (β1 + β3Agei)×MMSEi + εi

ADLi = β0 + β1MMSEi + (β2 + β3MMSEi)× Agei + εi

• From the first equation it follows that we assume a linear relationship between ADLand MMSE, but that the intercept and the slope depend on Age:

. Intercept: β0 + β2Agei

. Slope : β1 + β3Agei


• From the second equation, it follows that we assume a linear relationship bewteenADL and Age, but with intercept and slope dependent on MMSE:

. Intercept: β0 + β1MMSEi

. Slope: β2 + β3MMSEi

• Note also that the interaction effect implies that the effect of MMSE on ADL dependson Age but simultaneously also that the effect of Age on ADL depends on MMSE.

• Furthermore, the assumption made before, i.e., that the effect of MMSE (Age) onADL does not depend on Age (MMSE) can easily be checked by testing H0 : β3 = 0

• The computation of the product term for the interaction is done implicitly in mostsoftware packages


17.2 Application

• Resulting regression coefficients and ANOVA table:


• Like always, the globale F -test aims at testing the null hypothesis

H0 : β1 = β2 = β3 = 0

versus the alternative hypothesis that at least one of the above regression coefficientsis different from zero.

• Adding the interaction term has increased R2 from 0.4946 to 0.5235.

• Strictly speaking, the interaction term is not significant (α = 0.05), but there isevidence that the effects of MMSE and Age on ADL are not entirely independent ofone another.

• The estimated regression equation is

ADL = 40.87 − 1.19 ×MMSE− 0.21 × Age

+0.01 ×MMSE× Age


• Graphical representation of regression surface (upon rotation):


17.3 Interpretation of results

• The estimated regression equation is

ADL = 40.87 − 1.19 ×MMSE− 0.21 × Age + 0.01 ×MMSE× Age

• Like with polynomial regression, we cannot interpret the individual regressioncoefficients.

• For example, we cannot conclude that −1.19 captures how strongly ADL changes withMMSE, while the other covariates are kept fixed. Indeed, MMSE cannot vary withoutalso changing the product MMSE×Age, if Age is kept constant.

• To enhance insight in the effect of adding the interaction to the model, we considerthe predicted evoluation of ADL as a function of MSE and as a function of Age,separately.


17.3.1 ADL as a function of MMSE

• To see how ADL evolves as a function of MMSE, we rewrite the estimated regressionequation as:

ADL = 40.87 − 0.21 × Age

+(−1.19 + 0.01 × Age)×MMSE

• We can now compute this for various age groups:

. 65 years: ADL = 27.22 − 0.54 ×MMSE

. 75 years: ADL = 25.12 − 0.44 ×MMSE

. 85 years: ADL = 23.02 − 0.34 ×MMSE

. 95 years: ADL = 20.92 − 0.24 ×MMSE


• Each of the equations corresponds to a straight line in the regression surface, for agiven level of age:


• We see that ADL deceases progressively less as a function of MMSE, with increasingage.

• For high ages, the dependence will decrease less pronounced as a function of thepatient’s cognitive status.


17.3.2 ADL as a function of Age

• To see how ADL evolves as a function of age, we rewrite the estimated regressionequation as:

ADL = 40.87 − 1.19 ×MMSE

+(−0.21 + 0.01 ×MMSE)× Age

• We can now compute this equation for various MMSE groups:

. MMSE = 0: ADL = 40.87 − 0.21 × Age

. MMSE = 10: ADL = 28.97 − 0.11 × Age

. MMSE = 20: ADL = 17.07 − 0.01 × Age

. MMSE = 30: ADL = 5.17 + 0.09 × Age


• Each of these equations corresponds to a straight line in the regression plane, for aconstant MMSE value:


• Hence, we see that patients with a very good cognitive status, there is a tendency forADL to increase with age.

• For patients with a worse cognitive status, there is a tendency for ADL to decreasewith age.

• The latter observation is counter-intuitive. For that reason, we wish to test whether,e.g., for patients with MMSE score equal to 10, the slope value of −0.11 is significant.

• In fact, this slope is−0.21 + 0.01 × 10

where −0.21 is an estimate for β2 (coefficient of Age), and where 0.01 is an estimatefor β3 (interaction coefficient).


• We are intersted in testing the hypothesis:

H0 : β2 + 10β3 = 0, versus HA : β2 + 10β3 6= 0

• Most software packages allow specification of null-hypotheses which are linearcombinations of the parameters in the model.

• Result:

• We obtain an F -test for the specified null hypothesis, from which it follows that thereis no significant relationship between ADL and Age, for patients with an MMSE scoreequal to 10 (p = 0.2034).


• Note that, strictly speaking, the interaction term is non-significant (p = 0.0731),suggesting that the lines on the regression surface are parallel:

• On the other hand, the relatively small p-value hints on the presence of (a weak formof) interaction, which now, due to lack of power, is not found to be significant. It isimportant to verify this in a new, perhaps larger experiment.


17.4 What about non-significant main effects?

• In our example, we obtained the following estimators for the regression parameters:

• The effect of Age and MMSE are termed ‘main effects,’ to effectuate the differencewith the interaction MMSE×Age.


• Can we delete, in this case, the least significant term, i.e., the main effect of Age?

• The rationale would be that this is a term that would not provide additionalinformation about the ADL response variable.

• As long as there is an interaction term, it is possible that the effect of Age depends onMMSE, and vice versa.

• This implies that no assertions can be made over the global effect of Age.

• For this reason, we will not delete non-significant main effects, as long as interactioneffects are included in the model.


17.5 Remarks

• Interactions can be added as well to polynomial regression models:

yi = β0 + β1x1i + β2x21i + β3x2i + β4x1ix2i + εi

• Fitting models with several covariates, whether or not polynomial, and with or withoutinteraction, can be easily be done within the context of the so-called ‘General LinearModel,’ to be discussed later.

• Given that regression models with interaction terms are, again, a special case ofmultiple regression, the techniques for model diagnosis and influential subjects remainvalid.

• Also, implementation in software remains similar.



Collard et al. [17]

. Statistical analysis section, p. 191:

Multiple linear regression analyses were conducted to

examine associations of the number of somatic diseases

(dependent variable) with depression (independent vari-

able) adjusted for socio-demographic variables (age,

gender, educational level, partner status, income) and

lifestyle factors (smoking status, alcohol use, BMI, and

physical exercise). First, we checked whether the associa-

tions between depression and somatic comorbidity were

dependent on frailty by including interaction terms

between frailty and depression in the fully adjusted

models. We tested both, frailty as a dichotomous

characteristic (present yes/no) and as a dimensional

variable based on the number of criteria present. A

significant interaction term between depression and frailty

(yes/no) implies that the association between depression

and somatic diseases is different in patients with and

without frailty. In case of a significant interaction term

with the number of frailty criteria present, the association

between depression and frailty differs among the different

levels of frailty. Subsequently, it was tested whether frailty



5.2. Frailty as a moderating factor

Whether the association between depression and

number of somatic diseases was dependent on frailty

status, was examined by adding the interaction term of

depression by frailty to the fully adjusted linear regression

models. Depression neither interacted with the presence of

frailty (yes/no) (p = .57), nor with the number of frailty

components present (p = .25).

∗ Outcome: Number of somatic diseases

∗ Covariates: Severity of depression, Number of frailty components present

∗ Interaction: Severity of depression × Number of frailty components present

∗ Adjusted for: Socio-demographic variables and Lifestyle factors


Part VI

Analysis of variance with multiple factors


Chapter 18

Multiple analysis of variance

. Example

. Application


. Model diagnostics




18.1 Example

• Let us reconsider the examples from single ANOVA:

. We found a significant difference in mean ADL score, 1 day post operation,between neuro-psychiatric patients and other patients (p = 0.0025):


. We also found that the average ADL score, 1 day post operation, is significantlydifferent for different patient pre-operative housing situations (p = 0.0006):

• Hence, we have two factors that are related with ADL score.


• The average ADL for each combination of housing situation and neuro-status:

• Like before, we will remove the fourth housing situation, because it contains a singleobservation only.



• Note that the averages have been connected to emphasize the difference in evolutionbetween both neuro groups.


• This difference is easier observed in a so-called interaction plot:


• Multiple ANOVA will allow us to assess the joint effect of housing situation andneuro-status on the ADL score, 1 day post operation.

• Note that the graph suggests that the effect of neuro-status on mean ADL depends onthe patient’s housing situation.

• In analogy with multiple linear regression, we need to account for possible interactionbetween both factors.

• The prediction of ADL, using neuro-status and housing situation, is an example of aso-called 2-way ANOVA, because we have two factors to predict the response. Like inthe regression case, the entire 2-way ANOVA discussion can be generalized to morethan two factors.


18.2 Application

• An ANOVA analysis for the ADL score, with neuro-status and housing situation asfactors, and with potential interaction between both’:


• Like before, we obtain a decomposition of the total variability in the response variable(SSTO) into a component capturing the variability between the various groups(SSbetween) and a component capturing the variability within both groups (SSwithin).

• Like before, we obtain a global F-test comparing the variability between the groupswith the variability within the groups:

F =197.89/5

470.94/48

and hence, which expresses in how far the factors in the model assist us in predictingADL (here significant, p=0.0039).

• The degrees of freedom, needed to standardize SSwithin and SSbetween are moredifficult to derive with multiple ANOVA (see later).


• Like before, we obtain an R2, indicating the percentage of total variability in ADLscore that is explained by the ANOVA model.

• Furthermore, an F -test for each of the effects in the model can be obtained:



• The relevant ANOVA table for testing the various effects in the model is:

• We obtain an F -test for each effect specified in the model.


• Regarding interpretation, these tests are completely analogous to these in multiplelinear regression, i.e., one tests for the significance of a given effect while keepingother effects constant.

• Before making a statement about ‘the’ neuro-effect or ‘the’ effect of housingsituation, we need to assess whether the effect of one factor does or does not dependon the other factor.

• In other words, we need to check whether there is an interaction between these factors.

• From the above table, it follows that there is no significant interaction (p = 0.4515).

• We can conclude that the effect of housing situation does not depend on neuro-statusand, vice versa, that the neuro-effect does not depend on housing situation.


• Graphically, this means that the non-parallel structure of the means can be ascribed torandomness:

• We can assume that the mean profiles in fact are parallel.


• This assumption can be built into the analysis by removing the interaction term fromthe model.

• Results for model without interaction between housing situation and neuro-status:


• We can also calculate predicted averages, based on the model without interactionbetween both model factors.


• With the above output, we can now test for the effect of each factor, after correctionfor the other factor. In other words, we can test the effect of each of the factors, whilekeeping the other factor constant:

. The model assumed that the effect of housing situation is the same for bothneuro-statuses.

. This effect is found to be highly significant (p =0.0076), implying that the lines inthe foregoing graph are not horizontal.

. The model assumed that the neuro-effect is the same for the three housingsituations.

. This effect is not significant (p =0.2459), implying that the vertical distancebetween the lines in the foregoing graph are due to random variability.

. This actually means that we are allowed to further simplify the graph by assumingthe same evolution for both neuro groups (coinciding lines in the above graph).

• In other words, we can simplify the model by removing the factor Neuro.


• We obtain a one-way ANOVA model with Housing Situation as only factor, asdiscussed before.

• The difference in neuro-effect between the housing situations, suggested by theoriginal graph, is not significant (no interaction).

• Moreover, the neuro-effect is not only equal for all housing situations, it is not evensignificantly different from zero.

• How can we intuitively see why neuro status was originally significant in the t-testanalysis, but no longer after correction for housing situation?

• Apparently, differences in ADL between neuro-psychiatric patients andnon-neuro-psychiatric patients can be explained by differences in housing situation.


• Graphically:

��

. Outcome Y : ADL (day 1 post-operatively)

. Factor X1: Housing Situation

. Factor X2: Neuro-psychiatric status

• This suggests that there is a strong relationship between neuro-statusand housing situation.


• We can check this with a table-analysis (chi-squared test):


• There is a strongly significant (p = 0.008) relationship between housing situation andneuro-status of the patient: 53% of the neuro-psychiatric patients stay in a RH/RVT.

• This explains why, after correction for housing situation, the neuro-status is no longersignificant.



• With 1-way ANOVA, it is assumed that the data within each group are normallydistributed, with constant variance.

• With multiple ANOVA, it is assumed that, for each combination of the factors in themodel, the data are normally distributed with constant variance.

• For our example, the factor Housing Situation has three levels, and the factor Neurohas 2 possible values. We have 6 possible combinations.

• Our 2-way ANOVA model assumed that for each of these 6 combinations, the data arenormally distributed, with equal variance throughout.


• The violation of these assumptions can lead to erroneous results of the statistical tests.

• Hence, it is key to check the assumptions as carefully as possible.

• Let us illustrate this for the original model (model with interaction), so as to ensurethat model simplification (based on p-values) be justified.


18.4.1 Assumption of constant variance

• The estimated standard deviation for each combination of model factors:


• Like with 1-way ANOVA, Levene’s test can be used to assess whether the 6 variancesare equal:

• The variances in the six groups are not significantly different (p = 0.3059).

• Like with 1-way ANOVA, when there are many groups or when some groups containvery many observations, small differences in variances can turn out to be significant,while small differences in variance do not drastically alter conclusions.

• For this reason, next to a formal test, a rule of thumb of equal variance is applied: wecheck whether or not the variances differ by a factor larger than 5.


• In our example, this becomes:

4.692

1.542= 9.27

• Note that the variability in the group of neuro-psychiatric patients that live alone ismuch larger tha in the other groups.

• However, it so happens that this groups contains 4 patients only with the followingADL scores:

Patient ADL (day 1)

#14 18

#25 23

#49 12

#56 15


• The large variability, therefore, is essentially due to patient #25, who exhibits a muchlarger ADL value than the other patients in this group.

• Hence, the large standard deviation in this sub-population does not necessarily suggestthat the variability in this group would be larger than in the other sub-populations.

• On the other hand, with regression analysis, we saw that such ‘outliers’ arepotentially influential. We will have to pay careful attention in our influence analysisto subject #25.


18.4.2 Normalilty assumption

• Multiple ANOVA assumes that the data are normally distributed for each combinationof factors in the model, with constant variance. Above, we already discussed theassessment of equality in the variances. We now assume that the assumption of equalvariance is satisfied. How can we test the assumption of normality?

• As discussed already, to each ANOVA corresponds a statistical model that makesassumptions about the relationship between the average response values across classesof model factors.

• For example, the 2-way ANOVA for ADL, with factors Housing Situation andNeuro-status, without interaction, corresponds to the assumption of parallel lines inthe graphical representation of group-specific averages.


• The lines in the graph are mean ADL values, predicted by the ANOVA model:

• For a particular model, one can calculate a residual for each individual, whichcompares the predicted response with the observed response.

• These residuals can be used, like before, to assess the normality assumption.


• Once the residuals are calculated, we can assess normality by means of a histogram onthe one hand, or by means of formal normality tests, similar to previous normalitychecks:

• Residual analysis for original 2-way ANOVA model (model with interaction):


• The assumption of normality is acceptable.

• Like before, we have that:

. Departures from normality lead to correct results nevertheless, as long as the errordistribution is symmetric.

. In case of asymmetry, the response can sometimes be transformed, so as to ensurenormality of the new model.

. Possibly, such transformations can disturb the constant-variance properties, so thatit is imperative to re-assess this assumption after transformation.



• Each ANOVA model results in a prediction of the mean response for each combinationof the model factors, and these predictions satisfy the assumptions implicitly made bythe model.

• For example, removal of the interaction between housing situation and neuro-statusled to parallel predicted means.

• Like with 1-way ANOVA and regression, it is important to assess whether there aresubjects in the data set with unduly large influence on these predicted values.

• We can again calculate Cook’s distances, measuring the strength with which thepredicted means change if an observation is removed from analysis.

• This is done in full analogy with 1-way ANOVA.


• Indexplot of Cook’s distances, for the model without interaction:


• As expected, the outlier #25 exhibits a relatively large Cook’s distance, pointingtowards a relatively large influence of this individual on the estimation of the meanresponse.

• There are two subjects even more influential, i.e., subjects #49 and #53 (= mostinfluential).

• To study the influence of, e.g., subject #53, in more detail, we can compare theanalysis based on all data with one where this subject has been removed.


• The corresponding outputs are:


• Estimated means, based on the analysis with #53:


• Estimated means, based on the analysis without #53:


• Given that our ANOVA model does not contain interactions, the estimated effect ofneuro-status is constant across housing situation, for the analysis with subject #53 aswell as for the one without this subject.

• We see that subject #53 leads to an increased effect of neuro-status. Removal ofpatient #53 led to a larger p-value for the neuro-effect.

• Subject #53 is a neuro-psychiatric patient in the second housing situation. There areonly 4 patients in this situation, with the following ADL scores:

Patient ADL (day 1)

#11 19

#24 18

#26 16

#53 24


• Our patient is an outlier in his group. The extra large ADL score leads to an increasein the estimated neuro-effect.

• Observed means (no ANOVA), obtained with #53:


• Observed means (no ANOVA), obtained without subject #53:


• Hence, our original impression of the presence of an interaction is primarily due tosubject #53.

• This explains why the interaction term is far from significant (p = 0.4515)

• This p-value becomes p = 0.8717 when subject #53 is removed from analysis.



• Blomquist et al. [18]

. Hypotheses to test (p. 380):


. Statistical analysis & results (p. 381):

∗ Tests for interactions

∗ Not clear whether main effectstested based on 2-way model oron separate 1-way models


• Richardson et al. [19]:

. Aim (p. 1198) & analysis (p. 1199):

Aim

The aim of this study was to identify if student nurses

studying in the child field of nursing feel a lack of comfort

in caring by providing support for adolescents who are

LGBQ and what factors influence their comfort level.

alpha. Between-group comparisons were carried out with a

two-way analysis of variance (ANOVA) (factor 1: ethnicity,

White British or Ethnic Other; factor 2: religion, Religious

or Non-Religious). A test for normality and checks for mul-

ticollinearity were carried out. To reduce the risk of Type I

errors (false positives) due to multiple testing, each signifi-

cant value was adjusted using Bonferroni’s method. The

significance level was set at P ≤ 0�05.


. Table 3 (p. 1201):

Table 3 Effects of ethnicity and religion on comfort (two-way ANOVA).

Variable

White british Ethnic other F (P-value)

R* NR** R* NR** Ethnicity Religion E 9 R†

[A1] 4�13 (0�61) 4�26 (0�51) 4�10 (0�73) 4�40 (0�52) 0�183 (0�669) 2�400 (0�124) 0�359 (0�550)

[A2] 3�83 (0�83) 3�77 (0�88) 3�45 (0�99) 4�00 (0�94) 0�125 (0�724) 1�474 (0�227) 2�203 (0�140)

[A3] 3�50 (0�89) 3�51 (0�82) 3�58 (0�89) 3�90 (0�74) 1�614 (0�206) 0�812 (0�369) 0�678 (0�412)

[A4] 3�71 (0�96) 4�17 (0�57) 3�73 (0�96) 4�00 (0�67) 0�155 (0�694) 3�897 (0�050) 0�285 (0�594)

[A5] 2�42 (0�88) 2�11 (0�90) 2�45 (1�03) 2�20 (1�14) 0�086 (0�770) 1�741 (0�189) 0�013 (0�910)

[A6] 3�13 (1�12) 3�06 (1�21) 3�29 (1�20) 3�60 (0�84) 1�995 (0�160) 0�235 (0�629) 0�561 (0�455)

[A7] 3�83 (0�64) 4�00 (0�59) 3�70 (0�87) 4�00 (0�67) 0�176 (0�675) 2�074 (0�152) 0�176 (0�675)

[A8] 2�61 (0�84) 2�29 (0�71) 2�77 (1�12) 2�30 (1�06) 0�169 (0�681) 3�479 (0�064) 0�119 (0�731)

[A9] 3�67 (0�87) 3�69 (0�76) 3�65 (0�89) 3�70 (0�82) 0�001 (0�985) 0�040 (0�841) 0�009 (0�923)

*Religious.

**Non-Religious.†E 9 R = Ethnicity 9 Religion interaction effect.

Note: Values are expressed as mean (SD).

∗ Main effects not interpretable when interactions present in the model

∗ No indication that assumptions of constant variance would not be satisfied


Part VII

Analysis of covariance and the general linear model


Chapter 19

Analysis of covariance

. Example

. Application


. Model diagnostics

. Influential subjects



19.1 Example

• In the context of simple regression, we have demonstrated that there is a relationshipbetween ADL and MMSE. At the same time, we expect a relationship between theoccurrence (yes/no) of complications and ADL.

• We wish to explore the relationship between dependence (ADL) on the one hand, andcognitive status (MMSE) and the occurrence of complications on the other hand.

• Overview of the number of patients with and without general post-operativecomplications:


• If we want to relate a patient’s dependence to the occurrence of post-operativecomplications, then it is not meaningful to use the ADL score on day 1 post operationas the response variable.

• Instead, we will use the highest ADL value recorded over the three recordings (days 1,5, and 12 post operation).

• This maximal ADL score expresses the highest level of dependence, recorded for eachsubject in the study.

• We will relate this new variable to the cognitive status of the patient, 1 day postoperation, as well as to the occurrence of general post-operative complications.

• Regression of the maximal ADL score on the MMSE score, 1 day post operation,produces a significant (p < 0.0001) result, where dependence increases when MMSEdecreases.


• An unpaired t-test (unequal variances) shows that the mean ADL is significantly higher(p < 0.0001) for patients with complications than for these without complications.

• Graphically:


• These results are also visible in a graph with different symbols for the two groups:

• Note that all patients with complications feature an ADL score above the regressionline that had been obtained without distinguishing between patients with and withoutcomplications.


• This suggests that a separate regression is necessary for both groups:


• The graph suggests that the relationship between the ADL score and the MMSE scoreis less pronounced for patients with complication than for patients withoutcomplication.

• In other words, we expect an interaction of MMSE with the occurrence ofcomplications.

• In order to statistically test for this, we have to make use of analysis of covariance(ANOCOVA), allowing to study the relationship between a continuous response on theone hand and one or more covariates (regression), and one or more factors (ANOVA)on the other hand.

• ANOCOVA can be seen as a combination of regression and ANOVA.


19.2 Application

• Results from fitting the ANOCOVA model:


• As with regression, we obtain a split of the total variability in the response (SSTO)into a component explained by the variability in the MMSE scores and differencesattributable to the occurrence, yes or no, of complications (SSR), and a componentexpressing the total error if we predict ADL based on the model effects (MMSE andcomplications).

• Here too, we obtain a global F-test, checking whether the effects in the model containinformation for the prediction of the ADL score. In our example, we have a significantresult (p < 0.0001).

• Like before, the R2 captures which part of the total variability in the data can beexplained by the effects in the model:

R2 =SSR

SSTO=

557.01

836.75= 0.6657,

which then means that the occurrence of complications and the MMSE score, 1 daypost operation, explains ADL for more than 66%.



• The relevant ANOVA table for the testing of the various effects in our model is:

• We obtain an F-test for each effect specified in the model.

• Because the model fitted contains an interaction term, it is not possible to make aclaim about ‘the’ effect of complications or about ‘the’ effect of the MMSE score onthe maximal ADL score.


• From the above table, there appears to be some evidence for interaction between botheffects (strictly speaking, not significant, p = 0.0666 > 0.05).

• This means that there is some evidence for the effect of MMSE score not being thesame for patients with and without complications, i.e., that the two regression lines onthe following figure are not parallel:


• In spite of some evidence for the presence of interaction between MMSE and theoccurrence of complications, and in spite of this interaction being scientifically relevantand might be expected, it is not significant (p = 0.0666).

• Removal of the interaction results in a model with parallel lines: The effect of MMSEis the same for patients with and without complications:


• Results after the interaction term has been removed from the model:


• Hence, both remaining effects are highly significant:

. There is a significant difference between patients with and without complication,after correction for MMSE1. In other words, for patients with equal MMSE score, 1day post operation, there will still be a significant difference between both groups.

. There is a significant effect of MMSE, after correction for GEN. In other words,both for patients with and without complications, ADL is related to the MMSEscore, 1 day post operation.



• Like with regression and ANOVA, ANOCOVA also implies a statistical model todescribe the data. In our first model (with interaction) we assume, both for patientswith and without complications, that the relationship between maximal ADL score andthe MMSE score on day 1, is linear, but that the intercept as well as the slope aredifferent between both groups.

• Based on this model, we can again determine, for every individual, a predicted ADLscore, based on complication status and MMSE score, 1 day post operation.

• Again, we implicitly assumed that the errors, made in prediction, are normallydistributed with mean zero and constant variance.

• When these assumptions are not satisfied, errouneous results can be obtained.


• Like always, the verification of these assumptions is based on the calculated residualsei = yi − yi. These must not exhibit systematic trends, must have constant variance,and must be normally distributed.

• In practice, it usually suffices to apply the following techniques:

. Scatter plots of the ei versus all covariates in the model.

. Scatter plot of the ei versus the predicted values yi

. Normality checks for the ei


19.4.1 Residuals versus covariates

• In multiple linear regression, the residuals ei were plotted versus every covariate in themodel.

• We now construct a scatter plot of the residuals ei versus each of the two covariates inthe model, but perhaps with different symbols for patients of different subgroups thatare taken into account in the model (with/without complications).

• If the model is correct, we expect no further systematic trends, for none of thesubgroups.

• Systematic trends, like with simple regression, can point to the need for atransformation in one or more covariates.


• Resulting plot:

• For none of the two groups do we find a systematic trend in the residuals. This impliesthat the maximal ADL score is not systematically differently over- or underestimated,e.g., for patients with higher or lower cognitive status.


19.4.2 Residuals versus predicted values

• The scatter plots of the residuals versus covariates allows us to check whether or notthe response is systematically over- or underestimated, for certain values of thecovariates.

• Otherwise, it is also important to verify whether or not, for example, large or smallpredicted values point to systematic over- or underestimation.

• In our example, this comes down to verifying whether our model systematically over-or underestimates certain ADL values.

• This can be verified by plotting the residuals versus the predicted values yi, again withdifferent symbols for the various subgroups in our set of data.


• Resulting plot:

• Thus, we find no systematic errors for certain ADL values, neither for the group withnor for the group without complications.


19.4.3 Normality of the residuals

• Verifying the residuals’ normality can, again, be done in a graphical fashion(histogram), or via a formal test for normality.


• We conclude that the normality assumption appears to be plausible.

• Like before, it holds that:

. Departures from normality still lead to correct results, as long as the distribution ofthe errors is symmetric.

. In case of asymmetry, the response can sometimes be transformed, so that theresiduals in the new model are normally distributed.

. Such transformations can distort linearity and constant variance, so that, aftertransformation, the residuals need to be checked again for the renewed model.



• Each ANOCOVA model produces a predicted average response for each combinationof the model effects, and these predictions satisfy the assumptions implicitly made bythe model.

• In our example with interaction, we had a different regression line for each group. Themodel without interaction led to parallel predicted regression lines.

• We can now explore the influence of each subject on the prediction, using Cook’sdistance, measuring how strongly the predicted values change when a particularindividual is removed from the data.

• In practice, an influence analysis is conducted in the same way as discussed before, forregression and ANOVA.


• Indexplot of Cook’s distances:

• We do not observe points that exceptionally strongly lead to deviations than others.



• Collard et al. [17]

. Statistical analysis section, p. 191:

Multiple linear regression analyses were conducted to

examine associations of the number of somatic diseases

(dependent variable) with depression (independent vari-

able) adjusted for socio-demographic variables (age,

gender, educational level, partner status, income) and

lifestyle factors (smoking status, alcohol use, BMI, and

physical exercise). First, we checked whether the associa-

tions between depression and somatic comorbidity were

dependent on frailty by including interaction terms

between frailty and depression in the fully adjusted

models. We tested both, frailty as a dichotomous

characteristic (present yes/no) and as a dimensional

variable based on the number of criteria present. A

significant interaction term between depression and frailty

(yes/no) implies that the association between depression

and somatic diseases is different in patients with and

without frailty. In case of a significant interaction term

with the number of frailty criteria present, the association

between depression and frailty differs among the different

levels of frailty. Subsequently, it was tested whether frailty



5.2. Frailty as a moderating factor

Whether the association between depression and

number of somatic diseases was dependent on frailty

status, was examined by adding the interaction term of

depression by frailty to the fully adjusted linear regression

models. Depression neither interacted with the presence of

frailty (yes/no) (p = .57), nor with the number of frailty

components present (p = .25).

∗ Outcome: Number of somatic diseases

∗ Covariate: Severity of depression

∗ Factor: Presence of frailty (yes / no)

∗ Interaction: Severity of depression × Presence of frailty

∗ Adjusted for: Socio-demographic variables and Lifestyle factors


• Ausili et al. [20]:

. Data analysis section p. 20-21:

Third, we compared self-care maintenance, self-care manage-

ment and self-care confidence scores between heart failure

patients with diabetes versus those without diabetes. This

comparison was performed adjusting for sociodemographic and

clinical variables known to influence self-care maintenance,

management and confidence (Bidwell et al., 2015; Clark et al.,

2014; Cocchieri et al., 2015; Tsai et al., 2015) and those variables

that were significantly different between heart failure patients

with diabetes and heart failure patients without diabetes. The

variables that were used to adjust the above comparison were: age,

gender, Charlson Comorbidity Index score, number of medications,

employment status, Mini Mental State Examination score,

caregiver presence, education, months of illness, New York Heart

Association functional class, number of hospitalizations in the last

year, and alcohol consumption.

Fourth, in order to know if the presence of diabetes in uenced


. Figure 1, p. 23:

Fig. 1. Comparison of self-care maintenance, self-care management and self-care confidence means and medians between heart failure patients with diabetes mellitus

(n = 379) and without diabetes mellitus (n = 813).

Note. Sample size in self-care management dimension is lower (628) because this scale was administered only to patients who reported symptoms of heart failure in the last

month. P-values derived by multiple linear regression adjusting for age, gender, Charlson Comorbidity Index score, number of medications, employment status, Mini Mental

State Examination score, caregiver presence, education, months of illness, New York heart Association functional class, number of hospitalizations in the last year, alcohol

consumption and self-care confidence (this last one only in the regression analysis on self-care maintenance and self-care management).


. Results section, p. 21-22:

�

adequate self-care (Riegel et al., 2009a,b). As shown by Fig. 1, none

of these self-care scores were statistically different between heart

failure patients with versus those without diabetes mellitus (aim

1; self-care maintenance p = 0.23, adjusted p = 0.13; self-care

management p = 0.98, adjusted p = 0.21; self-care confidence

p = 0.87, adjusted p = 0.51). Accordingly, no statistically significant

associations were found between the presence of diabetes and∗ Three outcomes

∗ Comparison of patients with and without diabetes mellitus

∗ Corrected for covariates related to outcome and different between both groups

∗ p-value for self-care management not the same in graph as in text(p = 0.22 versus p = 0.21)


Chapter 20

The general linear model

. Introduction

. Example

. The generalized linear model


20.1 Introduction

• In the previous chapters, many statistical models have been discussed:

. Linear regression (simple, multiple, interaction)

. Polynomial regression

. ANOVA (simple, multiple, interaction)

. ANOCOVA

• These are all special cases of the so-called ‘general linear model’ (GLM).

• In practice, one will often use a combination of the above models to relate a set ofcovariates and/or factors to a given response variable.


• Furthermore, one will often aim to reach a final model in a step-by-step fashion, by:

. Removal of non-significant effects

. Adding significant terms

. Adding (or removing) interaction terms

• This can be done flexibly only if we easily can transfer from one type of analysis to theother.

• Most software packages have therefore incorporated all those models in a singlesoftware routine.


20.2 Example

• As an illustration, we repeat the example from the ANOCOVA chapter, where werelate the maximal ADL score to the MMSE score on the first day, and to theoccurrence of complications.

• Furthermore, we want to take into account that the living condition has been found tobe an important factor to explain ADL (the last living condition will be deleted, asbefore).

• Finally, we want to correct for the fact that not all patients are of the same age, andwe allow the relationship between Age and ADL to be quadratic.


• We end up with a model containing the following effects:

. MMSE (day 1): main effects of MMSE

. Gen: main effect of complication status

. Woonsi: main effect of living condition

. MMSE*Gen: interaction of complication status and MMSE

. MMSE*Woonsi: interaction of living condition and MMSE

. Woonsi*Gen: interaction of living condition and complication status

. Leeftijd and Leeftijd2: correction for age


• It is then extremely important to make a clear distinction between covariates and(categorical) factors:


• Furthermore, the model needs to be specified:


• We obtain the following ANOVA table with the tests of the individual effects:


• As an informal check of the model specification, we can verify whether the degrees offreedom satisfy the rules:

. 1 df for each covariate (Leeftijd, Leeftijd2, MMSE)

. r − 1 df for a factor with r levels (Gen, Woonsi)

. the product of the individual df for each interaction (Gen*Woonsi, Gen*MMSE,Woonsi*MMSE)

• Note that this is a confirmation of the fact that the last living condition has beendeleted (2 df correspond to 3 groups).

• Our model clearly contains too many effects and therefore should be reduced in a stepby step fashion.


• As is always the case, the main effects can be interpreted only if they are not part ofan interaction term.

• For the reported p-values to be correct, we first need to verify whether the underlyingassumptions are satisfied.

• The assumptions have been verified and are satisfied (not reported)


• Model reduction:

. Step 1: deletion of Leeftijd2:


. Step 2: deletion of Leeftijd:


. Step 3: deletion of Gen*MMSE:


. Step 4: deletion of Woonsi*MMSE:

• The non-significance of the interaction between complication status and livingcondition (p = 0.0776 > 0.05) notwithstanding, there is some evidence for thepresence of an interaction.


• If we accept this model as final, then we reach the following conclusions:

. There is no need for an age correction

. The relationship between maximal ADL and the MMSE on day 1 does depend onneither living conditions nor on the occurrence of complications.

. The effect of occurrence of complications on maximal ADL depends on the livingcondition of the patient prior to intake.

• This can be displayed graphically by saving the predicted ADL values in a dataset (aswith the residuals), and then to plot them for various combinations of factors in themodel.


• Result:


• Hence, we see that patients with complications are, generally speaking, moredependent than these without.

• Furthermore, we see that that the effect of complication is much larger among theseliving alone than among the other two groups. This is one of the aspects of theinteraction between living condition and complication status.

• If we would have decided to remove the interaction term from the model, then otherpredictions would have been obtained.


• Graphically:


• We now see that, indeed, the effect of a complication is equally large in the threeliving conditions.

• On the other hand, the difference between the three living conditions is independent ofcomplications having occurred or not.


20.3 The generalized linear model

• All models considered so far share a common characteristic: the outcome variable iscontinuous.

• This is reflected in the fact that the underlying distributional assumption is one ofnormality.

• Therefore, the models are termed ‘linear models.’

• For example, when one wants to analyze a binary variable as a function of covariatesand/or factors, then we are not able to use linear models.


• Generalized linear models are designed to generalize the linear models to caseswhere the outcome variable is no longer continuous and no longer normally distributed.

• The most often used model here is logistic regression, which allows for the analysisof binary outcomes.

• This model will be discussed later.


Chapter 21

Regression notation of a general linear model

. Introduction

. Factor with two levels

. Factor with more than two levels

. ANOCOVA model



21.1 Introduction

• Obviously, ANOVA, regression, and ANOCOVA models are very much related:

. Special cases of the general linear model

. Based on similar assumptions (normality, constant variance)

. Similar diagnostics (model checking, influence analysis)

• One can show that all models in the general linear model family can be written asregression models, for an appropriate selection of covariates.

• Hence, from a mathematical point of view, all models are the same

• In many publications the regression notation is used to present results (see later)


21.2 Factor with two levels

• Let us consider a simple linear regression model:

Yi = β0 + β1Xi + εi, εi ∼ N (0, σ2)

• Furthermore, let us consider the special case where Xi can take two values only,0 and 1

• This subdivides our sample in two subsets:

. Observations with Xi = 0

. Observations with Xi = 1


• Graphically:

� ��

�

� � � � ! � " # $% & ' % & (

) * + , - . . , / . , 0 0 * 1 + 2 1 3 , 4


• For both subsets, the following distributional assumptions hold:

Yi = β0 + β1Xi + εi =

β0 + εi, if Xi = 0

β0 + β1 + εi, if Xi = 1

=

µ1 + εi, if Xi = 0

µ2 + εi, if Xi = 1,

with µ1 = β0 and µ2 = β0 + β1.

• Hence, the following assumptions are made:

Yi ∼ N (µ1, σ2), if observation i from group Xi = 0

Yi ∼ N (µ2, σ2), if observation i from group Xi = 1


• Hence, the model coincides with the statistical model behind the two-sample t-test:

.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Xi = 0 Xi = 1

µ1 µ2

Y

• Note also that the null-hypothesis tested in the regression model,

H0 : β1 = 0, versus HA : β1 6= 0,

is equivalent to the null-hypothesis tested in a t-test procedure:

H0 : µ1 = µ2, versus HA : µ1 6= µ2.


• This shows that a t-test can be considered a special case of linear regression, i.e., alinear regression with binary covariate X

• Note also that the assumptions are identical:

Regression −→ t-test

Normal errors Normal errors

Constant variance Equal variance

Linearity

• Note that the linearity assumption is satisfied automatically if the covariate X cantake two values only, which is why no linearity assumption was present in our earlierdiscussion of the t-test.


21.3 Factor with more than two levels

• The same idea can be used to write any one-way ANOVA model as a regression model

• A factor with two levels is modeled using one dummy variable:

Xi

Group 1: 0

Group 2: 1

• The regression model fitted then equals:

Yi = β0 + β1Xi + εi =

β0 + εi, if Group 1

β0 + β1 + εi, if Group 2


• A factor with three levels is modeled using two dummy variables:

X1i X2i

Group 1: 0 0

Group 2: 1 0

Group 3 0 1

• The ANOVA model testing equality of means of the three groups is equivalent to themultiple regression model

Yi = β0 + β1X1i + β2X2i + εi =





• Testing equality of all three means is equivalent to testing H0 : β1 = β2 = 0, which isdone with the overall F -test reported in the ANOVA table of the multiple regressionanalysis.

• Group 1 is often called the reference group to which the other groups are compared.The average differences are β1 and β2 for groups 2 and 3, respectively, relative togroup 1.

• Note that the choice of the reference group is not unique.

• Selecting another reference group would not affect the p-value for the comparison ofall groups.

• Selecting another reference group would affect the interpretation of the parametersβ1 and β2.


• A factor with four levels is modeled using three dummy variables:

X1i X2i X3i

Group 1: 0 0 0

Group 2: 1 0 0

Group 3: 0 1 0

Group 4: 0 0 1

• The ANOVA model testing equality of means of the four groups is equivalent to themultiple regression model

Yi = β0 + β1X1i + β2X2i + β3X3i + εi =






• In general, a factor with r levels is modeled using r − 1 dummy variables

• This explains why, in the general linear model, r− 1 degrees of freedom are associatedwith a factor of r levels, while only one degree of freedom is associated to a covariate.

• Indeed, a factor with r levels implicitly is identical to r − 1 covariates.

• The same principle can be applied in models with multiple factors

• Interactions between 2 factors, with r and s levels are obtained by adding the productof all r − 1 dummy variables for the first factor with all s− 1 dummy variables of thesecond factor, leading to (r − 1)(s − 1) additional covariates in the regression model.

• In the biomedical literature, the regression analogue of ANOVA models is often usedto report differences between groups (see later).


21.4 ANOCOVA model

• The same idea can be used to write an ANOCOVA model as a linear regression model.

• Let us consider a model with covariate Xi and a factor with three levels.

• As before, we introduce two dummy variables to replace the factor:

X1i X2i

Group 1: 0 0

Group 2: 1 0

Group 3 0 1


• The ANOCOVA model without interaction is equivalent to

Yi = β0 + β1X1i + β2X2i + β3Xi + εi =

β0 + β3Xi + εi, if Group 1

β0 + β1 + β3Xi + εi, if Group 2

β0 + β2 + β3Xi + εi, if Group 3

• Hence, the model indeed assumes, for each group, a linear relation between theoutcome Y and the covariate X

• The slope is the same for all three groups (parallel lines ≡ no interaction)

• The intercepts are allowed to be different for the groups (main group effect)


• Graphically:

5 6 7 8 9 :

5 6 7 8 9 ;

5 6 7 8 9 <

= > ? @ A BC D E

C D F

C


• The ANOCOVA model with interaction is obtained by adding the product between thedummy variables and the covariate:

Yi = β0 + β1X1i + β2X2i + β3Xi + β4X1iXi + β5X2iXi + εi

=

β0 + β3Xi + εi, if Group 1

β0 + β1 + (β3 + β4)Xi + εi, if Group 2

β0 + β2 + (β3 + β5)Xi + εi, if Group 3

• Hence, the model indeed assumes, for each group, a linear relation between theoutcome Y and the covariate X

• The slope is not the same for all three groups (no parallel lines ≡ interaction)

• The intercepts are allowed to be different for the groups (main group effect)


• Graphically:

G H I J K L

G H I J K MG H I J K N

O P Q R S T

U V W

U V X

U

O P Q R S T V Y

O P Q R S T V Z



• Ausili et al. [20], Table 3 p. 24:

Table 3

Socio-demographic and clinical determinants of self-care maintenance, self-care management and self-care confidence in heart failure patients with diabetes mellitus

(n = 379).

Variable R -square Parameter

Estimate (95%CI)

p value

Determinants of self-care maintenance (n = 379) 0.34

Age �0.20 (�0.40–�0.007) 0.042

Gender �1.01 (�3.83–1.79) 0.471

Presence of diabetes complications �0.28 (�2.20–1.63) 0.771

Charlson Comorbidity Index score �0.20 (�1.37–0.97) 0.735

Number of medications taken by patients 0.90 (0.15–1.65) 0.017

Employment status �0.39 (�5.60–4.82) 0.883

Mini Mental State Examination score �0.02 (�0.29–0.24) 0.847

Presence of a caregiver �3.43 (�6.82– �0.05) 0.046

Family Income �1.69 (�2.96– �0.4) 0.009

Month of Illness 0.02 (0.005–0.53) 0.105

New York Heart Association Class �1.72 (�3.62–0.17) 0.074

Self-care confidence 0.34 (0.26–0.40) <0.001

∗ Regression parameters for factors (e.g., gender, presence of complications, etc.)

∗ Interpretation requires knowledge of the reference group


• Hahnel et al. [21]:

. Outcomes section, p.5:and control groups at all skin areas, except the right and left lower

leg (Supplementary online Tables 1 to 3). Results of the GLM

analysis regarding the changes in Overall Dry Skin scores at the

right lower leg in all three groups over the entire study are shown

in Table 3.

The model was adjusted for baseline Overall Dry Skin score

(right lower leg), age, nursing homes and Barthel-Index. Group I

(b = !0.643; p = 0.020) and Group II (b = !0.696; p = 0.009) had

statistical significant lower Overall Dry Skin scores compared to

the control group (Group III) over time. Visit was modelled as intra-

individual variable. In this model, the Overall Dry Skin score

decreased significantly over time (b = !1.696; p < 0.001), and

higher baseline measurements led to generally higher Overall Dry

Skin score over the entire course of time (b = 2.336; p < 0.001). The

particular nursing home was not associated with the treatment

effect.

For all other examined skin areas (left lower leg, right forearm,


. Table 3:

Table 3

Generalized linear model for the dependent variable Overall Dry Skin score at the right lower leg (n = 117).

Parameter B Std. Error 95% Wald Confidence Interval Hypothesis Test

Lower Upper Wald Chi-Square df p-value

ODS = 1 0.657 1.607 3.806 2.493 0.167 1 0.683

ODS = 2 2.071 1.601 1.067 5.209 1.673 1 0.196

ODS = 3 5.286 1.656 2.040 8.533 10.186 1 0.001

ODS = 4 8.247 1.729 4.859 11.635 22.762 1 <0.001

Group I 0.643 0.277 1.186 0.100 5.386 1 0.020

Group II 0.696 0.267 1.219 0.174 6.819 1 0.009

Group III (control) 0.0a . . . . . .

Visit 1.696 0.170 2.029 1.363 99.636 1 <0.001

Overall Dry Skin Score: right lower leg (Day 0) 2.336 0.2481 1.850 2.822 88.683 1 <0.001

Age (years) 0.012 0.014 0.016 0.040 0.665 1 0.415

Barthel-Index 0.011 0.005 0.002 0.020 5.868 1 0.015

Nursing home 1 0.607 0.938 2.445 1.232 0.419 1 0.518

Nursing home 2 0.115 0.567 0.997 1.227 0.041 1 0.839

Nursing home 3 0.017 0.456 0.875 0.910 0.001 1 0.969

Nursing home 4 0.138 0.408 0.936 0.661 0.114 1 0.735

Nursing home 5 0.539 0.463 0.368 1.446 1.357 1 0.244

Nursing home 6 0.694 0.533 -1.738 0.350 1.696 1 0.193

Nursing home 7 0.303 0.492 0.661 1.267 0.380 1 0.538

Nursing home 8 0.196 0.398 0.977 0.584 0.243 1 0.622

Nursing home 9 0.226 0.408 0.573 1.026 0.307 1 0.579

Nursing home 10 0.0a . . . . . .

(Scale) 1

df: Degrees of freedom.a Set to zero.


. Two different parameterizations in the same model:

∗ Overall Dry Skin score at baseline (ODS):

Yi =

µ1 + εi, if ODS= 1



µ4 + εi, if ODS= 4,

hence no use of dummy coding with an intercept representing a reference group.

∗ Treatment group (GROUP), and similar for Barthel-Index:

Yi =



β0 + β2 + εi, if Group 2,

hence dummy coding with Group 3 as reference group.


Part VIII

Models for binary outcomes


Chapter 22

Simple logistic regression

. Example

. Logistic regression model

. Application

. Model diagnostics

. Influential subjects

. Odds ratio



22.1 Example

• One of the longitudinal measures is the CAM score, measuring the extent to whichpatients are confused.

• This score is measured at days 1, 3, 5, 8, and 12, post-operatively. Based on these 5measures, one can approximately assess whether the patient has been confused postoperation.

• This variable is binary (0=not confused, 1=confused) and is one of the responses inwhich researchers took a particular interest.


• Overview of the number of confused and non-confused patients:

• About 23% of patients were confused post operation.

• At the same time, it is possible for the probability on confusion to depend on effectssuch as age, whether or not the patient is neuro-psychiatric,. . . .

=⇒ Logistic regression


22.2 The logistic regression model

• Assume that we want to study whether confusion is related to the age of the patient.

• We then have, for each patient, a pair (xi, yi) of measures:

. xi: the age of the ith patient

. yi: confusion: 0 : not been confused1 : been confused

• A first way to describe the relationship between xi and yi would be a linear regressionmodel:

Yi = β0 + β1xi


• Graphically:


• The graph points to some problems:

. The discrete nature of the response implies that the observed data are poorlydescribed by the regression line, implying a low R2.

. The predicted values of the response can take every real value, which is entirelymeaningless given that the response Y can assume the values 0 and 1 only.

• The first problem is solved by relating age to the probability for confusion, ratherthan confusion itself:

P (Yi = 1) = β0 + β1xi

• This implies that every value between 0 and 1 is meaningful as prediction.


• Graphically:


• To further impose that the predicted probabilities would not be larger than 1 orsmaller than 0, the linear relationship is replaced by a so-called logistic relationship:

P (Yi = 1) =exp(β0 + β1xi)

1 + exp(β0 + β1xi)

• The relationship between the confusion chance and age then is S-shaped:

. Approximately linear in the middle

. Leveling off near the extremes

• The above model is a logistic regression.


• Graphically:


• In practice, fitting this model comes down to finding estimators β0 and β1, such thatthe corresponding logistic curve describes the data best.

• The numerical method to compute these estimators will not be discussed further.

• Like with simple liner regression, the logistic regression model contains two parameters.

• Intercept β0: captures the horizontal displacement of the curve. The larger theintercept, the larger the probability for a ‘success,’ which means that the regressioncurve shifts to the left when β0 increases.

• Slope β1: describes how strongly the chance for a ‘success’ changes as a function ofthe covariate X . The logistic curve increases if β1 > 0, and decreases if β1 < 0. Thelarger |β1|, the stronger the increase or decrease.


• Graphically:


• Graphically (β1 > 0):


• Graphically (β1 < 0):


• Note that, up to now, the probability P (Y = 1) was modeled as a function of acovariate X . A fully equivalent model is obtained by modeling P (Y = 0):

P (Yi = 0) = 1 − P (Yi = 1) = 1 − exp(β0 + β1xi)

1 + exp(β0 + β1xi)

=1

1 + exp(β0 + β1xi)=

1

exp(β0 + β1xi)[1 + exp(−β0 − β1xi)]

=exp(−β0 − β1xi)

1 + exp(−β0 − β1xi)

• Hence, if the probability of a ‘failure’ is modeled, we obtain the same result as whenmodeling the probability of a ‘success,’ but with opposite sign for the regressioncoefficients.


22.3 Application

• In most statistical software packages, model specification is completely analogous withthe specification of the general linear model, but then within a module for logisticregression.

• Often, logistic regression is implemented within the generalized linear modelsenvironment

• In order to correctly interpret results, it is important to check whether P (Y = 0) ismodeled, or P (Y = 1)

• Most software packages allow specification of your preference (P (Y = 0) orP (Y = 1)), and they all provide information about which model was fitted.


• For example, the SAS software package gives the following notification:

• In our analyses, P (no confusion) is modeled, implying that ‘no confusion’ is treated as‘success’.

• The parameter estimates are given by:


• We observe a significant (p = 0.0137) relationship between age and occurrence ofconfusion. The logistic curve is described by the equation:

P (not confused) =exp(10.30 − 0.11 × age)

1 + exp(10.30 − 0.11 × age)

or, equivalently

P (confused) =exp(−10.30 + 0.11 × age)

1 + exp(−10.30 + 0.11 × age)

• Hence, the probability of confusion increases with age (0.11 > 0).


• The above equation can now be used to predict, based on age, the probability forconfusion:

Age P (confused)

65 years exp(−10.30+0.11×65)1+exp(−10.30+0.11×65) = 0.05019

75 years exp(−10.30+0.11×75)1+exp(−10.30+0.11×75)

= 0.14095

85 years exp(−10.30+0.11×85)1+exp(−10.30+0.11×85)

= 0.33751

95 years exp(−10.30+0.11×95)1+exp(−10.30+0.11×95) = 0.61268


• We can request a table with the predicted probabilities for all individuals:

• Note that these predictions take the form of probability for no confusion, since thiswas the probability modeled in our analysis.


• Graphical representation of the predicted probabilities:

• Note that we see a part of the S-curve only: Only for these values of age for whichthere are actual observations:


• Extrapolated curve:



• Each statistical model is based on a number of assumptions regarding the datacollected. This is true for the simple logistic regression model as well.

• In our example, we assumed that the response variable (confusion status) followed aBernoulli distribution, where the probability of ‘success’ is described by a logistic curve:

P (confused) =exp(β0 + β1age)

1 + exp(β0 + β1 × age)

• Like always, when the assumptions are not satisfied, then erroneous results may follow.


22.4.1 The deviance statistic

• Every logistic regression is accompanied by a table with so-called ‘Goodness-of-fit’statistics:

• Each statistic is some measure for the total distance between the observations and thepredicted probabilities

• A well-fitting model satisfies:

. All values Y = 1 get (very) high predictions for P (Y = 1)

. All values Y = 0 get (very) low predictions for P (Y = 1)


• We will restrict interpretation to the deviance, the most popular measure.

• The smaller the deviance, the better the model describes the data.

• As a rule of thumb, models with a deviance less than the number of associateddegrees of freedom (DF) are considered well fitting. This comes down to a value forValue/DF, smaller than 1.

• The DF is the number of observations in the data set minus the number of parametersin the model, in our case 60− 2.

• This way, the deviance is corrected for the sample size and penalized for thecomplexity of the model, of particular interest in multiple logistic regression models.


22.4.2 Pearson residuals

• Like with regression and ANOVA models, residuals can be calculated for logisticregression models as well.

• A residual measures how well the response for a given individual is predicted by theregression curve

• Hence a residual should be close to zero if value Y = 1 has a large predicted value forP (Y = 1) and if a value Y = 0 has a low predicted value for P (Y = 1), and shouldbe far from zero otherwise.

• In the context of logistic regression, various types of residuals are defined. We willrestrict ourselves here to the so-called Pearson residuals. They are denoted, in analogywith regression and ANOVA, by ei.


• The residual ei is a measure for how far the observed response (0 or 1) is separatedfrom the predicted probability of ‘success’:

ei =yi − P (Yi = 1)

√P (Yi = 1)(1 − P (Yi = 1))

• The denominator is necessary to ensure that all ei have comparable variance, suchthat one does not conclude bad prediction only because yi − P (Yi = 1) is estimatedwith a lot of uncertainty.

• In our example, this would come down to comparing confusion status with theobserved probability of being confused.

• Most software packages calculate the Pearson residuals by default.


• A scatter plot of the Pearson residuals as a function of the covariate Age:

• Note that the graph does show a certain form of systematic pattern.


• In general, we even expect that with a strongly predictive model, a large systematictrend will be observed in a scatter plot of the Pearson residuals versus a modelcovariate:

. All non-confused patients have a positive residual:

1− P (Yi = 1) ≥ 0

. A very large positive residual corresponds to a patient for whom we expectconfusion (relative large predicted probability in favor of confusion), but whohappened not to be confused nevertheless.

. All confused patients have a negative residual:

0− P (Yi = 1) ≤ 0

. A very negative positive residual corresponds to a patient for whom we do notexpect confusion (relative large predicted probability in favor of non-confusion), butwho happened to be confused nevertheless.

. In a strongly predictive model, we therefore expect to see positive residuals for smallcovariate values and negative residuals for large covariate values, or vice versa.


• Therefore, in practice, no scatter plot is made of Pearson residuals versus thecovariates (or predicted values).

• However, an index plot can be made of the Pearson residuals, to detect whichobservations fit the model poorly:


• We observe that the negative values are largest in absolute values.

• Furthermore, there are a number of outliers:

Patient Age P (Confusion) Pearson residual

#16 83 0.28902 −1.56841

#18 73 0.11576 −2.76382

#33 83 0.28902 −1.56841

#34 83 0.28902 −1.56841

#38 72 0.10466 −2.92493

#51 83 0.28902 −1.56841

#52 77 0.17080 −2.20338

#53 84 0.31285 −1.48202

• All of these patients had a small predictive probability for confusion, and happened tobe confused nevertheless.


• A careful analysis of these subjects established that all of them were neuro-psychiatric.

• This can be represented graphically by plotting the residuals again, but with a symbolspecific to each of the two neuro-groups:


• Hence, we see that all confused neuro-psychiatric patients have a very negativeresidual, which implies that our model does not describe these patients adequately

• The probability of confusion among confused neuro-psychiatric patients issystematically underestimated.

• This points to the need for multiple logistic regression, allowing to include severalcovariates simultaneously when predicting confusion.



• Exactly as with regression, ANOVA, and ANOCOVA, it is possible to examine theinfluence of each individual separately on the results obtained.

• One can make use, again, of Cook’s distance, obtained in exactly the same way asobtained earlier with linear models.

• For observations with relatively large influence, the analysis can be conducted, again,with and without these observations.

• This too is done in full analogy with the way it is done with linear models.


22.6 Odds ratio

• In a linear regression model

Yi = β0 + β1xi,

we have that the effect (on average) of a one-unit increase in the covariate X fromX = x to X = x + 1 equals β1, and is independent of x.

• This does not directly generalize to the logistic regression model

P (Yi = 1) =exp(β0 + β1xi)

1 + exp(β0 + β1xi),

unless if ‘risk’ is expressed in terms of the odds rather than the probability


• The odds of observing a success equals:

Odds(Yi = 1) =P (Yi = 1)

1− P (Yi = 1)=

exp(β0 + β1xi)

1 + exp(β0 + β1xi)

/ 1

1 + exp(β0 + β1xi)

= exp(β0 + β1xi)

• The odds can be interpreted as a different scale for quantifying risk:

. Odds(Yi = 1) increases (decreases) if P (Yi = 1) increases (decreases)

. There is a 1-1 relation between Odds(Yi = 1) and P (Yi = 1):

P (Yi = 1) ←→ Odds(Yi = 1)

0.50 ←→ 1

0.25 ←→ 1/3

0.10 ←→ 1/9

0.75 ←→ 3

0.90 ←→ 9


• The odds ratio is the relative change in odds, for a one-unit increase in the covariateX from X = x to X = x + 1:

OR =Odds(Y = 1|X = x + 1)

Odds(Y = 1|X = x)=

exp(β0 + β1(x + 1))

exp(β0 + β1x)= exp(β1).

• Hence, the relative change in risk (measured on the odds scale), associated with aone-unit increase in the covariate, from X = x to X = x + 1, is independent of x.

• Therefore, many publications report the OR, i.e., exp(β1), rather than the slope β1 toexpress the effect of a covariate.

• Finally, the null-hypothesis H0 : β1 = 0 translates to H0 : OR = 1, hence confidenceintervals for OR are compared to the value 1 rather than 0.

• Note that correct interpretation of the OR requires knowledge of the reference category



• Silva et al. [22]

. Statistical analysis section, p.637:

Univariate and multivariate stepwise logistic regression analy-

sis of the data was also performed. Gender, age and BMI (body mass

index) were defined as antecedent variables, and two analytical

models were then developed: (I) a model using the number of

comorbidities and red cell indices individually as independent

variables and (II) a model that also included inflammatory cytokine

and erythropoietin levels. In both models we used the frailty

classification and each of the defining criteria as outcome, and a

significance level of 5% (p < 0.05) was used with a 95% confidence

interval.


. Table 3, p.639:

Table 3

Univariate analysis of variables of interest associated with frailty. FIBRA Study

(n = 255).

Variable ORa (95% CI OR) p value*

Female gender 4.94 1.081–22.580 0.039

Age 0.84 0.744–0.970 <0.001

RBC 0.13 0.032–0.542 0.005

Hb 0.40 0.259–0.650 <0.001

HTC 0.77 0.660–0.913 0.008

MCV 1.00 0.900–1.127 0.891

RDW 2.15 1.263–3.683 0.019

RetAbs 0.99 0.958–1.026 0.871

Serum EPO 1.01 0.998–1.034 0.178

Serum hsCRP 4.83 1.776–13.15 0.008

Serum IL-1RA 1.00 1.000–1.003 0.081

Serum IL-6 1.02 0.910–1.143 0.458

a Ref: Non-frail.* Significance level p < 0.05.


. Interpretation:

∗ OR rather than regression parameters

∗ OR’s calculated with ‘non-frailty’ group as reference

∗ Regression dummy coding for factor gender

∗ For example, females have almost five times as much odds to belong to thefrailty group than males (p = 0.039)


• Moon and Lee [23]

. Analysis section p.1427:

2.9. Analysis

The Chi-square and t-tests were used for comparing the

demographics and clinical characteristics between the

intervention and the control groups. The effects of protocol

application on delirium incidence, mortality, and re-

admission to the ICU during the same hospitalization

period were analyzed by logistic regression analysis. The

effects on 7- and 30-day in-hospital mortality were


. Table 3 p.1429:

Table 3

Effects of the delirium prevention protocol on the patient outcomes.

Outcomes Univariate logistic/linear regression

OR/HR(CI) b SE

Episodes of deliriuma 0.50 (0.22–1.14)

In-hospital mortalitya 0.28 (0.08–0.90)

7 days in- hospital mortalityb 0.10 (0.01–0.83)

30 days in-hospital mortalityb 0.34 (0.10–1.13)

ICU re-admission during same

hospitalizationa

0.28 (0.07–1.07)

ICU length of stayc 0.80 1.95

CI = confidence interval; ICU = intensive care unit; HR = hazard ratio; OR = odds ratio.a Logistic regression.b Cox-proportional hazard regression.c Linear regression.


. Interpretation of results:

∗ Multiple types of analyses in one table (logistic, linear, Cox regression)

∗ OR rather than regression parameters for logistic regression models

∗ No explicit mentioning of reference category used in the calculation of the OR.

∗ Most likely: Mortality / No mortality

∗ Regression dummy coding for factor intervention

∗ No explicit mentioning of reference group for the factor

∗ Most likely the control group is used as reference

∗ For example, applying the prevention protocol reduces the odds for in-hospitalmortality by a factor 0.28 (C.I.:[0.08; 0.90]).


Chapter 23

Multiple logistic regression

. Example

. Application

. Model diagnostics


. Odds ratio



23.1 Example

• With simple logistic regression, we found evidence to believe that the logisticregression model, used to relate the probability of confusion to age, systematicallyunderestimated the probability of confusion for neuro-psychiatric patients.

• This suggests that we ought to conduct a separate logistic regression for each of theneuro-groups separately.

• Exactly like with linear regression and ANOVA, simple logistic regression can beextended to situations where the response variable is related to several covariates (likewith multiple regression), multiple factors (like with multiple ANOVA) or severalcovariates and factors (like with ANOCOVA).

• Also now, potential interactions can be included in the model.


23.2 Application

• Result from fitting a logistic regression with covariate age and factor neuro status:

• Inclusion of the interaction implicitly implies two separate logistic models for the twoneuro groups.


• A graphical representation of the estimated regression curves, with group-specificsymbols:


• The graph suggests that, for both neuro-groups, the probability of no confusion(confusion) decreases (increases) with age.

• Further, for each age, the probability of confusion is larger for neuro- than fornon-neuro-patients.

• The apparent stronger relationship between confusion and age for the non-neuropatients than for the neuro patients is not significant (p = 0.1982). In other words,one can assume that the effect of age on confusion is the same for both neuro groups.

• A logistic regression model that explicitly makes this assumption can be estimated byremoving the interaction term from the model.


• Result:

• Hence, each of the two terms is significant, after correction for the other:

. There is a significant effect of age on confusion, after correction for neuro status(p = 0.0110). In other words, for both the neuro and the non-neuro patients, thereis a significant relationship between age and the probability of confusion.

. There is a significant effect of neuro-status, after correction for age (p = 0.0017).In other words, for patients of a given age, there is a significant difference betweenthe neuro groups concerning the probability to be confused.


• Based on the above model, we obtain the following graph for the predicted probabilityof not being confused:


• Given that our model no longer contains an interaction of neuro status with age, thelogistic models have the same slope and only differ in the intercepts, hence oneS-shaped curve is a horizontal translation of the other S-shaped curve.

• This is clear if we extrapolate both curves to ages outside of the range of observedages:


• As a general conclusion, we can state that both the neuro status and age have aninfluence on the confusion status, independently of each other:

. The probability of confusion increases with age.

. The probability of confusion is larger among neuro-psychiatric patients than amongnon-neuro psychiatric patients.



23.3.1 The deviance statistic

• For our logistic regression model designed to predict confusion based on age and neurostatus, without interaction between both effects, the table with ‘goodness-of-fit’statistic is:


• Compared to the deviance for our original simple logistic model, the number ofdegrees of freedom has reduced from 58 to 57, due to the fact that model containsone additional parameter.

• The ratio of the deviance statistic over its degrees of freedom now is 0.7821, whichimplies that our model seems to predict most observabions reasonably well.

• In our simple model, with age as the sole effect, this ratio equaled 0.9971, leading tothe conclusion that adding neuro status considerably improved our model. It confirmswhat we expected based on the residual plot with simple regression.


23.3.2 Pearson residuals

• Index plot of the Pearson residuals, with neuro-group-specific symbols:

• We now see a more balanced spread of the residuals; also neuro-patients no longerexhibit systematically deviating residuals.



• As before, for each observation Cook’s distance can be calculated, measuring theeffect of removing an observation.

• Among observations with relatively large influence, the analysis can be conducted withand without these observations.

• This is done in full analogy with the analyses discussed before.


23.5 Odds ratio

• Suppose a multiple logistic regression model with two covariates X1 and X2 has beenfitted:

P (Yi = 1) =exp(β0 + β1x1i + β2x2i)

1 + exp(β0 + β1x1i + β2x2i).

• In full analogy to the simple logistic regression model, the odds of observing a successequals:

Odds(Yi = 1) =P (Yi = 1)

1− P (Yi = 1)

=exp(β0 + β1x1i + β2x2i)

1 + exp(β0 + β1x1i + β2x2i)

/ 1

1 + exp(β0 + β1x1i + β2x2i)

= exp(β0 + β1x1i + β2x2i)


• In order to express the change in risk with a one-unit increase in covariate X1 fromX1 = x1 to X1 = x1 + 1, while keeping the other covariate fixed, the OR canbe used:

OR =Odds(Y = 1|X1 = x1 + 1, X2 = x2)

Odds(Y = 1|X1 = x1, X2 = x2)

=exp(β0 + β1(x1 + 1) + β2x2)

exp(β0 + β1x1 + β2x2)= exp(β1).

• Hence, the relative change in risk (measured on the odds scale), associated with aone-unit increase in the covariate, from X1 = x1 to X1 = x1 + 1, is not onlyindependent of x1, but also of x2.

• This is in full analogy to the interpretation of a regression coefficient in a multiplelinear regression model.

• Caution is needed, however, in polynomial models or models with interactions, whereindividual regression parameters cannot always be interpreted.



• Silva et al. [22]


Univariate and multivariate stepwise logistic regression analy-

sis of the data was also performed. Gender, age and BMI (body mass

index) were defined as antecedent variables, and two analytical

models were then developed: (I) a model using the number of

comorbidities and red cell indices individually as independent

variables and (II) a model that also included inflammatory cytokine

and erythropoietin levels. In both models we used the frailty

classification and each of the defining criteria as outcome, and a

significance level of 5% (p < 0.05) was used with a 95% confidence

interval.


. Table 4, p.639:

Table 4

Multivariate analysis of variables of interest associated with frailty status (Model I).

FIBRA study (n = 255).

Variable ORa (95% CI OR) p value*

Model I

Outcome: Frailty

Age 1.16 (1.065–1.277) 0.001

Hb 0.493 (0.270–0.899) 0.021

Weight loss

Hb 0.68 (0.490–0.956) 0.026

RetAbs 0.97 (0.946–0.997) 0.029

Fatigue

No statistically significant difference

Grip strength

Age 1.11 (1.049–1.174) <0.001

Physical activity

Age 1.07 (1.023–1.133) 0.005

RetAbs 1.02 (1.003–1.044) 0.022

Gait speed

Age 1.09 (1.043–1.157) <0.001

a Ref: Non-frail.* Significance level p < 0.05.


. Interpretation:

∗ Several outcomes: Frailty and several defining criteria

∗ Backward selection, leading to different final models for the different outcomes



. Analysis section p.1427:

2.9. Analysis

The Chi-square and t-tests were used for comparing the

demographics and clinical characteristics between the

intervention and the control groups. The effects of protocol

application on delirium incidence, mortality, and re-

admission to the ICU during the same hospitalization




. Table 3 p.1429:

Table 3

Effects of the delirium prevention protocol on the patient outcomes.

Outcomes Univariate logistic/linear regression Multivariate logistic/linear regression

OR/HR(CI) b SE p OR/HR(CI) b SE p

Episodes of deliriuma 0.50 (0.22–1.14) .10 0.52 (0.23–1.21) .13

In-hospital mortalitya 0.28 (0.08–0.90) .02 0.32 (0.09–1.13) .08

7 days in- hospital mortalityb 0.10 (0.01–0.83) .03 0.09 (0.01–0.72) .02

30 days in-hospital mortalityb 0.34 (0.10–1.13) .08 0.33 (0.10–1.09) .07

ICU re-admission during same

hospitalizationa

0.28 (0.07–1.07) .06 0.28 (0.07–1.13) .07

ICU length of stayc 0.80 1.95 .69 1.80 1.92 .35

CI = confidence interval; ICU = intensive care unit; HR = hazard ratio; OR = odds ratio.a Logistic regression.b Cox-proportional hazard regression.c Linear regression.

Adjusted variables: episodes of delirium, ventilator use, APACHE II score, excluding episodes of delirium for analysis of the episode of delirium as the

dependent variable.

∗ Multiple types of analyses in one table (logistic, linear, Cox regression)

∗ Simple and multiple analyses compared

∗ The significant difference in risk for in-hospital mortality between patientsreceiving the prevention protocol and control patients, no longer is significantafter correction for some patient characteristics (p = 0.02→ p = 0.08).


Part IX

Models for time-to-event data


Chapter 24

Survival analysis without censoring

. Example

. The survival curve

. Estimation of survival curve


24.1 Example: Survival times of cancer patients

• Cameron and Pauling [24]; Hand et al. [25] p. 255

• Patients with advanced cancer of the stomach, bronchus, colon, ovary, or breast weretreated (in addition to standard treatment) with ascorbate.

• The outcome of interest is the survival time (days)

• Research question(s):

What is the prognosis for a patient with specific type of cancer ?

Do survival times differ with organ affected ?


• Dataset ‘Cancer’:

Stomach Bronchus Colon Ovary Breast

124 81 248 1234 1235

42 461 377 89 24

25 20 189 201 1581

45 450 1843 356 1166

412 246 180 2970 40

51 166 537 456 727

1112 63 519 3808

46 64 455 791

103 155 406 1804

876 859 365 3460

146 151 942 719

340 166 776

396 37 372

223 163

138 101

72 20

245 283

Average (days) Median (days)

Stomach: 286 124

Bronchus: 211.6 155

Colon: 457.4 372

Ovary: 884.3 406

Breast: 1395.9 1166


• Note the severe differences between averages and medians, due to the skewness of thedistribution:

• Comparisons between groups is therefore based on parametric tests after appropriatetransformation (e.g., logarithmic), or based on non-parametric tests (e.g., Wilcoxontest).


24.2 The survival curve

• Often it is of interest to make a prognosis for specific patients, i.e., it is of interest toestimate the probability of ‘surviving’ a specific amount of time

• In other contexts, the response is not ‘survival’, but still a ‘time to event’:

. Progression free ‘survival’

. How long will a bulb ‘survive’

. Time untill first tooth is affected with caries

. Time a rat needs to find the exit of a maze

. . . .

• Terminology: Survival and Failure


• In the cancer example, it may be of interest to estimate how likely it is that a patientwith colon cancer, treated (in addition to standard treatment) with ascorbate, willsurvive 1 year, 2 years, . . .

• Interest is then in the survival function / curve:

S(t) = P (Outcome > t)

“The probability of surviving time point t”

• Properties of S(t):

. S(0) = 1: There is absolute certainty to ‘survive’ t = 0

. S(+∞) = 0: There is absolute certainty to ‘fail’ eventually

. S(t) is a decreasing function


• Examples of survival curves:


24.3 Estimation of survival curve

• As S(t) can be interpreted as a proportion, it can easily be estimated by the observedproportion of subjects surviving time point t:

S(t) = P (Outcome > t) −→ S(t) =

# subjects surviving t

N

• As an example, we estimate the survival curve for ovary cancer patients

• The following 6 event times were recorded:

1234 89 201 356 2970 456


• Calculations:

Time (t) # Surving t S(t)

0 6 6/6 = 1.00

30 6 6/6 = 1.00

89 5 5/6 = 0.83

100 5 5/6 = 0.83

201 4 4/6 = 0.67

356 3 3/6 = 0.50

400 3 3/6 = 0.50

556 2 2/6 = 0.33

1234 1 1/6 = 0.17

2970 0 0/6 = 0.00


• Graphically:


• Remarks:

. S(t) is estimated using a step function

. Steps only at the times where events were observed

. Step size at time point t:

# subjects with event at t

N

. The estimate is right-continuous:


Chapter 25

Survival analysis with censoring

. The problem of censoring

. Example

. Kaplan-Meier estimate of survival curve

. Comparison of survival curves



25.1 The problem of censoring

Event time cannot always be measured !

⇓Censored observations

Various types of censoring:

. Right

. Left

. Interval

. Mixture of the above


No censoring

Time/Age

................................................................................................................................................................................................................................................

........................................

Subject 1

Subject 2

Subject 3

Subject 4

Subject 5

Subject 6

Subject 7

Subject 8

: Before event : After event •: True event time ◦: Observations

•• ••• •• •

◦◦ ◦◦◦ ◦◦ ◦


Right censoring due to study end

Time/Age

................................................................................................................................................................................................................................................

........................................

Subject 1

Subject 2

Subject 3

Subject 4

Subject 5

Subject 6

Subject 7

Subject 8


•• ••• •• •

◦

◦

◦ ◦◦◦

◦ ◦

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

End of study


Right censoring due to dropout

Time/Age

................................................................................................................................................................................................................................................

........................................

Subject 1

Subject 2

Subject 3

Subject 4

Subject 5

Subject 6

Subject 7

Subject 8


•• ••• •• •

◦

◦

◦ ◦◦◦

◦ ◦.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................


Left censoring due to late study onset

Time/Age

................................................................................................................................................................................................................................................

........................................

Subject 1

Subject 2

Subject 3

Subject 4

Subject 5

Subject 6

Subject 7

Subject 8


•• ••• •• •

◦◦

◦

◦◦

◦ ◦◦

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

Begin of study


Interval censoring due to discrete observation times

Time/Age

................................................................................................................................................................................................................................................

........................................

Subject 1

Subject 2

Subject 3

Subject 4

Subject 5

Subject 6

Subject 7

Subject 8


•• ••• •• •

◦◦ ◦◦ ◦ ◦◦ ◦

◦◦ ◦◦ ◦◦ ◦

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

.

..

..

..

..

..

..

..

..

..

..

..

..

..

..

..

.

..

..

..

..

O B S E R V A T I O N T I M E SAdvanced statistical methods 597

• Our focus will be on right censoring, i.e., either the true event time or a lowerbound of it is observed

• Standard statistical tools for the analysis of censored observations assume randomcensoring:

Event time and censoring time are independent

• Counter examples:

. Patients entering the study later have a better prognosis due to increasedexperience of surgeon=⇒ Negative association between censoring and event time

. Patients leaving the study because they get worse=⇒ Positive association between censoring and event time


25.2 Example: Myelomatosis

• Peto et al. [26]; Allison [27] p.26

• Data on 25 patients diagnosed with myelomatosis (Kahler’s disease), multiple maligntumours in the bone marrow

• Patients randomly assigned to two drug treatments

• Event time is the time from moment of randomization to death

• Some event times are censored due to study termination

• Patients with normal and patients with impaired renal functioning at moment ofrandomization


• Data:

Treat Duration Status Renal Treat Duration Status Renal

1 8 1 1 2 180 1 0

1 852 0 0 2 632 1 0

1 52 1 1 2 2240 0 0

1 220 1 0 2 195 1 0

1 63 1 1 2 76 1 0

1 8 1 0 2 70 1 0

1 1976 0 0 2 13 1 1

1 1296 0 0 2 1990 0 0

1 1460 0 0 2 18 1 1

1 63 1 1 2 700 1 0

1 1328 0 0 2 210 1 0

1 365 0 0 2 1296 1 0

2 23 1 1

Status:

. 0: Censored

. 1: Death

Renal:

. 0: Normal

. 1: Impaired

• Interest is in estimating and comparing the survival curves for patients with differenttreatments and for patients with different renal functioning at baseline


25.3 Kaplan-Meier estimate of survival curve

• Suppose interest is in estimating the survival curve for patients with treatment 1

• Observed data:

Duration: 8 852 52 220 63 8 1976 1296 1460 63 1328 365

Status: 1 0 1 1 1 1 0 0 0 1 0 0

How to account for the censoring ?


• Simple ‘naive’ solutions:

. Ignoring the censored observations: Over-optimistic

. Treating censored observations as event times: Over-pessimistic

• Hence, correct account of the censoring process is necessary.

• The Kaplan-Meier (KM) estimate provides an unbiased estimate for S(t),assuming independent (non-informative) censoring.

• Most software packages allow calculation of the KM estimate.

• Input needed:

. Observed times

. The true status (‘event’ or ‘censored’)


• KM estimate for patients receiving treatment 1:


25.4 Comparison of survival curves

• Often, interest is in the comparison of survival curves of different groups

• For the Myelomatosis data, interest may be to compare survival between the twotreatment goups

• Also of interest is the comparison of survival for patients with impaired renalfunctioning with survival for patients with normal renal functioning.

• We will focuss on the comparison of two groups, but extensions are available for thecomparison of multiple groups

• For each group separately, the KM estimate for the survival curve can be calculated.


• KM estimates for both treatment groups:


• KM estimates for patients with normal and impaired renal functioning, respectively:


• Due to the censoring, classical tests such as t-test and Wilcoxon test cannot be usedfor the comparison of the survival times

• Various tests have been designed for the comparison of survival curves, when censoringis present

• The most popular ones are:

. Logrank test

. Wilcoxon (Gehan) test

• Note that the Wilcoxon test used here is different from the classical Mann-Whithney Utest, also termed Wilcoxon test.

• The Logrank test has more power than Wilcoxon for detecting late differences

• The Logrank test has less power than Wilcoxon for detecting early differences


• Test results:

Effect of treatment Effect of renal functioning

Logrank: p=0.2468

Wilcoxon: p=0.6260

Logrank: p=0.0029

Wilcoxon: p=0.0005


25.5 Examples from biomedical literature

• Shatari et al. [7]:

. Methods, p.439:


. Figure 1, p.440:


• Blanchon et al. [28]:

. Statistical Methods, p.831:


. Figure 2, p.834:



. Analysis section, p.1427:



analyzed by Kaplan–Meier survival and Cox proportional

hazard regression analysis. The effects on reduced lengths


. Figure 3, p.1430:

Fig. 3. Kaplan–Meier survival curves for 7-day and 30-day in-hospital mortality.

∗ Confusing formulation: 7-day and 30-day mortality

∗ Outcome analysed was time till death, from which an estimate for theprobability of dying within 7 (30) days results


Chapter 26

Regression for survival data

. Example

. Cox regression

. Application

. Model diagnostics

. Hazard rate



26.1 Example: Pneumonia data

• Klein and Moeschberger [29] p.14

• Data on 3470 children, to study risk factors for time to hospitalized pneumonia

• Censoring after the first year of life, if not earlier

• Overall, 73 (2.10%) of the children were reported to be hospitalized for pneumoniawithin the first year of life.

• We want to study the association between the time to hospitalized pneumonia andsome child- and/or mother-specific characteristics


• KM estimate for time to hospitalized pneumonia:


• Note that the estimated probability of not experiencing hospitalized pneumonia is97.7%, somewhat smaller than the observed proportion 100− 2.10 = 97.9%, obtainedwithout correction for the early censoring.

• The relation with the following potential risk factors is to be studied:

. Age of mother (Years)

. Presence of siblings (Yes: 48%, No: 52%)

. Smoking status mother (Yes: 34%, No: 66%)

. Urban environment (Yes: 76%, No: 24%)

. Alcohol use mother (Yes: 36%, No: 64%)

. Poverty status mother (Yes: 36%, No: 74%)

. Normal birthweight child (≥ 5.5 pounds ≈ 2.5 kg. Yes: 92%, No: 8%)


• The relation with factors can be studied using group-specific Kaplan-Meier estimates,together with Logrank and/or Wilcoxon tests

• Investigating the relation with covariates, requires a regression-type model

• Relating the outcome to several factors and/or covariates simultaneously requires amultiple model allowing to include (a combination of) covariates and factors

• The most frequently used model is the Cox (proportional hazards) model


26.2 Cox regression

• Suppose interest is in studying the relation between the survival probality S(t) andsome covariate X

• Examples:

X = Age mother

X =

1 if mother smokes

0 if mother does not smoke

• Let S0(t) denote the survival function in case X = 0, and Sx(t) is the survivalfunction in case X = x, for a specific value x


• The Cox regression model assumes that:

Sx(t) = {S0(t)}exp(βx)

• In case β > 0:

x↗ =⇒ exp(βx)↗ =⇒ Sx(t) < S0(t)

Higher X-values associated with increased risk for event

• In case β < 0:

x↗ =⇒ exp(βx)↘ =⇒ Sx(t) > S0(t)

Higher X-values associated with reduced risk for event


• In case β > 0:


• In case β < 0:


• Note that the covariate model exp(βx) expresses how much higher (lower) the risk isfor people with X = X than with people for X = 0.

• Hence, the model is a relative model and therefore does not contain an intercept.

• Note also that the model can easily be generated to models with multiple covariates

• For example, a model with two covariates X1 and X2 is given by:

Sx1,x2(t) = {S0(t)}exp(β1x1+β2x2)

• Factors can easily be incorporated using the regression notation with dummy coding.


26.3 Application: Pneumonia data

• Effects of all risk factors separately, assessed using simple Cox regression models:

Effect β p-value

Age of mother −0.0985 0.0275

Urban environment −0.4523 0.0695

Alcohol use −0.0535 0.8282

Normal birthweight child −0.2412 0.5439

Smoking of mother 0.7958 0.0007

Poverty of mother 0.5963 0.0109

Presence of siblings 0.6436 0.0079


• Some evidence for positive effect (later event) of:

. Older mother

. Urban environment

• Some evidence for negative effect (earlier event) of:

. Smoking of mother

. Poverty of the mother

. Presence of siblings

• The positive effect of an urban environment is somewhat surprising and may beexplained by other characteristics to which it is related.

• Also the effect of poverty of the mother might be explained by other factors, relatedto poverty.


• We therefore fit a multiple Cox model, assessing the effect of all covariates/factorssimultaneously:

Simple models Multiple model

Effect β p-value β p-value

Age of mother −0.0985 0.0275 −0.1287 0.0107

Urban environment −0.4523 0.0695 −0.3509 0.1616

Alcohol use −0.0535 0.8282 −0.1213 0.6374

Normal birthweight child −0.2412 0.5439 −0.0152 0.9697

Smoking of mother 0.7958 0.0007 0.7289 0.0028

Poverty of mother 0.5963 0.0109 0.2778 0.2586

Presence of siblings 0.6436 0.0079 0.7557 0.0042


• There is no evidence anymore for an effect of living in an urban environment. This canbe explained from:

. There are significantly less smoking mothers (32.74%) in urban environmentscompared to rural environments (38.63%): Chi-squared test, p = 0.0018.

. There are significantly less siblings (46.95%) in urban environments compared torural environments (51.74%): Chi-squared test, p = 0.0158.

• There is no evidence anymore for an effect of poverty of the mother. This can beexplained from:

. There are significantly more smoking mothers (40.30%) in poor circumstancescompared to non-poor circumstances (30.69%): Chi-squared test, p < 0.0001.

. There are significantly more siblings (58.41%) with poor mothers compared tonon-poor mothers (42.30%): Chi-squared test, p < 0.0001.


• To assess the effect of the risk factors, we compare survival when the risk factors arepresent to survival when the risk factors are absent, keeping the others constant (ruralenvironment, no alcohol use, normal birthweight, no poverty):

. Present: Mother 20yrs old, smoking, with other children

. Absent: Mother 30yrs old, not smoking, without other children



• The Cox regression model assumes that:

Sx(t) = {S0(t)}exp(βx)

• Checking the above assumption is difficult since:

. The observations are not always observed (censoring)

. The ‘baseline’ survival curve S0(t) is left unspecified

• Most software packages do not include tools to easily check the assumption.


26.5 Hazard rate

• Let us consider a Cox model with two covariates X1 and X2:

Sx1,x2(t) = {S0(t)}exp(β1x1+β2x2)

We then have that

ln[Sx1,x2(t)] = ln[S0(t)] exp(β1x1 + β2x2)

• Hence, the relative change in risk associated with a one-unit increase in X1 equals

ln[Sx1+1,x2(t)]

ln[Sx1,x2(t)]=

ln[S0(t)] exp(β1(x1 + 1) + β2x2)

ln[S0(t)] exp(β1x1 + β2x2)= exp(β1)


• The term exp(β1) is called the hazard ratio (hazard rate, HR)

• The HR expresses the relative change in risk (measured on the logarithmic scale),associated with a one-unit increase in the covariate, from X1 = x1 to X1 = x1 + 1.

• This change in risk is not only independent of x1, but also of x2.

• This is in full analogy to the interpretation of a regression coefficient in a linear model,or the OR in a logistic model.

• Caution is needed, however, in polynomial models or models with interactions, whereindividual regression parameters cannot always be interpreted.

• Because this change in risk is also independent of t, the Cox model is also termed‘the proportional hazards model’.


26.6 Examples from biomedical literature

• Nawrot et al. [30]:

. Statistical analyses, p.122:

∗ Power analyses basedon logrank

∗ Cox regression to adjustfor potentially importantcovariates

∗ Sensitivity analyses to checkwhether results depend oncovariates included.


. Figure 2, p.122:


. Results, p.124:


• Hutchins et al. [31]:

. End Point Definitions and Statistical Analysis, p.8315:

∗ OS: Overall survival

∗ DFS: Disease free survival


. Results (p.8316), Figures 4 and 5 (p.8318):

Fig 5. Overall survival (OS) by hormone receptor (HR) status with and

without tamoxifen (TAM). HR�, HR positive; HR�, HR negative.

Fig 4. Disease-free survival (DFS) by hormone receptor (HR) status with

and without tamoxifen (TAM). HR�, HR positive; HR�, HR negative.

HR: postmenopausal hormone receptor positive; TAM: Tamoxifen group


• Brown et al. [32]


c 1 ¼ ¼

(HR: 1.04; 95% CI: 1.01e1.07). Given the significant

gender depression interaction, gender-stratified

Cox proportional hazard models were used to

examine the effects of frailty characteristics, depres-

sion, and depression by frailty characteristic interac-

tions on survival time. Dummy-coded variables for

frailty, missing data, and depression status were used

for single predictor models. For these single predictor

models, Bonferroni correction on the false-positive

error rate was used to account for multiple compar-

isons (a of 0.05 adjusted for four frailty characteris-

tics: a ¼ 0.0125).


. Table 3, p.1091:

TABLE 3. Site-Stratified Proportional Hazards Models with Multiple Predictors of Frailty Characteristics for Depressed and Nondepressed Men and Women

Frailty Predictors

Nondepressed (N [ 543) Depressed (N [ 261)

Men (N [ 278) Women (N [ 265) Men (N [ 84) Women (N [ 177)

Wald c2 HR (95% CI) p Wald c

2 HR (95% CI) p Wald c2 HR (95% CI) p Wald c

2 HR (95% CI) p

Low physical activities 10.48 0.005 1.25 0.535 1.81 0.404 3.66 0.160

Yes vs. no 7.27 1.76 (1.17, 2.65) 0.007 1.13 1.34 (0.78, 2.28) 0.287 1.61 1.68 (0.76, 3.74) 0.204 2.86 1.56 (0.93, 2.61) 0.091

Missing vs. no 4.85 2.80 (1.12, 7.01) 0.028 0.01 0.95 (0.36, 2.50) 0.917 0.09 1.25 (0.29, 5.46) 0.765 0.19 0.81 (0.32, 2.05) 0.662

Fatigue 4.90 0.086 3.67 0.160 6.46 0.040 5.62 0.060

Yes vs. no 3.82 1.51 (1.00, 2.28) 0.051 2.73 1.54 (0.92, 2.56) 0.098 1.02 1.48 (0.69, 3.15) 0.313 5.40 1.94 (1.11, 3.40) 0.020

Missing vs. no 2.24 1.46 (0.89, 2.41) 0.134 1.96 1.52 (0.85, 2.74) 0.161 6.29 2.90 (1.26, 6.67) 0.012 2.81 1.66 (0.92, 3.00) 0.094

Slow gait speed 5.99 0.050 1.89 0.388 2.17 0.338 6.69 0.035

Yes vs. no 4.96 1.54 (1.05, 2.26) 0.026 1.04 1.30 (0.79, 2.14) 0.307 1.01 1.52 (0.67, 3.45) 0.315 4.59 1.84 (1.05, 3.21) 0.032

Missing vs. no 1.92 1.71 (0.80, 3.64) 0.166 1.38 1.85 (0.66, 5.18) 0.240 1.95 2.67 (0.67, 10.59) 0.163 5.47 2.71 (1.18, 6.26) 0.019

Low grip strength 2.34 0.310 4.92 0.085 0.64 0.725 1.60 0.452

Yes vs. no 0.01 1.02 (0.65, 1.59) 0.934 4.64 1.83 (1.06, 3.18) 0.031 0.63 1.35 (0.64, 2.81) 0.429 0.43 1.19 (0.70, 2.02) 0.513

Missing vs. no 2.19 0.55 (0.25, 1.22) 0.139 0.85 1.52 (0.63, 3.69) 0.355 0.05 1.15 (0.33, 4.02) 0.827 1.46 1.59 (0.75, 3.36) 0.228

Notes: Cox proportional hazard models were used to explore the simultaneous effects of the baseline frailty characteristics in separate models for nondepressed and depressedmen and women. Wald c

2 values are listed (df ¼ 2 for the overall effect of each frailty characteristic, d f ¼ 1 for each subgroup comparison); HRs (and 95% CI) are for multiplepredictor models with the inclusion of other frailty characteristics as covariates.

∗ Separate analyses for depressed and non-depressed, males and females, due toimportant interaction effects

∗ Outcome (overall survival) not mentioned explicitly

∗ Dummy coding for factors, with ‘no’ group as reference group

∗ HR’s to quantify the risk


Part X

Further Topics


Chapter 27

Clustered data

. Data set: Washing without water

. Naive analysis

. Correction for clustering

. A mixed model

. Empirical Bayes estimates

. Other examples



27.1 Data set: Washing without water

• Schoonhoven et al. [33]

• Comparison of traditional washing (soap & water) with the use of disposable washgloves, made of non-woven material, saturated with quickly vaporizing cleaning &caring lotions

• Nursing home residents requiring bathing by nurses

• 56 nursing home wards (±500 residents) randomized:

. Usual Care (UC: traditional bathing)

. Washing without water (WWW)


• Exclusion: In bath or shower > 1 day/week

• Outcome of interest is ‘Completeness of assisted bathing (1/0)’after 4 weeks post randomization

• Correction for dementia (1/0)

• Other covariates (age, gender, Barthel index, BMI, skin damage, . . . ) explored as well


27.2 Naive analysis

• Logistic regression with factors ‘intervention’ and ‘dementia’

• Results:

Effect OR 95% C.I. p-value

Intervention: WWW 4.739 [3.155; 7.143] <0.0001

UC

Dementia: NO 1.508 [1.005; 2.268] 0.0475

YES

• Bathing completeness more likely . . .

. . . . in WWW intervention group

. . . . in non-demented residents


27.3 Correction for clustering

• Analysis did not account for the variability between wards w.r.t. proportion of residentswith complete bathing

[ \ [ [ \ ] [ \ ^ [ \ _ [ \ ` [ \ a [ \ b [ \ c [ \ d [ \ e ] \ [[

a

] [

] a

^ [

^ a

f ghigj

k lmgno plhq r

s t u v u t w x u y z u { v | } w } ~ � w � x y �


• Variability implies residents from one ward to be more alike than residents fromdifferent wards

=⇒ Correlated data

• All models discussed so far assumed all observations to be independent

• This correlation should be accounted for in the statistical analysis

=⇒ Mixed (multilevel) models

• Mixed models are the most popular models for the analysis of clustered data


27.4 A mixed model

• Let Yij be the binary outcome for patient j in ward i.

• Furthermore, let Iij be a dummy variable being 1 if the patient belongs to the WWWgroup and 0 otherwise.

• Likewise, let Dij be a dummy variable being 1 if the patient is not demented and 0otherwise.

• The logistic model fitted equals:

P (Yij = 1) =exp(β0 + β1Iij + β2Dij)

1 + exp(β0 + β1Iij + β2Dij)


• In order to account for the fact that different wards can have different successprobabilities, we add a ward-specific term bi:

P (Yij = 1) =exp(bi + β0 + β1Iij + β2Dij)

1 + exp(bi + β0 + β1Iij + β2Dij)

• Patients in a ward with a (very) high value for bi are (very) likely to have receivedcomplete bathing

• Patients in a ward with a (very) low value for bi are (very) unlikely to have receivedcomplete bathing

• Each ward has its own bi parameter

• Since wards are believed to be sampled from a population of wards, the parameters bi

can be viewed as being sampled from a population of ward effects


• Therefore, the parameters are often assumed random:

bi ∼ N (0, σ2b)

• The normality assumption is mathematically convenient

• The assumption of mean zero is interpretationally convenient:

. bi = 0: A ward with average / median ‘risk’ for complete bathing

. bi > 0: A ward with higher than average / median ‘risk’ for complete bathing

. bi < 0: A ward with lower than average / median ‘risk’ for complete bathing

• The variance σ2b expresses the variability between wards, hence tells us how different

the ‘risk’ for complete bathing is between wards

• Much (little) between-ward variability (σ2b large/small) implies much (little) correlation


• The different nature between the regression parameters β1, β2 and β3 on one handand the parameters bi is reflected in the terminology:

. Fixed effects β1, β2 and β3: If the experiment were to be repeated, the sameparameters would appear in the model because the same population would bestudied

. Random effects bi: If the experiment were to be repeated, different parameterswould appear in the model because different wards would be sampled

• A model with fixed effects as well as random effects is termed a mixed effectsmodel, or briefly a mixed model


• Fitting the mixed model to our data set leads to the following results:

Naive Correct

Effect OR 95% C.I. p-value OR 95% C.I. p-value

Intervention: WWW 4.739 [3.155; 7.143] <0.0001 12.821 [4.566; 35.714] <0.0001

UC

Dementia: NO 1.508 [1.005; 2.268] 0.0475 1.271 [0.883; 1.828] 0.1962

YES

• Conclusion:

Effects of covariates highly affectedby correlation witin clusters


• The between-ward variance estimate equals σ2b = 4.4617

• The original naive logistic regression model assumed no variability between wards, i.e.,assumed σ2

b = 0

• In general, any clustering in the data should be accounted for in the analysis

• The impact on covariates of interest very much depends on the variability betweenclusters, i.e., on the correlation between observations within clusters.

• In our example, a logistic regression model was extended with random effects, yieldinga logistic mixed model. Likewise, all models discussed so far can be turned into amixed model in order to account for clustering.

• Terminologies used: linear mixed models, generalized linear mixed models, . . .


27.5 Empirical Bayes estimates

• So far, we have focussed on inference for the fixed effects.

• In some cases, scientific interest may be in the random effects themselves, rather thanthe fixed effects.

• As an example, we consider data from the Diabetes Project Leuven (DPL) in whichgeneral practitioners (GP’s) were invited to participate in an intervention programaiming at improving care through providing support to the GP’s

• The intervention was to provide structured assistance to GP’s by a diabetes care team,consisting of a nurse educator, a dietician, an ophthalmologist and an internalmedicine doctor.


• We consider the analysis of 61 GP’s with a total of 1577 patients, the number per GPranging from 5 to 138

• The outcome studied is HbA1c, glycosylated hemoglobin, after one year in theintervention program:

. Molecule in red blood cells that attaches to glucose (blood sugar)

. High values reflect more glucose in blood

. In diabetes patients, HbA1c gives a good estimate of how well diabetes is beingmanaged over the last 2 or 3 months

. Non-diabetics have values between 4% and 6%

. HbA1c above 7% means diabetes is poorly controlled, implying higher risk forlong-term complications.


• More specifically, interest was in the dichotomized version

Y =

1 if HbA1c < 7%

0 if HbA1c ≥ 7%

• A logistic mixed model can be used:

P (Yij = 1) =exp(bi + β0)

1 + exp(bi + β0)

• Patients with a high P (Yij = 1) value are likely to reach the target

• This probability depends on the GP effect bi

• Patients treated by a GP with a high (positive) value bi are likely to reach the target.Patients treated by a GP with a low (negative) value bi are not likely to reach thetarget.


• Hence, ‘successful’ GP’s are those with a high value bi, while less ‘successful’ GP’shave lower values bi

• Therefore, it is of interest to estimate the random effects bi in order to be able toidentify (un-)successful GP’s

• Those estimates are called Empirical Bayes (EB) estimates.

• In order to have a fair comparison of GP’s, correction is needed for GP and patientcharacteristics such as:

. Practice form (1, 2, or > 2 GP’s in one practice)

. BMI of patient at the moment of enrollment in the study

. New, indicating whether patient is newly diagnosed as diabetes patient


• Correction can be done by including the factors and covariate in the logistic model

• Practice will be coded with two indicator variables P1 and P2

• BMI is a continuous covariate (B)

• New will be coded with one indicator variable N

• The logistic mixed model then becomes:

P (Yij = 1) =exp(bi + β0 + β1P1i + β2P2i + β3Bij + β4Nij)

1 + exp(bi + β0 + β1P1i + β2P2i + β3Bij + β4Nij)

• Studying the EB estimates for the GP effects bi allows comparison of GP’s, assumingthat they all would work in the same practice form and would have patients with thesame characteristics (BMI, New).


• A histogram of EB estimates is helpful to identify well-performing and poorlyperforming GP’s:

� � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� ��

� � � � � � � � � � � � � � � � � � � � � �


• Conclusion:

EB estimates can be used to identify ‘outlying’ clusters,after correction for systematic effects and/or differences


27.6 Multiple levels of clustering

• In mixed models, a ‘classical’ statistical model is extended with a random effect whichaccounts for variability between clusters.

• The same idea can be used for the analysis of data with multiple levels of clustering.

• As an example, re-consider the logistic mixed model used for the analysis of the‘Washing without water’ data set on nursing home wards:

P (Yij = 1) =exp(bi + β0 + β1Iij + β2Dij)

1 + exp(bi + β0 + β1Iij + β2Dij),

with Yij the (binary) outcome on patient j within ward i.

• Suppose some wards belong to the same nursing home.


• We then have patients clustered within wards, which themselves are clustered withinnursing homes.

• This additional level of clustering can be accommodated by adding an additionalrandom effect for each nursing home, on top of the random effect for each ward.

• Let Ykij be the (binary) outcome on patient j, within ward i, in nursing home k.

• A mixed model which accounts for variability between wards and between nursinghomes is:

P (Ykij = 1) =exp(ak + bki + β0 + β1Iij + β2Dij)

1 + exp(ak + bki + β0 + β1Iij + β2Dij).

• As before, the random effects are assumed normally distributed, but with differentvariances:

ak ∼ N (0, σ2a), bki ∼ N (0, σ2

b).


27.7 Other examples

Clustering =⇒ Correlation

• Residents clustered within wards

• Patients clustered within hospitals

• Ophthalmology studies: Eyes within patients (−→ paired t-test)

• Longitudinal studies (see later)

• . . .



• Smeds-Alenius et al. [34]

. Statistical methods section, p.120:

We used separate adjusted multivariate logistic regres-

sion models to estimate the relationship of RN assessed

quality of care and RN assessed patient safety to 30-day

inpatient mortality (LaValley, 2008). In all regression

analyses, a mixed model approach with random intercept

was used to correct for the dependency of observations

within a hospital. Confidence intervals were set at 95%.

Data were analyzed using SAS 9.4.

∗ RN: registered nurse

∗ Random effect for correction due to clustering in hospitals


. Table 3, p.121:

Table 3

Relationships between RNs who report excellent quality of care and/or patient safety and the outcome of 30-day inpatient mortality.

Unadjusted model Adjusted modela

OR 95% CI Pr > ChiSq OR 95% CI Pr > ChiSq

Quality of care

Middle tertile hospitals compared to the lowest tertile hospitals 0.82 0.66–1.00 0.055 0.86 0.72–1.04 0.112

Highest tertile hospitals compared to the lowest tertile hospitals 0.79 0.61–1.02 0.067 0.77 0.65–0.91 0.002

Patient safety

Middle tertile hospitals compared to the lowest tertile hospitals 0.92 0.75–0.13 0.450 0.82 0.68–1.00 0.048

Highest tertile hospitals compared to the lowest tertile hospitals 0.68 0.52–0.90 0.006 0.74 0.60–0.91 0.004

a Adjustments were made for patient characteristics (gender, age, comorbidities, surgical DRGs, emergency room admittance) and hospital

characteristics (size, level of specialization, teaching status).

∗ Outcome: Death within 30 days of admission

∗ Factors of interest: Quality of care and patient safety, as reported by RN’s

∗ Regression notation for the factors, after discretisation in 3 groups

∗ Analysis with and without correction for patient and hospital characteristics


• Fisher et al. [35]


surgeon volume category. Multilevel logistic regression

with surgeons and hospitals as crossed random effects was

used to estimate odds ratios (OR) of receiving mastectomy

by surgeon volume, adjusting for year of diagnosis, age at

diagnosis, geographic region, ER/PR status, tumor size,

and nodal status, as well as for interaction of all variables

with stage. Postestimation lincom commands were used to

calculate the OR for the variables of interest by stage.

Crossed random effects were necessary because some

surgeons operated out of multiple hospitals. Interaction

between year of diagnosis and surgeon volume was eval-

uated and found to be nonsignificant. Empirical Bayes

estimation was used to estimate adjusted OR for individual

surgeons and hospitals. All statistical analyses were per-

formed by SAS 9.3 statistical software (SAS Institute,

Cary, NC, USA) and Stata 12.1 (StataCorp, College Sta-

tion, TX, USA).

∗ Outcome:Receiving mastectomy

∗ Covariate of interest:Surgeon volume

∗ Clustering within hospitals

∗ Clustering within surgeons

∗ 2 random effects

∗ Logistic mixed model

∗ EB estimates to get bi

∗ OR’s computed as exp(bi)


. Figure 1, p.1846:

More BCS

BA

More BCS

Ind

ivid

ual

Su

rgeo

ns

Adjusted Odds Ratio Estimates of Mastectomy Adjusted Odds Ratio Estimates of Mastectomy

* Very Low Volume Surgeons (N=42) ** Low Volume Surgeons (N=28) * Low Volume Hospitals (N=17)

Ind

ivid

ual

Ho

spit

als

ymotcetsaMeroMymotcetsaMeroMAlbertaAverage

AlbertaAverage

0.1 0.25 0.5 1 2 4 8 0.1 0.25 0.5 1 2 4 8

***

*

FIG. 1 Empirical Bayes estimates of adjusted OR of mastectomy for breast cancer patients by a individual surgeon and b individual hospital,

adjusting for patient characteristics and accounting for variation by surgeon volume

∗ Much more variability between surgeons than between hospitals


Chapter 28

Longitudinal data / Repeated measures

. Example: Longitudinal MMSE evolutions

. Repeated measures ANOVA

. Model extensions



28.1 Example: Longitudinal MMSE evolutions

• In the delirium data set, MMSE was measured 5 times post operation, at days 1, 3, 5,8, and 12

• This allows to study how patients have evolved over time

• This also allows to study how such evolutions depend on patient characteristics

• As an illustration we want to investigate whether MMSE has a different evolution forneuro-psychiatric patients than for non-neuro-psychiatric patients


• Individual trends:

¡¡¢£

¤

¥ ¤

¦ ¤

§ ¤

¨ © ª « ¬ ® ¯ ° ±¥ ¦ § ² ³ ´ µ ¶ · ¥ ¤ ¥ ¥ ¥ ¦

¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å » Ã Á Ä Æ Ç ¸ ¼ Å Æ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å » Ã Á

• Obviously, there is skewness in the data


• This is also supported by a histogram at time 1:

È É Ê È Ê É Ë È Ë É Ì ÈÈÈ Í È ÉÈ Í Ê ÈÈ Í Ê ÉÈ Í Ë ÈÈ Í Ë ÉÈ Í Ì ÈÈ Í Ì ÉÈ Í Î È

Ï ÐÑÒÑÐÓÔ ÑÕ

Ö Ö × Ø

• Since all models for continuous data assume normality, we use an exponentialtransformation:

MMSE −→ exp(MMSE/30)


• New histogram (at time 1):

Ù Ú Û Û Ù Ú Ü Ý Ù Ú Ý Û Ù Ú Þ Ý Ü Ú Û Û Ü Ú Ü Ý Ü Ú Ý Û Ü Ú Þ ÝÛ

Û Ú Û Ý

Û Ú Ù Û

Û Ú Ù Ý

Û Ú Ü Û

Û Ú Ü Ý

Û Ú ß Û

à áâãâáäå âæ

ç è é ê ë ë ì í î ï ð ñ

• Perfect normality is not obtained, but covariates have not been included yet andsymmetry is much better satisfied than before.


• Individual trends after transformation:

òóôõöö÷øùúûü

ý þ ÿ

ý þ �

� þ ÿ

� þ �

� þ ÿ

� � � � � � � �ý � � � � � � � ý ÿ ý ý ý �

� � � � � � � � � � � � � � � � � � ! � � � � � � � � � � � � � � � � � � �


28.2 Repeated measures ANOVA

• Longitudinal data can be viewed as a particular instance of clustered data, in whichobservations are clustered within subjects

• Hence, mixed models can be used, the most popular one being repeated measuresANOVA

• The model is an extension of the ANOVA model with subject-specific random effectsto account for between-subject variability

• As an illustration, we fit a two-way repeated measures ANOVA model with fixedfactors ‘time,’ ‘neuro-status,’ and the interaction between both, and with randomeffects for the patients


• Results: " # $ % & " % ' ( ' ) * + , - % . / * * % 0 ( '1 2 2 3 4 5 6 7 89 : ; < => ? @ A B C D E F G H IJ K L M N O P P Q R S T U V W W W XY Z [ \ ] ^ _ ` ` a a b c d e f g g g hi j k l m n o p q r s t u u v w v x y z { | } ~

• Note that omitting the random effect, i.e., ignoring the clustering leads to differentresults: � � � � � � � � � � � � � � � � � � � � � � � �

� � � � � � � � �� ¡ ¢ £ ¤ ¥ ¦§ ¨ © ª « ¬ ® ¯ ° ± ² ³ ´ µ µ ¶ ·¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Â Å Æ Ç È È È ÉÊ Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß Þ à


• Predicted evolutions:

áâãä ååæçèéêë

ì í î

ì í ï

ð í î

ð í ï

ñ í î

ò ó ô õ ö ÷ ø ù ú ûì ð ñ ü ï ý þ ÿ � ì î ì ì ì ð

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

• There is no evidence for any interaction between ‘time’ and ‘neuro’, suggesting thatthe evolution over time is not different for both neuro-groups (p = 0.3482)


• Leaving out the interaction term leads to a model which assumes the same averageevolutions for both groups:

��

� � �

� � �

� �

� �

! � �

" # $ % & ' ( ) * +� ! , � - . / 0 � � � � �

1 2 3 4 5 6 7 8 9 : ; < = > 4 < : = ? @ 1 5 > ? 2 3 4 5 6 7 8 9 : ; < = > 4 < :


• The test results for the resulting model are:

A B C D E A D F G F H I J K L D M N I I D O G F

P Q Q R S T U V WX Y Z [ \] ^ _ ` a b c d e f g hi j k l m n o p q r s t u v w w w xy z { | } ~ � � � � � � � � � � � � � �

• Hence, the general conclusion is:

. Both groups have the same evolution over time (p = 0.3482)

. The trend is not constant over time (p < 0.0001)

. Neuro-psychiatric patients, on average, have lower MMSE values (p < 0.0001)

• Note that the trend seems very minor, while being highly significant.


• This can be explained from the fact that the trend observed is on the transformedscale.

• Back-transforming the average trends leads to:

��

� �

� �

� �

� �

� � � � � � � � � ��

¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¤ ¬ ª ¯ ° ¡ ¥ ® ¯ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¤ ¬ ª


28.3 Model extensions

28.3.1 Inclusion of additional covariates and/or factors

• The repeated measures ANOVA model can be extended in various ways.

• The repeated measures ANOVA model can easily be extended with additionalcovariates or factors:

. To correct for differences between groups to be compared, e.g., age, gender

. To study how evolutions depend on additional patient characteristics, e.g., age,gender

• This turns the basic ANOVA model into a general linear model, extended with randomeffects

• The same ideas apply when analysing categorical longitudinal outcomes. For example,a binary outcome can be analysed using a logistic mixed model.


28.3.2 Categorical or continuous time effects ?

• In our example, ‘time’ was treated as a categorical factor. If the outcomes shows alinear trend over time, ‘time’ can be treated as a continuous covariate.

• In our analysis of the MMSE outcome, let Yij be the jth measurement of MMSE forpatient i, measured at time tj , and let Xi be a dummy variable for the neuro-status (1for neuro-psychiatric patients).

• A model with linear time effect is given by:

Yij = bi + β0 + β1tj + β2Xi + β3tjXi + εij

=

bi + β0 + β1ti + εi, if not neuro-psychiatric

bi + β0 + β2 + (β1 + β3)ti + εij. if neuro-psychiatric

• As before, the random effect bi ∼ N (0, σ2b ) accounts for the variability between

subjects, and hence for the correlation in the data.


• Graphically, this leads to two average regression lines, one for each group:

±²³´ µµ¶·¸¹º»

¼ ½ ¾

¼ ½ ¿

À ½ ¾

À ½ ¿

Á ½ ¾

Â Ã Ä Å Æ Ç È É Ê Ë¾ ¼ À Á Ì ¿ Í Î Ï Ð ¼ ¾ ¼ ¼ ¼ À

Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ Ô Ü Ú Ý ß à Ñ Õ Þ ß Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ Ô Ü Ú

• The only difference when compared to ANOCOVA is that the random effect bi nowaccounts for clustering


• The test results for the resulting model are:

á â ã ä å á ä æ ç æ è é ê ë ì ä í î é é ä ï ç æð ñ ñ ò ó ô õ ö ÷ø ù ú û üý þ ÿ � � � � � � � � � � � � � � � � � � � � � � � � �� ! " # $ % & & & '( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = : >

• Again, there is no evidence for a different trend in the two neuro groups (p = 0.2601)

• Hence the model can be simplified by omitting the interaction:

Yij = bi + β0 + β1tj + β2Xi + εij

=

bi + β0 + β1ti + εi, if not neuro-psychiatric

bi + β0 + β2 + β1ti + εij. if neuro-psychiatric


• Obtained average trends:

?@AB CC

DEFGHI

J K L

J K M

N K L

N K M

O K L

P Q R S T U V W X YL J N O Z M [ \ ] ^ J L J J J N

_ ` a b c d e f g h i j k l b j h k m n _ c l m ` a b c d e f g h i j k l b j h


• Test results for the remaining main effects

o p q r s o r t u t v w x y z r { | w w r } u t

~ � � � � ��

� ��

� � � � � � � � � � � �� ¡ ¢ � £ ¤ ¥ ¥ ¥ ¦§ ¨ © ª « ¬ ® ¯ ° ° ± ² ³ ´ µ ¶ ¶ ¶ ·

• Hence, the general conclusion is:

. Both groups have the same linear evolution over time (p = 0.2601)

. The trend is not constant over time (p < 0.0001)

. Neuro-psychiatric patients, on average, have lower MMSE values (p < 0.0001)

• Note that the linear trend is on the transformed scale, needed to obtain valid inferences


• Back-transforming the average trends leads to:

¸¹º

» ¼

» ½

¾ ¼

¾ ½

¿ À Á Â Ã Ä Å Æ Ç È¼ » ¾ É Ê ½ Ë Ì Í Î » ¼ » » » ¾

Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ò Ú Ø Û Ý Þ Ï Ó Ü Ý Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ò Ú Ø

• While both trends look linear, they are not due to the non-linear transformation


28.3.3 Adding more random effects

• Let us re-consider the model with linear time-effect used to describe thenon-neuro-psychiatric patients in the delirium study:

Yij = bi + β0 + β1tj + εij

• The model can be interpreted as an ANOCOVA model with covariate ‘time’ andrandom patient factor, needed to account for clustering.

• The model can also be interpreted as a subject-specific regression model withsubject-specific intercepts:

Yij = (bi + β0) + β1tj + εij

• Because the bi have mean zero, the average evolution in the population is:

Yij = β0 + β1tj + εij


• Individual subjects deviate from the average by having their own intercept bi + β0

• Graphical representation of this random-intercepts model:

ß àá âãäå

æ

ç è é êë ì í î ï ð ñ ò ó ô ì ë ì ì ì í

······························································································································ ········································ Y = β0 + β1t↑||σ2

b

||↓


• As a consequence, the model assumes:

. Subjects show approximately parallel profiles

. The variability does not change over time

• Obviously, this is not always/often realistic, especially in studies with many repeatedmeasurements and/or long follow-up times.

• A possible solution is to extend the model allowing for subject-specific slopes as well:

Yij = (bi0 + β0) + (bi1 + β1)tj + εij,

where now bi0 and bi1 are both assumed normally distributed with means 0 andvariances σ2

b0 and σ2b1, respectively.

• Because the random effects still have mean zero, the average evolution in thepopulation still is:

Yij = β0 + β1tj + εij


• Graphical representation of the model with random intercepts and slopes:

õ ö÷ øùúû

ü

ý þ ÿ ��

······························································································································ ········································ Y = β0 + β1t

↑|σ2

b0

|↓

↑|||

σ2b1

|||↓

• As before, EB estimates can be used to identify subjects with particular evolutions,i.e., particular intercepts and/or slopes.



• Malmstrom et al. [36]:

. Title:

The effect of a nurse led telephone supportive care programme on

patients’ quality of life, received information and health care contacts

after oesophageal cancer surgery—A six month RCT-follow-up study

Marlene Malmströma,b,c,*, Bodil Ivarssona,c,d, Rosemarie Klefsgårda, Kerstin Perssona,b,Ulf Jakobssonc,e, Jan Johanssona,b,c

a Skåne University Hospital, Lund, SwedenbDepartment of Surgery, Skåne University Hospital, Lund, Swedenc Lund University, SwedendDepartment of Cardio-Thoracic Surgery, Skåne University Hospital, Lund, SwedeneCenter for Primary Health Care Research, Faculty of Medicine, Lund University, Sweden

∗ Randomized controlled trial (RCT)

∗ Longitudinal, 6-month follow-up


. Study design, Figure 2, p.89:

Fig. 2. Overview of the intervention divided on control group (CG) and intervention group (IG). Measurement points, Follow-up with surgeon.

∗ Five repeated measurements

∗ Measurements at discharge, 2w, 2m, 4m, and 6m (•)



over time. To test if there was a significant difference over time

between the groups a repeated measurements analysis of variance

(ANOVA) was conducted. If Mauchly’s test of sphericity indicated

violation, we used the Huynh-Feldt correction of the degree of

freedom to achieve valid F-ratios to the analysis. A complete cases

analysis was also conducted (Bennett, 2001).∗ Repeated measures ANOVA to compare evolutions between both groups

∗ The ‘sphericity’ assumption is the assumption of parallel profiles for all subjects

∗ When sphericity not satisfied, a correction is applied−→ Better to extend model with random slopes


. Table 4, p.92:

Table 4

Mean value and standard deviation (SD) at each time-point and overall between group comparison for quality of life (QLQ-OG25) by intervention (IG) and control group (CG).

Discharge 2 week 2 month 4 month 6 month Between groups

IG CG IG CG IG CG IG CG IG CG P-valueb

n=41 n=39 n=40 n=34 n=38 n =32 n=34 n=25 n =25 n =23 (n =41/n = 41)

QLQ-OG25 mean (SD)a

Dysphagia 32.2 (31.5) 37.3 (33.7) 28.5 (26.5) 19.6 (19.6) 21.3 (24.7) 19.8 (22.5) 15.7 (20.9) 8.9 (11.6) 13.8 (25.3) 7.7 (9.7) 0.222

Eating 53.8 (24.7) 61.0 (24.1) 57.9 (30.3) 49.2 (26.7) 38.6 (27.4) 38.0 (25.5) 33.9 (23.2) 31.7 (23.8) 28.7 (25.8) 29.3 (21.3) 0.840

Reflux 13.0 (17.7) 16.7 (26.8) 17.5 (26.7) 16.2 (21.9) 13.6 (17.7) 15.1 (22.1) 15.2 (19.4) 20.0 (20.4) 15.3 (19.2) 22.5 (25.9) 0.352

Odynophagia 19.6 (26.7) 18.4 (17.7) 22.2 (22.7) 14.7 (22.4) 19.3 (23.4) 15.1 (13.2) 21.1 (29.4) 14.7 (14.7) 17.3 (21.8) 13.8 (11.9) 0.116

Pain and discomfort 22.9 (24.1) 22.4 (24.0) 31.2 (28.0) 19.1 (21.0) 28.9 (26.5) 20.0 (19.0) 27.9 (31.4) 24.7 (22.6) 23.3 (24.1) 24.6 (22.4) 0.163

Anxiety 53.3 (25.1) 55.1 (33.4) 48.7 (26.0) 46.6 (31.2) 43.4 (23.7) 40.9 (26.8) 41.7 (25.0) 37.3 (26.9) 41.3 (26.8) 39.9 (26.4) 0.677

Eating with others 16.7 (28.2) 14.8 (24.5) 12.3 (22.5) 18.8 (28.0) 11.4 (23.6) 8.0 (14.5) 17.6 (24.9) 5.3 (15.8) 16.0 (25.7) 11.6 (16.2) 0.358

Dry mouth 60.0 (32.2) 56.4 (35.2) 58.3 (33.5) 48.0 (33.0) 27.2 (27.8) 32.3 (31.6) 24.5 (29.9) 29.3 (29.4) 28.0 (31.4) 18.8 (24.3) 0.507

Trouble with taste 38.3 (31.6) 40.7 (32.0) 40.8 (38.9) 41.4 (32.3) 32.5 (34.2) 30.0 (29.5) 25.5 (30.8) 24.0 (31.2) 16.0 (25.7) 21.7 (29.5) 0.816

Body image 40.8 (42.4) 52.6 (35.2) 35.0 (36.2) 41.4 (37.3) 26.3 (32.1) 25.8 (29.5) 20.6 (30.7) 18.7 (27.4) 25.3 (35.1) 24.6 (25.1) 0.481

Trouble swallowing saliva 13.8 (19.7) 21.4 (26.0) 8.3 (18.1) 18.6 (29.8) 11.4 (20.9) 6.5 (15.9) 12.7 (26.0) 5.3 (12.5) 14.7 (27.4) 4.3 (11.5) 0.737

Choked when swallowing 13.3 (30.0) 12.3 (19.6) 13.3 (25.9) 14.1 (20.5) 16.7 (22.9) 20.4 (26.8) 14.7 (18.7) 12.0 (19.0) 12.0 (16.3) 11.6 (16.2) 0.978

Trouble with coughing 45.0 (29.8) 44.4 (31.8) 45.0 (28.8) 45.1 (27.1) 43.9 (28.1) 50.5 (32.1) 35.3 (25.9) 33.3 (30.4) 26.7 (30.4) 31.9 (30.9) 0.646

Trouble with talking 24.8 (28.3) 29.1 (30.8) 17.5 (23.9) 18.6 (27.5) 13.2 (27.4) 12.9 (25.4) 15.7 (27.5) 13.3 (25.5) 16.0 (29.1) 10.1 (21.2) 0.876

Weight loss 15.0 (23.1) 23.1 (30.7) 32.5 (33.3) 36.3 (32.2) 24.6 (29.7) 31.2 (29.7) 33.3 (33.8) 25.3 (30.9) 22.7 (31.5) 30.4 (33.2) 0.421

a Score range 0–100. A high score represents a higher level symptoms/problems (worse).b Repeated measurements ANOVA. Based on items/scales with mean value imputation.

92

M.Malm

ström

etal./In

ternatio

nal


• Kruse et al. [37]

. Title:

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ! � � � ! � " � � � � � ! �# � � $ � � � � % � � � � � � � � � � & � � � ' ! � � � � � # � ( � ) � � � " � � �

* + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ = A B C D > E = @ F G H I D J K L M N O P Q R S T U V W X Y U N Y U Z [ \ ] ^ _ ` a ^ _ ^ b c ^ d ef g h h i j k l m n o p q r s t u v t w s u w x s y z { | }~ � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

� � � � � � � �� ¡ ¢ £ ¡ � ¤ ¥ ¦ � � � § ¨ © ª � � � ¨ « ¦ � ¢ ¬ � ¨ � � � � � � � � � � ¡ ® ¥ � ¥ � ¦ ¯ ° ¥ � � ± ¡ £ ¡ � � ¢ � ¯ ¨ ¨ ² ¨ © ª ¥ � � ¡ ¥³ ´ µ ¶ · ¸ ¹ ¹ ¶ µ º » ¼ ½ ¾ ¸ µ ¹ ½ ¿ À ¶ · Á ½ ¹ ¹ ¶ Â µ ½ º Ã ¸ Ä Å µ ¿ Æ ¸ ¼ ¿ ¶ · Ç Å Æ ½ È À Å ¼ É Ê ¶ Æ Æ Â ¼ ½ ¿ À Á ¸ É ½ Ë ½ ¼ ¸Ì Í Î Î Ï Ð Ñ Ò Ó Ô Õ Ö Ï × Ô Î Î Ï Ö Ø Ù Ú Ñ Û Ô Ö Î Ñ Ó Ü Ï × Ý Ñ Ð Þ Ñ ß Ò Ú Ø à Ô á Ò Ö Ó â Ô Ú Ó Ï × ã Ô Ò ä Ó Þ Ý Ò Ú Ò ß Ô â Ô Ú Ó Ò Ú å Õ Ï ä Ñ Ð Ü Øæ Ð Þ Ï Ï ä Ï × Õ ç è ä Ñ Ð ã Ô Ò ä Ó Þ Ò Ú å Í Î Î Ï Ð Ñ Ò Ó Ô é Ô Î Ô Ò Ö Ð Þ Õ Ö Ï × Ô Î Î Ï Ö Ø ê Ú Î Ó Ñ Ó ç Ó Ô Ï × ë Ô Ö Ï Ú Ó Ï ä Ï ß Ü Ø æ Ð Þ Ï Ï ä Ï ×Ý Ô å Ñ Ð Ñ Ú Ôì í î î ï ð ñ ò ó ô õ ö ï ÷ ô î î ï ö ø õ ö ï ù ñ ú ô û ð ô ü í ý ô ú ñ ð ò þ ÿ ô û ó ô ö ò û ú � ö ï � û � û ñ ù ô ö î ñ ó � ø ÿ ô û ó ô ö ÷ ï ö� ô ö ï û ó ï þ ï � � ò û ú � ô ò þ ó � ÿ ò ö ô � ô î ô ò ö ð �



� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ! � " � " # $ % & ' " � " ( � � ) ! ) * ) � ( � + % , - � � � , . - / % 0 * � � # � � � � �

� ) � � ) � 1 � � � � ) � � ) � � � ! � � 2 � � � 3 4 ) � � 5 � + ) � � � � � � � � ! � ! � 6 � 2 � � � ! ) � � ! � 1 � � � ! � � � � � � 1 ) 2 � � � ) � � % 7 � + � * � � 8 3 " � � � � � � � � � ) � � � � ) � � ! + � � � � � � � � � � � � � � � 9 * � � ) � � � � 6 � � ! � , � �� * � � ) ! � � ! � � � � + ) ! � � � � � 9 * � � ) � � � 6 � � � � � 2 � � � ! ) � � � � ! � � ! � � � � � � � � � � ! � 1 ! � + 2 � � 1 � ,� ! ) 2 6 � � � � ! � � + � # � � ! � ) � � � � � � * � � � ) ) 2 � � � � � ! � � ) ! � + � * � � � � : 2 � � � ! ) � � � � � � � � � � � ) % ; �� 3 4 ) � � 5 � + ) � � ! � � * � ! � 1 � � ! � � � � � ! < � : � � � � + ) � � � � � % " � � � � � ) � + � # � � ! � � + �

� ) � * + ) * � � � � � � � � � ) 2 � � � � � � � : � � � + ! � ! + � � � � � � � � � � � � � ! � ) � � + � � ) � ) � # � � � � # � � � � � : � � � � � ) : 2 � � � ! ) � � � � � ! � � %

∗ Subject-specific (random) intercepts and slopes

∗ Variances of random intercepts and slopes are allowed to be different before andafter hospitalization.


. Figure 2, p.1921:

= > ? @ A B C DE F G H I J G E K L M H I N G O M P H Q G R S H G T Q O M G T U V H G J H G R R Q P W X P T G Y Z [ I U Y G \ I ] ^ P H W _ H R Q W J ` P X GH G R Q T G W M R ` P R S Q M I Y Q a G T ^ P H ` Q S ^ H I O M _ H G P H S W G _ X P W Q I b c G O I _ R G M ` G S H Q X I H V ` P R S Q M I Y T Q I J W P R Q RQ R W P M S H G R G W M U G ^ P H G M ` G I O _ M G G F G W M d M ` G S H G e ` P R S Q M I Y M H I N G O M P H V Q R Q T G W M Q O I Y ^ P H M ` G M f PT Q I J W P R G R b [ ` G R Y P f S H G e ` P R S Q M I Y f P H R G W Q W J Q W E K L ^ _ W O M Q P W Q R ^ P Y Y P f G T U V S H G O Q S Q M P _ Rf P H R G W Q W J R _ H H P _ W T Q W J M ` G I O _ M G ` P R S Q M I Y Q a I M Q P W b [ ` G I X P _ W M P ^ f P H R G W Q W J F I H Q G R U VT Q I J W P R Q R I W T Q R G g _ I Y M P M ` G Q W M G H O G S M R Q W [ I U Y G \ I h ` Q S ^ H I O M _ H G f I R I R R P O Q I M G T f Q M ` I i b j k eS P Q W M O ` I W J G I W T S W G _ X P W Q I f I R I R R P O Q I M G T f Q M ` I l b i m e S P Q W M O ` I W J G b n W I F G H I J G d ` Q S^ H I O M _ H G S I M Q G W M R Q X S H P F G ^ P Y Y P f Q W J ` P R S Q M I Y T Q R O ` I H J G f ` Q Y G H G R Q T G W M R ` P R S Q M I Y Q a G T ^ P HS W G _ X P W Q I Z I W T M ` G P M ` G H T Q I J W P R G R ] O P W M Q W _ G M P f P H R G W b o P H S _ H S P R G R P ^ Q Y Y _ R M H I M Q P W d M ` G


Chapter 29

Missing observations

. Introduction

. How not to handle missing data ?

. How to handle missing data ?



29.1 Introduction

• For example, the plot with individual profiles of MMSE evolutions in the delirium dataset suggests dropout:

ppqr

s

t s

u s

v s

w x y z { | } ~ � �t u v � � � � � � t s t t t u

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �


• Complete data sets are rare in practice

• Missing observations not only imply loss of power, but more importantly may alsoimply biased results

• Problematic case:

Probability for an observation to be missingis related to the observation itself

• How to handle missingness in a data set ?

• This will be illustrated in the context of longitudinal data but ideas equally well applyto all other contexts


29.2 How not to handle missing data ?

• Consider data from a longitudinal study with 20 subjects, measured at baseline andfollowed by 6 weekly visits:

� ��

��

�

� �

�

¡ �

¢ �

£ ¤ ¥ ¦� � ¡ ¢ § ¨

© ª « ¬ ® ¯ ® ° ± ¯ ±


• Due to dropout, not all subjects have been followed up to week 6:

² ³´ µ

¶·

¹

º ¹

» ¹

¼ ¹

½ ¹

¾ ¿ À Á¹ º » ¼ ½ Â Ã

Ä Å Æ Ç È É Ç Ê Ê Ë Ì Ë

• Let us compare various common approaches to handle missingness,when interest is in estimation of the average trend


• Averaging the observed values at each visit:

Í ÎÏ Ð

ÑÒÓ

Ô

Õ Ô

Ö Ô

× Ô

Ø Ô

Ù Ú Û ÜÔ Õ Ö × Ø Ý Þ

ß à á â ã ä â å å æ ç æ

=⇒

Correct at visits without missing observations

Biased at visits with missing observations


• Averaging the values of the complete cases only:

è éê ë

ìíî

ï

ð ï

ñ ï

ò ï

ó ï

ô õ ö ÷ï ð ñ ò ó ø ù

ú û ü ý þ ÿ � ÿ � � � ÿ � û � þ �

=⇒

Biased at visits without missing observations



• Averaging after last observation carried forward (LOCF):

� ��

��

�

�

�

�

� � � � � � � � � �

� � � � � � � � ! � " #

=⇒


Distorted association structure (→ p-values)


• Averaging after mean imputation:

$ %& '

()*

+

, +

- +

. +

/ +

0 1 2 3+ , - . / 4 5

6 7 8 9 : ; < = > 8 > : ? 9

=⇒


Distorted association & variance structure (→ p-values)


29.3 How to handle missing data ?

• No uniformly best answer:

. Depends on nature of missingness

. Depends on outcome type

. Depends on research question

. Depends on model considered

. . . .

• All methods rely on assumptions about the relation between the probability for anobservation to be missing and the observation itself

=⇒ Untestable assumptions


• Multiple imputation (M = 5 imputations):

Observeddata

.................................................................................................................................................................................................................................................................................................................................................

........................................................................................

.......................................................................................

...........................................

........................................

....................................................................................................................................................................................... ........................................

.......................................................................................................................................................................................................................... ........................................

......................................................................................................................................................................................................................................................................................................... ........................................

Imputed 1

Imputed 2

Imputed 3

Imputed 4

Imputed 5

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

Results 1

Results 2

Results 3

Results 4

Results 5

......................................................................................................................................................................................................................................................................................................... ........................................

.......................................................................................................................................................................................................................... ........................................

....................................................................................................................................................................................... ........................................

........................................................................................

........................................................................................

..........................................

........................................

.................................................................................................................................................................................................................................................................................................................................................

Finalresults

...........................

...........................

...........................

..........................

..................................................

....................................

...................................................................................

......................................................................................................................................................................

.........................................................................................................................................................................................................................................................

Imputation CombinationAnalysis


• Advantages:

. Correctly accounts for uncertainty about imputed values

. Imputation can be based on observed information (covariates, outcomes)

. Expert opinion

. Various imputation models can be explored (−→ sensitivity analyses)

. Relatively straightforward to implement

• Often, a small number M of imputations is sufficient (M = 3, 5)

• Alternative approaches rely on joint modeling the outcome and thedropout/missingness process

• Such methods are less generally applicable and/or more difficult to implement



• Malmstrom et al. [36], statistical analysis section, p.90:

3.10. Statistical analysis

The responses to instrument items were transformed to scale

scores according to the instructions from the providers (Fayers

et al., 2001). Imputations of missing values were performed in two

steps. Missing values within the forms were replaced according to

the scoring manual of the instrument (Fayers et al., 2001) and

missing values due to missing forms were replaced with mean

value imputation. The analyses were conducted according to the

Intention-To-Treat principle (Altman, 1991). A priori, we decided∗ Imputation according to scoring mannuals is also (single) imputation,

and to be avoided !

∗ Mean value imputation !


• Zimmerman et al. [38], statistical analysis section, p.106:

who had fully completed the baseline assessment, were assessed

again at week 8 post baseline and 12 months post baseline. If

patients did not answer the invitation for assessment or could not

be reached at all (i.e. if they dropped out before these assessments),

their last available values were used, their last observation was

carried forward (LOCF method) for the ITT-analysis. We reported

the results as adjustedmean differences with their 95% con dence

In a sensitivity analysis, we calculated results of the observed

cases (OC) for the primary outcome. This analysis will include only

those patients who did not drop out and completed their final

assessment. In a second sensitivity analysis, we replaced missing

values using a multiple imputation approach (N = 100 imputa-

tions). Analyses were done using Stata 14.

∗ Primary analysis based on LOCF

∗ Sensitivity analyses based on complete cases and on multiple impuation


• Kruse et al. [37], statistical analysis section p.1912:

@ A B B C A D A E E A F G H I J H K C A J K L C M A N M A B I O C L E N K L J P O B B O A Q O B L B O R Q O D O S L Q C D K L C H N K A D C M O B S A M A N C TU K L C M L Q J N K L J P O B B O A Q N K I N K B K Q C I A C K Q C O L E E V Q A Q G N L Q J A P B K E K S C O A Q P K S M L Q O B P B E O P O C O Q R C M KK B C O P L C O A Q A D C N L W K S C A N O K B C A X M K L E C M V B H N Y O Y A N B T Z [ \ ] ^ _ ` a b c d e f ` g a h _ f i e a ^ d j b k e a f d j ^ b ` i ^ d c af ` b l f _ e m n o p q c d i b k e b ^ _ e b f i r f s t f ` b l c d a ^ _ ^ b b k e u ^ c g ^ d j e v v e l b g f v d f d t r c d i f _ i r f sf ` b w x y z x { | } ~ � } � } � � � } � � � � ~ � � � � � } � } � � � � ~ � � � } � ~ � � � � � � � � � � � � ~ � � } � � � � ~ � � } ~ �

� � � � � � � � � � � � � � � ¡ ¢ £ ¡ ¤ ¥ ¦ ¡ ¥ § ¨ © © ¨ ª ¤ « ¦ ¡ ¬ £ § ª ¥ ¥ « ¨ ¢ £ ¡ © ® ¡ ¦ ¡ ¢ ª ¯ ° ¤ ª ¦ § ¡ ¦ ¯ ¦ © © ¨ ª ¤ ± ² £ ³ ¤ ¢ ° ¢ ¨ § § ª ¥ © © £ ¡ ¦ ¥ ¡ © ´ µ ¶ ¬ ¢ ° © ® ¬ ¨ · ¨ ¬ ¨ ¤ ¢ ¦ ¬ ® ¢ ¡ ¤ ¥ © ª ® · ª ¦ ® ª © ¢ °£ ª © ® ¨ ¢ ¡ ¸ � ¹ ± º ª ¥ · ¨ ¢ ¢ ¨ ¤ ¯ « ¡ © ¥ ª ¤ « ¨ ¢ £ ¢ £ » ¹ º ¼ ½ ¾ � ® ¦ ª ¬ ¥ ´ ¦ ± ¸ § ª ¦ ¥ ¢ ¡ ¨ ¥

∗ Considerable dropout due to death or re-admission

∗ Dropout believed to be potentially related to the outcome studied (ADL)

∗ The dropout mechanism has been jointly modeled with the longitudinal outcome:

1. ADL: mixed model with random intercepts and slopes

2. Two time-to-event models for death and re-admission:Log-normal model is an alternative to Cox regression


Bibliography


Bibliography

[1] C.A. Wong, B.M. Scavone, A.M. Peaceman, et al. The risk of cesarean delivery with neuraxial analgesia given early versus late in labor. The

New England Journal of Medicine, 352:655–665, 2005.

[2] A.I. Amin, O. Hallbook, A.J. Lee, R. Sexton, B.J. Moran, and R.J. Heald. A 5-cm colonic j pouch colo-anal reconstruction following anteriorresection for low rectal cancer results in acceptable evacuation and continence in the long term. Colorectal Disease, 5:33–37, 2003.

[3] S. Kaplan, S. Etlin, I. Novikov, and B. Modan. Occupational risks for the development of brain tumours. American Journal of Industrial

Medicine, 31:15–20, 1997.

[4] Y. Baba, J.D. Putzke, N.R. Whaley, Z.K. Wszolek, and R.J. Uitti. Gender and the parkinson’s disease phenotype. Journal of Neurology,252:1201–1205, 2005.

[5] K.M. Kellett, D.A. Kellett, and L.A. Nordholm. Effects of an exercise program on sick leave due to back pain. Physical Therapy, 71:283–293,1991.

[6] S.E. Nissen, E.M. Tuzcu, P. Schoenhagen, et al. Statin therapy, LDL cholesterol, C-reactive protein, and coronary artery disease. The New

England Journal of Medicine, 352:29–38, 2005.

[7] T. Shatari, M.A. Clark, T. Yamamoto, A. Menon, C. Keh, J.Alexander-Williams, and M. Keighley. Long strictureplasty is as safe and effective asshort strictureplasty in small-bowel crohn’s disease. Colorectal Disease, 6:438–441, 2004.


[8] P. Serrano-Gallardo, M. Martınez-Marcos, F. Espejo-Matorrales, T. Arakawa, G.T. Magnabosco, and I.C. Pinto. Factors associated to clinicallearning in nursing students in primary health care: An analytical cross-sectional study. Revista Latino-Americana de Enfermagem, 24:e2803,2016.

[9] A. Salehi, M. Marzban, M. Sourosh, F. Sharif, M. Nejabat, and M.H. Imanieh. Social well-being and related factors in students of school ofnursing and midwifery. International Journal of Community Based Nursing and Midwifery, 5:82–90, 2017.

[10] P. Kiekkas, H. Brokalaki, E. Manolis, A. Samios, C. Skartsani, and G. Baltopoulos. Patient severity as an indicator of nursing workload in theintensive care unit. Nursing in Critical Care, 12:34–41, 2007.

[11] M. Frilund and L. Fagerstrom. Managing the optimal workload by the PAONCIL method – A challenge for nursing leadership in care of olderpeople. Journal of Nursing Management, 17:426–434, 2009.

[12] S. Bjork, M. Lindkvist, A. Wimo, C. Juthberg, A. Bergland, and D. Edvardsson. Residents’ engagement in everyday activities and its assocationwith thriving in nursing homes. Journal of Advanced Nursing, 47:http://dx.doi.org/10.1111/jan.13275, 2017.

[13] K.H. Archbold, B. Giordani, D.L. Ruzicka, and R.D. Chervin. Cognitive executive dysfunction in children with mild sleep-disordered breathing.Biological Research for Nursing, 5:168–176, 2004.

[14] S.M. van Hooft, J. Dwarswaard, R. Bal, M. Strating, and A. van Staa. What factors influence nurses’ behavior in supporting patientself-management ? an explorative questionnaire study. International Journal of Nursing Studies, 63:65–72, 2016.

[15] W.Y. Huang, C.C. Chang, D.R. Chen, C.T. Kor, T.Y. Chen, and H.M. Wu. Circulating leptin and adiponectin are associated with insulinresistance in healthy postmenopausal women with hot flashes. PloS One, 12:e0176430, 2017.

[16] S. Bjork, H. Lovheim, M. Lindkvist, A. Wimo, and D. Edvardsson. Thriving in relation to cognitive impairment and neuropsychiatric symptomsin Swedish nursing home residents. International Journal of Geriatric Psychiatry, 32:http://dx.doi.org/10.1002/gps.4714, 2017.

[17] R.M. Collard, M. Arts, H.C. Comijs, P. Naarding, P.F.M. Verhaak, M.W. de Waal, and R.C. Oude Voshaar. The role of frailty in the associationbetween depression and somatic comorbidity: Results from baseline data of an ongoing prospective cohort study. International Journal of Nursing

Studies, 52:188–196, 2015.

[18] B.L. Blomquist, P.D. Cruise, and R.J. Cruise. Values of baccalaureate nursing students in secular and religious schools. Nursing Research,29:379–383, 1980.


[19] B.P. Richardson, A.E. Ondracek, and D. Anderson. Do student nurses feel a lack of comfort in providing support for Lesbian, Gay, Bisexual orQuestioning adolescents: What factors influence their comfort level? Journal of Advanced Nursing, 73:1196–1207, 2016.

[20] D. Ausili, P. Rebora, S. Di Mauro, B. Riegel, M.G. Valsecchi, M. Paturzo, R. Alvaro, and E. Vellone. Clinical and socio-demographicdeterminants of self-care behaviours in patients with heart failure and diabetes mellitus: A multicentre cross-sectional study. International

Journal of Nursing Studies, 63:18–27, 2016.

[21] E. Hahnel, U. Blume-Peytavi, C. Trojahn, G. Dobos, A. Stroux, N. Garcia Bartels, I. Jahnke, A. Lichterfeld-Kottner, H. Neels-Herzmann,A. Klasen, and J. Kottner. The effectiveness of standardized skin care regimens on skin dryness in nursing home residents: A randomizedcontrolled parallel-group pragmatic trial. International Journal of Nursing Studies, 70:1–10, 2017.

[22] J.C. Silva, Z. Viera de Moraes, C. Aparecida da Silva, S. de Barros Mazon, M.E. Guariento, A. Liberalesso Neri, and A. Fattori. Understandingred blood cell parameters in the context of the frailty phenotype: Interpretations of the FIBRA (Frailty in Brazilian Seniors) study. Archives of

Gerontology and Geriatrics, 59:636–641, 2014.

[23] K.J. Moon and S.M. Lee. The effects of a tailored intensive care unit delirium prevention protocol: A randomized controlled trial. International

Journal of Nursing Studies, 52:1423–1432, 2015.

[24] E. Cameron and L. Pauling. Supplemental ascorbate in the supportive treatment of cancer: re-evaluation of prolongation of survival times interminal human cancer. Proceedings of the National Academy of Science U.S.A., 75:4538–4542, 1978.

[25] D.J. Hand, F. Daly, A.D. Lunn, K.J. McConway, and E. Ostrowski. A handbook of small datasets. Chapman & Hall, first edition, 1989.

[26] R. Peto, M.C. Pike, P. Armitage, N.E. Breslow, D.R. Cox, S.V. Howard, N. Mantel, K. McPherson, J. Peto, and P.G. Smith. Design and analysisof randomised clinical trials requiring prolonged observation of each patient. British Journal of Cancer, 35:1–35, 1977.

[27] P.D. Allison. Survival analysis using the SAS system: A practical guide. NC: SAS Institute, 1995.

[28] F. Blanchon, M. Grivaux, B. Asselain, et al. 4-year mortality in patients with non-small-cell lunc cancer: development and validation of aprognostic index. Lancet Oncology, 7:829–836, 2006.

[29] J.P. Klein and M.L. Moeschberger. Survival analysis : Techniques for censored and truncated data. Springer Verlag New York, 1997.

[30] T. Nawrot, M. Plusquin, J. Hogervorst, et al. Environmental exposure to cadmium and risk of cancer: a prospective population-based study. The

Lancet Oncology, 7:119–126, 2006.


[31] L.F. Hutchins, S.J. Green, P.M. Ravdin, D. Lew, S. Martino, M. Abeloff, A.P. Lyss, C. Allred, S.E. Rivkin, and C.K. Osborne. Randomized,controlled trial of Cyclophosphamide, Methotrexate, and Fluorouracil versus Cyclophosphamide, Doxorubicin, and Fluorouracil with and withoutTamoxifen for high-risk, node-negative breast cancer: Treatment results of intergroup protocol int-0102. Journal of Clinical Oncology,23:8313–8321, 2005.

[32] P.J. Brown, S.P. Roose, R. Fieo, X. Liu, T. Rantanen, J. Sneed, B.R. Rutherford, D.P. Devanand, and K. Avlund. Frailty and depression in olderadults: A high-risk clinical population. The American Journal of Geriatric Psychiatry, 22:1083–1095, 2014.

[33] L. Schoonhoven, B.G. van Gaal, S. Teerenstra, E. Adang, C. van der Vleuten, and T. van Achterberg. Cost-consequence analysis of “washingwithout water” for nursing home residents: A cluster randomized trial. International Journal of Nursing Studies, 52:112–120, 2015.

[34] L. Smedts-Alenius, C. Tishelman, R. Lindqvist, and S. Runesdotter. RN assessments of excellent quality of care and patient safety are associatedwith significantly lower odds of 30-day inpatient mortality: A national cross-sectional study of acute-care hospitals. International Journal of

Nursing Studies, 61:117–124, 2016.

[35] S. Fisher, Y. Yasui, K. Dabbs, and M. Winget. Using multilevel models to explain variation in clinical practice: Surgeon volume and the surgicaltreatment of breast cancer. Annals of Surgical Oncology, 23:1845–1851, 2016.

[36] M. Malmstrom, B. Ivarsson, R. Klafsgard, and K. Persson. The effect of a nurse led telephone supportive care programme on patients’ quality oflife, received information and health care contacts after oesophageal cancer surgery – A six month RCT-follow-up study. International Journal of

Nursing Studies, 64:86–95, 2016.

[37] R.L. Kruse, G.F. Petroski, D.R. Mehr, J. Banaszak-Holl, and O. Intrator. Activities of daily living (ADL) trajectories surrounding acutehospitalisation of long-stay nursing home residents. Journal of the American Geriatrics Society, 61:19091918, 2013.

[38] T. Zimmermann, E. Puschmann, H. van den Bussche, B. Wiese, A. Ernst, S. Porzelt, A. Daubmann, and M. Scherer. Collaborative nurse-ledself-management support for primary care patients with anxiety, depressive or somatic symptoms: Cluster-randomized controlled trial (findings ofthe SMADS study). International Journal of Nursing Studies, 63:101–111, 2016.


master in nursing & midwifery ku leuven · advanced statistical methods master in nursing &...

Documents