master in nursing & midwifery ku leuven · advanced statistical methods master in nursing &...
TRANSCRIPT
Advanced statistical methods
Master in nursing & midwiferyKU Leuven
Geert Verbeke
Interuniversity Institute for Biostatisticsand statistical Bioinformatics
http://gbiomed.kuleuven.be/biostat/geertverbeke
Table of Contents
I Introductory material 1
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Central data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 What is statistics ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5 Confidence intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6 Some frequently used tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Advanced statistical methods i
II Critical appraisal of literature 78
7 Errors in statistics: Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8 Errors in statistics: Practical implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
III Simple linear regression 143
9 The Pearson correlation coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
10 Simple linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
11 Model diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
12 Influential observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
IV One-way analysis of variance 255
13 The unpaired t-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
14 1-way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Advanced statistical methods ii
V Multiple linear regression 315
15 Multiple linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
16 Polynomial regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
17 Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
VI Analysis of variance with multiple factors 399
18 Multiple analysis of variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
VII Analysis of covariance and the general linear model 444
19 Analysis of covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
20 The general linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
21 Regression notation of a general linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
Advanced statistical methods iii
VIII Models for binary outcomes 516
22 Simple logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
23 Multiple logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
IX Models for time-to-event data 579
24 Survival analysis without censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
25 Survival analysis with censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
26 Regression for survival data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
X Further Topics 640
27 Clustered data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
28 Longitudinal data / Repeated measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
29 Missing observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
Advanced statistical methods iv
Bibliography 712
Advanced statistical methods v
Part I
Introductory material
Advanced statistical methods 1
Chapter 1
Introduction
. Motivation
. Course material
. Examination and evaluation
Advanced statistical methods 2
1.1 Motivation
• Master thesis −→ research track
• Statistics in (bio-)medical literature −→ critical reading / appraisal
• Correct analysis of data collected
• Correct interpretation of results obtained
Advanced statistical methods 3
1.2 Course material
• Copies of slides: Toledo
• Publications discussed during course: Library (available online)
• Datasets used during course: Toledo
• Statistica software:
. Avalailable in all K.U.Leuven PC classes
. Available via LUDIT: https://icts.kuleuven.be/sc/english/index
. . . .
• Other packages (SAS, SPSS, . . . ) allowed
Advanced statistical methods 4
• Vestac JAVA applets
. JAVA applets for the visualization of statistical concepts
. Download from: http://lstat.kuleuven.be/newjava/vestac/
Advanced statistical methods 5
1.3 Examination & evaluation
• Critical appraisal of literature (Part A):
. Individual task
. Critical reading of literature
• Data analysis (Part B):
. Take-home team project (3-4 students per team)
. Data analysis and reporting of results
• Reporting and presentation:
. Written reports about Part A & Part B submitted prior to oral exam
. Mid-term individual presentation of intermediate results of Part B
. Individual presentation of results of Part A & Part B at oral exam
Advanced statistical methods 6
Chapter 2
Central data set
. Introduction
. Problem setting
. Sample
. Data collected
Advanced statistical methods 7
2.1 Introduction
• These data are central to this part.
• Origin: Prof.Dr. Koen Milisen, AccentVV, KU Leuven.
• Data available to students in the context of this course but cannot be distributedfurther
Advanced statistical methods 8
2.2 Problem setting
• Research into post-operative variability in the neuro-cognitive and functional status inelderly hip fracture patients.
• A surgical intervention in elderly patients often results in acute cognitivedisfunctioning (= delirium).
• Delirium versus dementia:
. Delirium: → acute start→ usually temporary
. Dementia: → no acute start→ slowly progressing→ irreversible
Advanced statistical methods 9
• Delirium . . .
. leads to medical problems and problems of care
. often is the first symptom of a physical disorder or intoxication stemming frommedicines
. can lead to increased mortality
. is hard to detect
• Economical implications of delirium:
. Extra care
. Longer hospital stay
. High degree of institutionalization
• Research suggest that, among elderly hip fracture patients, the increased degree ofdependence is a consequence of delirium, rather than the hip fracture itself.
Advanced statistical methods 10
2.3 Sample
• Longitudinal design: Certain variables are measured repeatedly over time.
• Prospective (e.g., complications) and retrospective (e.g., living conditions)measurements.
• Data of 2 traumatology units of University Hospital Gasthuisberg, KU Leuven.
• Inclusion criteria:
. ≥ 65 years of age
. hospitalized with hip fracture in the emergency room
. consent for participation into the study
. . . .
Advanced statistical methods 11
• Exclusion criteria:
. time between admission and operation ≥ 72 hours
. various traumas
. . . .
• Data collected 16/09/1996–28/02/1997.
Advanced statistical methods 12
2.4 Data collected
• Statistica file: delirium.sta
• Data on 60 patients
• 78 variables
• Data for every patient, prior to, during, and post operation
• Longitudinal and derived measurements
• Study questionnaire, ADL score, MMSE, and CAM scores
Advanced statistical methods 13
Advanced statistical methods 14
Advanced statistical methods 15
Advanced statistical methods 16
2.4.1 Pre-operative evaluation
Variable Description Values
nummer patient number 1–60
leeftd age (years)
gesl sex 1=male2=female
opnduur length of stay (days)
burgst civil status 1=single2=married3=widow(er)4=divorced5=religious
opleid education 1=university/college2=high school3=lower secundary4=primary
zijfrc side fracture 1=left2=right
typfrc type fracture 1=intra-capsular2=extra-capsular
cardio cardiologic pathology 0=no1=yes
vascul vascular pathology 0=not1=yes
Advanced statistical methods 17
Variabele Description Values
pulmon pulmonary pathology 0=no1=yes
urinai urinary pathology 0=no1=yes
abdom abdominal pathology 0=no1=yes
hyper hypertension 0=no1=yes
zicht vision pathology 0=no1=yes
gehoor auditive pathology 0=no1=yes
malign malignant disease 0=no1=yes
diabet diabetes 0=no1=yes
reumat reumatological pathology 0=no1=yes
vrop past surgery 0=no1=yes
neuro neuro-psychiatric pathology 0=no1=yes
andere other pathology 0=no1=yes
Advanced statistical methods 18
2.4.2 Operative evaluation
Variabele Description Values
opnop time hospitatlization-operation 1=emurgency2=<24 hours3=<48 hours4=<72 hours
soorin type of surgery 1=internal fixation2=THP3=BHP4=DHS
percom per-operative complications 1=yesl2=not
opduur duration surgery 1=<45 min2=45-90 min3=90-120 min4=>120 min
bloed blood loss 1=<300 ml2=300-1000 ml3=>1000 ml
anes anesthesia 1=local2=spinal3=complete
Advanced statistical methods 19
2.4.3 Post-operative evaluation
Variabele Description Values
no mechanic complications 0=no1=yes
luxa luxation of prosthesis 0=no1=yes
impla implantation problems 0=no1=yes
anmec other mechanical problems 0=no1=yes
nolok local complications 0=no1=yes
opper superficial wound problems 0=no1=yes
diep deep infection 0=no1=yes
anlok other local complications 0=no1=yes
gen general complications 0=no1=yes
doorli decubitus 0=no1=yes
diephl deep phlebothrombosis 0=no1=yes
pulemb pulmonary embolism 0=no1=yes
Advanced statistical methods 20
Variabele Description Values
urin urinary complications 0=no1=yes
ander other respiratory problems 0=no1=yes
cardi cardiologic complications 0=no1=yes
cere cerebral complications 0=no1=yes
autre other general complications 0=no1=yes
gn intake medication 0=no1=yes
dig intake digitalis 0=no1=yes
diur intake diuretics 0=no1=yes
bblo intake β-blocker 0=no1=yes
benz intake benzodiazepines 0=no1=yes
anti intake anticholinergics 0=no1=yes
neur intake neuroleptics 0=no1=yes
Advanced statistical methods 21
Variabele Description Values
depres02 intake anti-depressants 0=no1=yes
other intake other medication 0=no1=yes
ontsl discharge to 1=home2=daughter/son3=geriatric ward4=revalidation unit5=psychiatric unit6=RH/RVT7=convent8=other
dood death during hospitalisation 1=yes2=no
Advanced statistical methods 22
2.4.4 Longitudinal and derived measures
Variabele Description Values
sencam CAM result on day 1 1=delirium; 2=no delirium
sencam03 CAM result on day 3
sencam05 CAM result on day 5
senverw Was CAM result ever equal to 1 ? 0=no; 1=yes
adltot1 ADL score on day 1 6-24; 6=not dependent; 24=very dependent
adltot5 ADL score on day 5
adltot12 ADL score on day 12
MMSE1 MMSE score on day 1 0-30; 0=extreme confusion; 30=no confusion
MMSE3 MMSE score on day 3
MMSE5 MMSE score on day 5
MMSE8 MMSE score on day 8
MMSE12 MMSE score on day 12
CAM : Confusion Assessment Method, measured on days 1,3,5,8,12
ADL : Activities of Daily Living, measured on days 1,5,12
MMSE : Mini Mental State Examination, measured on days 1,3,5,8,12
Advanced statistical methods 23
Chapter 3
What is statistics ?
. Example
. Population – sample
. Random variability
Advanced statistical methods 24
3.1 Example: Captopril data
• 15 patients with hypertension
• The response of interest is the supine blood pressure, before and after treatment withCAPTOPRIL
• Research question:
How does treatment affect BP ?
Advanced statistical methods 25
• Dataset ‘Captopril’
Before After
Patient SBP DBP SBP DBP
1 210 130 201 125
2 169 122 165 121
3 187 124 166 121
4 160 104 157 106
5 167 112 147 101
6 176 101 145 85
7 185 121 168 98
8 206 124 180 105
9 173 115 147 103
10 146 102 136 98
11 174 98 151 90
12 201 119 168 98
13 198 106 179 110
14 148 107 129 103
15 154 100 131 82
Average (mm Hg)
Diastolic before: 112.3
Diastolic after: 103.1
Systolic before: 176.9
Systolic after: 158.0
Advanced statistical methods 26
• It would be of interest to know how likely the observed changes in BP are to occur bypure chance.
• If this is very unlikely, the above data provide evidence that BP indeed decreases aftertreatment with Captopril. Otherwise, the above data do not provide evidence forefficacy of Captopril.
Advanced statistical methods 27
• Obviously, we are not interested in drawing conclusions about the 15 observed patientsonly.
• Instead, we would like to draw conclusions about the effect of Captopril on the totalpopulation of all hypertensive patients.
• Conclusion:
Statistics aims at drawing conclusions about some population,based on what has been observed in a random sample
Advanced statistical methods 28
POPULATION
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
S
A
M
P
L
E
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••RANDOM STATISTICS
Effect of Captopril in population
Effect of Captopril in 15 patients
Advanced statistical methods 29
3.2 Population versus random sample
• Population: Hypothetical group of current and future subjects, with a specificcondition, about which conclusions are to be drawn
• Sample: Subgroup from the population on which observations will be taken
• In order for effects observed in the sample to be generalizable to the total population,the sample should be taken at random
Advanced statistical methods 30
3.3 Random variability
• Descriptive statistics of the observed differences in diastolic BP, after treatment withCaptopril, in 15 subjects:
Before After Change
Patient DBP DBP
1 130 125 5
2 122 121 1
3 124 121 3
4 104 106 −2
5 112 101 11
6 101 85 16
7 121 98 23
8 124 105 19
9 115 103 12
10 102 98 4
11 98 90 8
12 119 98 21
13 106 110 −4
14 107 103 4
15 100 82 18
Advanced statistical methods 31
• Note that not all subjects experience the same benefit from the treatment
• An average decrease of 9.27 mm/Hg is observed in our sample
• A new, similar, experiment would lead to another sample, hence to another observedchange in BP:
. More reduction (11.57 mm/Hg) ?
. Less reduction (4.78 mm/Hg) ?
. No change (0.00 mm/Hg) ?
. Increase (-5.23 mm/Hg) ?
• This shows that the observed decrease of 9.27 mm/Hg should not be overinterpreted
• This also shows that one should not hope that 9.27 mm/Hg is the gain in BP onewould observe if the total population were treated with Captopril.
Advanced statistical methods 32
• Let µ be the average change in BP one would observe if the total population would betreated
• 9.27 mm/Hg can then be interpreted as an estimate for µ, based on our sample
• Question:
Is our observed change of 9.27 mm/Hg sufficient evidenceto conclude that the treatment really affects the BP ?
• Answer:
Hypothesis testing
Advanced statistical methods 33
POPULATION
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
S
A
M
P
L
E
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••RANDOM STATISTICS
Is µ different from 0 ?
Observed effect of 9.27 mm/Hg
in 15 randomly selected patients
Advanced statistical methods 34
Chapter 4
Hypothesis testing
. Example
. Null and alternative hypothesis
. The p-value and level of significance
. Possible errors in decision making
Advanced statistical methods 35
4.1 Example
• As before, µ is the average change in diastolic BP one would observe if the totalpopulation of hypertensive patients would be treated with Captopril.
• Note that µ will never be known, but we can use our sample to learn about µ.
• In case the treatment would have no effect, the average µ would be zero.
• So, if one can show that there is (strong) evidence that µ 6= 0, then this can beconsidered as evidence for a treatment effect.
• Based on our sample of 15 observations, we estimated µ by µ = 9.27mm/Hg.
• Obviously, this estimate is relatively far away from 0, suggesting that the treatmentmight affect BP
Advanced statistical methods 36
• On the other hand, the observed effect µ = 9.27 could have occurred by pure chance,even if there would be no treatment effect at all.
• Question:
How likely would that be ?
• Only if this would be very unlikely to happen, the observed data will be consideredsufficient evidence for some effect of the treatment
Advanced statistical methods 37
4.2 Null and alternative hypothesis
• The procedure to decide whether there is sufficient evidence to believe the treatmentdid affect BP is called test of hypothesis
• In practice, the research question is formulated in terms of a null hypothesis H0 andan alternative hypothesis HA:
H0 : µ = 0 versus HA : µ 6= 0
• Based on our observed data, we will investigate whether H0 can be rejected in favourof HA
• If not, the null hypothesis H0 is accepted and one decides that the treatment wasnot effective
Advanced statistical methods 38
4.3 The p-value and level of significance
• Intuitively, it is obvious that H0 : µ = 0 will be rejected if the observed sample averageµ is too far away from 0
• Question:
How far is too far ?
• Answer:
If this result is very unlikely to happen by pure chance
Advanced statistical methods 39
• Equivalently:
If this result is not at all what you expect to see if µ would be 0
• One can calculate that, if Captopril would have no effect at all, that there is only 0.1%chance of observing a sample with average change in BP at least as big as9.27mm/Hg.
• Hence, if Captopril would have no effect (i.e., if µ = 0), then it would be very unlikelyto observe a sample with average as extreme as 9.27. This would happen only onceevery 1000 times a similar experiment would be performed.
Advanced statistical methods 40
• We therefore consider the data observed in our experiment sufficient evidence toreject the null hypothesis and we conclude that the treatment effect is significantlydifferent from 0, or equivalently, that there is a significant treatment effect
• The probability 0.1% that expresses how extreme our observations are in case the nullhypothesis would be true, is denoted by p, and is called the p-value.
• A small p-value is indication of extreme results were H0 true. One then rejects thenull hypothesis
Advanced statistical methods 41
• A large p-value is indication that the observed results are perfectly in line with whatcan be expected to observe, if H0 is true. One then does not reject the nullhypothesis, which is equivalent to accepting the null hypothesis
• In practice, one has to decide how small p should get before the null hypothesis isrejected.
• One therefore specifies the so-called level of significance α:
p < α =⇒ reject H0
p ≥ α =⇒ accept H0
• α is typicaly a small value, such as 0.01, 0.05, 0.10
Advanced statistical methods 42
• In biomedical sciences α = 0.05 =5% is standard.
• One then rejects the null hypothe-sis as soon as the observed resultwould happen in less than 5 timesin 100 experiments, assuming thatthe null hypothesis would be correct
• Strictly speaking, one should always mention what level of significance has been used,and the conclusion would have to be formulated as “the treatment effect issignificantly different from 0 at the 5% level of significance,” or equivalently, that“there is a significant treatment effect at the 5% level of significance.”
Advanced statistical methods 43
• Note that specification of α is onlyrequired if a formal decision is pre-ferred (‘accept’ or ‘reject’).
• It is therefore not meaningful to re-port ‘borderline significance’ inexamples where p is only slightlylarger than α(e.g., p = 0.06 > α = 0.05)
Advanced statistical methods 44
4.4 Possible errors in decision making
• In our example about the Captopril treatment, we obtained p = 0.001 leading to therejection of the null hypothesis of no treatment effect.
• This should not be considered as formal proof that there is a treatment effect
• Even if the treatment has no effect at all, a sample like ours would occur once every1000 times.
• Maybe, our sample was indeed the extreme one that happens once every thousandexperiments.
• Alternatively, suppose we would have obtained p = 0.9812. We then would not haverejected the null hypothesis, and concluded that there is no evidence for any treatmenteffect.
Advanced statistical methods 45
• This should not have been considered as formal proof that any treatment effect wouldbe absent.
• Maybe, the treatment effect µ is not 0, but very close to 0. The data one then wouldobserve would look very similar to data that would be observed if µ = 0, such that thedata do not allow to detect that µ 6= 0
Advanced statistical methods 46
• Conclusion:
“Statistics can prove everything”
• Intuitively: Absolute certainty aboutpopulation characteristics cannot beattained based on a finite sample of ob-servations
Advanced statistical methods 47
Advanced statistical methods 48
Chapter 5
Confidence intervals
. Example
. The confidence interval
. Interpretation
. Properties of confidence intervals
. Hypothesis testing versus confidence intervals
Advanced statistical methods 49
5.1 Example: Captopril data
• Consider the Captopril data, where blood pressure was taken in 15 hypertensivepatients, before and after administration of the drug Captopril:
• Interest is in estimating the average change in diastolic BP.
Advanced statistical methods 50
• Let X be the difference in diastolic BP before and after treatment:
X = BPbefore − BPafter
• The observed values xi for X can be calculated from the observed values of the BP inour sample:
Before After Change
Patient DBP DBP xi
1 130 125 5
2 122 121 1
3 124 121 3
4 104 106 −2
5 112 101 11
6 101 85 16
7 121 98 23
8 124 105 19
9 115 103 12
10 102 98 4
11 98 90 8
12 119 98 21
13 106 110 −4
14 107 103 4
15 100 82 18
Advanced statistical methods 51
• Note that, in relatively small samples, the histogram can be difficult to interpret.
• One therefore prefers not to estimate the complete distribution of X
• On the other hand, there does not seem to be strong evidence for severe skewness.
• Focuss will be on the estimation of the average µ of X . As before, our estimate willbe the sample average:
µ = x = 9.27
• Since every other sample would have lead to another estimate µ, it is of interest toknow how likely it is that our estimate is far from the true value µ
• We want to derive an interval around our estimate µ = 9.27 which is very likely tocontain the true value µ
Advanced statistical methods 52
POPULATION
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
S
A
M
P
L
E
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
How precise is µ as estimate foraverage change µ in diastolic BP
µ = x = 9.27
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••
STATISTICSRANDOM
Advanced statistical methods 53
5.2 The confidence interval
• The confidence interval is an interval around the estimate µ which expresses theprecision with which µ has been estimated
• The interval will contain the unknown value µ with user-defined certainty, called theconfidence level:
Level Confidence Interval
90% [5.61; 12.93]
95% [4.91; 13.63]
99% [3.02; 15.52]
• In biomedical sciences, one traditionally uses 95% confidence levels
Advanced statistical methods 54
5.3 Interpretation
• Let us focuss on the 95% confidence interval. For other confidence levels, theinterpretation is similar.
• The 95% C.I. equals [4.91; 13.63]
• Obviously, this cannot be interpreted as the interval [4.91; 13.63] containing µ with95% probability
• Indeed, [4.91; 13.63] always or never contains µ
• Correct interpretation:
→ confidence interval for mean
Advanced statistical methods 55
Advanced statistical methods 56
• Conclusion:
There is 95% probability that the experiment conducted
results in a C.I. which contains the unknown value µ
Advanced statistical methods 57
5.4 Properties of confidence intervals
• Ideally, C.I.’s are small, as this reflects a very precise estimation of the unknownpopulation parameter µ
• Hence, a C.I. can be used as an indication of the precision of the estimation:
. short C.I.: precise estimation
. long C.I.: imprecise estimation, much uncertainty
• The length of the C.I. increases with the confidence level:
Level Confidence interval
95% [4.91; 13.63]
99% [3.02; 15.52]
Advanced statistical methods 58
• Intuitively: larger intervals are more likely to contain the unknown populationparameter µ
• The length of the C.I. decreases with the sample size n
• Illustration:
→ confidence interval for mean
Advanced statistical methods 59
• Intuitively: More observations leads to more precision:
One can ‘buy’ extra precision with extra observations
• The length of the C.I. increases with the variance σ2 of the original data
• Intuitively: The more the observations are alike, the more precise the mean can beestimated:
|µ
Precise estimation of µ
|µ
Imprecise estimation of µ
Advanced statistical methods 60
•What about 100% C.I.’s ?
• The 100% C.I. for µ equals [−∞; +∞], which is not informative at all
• Intuitively: Absolute certainty aboutpopulation characteristics cannot be at-tained based on a finite sample of obser-vations
Advanced statistical methods 61
5.5 Hypothesis testing versus confidence intervals
• For the Captopril data, we have drawn conclusions about the average treatment effectin the population, through 2 different statistical procedures:
. 95% confidence interval: [4.91; 13.63]
. Significance of treatment effect, p = 0.001
• We know from the C.I. that the average treatment effect is likely to be between 4.91and 13.63, excluding 0
• The significance test has rejected the value 0 as possible value for µ
• So, both procedures agree
Advanced statistical methods 62
• Question:
Do both procedures always agree ?
• Answer:
Yes, provided the levels of significance andconfidence are complementary to each other:
Level of significance α Confidence level (1− α)100%
0.05 95%
0.10 90%
0.01 99%
Advanced statistical methods 63
• In case of accepting H0 (p ≥ α = 0.05):
x.........................................................................................
.
..
..
..
..
..
..
..
..
..
..
..
.
H0
[ ]
95% C.I.
• In case of rejecting H0 (p < α = 0.05):
x.........................................................................................
.
..
..
..
..
...
..
..
..
..
..
..
H0
[ ]
95% C.I.
Advanced statistical methods 64
• An alternative interpretation for the C.I. follows immediately:
A 95% C.I. is the collection of all null hypotheses
that would be accepted in a statistical test
• Statistical tests are to some extent equivalent to C.I.’s
• However, C.I.’s have the advantage of giving an indication of the effect size(treatment esstimate µ), as well as of the precision of estimation (width of C.I.)
• So, C.I.’s should be preferred over statistical tests
↔ Biomedical literature
Advanced statistical methods 65
Chapter 6
Some frequently used tests
. The unpaired t-test
. The chi-squared test
. The paired t-test
. Assumptions
Advanced statistical methods 66
6.1 The unpaired t-test
• Consider data from a rat experiment to study weight gain under a high or a lowprotein diet
• Group-specific histograms:
Advanced statistical methods 67
• Group-specific summary statistics:
• On average, there is an observed difference of 19g between the rats on a high proteindiet and those on a low protein diet.
• Is this observed difference sufficient evidence to conclude that there indeed is an effectof diet on the weight gain ?
• It would be of interest to know how likely such a difference of 19g is to occur if weightgain would be completely unrelated to the protein level of the diet.
Advanced statistical methods 68
• Based on the unpaired t-test, it can be calculated that, in case the diet would notaffect the weight gain at all, one would have p = 0.0757 = 7.57% chance of observinga difference of at least 19g, in a similar experiment.
• So, even if there is no relation at all between the protein content of the diet andweight gain, then one can still expect to observe a difference of at least 19g in 7.6% ofthe future similar experiments.
• Since p = 0.0757 > 0.05 = α, we consider this unsufficient evidence to conclude thatthe protein level would indeed affect the weight gain
Advanced statistical methods 69
• Conclusion:
There is no significant difference (p = 0.0757) in weight gain
between rats on a high protein level diet,
and rats on a low protein level diet
Advanced statistical methods 70
6.2 The chi-squared test
• We consider data on sickness absence, collected on 585 employees with a similar job:
Sickness absenceNo Yes
Genderfemale 245 184 429
male 98 58 156
343 242 585
Advanced statistical methods 71
• Research question:
Is there a relation between absence and gender ?
• 184/429 = 42.9% of the females, and 58/156 = 37.2% of the males have been absent
• This suggests that females are more absent than males
• However, even if absence due to sickness is equally frequent amongst males andfemales, the above results could have occurred by pure chance.
• It therefore would be of interest to calculate how likely it would be to observe suchdifferences, by pure chance
Advanced statistical methods 72
• Based on the chi-squared test, it can be calculated that, even if males and femaleswould be equally frequently absent, there would be p = 0.215 = 21.5% chance ofobserving a similar experiment with difference between the groups at least equal to0.429 − 0.372 = 0.057
• So, even if there is no relation at all between gender and absence, then one can stillexpect to observe a difference of 5.7% in 21.5% of the future similar experiments.
• Since p = 0.215 > 0.05 = α, we consider this unsufficient evidence to conclude thatthe the occurrence of sickness absence is related to gender
Advanced statistical methods 73
• Conclusion:
There is no significant difference (p = 0.215) in prevalence
of sickness absence
between males and females
Advanced statistical methods 74
6.3 The paired t-test
• The Captopril example discussed before considered paired data: Each observationbefore treatment uniquely corresponds to one observation after treatment (from thesame patient), and vice versa
• The paired t-test analyses paired observations:
. Before and after treatment
. Married couples: male and female
. Twin studies
. Ophthalmology: left and right eye
. . . .
Advanced statistical methods 75
6.4 Assumptions
• Most statistical procedurs are based on assumptions about the distribution of theobservations in the population
• For example, the unpaired t-test, used before to compare weight gains under twodifferent diets, assumed weight gains to be normally distributed, with the sameamount of variability in both groups:
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
Low protein High protein
|µ2
|µ1
Advanced statistical methods 76
• If assumptions are not satisfied, wrong results can be obtained
• One will therefore always explore the observed data to check whether the assumptionsare supported by the data.
• In large samples however, results are less sensitive to the underlying assumptions.
Advanced statistical methods 77
Part II
Critical appraisal of literature
Advanced statistical methods 78
Chapter 7
Errors in statistics: Basic concepts
. Introduction
. Two types of errors
. Power
. Sample size calculation
. Examples
. Remarks
. Example from the biomedical literature
Advanced statistical methods 79
7.1 Introduction
• Re-consider the example on the weight gain in rats, where interest is in thecomparison between rats fed on a high or low protein diet
• Group-specific histograms:
Advanced statistical methods 80
• Group-specific summary statistics:
• On average, there is an observed difference of 19g between the rats on a high proteindiet and those on a low protein diet.
• Based on the unpaired t-test, we obtained before that this observed difference is notsufficient evidence to believe that the weight gain is really different for the two diets(p = 0.0757)
Advanced statistical methods 81
• Conclusion:
There is no significant difference (p = 0.0757) in weight gain
between rats on a high protein level diet,
and rats on a low protein level diet
• As indicated before, the result of a statistical test should be interpreted as evidence infavour or against the null hypothesis, and should not be interpreted as formal proof.
• In our example, the difference in weight gain between a population treated with onediet and a population treated with the other diet is too small to be detected based on12 and 7 animals, respectively.
Advanced statistical methods 82
• Alternatively, if the t-test would have lead to p = 0.001, this would still not formallyproof that there is a difference between both populations.
• After all, p = 0.001 would only indicate that the observed difference of 19g occursonce every 1000 times, even if there is no difference at all between both populations.
• Maybe, our sample was indeed the extreme one that happens once every thousandexperiments.
• Hence, whenever statistical tests are used, one has to be aware that errors in theconclusions can occur.
• It is therefore important to quantify the errors, and to keep them under control
Advanced statistical methods 83
7.2 Two types of errors
RealityH0 correct H0 not correct
Test resultAccept H0 No error Type II error
Reject H0 Type I error No error
• Type I error: H0 is incorrectly rejected
• Type II error: H0 is incorrectly accepted
Advanced statistical methods 84
7.3 Type I error
• A type I error occurs if H0 is correct but the test leads to a significant result.
• Question:
How likely is such an error to occur ?
• Suppose the test is performed at the α = 5% level of significance
• If H0 is correct, then one will observe a significant result in 5% of the cases
• Hence, in 5% of the cases, H0 would be incorrectly rejected
Advanced statistical methods 85
• The probability of making a type I error is therefore equal to the chosen level α ofsignificance.
• In practice, the probability of making a type I error is kept under control by choosingα sufficiently small
• In biomedical sciences α = 5% is often used, hereby allowing to make a type I error in5% of the cases.
RealityH0 correct H0 not correct
Test resultAccept H0 1− α
Reject H0 α
1
• If H0 is correct, then the probability of making a type I error is α, while the probabilityof correctly accepting H0 is 1− α.
Advanced statistical methods 86
7.4 Type II error
• A type II error occurs if H0 is incorrect but the test has not detected this, i.e., anon-significant result is obtained
• Question:
How likely is such an error to occur ?
• In contrast to the type I error, the probability of making a type II error is not easilycontrolled, and depends on various aspects of the sample(s) and population(s)
Advanced statistical methods 87
• In analogy to the type I error, the type II error rate is denoted by β
RealityH0 correct H0 not correct
Test resultAccept H0 1− α β
Reject H0 α 1− β
1 1
• The power of a statistical test is 1− β, the probability of correctly rejecting H0
Advanced statistical methods 88
7.5 Power
• In general, a specific testing procedure is acceptable, only if:
. the chance of making a type I error rate is sufficiently small
. the power to detect deviations from H0 is sufficiently large
• The first condition can be met by specifying α sufficiently small.
• The second condition is more difficult to meet, as the power depends on variousaspects of the sample(s) and population(s)
• This will be illustrated in the context of the comparison of two groups (such as theweight gain experiment)
Advanced statistical methods 89
• As before, let µ1 and µ2 represent the average weight gain in the total population,under high and low protein diets, respectively.
• The null and alternative hypotheses are given by
H0 : µ1 = µ2 versus HA : µ1 6= µ2
• The power is the probability of correctly rejecting H0.
• In that case, µ1 6= µ2, and we denote the true difference between both populations by∆ = µ1 − µ2
• The unpaired t-test assumes the data to be normally distributed in both populations,with equal variability σ2
Advanced statistical methods 90
• Graphically:
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
Low protein High protein
|µ2
|µ1
................................................................................................................................................................................................................................................................... ........................................ ...........................................................................................................................................................................................................................................................................................................∆
........................................................................................................................................................................................................................................................ ........................................ ................................................................................................................................................................................................................................................................................................σ2
........................................................................................................................................................................................................................................................ ........................................ ................................................................................................................................................................................................................................................................................................σ2
Advanced statistical methods 91
7.5.1 Power as a function of α
The smaller α, the smaller the power
• Intuitively: Type I errors are less likely if the null hypothesis is rejected less often.However, in cases where H0 is truly wrong, it will still be rejected less often.
• An extreme case is obtained for α = 0:
. α = 0 implies that the null hypothesis is always accepted
. So, in case the null hypothesis is wrong, it is still accepted, leading to power 0
Advanced statistical methods 92
7.5.2 Power as a function of true difference ∆
The smaller ∆, the smaller the power
• Intuitively: Large deviations from the null hypothesis are easier to detect
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
Low protein High protein
|µ2
|µ1
................................................................................................................................................................................................................................................................... ........................................ ...........................................................................................................................................................................................................................................................................................................∆
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
Low protein High protein
|µ2
|µ1
..............................................................................................................................................................∆
Advanced statistical methods 93
7.5.3 Power as a function of variability σ2
The smaller σ2, the larger the power
• Intuitively: Homogeneous groups are easier discriminated than heterogeneous groups
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
Low protein High protein
|µ2
|µ1
................................................................................................................................................................................................................................................................... ........................................ ...........................................................................................................................................................................................................................................................................................................∆
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
Low protein High protein
|µ2
|µ1
................................................................................................................................................................................................................................................................ ........................................ ........................................................................................................................................................................................................................................................................................................∆
Advanced statistical methods 94
7.5.4 Power as a function of sample size(s)
The more observations, the larger the power
• Intuitively: More observations yields more information about the population(s),therefore implying more precision in the conclusions
Advanced statistical methods 95
7.5.5 Conclusion
• The power depends on various aspects:
. Level of significance α
. True difference ∆ between the populations
. Within-group variance σ2
. Sample size(s)
• Note that the sample size is the only aspect under control of the investigator.
• In practice, one can calculate the sample size needed to reach a sufficiently high power.
Advanced statistical methods 96
7.6 Sample size calculation
• As indicated before, a testing procedure is only acceptable if it has sufficient power,i.e., if the probability of making a type II error is sufficiently small.
• Since the sample size is the only aspect influencing the power, which is under controlof the investigator, it is important that experiments are sufficiently large in order forthe power to be sufficiently large as well
• The level α of significance is chosen such that the probability of making a type I erroris sufficiently small
• The within-group variance σ2 is pre-specified based on earlier, similar experiments,relevant literature, or a pilot study
Advanced statistical methods 97
• To be on the safe side, usually an upperbound for σ2 is used: In case the variabilitywould be smaller, the power would be higher, hence still sufficiently high
• In practice, ∆ is not known. Instead, the smallest ∆ which would still be clinicallyrelevant to detect, is specified.
• If sufficient power is attained for the smallest meaningful ∆, we have that:
. Any larger difference will be detected with even larger power
. We are not concerned about small powers for detecting smaller differences, as suchdifferences are not relevant anyway.
• One can then calculate the number(s) of observations needed to reach a desired levelof power.
Advanced statistical methods 98
7.7 Example: Weight gain data
• In the weight gain data, the observed difference of 19g was found not to be significant(p = 0.0757)
• We can calculate the power that a real difference of 19g would be found significant ifa new experiment were to be conducted, again with 12 and 7 observations in the highand low protein diet groups, respectively.
• Group-specific summary statistics, from the current experiment:
Advanced statistical methods 99
• Power calculations will be based on σ = 21, and α = 0.05
• The power to detect a difference ∆ of 19g equals 43.45%
• Hence, with 12 and 7 observations respectively, there is only 43.45% chance that atrue difference of 19g would be detected.
• If a difference of 19g is considered clinically relevant, then the weight gain experimentwas clearly too small, since it is very likely that such a difference would remainundetected.
• We can also calculate the power for other values of ∆
Advanced statistical methods 100
• Summary:
∆ Power to detect a difference ∆
0g 5.00%∗
10g 15.70%
19g 43.45%
30g 80.80%
40g 96.49%
∗: equal to α
• For example, 12 and 7 observations would be sufficient to show a true difference of40g with more than 96% chance.
• Alternatively, one can also calculate how large the samples should be to detect adifference of, e.g., 20g with sufficiently high power.
Advanced statistical methods 101
Advanced statistical methods 102
• If a power of 90% is required to detect true effects as small as ∆ = 20g, at least 25observations are needed in each group.
• With 30 observations in each group, the probability of making a type II error, whenthe true effect is not smaller than 20g, is approximately 5%.
Advanced statistical methods 103
7.8 Example: Sickness absence
• We re-consider the data on sickness absence, collected on 585 employees with asimilar job:
Sickness absenceNo Yes
Genderfemale 245 184 429
male 98 58 156
343 242 585
• The observed difference between the absence rate 42.9% in females and 37.2% inmales was found not significant (chi-squared test, p = 0.215).
Advanced statistical methods 104
• In case the percentages of sickness absence would be 42% in the total femalepopulation, and 37% in the total male population, and in case a random sample of429 females and 156 males would be taken, there would be 19.01% chance to reach asignificant effect.
• So, if the population proportions are indeed 42% and 37%, an experiment with 429 en156 would detect this difference only 19 times out of 100 experiments.
• If a difference of 5% is considered clinically relevant, then the current experiment wasclearly too small, since it is very likely that such a difference would remain undetected.
• We can calculate how large the samples should be in order to detect a differencebetween 42% and 37%, with sufficiently high power
Advanced statistical methods 105
Advanced statistical methods 106
• For example, two samples of approximately 2500 observations are needed in order toshow a difference between 37% and 42%, with 95% probability
• Compared to the weight change example, many more observations are needed:
. Different outcomes imply ∆ values are not comparable
. Binary data, in general, contain less information than continuous data
Advanced statistical methods 107
7.9 Remarks
• The earlier examples of power and/or sample size calculations were in the context ofthe unpaired t-test and chi-squared test.
• Similar calculations can be done in any other statistical testing situation, e.g., FisherExact test, paired t-test, McNemar test, . . .
• Strictly speaking, all experiments should be preceded by a realistic sample sizecalculation to avoid experiments with unacceptable high type II error rates, i.e., withalmost no chance at all to show clinically meaningful effects.
Advanced statistical methods 108
7.10 Example from the biomedical literature
Wong et al. [1]
• Methodology section, p.658:
Advanced statistical methods 109
• Table 2 with results:
• Discussion, p.664:
Advanced statistical methods 110
• The difference on which the sample size calculation was based was much larger thanwhat actually was observed in the experiment
• Therefore, the power to reject equality of the groups was (much) lower than theexpected 80%
• The current study cannot tell the difference between a 9% increase and a 3% decrease.
• If such differences are considered clinically important, then the current study wasunder-powered, due to the fact that the difference was overestimated at the time ofthe sample size calculation.
Advanced statistical methods 111
Chapter 8
Errors in statistics: Practical implications
. Multiple testing
. Bonferroni correction
. Tests for baseline differences
. Equivalence tests
. Significance versus relevance
. Examples from biomedical literature
Advanced statistical methods 112
8.1 Multiple testing
• Each time a test is performed, there is probability α of making a type I error
• For example, if α = 0.05, we can expect to incorrectly reject the null hypothesis in 5out of 100 times.
• Implication:
“The more tests one performs, the higher the probabilitythat something is detected by pure chance”
• This problem of multiple testing occurs very frequently in bio-medical sciences, invarious settings
Advanced statistical methods 113
8.1.1 Example: A classroom experiment
• On entry in the classroom, assign each student at random to be seated at the left orat the right side of the classroom
• Compare both sides with respect to 100 aspects including weight, height, age, gender,color of hair, color of eyes,. . .
• It is to be expected that for at least 5 of these outcomes, a significant difference isobtained at the 5% level of significance, by pure chance.
Advanced statistical methods 114
8.1.2 Example: Testing many relations
• Amin et al. [2], Table 2:
. 18 tests performed
. only 2 significant results
Advanced statistical methods 115
8.1.3 Example: Subgroup analyses
• Kaplan et al. [3], Table 5:
. Tests based on C.I.’s for odds ratios
. C.I. containing 1 is equivalent to anon-significant test result
. 21× 3 = 63 tests performed
. only 5 significant results
Advanced statistical methods 116
8.1.4 Example: Searching for the most significant results
• This ‘scientific finding’ was printed in the Belgian newspapers:
• It was even stated that those who wake up before 7.21am have a statisticallysignificant higher stress level during the day than those who wake up after 7.21am.
Advanced statistical methods 117
8.1.5 Conclusion
• Significant results obtained by multiple testing are often overinterpreted
• If the number of tests is reported, the reader knows that such results need to beinterpreted with extreme care
• The problem arises when only the significant results are reported, and one does notknow how many tests were performed in total
• This leads to reporting results which turn out to be not reproducible:
. For example, a new study would not find that students seated on the left are tallerthan those on the right. Instead, students seated on the left may weigh more thanthose seated on the right.
. For example, a new experiment might show no difference in stress levels betweensubjects waking up early and those waking up late. Or maybe a difference would befound only when waking up is later than 8.12am.
Advanced statistical methods 118
8.2 Bonferroni correction
• Suppose two tests are performed, both at the 5% level of significance.
• The probability that at least one type I error will be made can be shown not to exceed2× 0.05 = 0.10:
P (at least 1 type I error) ≤ 2× 5% = 10%
• In general, if k tests are performed, all at the 5% level of significance, the probabilityof making at least one type I error can only be shown not to exceed k × 5%
• Obviously, controling the overall type I error rate can be done by performing eachseparate test at the α/k level of significance.
Advanced statistical methods 119
• For example, performing 2 tests at the 2.5% level of significance each implies that theprobability of making at least one type I error will not exceed 5%.
• In general, when k tests are performed at the α/k level of significance, one is surethat the overall probability of making at least one type I error will not exceed α.
• This correction of the significance level is called the Bonferroni correction.
• When confidence intervals are used instead of p-values, the confidence levels can becorrected in a similar way
Advanced statistical methods 120
• Some examples:
Number of tests Significance level α Confidence level
1 0.05 95%
2 0.025 97.5%
5 0.01 99%
k 0.05/k (1− 0.05/k) × 100%
• For example, if CI1, CI2, . . .CI5 are 5 intervals with 99% confidence, for 5 unknownparameters θ1, θ2, . . . , θ5, then there is at least 95% probability that all 5 C.I.’s willcontain all 5 unknown parameters:
P (CI1 contains θ1 and . . . and CI5 contains θ5) ≥ 95%
Advanced statistical methods 121
• Note that, strictly speaking, the Bonferroni correction is an overcorrection, since theoverall type I error rate can only be shown not to exceed 5%, and usually will besmaller than the required 5%.
• In some specific testing situations (e.g., ANOVA analysis), more accurate correctionsare available.
Advanced statistical methods 122
8.3 Examples from the biomedical literature
• Baba et al. [4], p.1202 and p.1203:
Advanced statistical methods 123
• Kellett et al. [5], Table 2 (for example):
Advanced statistical methods 124
In the discussion, R.Roy writes:
Note that the reader cannot perform the Bonferroni correction as the exact p-valueshave not been reported.
Advanced statistical methods 125
8.4 Tests for baseline differences
• In order to show causal effects, patients are often randomized into 2 or more groups
• This ensures (at least in large studies) that all treatment groups are identical, exceptfor the treatment the patients receive
• In (relatively) small studies, imbalances can still occur by pure chance
• Therefore, one often compares the various groups with respect to important factorswhich are believed to be strongly related to the outcome of interest.
• This is called testing for baseline differences, as one compares the characteristicsof the patients at the start of the study.
Advanced statistical methods 126
• As an example, suppose interest is to compare two oral treatments, A and B, for thetreatment of hypertension.
• Suppose the change in diastolic BP is the oucome of interest
• Age is one of the factors believed to be strongly related to BP. Therefore, it isimportant that both treatment groups have the same age distribution
• Therefore, one often tests for age differences between A and B, e.g., based on thetwo-sample t-test.
• The hypothesis tested is
H0 : µA = µB versus HA : µA 6= µB
• Note that H0 and HA express properties of the populations, not the samples
Advanced statistical methods 127
• In the populations (infinitely large), we know that, due to the randomization, µA andµB are identical
• Conclusion:
It makes no sense at all to perform baseline testsin randomized studies
• No matter how small the resulting p-value would be (e.g., < 10−8) we know that theobserved difference in age between groups A and B has occurred purely by chance.
• A meaningful alternative is to calculate a C.I. of the average age difference betweenboth groups, to ensure that the observed difference is sufficiently small to concludethat it cannot (completely) explain the observed differences in the outcome of interest.
Advanced statistical methods 128
• In our example suppose that a 95% confidence interval for the average difference in age(years) is given by [0.1; 0.3], then we believe that this difference would be too small toexplain why patients in group A show more decrease in BP than patients in group B.
• Note also that testing for baseline differences cannot be used to check whether therandomization was done properly.
Advanced statistical methods 129
8.5 Example from the biomedical literature
Nissen et al. [6], abstract and table 1:
A two-arm randomized study
Advanced statistical methods 130
formal tests at baseline
Advanced statistical methods 131
8.6 Equivalence tests
• Suppose two groups A and B are to be compared, with hypotheses to test:
H0 : µA = µB versus HA : µA 6= µB
• In case of a non-significant test result, one often concludes that both groups areidentical or equivalent
• An alternative interpretation is that the experiment did not have sufficient power toshow an effect which is present.
• Conclusion:
Non-significance should not be interpreted as equivalence
Advanced statistical methods 132
• This can also be seen from the fact that, if the two-sample t-test could be used toshow equivalence, it would be best to collect data on (extremely) small samples, as thiswould increase the chance to obtain an non-significant result, due to lack of power.
• Instead, one should reverse H0 and HA:
H0 : |µA − µB| > ∆ versus HA : |µA − µB| ≤ ∆
where ∆ is a pre-specified constant, defining ‘equivalence’
• Note that HA is equivalent to −∆ ≤ µA − µB ≤ ∆
• Hence, in order to reject H0, one needs to show evidence that µA and µB are less than∆ away from each other
• One way to proceed is to construct a C.I. for µA − µB and to check whether it isentirely within the interval [−∆; ∆].
Advanced statistical methods 133
• Graphically, H0 would be rejected if:
µA − µB
−∆ +∆
0
[ ]
95% C.I.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
...
..
..
..
..
..
..
..
..
..
..
..
..
...
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
...
..
.
• Graphically, H0 would not be rejected if:
µA − µB
−∆ +∆
0
[ ]
95% C.I.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
...
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
...
..
..
..
..
..
..
..
..
..
..
..
Advanced statistical methods 134
• Obviously, the result of the equivalence test entirely depends on the choice of ∆
• Therefore, ∆ needs to be specified prior to the data collection
Advanced statistical methods 135
8.7 Example from the biomedical literature
Shatari et al. [7]:
• Title:
Advanced statistical methods 136
• Table 1:
No significantdifferences !
Advanced statistical methods 137
• Results and conclusions (abstract):
Advanced statistical methods 138
8.8 Significance versus relevance
• We discussed before that the power to detect some effect ∆ increases with the samplesize
• This implies that any effect ∆, no matter how small, will, sooner or later, be detected,if the sample is sufficiently large.
• For example, consider the Captopril data, where the observed difference of 9.27 mmHgwas found significantly different from zero (p < 0.001), based on data from 15patients only:
Advanced statistical methods 139
• The 99% confidence interval for the average change µ in BP was found to be[3.02; 15.52].
• Suppose that the observed difference would have been 0.1 mmHg.
• A p-value as small as 0.001 would be likely to be obtained, provided that the samplewould be sufficiently large.
• Obviously, an average change in BP as small as 0.1 mmHg is not relevant from aclinical point of view.
• Conclusion:
Statistical significance 6= Clinical relevance
Advanced statistical methods 140
• A highly significant effect can be a large effect:
µ
0
[ ]
95% C.I. p = 0.0001
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
• A highly significant effect can also be a very small effect, but estimated with highprecision, due to a large sample size:
µ
0
[ ]
95% C.I. p = 0.0001
.
.
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
.
..
.
..
.
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
Advanced statistical methods 141
• The p-value cannot distinguish between both situations
• It is therefore important not to blindly overinterpret significant results withoutknowing the size of the effect
• This is another reason why confidence intervals are to be preferred over significancetesting
Advanced statistical methods 142
Part III
Simple linear regression
Advanced statistical methods 143
Chapter 9
The Pearson correlation coefficient
. Example
. Pearson correlation
. Properties and interpretation
. Statistical inference
. Application
. Examples from the biomedical literature
Advanced statistical methods 144
9.1 Example
• In the literature, it is suggested that a decreased cognitive status implies an increaseddependence in post-operative hip fracture patients.
• Therefore, we investigate the relationship between MMSE and ADL, 1 day postoperation.
• For each patient, we have two measurements:
. The MMSE score: xi for the ith patient
. The ADL score: yi for the ith patient
• Hence, the data are ordered pairs (xi, yi)
Advanced statistical methods 145
• A graphical representation of the relationship between MMSE and ADL can beobtained via a scatter plot of the yi versus the xi:
• The graph suggests a negative relationship between MMSE and ADL.
Advanced statistical methods 146
9.2 The Pearson correlation coefficient
• The relationship between two variables is often summarized using thePearson correlation coefficient:
r =∑
i(xi − x)(yi − y)√∑
i(xi − x)2√∑
i(yi − y)2
• The sample averages x and y of the x-observations and the y-observations,respectively are given by:
x =1
n
∑
ixi, y =
1
n
∑
iyi
Advanced statistical methods 147
9.3 Properties and interpretation
r =∑
i(xi − x)(yi − y)√∑
i(xi − x)2√∑
i(yi − y)2
x
y
•
••
••
•••
•
•
•
xi
yi(+,+)
(+,–)
(–,+)
(–,–)
Advanced statistical methods 148
The correlation coefficient measures the linear relationship between X and Y , and enjoysthe following properties:
• −1 ≤ r ≤ 1
• r < 0 : negative linear relationship between the xi and the yi
• r > 0 : positive linear relationship between the xi and the yi
• r = −1 : the points xi and yi perfectly lie on a decreasing straight line
• r = 1 : the points xi and yi perfectly lie on an increasing straight line
• r = 0 : there is no LINEAR relationship between xi and yi
Advanced statistical methods 149
Advanced statistical methods 150
9.4 Statistical inference
• The correlation coefficient is calculated based on the observations (xi, yi), and is anestimator for the theoretical correlation ρ in the population
• In practice, one wants to test whether or not there is a linear relationship between thevariables X and Y , i.e., whether the correlation ρ is significantly different from zero.
• Formally, we want to test the null hypothesis
H0 : ρ = 0
versus the alternative hypothesisHA : ρ 6= 0
• The corresponding test procedure assumes that the variables X and Y are jointlynormally distributed.
Advanced statistical methods 151
9.5 Application
• Correlation matrix for ADL and MMSE at days 1, 5, and 12 post-operatively:
Advanced statistical methods 152
• Corresponding scatter plot matrix:
Advanced statistical methods 153
• The correlation between MMSE and ADL on day 1 is r = −0.70 and is significantlydifferent from zero (p < 0.0001).
• Hence, we can conclude that there is a strongly negative linear relationship betweenMMSE and ADL, 1 day post operation: The lower the cognitive status of the patient,the higher his dependence.
Advanced statistical methods 154
9.6 Examples from the biomedical literature
• Serrano-Gallardo et al. [8]:
. Methodology section, p. 4:
For data analysis, we performed univariate
analyses (measures of central tendency and dispersion
or percentages, depending on the variables� nature)
and bivariate analyses (Student�s t-test, ANOVA, and
Advanced statistical methods 155
. Results section, p. 6:
There was no evidence of an association between
the clinical learning and students� age (Pearson
there was evidence for an association with the grades
The multiple linear regression model (adjusted
Advanced statistical methods 156
• Salehi et al. [9], Table 3:
Table 3: The correlation between various domains of social well-being in School of midwifery and nursing
students in Shiraz University of Medical Sciences
Social
actualization
Social
coherence
Social
integration
Social
acceptance
Social
contribution
Social actualization 1 - - - -
Social coherence 0.96 1 - - -
Social integration 0.96 0.94 1 - -
Social acceptance 0.97 0.96 0.94 1 -
Social contribution 0.96 0.96 0.94 0.97 1
For all domains every P value is less than 0.0001; Pearson correlation
Advanced statistical methods 157
Chapter 10
Simple linear regression
. Introduction
. The method of least squares
. Application
. Statistical inference
. The ANOVA table
. Application
. Examples from the biomedical literature
Advanced statistical methods 158
10.1 Introduction
• The correlation coefficient r measures the linear relationship between two variables, Xand Y . How can we describe this linear relationship?
• One possible way would be to construct the straight line that ‘fits best’ the observedmeasurements:
Advanced statistical methods 159
• A straight line is described analytically by an equation of the form
y = β0 + β1x
• The parameter β0 is the intercept, β1 is the slope.
• If β1 > 0 :
. There is a positive relationship between x and y
. The larger β1, the faster y increases with x
• If β1 < 0 :
. There is a negative relationship between x and y
. The smaller β1, the faster y decreases with x
• In practice, one needs to estimate the parameters β0 and β1 based on the collecteddata (xi, yi).
Advanced statistical methods 160
10.2 The least squares method
• To estimate β0 and β1, we first need to decide which criterion should be satisfied by‘the best’ straight line
0
β0................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
•• •
•
•
•••
•
•
•
xi
yi y = β0 + β1x
yi
yi
Advanced statistical methods 161
• If we would know β0 and β1, then for each observation in the set of data, based on thex value, a predicted value can be calculated for y:
yi = β0 + β1xi
• The prediction will be good if yi lies closely to yi and will be poor if yi deviatesstrongly from yi
• If the straight line describes the data (xi, yi) adequately, then we expect, for mostpoints, yi to lie closely to the true value yi.
• A possible measure to capture how well the straight line has been chosen is
Q =∑
i[yi − yi]
2 =∑
i[yi − (β0 + β1xi)]
2
• Hence, Q is a measure for how closely the data lie to the straight line y = β0 + β1x.
Advanced statistical methods 162
• Note that other straight lines (i.e., other β0 and β1), will lead to different Q values.
• The straight line that describes the data best is the one for which Q is smallest.
• The least squares method calculates the values of β0 and β1 for which Q is minimal.
• It can be shown that these values are given by:
β1 =
∑
i(xi − x)(yi − y)∑
i(xi − x)2
, β0 = y − β1x
• β0 and β1 are termed the least squares estimators for β0 and β1.
Advanced statistical methods 163
• The straight line so obtained,
y = β0 + β1x
is termed the regression line.
• Once the estimators for β0 and β1 known, we can make a prediction, for eachobservation in the data set, for y based on x:
yi = β0 + β1xi
• We are also able, for each data point (xi, yi) in the set of data, to compute the error ifwe try to predict yi by yi:
ei = yi − yi = yi − (β0 + β1xi)
Advanced statistical methods 164
• The quantities ei are termed residuals:
. ei > 0 : the observed yi lies above the regression line
. ei = 0 : the observed yi lies on the regression line
. ei < 0 : the observed yi lies underneath the regression line
• Further, one can show that∑
iei = 0
i.e., the points above the regression line are ‘in equilibrium’ with these underneath theregression line.
Advanced statistical methods 165
10.3 Application
• Regression of ADL on MMSE, one day post-operatively yields the following regressioncoefficients:
• The Y variable is termed response, or also dependent variable.
• The X variable is termed covariate, or also independent variable.
Advanced statistical methods 166
• The parameter estimates are β0 = 23.65 and β1 = −0.30.
• The corresponding regression line is
ADL = 23.65 − 0.30 ×MMSE
• The regression line predicts an ADL score of 23.65 if MMSE is equal to zero.
• Further, there is a negative linear relationship between MMSE and ADL: The higherMMSE the lower ADL, and vice versa.
• The regression line predicts a decrease of 0.30 in ADL, for a unit increase of MMSE.
Advanced statistical methods 167
• Graphical representation:
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• •••••••••••••••••••••••••••••••••••••••• •••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
Difference of 10 units
Difference of
10 ×β1 units
Advanced statistical methods 168
• This ought to be interpreted as follows:
. Consider two groups of patients
. All patients in the first group have identical MMSE (e.g., 20).
. All patients in the second group have identical MMSE values, too, but 1 unithigher than these in the first group (hence, 21).
. Then, we expect the difference in average ADL score between both groups to be0.30, with the lower score for the group with highest MMSE.
• Hence, we should not conclude that an increase of MMSE with 1 unit in a givenpatient will lead to a decrease of 0.30 in ADL.
• Hence, we cannot draw ‘longitudinal’ conclusions from a ‘cross-sectional’experiment.
Advanced statistical methods 169
10.4 Statistical inference
10.4.1 Introduction
• We obtained the following regression output:
• The p-values listed test the hypotheses
H0 : β0 = 0 versus HA : β0 6= 0 and H0 : β1 = 0 versus HA : β1 6= 0
Advanced statistical methods 170
• Indeed, the least squares method allows us to calculate the straight line that bestdescribes our observations (xi, yi).
• However, a different sample from the same population would lead to a differentregression line
y = β0 + β1x
• Illustration:
→ regression plots
Advanced statistical methods 171
Advanced statistical methods 172
• Based on a sample and hence the corresponding estimators β0 and β1, statisticalinference (p-values, confidence intervals) aims to make a statement about theregression line
y = β0 + β1x
that captures the relationship in the entire population.
• This is not possible without additional assumptions about the distribtuion from whichthe data are sampled.
• The assumptions needed are described by the so-called regression model.
Advanced statistical methods 173
10.4.2 The simple linear regression model
• In realistic situations, the points (xi, yi) will never describe a perfect straight line, butrather a cloud of points.
• This implies that the observations do not satisfy
yi = β0 + β1xi
but rather
yi = β0 + β1xi + εi
where εi expresses how much an observation yi lies above or below the regression line.
• The quantities εi are termed errors, and the linear regression model assumes that theyare distributed following a normal distribution with mean 0 and (unknown) variance σ2:
εi ∼ N (0, σ2)
Advanced statistical methods 174
Advanced statistical methods 175
• Note that the εi are the ‘theoretical version’ of the residuals ei
• Hence, the regression model assumes . . .
. . . . linearity: for each X , the mean of the corresponding Y -values lie on theregression line
. . . .normality: for each X , the corresponding Y -values lie symmetrically aroundthe regression line
. . . .constant variance: the prediction errors for small X-values are neither largernor smaller than the errors for X-values
Advanced statistical methods 176
Advanced statistical methods 177
Advanced statistical methods 178
Advanced statistical methods 179
10.4.3 Significance tests for β0 and β1
• If the slope β1 is equal to zero, then the regression model is described by
yi = β0 + εi
which implies that there is no linear relationship between Y and X .
• In practice, if we want to test whether there is a linear relationship between X and Y ,then we need to test the null hypothesis:
H0 : β1 = 0 versus HA : β1 6= 0
• The value observed in our sample is β1 = −0.30
Advanced statistical methods 180
• This value could be obtained by coincidence, even if in the total population β1 = 0would hold.
• Research question:
How likely is it to observe β1 = −0.30 even if β1 = 0?
• Illustration:
→ histograms of slope and intercept
Advanced statistical methods 181
Advanced statistical methods 182
Advanced statistical methods 183
• It is clear that, when β1 = 0, it becomes very unlikely to still observe β1 = −0.30.
• Note that it would be equally unlikely to observe β1 = +0.30.
• The chance that we would find an estimate with |β1| ≥ 0.30 is p < 0.0001.
• Given that this probability is so small, more specifically that p < α = 0.05 = 5%, wewill conclude that what has been observed (β1 = −0.30) is sufficient indication tobelieve that β1 6= 0.
• Hence we reject the null hypothesis and conclude that β1 is significantly different from0, at the 5% significance level.
Advanced statistical methods 184
• Apart from testing hypotheses, the regression model also allows constructingconfidence intervals.
• For example, a 95% C.I. for β1 in our example is [−0.378;−0.218].
• Given that this interval is far away from 0, this is again strong evidence that β1 6= 0.
• Analogously, a significance test can be constructed for
H0 : β0 = 0 versus HA : β0 6= 0
• In practice, one is primarily interested in tests for β1.
• Note that all tests and confidence intervals are valid only when all regression modelassumptions are satisfied.
Advanced statistical methods 185
10.5 The ANOVA table
• How much better can we predict Y , given that we know X?
0
β0
y
...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
•
• •
•
• •
••
•
•
•
xi
yi y = β0 + β1xyi
yi
Advanced statistical methods 186
• Intuitively, this should be related to how well the dataset is describedby the regression line
• When we would not have x-values, then the best possible prediction for each yi-valueis the sample average y.
• A measure for the error so made is the sum of squares∑
i[yi − y]2
• Note that this is a measure for the variability in the yi.
• If we do use the observed xi-values to predict the y-values, then we predict each yi bymeans of
yi = β0 + β1xi
Advanced statistical methods 187
• A measure for the error so made is the sum of squares∑
i[yi − yi]
2 =∑
ie2i
• Because the use of this extra information coming from the xi leads to more precisepredictions, we have that
∑
i[yi − y]2 ≥ ∑
i[yi − yi]
2
• One can show that∑
i[yi − y]2
︸ ︷︷ ︸
↓SSTO
=∑
i[yi − yi]
2
︸ ︷︷ ︸
↓SSE
+∑
i[yi − y]2
︸ ︷︷ ︸
↓SSR
Advanced statistical methods 188
• SSTO: Total sum of squaresThis term captures the total error made by predicting the yi without taking intoaccount the observed values xi.
• SSE: Error sum of squaresThis term captures the error made upon predicting the yi by making use of theobservations xi.
• SSR: Regression sum of squaresThis term captures the decrease in error by predicting the values yi with, rather thanwithout, making use of the covariates.
Advanced statistical methods 189
• A measure for how well the data points (xi, yi) agree with the regression line is
R2 =SSR
SSTO
• R2 enjoys the following properties:
. 0 ≤ R2 ≤ 1
. R2 = 0 implies that SSR = 0 and hence that all yi are equal to y, i.e., theregression line is flat. This is equivalent with β1 = 0.
. R2 = 1 implies that SSE = 0. This implies that yi = yi for all i, and hence thatall points (xi, yi) lie on the regression line.
• R2 expresses ‘the fraction of the variability in the yi which can be explainedby the xi’.
• One can show that R2 is equal to r2, the square of the correlation between the xi andyi values.
Advanced statistical methods 190
10.6 Application
• ANOVA table for regression of ADL on MMSE on day 1 post-operatively:
• ‘R-square’ : R2 = 0.4940, the regression can explain about 50% of the total variabilityin the yi values:
R2 =SSR
SSTO=
351.23
351.23 + 359.76= 0.4940
• The Pearson correlation, found before, was:
r = −√
R2 = −√
0.4940 = −0.70
Advanced statistical methods 191
10.7 Examples from the biomedical literature
• Kiekkas et al. [10], Table 2:
Table 2 Simple linear regression in all intensive care unit patients: mean
daily Projet de Recherche en Nursing Rea and its categories as dependent
variables and Acute Physiology and Chronic Health Evaluation II as
explanatory variable
R2 a (M ± SE) b (M ± SE)
PRN Rea 0�256* 121�1 ± 4�1* 1�9 ± 0�3*
Respiration 0�103* 21�5 ± 1�5* 0�4 ± 0�1*
Nutrition 0�003 2�6 ± 0�6* ÿ0�1 ± 0�1
Elimination 0�021 0�1 ± 0�2 0�1 ± 0�1
Hygiene 0�009 25�4 ± 0�7* ÿ0�1 ± 0�1
Mobilization 0�016 10�7 ± 0�7* 0�1 ± 0�1
Communication 0�029** 9�1 ± 1�0* ÿ0�1 ± 0�1**
Diagnostic methods 0�194* 30�5 ± 1�9* 0�7 ± 0�1*
Treatments 0�221* 21�5 ± 2�0* 0�9 ± 0�1*
R2, determination coefficient; a, b, unstandardized regression coefficients
(a: intercept, b: slope), M ± SE, mean ± standard error.
*p < 0�01.
**p < 0�05.
Advanced statistical methods 192
• Frilund & Fagerstrom [11], statistical methodology section:
Data analysis
In order to establish the unitsÕ optimal NI-level certain
valueswere needed: themeanOPCpoints per day andper
care giver, the personnel resources for the specific day (i.e.
the actual time used tomeet the needs of the patients), and
a mean of the PAONCIL assessments for the same day.
The data was analysed by means of a simple linear
regression analysis. It is possible to predict a dependent
variable bymeans of a regression equation Y = a + bx, i.e.
was possible to calculate the optimal NI score per nurse,
that is, the score which led to the average PAONCIL
value zero. It is analysed to what extent the independent
variable (x = the mean OPC points per care giver per
day), explains the variation in values of the dependant
variable (y = the mean of the PAONCIL assessments) on
the basis of a linear relationship. The explanatory power
determination coefficient (R2) gives the proportion of
variance in Y that is counted for by x. If for instance R2 is
0.3, themodel explains 30%of the variation in the values
of the outcome variable (Fagerstrom et al. 2000 b,
Advanced statistical methods 193
Chapter 11
Model diagnostics
. Example
. Linearity
. Constant error variance
. Normality of the errors
. Examples from biomedical literature
Advanced statistical methods 194
11.1 Example
• We wish to assess whether a patient’s dependence (ADL), one day after surgery, canbe used to predict a patient’s length of stay:
Advanced statistical methods 195
• There appears to be a slight increase of length of stay, as a function of the ADL score.Is this relationship significant?
• Therefore, we fit the following regression model:
Length of stay = β0 + β1ADL + εi
• Regression output:
Advanced statistical methods 196
• The parameter estimates are:
. β0 = 9.37
. β1 = 0.29, p-value: 0.1173
• The fitted regression line is
Length of stay = 9.37 + 0.29ADL
• Note that there is no significant relationship between length of stay and ADL score,1 day post operation.
• Further, it follows from R2 = 0.0432 that ADL explains only 4% of the total variabilityin length of stay.
Advanced statistical methods 197
11.2 Model assumptions
• The statistical inferences, obtained for the regression parameters, are valid only if themodel assumptions are satisfied, i.e.,
yi = β0 + β1xi + εi, εi ∼ N (0, σ2)
Advanced statistical methods 198
• Hence, the regression model assumes that . . .
. . . . linearity: for each X , the mean of the corresponding Y -values lie on theregression line
. . . .normality: for each X , the corresponding Y -values lie symmetrically aroundthe regression line
. . . .constant variance: the prediction errors for small X-values are neither largernor smaller than the errors for X-values
• How can these assumptions be verified?
Advanced statistical methods 199
11.3 The assumption of linearity
Advanced statistical methods 200
• To illustrate the effect of non-linearity, consider the following fictitious example:
Advanced statistical methods 201
• There clearly is a positive relationship between xi and yi, but the relationship betweenxi and yi appears to deviate somewhat from linearity.
• What happens if we still apply linear regression?
• Regression output:
Advanced statistical methods 202
• R2 = 0.85: X explains 85% of the observed variability in Y .
• The regression line is given by
Y = 1.19 + 2.06X
• The slope β1 is significantly different from zero (p < 0.001).
• The observed points all lie close to the fitted regression line (explaining the high R2),but the straight line poorly describes the relationship between xi and yi:
. Over-estimation of the yi for small and large xi
. Under-estimation of the yi for intermediate values xi
Advanced statistical methods 203
• The graph suggests that non-linearity can be discerned through studying the residuals
ei = yi − yi = yi − (β0 + β1xi)
and to plot them as a function of x:
Advanced statistical methods 204
• If the assumption of linearity would be satisfied, then, for each value of X , thecorresponding values of Y would lie symmetrically around the regression line. Theresiduals ei would then have to lie symmetrically around zero, for all possible X values.
• Clearly, this is not satisfied in the above example.
• Note that the residuals in fact suggest that the relationship between the yi and the xi
is rather a quadratic function. We return to this point as part of polynomial regression.
• Oftentimes, the covariate X can be transformed so that the yi, as a function of thetransformed xi can be assumed linear.
• Frequently used transformations include ln(X),√
X , 1/X , exp(X), ln(X + 1),. . .
Advanced statistical methods 205
• For our fictitious example we try a logarithmic transformation of the observed xi:
xi −→ ln(xi)
• Regression output after log-transformation of Y :
Advanced statistical methods 206
• Accompanying graph:
Advanced statistical methods 207
• Residual plot:
Advanced statistical methods 208
• R2 = 0.92: Our model has improved, because we now can explain more variability inthe y-values by means of the x-values.
• The estimated regression curve now is
Y = 2.95 + 0.80 ln(X)
• Hence, the transformation complicates the interpretation of the regression coefficients.For example, 0.80 is the estimated increase in Y when ln(X) increases with one unit.
• At the same time is the transformation necessary to render the assumption ofnormality more realistic, which in turn implies that our statistical inferences w.r.t. β0
and β1 improve.
Advanced statistical methods 209
11.4 Example: Length of stay versus ADL
• We now check whether the linearity assumption is satisfied in the regression modelemployed for the prediction of length of stay by means of the ADL score, 1 day postoperation.
• The residual plot does not indicate any systematic trend in the residuals:
Advanced statistical methods 210
11.5 The assumption of constant variance
Advanced statistical methods 211
• For illustration, we study the relationship between diastolic blood pressure and age,using data of 54 healthy adult women, between 20 and 60 years of age:
Advanced statistical methods 212
• We conduct a regression of blood pressure on age:
• The regression explains more than 40% of the variability in blood pressure(R2 = 0.4077); there is a significant (p < 0.0001) linear relationship between age andblood pressure; the estimated regression line is:
Blood pressure = 56.16 + 0.58 × Age
Advanced statistical methods 213
• Given that the residuals ei = yi − yi can be interpreted as estimates of the theoreticaldeviations εi, we can assess the assumption of constant variance for the εi via ascatter plot of the residuals:
• The residuals show that the linearity assumption is satisfied.
Advanced statistical methods 214
• On the other hand, the residual plot suggest that the variance in εi increases with age.
• Violation of this assumption will lead to less than optimal inferences about theparameters β0 and β1:
. The estimated regression line remains to be correct
. The parameters β0 and β1 are estimated less precisely. This leads to larger p-valuesand hence a linear relationship between X and Y may go undetected.
• An optimal analysis is obtained through a so-called weighted least squares analysis.
• Oftentimes, non-constant variance is often paired with non-normality. A solution forthe non-normality problem very often generates, on the side, a solution for thenon-constant-variance problem.
Advanced statistical methods 215
11.6 Example: Length of stay versus ADL
• To check the assumption of constant residual variance for the regression model,employed to predict length of stay by means of the ADL score, 1 day post operation,we re-consider the residual scatter plot, already created to assess linearity:
Advanced statistical methods 216
• Apart from the outlier in the middle, there are no systematic trends in the variabilityof the residuals.
• We can therefore accept the assumption of constant residual variance.
Advanced statistical methods 217
11.7 The assumption of normality
Advanced statistical methods 218
• Given that the residuals ei = yi − yi are estimators for the theoretical deviations εi, itis natural to assess the assumption of normality via residuals.
• In practice, one often uses a combination of two methods:
. Graphical: a histogram of residuals
. A formal test for normality
• Both techniques are illustrated by means of the blood pressure data in 54 women.
Advanced statistical methods 219
11.7.1 A histogram of residuals
• A simple graphical way to explore the distribution of the residuals is by means of ahistogram, together with the normal distribution that most closely fits the histogram:
Advanced statistical methods 220
• From this histogram follows:
. There is no evidence for asymmetry in the distribution of the residuals
. The distribution appears not to be too different from the normal distribution
• We conclude that there is no graphical evidence for non-normal errors εi
Advanced statistical methods 221
11.7.2 The normality test
• Most software packages allow for a formal normality test
• One tests the null hypothesis
H0 : the data are normally distributed
versus the alternative hypothesis
HA : the data are not normally distributed
• Various testing procedures are possible, all leading to a p-value, allowing us to eitherreject or accept the null hypothesis
Advanced statistical methods 222
• Histogram with results of 3 normality tests:
Advanced statistical methods 223
• We obtain a histogram with the normal approximation, but also with the results of 3test procedures for normality: Shapiro-Wilk, Kolmogorov-Smirnov, and Lilliefor.The first two are the more common ones.
• Based on each of the 3 procedures, the null hypothesis of normality would be accepted.We conclude that the residuals ei and hence the errors εi are normally distributed.
Advanced statistical methods 224
11.7.3 Histogram ←→ normality test
• The histogram is an exploration technique to study the distribution of the residuals.
• The normality test is a formal test, allowing to test whether the assumption ofnormality is acceptable.
• In (very) large samples is the rejection of normality, based on a statistical testingprocedure, rather likely: The smallest deviations of normality will be detected.
• It is known that small deviations from normality will still lead to correct results, aslong as the errors are symmetric.
• Hence, if non-normality is not due to asymmetry, then the results obtained will still bereliable.
Advanced statistical methods 225
11.8 Example: Length of stay versus ADL
• We consider again the regression of length of stay with hip fracture patients on theirADL score, 1 day post operation.
Advanced statistical methods 226
• The residuals are clearly non-normally distributed.
• From the histogram, it follows that non-normality is due to asymmetry.
• In case non-normality results from asymmetry, one can sometimes transform the yvalues so as to make residuals in the new regression normally distributed.
• Frequently used transformation are ln(Y ),√
Y , 1/Y , exp(Y ), ln(Y + 1), . . .
• In our example, we have to transform the data (the y-values) such that the largerresidusals approach the bulk of the residuals.
• A possible transformation is
Length of stay −→ ln(Length of stay)
Advanced statistical methods 227
• Note that all observed values of length of stay are positive, implying that a logarithmictransformation is mathematically allowed.
• Before interpreting the regression model output, we check whether the distribution ofthe new residuals is clsoer to a normal distribution:
Advanced statistical methods 228
• Hence, we can conclude that the errors in the new regression model are normallydistributed.
• New regression output:
Advanced statistical methods 229
• The regression model is slightly improved, given that the R2 value has increased from0.0432 to 0.0670
• The regression line is:
ln(Length of stay) = 2.23 + 0.02 × ADL
• Now, we do find a significant relationship:p = 0.0497 in contrast with p = 0.1173 prior to transformation.
• Note that the relationship derived is no longer linear.
• This example underscores the need to check normality of errors, given that possiblenon-linearity can strongly distort the results.
Advanced statistical methods 230
• The transformation of the y-values can, again, distort linearity, and/or non-constantvariance of the errors εi. It is therefore useful to construct, after transformation, ascatter plot of the y-values versus the residuals:
• Linearity and constant variability remain satisfied.
Advanced statistical methods 231
11.9 General Conclusion
• Carrying out a regression is easy
• Evaluating a regression model is difficult
Advanced statistical methods 232
11.10 Examples from the biomedical literature
• Bjork et al. [12], methodology section:
Data analyses
Descriptive statistics was used to describe the characteristics
of the participants (age, gender ADL dependence and cogni-
tive impairment) and the prevalence of resident engagement
in the everyday activities. Descriptive results of categorical
data are presented as actual numbers, percentages and
results of continuous data are presented as means and stan-
dard deviations and median. The distribution of quantita-
tive variables was examined for normal distribution. Simple
linear regression analyses were performed with the total
score for thriving as the outcome variable. A total of 26
→ normality check of original variables !
Advanced statistical methods 233
• Kiekkas et al. [10]:
. Methodology section:
collected data, and statistical significance was set at
p < 0�05. Kolmogorov-Smirnov test was used to check
whether continuous variables (age, APACHE II score,
mean daily PRN Rea score and ICU length of stay)
were normally distributed. According to APACHE II
values, patients were divided into six clinical severity
groups (cutoff at every five points), and analysis of
variance was performed to identify differences in
nursing workload among patient groups. Dunnett’s
test was used for comparing the lowest clinical severity
group (control group, because the researchers sup-
posed a positive correlation between clinical severity
and nursingworkload) to each of the other five groups.
Simple linear regression was used to estimate the
variability of mean daily PRN Rea score (and catego-
ries of PRN Rea) with respect to APACHE II score,
within the entire patient population as well as within
patient subgroups.
→ normality check of outcome rather than residuals !
Advanced statistical methods 234
. Table 2:
Table 2 Simple linear regression in all intensive care unit patients: mean
daily Projet de Recherche en Nursing Rea and its categories as dependent
variables and Acute Physiology and Chronic Health Evaluation II as
explanatory variable
R2 a (M ± SE) b (M ± SE)
PRN Rea 0�256* 121�1 ± 4�1* 1�9 ± 0�3*
Respiration 0�103* 21�5 ± 1�5* 0�4 ± 0�1*
Nutrition 0�003 2�6 ± 0�6* ÿ0�1 ± 0�1
Elimination 0�021 0�1 ± 0�2 0�1 ± 0�1
Hygiene 0�009 25�4 ± 0�7* ÿ0�1 ± 0�1
Mobilization 0�016 10�7 ± 0�7* 0�1 ± 0�1
Communication 0�029** 9�1 ± 1�0* ÿ0�1 ± 0�1**
Diagnostic methods 0�194* 30�5 ± 1�9* 0�7 ± 0�1*
Treatments 0�221* 21�5 ± 2�0* 0�9 ± 0�1*
R2, determination coefficient; a, b, unstandardized regression coefficients
(a: intercept, b: slope), M ± SE, mean ± standard error.
*p < 0�01.
**p < 0�05.
Advanced statistical methods 235
Chapter 12
Influential observations
. Example
. Cook’s distance
. Application
. What to do with influential observations ?
. Example from biomedical literature
Advanced statistical methods 236
12.1 Example
• We consider again the regression of ln(Length of stay) on the ADL score, 1 day postoperation:
Advanced statistical methods 237
• Patient #20 has got an ADL score of 17, and is hospitalized during 36 days, which isexceptionally long in comparison with other patients.
• For subject #20, the residual ei = yi − yi is, therefore, very large.
• Given that the parameters β0 and β1 are estimated via the least squares method, it islegitimate to investigate how strongly our results β0 and β1 are influenced by thisindividual.
• A subject is highly influential if deleting the subject leads to strongly differing results.
• Influential observations make interpreting the results more difficult, because theconclusions become sample-dependent: A different sample would have led to differentresults.
Advanced statistical methods 238
• To study a subject’s influence, we can compare β0 and β1 with and without the givensubject.
• To illustrate the method, we consider subject #20, and investigate the effect ofdeleting this patient, together with what the effect would have been, had the subjectnot had an ‘average’ ADL score, but rather a very large (24) or very small (10, 5, 0)ADL.
Advanced statistical methods 239
• Results for ADL= 17:
Advanced statistical methods 240
• Results for ADL= 24:
Advanced statistical methods 241
• Results for ADL= 10:
Advanced statistical methods 242
• Results for ADL= 5:
Advanced statistical methods 243
• Results for ADL= 0:
Advanced statistical methods 244
• Summary of the regression results:
With subject #20 Without subject #20
ADL Parameter Estimate (p-value) Estimate (p-value)
17 Intercept (β0) 2.233 (<0.001) 2.191 (<0.001)
Slope (β1) 0.022 (0.0497) 0.024 (0.0219)
24 Intercept (β0) 2.088 (<0.001) 2.191 (<0.001)
Slope (β1) 0.030 (0.0056) 0.024 (0.0219)
10 Intercept (β0) 2.420 (<0.001) 2.191 (<0.001)
Slope (β1) 0.012 (0.2801) 0.024 (0.0219)
5 Intercept (β0) 2.541 (<0.001) 2.191 (<0.001)
Slope (β1) 0.005 (0.6246) 0.024 (0.0219)
0 Intercept (β0) 2.636 (<0.001) 2.191 (<0.001)
Slope (β1) -0.0003 (0.9764) 0.024 (0.0219)
Advanced statistical methods 245
• In general, a subject is influential if the following two conditions are satisfied:
. The subject is an outlier, i.e., the value yi is exceptionally large or small, given itsxi value.
. The subject is located at the outside of the X-space; in our example this meansthat a large or small ADL score (day 1) is observed.
Advanced statistical methods 246
12.2 Cook’s distance
• The detection of influential subjects requires the following steps:
. Carry out the regression on all subjects
. Step 1: leave out the first subject and compare the new results with these based onall data
. Step 2: leave out the second subject and compare the new results with these basedon all data
. Step 3: leave out the third subject and compare the new results with these basedon all data
. . . .
. Step n: leave out the last subject and compare the new results with these based onall data
Advanced statistical methods 247
• In each step, we have to compare the results obtained in the absence of a certainsubject, with these obtained based on all data.
• This can be done with Cook’s distance, which measures the ‘distance’ between theresults with and without such an observation.
• Cook’s distance for the ith observation is denoted by Di.
• Influential subjects correspond to large Di.
• Non-influential subjects correspond to small Di.
Advanced statistical methods 248
12.3 Application
• We apply this to the regression of ln(Length of stay) on the ADL score, 1 day postoperation.
• Most software packages allow calculationof the Cook’s distance for all observations.
• Note that D20 is relatively large.
Advanced statistical methods 249
• In particular for large data sets, an index plot of Cook’s distances can be very handy,possibly upon explictly constructing a variable with observation numbers:
Advanced statistical methods 250
• Apart from subject #20, we also find that subject #45 has got a relatively large Di.
• It is therefore of interest to carry out the analyse with each of these observationsremoved in turn.
• The results with all observations, without observation #20, and without observation#45, respectively, are:
Advanced statistical methods 251
12.4 What to do with influential observations ?
• Does removing influential subjects lead to qualitatively different results?
• Are the data for influential subjects correct?
. Data-entry errors
. Mixing-up of patients case forms
. . . .
• Do influential subjects satisfy the inclusion/exclusion criteria of the study?
. Are these genuine hip fracture patients?
. Could there be an additional complication/comorbidity that could explain theirinfluence?
. . . .
Advanced statistical methods 252
• When there are no objective criteria for omission, influential subjects ought to be keptin the study.
• Possible, the least squares criterion can be replaced by a different criterion that is lesssensitive to individual observations.
=⇒ Robust regression techniques
Advanced statistical methods 253
12.5 Example from the biomedical literature
Archbold et al. [13],Results section p. 172 and Figure 2:
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � ! � � � � � � � � � � � " � � � � � � � � � � � ! � � � � � � � � � � � � � � � � !� � � � � � � � � ! � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � # � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � $ % � & � � � � ' � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � ( ( � � � � � � � � � � � � � � � � � � � � � � � � � �) * + , , - . / 0 + 1 2 3 4 - 5 6 7 8 7 7 9 : 8
;<=>?@ABC
< C
= C
> C
? C
D
E F G
HHIJ KL
MN
O P Q R S T U V W T Q S T X X P Y Z [ Z [ \ ] X P X Y ^ [ _ Z T [ ` a ] _ Y _ Z T [ P Z b T cd e f g h [ Z b i a P \ b S T Z j X i [ k T Q Y S ] l T X k d i i l h X m Y S T V g Z b P n P b R [ \X R o p T m k j X b [ k [ _ Y P Z k d q r s t u v w r x y z { | } ~ � � � r � � � � � � � r � �� ~ } � x � } � � ~ x � � � � { � � | � � } � � y � } x � � x { � � x � � x � ~ � � � | ~ � { x �� x � | z x ~ } � y | � � } � z � � { � y � � z | x � u �
Advanced statistical methods 254
Part IV
One-way analysis of variance
Advanced statistical methods 255
Chapter 13
The unpaired t-test
. Example
. The unpaired t-test
. Example
. Variability within versus between groups
Advanced statistical methods 256
13.1 Example
• We study the relationship between the ADL score, 1 day post operation, and thepre-operative neuro-psychiatric condition of the patient, i.e., we want to compare theaverage ADL score between neuro- and non-neuro patients.
• Descriptive statistics:
Advanced statistical methods 257
• Graphical representation:
• We note that, on average, the neuro patients exhibit a higher ADL score and henceare more dependent.
Advanced statistical methods 258
• How can we test whether this difference can be ascribed to chance? In other words, inhow far is this difference significant?
• Ineed, even if there would be no difference between both neuro groups (in thepopulation), then we still might observe differences, purely due to chance, in thesample.
• Illustration:
→ Anova
Advanced statistical methods 259
Advanced statistical methods 260
Advanced statistical methods 261
13.2 The unpaired t-test
• We have two independent groups of patients, and hence two sets of ADLmeasurements:
. y11, y12, y13, . . . , y1n1the measures in the first group
. y21, y22, y23, . . . , y2n2 the measures in the second group
• Both groups do not necessarily have the same number of observations: n1 en n2.
• The unpaired t-test assumes that the measures in both groups are normally distributedwith the same spread, but perhaps a different mean:
Y1j ∼ N (µ1, σ2)
Y2j ∼ N (µ2, σ2)
Advanced statistical methods 262
• Graphically:
.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Non-neuro Neuro
µ1 µ2
ADL
• The null hypothesis that we aim to test is
H0 : µ1 = µ2
versus the alternative hypothesis
HA : µ1 6= µ2
Advanced statistical methods 263
• The test statistic employed for this purpose:
T =y2· − y1·
sp
√1n1
+ 1n2
where y1· and y2· the observed means in the first and second groups are, respectively:
y1· =1
n1
n1∑
i=1y1i y2· =
1
n2
n2∑
i=1y2i
and where s2p is the ‘pooled’ sample variance, an estimate for the common variance σ2:
s2p =
(n1 − 1)s21 + (n2 − 1)s2
2
n1 + n2 − 2
which is a weighted average of the sample variances in both groups separately.
Advanced statistical methods 264
• Note that the test statistic T is a measure for the separation between the observedsamples.
• In our example, the T -value is:
T =20− 17.15
√(40−1)11.51 + (20−1)9.37
40+20−2
√140
+ 120
= 3.16
• Under the null hypothesis, i.e., when µ1 = µ2, we expect T to be small.
• We are interested in knowing in how far T = 3.16 can be obtained purely by chance.
Advanced statistical methods 265
• We calculate the probability to observe a T at least as large as the current value 3.16,in the case that both populations in reality have equal means, i.e., when µ1 = µ2.
• Illustration:
→ Hypothesis Test (two samples)
Advanced statistical methods 266
Advanced statistical methods 267
Advanced statistical methods 268
• It is clear that, if there are no differences between both populations, it would be veryunlikely to observe T = 3.16 or T = −3.16.
• The probability to observe |T | ≥ 3.16 purely by chance is p = 0.002.
• Given that this probability is so small, more specifically p < α = 0.05 = 5%, we willconclude that what has been observed, T = 3.16, is sufficient indication to accept thatµ1 6= µ2.
• We reject the null hypothesis and conclude that µ1 and µ2 are significantly different,at the 5% significance level.
• We reject the null hypothesis that the average ADL score is equal between the neuroand non-neuro patients.
Advanced statistical methods 269
• Note that the calculation of the p-values makes use of the assumptions:
. Normality between both groups
. Common variance between both groups
• Checking these assumptions can be done in exactly the same way as with 1-wayANOVA, and therefore will be explained when model diagnostics for ANOVA arediscussed.
Advanced statistical methods 270
13.3 Example
• A typical unpaired t-test output:
• The unpaired t-test assumes that the variance is the same in both groups. Thisassumption is tested automatically. If the hypothesis of equal variances is rejected,then an appropriately corrected t-test can be carried out.
• The hypothesis of equal variances is acceptable (p = 0.642).
Advanced statistical methods 271
13.4 Variability within versus between groups
• The unpaired t-test rejects H0 if |T | is large, which is equivalent to
T 2 =(y2· − y1·)
2
s2p
(1n1
+ 1n2
)
being large.
• The numerator of T 2 measures separation between the group averages, and is ameasure for the variability between both groups.
• The denominator of T 2 contains s2p, which is an estimator for σ2, and hence is a
measure for the variability within groups.
Advanced statistical methods 272
• Hence, the unpaired t-test rejects the null hypothesis if the variability between groupsis large enough, as compared with the variability within groups:
• This principle is applied in ANOVA to compare more than two groups.
Advanced statistical methods 273
Chapter 14
1-way ANOVA
. Example
. Pairwise t-tests
. 1-way ANOVA
. Example
. Model diagnostics
. Influential observations
. Examples from the biomedical literature
Advanced statistical methods 274
14.1 Example
• Because we expect that the ADL score post operation is not only influenced byoperation-specific factors, but also by, for example, how dependent the patient wasprior to the operation, we study the relationship between the ADL score and thepatient’s living condition prior to operation.
• We distinguish between the following classes:
. Single
. With partner / family / religious community
. RH/RVT (Retirement-Home / Retirement and Care Home)
. Other
Advanced statistical methods 275
• Descriptive statistics and graphical exploration:
Advanced statistical methods 276
• The fourth group contains only 1 subject, and will not be included for analysis.
• From the graph, it appears that the average ADL score in RH/RVT patients is higherthan in the other two groups. Is this difference significant?
• Even if the three populations would have the same mean, it would still be possible toobserve differences in the sample, purely by chance.
• How large is the probability that we observe this type of difference?
• Illustration:
→ Anova
Advanced statistical methods 277
Advanced statistical methods 278
Advanced statistical methods 279
14.2 Pairwise t-tests
• In analogy with the unpaired t-test, we assume that we now have r different sets ofmeasurements (in the example, r = 3):
. y11, y12, y13, . . . , y1n1the measurements in the first group
. y21, y22, y23, . . . , y2n2 the measurements in the second group
. . . .
. yr1, yr2, yr3, . . . , yrnr the measurements in the rth group
• Further, we assume that the measurements are sampled from the followingdistributions:
Y1j ∼ N (µ1, σ2), Y2j ∼ N (µ2, σ
2), . . . Yrj ∼ N (µr, σ2)
Advanced statistical methods 280
• The null hypothesis that we want to testis
H0 : µ1 = µ2 = . . . = µr
versus the alternative hypothesis
HA : not all µi equal
Advanced statistical methods 281
• When the above null hypothesis is not satisfied, then at least two of the means µi
must be different. Therefore, we can, in principle, use unpaired t-tests. For r = 3, thiswould mean that we test the following hypotheses:
H0 : µ1 = µ2
H0 : µ1 = µ3
H0 : µ2 = µ3
• For our example, we obtain the following p-values:
Single Partner/family/relig. RH/RVT
Single — 0.8763 0.0013
Partner/family/relig. 0.8763 — <0.0001
RH/RVT 0.0013 <0.0001 —
Advanced statistical methods 282
• Hence, we only find significant differences between the RH/RVT patients on the onehand and the other two groups on the other hand.
• Note that, for each test conducted, there is a chance of 5% for a type-I error(incorrectly rejection H0).
• It can be shown that, for our example, the total probability for a type-I error satisfies:
P (H0 rejected | H0)
= P (at least 1 significance | µ1 = µ2 = µ3)
≤ 3 × 5% = 15%
so that the chance for a type-I error is larger than the 5% requested.
Advanced statistical methods 283
• In general, when conducting k tests, the total probability for a type-I error can increaseto k × α, and hence become large when the number of tests conducted is large.
• It is therefore necessary to dispose of a testing quantity that allows us to test the nullhypothesis
H0 : µ1 = µ2 = . . . = µr
without having to conduct all pairwise t-tests.
=⇒ ANOVA
Advanced statistical methods 284
14.3 1-way ANOVA
• ANOVA (Analysis of variance) is an extension of the unpaired t-test to the comparisonwith more than 2 groups.
• Like with the t-test, the test procedure will compare the variability between groupswith the variability within groups.
• The following equations play a central role:
r∑
i=1
ni∑
j=1[yij − y··]
2
︸ ︷︷ ︸
↓
SSTO
=r∑
i=1
ni∑
j=1[yij − yi·]
2
︸ ︷︷ ︸
↓
SSwithin
+r∑
i=1ni[yi· − y··]
2
︸ ︷︷ ︸
↓
SSbetween
Advanced statistical methods 285
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Group 1 Group i Group r
y1j y1· yi· yr·y··
y1j − y1· y1· − y··
y1j − y··
. y·· : global mean (all groups together)
. yi· : mean in the ith group
. yij : jth measurement in the ith group
Advanced statistical methods 286
• SSTO : Total sum of squaresThis term expresses the total variability in the data.
• SSwithin : Within-group sum of squaresThis term expresses the variability within the groups
• SSbetween : Between-group sum of squaresThis term expresses the variability between the groups
• In ANOVA, the null hypothesis is rejected if
F =SSbetween/(r − 1)
SSwithin/(N − r)
is large. N is the total sample size, N = ∑i ni
Advanced statistical methods 287
• Note that F is the ratio of the variability between groups over the variability withingroups, which is entirely analogous to the unpaired t-test. This motivates theterminology ‘ANOVA.’
• In our example, F = 8.59
• Under the null hypothesis, F is expected to be small.
• We wish to known in how far F = 8.59 can be obtained purely by chance.
• We calculate the probability that F = 8.59, in case that all populations truly would beequal, i.e., when µ1 = µ2 = µ3.
• Illustration:
→ Histograms on ANOVA
Advanced statistical methods 288
Advanced statistical methods 289
Advanced statistical methods 290
• Clearly, when there is no difference between the three populations, then it is veryunlikely to observe F = 8.59. More specifically, the chance to observe F ≥ 8.59purely by chance is p = 0.0006.
• Given this chance is so small, more specifically p < α = 0.05 = 5%, we conclude thatthe observed value (F = 8.59) is sufficient indication to conclude that µ1, µ2, and µ3
are different.
• We reject the null hypothesis and conclude that the three groups are significantlydifferent at the 5% significance level.
• Note that the calculation of the p-values makes use of the assumptions made:
. Normality within all groups
. Equal variance for all groups
• Exactly like with linear regression, these assumptions need to be checked (see further).
Advanced statistical methods 291
14.4 Example
• ANOVA table with global F -test:
• The ‘SS MODEL’ is the SSbetween.
• The ‘SS Residual’ is the SSwithin.
• In the F statistic, SSbetween and SSwithin need to be divided by r − 1 = 3− 1 andN − r = 54− 3, respectively.
Advanced statistical methods 292
• These quantities are called the numbers of degrees of freedom (df) for SSbetween andSSwithin.
• The F statistic is
F =SSbetween/(r − 1)
SSwithin/(N − r)=
168.60/2
500.23/51= 8.59
• The corresponding p-value is p = 0.0006, which points to significant differencesbetween the three groups, as far as the average ADL on day 1 is concerned.
• As with regression, one can compute a statistic, indicating which portion of thevariability in the ADL scores can be explained by the differences in living conditions (=variability between groups):
R2 =SSbetween
SSTO=
168.60
168.60 + 500.23= 0.252
Advanced statistical methods 293
14.5 Model diagnostics
• With ANOVA, one implicitly assumes that the data are sampled from the followingpopulations:
Y1j ∼ N (µ1, σ2), Y2j ∼ N (µ2, σ
2), . . . Yrj ∼ N (µr, σ2)
• Hence, we assume that . . .
. . . .constant variance: within every group the spread is equally large
. . . .normality: within each group the data are normally distributed
• When the assumptions are not satisfied, as with linear regression, erroneous statisticalresults can follow (p-values, confidence intervals, . . . ).
• How can the above assumptions be verified?
Advanced statistical methods 294
Advanced statistical methods 295
14.5.1 Assumption of constant variance
Advanced statistical methods 296
• Descriptive statistics and graphical exploration:
Advanced statistical methods 297
• Is there too much difference in the variance so as to doubt the assumption of equalvariance?
• In other words, to what extent can the observed differences in variance be ascribed tochance?
• Statistica allows for a formal equal-variance test. The null hypothesis then is
H0 : σ21 = σ2
2 = . . . = σ2r
versus the alternative hypothesis
HA : not all σ2i equal
Advanced statistical methods 298
• A number of statistical tests are available, one of the most commonly used ones beingLevene’s test:
• Hence, we observe that the variances among the three groups are not significantlydifferent (p = 0.0808).
• When there are many groups, or when some groups contain (very) many observations,then small differences can be found to be significant by the formal testing procedure.
• At the same time, it is known that variances that are not too different pose little or noproblem (→ analogy with linear regression).
Advanced statistical methods 299
• Therefore, one employs, next to a formal test for equal variances, also a rule of thumb,stating that variances should not differ by more than a factor 5, to avoid adverselyaffecting the results.
• In our example, this is:
3.772
1.822= 4.29
• In practice, one uses the formal test, combined with the rule of thumb, so as to assesswhether the assumption of equal variance is satisfied.
Advanced statistical methods 300
14.5.2 Assumption of normality
Advanced statistical methods 301
• ANOVA assumes that the observations in every group are normally distributed, withcommon variance.
• Assuming common variance, how can normality be tested ?
• We rewrite the ANOVA model as
Y1j = µ1 + ε1j
Y2j = µ2 + ε2j
. . .
Yrj = µr + εrj
where the ‘error terms’ εij all come from the same normal distribution with mean zeroand variance σ2.
• The distribution of the error terms εij can be done after subtracting the populationaverages µ1, . . . , µr, which comes down to collapsing the population-specificdistributions
Advanced statistical methods 302
..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................
...............................................................................................................................................................................................................................................................................................................................................................................................................................
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................
...............................................................................................................................................................................................................................................................................................................................................................................................................................
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.........................................................................................................................................................................................................................................................................................................................................................................................................................
Group 1 Group i Group r
Y1j µ1 µi µr0
Y1j = µ1 + ε1j
......................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.............................................................................................................................................................................................................................................................................................................................................................................................................................
N(0, σ2)
ε1j 0
−µ1 −µr
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
−µi
Advanced statistical methods 303
• As with regression, we will check the assumption of normality for the εij via theirestimators
eij = yij − µi = yij − yi·
• As with regression, the eij are termed residuals: They represent the error made whenthe observed value yij for an individual in group i would be predicted by the groupaverage yi·.
• Once the residuals eij have been computed, we can assess normality using theirhistograms, or using formal normality tests.
• This is perfomed in full analogy with linear regression.
Advanced statistical methods 304
• Output:
• Hence, we can conclude that the assumption of normality is acceptable.
Advanced statistical methods 305
• Exactly as with simple regression, we have that:
. Departures from normality still lead to correct results, as long as the distribution ofthe errors is symmetric.
. In case of asymmetry, the response can sometimes be transformed, so as to renderthe residuals in the new model normally distributed.
. However, some transformations can disrupt the constant variance, implying thatthis needs to be assessed again after transformation.
Advanced statistical methods 306
14.6 Influential observations
• In spite of the fact that, with ANOVA, we strictly speaking do not dispose ofregression parameters, individual observations can still have a large influence on theestimation of the group averages, µi, and hence ultimately on the ANOVA results.
• Statistica allows us, exactly as with regression, to measure influence of eachobservation through comparing the estimators µi = yi· with these that would beobtained upon deletion of such an observation.
• This results, again, in the so-called ‘Cook’s distance,’ a distance between theestimators with and without a given observation.
• Exactly as with regression, we consider a scatter plot of Cook’s distances versus thesubject number.
Advanced statistical methods 307
• The computations are done in analogy with simple linear regression.
• Output:
• Hence, there are no observations with an unduly large influence.
Advanced statistical methods 308
14.7 Examples from the biomedical literature
• van Hooft et al. [14]:
. Analysis section p. 68:
2.4. Analysis
Descriptive data were generated for all variables. Statistical
analyses were performed using SPSS21 (SPSS Inc., Chicago, Il, USA).
Level of significance was set at p-value p < 0.05. Prior to analysis
the data was screened for repetitive response patterns (>10% of the
answers the same on the SEPPS-36; n = 5), and missing subscale
scores (>10% of the items of the subscale). The data of the
dependent variables were checked for normal distribution.
To determine self-ef cacy and behavior, sum scores were
Normality check for response −→ residuals ?
Advanced statistical methods 309
. Analysis section p. 68:
Hypothesis 2. The preferred attitude.
One-way ANOVA variance analysis with a Bonferoni post hoc
test was performed to measure associations between the
descriptions of the attitude towards self-management support
and the sum scores on behavior.
∗ One-way ANOVA to compare 4 groups
∗ Bonferroni correction for pairwise comparisons
Advanced statistical methods 310
. Results section p. 69:
The most preferred attitude towards self-management support
was the coach attitude (38.0%; n = 132). Next came the educator
(32.6%; n = 113), the clinician (15.6%; n = 54), and the gatekeeper
(13.8%; n = 48) attitudes. Analysis of variance showed no significant
difference in the sum scores of behavior between the different
attitudes, implying that the preferred attitude (coach, educator,
clinician, or gatekeeper) was not significantly associated with
nurses’ self-management support behavior (hypothesis 2).
Advanced statistical methods 311
• Huang et al. [15]:
. Statistical analysis section p. 5:
Statistical analysis
Results are presented as median (IQR: interquartile range) or mean ± standard deviation. Dif-
ferences between Group N, Group M, and Group S were tested by one-way ANOVA or the
Kruskal–Wallis test (Table 1). Tukey’s post hoc tests were then performed to find significant
differences between groups (Table 1). Correlations were determined using Pearson’s correla-
∗ One-way ANOVA to compare 3 groups
∗ Tukey’s post-hoc test for pairwise comparisons
∗ Kruskal-Wallis is a non-parametric rank-sum alternative, in case assumptions notsatisfied
Advanced statistical methods 312
. Table 1, p. 5:
Table 1. Characteristics of the participants by hot flash profiles.
Parameters Hot flash status P Value
None Mild to moderate Severe
n 52 47 52
Age, years 55 (51.5, 58.0) 53 (51, 56) 53 (51.0,55.5) .074†
MP_duration, years 4.0 (2.0, 5.0) 2.0 (1.7, 4.0) 2.5 (2.0, 5.0) .054†
SBP, mmHg 110 (98, 115) 112 (96, 116) 114 (102, 120) .065†
DBP, mmHg 71 (64, 75) 73 (64, 75) 72 (66, 75) .592†
BMI, kg/m2 22.8 2.7 22.6 2.6 23.5 2.4 .175‡
FSH, mIU/mL 69 24 67 25 65 21 .316‡
Estradiol, pg/mL 20 20 20 —
Fasting glucose, mg/dL 92 9 95 7 99 7* .0001‡
Hemoglobin A1c, % 5.5 (5.2, 5.6) 5.4 (5.2, 5.7) 5.6 (5.25, 5.8) .189†
Total cholesterol, mg/dL 209 (191, 228) 202 (173, 223) 201.5 (182, 236) .291†
Triglyceride, mg/dL 110 (84, 144) 97 (72, 144) 126 (81, 192) .090†
HDL cholesterol, mg/dL 56 (46, 68) 52 (46, 58) 52 (44, 62) .345†
LDL cholesterol, mg/dL 126 (104, 145) 123 (101, 139) 125 (103, 142) .662†
Insulin, pg/ml 339 (261, 469) 394 (256, 509) 515 (388, 732)* .0001†
Leptin, ng/mL 9.2 (4.9, 11.5) 10.2 (7.4, 15.6) 16.2 (10.3, 20.6)* .0001†
Adiponectin, ug/mL 14.9 (11.3, 21.9) 14.7 (8.3, 29.0) 8.1 (6.3, 11.9)* .0001†
Leptin to Adiponetin ratio 0.63 (0.25, 1.09) 0.82 (0.36, 1.63) 1.73 (1.08, 2.84)* .0001†
Resistin, ng/mL 16.2 (12.6, 22.8) 15.8 (11.6, 21.1) 13.5 (11.3, 18.4) .095†
HOMA-IR 1.20 (0.90, 1.84) 1.56 (1.00, 2.11) 2.13 (1.64, 2.97)* .0001†
Data are presented as mean SD or median (Q1, Q3). Statistical analysis was conducted by ANOVA test (marked with ‡) or Kruskal-Wallis test (marked
with †) to compare the mean/median differences between three groups of postmenopausal women with or without hot flashes. Tukey’s post hoc tests were
then performed to find significant differences between groups.
*, significant difference between Group S and Group M (p 0.05) and Group S and Group N (p 0.001).
Abbreviations: Q, quarter; Q1, 25th percentile; Q3, 75th percentile; MP_duration, menopause period since final menstrual period; SBP, systolic blood
pressure; DBP, Diastolic blood pressure; FSH, follicle stimulating hormone; BMI, body mass index; HDL, high density lipoprotein; LDL, low density
lipoprotein; HOMA-IR, homeostatic model assessment of insulin resistance.
Advanced statistical methods 313
. Table 2, p. 6:
Table 2. Association of hot flash status with adipocyte-derived hormones and HOMA-IR.
Variables Leptin Adiponectin Resistin Leptin/Adiponectin Ratio HOMA-IR index
Hot flashes
None
Mild to moderate 25.79 -11.36 -6.67 51.03 12.04
(4.54,51.36)a (-30.13,12.44) (-20.56,9.65) (1.89,123.87)a (-7.15,35.19)
Severe 53.18 -56.46 -16.86 140.29 53.89
(29.78,80.79)c (-71.33,-33.86)c (-29.76,-1.59) (70,239.64)c (28.16,84.79)c
Data are expressed as the percentage difference (95% CI). Adipocyte-derived hormones and leptin/adiponectin ratio were log-transformed.
Regression coefficients were back-transformed using formula (100*(exp( )-1)) to calculate the percentage difference and the 95% CI in each adipocyte-
derived hormone for hot-flash group relative to non-hot flash group.
∗ One-way ANOVA to compare 3 groups
∗ Logarithmic transformation of response
∗ Group differences versus reference group ‘None,’ but back-transformedto original scale:
β = µ2 − µ1 =⇒ exp(β)− 1 =exp(µ2)
exp(µ1)− 1 =
exp(µ2)− exp(µ1)
exp(µ1)
Advanced statistical methods 314
Part V
Multiple linear regression
Advanced statistical methods 315
Chapter 15
Multiple linear regression
. Example
. Regression model
. Application
. Interpretation
. Graphical interpretation
. Model diagnostics
. Influential observations
. Example from the biomedical literature
Advanced statistical methods 316
15.1 Example
• It has been shown that the relationship between ADL score and MMSE score, 1 daypost operation, is significant.
• Hence, there is a strongly significant relationship between a patient’s cognitive statusand his/her dependence.
• Likewise, we expect the ADL score to be age-dependent.
• At the same time, there may be a relationship between MMSE score and age.
• Let us study these relationship with 3 simple regressions.
Advanced statistical methods 317
• Regression of ADL on MMSE:
Advanced statistical methods 318
• Regression of ADL on age:
Advanced statistical methods 319
• Regression of MMSE on age:
Advanced statistical methods 320
• Our findings are as follows:
. The dependence is stronger with lower cognitive status.
. The dependence is stronger with increasing age.
. The cognitive status is lower with increasing age.
• It is possible that the relationship found between ADL and MMSE is purely an effectof age, i.e., it would be possible that a better cognitive status corresponds to a lowerdependence, because these patients tend to be younger.
• Hence, a simple regression is not sufficient to capture the complex rlationship betweenADL on the one hand and age and MMSE on the other hand.
=⇒ multiple (linear) regression
Advanced statistical methods 321
15.2 The multiple linear regression model
• We want to determine how the ADL score, 1 day post operation, is influenced by theMMSE score and age, simultaneously.
• Graphically, this relationship can be captured in a 3D scatter plot
• Often rotation of the plot is needed in order to get a clear view on the relationbetween the three variables plotted
Advanced statistical methods 322
• Output (after rotation):
Advanced statistical methods 323
• A possible way to relate ADL simultaneously with MMSE and age is to extend theregression model:
ADLi = β0 + β1MMSEi + εi
yi = β0 + β1xi + εi
used for the regression of ADL on MMSE, to:
ADLi = β0 + β1MMSEi + β2agei + εi
yi = β0 + β1x1i + β2x2i + εi
by which we explicitly indicate that ADL depends, not only on MMSE, but possiblyalso on age.
Advanced statistical methods 324
• ADL is termed the dependent variable (response), while MMSE and age are theindependent variables (covariates).
• The above equation describes a plane in the 3D space, the so-called regression plane(two different rotations):
Advanced statistical methods 325
• As with simple linear regression, the parameters β0, β1, and β2 need to be estimatedbased on a sample.
• This can be done using the least squares method, which searches for the estimatorsβ0,
β1, and β2, for which the predicted ADL scores,
ADLi = β0 + β1MMSEi + β2agei
are as close as possible to the original measurements.
• This comes down to minimizing
∑
i
[ADLi − ADLi
]2.
Advanced statistical methods 326
• As with simple regression, it is assumed that the errors εi are normally distributed withmean 0 and variance σ2.
• When the above assumptions are satisfied, significance can be tested for the regressionparameters β0, β1, and β2.
• Furthermore, in analogy with simple regression, one can construct an ANOVA tablebased on the equality:
∑
i[yi − y]2
︸ ︷︷ ︸
↓SSTO
=∑
i[yi − yi]
2
︸ ︷︷ ︸
↓SSE
+∑
i[yi − y]2
︸ ︷︷ ︸
↓SSR
Advanced statistical methods 327
• SSTO: Total sum of squaresThis term captures the total error made by predicting the yi without taking intoaccount the observed values x1i and x2i for the covariates X1 and X2.
• SSE: Error sum of squaresThis term captures the error made upon predicting the yi by making use of theobservations x1i and x2i.
• SSR: Regression sum of squaresThis term captures the decrease in error by predicting the values yi with rather thanwithout making use of the covariates.
• A measure of the regression’s “quality” is
R2 =SSR
SSTO
Advanced statistical methods 328
• Like with simple regression, R2 enjoys the following properties:
. 0 ≤ R2 ≤ 1
. R2 = 0 implies that SSR = 0 and thus that all yi are equal to y, i.e., theregression plane is horizontal.
This is equivalent with β1 = β2 = 0
. R2 = 1 implies that SSE = 0. This implies that yi = yi for all i, and hence thatall observations lie in the regression plane.
• It is said that R2 expresses ‘which fraction of variability in the response (ADL) can beexplained by covariates’ (MMSE and age).
• With simple regression, we found that R2 equals r2, the square of the correlationbetween xi and yi. Hence R2 can be seen as a generalization of the correlationcoefficient to a ‘correlation’ between one variable on the one hand (the response) andmultiple variables on the other hand (the covariates).
Advanced statistical methods 329
• If R2 = 0, then the covariates X1 and X2 do not help us in predicting the response,which is equivalent to β1 = β2 = 0. In practice, it is therefore important to assesswhether the covariates help us in predicting response. This can be done by testing thenull hypothesis
H0 : β1 = β2 = 0
versus the alternative
HA : β1 6= 0 of β2 6= 0
• In most software packages, the above hypothesis is tested automatically with everyregression. This is done by way of an F test.
• Everything discussed in the context of the regression with two covariates can beextended to several covariates, where a given response is to be predicted from a set ofcovariates.
Advanced statistical methods 330
15.3 Application
• The regression output obtained when regressing ADL on MMSE and age is:
Advanced statistical methods 331
• In the ANOVA table, we find SSTO, SSR, and SSE.
• The global F-test in the ANOVA table tests whether the covariates improve theprediction of ADL in a significant way, i.e., they test the hypothesis
H0 : β1 = β2 = 0
versus the alternative
HA : β1 6= 0 of β2 6= 0
• Given the strong significance (p < 0.0001), we conclude that at least β1 or β2
significantly differ from zero.
• Further R2 = 0.4946. Note that the regression of ADL on MMSE yieldedR2 = 0.4940.
• Hence, we see that age explains little extra variability in ADL, over and above whatwas already explained by MMSE.
Advanced statistical methods 332
• This suggests that, once we know the MMSE score, then the patient’s age provideslittle extra information for the prediction of the ADL score, one day after the operation.
• The least squares estimators are
. β0 = 22.55
. β1 = −0.29
. β2 = 0.01
• Note that these values are different from what would follow from two singleregressions:
Covariates
MMSE and Age MMSE Age
β0 22.55 23.65 5.93
β1 -0.29 -0.30 —
β2 0.01 — 0.15
Advanced statistical methods 333
• This suggests that the parameters change meaning, compared to those in singleregression.
• Note that age in the above regression model is no longer significant (p = 0.7963),which strongly contrasts with the significant univariate regression of ADL on Age(p = 0.0053). This underscores that the results from a multiple regression are to beinterpreted differently from their single-regression counterparts.
Advanced statistical methods 334
15.4 Interpretation
• Our regression of ADL on MMSE and age yielded the following regression equation:
ADL = 22.55 − 0.29 MMSE + 0.01 Age
Advanced statistical methods 335
• The estimator β1 = −0.29 can be interpreted as follows:
. Take two groups of subjects of the same age (e.g., 80 years), of which the first onehas MMSE=20 and the second one MMSE=21.
. Their expected ADL score then is
ADL1 = 22.55 − 0.29 × 20 + 0.01 × 80
ADL2 = 22.55 − 0.29 × 21 + 0.01 × 80
. The difference then is
ADL2 − ADL1 = −0.29 × (21 − 20) = −0.29
. Hence, we find that, for patients of a given age, the ADL score decreases onaverage with 0.29, for a unit increase of MMSE.
. Note that the effect would be the same of patients of a different age (e.g., 70years) would be taken
Advanced statistical methods 336
• Note that we cannot conclude that a unit increase of MMSE for a given patientwill lead to a decrease of ADL with 0.29.
• We can only assert that, for patients of a given age, a unit difference in MMSEcorresponds to an average difference in ADL of 0.29.
• We cannot draw ‘longitudinal’ conclusions from our ‘cross-sectional’ experiment.
• The estimator β1 indicates how the average ADL score varies with MMSE, for patientswith the same age.
Advanced statistical methods 337
• In the regression plane, this corresponds to lines for constant age:
Advanced statistical methods 338
• Similarly, we can interpret the estimator β2 = 0.01 as the average increase of ADL perunit increase of age, for patients with the same MMSE score. In the regression plane,this corresponds to lines for constant MMSE:
Advanced statistical methods 339
• Note that these lines are almost flat, suggesting that, for patients with the sameMMSE, the average ADL score is virtually without age influence.
• This explains why Age is no longer significant in the multiple regression model(p=0.7963): Age contributes only little additional information for the prediction ofADL, whenever the MMSE score is already known.
• We find this also in the fact that adding Age to the regression model of ADL onMMSE only marginally leads to an increase in R2, from 0.4940 to 0.4946.
• In practice, non-significant terms in the regression model are usually deleted, becausethey do not provide additional quality of prediction. In our example, this wouldcorrespond to the omission of the variable Age. The final model would then onlycontain ‘MMSE score on day 1.’
Advanced statistical methods 340
• Note that the p-value gives an indication of the need for one particular covariate, inaddition to the ones already in the model.
• It is therefore not a good practice to remove non-significant covariates simultaneouslyfrom the model.
• Rather, the deletion of non-significant covariates has to be conducted step by step.
Advanced statistical methods 341
15.5 Graphical interpretation
• A response Y and two covariates X1 and X2 contain some information about thepopulation of interest:
�
� �
�
� �
• Obviously, Y and X1 have some information about the population in common.Likewise, Y and X2 have some information about the population in common.
Advanced statistical methods 342
• A simple linear regression analysis of Y on X1 (X2) quantifies the information X1
(X2) contains about Y :
�
� ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¢ ¬ ¨ ® ¯ ° ± ²
³
´ µ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À · Á  ½ Ã Ä Å Æ Ç
H0 p-value
β1 = 0 0.0023
H0 p-value
β2 = 0 0.0087
Advanced statistical methods 343
• In multiple regression, one outcome Y and multiple covariates, e.g., X1 and X2, arestudied simultaneously. All containing some information about the population:
ÈÉ
Advanced statistical methods 344
• A multiple linear regression analysis of Y on X1 and X2 quantifies the information X1
and X2 jointly contain about Y :
ÊË
Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Í × Ø Ó Ù Ú Û Ü Ý Í Ø Ù Ü Þ
H0 p-value
β1 = β2 = 0 0.0004
Advanced statistical methods 345
• In multiple regression, the effect of an individual covariate quantifies the information itcontains about Y , not already incorporated in the other covariates:
ß
à áà â
ã ä å æ ç è é ê ë ì í ä î ï ê ð ñ ò ó ô ç ï æ ç ì ç è ó õ
ö
÷ ø÷ ù
ú û ü ý þ ÿ � � � � � û � � � � � � þ � ý þ � þ ÿ �
H0
p-value p-value
(simple) (multiple)
β1 = 0 0.0023 0.0374
β2 = 0 0.0087 0.0187
Advanced statistical methods 346
• Multiple regression with one (non-)significant covariate while both are highlysignificant in simple regressions:
� H0
p-value p-value
(simple) (multiple)
β1 = 0 0.0002 0.0138
β2 = 0 0.0087 0.9724
Advanced statistical methods 347
• Multiple regression with equal estimates but different p-values in multiple and simpleanalyses:
��
H0
p-value p-value
(simple) (multiple)
β1 = 0 0.0139 0.0038
β2 = 0 0.0255 0.0049
• This occurs if X1 and X2 do not contain information about each other,hence are independent.
Advanced statistical methods 348
• Multiple regression with two non-significant covariates in multiple regression, whileboth are highly significant in simple regression:
�� H0
p-value p-value
(simple) (multiple)
β1 = 0 0.0002 0.9625
β2 = 0 0.0004 0.8259
• This occurs if X1 and X2 contain much information about each other,hence are highly dependent.
Advanced statistical methods 349
15.6 Model diagnostics
• The general multiple linear regression model with p covariates takes the form
yi = β0 + β1x1i + . . . + βpxpi + εi
where the errors εi are assumed to be zero-mean normally distributed with variance σ2.
• Our assumptions:
. Linearity: The average Y value is well described by
β0 + β1x1i + . . . + βpxpi
and the errors εi have mean zero.
. The variance of the errors is constant.
. The errors εi are normally distributed.
Advanced statistical methods 350
• All significance tests are based on the above assumptions, i.e., their failure to hold canlead to erroneous results. Hence, it is important to check them.
• In our example, we assumed that the average ADL score could be well described by
β0 + β1 MMSE + β2 Age
and that the errors εi are normally distributed with mean zero andconstant variance σ2.
• Verifying these assumptions is more complex than with simple regression, because, asalready discussed, the relationship between ADL and, for example, Age, is alsoinfluenced by the second covariate, MMSE, in the model.
Advanced statistical methods 351
• In simple regression, the assumptions were verified using the residuals, which areestimators for the εi:
ei = ADLi − ADLi = ADLi − (β0 + β1MMSEi + β2Agei)
• If the model assumptions are correct, then we expect no systematic trends in theresiduals, they have to exhibit constant variability, and they have to be normallydistributed.
• In practice, it usually suffices to apply the following techniques:
. Scatter plots of the ei versus all covariates in the model.
. Scatter plot of the ei versus the predicted values yi.
. Normality checks for the ei.
• In most software packages, these techniques can be applied in full analogy with thesimple linear regression case.
Advanced statistical methods 352
15.6.1 Residuals versus covariates
• For simple regression, the residuals ei were plotted versus the model covariate.
• Now, we construct a scatter plot of the residuals ei versus each of the covariates inthe model.
• If the model is correct, then we expect no further systematic trends.
• Such systematic trends can, like with simple regression, point to the need fortransforming one or more covariates.
Advanced statistical methods 353
• Results:
• We find no systematic trends in the residuals. This means that the we neither over-nor underestimate the average ADL score, in a systematic way, for older nor foryounger patients.
Advanced statistical methods 354
15.6.2 Residuals versus predicted values
• The scatter plots of residuals versus covariates allow us to verify whether or not theresponse is systematically over- or underestimated for certain values of the covariates.
• On the other hand, it is also important to verify whether, for example, large or smallpredictions would come from systematic over- or underestimation of the outcome.
• In our example, we want to check whether the model systematically over- orunderestimates certain ADL values.
• This can be verified by plotting the residuals versus the predicted values yi.
Advanced statistical methods 355
• Result:
Advanced statistical methods 356
15.6.3 Normality of the residuals
• As with simple regression, we will check the normality assumption for the errors εi, viathe residuals ei.
• This can be done graphically (histogram), or via a formal test for normality:
Advanced statistical methods 357
• The normality assumption seems acceptable.
• As with simple regression:
. Deviations from normality lead to correct results as long as the errors aresymmetrically distributed.
. In case of asymmetry, the response can sometimes be transformed, so that theresiduals in the ensuing model are normally distributed.
. Potential transformations can disturb linearity and constant variance, implying that,after transformation, the residual plots need to be constructed again.
Advanced statistical methods 358
15.7 Influential observations
• In analogy with simple regression, influential subjects can have a strong impact on theregression’s results.
• In principle, one could explictly remove each observation in turn, and then assess howthe results (i.e., the estimators β0, . . . , βp) change.
• This means that, each time, the results of the analysis without a particular observationneed to be compared with the one based on all data.
• Like before, this can be effectuated with Cook’s distances, measuring the ‘distance’between the results with and without a given observation.
• Cook’s distance for the ith observation is again denoted by Di, and influential subjectscorrespond to large Di.
Advanced statistical methods 359
• Calculations proceed similarly as with simple regression.
• Indexplot of all Cook’s distances:
• The figure exhibits observation #43 as the only outlier. Therefore, we compare theregression with and without this subject.
Advanced statistical methods 360
• Results for the analysis with and without this subject are, respectively:
• The final conclusions do not change if we remove subject #43 from the analysis.
Advanced statistical methods 361
15.8 Example from the biomedical literature
van Hooft et al. [14]:
. Analysis section 2.4.1, p. 69:
A tud ePerspec%ve on self-manage ment supp ort
Importance of SMS
Sub jec)ve normsPa%ent’s capabili ty to make choices
Mo%va %on of the pa%ent
Knowledge of the pa%ent
Nee d of the pa%ent
Team Supp ort
Self-efficacy
Barr iersAvail able %me
Behavior
Self-management supp ort
Knowledge & skill s
Edu ca%onal nee ds
Inten%on
Backgroun dAge
Edu ca%on
Work experienceTarge t group
Inp a%ent / outpa%ent department
Fig. 1. The Attitude, Subjective norms, and Self-Efficacy (ASE) model (de Vries et al., 1988).
2.4.1. Predictors of self-management support behavior
To determine which factors influence the behavior of self-
management support a stepwise regression analysis was executed
with the significant variables of the ASE-model.
Advanced statistical methods 362
. Results section 3.8, p. 70-71,and Table 4, p. 70:
3.8. Predictors of self-management support behavior
Stepwise regression analysis showed that three factors were
significant predictors for self-management support behavior. We
first controlled for setting (inpatient or outpatient ward). This
accounted for 3.1% of the variance (adjusted R2 2.7%). In the
subsequent steps the importance of self-management support, the
presumed absence of a patients’ need for self-management support,
the perceived knowledge gap, and self-efficacy respectively, were
entered. In the final model, importance of self-management
support (attitude) and setting were mediated by self-efficacy. The
final model explained 41.1% of the variance of behavior of self-
management support (adjusted R2 39.9%) (Table 4).
Table 4
Determinants of self-management support behavior.
Step 1 Step 2 Step 3 Step 4
Behavior b P Value b P Value b P Value b P Value
Background
Working in an inpatient ward or outpatient department 0.18 0.005 0.14 0.020 0.13 0.025 0.06 0.274
Attitude
Importance 0.19 0.002 0.15 0.010 0.06 0.228
Subjective norms & knowledge
Patients do not have a need �0.19 0.001 �0.16 0.002
Own insufficient knowledge �0.26 <0.001 �0.14 0.005
Self-efficacy 0.53 <0.001
Explained variance R2 = 0 .03 <0.05 R2= 0 .07 <0.001 R2= 0 .17 <0.001 R2= 0 .41 <0.001
F-value (df) 7.97 (253) 8.96 (252) 12.37 (250) 34.68 (249)
Note: Stepwise regression analysis; b, standardized coefficients; df, degrees of freedom.
Advanced statistical methods 363
Chapter 16
Polynomial regression
. Example
. Application
. Interpretation of the results
. Example from the biomedical literature
Advanced statistical methods 364
16.1 Example
• We revisit the fictitious example, used before, to illustrate the effect of non-linearity insimple linear regression:
Advanced statistical methods 365
• The figure clearly shows non-linearity.
• Before, this was solved by logarithmically transforming the covariate X , x −→ ln(x)
• On the other hand, we note that the relationship between yi and xi could be quadratic.
• A possible statistical model could be:
yi = β0 + β1 xi + β2 x2i + εi
where, conventionally, the error terms εi are assumed normally distributed, with meanzero and variance σ2.
• Note that the above model can be considered a multiple regression model withcovariates x1i = xi and x2i = x2
i :
yi = β0 + β1 x1i + β2 x2i + εi
Advanced statistical methods 366
• Hence, the model can be fitted by first calculating a new variable which contains thesquares of xi, whereafter a multiple linear regression is computed.
• Most software packages allow for implicit calculation of the higher order term(s).
• Output for the regresssion coefficients:
• The fitted regression curve is:
yi = 0.72 + 4.50 xi − 2.32 x2i
Advanced statistical methods 367
• The coefficient for the quadratic term β2 is strongly significantly different from zero(p < 0.0001), establishing a strong quadratic effect.
• Graphical representation of the regression curve:
Advanced statistical methods 368
• Output for the ANOVA table:
• The R2 value is now higher than R2 = 0.9247, obtained from the regression modelwith the logarithmically transformed covariate.
Advanced statistical methods 369
16.2 Interpretation of the results
• In our fictitious example, the fitted regression curve was:
yi = 0.72 + 4.50 xi − 2.32 x2i
• Before, we derived that a regression coefficient indicates how the response changes onaverage as a function of the corresponding covariate, while keeping all othercoefficients constant.
• In the above example, this means, e.g., that β1 = 4.50 indicates that the response onaverage increases with 4.50 if X shows a unit increase, while X2 remains constant.
• Now, given that X cannot vary without X2 changing along, such interpretation ismeaningless.
Advanced statistical methods 370
• In general, we have to conclude that the individual regression coefficients inpolynomial regression cannot be interpreted.
• The regression coefficients merely describe a polynomial, describing the averageevolution of Y as a function of X .
• On the other hand, the high significance of β2 (p < 0.0001) indicates that theaddition of the quadratic term has importantly improved the regression model.In other words, there is a strong quadratic effect, superimposed on the linear effect.
• The signficance of the individual parameters in the polynomial regression can beinterpreted, the individual regression parameters cannot.
Advanced statistical methods 371
• Note that the result of the polynomial regression is a curve rather than a plane:
Advanced statistical methods 372
16.3 Remarks
• The foregoing discussion is directly generalizable to polynomials of degree higher thantwo:
yi = β0 + β1xi + . . . + βpxpi + εi
• We refer to third-degree polynomials as cubic regression:
yi = β0 + β1xi + β2x2i + β3x
3i + εi
• One can combine ordinary multiple regression with polynomial regression:
yi = β0 + β1x1i + β2x21i + β3x2i + εi
• Given that polynomial regression is a special case of multiple regression, all techniquesfor model diagnostics and influential observations apply.
Advanced statistical methods 373
16.4 Example from the biomedical literature
Bjork et al. [16]:
. Definition of outcomes:
Neuropsychiatric symptoms were assessed usingthe Neuropsychiatric Inventory Nursing HomeVersion (NPI-NH; Wood et al., 2000), which assessesthe frequency and severity of 12 psychiatric andbehavioral symptoms in nursing home residents.Symptom frequency is rated from 0 to 4 andsymptom severity from 1 to 3. An item score isgenerated by multiplying frequency by severity(0–12); thus, greater NPS scores indicate greaterfrequency and severity.
Cognitive functioning was assessed with Gottfries’
Advanced statistical methods 374
. Relation between 12 outcomes and cognitive functioning (Fig.1):
Figure 1 Neuropsychiatric symptoms in relation to level of cognitive function with polynomial regression curves fitted to the data. Regressioncoefficients are presented in Table 2. The x-axis presents the cognitive score (ranging from 27 to 0), and y-axis presents the mean item score of eachparticular symptom. A, delusions; B, hallucinations; C, aggression/agitation; D, depression/dysphoria; E, anxiety; F, elation/euphoria; G, apathy; H,disinhibition; I, irritability; J, aberrant motor behavior; K, night-time behaviors; L, eating changes.
Advanced statistical methods 375
. Polynomial regression results (Table 2):
Table 2 Characteristics of regression curves for NPS in relation to level of cognitive functioning
NPI-NH item Prevalence, % (n) Polynomial regression curve R R2
p-value
Delusions 32.7 (1442) 3rd degree 0.226 0.051 0.003Hallucinations 26.6 (1177) 3rd degree 0.219 0.048 <0.001Aggression/agitation 39.6 (1720) 3rd degree 0.339 0.115 <0.001Depression/dysphoria 51.8 (2157) 2nd degree 0.105 0.011 <0.001Anxiety 40.7 (1732) 3rd degree 0.189 0.036 0.009Elation/euphoria 17.2 (754) 1st degree 0.143 0.020 <0.001Apathy 42.3 (1762) 2nd degree 0.297 0.088 <0.001Disinhibition 25.7 (1133) 3rd degree 0.171 0.029 0.008Irritability 44.4 (1855) 3rd degree 0.232 0.054 0.001Aberrant motor behavior 29.7 (1298) 3rd degree 0.292 0.085 <0.001Night-time behaviors 35.0 (1487) 3rd degree 0.183 0.033 0.001Eating changes 35.2 (1406) 3rd degree 0.108 0.012 <0.001
NPS, neuropsychiatric symptoms; NPI-NH, Neuropsychiatric Inventory Nursing Home.Data correspond to diagrams in Figure 1.
∗ Prevalence is percentage of cases with neuro-psychiatric symptoms (NPS > 0)
∗ R as well as R2 measures reported
∗ R is not equal to Pearson correlation due to non-linearity
Advanced statistical methods 376
Chapter 17
Interaction
. Example
. Application
. Interpretation of results
. What about non-significant main effects?
. Remarks
. Example from the biomedical literature
Advanced statistical methods 377
17.1 Example
• Let us reconsider the example where we aim to predict the ADL score as a function ofage and the patient’s MMSE score, 1 day post operation, with the associated multipleregression model:
ADL = 22.55 − 0.29 ×MMSE + 0.01 × Age
Advanced statistical methods 378
• This regression assumed that the effect of MMSE on ADL is independent of the effectof the patient’s age: For each age class, we have that the ADL score diminishes onaverage with 0.29 per unit increase of MMSE.
• Conversely, it is also assumed that the effect of Age on ADL is independent of theMMSE score of the patient: For each MMSE class, we have that ADL increases onaverage with 0.01 per unit increase of age.
• A regression model not making this assumption can be obtained through a so-calledinteraction term of Age and MMSE:
ADLi = β0 + β1MMSEi + β2Agei
+β3MMSEi × Agei + εi
• This means that we merely add another covariate to the model, the product of theprevious two covariates.
Advanced statistical methods 379
• To demonstrate that we now no longer assume that the effect of Age is independentof MMSE, and vice versa, we rewrite the above model in two ways:
ADLi = β0 + β2Agei + (β1 + β3Agei)×MMSEi + εi
ADLi = β0 + β1MMSEi + (β2 + β3MMSEi)× Agei + εi
• From the first equation it follows that we assume a linear relationship between ADLand MMSE, but that the intercept and the slope depend on Age:
. Intercept: β0 + β2Agei
. Slope : β1 + β3Agei
Advanced statistical methods 380
• From the second equation, it follows that we assume a linear relationship bewteenADL and Age, but with intercept and slope dependent on MMSE:
. Intercept: β0 + β1MMSEi
. Slope: β2 + β3MMSEi
• Note also that the interaction effect implies that the effect of MMSE on ADL dependson Age but simultaneously also that the effect of Age on ADL depends on MMSE.
• Furthermore, the assumption made before, i.e., that the effect of MMSE (Age) onADL does not depend on Age (MMSE) can easily be checked by testing H0 : β3 = 0
• The computation of the product term for the interaction is done implicitly in mostsoftware packages
Advanced statistical methods 381
17.2 Application
• Resulting regression coefficients and ANOVA table:
Advanced statistical methods 382
• Like always, the globale F -test aims at testing the null hypothesis
H0 : β1 = β2 = β3 = 0
versus the alternative hypothesis that at least one of the above regression coefficientsis different from zero.
• Adding the interaction term has increased R2 from 0.4946 to 0.5235.
• Strictly speaking, the interaction term is not significant (α = 0.05), but there isevidence that the effects of MMSE and Age on ADL are not entirely independent ofone another.
• The estimated regression equation is
ADL = 40.87 − 1.19 ×MMSE− 0.21 × Age
+0.01 ×MMSE× Age
Advanced statistical methods 383
• Graphical representation of regression surface (upon rotation):
Advanced statistical methods 384
17.3 Interpretation of results
• The estimated regression equation is
ADL = 40.87 − 1.19 ×MMSE− 0.21 × Age + 0.01 ×MMSE× Age
• Like with polynomial regression, we cannot interpret the individual regressioncoefficients.
• For example, we cannot conclude that −1.19 captures how strongly ADL changes withMMSE, while the other covariates are kept fixed. Indeed, MMSE cannot vary withoutalso changing the product MMSE×Age, if Age is kept constant.
• To enhance insight in the effect of adding the interaction to the model, we considerthe predicted evoluation of ADL as a function of MSE and as a function of Age,separately.
Advanced statistical methods 385
17.3.1 ADL as a function of MMSE
• To see how ADL evolves as a function of MMSE, we rewrite the estimated regressionequation as:
ADL = 40.87 − 0.21 × Age
+(−1.19 + 0.01 × Age)×MMSE
• We can now compute this for various age groups:
. 65 years: ADL = 27.22 − 0.54 ×MMSE
. 75 years: ADL = 25.12 − 0.44 ×MMSE
. 85 years: ADL = 23.02 − 0.34 ×MMSE
. 95 years: ADL = 20.92 − 0.24 ×MMSE
Advanced statistical methods 386
• Each of the equations corresponds to a straight line in the regression surface, for agiven level of age:
Advanced statistical methods 387
• We see that ADL deceases progressively less as a function of MMSE, with increasingage.
• For high ages, the dependence will decrease less pronounced as a function of thepatient’s cognitive status.
Advanced statistical methods 388
17.3.2 ADL as a function of Age
• To see how ADL evolves as a function of age, we rewrite the estimated regressionequation as:
ADL = 40.87 − 1.19 ×MMSE
+(−0.21 + 0.01 ×MMSE)× Age
• We can now compute this equation for various MMSE groups:
. MMSE = 0: ADL = 40.87 − 0.21 × Age
. MMSE = 10: ADL = 28.97 − 0.11 × Age
. MMSE = 20: ADL = 17.07 − 0.01 × Age
. MMSE = 30: ADL = 5.17 + 0.09 × Age
Advanced statistical methods 389
• Each of these equations corresponds to a straight line in the regression plane, for aconstant MMSE value:
Advanced statistical methods 390
• Hence, we see that patients with a very good cognitive status, there is a tendency forADL to increase with age.
• For patients with a worse cognitive status, there is a tendency for ADL to decreasewith age.
• The latter observation is counter-intuitive. For that reason, we wish to test whether,e.g., for patients with MMSE score equal to 10, the slope value of −0.11 is significant.
• In fact, this slope is−0.21 + 0.01 × 10
where −0.21 is an estimate for β2 (coefficient of Age), and where 0.01 is an estimatefor β3 (interaction coefficient).
Advanced statistical methods 391
• We are intersted in testing the hypothesis:
H0 : β2 + 10β3 = 0, versus HA : β2 + 10β3 6= 0
• Most software packages allow specification of null-hypotheses which are linearcombinations of the parameters in the model.
• Result:
• We obtain an F -test for the specified null hypothesis, from which it follows that thereis no significant relationship between ADL and Age, for patients with an MMSE scoreequal to 10 (p = 0.2034).
Advanced statistical methods 392
• Note that, strictly speaking, the interaction term is non-significant (p = 0.0731),suggesting that the lines on the regression surface are parallel:
• On the other hand, the relatively small p-value hints on the presence of (a weak formof) interaction, which now, due to lack of power, is not found to be significant. It isimportant to verify this in a new, perhaps larger experiment.
Advanced statistical methods 393
17.4 What about non-significant main effects?
• In our example, we obtained the following estimators for the regression parameters:
• The effect of Age and MMSE are termed ‘main effects,’ to effectuate the differencewith the interaction MMSE×Age.
Advanced statistical methods 394
• Can we delete, in this case, the least significant term, i.e., the main effect of Age?
• The rationale would be that this is a term that would not provide additionalinformation about the ADL response variable.
• As long as there is an interaction term, it is possible that the effect of Age depends onMMSE, and vice versa.
• This implies that no assertions can be made over the global effect of Age.
• For this reason, we will not delete non-significant main effects, as long as interactioneffects are included in the model.
Advanced statistical methods 395
17.5 Remarks
• Interactions can be added as well to polynomial regression models:
yi = β0 + β1x1i + β2x21i + β3x2i + β4x1ix2i + εi
• Fitting models with several covariates, whether or not polynomial, and with or withoutinteraction, can be easily be done within the context of the so-called ‘General LinearModel,’ to be discussed later.
• Given that regression models with interaction terms are, again, a special case ofmultiple regression, the techniques for model diagnosis and influential subjects remainvalid.
• Also, implementation in software remains similar.
Advanced statistical methods 396
17.6 Example from the biomedical literature
Collard et al. [17]
. Statistical analysis section, p. 191:
Multiple linear regression analyses were conducted to
examine associations of the number of somatic diseases
(dependent variable) with depression (independent vari-
able) adjusted for socio-demographic variables (age,
gender, educational level, partner status, income) and
lifestyle factors (smoking status, alcohol use, BMI, and
physical exercise). First, we checked whether the associa-
tions between depression and somatic comorbidity were
dependent on frailty by including interaction terms
between frailty and depression in the fully adjusted
models. We tested both, frailty as a dichotomous
characteristic (present yes/no) and as a dimensional
variable based on the number of criteria present. A
significant interaction term between depression and frailty
(yes/no) implies that the association between depression
and somatic diseases is different in patients with and
without frailty. In case of a significant interaction term
with the number of frailty criteria present, the association
between depression and frailty differs among the different
levels of frailty. Subsequently, it was tested whether frailty
Advanced statistical methods 397
. Results section, p. 192:
5.2. Frailty as a moderating factor
Whether the association between depression and
number of somatic diseases was dependent on frailty
status, was examined by adding the interaction term of
depression by frailty to the fully adjusted linear regression
models. Depression neither interacted with the presence of
frailty (yes/no) (p = .57), nor with the number of frailty
components present (p = .25).
∗ Outcome: Number of somatic diseases
∗ Covariates: Severity of depression, Number of frailty components present
∗ Interaction: Severity of depression × Number of frailty components present
∗ Adjusted for: Socio-demographic variables and Lifestyle factors
Advanced statistical methods 398
Part VI
Analysis of variance with multiple factors
Advanced statistical methods 399
Chapter 18
Multiple analysis of variance
. Example
. Application
. Interpretation of results
. Model diagnostics
. Influential observations
. Examples from the biomedical literature
Advanced statistical methods 400
18.1 Example
• Let us reconsider the examples from single ANOVA:
. We found a significant difference in mean ADL score, 1 day post operation,between neuro-psychiatric patients and other patients (p = 0.0025):
Advanced statistical methods 401
. We also found that the average ADL score, 1 day post operation, is significantlydifferent for different patient pre-operative housing situations (p = 0.0006):
• Hence, we have two factors that are related with ADL score.
Advanced statistical methods 402
• The average ADL for each combination of housing situation and neuro-status:
• Like before, we will remove the fourth housing situation, because it contains a singleobservation only.
Advanced statistical methods 403
• Graphical representation:
• Note that the averages have been connected to emphasize the difference in evolutionbetween both neuro groups.
Advanced statistical methods 404
• This difference is easier observed in a so-called interaction plot:
Advanced statistical methods 405
• Multiple ANOVA will allow us to assess the joint effect of housing situation andneuro-status on the ADL score, 1 day post operation.
• Note that the graph suggests that the effect of neuro-status on mean ADL depends onthe patient’s housing situation.
• In analogy with multiple linear regression, we need to account for possible interactionbetween both factors.
• The prediction of ADL, using neuro-status and housing situation, is an example of aso-called 2-way ANOVA, because we have two factors to predict the response. Like inthe regression case, the entire 2-way ANOVA discussion can be generalized to morethan two factors.
Advanced statistical methods 406
18.2 Application
• An ANOVA analysis for the ADL score, with neuro-status and housing situation asfactors, and with potential interaction between both’:
Advanced statistical methods 407
• Like before, we obtain a decomposition of the total variability in the response variable(SSTO) into a component capturing the variability between the various groups(SSbetween) and a component capturing the variability within both groups (SSwithin).
• Like before, we obtain a global F-test comparing the variability between the groupswith the variability within the groups:
F =197.89/5
470.94/48
and hence, which expresses in how far the factors in the model assist us in predictingADL (here significant, p=0.0039).
• The degrees of freedom, needed to standardize SSwithin and SSbetween are moredifficult to derive with multiple ANOVA (see later).
Advanced statistical methods 408
• Like before, we obtain an R2, indicating the percentage of total variability in ADLscore that is explained by the ANOVA model.
• Furthermore, an F -test for each of the effects in the model can be obtained:
Advanced statistical methods 409
18.3 Interpretation of results
• The relevant ANOVA table for testing the various effects in the model is:
• We obtain an F -test for each effect specified in the model.
Advanced statistical methods 410
• Regarding interpretation, these tests are completely analogous to these in multiplelinear regression, i.e., one tests for the significance of a given effect while keepingother effects constant.
• Before making a statement about ‘the’ neuro-effect or ‘the’ effect of housingsituation, we need to assess whether the effect of one factor does or does not dependon the other factor.
• In other words, we need to check whether there is an interaction between these factors.
• From the above table, it follows that there is no significant interaction (p = 0.4515).
• We can conclude that the effect of housing situation does not depend on neuro-statusand, vice versa, that the neuro-effect does not depend on housing situation.
Advanced statistical methods 411
• Graphically, this means that the non-parallel structure of the means can be ascribed torandomness:
• We can assume that the mean profiles in fact are parallel.
Advanced statistical methods 412
• This assumption can be built into the analysis by removing the interaction term fromthe model.
• Results for model without interaction between housing situation and neuro-status:
Advanced statistical methods 413
• We can also calculate predicted averages, based on the model without interactionbetween both model factors.
Advanced statistical methods 414
• With the above output, we can now test for the effect of each factor, after correctionfor the other factor. In other words, we can test the effect of each of the factors, whilekeeping the other factor constant:
. The model assumed that the effect of housing situation is the same for bothneuro-statuses.
. This effect is found to be highly significant (p =0.0076), implying that the lines inthe foregoing graph are not horizontal.
. The model assumed that the neuro-effect is the same for the three housingsituations.
. This effect is not significant (p =0.2459), implying that the vertical distancebetween the lines in the foregoing graph are due to random variability.
. This actually means that we are allowed to further simplify the graph by assumingthe same evolution for both neuro groups (coinciding lines in the above graph).
• In other words, we can simplify the model by removing the factor Neuro.
Advanced statistical methods 415
• We obtain a one-way ANOVA model with Housing Situation as only factor, asdiscussed before.
• The difference in neuro-effect between the housing situations, suggested by theoriginal graph, is not significant (no interaction).
• Moreover, the neuro-effect is not only equal for all housing situations, it is not evensignificantly different from zero.
• How can we intuitively see why neuro status was originally significant in the t-testanalysis, but no longer after correction for housing situation?
• Apparently, differences in ADL between neuro-psychiatric patients andnon-neuro-psychiatric patients can be explained by differences in housing situation.
Advanced statistical methods 416
• Graphically:
��
. Outcome Y : ADL (day 1 post-operatively)
. Factor X1: Housing Situation
. Factor X2: Neuro-psychiatric status
• This suggests that there is a strong relationship between neuro-statusand housing situation.
Advanced statistical methods 417
• We can check this with a table-analysis (chi-squared test):
Advanced statistical methods 418
• There is a strongly significant (p = 0.008) relationship between housing situation andneuro-status of the patient: 53% of the neuro-psychiatric patients stay in a RH/RVT.
• This explains why, after correction for housing situation, the neuro-status is no longersignificant.
Advanced statistical methods 419
18.4 Model diagnostics
• With 1-way ANOVA, it is assumed that the data within each group are normallydistributed, with constant variance.
• With multiple ANOVA, it is assumed that, for each combination of the factors in themodel, the data are normally distributed with constant variance.
• For our example, the factor Housing Situation has three levels, and the factor Neurohas 2 possible values. We have 6 possible combinations.
• Our 2-way ANOVA model assumed that for each of these 6 combinations, the data arenormally distributed, with equal variance throughout.
Advanced statistical methods 420
• The violation of these assumptions can lead to erroneous results of the statistical tests.
• Hence, it is key to check the assumptions as carefully as possible.
• Let us illustrate this for the original model (model with interaction), so as to ensurethat model simplification (based on p-values) be justified.
Advanced statistical methods 421
18.4.1 Assumption of constant variance
• The estimated standard deviation for each combination of model factors:
Advanced statistical methods 422
• Like with 1-way ANOVA, Levene’s test can be used to assess whether the 6 variancesare equal:
• The variances in the six groups are not significantly different (p = 0.3059).
• Like with 1-way ANOVA, when there are many groups or when some groups containvery many observations, small differences in variances can turn out to be significant,while small differences in variance do not drastically alter conclusions.
• For this reason, next to a formal test, a rule of thumb of equal variance is applied: wecheck whether or not the variances differ by a factor larger than 5.
Advanced statistical methods 423
• In our example, this becomes:
4.692
1.542= 9.27
• Note that the variability in the group of neuro-psychiatric patients that live alone ismuch larger tha in the other groups.
• However, it so happens that this groups contains 4 patients only with the followingADL scores:
Patient ADL (day 1)
#14 18
#25 23
#49 12
#56 15
Advanced statistical methods 424
• The large variability, therefore, is essentially due to patient #25, who exhibits a muchlarger ADL value than the other patients in this group.
• Hence, the large standard deviation in this sub-population does not necessarily suggestthat the variability in this group would be larger than in the other sub-populations.
• On the other hand, with regression analysis, we saw that such ‘outliers’ arepotentially influential. We will have to pay careful attention in our influence analysisto subject #25.
Advanced statistical methods 425
18.4.2 Normalilty assumption
• Multiple ANOVA assumes that the data are normally distributed for each combinationof factors in the model, with constant variance. Above, we already discussed theassessment of equality in the variances. We now assume that the assumption of equalvariance is satisfied. How can we test the assumption of normality?
• As discussed already, to each ANOVA corresponds a statistical model that makesassumptions about the relationship between the average response values across classesof model factors.
• For example, the 2-way ANOVA for ADL, with factors Housing Situation andNeuro-status, without interaction, corresponds to the assumption of parallel lines inthe graphical representation of group-specific averages.
Advanced statistical methods 426
• The lines in the graph are mean ADL values, predicted by the ANOVA model:
• For a particular model, one can calculate a residual for each individual, whichcompares the predicted response with the observed response.
• These residuals can be used, like before, to assess the normality assumption.
Advanced statistical methods 427
• Once the residuals are calculated, we can assess normality by means of a histogram onthe one hand, or by means of formal normality tests, similar to previous normalitychecks:
• Residual analysis for original 2-way ANOVA model (model with interaction):
Advanced statistical methods 428
• The assumption of normality is acceptable.
• Like before, we have that:
. Departures from normality lead to correct results nevertheless, as long as the errordistribution is symmetric.
. In case of asymmetry, the response can sometimes be transformed, so as to ensurenormality of the new model.
. Possibly, such transformations can disturb the constant-variance properties, so thatit is imperative to re-assess this assumption after transformation.
Advanced statistical methods 429
18.5 Influential observations
• Each ANOVA model results in a prediction of the mean response for each combinationof the model factors, and these predictions satisfy the assumptions implicitly made bythe model.
• For example, removal of the interaction between housing situation and neuro-statusled to parallel predicted means.
• Like with 1-way ANOVA and regression, it is important to assess whether there aresubjects in the data set with unduly large influence on these predicted values.
• We can again calculate Cook’s distances, measuring the strength with which thepredicted means change if an observation is removed from analysis.
• This is done in full analogy with 1-way ANOVA.
Advanced statistical methods 430
• Indexplot of Cook’s distances, for the model without interaction:
Advanced statistical methods 431
• As expected, the outlier #25 exhibits a relatively large Cook’s distance, pointingtowards a relatively large influence of this individual on the estimation of the meanresponse.
• There are two subjects even more influential, i.e., subjects #49 and #53 (= mostinfluential).
• To study the influence of, e.g., subject #53, in more detail, we can compare theanalysis based on all data with one where this subject has been removed.
Advanced statistical methods 432
• The corresponding outputs are:
Advanced statistical methods 433
• Estimated means, based on the analysis with #53:
Advanced statistical methods 434
• Estimated means, based on the analysis without #53:
Advanced statistical methods 435
• Given that our ANOVA model does not contain interactions, the estimated effect ofneuro-status is constant across housing situation, for the analysis with subject #53 aswell as for the one without this subject.
• We see that subject #53 leads to an increased effect of neuro-status. Removal ofpatient #53 led to a larger p-value for the neuro-effect.
• Subject #53 is a neuro-psychiatric patient in the second housing situation. There areonly 4 patients in this situation, with the following ADL scores:
Patient ADL (day 1)
#11 19
#24 18
#26 16
#53 24
Advanced statistical methods 436
• Our patient is an outlier in his group. The extra large ADL score leads to an increasein the estimated neuro-effect.
• Observed means (no ANOVA), obtained with #53:
Advanced statistical methods 437
• Observed means (no ANOVA), obtained without subject #53:
Advanced statistical methods 438
• Hence, our original impression of the presence of an interaction is primarily due tosubject #53.
• This explains why the interaction term is far from significant (p = 0.4515)
• This p-value becomes p = 0.8717 when subject #53 is removed from analysis.
Advanced statistical methods 439
18.6 Examples from the biomedical literature
• Blomquist et al. [18]
. Hypotheses to test (p. 380):
Advanced statistical methods 440
. Statistical analysis & results (p. 381):
∗ Tests for interactions
∗ Not clear whether main effectstested based on 2-way model oron separate 1-way models
Advanced statistical methods 441
• Richardson et al. [19]:
. Aim (p. 1198) & analysis (p. 1199):
Aim
The aim of this study was to identify if student nurses
studying in the child field of nursing feel a lack of comfort
in caring by providing support for adolescents who are
LGBQ and what factors influence their comfort level.
alpha. Between-group comparisons were carried out with a
two-way analysis of variance (ANOVA) (factor 1: ethnicity,
White British or Ethnic Other; factor 2: religion, Religious
or Non-Religious). A test for normality and checks for mul-
ticollinearity were carried out. To reduce the risk of Type I
errors (false positives) due to multiple testing, each signifi-
cant value was adjusted using Bonferroni’s method. The
significance level was set at P ≤ 0�05.
Advanced statistical methods 442
. Table 3 (p. 1201):
Table 3 Effects of ethnicity and religion on comfort (two-way ANOVA).
Variable
White british Ethnic other F (P-value)
R* NR** R* NR** Ethnicity Religion E 9 R†
[A1] 4�13 (0�61) 4�26 (0�51) 4�10 (0�73) 4�40 (0�52) 0�183 (0�669) 2�400 (0�124) 0�359 (0�550)
[A2] 3�83 (0�83) 3�77 (0�88) 3�45 (0�99) 4�00 (0�94) 0�125 (0�724) 1�474 (0�227) 2�203 (0�140)
[A3] 3�50 (0�89) 3�51 (0�82) 3�58 (0�89) 3�90 (0�74) 1�614 (0�206) 0�812 (0�369) 0�678 (0�412)
[A4] 3�71 (0�96) 4�17 (0�57) 3�73 (0�96) 4�00 (0�67) 0�155 (0�694) 3�897 (0�050) 0�285 (0�594)
[A5] 2�42 (0�88) 2�11 (0�90) 2�45 (1�03) 2�20 (1�14) 0�086 (0�770) 1�741 (0�189) 0�013 (0�910)
[A6] 3�13 (1�12) 3�06 (1�21) 3�29 (1�20) 3�60 (0�84) 1�995 (0�160) 0�235 (0�629) 0�561 (0�455)
[A7] 3�83 (0�64) 4�00 (0�59) 3�70 (0�87) 4�00 (0�67) 0�176 (0�675) 2�074 (0�152) 0�176 (0�675)
[A8] 2�61 (0�84) 2�29 (0�71) 2�77 (1�12) 2�30 (1�06) 0�169 (0�681) 3�479 (0�064) 0�119 (0�731)
[A9] 3�67 (0�87) 3�69 (0�76) 3�65 (0�89) 3�70 (0�82) 0�001 (0�985) 0�040 (0�841) 0�009 (0�923)
*Religious.
**Non-Religious.†E 9 R = Ethnicity 9 Religion interaction effect.
Note: Values are expressed as mean (SD).
∗ Main effects not interpretable when interactions present in the model
∗ No indication that assumptions of constant variance would not be satisfied
Advanced statistical methods 443
Part VII
Analysis of covariance and the general linear model
Advanced statistical methods 444
Chapter 19
Analysis of covariance
. Example
. Application
. Interpretation of results
. Model diagnostics
. Influential subjects
. Examples from the biomedical literature
Advanced statistical methods 445
19.1 Example
• In the context of simple regression, we have demonstrated that there is a relationshipbetween ADL and MMSE. At the same time, we expect a relationship between theoccurrence (yes/no) of complications and ADL.
• We wish to explore the relationship between dependence (ADL) on the one hand, andcognitive status (MMSE) and the occurrence of complications on the other hand.
• Overview of the number of patients with and without general post-operativecomplications:
Advanced statistical methods 446
• If we want to relate a patient’s dependence to the occurrence of post-operativecomplications, then it is not meaningful to use the ADL score on day 1 post operationas the response variable.
• Instead, we will use the highest ADL value recorded over the three recordings (days 1,5, and 12 post operation).
• This maximal ADL score expresses the highest level of dependence, recorded for eachsubject in the study.
• We will relate this new variable to the cognitive status of the patient, 1 day postoperation, as well as to the occurrence of general post-operative complications.
• Regression of the maximal ADL score on the MMSE score, 1 day post operation,produces a significant (p < 0.0001) result, where dependence increases when MMSEdecreases.
Advanced statistical methods 447
• An unpaired t-test (unequal variances) shows that the mean ADL is significantly higher(p < 0.0001) for patients with complications than for these without complications.
• Graphically:
Advanced statistical methods 448
• These results are also visible in a graph with different symbols for the two groups:
• Note that all patients with complications feature an ADL score above the regressionline that had been obtained without distinguishing between patients with and withoutcomplications.
Advanced statistical methods 449
• This suggests that a separate regression is necessary for both groups:
Advanced statistical methods 450
• The graph suggests that the relationship between the ADL score and the MMSE scoreis less pronounced for patients with complication than for patients withoutcomplication.
• In other words, we expect an interaction of MMSE with the occurrence ofcomplications.
• In order to statistically test for this, we have to make use of analysis of covariance(ANOCOVA), allowing to study the relationship between a continuous response on theone hand and one or more covariates (regression), and one or more factors (ANOVA)on the other hand.
• ANOCOVA can be seen as a combination of regression and ANOVA.
Advanced statistical methods 451
19.2 Application
• Results from fitting the ANOCOVA model:
Advanced statistical methods 452
• As with regression, we obtain a split of the total variability in the response (SSTO)into a component explained by the variability in the MMSE scores and differencesattributable to the occurrence, yes or no, of complications (SSR), and a componentexpressing the total error if we predict ADL based on the model effects (MMSE andcomplications).
• Here too, we obtain a global F-test, checking whether the effects in the model containinformation for the prediction of the ADL score. In our example, we have a significantresult (p < 0.0001).
• Like before, the R2 captures which part of the total variability in the data can beexplained by the effects in the model:
R2 =SSR
SSTO=
557.01
836.75= 0.6657,
which then means that the occurrence of complications and the MMSE score, 1 daypost operation, explains ADL for more than 66%.
Advanced statistical methods 453
19.3 Interpretation of results
• The relevant ANOVA table for the testing of the various effects in our model is:
• We obtain an F-test for each effect specified in the model.
• Because the model fitted contains an interaction term, it is not possible to make aclaim about ‘the’ effect of complications or about ‘the’ effect of the MMSE score onthe maximal ADL score.
Advanced statistical methods 454
• From the above table, there appears to be some evidence for interaction between botheffects (strictly speaking, not significant, p = 0.0666 > 0.05).
• This means that there is some evidence for the effect of MMSE score not being thesame for patients with and without complications, i.e., that the two regression lines onthe following figure are not parallel:
Advanced statistical methods 455
• In spite of some evidence for the presence of interaction between MMSE and theoccurrence of complications, and in spite of this interaction being scientifically relevantand might be expected, it is not significant (p = 0.0666).
• Removal of the interaction results in a model with parallel lines: The effect of MMSEis the same for patients with and without complications:
Advanced statistical methods 456
• Results after the interaction term has been removed from the model:
Advanced statistical methods 457
• Hence, both remaining effects are highly significant:
. There is a significant difference between patients with and without complication,after correction for MMSE1. In other words, for patients with equal MMSE score, 1day post operation, there will still be a significant difference between both groups.
. There is a significant effect of MMSE, after correction for GEN. In other words,both for patients with and without complications, ADL is related to the MMSEscore, 1 day post operation.
Advanced statistical methods 458
19.4 Model diagnostics
• Like with regression and ANOVA, ANOCOVA also implies a statistical model todescribe the data. In our first model (with interaction) we assume, both for patientswith and without complications, that the relationship between maximal ADL score andthe MMSE score on day 1, is linear, but that the intercept as well as the slope aredifferent between both groups.
• Based on this model, we can again determine, for every individual, a predicted ADLscore, based on complication status and MMSE score, 1 day post operation.
• Again, we implicitly assumed that the errors, made in prediction, are normallydistributed with mean zero and constant variance.
• When these assumptions are not satisfied, errouneous results can be obtained.
Advanced statistical methods 459
• Like always, the verification of these assumptions is based on the calculated residualsei = yi − yi. These must not exhibit systematic trends, must have constant variance,and must be normally distributed.
• In practice, it usually suffices to apply the following techniques:
. Scatter plots of the ei versus all covariates in the model.
. Scatter plot of the ei versus the predicted values yi
. Normality checks for the ei
Advanced statistical methods 460
19.4.1 Residuals versus covariates
• In multiple linear regression, the residuals ei were plotted versus every covariate in themodel.
• We now construct a scatter plot of the residuals ei versus each of the two covariates inthe model, but perhaps with different symbols for patients of different subgroups thatare taken into account in the model (with/without complications).
• If the model is correct, we expect no further systematic trends, for none of thesubgroups.
• Systematic trends, like with simple regression, can point to the need for atransformation in one or more covariates.
Advanced statistical methods 461
• Resulting plot:
• For none of the two groups do we find a systematic trend in the residuals. This impliesthat the maximal ADL score is not systematically differently over- or underestimated,e.g., for patients with higher or lower cognitive status.
Advanced statistical methods 462
19.4.2 Residuals versus predicted values
• The scatter plots of the residuals versus covariates allows us to check whether or notthe response is systematically over- or underestimated, for certain values of thecovariates.
• Otherwise, it is also important to verify whether or not, for example, large or smallpredicted values point to systematic over- or underestimation.
• In our example, this comes down to verifying whether our model systematically over-or underestimates certain ADL values.
• This can be verified by plotting the residuals versus the predicted values yi, again withdifferent symbols for the various subgroups in our set of data.
Advanced statistical methods 463
• Resulting plot:
• Thus, we find no systematic errors for certain ADL values, neither for the group withnor for the group without complications.
Advanced statistical methods 464
19.4.3 Normality of the residuals
• Verifying the residuals’ normality can, again, be done in a graphical fashion(histogram), or via a formal test for normality.
Advanced statistical methods 465
• We conclude that the normality assumption appears to be plausible.
• Like before, it holds that:
. Departures from normality still lead to correct results, as long as the distribution ofthe errors is symmetric.
. In case of asymmetry, the response can sometimes be transformed, so that theresiduals in the new model are normally distributed.
. Such transformations can distort linearity and constant variance, so that, aftertransformation, the residuals need to be checked again for the renewed model.
Advanced statistical methods 466
19.5 Influential observations
• Each ANOCOVA model produces a predicted average response for each combinationof the model effects, and these predictions satisfy the assumptions implicitly made bythe model.
• In our example with interaction, we had a different regression line for each group. Themodel without interaction led to parallel predicted regression lines.
• We can now explore the influence of each subject on the prediction, using Cook’sdistance, measuring how strongly the predicted values change when a particularindividual is removed from the data.
• In practice, an influence analysis is conducted in the same way as discussed before, forregression and ANOVA.
Advanced statistical methods 467
• Indexplot of Cook’s distances:
• We do not observe points that exceptionally strongly lead to deviations than others.
Advanced statistical methods 468
19.6 Examples from the biomedical literature
• Collard et al. [17]
. Statistical analysis section, p. 191:
Multiple linear regression analyses were conducted to
examine associations of the number of somatic diseases
(dependent variable) with depression (independent vari-
able) adjusted for socio-demographic variables (age,
gender, educational level, partner status, income) and
lifestyle factors (smoking status, alcohol use, BMI, and
physical exercise). First, we checked whether the associa-
tions between depression and somatic comorbidity were
dependent on frailty by including interaction terms
between frailty and depression in the fully adjusted
models. We tested both, frailty as a dichotomous
characteristic (present yes/no) and as a dimensional
variable based on the number of criteria present. A
significant interaction term between depression and frailty
(yes/no) implies that the association between depression
and somatic diseases is different in patients with and
without frailty. In case of a significant interaction term
with the number of frailty criteria present, the association
between depression and frailty differs among the different
levels of frailty. Subsequently, it was tested whether frailty
Advanced statistical methods 469
. Results section, p. 192:
5.2. Frailty as a moderating factor
Whether the association between depression and
number of somatic diseases was dependent on frailty
status, was examined by adding the interaction term of
depression by frailty to the fully adjusted linear regression
models. Depression neither interacted with the presence of
frailty (yes/no) (p = .57), nor with the number of frailty
components present (p = .25).
∗ Outcome: Number of somatic diseases
∗ Covariate: Severity of depression
∗ Factor: Presence of frailty (yes / no)
∗ Interaction: Severity of depression × Presence of frailty
∗ Adjusted for: Socio-demographic variables and Lifestyle factors
Advanced statistical methods 470
• Ausili et al. [20]:
. Data analysis section p. 20-21:
Third, we compared self-care maintenance, self-care manage-
ment and self-care confidence scores between heart failure
patients with diabetes versus those without diabetes. This
comparison was performed adjusting for sociodemographic and
clinical variables known to influence self-care maintenance,
management and confidence (Bidwell et al., 2015; Clark et al.,
2014; Cocchieri et al., 2015; Tsai et al., 2015) and those variables
that were significantly different between heart failure patients
with diabetes and heart failure patients without diabetes. The
variables that were used to adjust the above comparison were: age,
gender, Charlson Comorbidity Index score, number of medications,
employment status, Mini Mental State Examination score,
caregiver presence, education, months of illness, New York Heart
Association functional class, number of hospitalizations in the last
year, and alcohol consumption.
Fourth, in order to know if the presence of diabetes in uenced
Advanced statistical methods 471
. Figure 1, p. 23:
Fig. 1. Comparison of self-care maintenance, self-care management and self-care confidence means and medians between heart failure patients with diabetes mellitus
(n = 379) and without diabetes mellitus (n = 813).
Note. Sample size in self-care management dimension is lower (628) because this scale was administered only to patients who reported symptoms of heart failure in the last
month. P-values derived by multiple linear regression adjusting for age, gender, Charlson Comorbidity Index score, number of medications, employment status, Mini Mental
State Examination score, caregiver presence, education, months of illness, New York heart Association functional class, number of hospitalizations in the last year, alcohol
consumption and self-care confidence (this last one only in the regression analysis on self-care maintenance and self-care management).
Advanced statistical methods 472
. Results section, p. 21-22:
�
adequate self-care (Riegel et al., 2009a,b). As shown by Fig. 1, none
of these self-care scores were statistically different between heart
failure patients with versus those without diabetes mellitus (aim
1; self-care maintenance p = 0.23, adjusted p = 0.13; self-care
management p = 0.98, adjusted p = 0.21; self-care confidence
p = 0.87, adjusted p = 0.51). Accordingly, no statistically significant
associations were found between the presence of diabetes and∗ Three outcomes
∗ Comparison of patients with and without diabetes mellitus
∗ Corrected for covariates related to outcome and different between both groups
∗ p-value for self-care management not the same in graph as in text(p = 0.22 versus p = 0.21)
Advanced statistical methods 473
Chapter 20
The general linear model
. Introduction
. Example
. The generalized linear model
Advanced statistical methods 474
20.1 Introduction
• In the previous chapters, many statistical models have been discussed:
. Linear regression (simple, multiple, interaction)
. Polynomial regression
. ANOVA (simple, multiple, interaction)
. ANOCOVA
• These are all special cases of the so-called ‘general linear model’ (GLM).
• In practice, one will often use a combination of the above models to relate a set ofcovariates and/or factors to a given response variable.
Advanced statistical methods 475
• Furthermore, one will often aim to reach a final model in a step-by-step fashion, by:
. Removal of non-significant effects
. Adding significant terms
. Adding (or removing) interaction terms
• This can be done flexibly only if we easily can transfer from one type of analysis to theother.
• Most software packages have therefore incorporated all those models in a singlesoftware routine.
Advanced statistical methods 476
20.2 Example
• As an illustration, we repeat the example from the ANOCOVA chapter, where werelate the maximal ADL score to the MMSE score on the first day, and to theoccurrence of complications.
• Furthermore, we want to take into account that the living condition has been found tobe an important factor to explain ADL (the last living condition will be deleted, asbefore).
• Finally, we want to correct for the fact that not all patients are of the same age, andwe allow the relationship between Age and ADL to be quadratic.
Advanced statistical methods 477
• We end up with a model containing the following effects:
. MMSE (day 1): main effects of MMSE
. Gen: main effect of complication status
. Woonsi: main effect of living condition
. MMSE*Gen: interaction of complication status and MMSE
. MMSE*Woonsi: interaction of living condition and MMSE
. Woonsi*Gen: interaction of living condition and complication status
. Leeftijd and Leeftijd2: correction for age
Advanced statistical methods 478
• It is then extremely important to make a clear distinction between covariates and(categorical) factors:
Advanced statistical methods 479
• Furthermore, the model needs to be specified:
Advanced statistical methods 480
• We obtain the following ANOVA table with the tests of the individual effects:
Advanced statistical methods 481
• As an informal check of the model specification, we can verify whether the degrees offreedom satisfy the rules:
. 1 df for each covariate (Leeftijd, Leeftijd2, MMSE)
. r − 1 df for a factor with r levels (Gen, Woonsi)
. the product of the individual df for each interaction (Gen*Woonsi, Gen*MMSE,Woonsi*MMSE)
• Note that this is a confirmation of the fact that the last living condition has beendeleted (2 df correspond to 3 groups).
• Our model clearly contains too many effects and therefore should be reduced in a stepby step fashion.
Advanced statistical methods 482
• As is always the case, the main effects can be interpreted only if they are not part ofan interaction term.
• For the reported p-values to be correct, we first need to verify whether the underlyingassumptions are satisfied.
• The assumptions have been verified and are satisfied (not reported)
Advanced statistical methods 483
• Model reduction:
. Step 1: deletion of Leeftijd2:
Advanced statistical methods 484
. Step 2: deletion of Leeftijd:
Advanced statistical methods 485
. Step 3: deletion of Gen*MMSE:
Advanced statistical methods 486
. Step 4: deletion of Woonsi*MMSE:
• The non-significance of the interaction between complication status and livingcondition (p = 0.0776 > 0.05) notwithstanding, there is some evidence for thepresence of an interaction.
Advanced statistical methods 487
• If we accept this model as final, then we reach the following conclusions:
. There is no need for an age correction
. The relationship between maximal ADL and the MMSE on day 1 does depend onneither living conditions nor on the occurrence of complications.
. The effect of occurrence of complications on maximal ADL depends on the livingcondition of the patient prior to intake.
• This can be displayed graphically by saving the predicted ADL values in a dataset (aswith the residuals), and then to plot them for various combinations of factors in themodel.
Advanced statistical methods 488
• Result:
Advanced statistical methods 489
• Hence, we see that patients with complications are, generally speaking, moredependent than these without.
• Furthermore, we see that that the effect of complication is much larger among theseliving alone than among the other two groups. This is one of the aspects of theinteraction between living condition and complication status.
• If we would have decided to remove the interaction term from the model, then otherpredictions would have been obtained.
Advanced statistical methods 490
• Graphically:
Advanced statistical methods 491
• We now see that, indeed, the effect of a complication is equally large in the threeliving conditions.
• On the other hand, the difference between the three living conditions is independent ofcomplications having occurred or not.
Advanced statistical methods 492
20.3 The generalized linear model
• All models considered so far share a common characteristic: the outcome variable iscontinuous.
• This is reflected in the fact that the underlying distributional assumption is one ofnormality.
• Therefore, the models are termed ‘linear models.’
• For example, when one wants to analyze a binary variable as a function of covariatesand/or factors, then we are not able to use linear models.
Advanced statistical methods 493
• Generalized linear models are designed to generalize the linear models to caseswhere the outcome variable is no longer continuous and no longer normally distributed.
• The most often used model here is logistic regression, which allows for the analysisof binary outcomes.
• This model will be discussed later.
Advanced statistical methods 494
Chapter 21
Regression notation of a general linear model
. Introduction
. Factor with two levels
. Factor with more than two levels
. ANOCOVA model
. Examples from the biomedical literature
Advanced statistical methods 495
21.1 Introduction
• Obviously, ANOVA, regression, and ANOCOVA models are very much related:
. Special cases of the general linear model
. Based on similar assumptions (normality, constant variance)
. Similar diagnostics (model checking, influence analysis)
• One can show that all models in the general linear model family can be written asregression models, for an appropriate selection of covariates.
• Hence, from a mathematical point of view, all models are the same
• In many publications the regression notation is used to present results (see later)
Advanced statistical methods 496
21.2 Factor with two levels
• Let us consider a simple linear regression model:
Yi = β0 + β1Xi + εi, εi ∼ N (0, σ2)
• Furthermore, let us consider the special case where Xi can take two values only,0 and 1
• This subdivides our sample in two subsets:
. Observations with Xi = 0
. Observations with Xi = 1
Advanced statistical methods 497
• Graphically:
� �������
�
� � � � ! � " # $% & ' % & (
) * + , - . . , / . , 0 0 * 1 + 2 1 3 , 4
Advanced statistical methods 498
• For both subsets, the following distributional assumptions hold:
Yi = β0 + β1Xi + εi =
β0 + εi, if Xi = 0
β0 + β1 + εi, if Xi = 1
=
µ1 + εi, if Xi = 0
µ2 + εi, if Xi = 1,
with µ1 = β0 and µ2 = β0 + β1.
• Hence, the following assumptions are made:
Yi ∼ N (µ1, σ2), if observation i from group Xi = 0
Yi ∼ N (µ2, σ2), if observation i from group Xi = 1
Advanced statistical methods 499
• Hence, the model coincides with the statistical model behind the two-sample t-test:
.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Xi = 0 Xi = 1
µ1 µ2
Y
• Note also that the null-hypothesis tested in the regression model,
H0 : β1 = 0, versus HA : β1 6= 0,
is equivalent to the null-hypothesis tested in a t-test procedure:
H0 : µ1 = µ2, versus HA : µ1 6= µ2.
Advanced statistical methods 500
• This shows that a t-test can be considered a special case of linear regression, i.e., alinear regression with binary covariate X
• Note also that the assumptions are identical:
Regression −→ t-test
Normal errors Normal errors
Constant variance Equal variance
Linearity
• Note that the linearity assumption is satisfied automatically if the covariate X cantake two values only, which is why no linearity assumption was present in our earlierdiscussion of the t-test.
Advanced statistical methods 501
21.3 Factor with more than two levels
• The same idea can be used to write any one-way ANOVA model as a regression model
• A factor with two levels is modeled using one dummy variable:
Xi
Group 1: 0
Group 2: 1
• The regression model fitted then equals:
Yi = β0 + β1Xi + εi =
β0 + εi, if Group 1
β0 + β1 + εi, if Group 2
Advanced statistical methods 502
• A factor with three levels is modeled using two dummy variables:
X1i X2i
Group 1: 0 0
Group 2: 1 0
Group 3 0 1
• The ANOVA model testing equality of means of the three groups is equivalent to themultiple regression model
Yi = β0 + β1X1i + β2X2i + εi =
β0 + εi, if Group 1
β0 + β1 + εi, if Group 2
β0 + β2 + εi, if Group 3
Advanced statistical methods 503
• Testing equality of all three means is equivalent to testing H0 : β1 = β2 = 0, which isdone with the overall F -test reported in the ANOVA table of the multiple regressionanalysis.
• Group 1 is often called the reference group to which the other groups are compared.The average differences are β1 and β2 for groups 2 and 3, respectively, relative togroup 1.
• Note that the choice of the reference group is not unique.
• Selecting another reference group would not affect the p-value for the comparison ofall groups.
• Selecting another reference group would affect the interpretation of the parametersβ1 and β2.
Advanced statistical methods 504
• A factor with four levels is modeled using three dummy variables:
X1i X2i X3i
Group 1: 0 0 0
Group 2: 1 0 0
Group 3: 0 1 0
Group 4: 0 0 1
• The ANOVA model testing equality of means of the four groups is equivalent to themultiple regression model
Yi = β0 + β1X1i + β2X2i + β3X3i + εi =
β0 + εi, if Group 1
β0 + β1 + εi, if Group 2
β0 + β2 + εi, if Group 3
β0 + β3 + εi, if Group 4
Advanced statistical methods 505
• In general, a factor with r levels is modeled using r − 1 dummy variables
• This explains why, in the general linear model, r− 1 degrees of freedom are associatedwith a factor of r levels, while only one degree of freedom is associated to a covariate.
• Indeed, a factor with r levels implicitly is identical to r − 1 covariates.
• The same principle can be applied in models with multiple factors
• Interactions between 2 factors, with r and s levels are obtained by adding the productof all r − 1 dummy variables for the first factor with all s− 1 dummy variables of thesecond factor, leading to (r − 1)(s − 1) additional covariates in the regression model.
• In the biomedical literature, the regression analogue of ANOVA models is often usedto report differences between groups (see later).
Advanced statistical methods 506
21.4 ANOCOVA model
• The same idea can be used to write an ANOCOVA model as a linear regression model.
• Let us consider a model with covariate Xi and a factor with three levels.
• As before, we introduce two dummy variables to replace the factor:
X1i X2i
Group 1: 0 0
Group 2: 1 0
Group 3 0 1
Advanced statistical methods 507
• The ANOCOVA model without interaction is equivalent to
Yi = β0 + β1X1i + β2X2i + β3Xi + εi =
β0 + β3Xi + εi, if Group 1
β0 + β1 + β3Xi + εi, if Group 2
β0 + β2 + β3Xi + εi, if Group 3
• Hence, the model indeed assumes, for each group, a linear relation between theoutcome Y and the covariate X
• The slope is the same for all three groups (parallel lines ≡ no interaction)
• The intercepts are allowed to be different for the groups (main group effect)
Advanced statistical methods 508
• Graphically:
5 6 7 8 9 :
5 6 7 8 9 ;
5 6 7 8 9 <
= > ? @ A BC D E
C D F
C
Advanced statistical methods 509
• The ANOCOVA model with interaction is obtained by adding the product between thedummy variables and the covariate:
Yi = β0 + β1X1i + β2X2i + β3Xi + β4X1iXi + β5X2iXi + εi
=
β0 + β3Xi + εi, if Group 1
β0 + β1 + (β3 + β4)Xi + εi, if Group 2
β0 + β2 + (β3 + β5)Xi + εi, if Group 3
• Hence, the model indeed assumes, for each group, a linear relation between theoutcome Y and the covariate X
• The slope is not the same for all three groups (no parallel lines ≡ interaction)
• The intercepts are allowed to be different for the groups (main group effect)
Advanced statistical methods 510
• Graphically:
G H I J K L
G H I J K MG H I J K N
O P Q R S T
U V W
U V X
U
O P Q R S T V Y
O P Q R S T V Z
Advanced statistical methods 511
21.5 Examples from the biomedical literature
• Ausili et al. [20], Table 3 p. 24:
Table 3
Socio-demographic and clinical determinants of self-care maintenance, self-care management and self-care confidence in heart failure patients with diabetes mellitus
(n = 379).
Variable R -square Parameter
Estimate (95%CI)
p value
Determinants of self-care maintenance (n = 379) 0.34
Age �0.20 (�0.40–�0.007) 0.042
Gender �1.01 (�3.83–1.79) 0.471
Presence of diabetes complications �0.28 (�2.20–1.63) 0.771
Charlson Comorbidity Index score �0.20 (�1.37–0.97) 0.735
Number of medications taken by patients 0.90 (0.15–1.65) 0.017
Employment status �0.39 (�5.60–4.82) 0.883
Mini Mental State Examination score �0.02 (�0.29–0.24) 0.847
Presence of a caregiver �3.43 (�6.82– �0.05) 0.046
Family Income �1.69 (�2.96– �0.4) 0.009
Month of Illness 0.02 (0.005–0.53) 0.105
New York Heart Association Class �1.72 (�3.62–0.17) 0.074
Self-care confidence 0.34 (0.26–0.40) <0.001
∗ Regression parameters for factors (e.g., gender, presence of complications, etc.)
∗ Interpretation requires knowledge of the reference group
Advanced statistical methods 512
• Hahnel et al. [21]:
. Outcomes section, p.5:and control groups at all skin areas, except the right and left lower
leg (Supplementary online Tables 1 to 3). Results of the GLM
analysis regarding the changes in Overall Dry Skin scores at the
right lower leg in all three groups over the entire study are shown
in Table 3.
The model was adjusted for baseline Overall Dry Skin score
(right lower leg), age, nursing homes and Barthel-Index. Group I
(b = !0.643; p = 0.020) and Group II (b = !0.696; p = 0.009) had
statistical significant lower Overall Dry Skin scores compared to
the control group (Group III) over time. Visit was modelled as intra-
individual variable. In this model, the Overall Dry Skin score
decreased significantly over time (b = !1.696; p < 0.001), and
higher baseline measurements led to generally higher Overall Dry
Skin score over the entire course of time (b = 2.336; p < 0.001). The
particular nursing home was not associated with the treatment
effect.
For all other examined skin areas (left lower leg, right forearm,
Advanced statistical methods 513
. Table 3:
Table 3
Generalized linear model for the dependent variable Overall Dry Skin score at the right lower leg (n = 117).
Parameter B Std. Error 95% Wald Confidence Interval Hypothesis Test
Lower Upper Wald Chi-Square df p-value
ODS = 1 0.657 1.607 3.806 2.493 0.167 1 0.683
ODS = 2 2.071 1.601 1.067 5.209 1.673 1 0.196
ODS = 3 5.286 1.656 2.040 8.533 10.186 1 0.001
ODS = 4 8.247 1.729 4.859 11.635 22.762 1 <0.001
Group I 0.643 0.277 1.186 0.100 5.386 1 0.020
Group II 0.696 0.267 1.219 0.174 6.819 1 0.009
Group III (control) 0.0a . . . . . .
Visit 1.696 0.170 2.029 1.363 99.636 1 <0.001
Overall Dry Skin Score: right lower leg (Day 0) 2.336 0.2481 1.850 2.822 88.683 1 <0.001
Age (years) 0.012 0.014 0.016 0.040 0.665 1 0.415
Barthel-Index 0.011 0.005 0.002 0.020 5.868 1 0.015
Nursing home 1 0.607 0.938 2.445 1.232 0.419 1 0.518
Nursing home 2 0.115 0.567 0.997 1.227 0.041 1 0.839
Nursing home 3 0.017 0.456 0.875 0.910 0.001 1 0.969
Nursing home 4 0.138 0.408 0.936 0.661 0.114 1 0.735
Nursing home 5 0.539 0.463 0.368 1.446 1.357 1 0.244
Nursing home 6 0.694 0.533 -1.738 0.350 1.696 1 0.193
Nursing home 7 0.303 0.492 0.661 1.267 0.380 1 0.538
Nursing home 8 0.196 0.398 0.977 0.584 0.243 1 0.622
Nursing home 9 0.226 0.408 0.573 1.026 0.307 1 0.579
Nursing home 10 0.0a . . . . . .
(Scale) 1
df: Degrees of freedom.a Set to zero.
Advanced statistical methods 514
. Two different parameterizations in the same model:
∗ Overall Dry Skin score at baseline (ODS):
Yi =
µ1 + εi, if ODS= 1
µ2 + εi, if ODS= 2
µ3 + εi, if ODS= 3
µ4 + εi, if ODS= 4,
hence no use of dummy coding with an intercept representing a reference group.
∗ Treatment group (GROUP), and similar for Barthel-Index:
Yi =
β0 + εi, if Group 3
β0 + β1 + εi, if Group 1
β0 + β2 + εi, if Group 2,
hence dummy coding with Group 3 as reference group.
Advanced statistical methods 515
Part VIII
Models for binary outcomes
Advanced statistical methods 516
Chapter 22
Simple logistic regression
. Example
. Logistic regression model
. Application
. Model diagnostics
. Influential subjects
. Odds ratio
. Examples from the biomedical literature
Advanced statistical methods 517
22.1 Example
• One of the longitudinal measures is the CAM score, measuring the extent to whichpatients are confused.
• This score is measured at days 1, 3, 5, 8, and 12, post-operatively. Based on these 5measures, one can approximately assess whether the patient has been confused postoperation.
• This variable is binary (0=not confused, 1=confused) and is one of the responses inwhich researchers took a particular interest.
Advanced statistical methods 518
• Overview of the number of confused and non-confused patients:
• About 23% of patients were confused post operation.
• At the same time, it is possible for the probability on confusion to depend on effectssuch as age, whether or not the patient is neuro-psychiatric,. . . .
=⇒ Logistic regression
Advanced statistical methods 519
22.2 The logistic regression model
• Assume that we want to study whether confusion is related to the age of the patient.
• We then have, for each patient, a pair (xi, yi) of measures:
. xi: the age of the ith patient
. yi: confusion: 0 : not been confused1 : been confused
• A first way to describe the relationship between xi and yi would be a linear regressionmodel:
Yi = β0 + β1xi
Advanced statistical methods 520
• Graphically:
Advanced statistical methods 521
• The graph points to some problems:
. The discrete nature of the response implies that the observed data are poorlydescribed by the regression line, implying a low R2.
. The predicted values of the response can take every real value, which is entirelymeaningless given that the response Y can assume the values 0 and 1 only.
• The first problem is solved by relating age to the probability for confusion, ratherthan confusion itself:
P (Yi = 1) = β0 + β1xi
• This implies that every value between 0 and 1 is meaningful as prediction.
Advanced statistical methods 522
• Graphically:
Advanced statistical methods 523
• To further impose that the predicted probabilities would not be larger than 1 orsmaller than 0, the linear relationship is replaced by a so-called logistic relationship:
P (Yi = 1) =exp(β0 + β1xi)
1 + exp(β0 + β1xi)
• The relationship between the confusion chance and age then is S-shaped:
. Approximately linear in the middle
. Leveling off near the extremes
• The above model is a logistic regression.
Advanced statistical methods 524
• Graphically:
Advanced statistical methods 525
• In practice, fitting this model comes down to finding estimators β0 and β1, such thatthe corresponding logistic curve describes the data best.
• The numerical method to compute these estimators will not be discussed further.
• Like with simple liner regression, the logistic regression model contains two parameters.
• Intercept β0: captures the horizontal displacement of the curve. The larger theintercept, the larger the probability for a ‘success,’ which means that the regressioncurve shifts to the left when β0 increases.
• Slope β1: describes how strongly the chance for a ‘success’ changes as a function ofthe covariate X . The logistic curve increases if β1 > 0, and decreases if β1 < 0. Thelarger |β1|, the stronger the increase or decrease.
Advanced statistical methods 526
• Graphically:
Advanced statistical methods 527
• Graphically (β1 > 0):
Advanced statistical methods 528
• Graphically (β1 < 0):
Advanced statistical methods 529
• Note that, up to now, the probability P (Y = 1) was modeled as a function of acovariate X . A fully equivalent model is obtained by modeling P (Y = 0):
P (Yi = 0) = 1 − P (Yi = 1) = 1 − exp(β0 + β1xi)
1 + exp(β0 + β1xi)
=1
1 + exp(β0 + β1xi)=
1
exp(β0 + β1xi)[1 + exp(−β0 − β1xi)]
=exp(−β0 − β1xi)
1 + exp(−β0 − β1xi)
• Hence, if the probability of a ‘failure’ is modeled, we obtain the same result as whenmodeling the probability of a ‘success,’ but with opposite sign for the regressioncoefficients.
Advanced statistical methods 530
22.3 Application
• In most statistical software packages, model specification is completely analogous withthe specification of the general linear model, but then within a module for logisticregression.
• Often, logistic regression is implemented within the generalized linear modelsenvironment
• In order to correctly interpret results, it is important to check whether P (Y = 0) ismodeled, or P (Y = 1)
• Most software packages allow specification of your preference (P (Y = 0) orP (Y = 1)), and they all provide information about which model was fitted.
Advanced statistical methods 531
• For example, the SAS software package gives the following notification:
• In our analyses, P (no confusion) is modeled, implying that ‘no confusion’ is treated as‘success’.
• The parameter estimates are given by:
Advanced statistical methods 532
• We observe a significant (p = 0.0137) relationship between age and occurrence ofconfusion. The logistic curve is described by the equation:
P (not confused) =exp(10.30 − 0.11 × age)
1 + exp(10.30 − 0.11 × age)
or, equivalently
P (confused) =exp(−10.30 + 0.11 × age)
1 + exp(−10.30 + 0.11 × age)
• Hence, the probability of confusion increases with age (0.11 > 0).
Advanced statistical methods 533
• The above equation can now be used to predict, based on age, the probability forconfusion:
Age P (confused)
65 years exp(−10.30+0.11×65)1+exp(−10.30+0.11×65) = 0.05019
75 years exp(−10.30+0.11×75)1+exp(−10.30+0.11×75)
= 0.14095
85 years exp(−10.30+0.11×85)1+exp(−10.30+0.11×85)
= 0.33751
95 years exp(−10.30+0.11×95)1+exp(−10.30+0.11×95) = 0.61268
Advanced statistical methods 534
• We can request a table with the predicted probabilities for all individuals:
• Note that these predictions take the form of probability for no confusion, since thiswas the probability modeled in our analysis.
Advanced statistical methods 535
• Graphical representation of the predicted probabilities:
• Note that we see a part of the S-curve only: Only for these values of age for whichthere are actual observations:
Advanced statistical methods 536
• Extrapolated curve:
Advanced statistical methods 537
22.4 Model diagnostics
• Each statistical model is based on a number of assumptions regarding the datacollected. This is true for the simple logistic regression model as well.
• In our example, we assumed that the response variable (confusion status) followed aBernoulli distribution, where the probability of ‘success’ is described by a logistic curve:
P (confused) =exp(β0 + β1age)
1 + exp(β0 + β1 × age)
• Like always, when the assumptions are not satisfied, then erroneous results may follow.
Advanced statistical methods 538
22.4.1 The deviance statistic
• Every logistic regression is accompanied by a table with so-called ‘Goodness-of-fit’statistics:
• Each statistic is some measure for the total distance between the observations and thepredicted probabilities
• A well-fitting model satisfies:
. All values Y = 1 get (very) high predictions for P (Y = 1)
. All values Y = 0 get (very) low predictions for P (Y = 1)
Advanced statistical methods 539
• We will restrict interpretation to the deviance, the most popular measure.
• The smaller the deviance, the better the model describes the data.
• As a rule of thumb, models with a deviance less than the number of associateddegrees of freedom (DF) are considered well fitting. This comes down to a value forValue/DF, smaller than 1.
• The DF is the number of observations in the data set minus the number of parametersin the model, in our case 60− 2.
• This way, the deviance is corrected for the sample size and penalized for thecomplexity of the model, of particular interest in multiple logistic regression models.
Advanced statistical methods 540
22.4.2 Pearson residuals
• Like with regression and ANOVA models, residuals can be calculated for logisticregression models as well.
• A residual measures how well the response for a given individual is predicted by theregression curve
• Hence a residual should be close to zero if value Y = 1 has a large predicted value forP (Y = 1) and if a value Y = 0 has a low predicted value for P (Y = 1), and shouldbe far from zero otherwise.
• In the context of logistic regression, various types of residuals are defined. We willrestrict ourselves here to the so-called Pearson residuals. They are denoted, in analogywith regression and ANOVA, by ei.
Advanced statistical methods 541
• The residual ei is a measure for how far the observed response (0 or 1) is separatedfrom the predicted probability of ‘success’:
ei =yi − P (Yi = 1)
√P (Yi = 1)(1 − P (Yi = 1))
• The denominator is necessary to ensure that all ei have comparable variance, suchthat one does not conclude bad prediction only because yi − P (Yi = 1) is estimatedwith a lot of uncertainty.
• In our example, this would come down to comparing confusion status with theobserved probability of being confused.
• Most software packages calculate the Pearson residuals by default.
Advanced statistical methods 542
• A scatter plot of the Pearson residuals as a function of the covariate Age:
• Note that the graph does show a certain form of systematic pattern.
Advanced statistical methods 543
• In general, we even expect that with a strongly predictive model, a large systematictrend will be observed in a scatter plot of the Pearson residuals versus a modelcovariate:
. All non-confused patients have a positive residual:
1− P (Yi = 1) ≥ 0
. A very large positive residual corresponds to a patient for whom we expectconfusion (relative large predicted probability in favor of confusion), but whohappened not to be confused nevertheless.
. All confused patients have a negative residual:
0− P (Yi = 1) ≤ 0
. A very negative positive residual corresponds to a patient for whom we do notexpect confusion (relative large predicted probability in favor of non-confusion), butwho happened to be confused nevertheless.
. In a strongly predictive model, we therefore expect to see positive residuals for smallcovariate values and negative residuals for large covariate values, or vice versa.
Advanced statistical methods 544
• Therefore, in practice, no scatter plot is made of Pearson residuals versus thecovariates (or predicted values).
• However, an index plot can be made of the Pearson residuals, to detect whichobservations fit the model poorly:
Advanced statistical methods 545
• We observe that the negative values are largest in absolute values.
• Furthermore, there are a number of outliers:
Patient Age P (Confusion) Pearson residual
#16 83 0.28902 −1.56841
#18 73 0.11576 −2.76382
#33 83 0.28902 −1.56841
#34 83 0.28902 −1.56841
#38 72 0.10466 −2.92493
#51 83 0.28902 −1.56841
#52 77 0.17080 −2.20338
#53 84 0.31285 −1.48202
• All of these patients had a small predictive probability for confusion, and happened tobe confused nevertheless.
Advanced statistical methods 546
• A careful analysis of these subjects established that all of them were neuro-psychiatric.
• This can be represented graphically by plotting the residuals again, but with a symbolspecific to each of the two neuro-groups:
Advanced statistical methods 547
• Hence, we see that all confused neuro-psychiatric patients have a very negativeresidual, which implies that our model does not describe these patients adequately
• The probability of confusion among confused neuro-psychiatric patients issystematically underestimated.
• This points to the need for multiple logistic regression, allowing to include severalcovariates simultaneously when predicting confusion.
Advanced statistical methods 548
22.5 Influential observations
• Exactly as with regression, ANOVA, and ANOCOVA, it is possible to examine theinfluence of each individual separately on the results obtained.
• One can make use, again, of Cook’s distance, obtained in exactly the same way asobtained earlier with linear models.
• For observations with relatively large influence, the analysis can be conducted, again,with and without these observations.
• This too is done in full analogy with the way it is done with linear models.
Advanced statistical methods 549
22.6 Odds ratio
• In a linear regression model
Yi = β0 + β1xi,
we have that the effect (on average) of a one-unit increase in the covariate X fromX = x to X = x + 1 equals β1, and is independent of x.
• This does not directly generalize to the logistic regression model
P (Yi = 1) =exp(β0 + β1xi)
1 + exp(β0 + β1xi),
unless if ‘risk’ is expressed in terms of the odds rather than the probability
Advanced statistical methods 550
• The odds of observing a success equals:
Odds(Yi = 1) =P (Yi = 1)
1− P (Yi = 1)=
exp(β0 + β1xi)
1 + exp(β0 + β1xi)
/ 1
1 + exp(β0 + β1xi)
= exp(β0 + β1xi)
• The odds can be interpreted as a different scale for quantifying risk:
. Odds(Yi = 1) increases (decreases) if P (Yi = 1) increases (decreases)
. There is a 1-1 relation between Odds(Yi = 1) and P (Yi = 1):
P (Yi = 1) ←→ Odds(Yi = 1)
0.50 ←→ 1
0.25 ←→ 1/3
0.10 ←→ 1/9
0.75 ←→ 3
0.90 ←→ 9
Advanced statistical methods 551
• The odds ratio is the relative change in odds, for a one-unit increase in the covariateX from X = x to X = x + 1:
OR =Odds(Y = 1|X = x + 1)
Odds(Y = 1|X = x)=
exp(β0 + β1(x + 1))
exp(β0 + β1x)= exp(β1).
• Hence, the relative change in risk (measured on the odds scale), associated with aone-unit increase in the covariate, from X = x to X = x + 1, is independent of x.
• Therefore, many publications report the OR, i.e., exp(β1), rather than the slope β1 toexpress the effect of a covariate.
• Finally, the null-hypothesis H0 : β1 = 0 translates to H0 : OR = 1, hence confidenceintervals for OR are compared to the value 1 rather than 0.
• Note that correct interpretation of the OR requires knowledge of the reference category
Advanced statistical methods 552
22.7 Examples from the biomedical literature
• Silva et al. [22]
. Statistical analysis section, p.637:
Univariate and multivariate stepwise logistic regression analy-
sis of the data was also performed. Gender, age and BMI (body mass
index) were defined as antecedent variables, and two analytical
models were then developed: (I) a model using the number of
comorbidities and red cell indices individually as independent
variables and (II) a model that also included inflammatory cytokine
and erythropoietin levels. In both models we used the frailty
classification and each of the defining criteria as outcome, and a
significance level of 5% (p < 0.05) was used with a 95% confidence
interval.
Advanced statistical methods 553
. Table 3, p.639:
Table 3
Univariate analysis of variables of interest associated with frailty. FIBRA Study
(n = 255).
Variable ORa (95% CI OR) p value*
Female gender 4.94 1.081–22.580 0.039
Age 0.84 0.744–0.970 <0.001
RBC 0.13 0.032–0.542 0.005
Hb 0.40 0.259–0.650 <0.001
HTC 0.77 0.660–0.913 0.008
MCV 1.00 0.900–1.127 0.891
RDW 2.15 1.263–3.683 0.019
RetAbs 0.99 0.958–1.026 0.871
Serum EPO 1.01 0.998–1.034 0.178
Serum hsCRP 4.83 1.776–13.15 0.008
Serum IL-1RA 1.00 1.000–1.003 0.081
Serum IL-6 1.02 0.910–1.143 0.458
a Ref: Non-frail.* Significance level p < 0.05.
Advanced statistical methods 554
. Interpretation:
∗ OR rather than regression parameters
∗ OR’s calculated with ‘non-frailty’ group as reference
∗ Regression dummy coding for factor gender
∗ For example, females have almost five times as much odds to belong to thefrailty group than males (p = 0.039)
Advanced statistical methods 555
• Moon and Lee [23]
. Analysis section p.1427:
2.9. Analysis
The Chi-square and t-tests were used for comparing the
demographics and clinical characteristics between the
intervention and the control groups. The effects of protocol
application on delirium incidence, mortality, and re-
admission to the ICU during the same hospitalization
period were analyzed by logistic regression analysis. The
effects on 7- and 30-day in-hospital mortality were
Advanced statistical methods 556
. Table 3 p.1429:
Table 3
Effects of the delirium prevention protocol on the patient outcomes.
Outcomes Univariate logistic/linear regression
OR/HR(CI) b SE
Episodes of deliriuma 0.50 (0.22–1.14)
In-hospital mortalitya 0.28 (0.08–0.90)
7 days in- hospital mortalityb 0.10 (0.01–0.83)
30 days in-hospital mortalityb 0.34 (0.10–1.13)
ICU re-admission during same
hospitalizationa
0.28 (0.07–1.07)
ICU length of stayc 0.80 1.95
CI = confidence interval; ICU = intensive care unit; HR = hazard ratio; OR = odds ratio.a Logistic regression.b Cox-proportional hazard regression.c Linear regression.
Advanced statistical methods 557
. Interpretation of results:
∗ Multiple types of analyses in one table (logistic, linear, Cox regression)
∗ OR rather than regression parameters for logistic regression models
∗ No explicit mentioning of reference category used in the calculation of the OR.
∗ Most likely: Mortality / No mortality
∗ Regression dummy coding for factor intervention
∗ No explicit mentioning of reference group for the factor
∗ Most likely the control group is used as reference
∗ For example, applying the prevention protocol reduces the odds for in-hospitalmortality by a factor 0.28 (C.I.:[0.08; 0.90]).
Advanced statistical methods 558
Chapter 23
Multiple logistic regression
. Example
. Application
. Model diagnostics
. Influential observations
. Odds ratio
. Examples from the biomedical literature
Advanced statistical methods 559
23.1 Example
• With simple logistic regression, we found evidence to believe that the logisticregression model, used to relate the probability of confusion to age, systematicallyunderestimated the probability of confusion for neuro-psychiatric patients.
• This suggests that we ought to conduct a separate logistic regression for each of theneuro-groups separately.
• Exactly like with linear regression and ANOVA, simple logistic regression can beextended to situations where the response variable is related to several covariates (likewith multiple regression), multiple factors (like with multiple ANOVA) or severalcovariates and factors (like with ANOCOVA).
• Also now, potential interactions can be included in the model.
Advanced statistical methods 560
23.2 Application
• Result from fitting a logistic regression with covariate age and factor neuro status:
• Inclusion of the interaction implicitly implies two separate logistic models for the twoneuro groups.
Advanced statistical methods 561
• A graphical representation of the estimated regression curves, with group-specificsymbols:
Advanced statistical methods 562
• The graph suggests that, for both neuro-groups, the probability of no confusion(confusion) decreases (increases) with age.
• Further, for each age, the probability of confusion is larger for neuro- than fornon-neuro-patients.
• The apparent stronger relationship between confusion and age for the non-neuropatients than for the neuro patients is not significant (p = 0.1982). In other words,one can assume that the effect of age on confusion is the same for both neuro groups.
• A logistic regression model that explicitly makes this assumption can be estimated byremoving the interaction term from the model.
Advanced statistical methods 563
• Result:
• Hence, each of the two terms is significant, after correction for the other:
. There is a significant effect of age on confusion, after correction for neuro status(p = 0.0110). In other words, for both the neuro and the non-neuro patients, thereis a significant relationship between age and the probability of confusion.
. There is a significant effect of neuro-status, after correction for age (p = 0.0017).In other words, for patients of a given age, there is a significant difference betweenthe neuro groups concerning the probability to be confused.
Advanced statistical methods 564
• Based on the above model, we obtain the following graph for the predicted probabilityof not being confused:
Advanced statistical methods 565
• Given that our model no longer contains an interaction of neuro status with age, thelogistic models have the same slope and only differ in the intercepts, hence oneS-shaped curve is a horizontal translation of the other S-shaped curve.
• This is clear if we extrapolate both curves to ages outside of the range of observedages:
Advanced statistical methods 566
• As a general conclusion, we can state that both the neuro status and age have aninfluence on the confusion status, independently of each other:
. The probability of confusion increases with age.
. The probability of confusion is larger among neuro-psychiatric patients than amongnon-neuro psychiatric patients.
Advanced statistical methods 567
23.3 Model diagnostics
23.3.1 The deviance statistic
• For our logistic regression model designed to predict confusion based on age and neurostatus, without interaction between both effects, the table with ‘goodness-of-fit’statistic is:
Advanced statistical methods 568
• Compared to the deviance for our original simple logistic model, the number ofdegrees of freedom has reduced from 58 to 57, due to the fact that model containsone additional parameter.
• The ratio of the deviance statistic over its degrees of freedom now is 0.7821, whichimplies that our model seems to predict most observabions reasonably well.
• In our simple model, with age as the sole effect, this ratio equaled 0.9971, leading tothe conclusion that adding neuro status considerably improved our model. It confirmswhat we expected based on the residual plot with simple regression.
Advanced statistical methods 569
23.3.2 Pearson residuals
• Index plot of the Pearson residuals, with neuro-group-specific symbols:
• We now see a more balanced spread of the residuals; also neuro-patients no longerexhibit systematically deviating residuals.
Advanced statistical methods 570
23.4 Influential observations
• As before, for each observation Cook’s distance can be calculated, measuring theeffect of removing an observation.
• Among observations with relatively large influence, the analysis can be conducted withand without these observations.
• This is done in full analogy with the analyses discussed before.
Advanced statistical methods 571
23.5 Odds ratio
• Suppose a multiple logistic regression model with two covariates X1 and X2 has beenfitted:
P (Yi = 1) =exp(β0 + β1x1i + β2x2i)
1 + exp(β0 + β1x1i + β2x2i).
• In full analogy to the simple logistic regression model, the odds of observing a successequals:
Odds(Yi = 1) =P (Yi = 1)
1− P (Yi = 1)
=exp(β0 + β1x1i + β2x2i)
1 + exp(β0 + β1x1i + β2x2i)
/ 1
1 + exp(β0 + β1x1i + β2x2i)
= exp(β0 + β1x1i + β2x2i)
Advanced statistical methods 572
• In order to express the change in risk with a one-unit increase in covariate X1 fromX1 = x1 to X1 = x1 + 1, while keeping the other covariate fixed, the OR canbe used:
OR =Odds(Y = 1|X1 = x1 + 1, X2 = x2)
Odds(Y = 1|X1 = x1, X2 = x2)
=exp(β0 + β1(x1 + 1) + β2x2)
exp(β0 + β1x1 + β2x2)= exp(β1).
• Hence, the relative change in risk (measured on the odds scale), associated with aone-unit increase in the covariate, from X1 = x1 to X1 = x1 + 1, is not onlyindependent of x1, but also of x2.
• This is in full analogy to the interpretation of a regression coefficient in a multiplelinear regression model.
• Caution is needed, however, in polynomial models or models with interactions, whereindividual regression parameters cannot always be interpreted.
Advanced statistical methods 573
23.6 Examples from the biomedical literature
• Silva et al. [22]
. Statistical analysis section, p.637:
Univariate and multivariate stepwise logistic regression analy-
sis of the data was also performed. Gender, age and BMI (body mass
index) were defined as antecedent variables, and two analytical
models were then developed: (I) a model using the number of
comorbidities and red cell indices individually as independent
variables and (II) a model that also included inflammatory cytokine
and erythropoietin levels. In both models we used the frailty
classification and each of the defining criteria as outcome, and a
significance level of 5% (p < 0.05) was used with a 95% confidence
interval.
Advanced statistical methods 574
. Table 4, p.639:
Table 4
Multivariate analysis of variables of interest associated with frailty status (Model I).
FIBRA study (n = 255).
Variable ORa (95% CI OR) p value*
Model I
Outcome: Frailty
Age 1.16 (1.065–1.277) 0.001
Hb 0.493 (0.270–0.899) 0.021
Weight loss
Hb 0.68 (0.490–0.956) 0.026
RetAbs 0.97 (0.946–0.997) 0.029
Fatigue
No statistically significant difference
Grip strength
Age 1.11 (1.049–1.174) <0.001
Physical activity
Age 1.07 (1.023–1.133) 0.005
RetAbs 1.02 (1.003–1.044) 0.022
Gait speed
Age 1.09 (1.043–1.157) <0.001
a Ref: Non-frail.* Significance level p < 0.05.
Advanced statistical methods 575
. Interpretation:
∗ Several outcomes: Frailty and several defining criteria
∗ Backward selection, leading to different final models for the different outcomes
Advanced statistical methods 576
• Moon and Lee [23]
. Analysis section p.1427:
2.9. Analysis
The Chi-square and t-tests were used for comparing the
demographics and clinical characteristics between the
intervention and the control groups. The effects of protocol
application on delirium incidence, mortality, and re-
admission to the ICU during the same hospitalization
period were analyzed by logistic regression analysis. The
effects on 7- and 30-day in-hospital mortality were
Advanced statistical methods 577
. Table 3 p.1429:
Table 3
Effects of the delirium prevention protocol on the patient outcomes.
Outcomes Univariate logistic/linear regression Multivariate logistic/linear regression
OR/HR(CI) b SE p OR/HR(CI) b SE p
Episodes of deliriuma 0.50 (0.22–1.14) .10 0.52 (0.23–1.21) .13
In-hospital mortalitya 0.28 (0.08–0.90) .02 0.32 (0.09–1.13) .08
7 days in- hospital mortalityb 0.10 (0.01–0.83) .03 0.09 (0.01–0.72) .02
30 days in-hospital mortalityb 0.34 (0.10–1.13) .08 0.33 (0.10–1.09) .07
ICU re-admission during same
hospitalizationa
0.28 (0.07–1.07) .06 0.28 (0.07–1.13) .07
ICU length of stayc 0.80 1.95 .69 1.80 1.92 .35
CI = confidence interval; ICU = intensive care unit; HR = hazard ratio; OR = odds ratio.a Logistic regression.b Cox-proportional hazard regression.c Linear regression.
Adjusted variables: episodes of delirium, ventilator use, APACHE II score, excluding episodes of delirium for analysis of the episode of delirium as the
dependent variable.
∗ Multiple types of analyses in one table (logistic, linear, Cox regression)
∗ Simple and multiple analyses compared
∗ The significant difference in risk for in-hospital mortality between patientsreceiving the prevention protocol and control patients, no longer is significantafter correction for some patient characteristics (p = 0.02→ p = 0.08).
Advanced statistical methods 578
Part IX
Models for time-to-event data
Advanced statistical methods 579
Chapter 24
Survival analysis without censoring
. Example
. The survival curve
. Estimation of survival curve
Advanced statistical methods 580
24.1 Example: Survival times of cancer patients
• Cameron and Pauling [24]; Hand et al. [25] p. 255
• Patients with advanced cancer of the stomach, bronchus, colon, ovary, or breast weretreated (in addition to standard treatment) with ascorbate.
• The outcome of interest is the survival time (days)
• Research question(s):
What is the prognosis for a patient with specific type of cancer ?
Do survival times differ with organ affected ?
Advanced statistical methods 581
• Dataset ‘Cancer’:
Stomach Bronchus Colon Ovary Breast
124 81 248 1234 1235
42 461 377 89 24
25 20 189 201 1581
45 450 1843 356 1166
412 246 180 2970 40
51 166 537 456 727
1112 63 519 3808
46 64 455 791
103 155 406 1804
876 859 365 3460
146 151 942 719
340 166 776
396 37 372
223 163
138 101
72 20
245 283
Average (days) Median (days)
Stomach: 286 124
Bronchus: 211.6 155
Colon: 457.4 372
Ovary: 884.3 406
Breast: 1395.9 1166
Advanced statistical methods 582
• Note the severe differences between averages and medians, due to the skewness of thedistribution:
• Comparisons between groups is therefore based on parametric tests after appropriatetransformation (e.g., logarithmic), or based on non-parametric tests (e.g., Wilcoxontest).
Advanced statistical methods 583
24.2 The survival curve
• Often it is of interest to make a prognosis for specific patients, i.e., it is of interest toestimate the probability of ‘surviving’ a specific amount of time
• In other contexts, the response is not ‘survival’, but still a ‘time to event’:
. Progression free ‘survival’
. How long will a bulb ‘survive’
. Time untill first tooth is affected with caries
. Time a rat needs to find the exit of a maze
. . . .
• Terminology: Survival and Failure
Advanced statistical methods 584
• In the cancer example, it may be of interest to estimate how likely it is that a patientwith colon cancer, treated (in addition to standard treatment) with ascorbate, willsurvive 1 year, 2 years, . . .
• Interest is then in the survival function / curve:
S(t) = P (Outcome > t)
“The probability of surviving time point t”
• Properties of S(t):
. S(0) = 1: There is absolute certainty to ‘survive’ t = 0
. S(+∞) = 0: There is absolute certainty to ‘fail’ eventually
. S(t) is a decreasing function
Advanced statistical methods 585
• Examples of survival curves:
Advanced statistical methods 586
24.3 Estimation of survival curve
• As S(t) can be interpreted as a proportion, it can easily be estimated by the observedproportion of subjects surviving time point t:
S(t) = P (Outcome > t) −→ S(t) =
# subjects surviving t
N
• As an example, we estimate the survival curve for ovary cancer patients
• The following 6 event times were recorded:
1234 89 201 356 2970 456
Advanced statistical methods 587
• Calculations:
Time (t) # Surving t S(t)
0 6 6/6 = 1.00
30 6 6/6 = 1.00
89 5 5/6 = 0.83
100 5 5/6 = 0.83
201 4 4/6 = 0.67
356 3 3/6 = 0.50
400 3 3/6 = 0.50
556 2 2/6 = 0.33
1234 1 1/6 = 0.17
2970 0 0/6 = 0.00
Advanced statistical methods 588
• Graphically:
Advanced statistical methods 589
• Remarks:
. S(t) is estimated using a step function
. Steps only at the times where events were observed
. Step size at time point t:
# subjects with event at t
N
. The estimate is right-continuous:
Advanced statistical methods 590
Chapter 25
Survival analysis with censoring
. The problem of censoring
. Example
. Kaplan-Meier estimate of survival curve
. Comparison of survival curves
. Examples from biomedical literature
Advanced statistical methods 591
25.1 The problem of censoring
Event time cannot always be measured !
⇓Censored observations
Various types of censoring:
. Right
. Left
. Interval
. Mixture of the above
Advanced statistical methods 592
No censoring
Time/Age
................................................................................................................................................................................................................................................
........................................
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
: Before event : After event •: True event time ◦: Observations
•• ••• •• •
◦◦ ◦◦◦ ◦◦ ◦
Advanced statistical methods 593
Right censoring due to study end
Time/Age
................................................................................................................................................................................................................................................
........................................
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
: Before event : After event •: True event time ◦: Observations
•• ••• •• •
◦
◦
◦ ◦◦◦
◦ ◦
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
End of study
Advanced statistical methods 594
Right censoring due to dropout
Time/Age
................................................................................................................................................................................................................................................
........................................
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
: Before event : After event •: True event time ◦: Observations
•• ••• •• •
◦
◦
◦ ◦◦◦
◦ ◦.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Advanced statistical methods 595
Left censoring due to late study onset
Time/Age
................................................................................................................................................................................................................................................
........................................
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
: Before event : After event •: True event time ◦: Observations
•• ••• •• •
◦◦
◦
◦◦
◦ ◦◦
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
Begin of study
Advanced statistical methods 596
Interval censoring due to discrete observation times
Time/Age
................................................................................................................................................................................................................................................
........................................
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
: Before event : After event •: True event time ◦: Observations
•• ••• •• •
◦◦ ◦◦ ◦ ◦◦ ◦
◦◦ ◦◦ ◦◦ ◦
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
.
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
.
..
..
..
..
O B S E R V A T I O N T I M E SAdvanced statistical methods 597
• Our focus will be on right censoring, i.e., either the true event time or a lowerbound of it is observed
• Standard statistical tools for the analysis of censored observations assume randomcensoring:
Event time and censoring time are independent
• Counter examples:
. Patients entering the study later have a better prognosis due to increasedexperience of surgeon=⇒ Negative association between censoring and event time
. Patients leaving the study because they get worse=⇒ Positive association between censoring and event time
Advanced statistical methods 598
25.2 Example: Myelomatosis
• Peto et al. [26]; Allison [27] p.26
• Data on 25 patients diagnosed with myelomatosis (Kahler’s disease), multiple maligntumours in the bone marrow
• Patients randomly assigned to two drug treatments
• Event time is the time from moment of randomization to death
• Some event times are censored due to study termination
• Patients with normal and patients with impaired renal functioning at moment ofrandomization
Advanced statistical methods 599
• Data:
Treat Duration Status Renal Treat Duration Status Renal
1 8 1 1 2 180 1 0
1 852 0 0 2 632 1 0
1 52 1 1 2 2240 0 0
1 220 1 0 2 195 1 0
1 63 1 1 2 76 1 0
1 8 1 0 2 70 1 0
1 1976 0 0 2 13 1 1
1 1296 0 0 2 1990 0 0
1 1460 0 0 2 18 1 1
1 63 1 1 2 700 1 0
1 1328 0 0 2 210 1 0
1 365 0 0 2 1296 1 0
2 23 1 1
Status:
. 0: Censored
. 1: Death
Renal:
. 0: Normal
. 1: Impaired
• Interest is in estimating and comparing the survival curves for patients with differenttreatments and for patients with different renal functioning at baseline
Advanced statistical methods 600
25.3 Kaplan-Meier estimate of survival curve
• Suppose interest is in estimating the survival curve for patients with treatment 1
• Observed data:
Duration: 8 852 52 220 63 8 1976 1296 1460 63 1328 365
Status: 1 0 1 1 1 1 0 0 0 1 0 0
How to account for the censoring ?
Advanced statistical methods 601
• Simple ‘naive’ solutions:
. Ignoring the censored observations: Over-optimistic
. Treating censored observations as event times: Over-pessimistic
• Hence, correct account of the censoring process is necessary.
• The Kaplan-Meier (KM) estimate provides an unbiased estimate for S(t),assuming independent (non-informative) censoring.
• Most software packages allow calculation of the KM estimate.
• Input needed:
. Observed times
. The true status (‘event’ or ‘censored’)
Advanced statistical methods 602
• KM estimate for patients receiving treatment 1:
Advanced statistical methods 603
25.4 Comparison of survival curves
• Often, interest is in the comparison of survival curves of different groups
• For the Myelomatosis data, interest may be to compare survival between the twotreatment goups
• Also of interest is the comparison of survival for patients with impaired renalfunctioning with survival for patients with normal renal functioning.
• We will focuss on the comparison of two groups, but extensions are available for thecomparison of multiple groups
• For each group separately, the KM estimate for the survival curve can be calculated.
Advanced statistical methods 604
• KM estimates for both treatment groups:
Advanced statistical methods 605
• KM estimates for patients with normal and impaired renal functioning, respectively:
Advanced statistical methods 606
• Due to the censoring, classical tests such as t-test and Wilcoxon test cannot be usedfor the comparison of the survival times
• Various tests have been designed for the comparison of survival curves, when censoringis present
• The most popular ones are:
. Logrank test
. Wilcoxon (Gehan) test
• Note that the Wilcoxon test used here is different from the classical Mann-Whithney Utest, also termed Wilcoxon test.
• The Logrank test has more power than Wilcoxon for detecting late differences
• The Logrank test has less power than Wilcoxon for detecting early differences
Advanced statistical methods 607
• Test results:
Effect of treatment Effect of renal functioning
Logrank: p=0.2468
Wilcoxon: p=0.6260
Logrank: p=0.0029
Wilcoxon: p=0.0005
Advanced statistical methods 608
25.5 Examples from biomedical literature
• Shatari et al. [7]:
. Methods, p.439:
Advanced statistical methods 609
. Figure 1, p.440:
Advanced statistical methods 610
• Blanchon et al. [28]:
. Statistical Methods, p.831:
Advanced statistical methods 611
. Figure 2, p.834:
Advanced statistical methods 612
• Moon and Lee [23]
. Analysis section, p.1427:
period were analyzed by logistic regression analysis. The
effects on 7- and 30-day in-hospital mortality were
analyzed by Kaplan–Meier survival and Cox proportional
hazard regression analysis. The effects on reduced lengths
Advanced statistical methods 613
. Figure 3, p.1430:
Fig. 3. Kaplan–Meier survival curves for 7-day and 30-day in-hospital mortality.
∗ Confusing formulation: 7-day and 30-day mortality
∗ Outcome analysed was time till death, from which an estimate for theprobability of dying within 7 (30) days results
Advanced statistical methods 614
Chapter 26
Regression for survival data
. Example
. Cox regression
. Application
. Model diagnostics
. Hazard rate
. Examples from biomedical literature
Advanced statistical methods 615
26.1 Example: Pneumonia data
• Klein and Moeschberger [29] p.14
• Data on 3470 children, to study risk factors for time to hospitalized pneumonia
• Censoring after the first year of life, if not earlier
• Overall, 73 (2.10%) of the children were reported to be hospitalized for pneumoniawithin the first year of life.
• We want to study the association between the time to hospitalized pneumonia andsome child- and/or mother-specific characteristics
Advanced statistical methods 616
• KM estimate for time to hospitalized pneumonia:
Advanced statistical methods 617
• Note that the estimated probability of not experiencing hospitalized pneumonia is97.7%, somewhat smaller than the observed proportion 100− 2.10 = 97.9%, obtainedwithout correction for the early censoring.
• The relation with the following potential risk factors is to be studied:
. Age of mother (Years)
. Presence of siblings (Yes: 48%, No: 52%)
. Smoking status mother (Yes: 34%, No: 66%)
. Urban environment (Yes: 76%, No: 24%)
. Alcohol use mother (Yes: 36%, No: 64%)
. Poverty status mother (Yes: 36%, No: 74%)
. Normal birthweight child (≥ 5.5 pounds ≈ 2.5 kg. Yes: 92%, No: 8%)
Advanced statistical methods 618
• The relation with factors can be studied using group-specific Kaplan-Meier estimates,together with Logrank and/or Wilcoxon tests
• Investigating the relation with covariates, requires a regression-type model
• Relating the outcome to several factors and/or covariates simultaneously requires amultiple model allowing to include (a combination of) covariates and factors
• The most frequently used model is the Cox (proportional hazards) model
Advanced statistical methods 619
26.2 Cox regression
• Suppose interest is in studying the relation between the survival probality S(t) andsome covariate X
• Examples:
X = Age mother
X =
1 if mother smokes
0 if mother does not smoke
• Let S0(t) denote the survival function in case X = 0, and Sx(t) is the survivalfunction in case X = x, for a specific value x
Advanced statistical methods 620
• The Cox regression model assumes that:
Sx(t) = {S0(t)}exp(βx)
• In case β > 0:
x↗ =⇒ exp(βx)↗ =⇒ Sx(t) < S0(t)
Higher X-values associated with increased risk for event
• In case β < 0:
x↗ =⇒ exp(βx)↘ =⇒ Sx(t) > S0(t)
Higher X-values associated with reduced risk for event
Advanced statistical methods 621
• In case β > 0:
Advanced statistical methods 622
• In case β < 0:
Advanced statistical methods 623
• Note that the covariate model exp(βx) expresses how much higher (lower) the risk isfor people with X = X than with people for X = 0.
• Hence, the model is a relative model and therefore does not contain an intercept.
• Note also that the model can easily be generated to models with multiple covariates
• For example, a model with two covariates X1 and X2 is given by:
Sx1,x2(t) = {S0(t)}exp(β1x1+β2x2)
• Factors can easily be incorporated using the regression notation with dummy coding.
Advanced statistical methods 624
26.3 Application: Pneumonia data
• Effects of all risk factors separately, assessed using simple Cox regression models:
Effect β p-value
Age of mother −0.0985 0.0275
Urban environment −0.4523 0.0695
Alcohol use −0.0535 0.8282
Normal birthweight child −0.2412 0.5439
Smoking of mother 0.7958 0.0007
Poverty of mother 0.5963 0.0109
Presence of siblings 0.6436 0.0079
Advanced statistical methods 625
• Some evidence for positive effect (later event) of:
. Older mother
. Urban environment
• Some evidence for negative effect (earlier event) of:
. Smoking of mother
. Poverty of the mother
. Presence of siblings
• The positive effect of an urban environment is somewhat surprising and may beexplained by other characteristics to which it is related.
• Also the effect of poverty of the mother might be explained by other factors, relatedto poverty.
Advanced statistical methods 626
• We therefore fit a multiple Cox model, assessing the effect of all covariates/factorssimultaneously:
Simple models Multiple model
Effect β p-value β p-value
Age of mother −0.0985 0.0275 −0.1287 0.0107
Urban environment −0.4523 0.0695 −0.3509 0.1616
Alcohol use −0.0535 0.8282 −0.1213 0.6374
Normal birthweight child −0.2412 0.5439 −0.0152 0.9697
Smoking of mother 0.7958 0.0007 0.7289 0.0028
Poverty of mother 0.5963 0.0109 0.2778 0.2586
Presence of siblings 0.6436 0.0079 0.7557 0.0042
Advanced statistical methods 627
• There is no evidence anymore for an effect of living in an urban environment. This canbe explained from:
. There are significantly less smoking mothers (32.74%) in urban environmentscompared to rural environments (38.63%): Chi-squared test, p = 0.0018.
. There are significantly less siblings (46.95%) in urban environments compared torural environments (51.74%): Chi-squared test, p = 0.0158.
• There is no evidence anymore for an effect of poverty of the mother. This can beexplained from:
. There are significantly more smoking mothers (40.30%) in poor circumstancescompared to non-poor circumstances (30.69%): Chi-squared test, p < 0.0001.
. There are significantly more siblings (58.41%) with poor mothers compared tonon-poor mothers (42.30%): Chi-squared test, p < 0.0001.
Advanced statistical methods 628
• To assess the effect of the risk factors, we compare survival when the risk factors arepresent to survival when the risk factors are absent, keeping the others constant (ruralenvironment, no alcohol use, normal birthweight, no poverty):
. Present: Mother 20yrs old, smoking, with other children
. Absent: Mother 30yrs old, not smoking, without other children
Advanced statistical methods 629
26.4 Model diagnostics
• The Cox regression model assumes that:
Sx(t) = {S0(t)}exp(βx)
• Checking the above assumption is difficult since:
. The observations are not always observed (censoring)
. The ‘baseline’ survival curve S0(t) is left unspecified
• Most software packages do not include tools to easily check the assumption.
Advanced statistical methods 630
26.5 Hazard rate
• Let us consider a Cox model with two covariates X1 and X2:
Sx1,x2(t) = {S0(t)}exp(β1x1+β2x2)
We then have that
ln[Sx1,x2(t)] = ln[S0(t)] exp(β1x1 + β2x2)
• Hence, the relative change in risk associated with a one-unit increase in X1 equals
ln[Sx1+1,x2(t)]
ln[Sx1,x2(t)]=
ln[S0(t)] exp(β1(x1 + 1) + β2x2)
ln[S0(t)] exp(β1x1 + β2x2)= exp(β1)
Advanced statistical methods 631
• The term exp(β1) is called the hazard ratio (hazard rate, HR)
• The HR expresses the relative change in risk (measured on the logarithmic scale),associated with a one-unit increase in the covariate, from X1 = x1 to X1 = x1 + 1.
• This change in risk is not only independent of x1, but also of x2.
• This is in full analogy to the interpretation of a regression coefficient in a linear model,or the OR in a logistic model.
• Caution is needed, however, in polynomial models or models with interactions, whereindividual regression parameters cannot always be interpreted.
• Because this change in risk is also independent of t, the Cox model is also termed‘the proportional hazards model’.
Advanced statistical methods 632
26.6 Examples from biomedical literature
• Nawrot et al. [30]:
. Statistical analyses, p.122:
∗ Power analyses basedon logrank
∗ Cox regression to adjustfor potentially importantcovariates
∗ Sensitivity analyses to checkwhether results depend oncovariates included.
Advanced statistical methods 633
. Figure 2, p.122:
Advanced statistical methods 634
. Results, p.124:
Advanced statistical methods 635
• Hutchins et al. [31]:
. End Point Definitions and Statistical Analysis, p.8315:
∗ OS: Overall survival
∗ DFS: Disease free survival
Advanced statistical methods 636
. Results (p.8316), Figures 4 and 5 (p.8318):
Fig 5. Overall survival (OS) by hormone receptor (HR) status with and
without tamoxifen (TAM). HR�, HR positive; HR�, HR negative.
Fig 4. Disease-free survival (DFS) by hormone receptor (HR) status with
and without tamoxifen (TAM). HR�, HR positive; HR�, HR negative.
HR: postmenopausal hormone receptor positive; TAM: Tamoxifen group
Advanced statistical methods 637
• Brown et al. [32]
. Statistical analysis section, p.1086:
c 1 ¼ ¼
(HR: 1.04; 95% CI: 1.01e1.07). Given the significant
gender depression interaction, gender-stratified
Cox proportional hazard models were used to
examine the effects of frailty characteristics, depres-
sion, and depression by frailty characteristic interac-
tions on survival time. Dummy-coded variables for
frailty, missing data, and depression status were used
for single predictor models. For these single predictor
models, Bonferroni correction on the false-positive
error rate was used to account for multiple compar-
isons (a of 0.05 adjusted for four frailty characteris-
tics: a ¼ 0.0125).
Advanced statistical methods 638
. Table 3, p.1091:
TABLE 3. Site-Stratified Proportional Hazards Models with Multiple Predictors of Frailty Characteristics for Depressed and Nondepressed Men and Women
Frailty Predictors
Nondepressed (N [ 543) Depressed (N [ 261)
Men (N [ 278) Women (N [ 265) Men (N [ 84) Women (N [ 177)
Wald c2 HR (95% CI) p Wald c
2 HR (95% CI) p Wald c2 HR (95% CI) p Wald c
2 HR (95% CI) p
Low physical activities 10.48 0.005 1.25 0.535 1.81 0.404 3.66 0.160
Yes vs. no 7.27 1.76 (1.17, 2.65) 0.007 1.13 1.34 (0.78, 2.28) 0.287 1.61 1.68 (0.76, 3.74) 0.204 2.86 1.56 (0.93, 2.61) 0.091
Missing vs. no 4.85 2.80 (1.12, 7.01) 0.028 0.01 0.95 (0.36, 2.50) 0.917 0.09 1.25 (0.29, 5.46) 0.765 0.19 0.81 (0.32, 2.05) 0.662
Fatigue 4.90 0.086 3.67 0.160 6.46 0.040 5.62 0.060
Yes vs. no 3.82 1.51 (1.00, 2.28) 0.051 2.73 1.54 (0.92, 2.56) 0.098 1.02 1.48 (0.69, 3.15) 0.313 5.40 1.94 (1.11, 3.40) 0.020
Missing vs. no 2.24 1.46 (0.89, 2.41) 0.134 1.96 1.52 (0.85, 2.74) 0.161 6.29 2.90 (1.26, 6.67) 0.012 2.81 1.66 (0.92, 3.00) 0.094
Slow gait speed 5.99 0.050 1.89 0.388 2.17 0.338 6.69 0.035
Yes vs. no 4.96 1.54 (1.05, 2.26) 0.026 1.04 1.30 (0.79, 2.14) 0.307 1.01 1.52 (0.67, 3.45) 0.315 4.59 1.84 (1.05, 3.21) 0.032
Missing vs. no 1.92 1.71 (0.80, 3.64) 0.166 1.38 1.85 (0.66, 5.18) 0.240 1.95 2.67 (0.67, 10.59) 0.163 5.47 2.71 (1.18, 6.26) 0.019
Low grip strength 2.34 0.310 4.92 0.085 0.64 0.725 1.60 0.452
Yes vs. no 0.01 1.02 (0.65, 1.59) 0.934 4.64 1.83 (1.06, 3.18) 0.031 0.63 1.35 (0.64, 2.81) 0.429 0.43 1.19 (0.70, 2.02) 0.513
Missing vs. no 2.19 0.55 (0.25, 1.22) 0.139 0.85 1.52 (0.63, 3.69) 0.355 0.05 1.15 (0.33, 4.02) 0.827 1.46 1.59 (0.75, 3.36) 0.228
Notes: Cox proportional hazard models were used to explore the simultaneous effects of the baseline frailty characteristics in separate models for nondepressed and depressedmen and women. Wald c
2 values are listed (df ¼ 2 for the overall effect of each frailty characteristic, d f ¼ 1 for each subgroup comparison); HRs (and 95% CI) are for multiplepredictor models with the inclusion of other frailty characteristics as covariates.
∗ Separate analyses for depressed and non-depressed, males and females, due toimportant interaction effects
∗ Outcome (overall survival) not mentioned explicitly
∗ Dummy coding for factors, with ‘no’ group as reference group
∗ HR’s to quantify the risk
Advanced statistical methods 639
Part X
Further Topics
Advanced statistical methods 640
Chapter 27
Clustered data
. Data set: Washing without water
. Naive analysis
. Correction for clustering
. A mixed model
. Empirical Bayes estimates
. Other examples
. Examples from the biomedical literature
Advanced statistical methods 641
27.1 Data set: Washing without water
• Schoonhoven et al. [33]
• Comparison of traditional washing (soap & water) with the use of disposable washgloves, made of non-woven material, saturated with quickly vaporizing cleaning &caring lotions
• Nursing home residents requiring bathing by nurses
• 56 nursing home wards (±500 residents) randomized:
. Usual Care (UC: traditional bathing)
. Washing without water (WWW)
Advanced statistical methods 642
• Exclusion: In bath or shower > 1 day/week
• Outcome of interest is ‘Completeness of assisted bathing (1/0)’after 4 weeks post randomization
• Correction for dementia (1/0)
• Other covariates (age, gender, Barthel index, BMI, skin damage, . . . ) explored as well
Advanced statistical methods 643
27.2 Naive analysis
• Logistic regression with factors ‘intervention’ and ‘dementia’
• Results:
Effect OR 95% C.I. p-value
Intervention: WWW 4.739 [3.155; 7.143] <0.0001
UC
Dementia: NO 1.508 [1.005; 2.268] 0.0475
YES
• Bathing completeness more likely . . .
. . . . in WWW intervention group
. . . . in non-demented residents
Advanced statistical methods 644
27.3 Correction for clustering
• Analysis did not account for the variability between wards w.r.t. proportion of residentswith complete bathing
[ \ [ [ \ ] [ \ ^ [ \ _ [ \ ` [ \ a [ \ b [ \ c [ \ d [ \ e ] \ [[
a
] [
] a
^ [
^ a
f ghigj
k lmgno plhq r
s t u v u t w x u y z u { v | } w } ~ � w � x y �
Advanced statistical methods 645
• Variability implies residents from one ward to be more alike than residents fromdifferent wards
=⇒ Correlated data
• All models discussed so far assumed all observations to be independent
• This correlation should be accounted for in the statistical analysis
=⇒ Mixed (multilevel) models
• Mixed models are the most popular models for the analysis of clustered data
Advanced statistical methods 646
27.4 A mixed model
• Let Yij be the binary outcome for patient j in ward i.
• Furthermore, let Iij be a dummy variable being 1 if the patient belongs to the WWWgroup and 0 otherwise.
• Likewise, let Dij be a dummy variable being 1 if the patient is not demented and 0otherwise.
• The logistic model fitted equals:
P (Yij = 1) =exp(β0 + β1Iij + β2Dij)
1 + exp(β0 + β1Iij + β2Dij)
Advanced statistical methods 647
• In order to account for the fact that different wards can have different successprobabilities, we add a ward-specific term bi:
P (Yij = 1) =exp(bi + β0 + β1Iij + β2Dij)
1 + exp(bi + β0 + β1Iij + β2Dij)
• Patients in a ward with a (very) high value for bi are (very) likely to have receivedcomplete bathing
• Patients in a ward with a (very) low value for bi are (very) unlikely to have receivedcomplete bathing
• Each ward has its own bi parameter
• Since wards are believed to be sampled from a population of wards, the parameters bi
can be viewed as being sampled from a population of ward effects
Advanced statistical methods 648
• Therefore, the parameters are often assumed random:
bi ∼ N (0, σ2b)
• The normality assumption is mathematically convenient
• The assumption of mean zero is interpretationally convenient:
. bi = 0: A ward with average / median ‘risk’ for complete bathing
. bi > 0: A ward with higher than average / median ‘risk’ for complete bathing
. bi < 0: A ward with lower than average / median ‘risk’ for complete bathing
• The variance σ2b expresses the variability between wards, hence tells us how different
the ‘risk’ for complete bathing is between wards
• Much (little) between-ward variability (σ2b large/small) implies much (little) correlation
Advanced statistical methods 649
• The different nature between the regression parameters β1, β2 and β3 on one handand the parameters bi is reflected in the terminology:
. Fixed effects β1, β2 and β3: If the experiment were to be repeated, the sameparameters would appear in the model because the same population would bestudied
. Random effects bi: If the experiment were to be repeated, different parameterswould appear in the model because different wards would be sampled
• A model with fixed effects as well as random effects is termed a mixed effectsmodel, or briefly a mixed model
Advanced statistical methods 650
• Fitting the mixed model to our data set leads to the following results:
Naive Correct
Effect OR 95% C.I. p-value OR 95% C.I. p-value
Intervention: WWW 4.739 [3.155; 7.143] <0.0001 12.821 [4.566; 35.714] <0.0001
UC
Dementia: NO 1.508 [1.005; 2.268] 0.0475 1.271 [0.883; 1.828] 0.1962
YES
• Conclusion:
Effects of covariates highly affectedby correlation witin clusters
Advanced statistical methods 651
• The between-ward variance estimate equals σ2b = 4.4617
• The original naive logistic regression model assumed no variability between wards, i.e.,assumed σ2
b = 0
• In general, any clustering in the data should be accounted for in the analysis
• The impact on covariates of interest very much depends on the variability betweenclusters, i.e., on the correlation between observations within clusters.
• In our example, a logistic regression model was extended with random effects, yieldinga logistic mixed model. Likewise, all models discussed so far can be turned into amixed model in order to account for clustering.
• Terminologies used: linear mixed models, generalized linear mixed models, . . .
Advanced statistical methods 652
27.5 Empirical Bayes estimates
• So far, we have focussed on inference for the fixed effects.
• In some cases, scientific interest may be in the random effects themselves, rather thanthe fixed effects.
• As an example, we consider data from the Diabetes Project Leuven (DPL) in whichgeneral practitioners (GP’s) were invited to participate in an intervention programaiming at improving care through providing support to the GP’s
• The intervention was to provide structured assistance to GP’s by a diabetes care team,consisting of a nurse educator, a dietician, an ophthalmologist and an internalmedicine doctor.
Advanced statistical methods 653
• We consider the analysis of 61 GP’s with a total of 1577 patients, the number per GPranging from 5 to 138
• The outcome studied is HbA1c, glycosylated hemoglobin, after one year in theintervention program:
. Molecule in red blood cells that attaches to glucose (blood sugar)
. High values reflect more glucose in blood
. In diabetes patients, HbA1c gives a good estimate of how well diabetes is beingmanaged over the last 2 or 3 months
. Non-diabetics have values between 4% and 6%
. HbA1c above 7% means diabetes is poorly controlled, implying higher risk forlong-term complications.
Advanced statistical methods 654
• More specifically, interest was in the dichotomized version
Y =
1 if HbA1c < 7%
0 if HbA1c ≥ 7%
• A logistic mixed model can be used:
P (Yij = 1) =exp(bi + β0)
1 + exp(bi + β0)
• Patients with a high P (Yij = 1) value are likely to reach the target
• This probability depends on the GP effect bi
• Patients treated by a GP with a high (positive) value bi are likely to reach the target.Patients treated by a GP with a low (negative) value bi are not likely to reach thetarget.
Advanced statistical methods 655
• Hence, ‘successful’ GP’s are those with a high value bi, while less ‘successful’ GP’shave lower values bi
• Therefore, it is of interest to estimate the random effects bi in order to be able toidentify (un-)successful GP’s
• Those estimates are called Empirical Bayes (EB) estimates.
• In order to have a fair comparison of GP’s, correction is needed for GP and patientcharacteristics such as:
. Practice form (1, 2, or > 2 GP’s in one practice)
. BMI of patient at the moment of enrollment in the study
. New, indicating whether patient is newly diagnosed as diabetes patient
Advanced statistical methods 656
• Correction can be done by including the factors and covariate in the logistic model
• Practice will be coded with two indicator variables P1 and P2
• BMI is a continuous covariate (B)
• New will be coded with one indicator variable N
• The logistic mixed model then becomes:
P (Yij = 1) =exp(bi + β0 + β1P1i + β2P2i + β3Bij + β4Nij)
1 + exp(bi + β0 + β1P1i + β2P2i + β3Bij + β4Nij)
• Studying the EB estimates for the GP effects bi allows comparison of GP’s, assumingthat they all would work in the same practice form and would have patients with thesame characteristics (BMI, New).
Advanced statistical methods 657
• A histogram of EB estimates is helpful to identify well-performing and poorlyperforming GP’s:
� � � � � � � � � � � � � � � � � � � � � � � � � � � � ��� � � �� � � �� � � �� � � �� � � �� � � �� � � �
� ������� ��
� � � � � � � � � � � � � � � � � � � � � �
Advanced statistical methods 658
• Conclusion:
EB estimates can be used to identify ‘outlying’ clusters,after correction for systematic effects and/or differences
Advanced statistical methods 659
27.6 Multiple levels of clustering
• In mixed models, a ‘classical’ statistical model is extended with a random effect whichaccounts for variability between clusters.
• The same idea can be used for the analysis of data with multiple levels of clustering.
• As an example, re-consider the logistic mixed model used for the analysis of the‘Washing without water’ data set on nursing home wards:
P (Yij = 1) =exp(bi + β0 + β1Iij + β2Dij)
1 + exp(bi + β0 + β1Iij + β2Dij),
with Yij the (binary) outcome on patient j within ward i.
• Suppose some wards belong to the same nursing home.
Advanced statistical methods 660
• We then have patients clustered within wards, which themselves are clustered withinnursing homes.
• This additional level of clustering can be accommodated by adding an additionalrandom effect for each nursing home, on top of the random effect for each ward.
• Let Ykij be the (binary) outcome on patient j, within ward i, in nursing home k.
• A mixed model which accounts for variability between wards and between nursinghomes is:
P (Ykij = 1) =exp(ak + bki + β0 + β1Iij + β2Dij)
1 + exp(ak + bki + β0 + β1Iij + β2Dij).
• As before, the random effects are assumed normally distributed, but with differentvariances:
ak ∼ N (0, σ2a), bki ∼ N (0, σ2
b).
Advanced statistical methods 661
27.7 Other examples
Clustering =⇒ Correlation
• Residents clustered within wards
• Patients clustered within hospitals
• Ophthalmology studies: Eyes within patients (−→ paired t-test)
• Longitudinal studies (see later)
• . . .
Advanced statistical methods 662
27.8 Examples from the biomedical literature
• Smeds-Alenius et al. [34]
. Statistical methods section, p.120:
We used separate adjusted multivariate logistic regres-
sion models to estimate the relationship of RN assessed
quality of care and RN assessed patient safety to 30-day
inpatient mortality (LaValley, 2008). In all regression
analyses, a mixed model approach with random intercept
was used to correct for the dependency of observations
within a hospital. Confidence intervals were set at 95%.
Data were analyzed using SAS 9.4.
∗ RN: registered nurse
∗ Random effect for correction due to clustering in hospitals
Advanced statistical methods 663
. Table 3, p.121:
Table 3
Relationships between RNs who report excellent quality of care and/or patient safety and the outcome of 30-day inpatient mortality.
Unadjusted model Adjusted modela
OR 95% CI Pr > ChiSq OR 95% CI Pr > ChiSq
Quality of care
Middle tertile hospitals compared to the lowest tertile hospitals 0.82 0.66–1.00 0.055 0.86 0.72–1.04 0.112
Highest tertile hospitals compared to the lowest tertile hospitals 0.79 0.61–1.02 0.067 0.77 0.65–0.91 0.002
Patient safety
Middle tertile hospitals compared to the lowest tertile hospitals 0.92 0.75–0.13 0.450 0.82 0.68–1.00 0.048
Highest tertile hospitals compared to the lowest tertile hospitals 0.68 0.52–0.90 0.006 0.74 0.60–0.91 0.004
a Adjustments were made for patient characteristics (gender, age, comorbidities, surgical DRGs, emergency room admittance) and hospital
characteristics (size, level of specialization, teaching status).
∗ Outcome: Death within 30 days of admission
∗ Factors of interest: Quality of care and patient safety, as reported by RN’s
∗ Regression notation for the factors, after discretisation in 3 groups
∗ Analysis with and without correction for patient and hospital characteristics
Advanced statistical methods 664
• Fisher et al. [35]
. Statistical analysis section, p.1846:
surgeon volume category. Multilevel logistic regression
with surgeons and hospitals as crossed random effects was
used to estimate odds ratios (OR) of receiving mastectomy
by surgeon volume, adjusting for year of diagnosis, age at
diagnosis, geographic region, ER/PR status, tumor size,
and nodal status, as well as for interaction of all variables
with stage. Postestimation lincom commands were used to
calculate the OR for the variables of interest by stage.
Crossed random effects were necessary because some
surgeons operated out of multiple hospitals. Interaction
between year of diagnosis and surgeon volume was eval-
uated and found to be nonsignificant. Empirical Bayes
estimation was used to estimate adjusted OR for individual
surgeons and hospitals. All statistical analyses were per-
formed by SAS 9.3 statistical software (SAS Institute,
Cary, NC, USA) and Stata 12.1 (StataCorp, College Sta-
tion, TX, USA).
∗ Outcome:Receiving mastectomy
∗ Covariate of interest:Surgeon volume
∗ Clustering within hospitals
∗ Clustering within surgeons
∗ 2 random effects
∗ Logistic mixed model
∗ EB estimates to get bi
∗ OR’s computed as exp(bi)
Advanced statistical methods 665
. Figure 1, p.1846:
More BCS
BA
More BCS
Ind
ivid
ual
Su
rgeo
ns
Adjusted Odds Ratio Estimates of Mastectomy Adjusted Odds Ratio Estimates of Mastectomy
* Very Low Volume Surgeons (N=42) ** Low Volume Surgeons (N=28) * Low Volume Hospitals (N=17)
Ind
ivid
ual
Ho
spit
als
ymotcetsaMeroMymotcetsaMeroMAlbertaAverage
AlbertaAverage
0.1 0.25 0.5 1 2 4 8 0.1 0.25 0.5 1 2 4 8
***
*
FIG. 1 Empirical Bayes estimates of adjusted OR of mastectomy for breast cancer patients by a individual surgeon and b individual hospital,
adjusting for patient characteristics and accounting for variation by surgeon volume
∗ Much more variability between surgeons than between hospitals
Advanced statistical methods 666
Chapter 28
Longitudinal data / Repeated measures
. Example: Longitudinal MMSE evolutions
. Repeated measures ANOVA
. Model extensions
. Examples from the biomedical literature
Advanced statistical methods 667
28.1 Example: Longitudinal MMSE evolutions
• In the delirium data set, MMSE was measured 5 times post operation, at days 1, 3, 5,8, and 12
• This allows to study how patients have evolved over time
• This also allows to study how such evolutions depend on patient characteristics
• As an illustration we want to investigate whether MMSE has a different evolution forneuro-psychiatric patients than for non-neuro-psychiatric patients
Advanced statistical methods 668
• Individual trends:
¡¡¢£
¤
¥ ¤
¦ ¤
§ ¤
¨ © ª « ¬ ® ¯ ° ±¥ ¦ § ² ³ ´ µ ¶ · ¥ ¤ ¥ ¥ ¥ ¦
¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å » Ã Á Ä Æ Ç ¸ ¼ Å Æ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å » Ã Á
• Obviously, there is skewness in the data
Advanced statistical methods 669
• This is also supported by a histogram at time 1:
È É Ê È Ê É Ë È Ë É Ì ÈÈÈ Í È ÉÈ Í Ê ÈÈ Í Ê ÉÈ Í Ë ÈÈ Í Ë ÉÈ Í Ì ÈÈ Í Ì ÉÈ Í Î È
Ï ÐÑÒÑÐÓÔ ÑÕ
Ö Ö × Ø
• Since all models for continuous data assume normality, we use an exponentialtransformation:
MMSE −→ exp(MMSE/30)
Advanced statistical methods 670
• New histogram (at time 1):
Ù Ú Û Û Ù Ú Ü Ý Ù Ú Ý Û Ù Ú Þ Ý Ü Ú Û Û Ü Ú Ü Ý Ü Ú Ý Û Ü Ú Þ ÝÛ
Û Ú Û Ý
Û Ú Ù Û
Û Ú Ù Ý
Û Ú Ü Û
Û Ú Ü Ý
Û Ú ß Û
à áâãâáäå âæ
ç è é ê ë ë ì í î ï ð ñ
• Perfect normality is not obtained, but covariates have not been included yet andsymmetry is much better satisfied than before.
Advanced statistical methods 671
• Individual trends after transformation:
òóôõöö÷øùúûü
ý þ ÿ
ý þ �
� þ ÿ
� þ �
� þ ÿ
� � � � � � � �ý � � � � � � � ý ÿ ý ý ý �
� � � � � � � � � � � � � � � � � � ! � � � � � � � � � � � � � � � � � � �
Advanced statistical methods 672
28.2 Repeated measures ANOVA
• Longitudinal data can be viewed as a particular instance of clustered data, in whichobservations are clustered within subjects
• Hence, mixed models can be used, the most popular one being repeated measuresANOVA
• The model is an extension of the ANOVA model with subject-specific random effectsto account for between-subject variability
• As an illustration, we fit a two-way repeated measures ANOVA model with fixedfactors ‘time,’ ‘neuro-status,’ and the interaction between both, and with randomeffects for the patients
Advanced statistical methods 673
• Results: " # $ % & " % ' ( ' ) * + , - % . / * * % 0 ( '1 2 2 3 4 5 6 7 89 : ; < => ? @ A B C D E F G H IJ K L M N O P P Q R S T U V W W W XY Z [ \ ] ^ _ ` ` a a b c d e f g g g hi j k l m n o p q r s t u u v w v x y z { | } ~
• Note that omitting the random effect, i.e., ignoring the clustering leads to differentresults: � � � � � � � � � � � � � � � � � � � � � � � �
� � � � � � � � �� � � � �� � � � � ¡ ¢ £ ¤ ¥ ¦§ ¨ © ª « ¬ ® ¯ ° ± ² ³ ´ µ µ ¶ ·¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Â Å Æ Ç È È È ÉÊ Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß Þ à
Advanced statistical methods 674
• Predicted evolutions:
áâãä ååæçèéêë
ì í î
ì í ï
ð í î
ð í ï
ñ í î
ò ó ô õ ö ÷ ø ù ú ûì ð ñ ü ï ý þ ÿ � ì î ì ì ì ð
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
• There is no evidence for any interaction between ‘time’ and ‘neuro’, suggesting thatthe evolution over time is not different for both neuro-groups (p = 0.3482)
Advanced statistical methods 675
• Leaving out the interaction term leads to a model which assumes the same averageevolutions for both groups:
���� ��������
� � �
� � �
� �
� �
! � �
" # $ % & ' ( ) * +� ! , � - . / 0 � � � � �
1 2 3 4 5 6 7 8 9 : ; < = > 4 < : = ? @ 1 5 > ? 2 3 4 5 6 7 8 9 : ; < = > 4 < :
Advanced statistical methods 676
• The test results for the resulting model are:
A B C D E A D F G F H I J K L D M N I I D O G F
P Q Q R S T U V WX Y Z [ \] ^ _ ` a b c d e f g hi j k l m n o p q r s t u v w w w xy z { | } ~ � � � � � � � � � � � � � �
• Hence, the general conclusion is:
. Both groups have the same evolution over time (p = 0.3482)
. The trend is not constant over time (p < 0.0001)
. Neuro-psychiatric patients, on average, have lower MMSE values (p < 0.0001)
• Note that the trend seems very minor, while being highly significant.
Advanced statistical methods 677
• This can be explained from the fact that the trend observed is on the transformedscale.
• Back-transforming the average trends leads to:
����
� �
� �
� �
� �
� � � � � � � � � �� � � � � � � � � � � � � �
¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¤ ¬ ª ¯ ° ¡ ¥ ® ¯ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¤ ¬ ª
Advanced statistical methods 678
28.3 Model extensions
28.3.1 Inclusion of additional covariates and/or factors
• The repeated measures ANOVA model can be extended in various ways.
• The repeated measures ANOVA model can easily be extended with additionalcovariates or factors:
. To correct for differences between groups to be compared, e.g., age, gender
. To study how evolutions depend on additional patient characteristics, e.g., age,gender
• This turns the basic ANOVA model into a general linear model, extended with randomeffects
• The same ideas apply when analysing categorical longitudinal outcomes. For example,a binary outcome can be analysed using a logistic mixed model.
Advanced statistical methods 679
28.3.2 Categorical or continuous time effects ?
• In our example, ‘time’ was treated as a categorical factor. If the outcomes shows alinear trend over time, ‘time’ can be treated as a continuous covariate.
• In our analysis of the MMSE outcome, let Yij be the jth measurement of MMSE forpatient i, measured at time tj , and let Xi be a dummy variable for the neuro-status (1for neuro-psychiatric patients).
• A model with linear time effect is given by:
Yij = bi + β0 + β1tj + β2Xi + β3tjXi + εij
=
bi + β0 + β1ti + εi, if not neuro-psychiatric
bi + β0 + β2 + (β1 + β3)ti + εij. if neuro-psychiatric
• As before, the random effect bi ∼ N (0, σ2b ) accounts for the variability between
subjects, and hence for the correlation in the data.
Advanced statistical methods 680
• Graphically, this leads to two average regression lines, one for each group:
±²³´ µµ¶·¸¹º»
¼ ½ ¾
¼ ½ ¿
À ½ ¾
À ½ ¿
Á ½ ¾
Â Ã Ä Å Æ Ç È É Ê Ë¾ ¼ À Á Ì ¿ Í Î Ï Ð ¼ ¾ ¼ ¼ ¼ À
Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ Ô Ü Ú Ý ß à Ñ Õ Þ ß Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ Ô Ü Ú
• The only difference when compared to ANOCOVA is that the random effect bi nowaccounts for clustering
Advanced statistical methods 681
• The test results for the resulting model are:
á â ã ä å á ä æ ç æ è é ê ë ì ä í î é é ä ï ç æð ñ ñ ò ó ô õ ö ÷ø ù ú û üý þ ÿ � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � ! " # $ % & & & '( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = : >
• Again, there is no evidence for a different trend in the two neuro groups (p = 0.2601)
• Hence the model can be simplified by omitting the interaction:
Yij = bi + β0 + β1tj + β2Xi + εij
=
bi + β0 + β1ti + εi, if not neuro-psychiatric
bi + β0 + β2 + β1ti + εij. if neuro-psychiatric
Advanced statistical methods 682
• Obtained average trends:
?@AB CC
DEFGHI
J K L
J K M
N K L
N K M
O K L
P Q R S T U V W X YL J N O Z M [ \ ] ^ J L J J J N
_ ` a b c d e f g h i j k l b j h k m n _ c l m ` a b c d e f g h i j k l b j h
Advanced statistical methods 683
• Test results for the remaining main effects
o p q r s o r t u t v w x y z r { | w w r } u t
~ � � � � �� � �
� �� � �
� � � � � � � � � � � �� � � � � � � � � ¡ ¢ � £ ¤ ¥ ¥ ¥ ¦§ ¨ © ª « ¬ ® ¯ ° ° ± ² ³ ´ µ ¶ ¶ ¶ ·
• Hence, the general conclusion is:
. Both groups have the same linear evolution over time (p = 0.2601)
. The trend is not constant over time (p < 0.0001)
. Neuro-psychiatric patients, on average, have lower MMSE values (p < 0.0001)
• Note that the linear trend is on the transformed scale, needed to obtain valid inferences
Advanced statistical methods 684
• Back-transforming the average trends leads to:
¸¹º
» ¼
» ½
¾ ¼
¾ ½
¿ À Á Â Ã Ä Å Æ Ç È¼ » ¾ É Ê ½ Ë Ì Í Î » ¼ » » » ¾
Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ò Ú Ø Û Ý Þ Ï Ó Ü Ý Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ò Ú Ø
• While both trends look linear, they are not due to the non-linear transformation
Advanced statistical methods 685
28.3.3 Adding more random effects
• Let us re-consider the model with linear time-effect used to describe thenon-neuro-psychiatric patients in the delirium study:
Yij = bi + β0 + β1tj + εij
• The model can be interpreted as an ANOCOVA model with covariate ‘time’ andrandom patient factor, needed to account for clustering.
• The model can also be interpreted as a subject-specific regression model withsubject-specific intercepts:
Yij = (bi + β0) + β1tj + εij
• Because the bi have mean zero, the average evolution in the population is:
Yij = β0 + β1tj + εij
Advanced statistical methods 686
• Individual subjects deviate from the average by having their own intercept bi + β0
• Graphical representation of this random-intercepts model:
ß àá âãäå
æ
ç è é êë ì í î ï ð ñ ò ó ô ì ë ì ì ì í
······························································································································ ········································ Y = β0 + β1t↑||σ2
b
||↓
Advanced statistical methods 687
• As a consequence, the model assumes:
. Subjects show approximately parallel profiles
. The variability does not change over time
• Obviously, this is not always/often realistic, especially in studies with many repeatedmeasurements and/or long follow-up times.
• A possible solution is to extend the model allowing for subject-specific slopes as well:
Yij = (bi0 + β0) + (bi1 + β1)tj + εij,
where now bi0 and bi1 are both assumed normally distributed with means 0 andvariances σ2
b0 and σ2b1, respectively.
• Because the random effects still have mean zero, the average evolution in thepopulation still is:
Yij = β0 + β1tj + εij
Advanced statistical methods 688
• Graphical representation of the model with random intercepts and slopes:
õ ö÷ øùúû
ü
ý þ ÿ �� � � � � � � � � � � � � �
······························································································································ ········································ Y = β0 + β1t
↑|σ2
b0
|↓
↑|||
σ2b1
|||↓
• As before, EB estimates can be used to identify subjects with particular evolutions,i.e., particular intercepts and/or slopes.
Advanced statistical methods 689
28.4 Examples from the biomedical literature
• Malmstrom et al. [36]:
. Title:
The effect of a nurse led telephone supportive care programme on
patients’ quality of life, received information and health care contacts
after oesophageal cancer surgery—A six month RCT-follow-up study
Marlene Malmströma,b,c,*, Bodil Ivarssona,c,d, Rosemarie Klefsgårda, Kerstin Perssona,b,Ulf Jakobssonc,e, Jan Johanssona,b,c
a Skåne University Hospital, Lund, SwedenbDepartment of Surgery, Skåne University Hospital, Lund, Swedenc Lund University, SwedendDepartment of Cardio-Thoracic Surgery, Skåne University Hospital, Lund, SwedeneCenter for Primary Health Care Research, Faculty of Medicine, Lund University, Sweden
∗ Randomized controlled trial (RCT)
∗ Longitudinal, 6-month follow-up
Advanced statistical methods 690
. Study design, Figure 2, p.89:
Fig. 2. Overview of the intervention divided on control group (CG) and intervention group (IG). Measurement points, Follow-up with surgeon.
∗ Five repeated measurements
∗ Measurements at discharge, 2w, 2m, 4m, and 6m (•)
Advanced statistical methods 691
. Statistical analysis section, p.90:
over time. To test if there was a significant difference over time
between the groups a repeated measurements analysis of variance
(ANOVA) was conducted. If Mauchly’s test of sphericity indicated
violation, we used the Huynh-Feldt correction of the degree of
freedom to achieve valid F-ratios to the analysis. A complete cases
analysis was also conducted (Bennett, 2001).∗ Repeated measures ANOVA to compare evolutions between both groups
∗ The ‘sphericity’ assumption is the assumption of parallel profiles for all subjects
∗ When sphericity not satisfied, a correction is applied−→ Better to extend model with random slopes
Advanced statistical methods 692
. Table 4, p.92:
Table 4
Mean value and standard deviation (SD) at each time-point and overall between group comparison for quality of life (QLQ-OG25) by intervention (IG) and control group (CG).
Discharge 2 week 2 month 4 month 6 month Between groups
IG CG IG CG IG CG IG CG IG CG P-valueb
n=41 n=39 n=40 n=34 n=38 n =32 n=34 n=25 n =25 n =23 (n =41/n = 41)
QLQ-OG25 mean (SD)a
Dysphagia 32.2 (31.5) 37.3 (33.7) 28.5 (26.5) 19.6 (19.6) 21.3 (24.7) 19.8 (22.5) 15.7 (20.9) 8.9 (11.6) 13.8 (25.3) 7.7 (9.7) 0.222
Eating 53.8 (24.7) 61.0 (24.1) 57.9 (30.3) 49.2 (26.7) 38.6 (27.4) 38.0 (25.5) 33.9 (23.2) 31.7 (23.8) 28.7 (25.8) 29.3 (21.3) 0.840
Reflux 13.0 (17.7) 16.7 (26.8) 17.5 (26.7) 16.2 (21.9) 13.6 (17.7) 15.1 (22.1) 15.2 (19.4) 20.0 (20.4) 15.3 (19.2) 22.5 (25.9) 0.352
Odynophagia 19.6 (26.7) 18.4 (17.7) 22.2 (22.7) 14.7 (22.4) 19.3 (23.4) 15.1 (13.2) 21.1 (29.4) 14.7 (14.7) 17.3 (21.8) 13.8 (11.9) 0.116
Pain and discomfort 22.9 (24.1) 22.4 (24.0) 31.2 (28.0) 19.1 (21.0) 28.9 (26.5) 20.0 (19.0) 27.9 (31.4) 24.7 (22.6) 23.3 (24.1) 24.6 (22.4) 0.163
Anxiety 53.3 (25.1) 55.1 (33.4) 48.7 (26.0) 46.6 (31.2) 43.4 (23.7) 40.9 (26.8) 41.7 (25.0) 37.3 (26.9) 41.3 (26.8) 39.9 (26.4) 0.677
Eating with others 16.7 (28.2) 14.8 (24.5) 12.3 (22.5) 18.8 (28.0) 11.4 (23.6) 8.0 (14.5) 17.6 (24.9) 5.3 (15.8) 16.0 (25.7) 11.6 (16.2) 0.358
Dry mouth 60.0 (32.2) 56.4 (35.2) 58.3 (33.5) 48.0 (33.0) 27.2 (27.8) 32.3 (31.6) 24.5 (29.9) 29.3 (29.4) 28.0 (31.4) 18.8 (24.3) 0.507
Trouble with taste 38.3 (31.6) 40.7 (32.0) 40.8 (38.9) 41.4 (32.3) 32.5 (34.2) 30.0 (29.5) 25.5 (30.8) 24.0 (31.2) 16.0 (25.7) 21.7 (29.5) 0.816
Body image 40.8 (42.4) 52.6 (35.2) 35.0 (36.2) 41.4 (37.3) 26.3 (32.1) 25.8 (29.5) 20.6 (30.7) 18.7 (27.4) 25.3 (35.1) 24.6 (25.1) 0.481
Trouble swallowing saliva 13.8 (19.7) 21.4 (26.0) 8.3 (18.1) 18.6 (29.8) 11.4 (20.9) 6.5 (15.9) 12.7 (26.0) 5.3 (12.5) 14.7 (27.4) 4.3 (11.5) 0.737
Choked when swallowing 13.3 (30.0) 12.3 (19.6) 13.3 (25.9) 14.1 (20.5) 16.7 (22.9) 20.4 (26.8) 14.7 (18.7) 12.0 (19.0) 12.0 (16.3) 11.6 (16.2) 0.978
Trouble with coughing 45.0 (29.8) 44.4 (31.8) 45.0 (28.8) 45.1 (27.1) 43.9 (28.1) 50.5 (32.1) 35.3 (25.9) 33.3 (30.4) 26.7 (30.4) 31.9 (30.9) 0.646
Trouble with talking 24.8 (28.3) 29.1 (30.8) 17.5 (23.9) 18.6 (27.5) 13.2 (27.4) 12.9 (25.4) 15.7 (27.5) 13.3 (25.5) 16.0 (29.1) 10.1 (21.2) 0.876
Weight loss 15.0 (23.1) 23.1 (30.7) 32.5 (33.3) 36.3 (32.2) 24.6 (29.7) 31.2 (29.7) 33.3 (33.8) 25.3 (30.9) 22.7 (31.5) 30.4 (33.2) 0.421
a Score range 0–100. A high score represents a higher level symptoms/problems (worse).b Repeated measurements ANOVA. Based on items/scales with mean value imputation.
92
M.Malm
ström
etal./In
ternatio
nal
Advanced statistical methods 693
• Kruse et al. [37]
. Title:
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ! � � � ! � " � � � � � ! �# � � $ � � � � % � � � � � � � � � � & � � � ' ! � � � � � # � ( � ) � � � " � � �
* + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ = A B C D > E = @ F G H I D J K L M N O P Q R S T U V W X Y U N Y U Z [ \ ] ^ _ ` a ^ _ ^ b c ^ d ef g h h i j k l m n o p q r s t u v t w s u w x s y z { | }~ � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
� � � � � � � �� � � � � � � � � � � ¡ ¢ £ ¡ � ¤ ¥ ¦ � � � § ¨ © ª � � � ¨ « ¦ � ¢ ¬ � ¨ � � � � � � � � � � ¡ ® ¥ � ¥ � ¦ ¯ ° ¥ � � ± ¡ £ ¡ � � ¢ � ¯ ¨ ¨ ² ¨ © ª ¥ � � ¡ ¥³ ´ µ ¶ · ¸ ¹ ¹ ¶ µ º » ¼ ½ ¾ ¸ µ ¹ ½ ¿ À ¶ · Á ½ ¹ ¹ ¶  µ ½ º à ¸ Ä Å µ ¿ Æ ¸ ¼ ¿ ¶ · Ç Å Æ ½ È À Å ¼ É Ê ¶ Æ Æ Â ¼ ½ ¿ À Á ¸ É ½ Ë ½ ¼ ¸Ì Í Î Î Ï Ð Ñ Ò Ó Ô Õ Ö Ï × Ô Î Î Ï Ö Ø Ù Ú Ñ Û Ô Ö Î Ñ Ó Ü Ï × Ý Ñ Ð Þ Ñ ß Ò Ú Ø à Ô á Ò Ö Ó â Ô Ú Ó Ï × ã Ô Ò ä Ó Þ Ý Ò Ú Ò ß Ô â Ô Ú Ó Ò Ú å Õ Ï ä Ñ Ð Ü Øæ Ð Þ Ï Ï ä Ï × Õ ç è ä Ñ Ð ã Ô Ò ä Ó Þ Ò Ú å Í Î Î Ï Ð Ñ Ò Ó Ô é Ô Î Ô Ò Ö Ð Þ Õ Ö Ï × Ô Î Î Ï Ö Ø ê Ú Î Ó Ñ Ó ç Ó Ô Ï × ë Ô Ö Ï Ú Ó Ï ä Ï ß Ü Ø æ Ð Þ Ï Ï ä Ï ×Ý Ô å Ñ Ð Ñ Ú Ôì í î î ï ð ñ ò ó ô õ ö ï ÷ ô î î ï ö ø õ ö ï ù ñ ú ô û ð ô ü í ý ô ú ñ ð ò þ ÿ ô û ó ô ö ò û ú � ö ï � û � û ñ ù ô ö î ñ ó � ø ÿ ô û ó ô ö ÷ ï ö� ô ö ï û ó ï þ ï � � ò û ú � ô ò þ ó � ÿ ò ö ô � ô î ô ò ö ð �
Advanced statistical methods 694
. Statistical analysis section, p.1912:
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ! � " � " # $ % & ' " � " ( � � ) ! ) * ) � ( � + % , - � � � , . - / % 0 * � � # � � � � �
� ) � � ) � 1 � � � � ) � � ) � � � ! � � 2 � � � 3 4 ) � � 5 � + ) � � � � � � � � ! � ! � 6 � 2 � � � ! ) � � ! � 1 � � � ! � � � � � � 1 ) 2 � � � ) � � % 7 � + � * � � 8 3 " � � � � � � � � � ) � � � � ) � � ! + � � � � � � � � � � � � � � � 9 * � � ) � � � � 6 � � ! � , � �� � � � * � � ) ! � � ! � � � � + ) ! � � � � � 9 * � � ) � � � 6 � � � � � 2 � � � ! ) � � � � ! � � ! � � � � � � � � � � ! � 1 ! � + 2 � � 1 � ,� ! ) 2 6 � � � � ! � � + � # � � ! � ) � � � � � � * � � � ) ) 2 � � � � � ! � � ) ! � + � * � � � � : 2 � � � ! ) � � � � � � � � � � � ) % ; �� � � � � � 3 4 ) � � 5 � + ) � � ! � � * � ! � 1 � � ! � � � � � ! < � : � � � � + ) � � � � � % " � � � � � ) � + � # � � ! � � + �
� ) � * + ) * � � � � � � � � � ) 2 � � � � � � � : � � � + ! � ! + � � � � � � � � � � � � � ! � ) � � + � � ) � ) � # � � � � # � � � � � : � � � � � ) : 2 � � � ! ) � � � � � ! � � %
∗ Subject-specific (random) intercepts and slopes
∗ Variances of random intercepts and slopes are allowed to be different before andafter hospitalization.
Advanced statistical methods 695
. Figure 2, p.1921:
= > ? @ A B C DE F G H I J G E K L M H I N G O M P H Q G R S H G T Q O M G T U V H G J H G R R Q P W X P T G Y Z [ I U Y G \ I ] ^ P H W _ H R Q W J ` P X GH G R Q T G W M R ` P R S Q M I Y Q a G T ^ P H ` Q S ^ H I O M _ H G P H S W G _ X P W Q I b c G O I _ R G M ` G S H Q X I H V ` P R S Q M I Y T Q I J W P R Q RQ R W P M S H G R G W M U G ^ P H G M ` G I O _ M G G F G W M d M ` G S H G e ` P R S Q M I Y M H I N G O M P H V Q R Q T G W M Q O I Y ^ P H M ` G M f PT Q I J W P R G R b [ ` G R Y P f S H G e ` P R S Q M I Y f P H R G W Q W J Q W E K L ^ _ W O M Q P W Q R ^ P Y Y P f G T U V S H G O Q S Q M P _ Rf P H R G W Q W J R _ H H P _ W T Q W J M ` G I O _ M G ` P R S Q M I Y Q a I M Q P W b [ ` G I X P _ W M P ^ f P H R G W Q W J F I H Q G R U VT Q I J W P R Q R I W T Q R G g _ I Y M P M ` G Q W M G H O G S M R Q W [ I U Y G \ I h ` Q S ^ H I O M _ H G f I R I R R P O Q I M G T f Q M ` I i b j k eS P Q W M O ` I W J G I W T S W G _ X P W Q I f I R I R R P O Q I M G T f Q M ` I l b i m e S P Q W M O ` I W J G b n W I F G H I J G d ` Q S^ H I O M _ H G S I M Q G W M R Q X S H P F G ^ P Y Y P f Q W J ` P R S Q M I Y T Q R O ` I H J G f ` Q Y G H G R Q T G W M R ` P R S Q M I Y Q a G T ^ P HS W G _ X P W Q I Z I W T M ` G P M ` G H T Q I J W P R G R ] O P W M Q W _ G M P f P H R G W b o P H S _ H S P R G R P ^ Q Y Y _ R M H I M Q P W d M ` G
Advanced statistical methods 696
Chapter 29
Missing observations
. Introduction
. How not to handle missing data ?
. How to handle missing data ?
. Examples from the biomedical literature
Advanced statistical methods 697
29.1 Introduction
• For example, the plot with individual profiles of MMSE evolutions in the delirium dataset suggests dropout:
ppqr
s
t s
u s
v s
w x y z { | } ~ � �t u v � � � � � � t s t t t u
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
Advanced statistical methods 698
• Complete data sets are rare in practice
• Missing observations not only imply loss of power, but more importantly may alsoimply biased results
• Problematic case:
Probability for an observation to be missingis related to the observation itself
• How to handle missingness in a data set ?
• This will be illustrated in the context of longitudinal data but ideas equally well applyto all other contexts
Advanced statistical methods 699
29.2 How not to handle missing data ?
• Consider data from a longitudinal study with 20 subjects, measured at baseline andfollowed by 6 weekly visits:
� �� �
���
�
� �
�
¡ �
¢ �
£ ¤ ¥ ¦� � ¡ ¢ § ¨
© ª « ¬ ® ¯ ® ° ± ¯ ±
Advanced statistical methods 700
• Due to dropout, not all subjects have been followed up to week 6:
² ³´ µ
¶·
¹
º ¹
» ¹
¼ ¹
½ ¹
¾ ¿ À Á¹ º » ¼ ½  Ã
Ä Å Æ Ç È É Ç Ê Ê Ë Ì Ë
• Let us compare various common approaches to handle missingness,when interest is in estimation of the average trend
Advanced statistical methods 701
• Averaging the observed values at each visit:
Í ÎÏ Ð
ÑÒÓ
Ô
Õ Ô
Ö Ô
× Ô
Ø Ô
Ù Ú Û ÜÔ Õ Ö × Ø Ý Þ
ß à á â ã ä â å å æ ç æ
=⇒
Correct at visits without missing observations
Biased at visits with missing observations
Advanced statistical methods 702
• Averaging the values of the complete cases only:
è éê ë
ìíî
ï
ð ï
ñ ï
ò ï
ó ï
ô õ ö ÷ï ð ñ ò ó ø ù
ú û ü ý þ ÿ � ÿ � � � ÿ � û � þ �
=⇒
Biased at visits without missing observations
Biased at visits with missing observations
Advanced statistical methods 703
• Averaging after last observation carried forward (LOCF):
� ��
��
�
�
�
�
� � � � � � � � � �
� � � � � � � � ! � " #
=⇒
Biased at visits with missing observations
Distorted association structure (→ p-values)
Advanced statistical methods 704
• Averaging after mean imputation:
$ %& '
()*
+
, +
- +
. +
/ +
0 1 2 3+ , - . / 4 5
6 7 8 9 : ; < = > 8 > : ? 9
=⇒
Biased at visits with missing observations
Distorted association & variance structure (→ p-values)
Advanced statistical methods 705
29.3 How to handle missing data ?
• No uniformly best answer:
. Depends on nature of missingness
. Depends on outcome type
. Depends on research question
. Depends on model considered
. . . .
• All methods rely on assumptions about the relation between the probability for anobservation to be missing and the observation itself
=⇒ Untestable assumptions
Advanced statistical methods 706
• Multiple imputation (M = 5 imputations):
Observeddata
.................................................................................................................................................................................................................................................................................................................................................
........................................................................................
.......................................................................................
...........................................
........................................
....................................................................................................................................................................................... ........................................
.......................................................................................................................................................................................................................... ........................................
......................................................................................................................................................................................................................................................................................................... ........................................
Imputed 1
Imputed 2
Imputed 3
Imputed 4
Imputed 5
....................................................................................................................................................................................... ........................................
....................................................................................................................................................................................... ........................................
....................................................................................................................................................................................... ........................................
....................................................................................................................................................................................... ........................................
....................................................................................................................................................................................... ........................................
Results 1
Results 2
Results 3
Results 4
Results 5
......................................................................................................................................................................................................................................................................................................... ........................................
.......................................................................................................................................................................................................................... ........................................
....................................................................................................................................................................................... ........................................
........................................................................................
........................................................................................
..........................................
........................................
.................................................................................................................................................................................................................................................................................................................................................
Finalresults
...........................
...........................
...........................
..........................
..................................................
....................................
...................................................................................
......................................................................................................................................................................
.........................................................................................................................................................................................................................................................
Imputation CombinationAnalysis
Advanced statistical methods 707
• Advantages:
. Correctly accounts for uncertainty about imputed values
. Imputation can be based on observed information (covariates, outcomes)
. Expert opinion
. Various imputation models can be explored (−→ sensitivity analyses)
. Relatively straightforward to implement
• Often, a small number M of imputations is sufficient (M = 3, 5)
• Alternative approaches rely on joint modeling the outcome and thedropout/missingness process
• Such methods are less generally applicable and/or more difficult to implement
Advanced statistical methods 708
29.4 Examples from the biomedical literature
• Malmstrom et al. [36], statistical analysis section, p.90:
3.10. Statistical analysis
The responses to instrument items were transformed to scale
scores according to the instructions from the providers (Fayers
et al., 2001). Imputations of missing values were performed in two
steps. Missing values within the forms were replaced according to
the scoring manual of the instrument (Fayers et al., 2001) and
missing values due to missing forms were replaced with mean
value imputation. The analyses were conducted according to the
Intention-To-Treat principle (Altman, 1991). A priori, we decided∗ Imputation according to scoring mannuals is also (single) imputation,
and to be avoided !
∗ Mean value imputation !
Advanced statistical methods 709
• Zimmerman et al. [38], statistical analysis section, p.106:
who had fully completed the baseline assessment, were assessed
again at week 8 post baseline and 12 months post baseline. If
patients did not answer the invitation for assessment or could not
be reached at all (i.e. if they dropped out before these assessments),
their last available values were used, their last observation was
carried forward (LOCF method) for the ITT-analysis. We reported
the results as adjustedmean differences with their 95% con dence
In a sensitivity analysis, we calculated results of the observed
cases (OC) for the primary outcome. This analysis will include only
those patients who did not drop out and completed their final
assessment. In a second sensitivity analysis, we replaced missing
values using a multiple imputation approach (N = 100 imputa-
tions). Analyses were done using Stata 14.
∗ Primary analysis based on LOCF
∗ Sensitivity analyses based on complete cases and on multiple impuation
Advanced statistical methods 710
• Kruse et al. [37], statistical analysis section p.1912:
@ A B B C A D A E E A F G H I J H K C A J K L C M A N M A B I O C L E N K L J P O B B O A Q O B L B O R Q O D O S L Q C D K L C H N K A D C M O B S A M A N C TU K L C M L Q J N K L J P O B B O A Q N K I N K B K Q C I A C K Q C O L E E V Q A Q G N L Q J A P B K E K S C O A Q P K S M L Q O B P B E O P O C O Q R C M KK B C O P L C O A Q A D C N L W K S C A N O K B C A X M K L E C M V B H N Y O Y A N B T Z [ \ ] ^ _ ` a b c d e f ` g a h _ f i e a ^ d j b k e a f d j ^ b ` i ^ d c af ` b l f _ e m n o p q c d i b k e b ^ _ e b f i r f s t f ` b l c d a ^ _ ^ b b k e u ^ c g ^ d j e v v e l b g f v d f d t r c d i f _ i r f sf ` b w x y z x { | } ~ � } � } � � � } � � � � ~ � � � � � } � } � � � � ~ � � � } � ~ � � � � � � � � � � � � ~ � � } � � � � ~ � � } ~ �
� � � � � � � � � � � � � � � ¡ ¢ £ ¡ ¤ ¥ ¦ ¡ ¥ § ¨ © © ¨ ª ¤ « ¦ ¡ ¬ £ § ª ¥ ¥ « ¨ ¢ £ ¡ © ® ¡ ¦ ¡ ¢ ª ¯ ° ¤ ª ¦ § ¡ ¦ ¯ ¦ © © ¨ ª ¤ ± ² £ ³ ¤ ¢ ° ¢ ¨ § § ª ¥ © © £ ¡ ¦ ¥ ¡ © ´ µ ¶ ¬ ¢ ° © ® ¬ ¨ · ¨ ¬ ¨ ¤ ¢ ¦ ¬ ® ¢ ¡ ¤ ¥ © ª ® · ª ¦ ® ª © ¢ °£ ª © ® ¨ ¢ ¡ ¸ � ¹ ± º ª ¥ · ¨ ¢ ¢ ¨ ¤ ¯ « ¡ © ¥ ª ¤ « ¨ ¢ £ ¢ £ » ¹ º ¼ ½ ¾ � ® ¦ ª ¬ ¥ ´ ¦ ± ¸ § ª ¦ ¥ ¢ ¡ ¨ ¥
∗ Considerable dropout due to death or re-admission
∗ Dropout believed to be potentially related to the outcome studied (ADL)
∗ The dropout mechanism has been jointly modeled with the longitudinal outcome:
1. ADL: mixed model with random intercepts and slopes
2. Two time-to-event models for death and re-admission:Log-normal model is an alternative to Cox regression
Advanced statistical methods 711
Bibliography
Advanced statistical methods 712
Bibliography
[1] C.A. Wong, B.M. Scavone, A.M. Peaceman, et al. The risk of cesarean delivery with neuraxial analgesia given early versus late in labor. The
New England Journal of Medicine, 352:655–665, 2005.
[2] A.I. Amin, O. Hallbook, A.J. Lee, R. Sexton, B.J. Moran, and R.J. Heald. A 5-cm colonic j pouch colo-anal reconstruction following anteriorresection for low rectal cancer results in acceptable evacuation and continence in the long term. Colorectal Disease, 5:33–37, 2003.
[3] S. Kaplan, S. Etlin, I. Novikov, and B. Modan. Occupational risks for the development of brain tumours. American Journal of Industrial
Medicine, 31:15–20, 1997.
[4] Y. Baba, J.D. Putzke, N.R. Whaley, Z.K. Wszolek, and R.J. Uitti. Gender and the parkinson’s disease phenotype. Journal of Neurology,252:1201–1205, 2005.
[5] K.M. Kellett, D.A. Kellett, and L.A. Nordholm. Effects of an exercise program on sick leave due to back pain. Physical Therapy, 71:283–293,1991.
[6] S.E. Nissen, E.M. Tuzcu, P. Schoenhagen, et al. Statin therapy, LDL cholesterol, C-reactive protein, and coronary artery disease. The New
England Journal of Medicine, 352:29–38, 2005.
[7] T. Shatari, M.A. Clark, T. Yamamoto, A. Menon, C. Keh, J.Alexander-Williams, and M. Keighley. Long strictureplasty is as safe and effective asshort strictureplasty in small-bowel crohn’s disease. Colorectal Disease, 6:438–441, 2004.
Advanced statistical methods 713
[8] P. Serrano-Gallardo, M. Martınez-Marcos, F. Espejo-Matorrales, T. Arakawa, G.T. Magnabosco, and I.C. Pinto. Factors associated to clinicallearning in nursing students in primary health care: An analytical cross-sectional study. Revista Latino-Americana de Enfermagem, 24:e2803,2016.
[9] A. Salehi, M. Marzban, M. Sourosh, F. Sharif, M. Nejabat, and M.H. Imanieh. Social well-being and related factors in students of school ofnursing and midwifery. International Journal of Community Based Nursing and Midwifery, 5:82–90, 2017.
[10] P. Kiekkas, H. Brokalaki, E. Manolis, A. Samios, C. Skartsani, and G. Baltopoulos. Patient severity as an indicator of nursing workload in theintensive care unit. Nursing in Critical Care, 12:34–41, 2007.
[11] M. Frilund and L. Fagerstrom. Managing the optimal workload by the PAONCIL method – A challenge for nursing leadership in care of olderpeople. Journal of Nursing Management, 17:426–434, 2009.
[12] S. Bjork, M. Lindkvist, A. Wimo, C. Juthberg, A. Bergland, and D. Edvardsson. Residents’ engagement in everyday activities and its assocationwith thriving in nursing homes. Journal of Advanced Nursing, 47:http://dx.doi.org/10.1111/jan.13275, 2017.
[13] K.H. Archbold, B. Giordani, D.L. Ruzicka, and R.D. Chervin. Cognitive executive dysfunction in children with mild sleep-disordered breathing.Biological Research for Nursing, 5:168–176, 2004.
[14] S.M. van Hooft, J. Dwarswaard, R. Bal, M. Strating, and A. van Staa. What factors influence nurses’ behavior in supporting patientself-management ? an explorative questionnaire study. International Journal of Nursing Studies, 63:65–72, 2016.
[15] W.Y. Huang, C.C. Chang, D.R. Chen, C.T. Kor, T.Y. Chen, and H.M. Wu. Circulating leptin and adiponectin are associated with insulinresistance in healthy postmenopausal women with hot flashes. PloS One, 12:e0176430, 2017.
[16] S. Bjork, H. Lovheim, M. Lindkvist, A. Wimo, and D. Edvardsson. Thriving in relation to cognitive impairment and neuropsychiatric symptomsin Swedish nursing home residents. International Journal of Geriatric Psychiatry, 32:http://dx.doi.org/10.1002/gps.4714, 2017.
[17] R.M. Collard, M. Arts, H.C. Comijs, P. Naarding, P.F.M. Verhaak, M.W. de Waal, and R.C. Oude Voshaar. The role of frailty in the associationbetween depression and somatic comorbidity: Results from baseline data of an ongoing prospective cohort study. International Journal of Nursing
Studies, 52:188–196, 2015.
[18] B.L. Blomquist, P.D. Cruise, and R.J. Cruise. Values of baccalaureate nursing students in secular and religious schools. Nursing Research,29:379–383, 1980.
Advanced statistical methods 714
[19] B.P. Richardson, A.E. Ondracek, and D. Anderson. Do student nurses feel a lack of comfort in providing support for Lesbian, Gay, Bisexual orQuestioning adolescents: What factors influence their comfort level? Journal of Advanced Nursing, 73:1196–1207, 2016.
[20] D. Ausili, P. Rebora, S. Di Mauro, B. Riegel, M.G. Valsecchi, M. Paturzo, R. Alvaro, and E. Vellone. Clinical and socio-demographicdeterminants of self-care behaviours in patients with heart failure and diabetes mellitus: A multicentre cross-sectional study. International
Journal of Nursing Studies, 63:18–27, 2016.
[21] E. Hahnel, U. Blume-Peytavi, C. Trojahn, G. Dobos, A. Stroux, N. Garcia Bartels, I. Jahnke, A. Lichterfeld-Kottner, H. Neels-Herzmann,A. Klasen, and J. Kottner. The effectiveness of standardized skin care regimens on skin dryness in nursing home residents: A randomizedcontrolled parallel-group pragmatic trial. International Journal of Nursing Studies, 70:1–10, 2017.
[22] J.C. Silva, Z. Viera de Moraes, C. Aparecida da Silva, S. de Barros Mazon, M.E. Guariento, A. Liberalesso Neri, and A. Fattori. Understandingred blood cell parameters in the context of the frailty phenotype: Interpretations of the FIBRA (Frailty in Brazilian Seniors) study. Archives of
Gerontology and Geriatrics, 59:636–641, 2014.
[23] K.J. Moon and S.M. Lee. The effects of a tailored intensive care unit delirium prevention protocol: A randomized controlled trial. International
Journal of Nursing Studies, 52:1423–1432, 2015.
[24] E. Cameron and L. Pauling. Supplemental ascorbate in the supportive treatment of cancer: re-evaluation of prolongation of survival times interminal human cancer. Proceedings of the National Academy of Science U.S.A., 75:4538–4542, 1978.
[25] D.J. Hand, F. Daly, A.D. Lunn, K.J. McConway, and E. Ostrowski. A handbook of small datasets. Chapman & Hall, first edition, 1989.
[26] R. Peto, M.C. Pike, P. Armitage, N.E. Breslow, D.R. Cox, S.V. Howard, N. Mantel, K. McPherson, J. Peto, and P.G. Smith. Design and analysisof randomised clinical trials requiring prolonged observation of each patient. British Journal of Cancer, 35:1–35, 1977.
[27] P.D. Allison. Survival analysis using the SAS system: A practical guide. NC: SAS Institute, 1995.
[28] F. Blanchon, M. Grivaux, B. Asselain, et al. 4-year mortality in patients with non-small-cell lunc cancer: development and validation of aprognostic index. Lancet Oncology, 7:829–836, 2006.
[29] J.P. Klein and M.L. Moeschberger. Survival analysis : Techniques for censored and truncated data. Springer Verlag New York, 1997.
[30] T. Nawrot, M. Plusquin, J. Hogervorst, et al. Environmental exposure to cadmium and risk of cancer: a prospective population-based study. The
Lancet Oncology, 7:119–126, 2006.
Advanced statistical methods 715
[31] L.F. Hutchins, S.J. Green, P.M. Ravdin, D. Lew, S. Martino, M. Abeloff, A.P. Lyss, C. Allred, S.E. Rivkin, and C.K. Osborne. Randomized,controlled trial of Cyclophosphamide, Methotrexate, and Fluorouracil versus Cyclophosphamide, Doxorubicin, and Fluorouracil with and withoutTamoxifen for high-risk, node-negative breast cancer: Treatment results of intergroup protocol int-0102. Journal of Clinical Oncology,23:8313–8321, 2005.
[32] P.J. Brown, S.P. Roose, R. Fieo, X. Liu, T. Rantanen, J. Sneed, B.R. Rutherford, D.P. Devanand, and K. Avlund. Frailty and depression in olderadults: A high-risk clinical population. The American Journal of Geriatric Psychiatry, 22:1083–1095, 2014.
[33] L. Schoonhoven, B.G. van Gaal, S. Teerenstra, E. Adang, C. van der Vleuten, and T. van Achterberg. Cost-consequence analysis of “washingwithout water” for nursing home residents: A cluster randomized trial. International Journal of Nursing Studies, 52:112–120, 2015.
[34] L. Smedts-Alenius, C. Tishelman, R. Lindqvist, and S. Runesdotter. RN assessments of excellent quality of care and patient safety are associatedwith significantly lower odds of 30-day inpatient mortality: A national cross-sectional study of acute-care hospitals. International Journal of
Nursing Studies, 61:117–124, 2016.
[35] S. Fisher, Y. Yasui, K. Dabbs, and M. Winget. Using multilevel models to explain variation in clinical practice: Surgeon volume and the surgicaltreatment of breast cancer. Annals of Surgical Oncology, 23:1845–1851, 2016.
[36] M. Malmstrom, B. Ivarsson, R. Klafsgard, and K. Persson. The effect of a nurse led telephone supportive care programme on patients’ quality oflife, received information and health care contacts after oesophageal cancer surgery – A six month RCT-follow-up study. International Journal of
Nursing Studies, 64:86–95, 2016.
[37] R.L. Kruse, G.F. Petroski, D.R. Mehr, J. Banaszak-Holl, and O. Intrator. Activities of daily living (ADL) trajectories surrounding acutehospitalisation of long-stay nursing home residents. Journal of the American Geriatrics Society, 61:19091918, 2013.
[38] T. Zimmermann, E. Puschmann, H. van den Bussche, B. Wiese, A. Ernst, S. Porzelt, A. Daubmann, and M. Scherer. Collaborative nurse-ledself-management support for primary care patients with anxiety, depressive or somatic symptoms: Cluster-randomized controlled trial (findings ofthe SMADS study). International Journal of Nursing Studies, 63:101–111, 2016.
Advanced statistical methods 716