health and disease in populations 2002 sources of variation (1) paul burton! jane hutton
TRANSCRIPT
Health and Disease in Populations 2002
Sources of variation (1)
Paul Burton! Jane Hutton
Informal lecture objectives
Objective 1 To enable the student to distinguish between
observed data and the underlying tendencies which give rise to those data
Objective 2: To understand the concepts of sources of
variation and randomness
Formal lecture objectivesfor Random Variation (1) and (2)
Objective 1 Distinguish between ‘observed’ epidemiological
quantities (incidence, prevalence, incidence rate ratio etc) and their ‘true’ or ‘underlying’ values.
Objective 2 Discuss how ‘observed’ epidemiological quantities
depart from their ‘true’ values because of random variation.
Formal lecture objectivesfor Sources of Variation
Objective 3 Describe how ‘observed’ values help us
towards a knowledge of the ‘true’ values by: allowing us to test hypotheses about the true
value (SoV 1) allowing us to calculate a range within which
the true value probably lies (SoV 2)
Drawing conclusions Experiment
Flip a coin 10 times
Result Observe 7 heads, 3 tails
Conclusions Data wrong (e.g. a miscount) Artefact Chance The coin is biased towards heads
Tendency versus observation Coins tend to produce equal numbers of
heads and tails, but what we observe may depart from this by random variation.
Random variation in health On average, there are 4 cases of meningitis per
month in Leicester; some months we observe 10, some months 0.
Smokers tend to be less healthy than non-smokers; but if we pick a few people at random, we might find that the smokers are healthier than the non-smokers.
Tendency versus observation
Epidemiologists, health planners etc. want to know about the underlying tendencies and patterns. However, as well as systematic variation, everything they observe is affected by random variation.
Underlying tendency observed data The proportion of red marbles in a bag of 1000 red and black ones
The number of reds among ten picked at random from the bag
The forthcoming result of a UK general election
The voting intentions of 1,000 UK voters picked at random
The total number of Leicester diabetic patients who have foot problems
The number with foot problems in a random sample of 200 Leicester diabetics
If we know about the underlying tendency, we can predict what we may ‘reasonably’ expect to observe (probability theory).
Neonatal Intensive Care (NIC) cots
True requirement (1992 figures) 1/1,000 live births per annum
Health authority has approximately 12,000 live births per annum
On average 12 NIC `cots' will be required per year (this is the true tendency)
95%18
29/3019
99%21
Obstetric beds (NIC cots)
Often observe 8-16 cots being used
Need 19 or more on 1/day per month
Need 21 or more on 1% of days
Hardly ever need more than 24 cots
Provide 19 cots On average 12 are occupied = 63% occupancy
True tendency observed distribution easy
BUT how do we reverse the direction of inference?
Observed distribution true tendency
Any questions?
Hypothesis testing
Objective 3 Describe how ‘observed’ values help us
towards a knowledge of the ‘true’ values by: Allowing us to test hypotheses about the true
value
Hypothesis testing
An hypothesis: A statement that an underlying tendency of scientific interest takes a particular quantitative value The coin is fair (the probability of heads is 0.5) The new drug is no better than the standard
treatment (the ratio of survival rates = 1.0) The true prevalence of tuberculosis in a given
population is 2 in 10,000
Testing hypotheses
Are the observed data ‘consistent’ with the stated hypothesis? Informally? Formally?
Formally Calculate the probability of getting an observation
as extreme as, or more extreme than, the one observed if the stated hypothesis was true.
If this probability is very small, then either something very unlikely has occurred; or the hypothesis is wrong
It is then reasonable to conclude that the data are incompatible with the hypothesis.
The probability is called a ‘p-value’
Hypothesis: this coin is fair Observed data: 10 heads, 0 tails
P-value: 0.002 (1 in 500) (exactly 2 1/ 1,024)
Conclusion: Data inconsistent with hypothesis; strong evidence against the hypothesis
Prior beliefs relevant here: 10 heads, 0 tails: (Is the coin biased?) 10 survivors, 0 deaths on new treatment X: (Does X work if
historically 50% died)
An arbitrary convention
P-value: p 0.05 Data ‘inconsistent with hypothesis’ ‘Substantive evidence against the hypothesis’ ‘Reasonable to reject the hypothesis’ ‘Statistically significant’
P-value: p>0.05 None of the above
The mean surface temperature of the earth has increased by only 1°C over the last 50 years p=0.1 does not prove that there is no global warming!
Hypothesis tests
The incidence of disease X in Warwickshire is significantly lower than in the rest of the UK (p=0.01)
The death rate from disease Y is significantly higher in Barnsley than in Leicester (p=0.05)
Patients on the new drug did not live significantly longer than those on the standard drug (p=0.4)
The ‘null hypothesis’ The hypothesis to be tested is often called the ‘null
hypothesis’ (H0) The ratio of death rates is 1.0 The prevalence in Warwickshire is the same as in
Leicestershire ‘p<=0.05’: substantial evidence against the
hypothesis being tested, not that it is definitely false
p>0.05: Data (not in-) consistent with the hypothesis. Little or no evidence against the hypothesis being tested, not that it is definitely true
An experiment: flip a coin 10 times Observed result: 7 heads, 3 tails Question:
Is the coin biased?0 0.001 *1 0.010 *2 0.044 *3 0.117 * 4 0.2055 0.246 p = 2×(0.001+0.010+0.044+0.117) = 0.3446 0.2057 0.117 *8 0.044 *9 0.010 *10 0.001 *
An experiment: flip a coin 10 times
Observed result: 7 heads, 3 tails Data consistent with the coin being unbiased. Weak evidence against the null hypothesis’ So: little evidence that the coin is biased But: does not prove that the coin is unbiased
Problems
Rejecting H0 is not always much use. p<0.05 is arbitrary; nothing special happens between
p=0.049 and p=0.051 p=0.000001and p=0.6 easy to interpret False positive results Statistical significance depends on sample size. Flip a
coin 3 times minimum p=0.25 (i.e. 2×1/8) Statistically significant clinically important
Nevertheless, p values are used a lot
A solution
Objective 3 Describe how ‘observed’ values help us
towards a knowledge of the ‘true’ values by: Allowing us to test hypotheses about the true
value Providing us with a range within which the
underlying tendency probably lies
Any questions?