johnsnowandcholera: revisitingdifference-in-differencesand...

20
John Snow and Cholera: Revisiting Difference-in-Differences and Randomized Trials in Research Thomas S. Coleman * January 24, 2020 Abstract John Snow, the London doctor often considered the father of modern epidemiology, recognized that two competing water companies in 1850s South London, serving an area with over 400,000 individuals, provided a “Grand Experiment” for testing the effect of clean versus dirty water in the transmission of cholera. Snow exploited the effective randomization in the design and applied a rudimentary form of Difference-in-Differences (DiD) to compare dirty-versus-clean and before-versus-after. The nature of Snow’s design allows us to examine within-sample variability both across and within regions. The conclusion is stark: Snow’s simple DiD cholera treatment effect is only modestly significant once we incorporate the within-sample variability (“over-dispersion” relative to a Poisson error process), contrary to what we might naively presume given a sample of roughly 400,000 persons. Although Snow’s claim for the “influence which the nature of the water supply exerted over the mortality” survives – as we know it should – this re-analysis highlights the importance of design and careful error analysis for properly measuring and testing a treatment effect. University of Chicago, Harris Public Policy 1307 E 60th St. Suite 3037 Chicago IL 60637 203-252-4897 Keywords: John Snow, cholera, causal inference, epidemiology, statistical methodology, difference-in-difference regression, randomized control trial error analysis JEL Classification: C18, N33, N93, B40, C52 Snow Project data and code: https://github.com/tscoleman/SnowCholera Declarations of interest: None Words: 8000 Acknowledgements: I would like to thank Dan Black, Steven Durlauf, Robert Michael, Xi Song, Austin Wright, and members of the CEHD Lifecycle Working Group at the University of Chicago. Errors are my own. * Harris School of Public Policy, University of Chicago, [email protected] 1

Upload: others

Post on 07-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

John Snow and Cholera: Revisiting Difference-in-Differences and

Randomized Trials in Research

Thomas S. Coleman∗

January 24, 2020

Abstract

John Snow, the London doctor often considered the father of modern epidemiology, recognized thattwo competing water companies in 1850s South London, serving an area with over 400,000 individuals,provided a “Grand Experiment” for testing the effect of clean versus dirty water in the transmissionof cholera. Snow exploited the effective randomization in the design and applied a rudimentary formof Difference-in-Differences (DiD) to compare dirty-versus-clean and before-versus-after. The natureof Snow’s design allows us to examine within-sample variability both across and within regions. Theconclusion is stark: Snow’s simple DiD cholera treatment effect is only modestly significant once weincorporate the within-sample variability (“over-dispersion” relative to a Poisson error process), contraryto what we might naively presume given a sample of roughly 400,000 persons. Although Snow’s claimfor the “influence which the nature of the water supply exerted over the mortality” survives – as we knowit should – this re-analysis highlights the importance of design and careful error analysis for properlymeasuring and testing a treatment effect.

University of Chicago, Harris Public Policy1307 E 60th St.Suite 3037Chicago IL 60637203-252-4897

Keywords: John Snow, cholera, causal inference, epidemiology, statistical methodology, difference-in-differenceregression, randomized control trial error analysis

JEL Classification: C18, N33, N93, B40, C52

Snow Project data and code: https://github.com/tscoleman/SnowCholera

Declarations of interest: None

Words: 8000

Acknowledgements: I would like to thank Dan Black, Steven Durlauf, Robert Michael, Xi Song, AustinWright, and members of the CEHD Lifecycle Working Group at the University of Chicago. Errors are myown.

∗Harris School of Public Policy, University of Chicago, [email protected]

1

Page 2: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

Contents

1 Introduction 3

2 1855 and “On the mode of communication ...”: Origin of Difference-in-Differences 5

2.1 Table XII and Difference-in-Differences as 2x2 Table . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Difference-in-Differences as Regression Equation . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Error Analysis and Stochastic Assumptions for DiD Regressions . . . . . . . . . . . . . . . . . 7

2.4 Interpreting DiD Results for Snow’s Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.5 Digression on Over-Dispersion and Randomization . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Quasi-Randomized Trial Comparison 13

3.1 Table IX – Simple Randomized Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 Randomized Comparison Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Conclusion 16

5 Appendix – Selected Tables from Snow [1855, 1856] 17

List of Tables

1 Population and Deaths from Cholera in 1849 & 1854, Summary from Snow 1855 Table VIII& Table XII . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Mortality Rates from Cholera per 10,000 Persons in 1849 & 1854, Summary from Snow 1855Table XII & Table VIII . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Regressions for Sub-District Difference-in-Differences 1849 vs 1854 . . . . . . . . . . . . . . . 8

4 Houses, Deaths, and Mortality per 10,000 Households, First Seven Weeks of 1854 CholeraEpidemic – Table IX Snow [1855] p 86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5 Houses, Deaths, and Mortality per 10,000 Households, First Seven Weeks of 1854 CholeraEpidemic – Data Table III Snow [1856] p 86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

6 Regressions for Randomized Comparison by District, Seven Weeks Ending 26 August 1854 . . 15

7 Snow’s Table XII – Deaths from Cholera in 1849 & 1854 (Snow [1855] p 90) with Populationin 1851 from Table VIII (p 85) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

8 Mortality During the First Seven Weeks of the Epidemic, Arranged in Districts, from Snow[1856] Table III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2

Page 3: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

List of Figures

1 Regions of South London Served by the Southwark & Vauxhall and the Lambeth Companies(Snow [1855]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 1849 vs 1854 Difference-in-Differences Actual vs Predicted, Poisson Regression with FixedEffects: Mortality per 10,000, Poisson Count Model, Different Rates for Sub-Districts, Pre-dicted (with 95% confidence bands) and Actual 1849 & 1854 (Adjusted for Time and SingleTreatment Effect) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 1849 vs 1854 Difference-in-Differences Actual vs Predicted, Negative Binomial Regression:Mortality per 10,000, Negative Binomial, Same Rates for Sub-Districts & Single TreatmentEffect, Predicted (with 95% confidence bands) and Actual 1849 & 1854 (Adjusted for Timeand Single Treatment Effect) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1 Introduction

John Snow’s studies of the cholera epidemics in London in 1849 and 1854 (primarily found in Snow [1849,1855], Westminster and London School of Hygiene and Tropical Medicine [1855], Snow [1856]) are justlyfamous, covered in both popular books and specialist texts. (Johnson [2007], Tufte [1997], Freedman [1999,1991], Hempel [2007], Vinten-Johansen et al. [2003], Rothman [2002], McLeod [2000] is a very partial list.)

Cholera is a horrible and often deadly disease of the small intestine caused by the bacterium Vibrio cholerae.It causes diarrhea, vomiting, and rapid dehydration; without treatment mortality is roughly 50 percent. Theprevailing theory in 1854 was that cholera was caused by bad air or miasma (smells). Although wrong, thiswas easy to believe: London at the time was a vile and dirty place with raw sewage collected in cesspools,emptied into open ditches, or piped by newly-constructed sewers directly into the Thames, from which manydrew their drinking water. Importantly, London’s sanitation issues made the city an ideal breeding groundfor cholera.

This essay focuses on one component of the evidence and analysis Snow assembled in his efforts to demon-strate the causal effect of water in cholera infection. In South London two water companies (the Southwark& Vauxhall Company and the Lambeth Water Company) supplied customers in adjacent and overlappingareas. Between the cholera epidemics in 1849 and 1854 the Lambeth company changed its water supplyfrom dirty to clean water, providing a near-ideal experiment comparing treated versus untreated subjects.Snow recognized the value of this fortuitous circumstance: “it is obvious that no experiment could have beendevised which would more thoroughly test the effect of water supply on the progress of cholera than this,which circumstances placed ready made before the observer.” (Snow [1855] p 75)

Snow applied a rudimentary form of difference-in-differences (DiD, Snow [1855] p 89 and Table XII) and adirect comparison of quasi–randomly-assigned customers (p 86 and Table IX). Snow has been credited withthe first use of difference-in-differences (Angrist and Pischke [2008] p 227 and Angrist and Pischke [2014] pp204-205). This essay examines Snow’s analysis with two goals: First, to embed Snow’s analysis is a moreformal and modern statistical framework and thus more clearly articulate the assumptions behind Snow’sanalysis. Second, to perform the statistical inference and testing that Snow could not perform (given thatthe necessary statistical theory developed after his death).

3

Page 4: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

The South London data provide what is essentially a very large (400,000+ individuals) randomized trial,but one with a design that allows examination and testing of various assumptions about the underlyingerror structure. There are multiple geographic regions (Districts and sub-districts) and observations in 1849(before treatment) and 1854 (after treatment). We can measure within-sample variation both across regionsand within regions (across time). There is substantial cross-region variation or over-dispersion and forcingus to reject the Poisson hypothesis that mortality rates are the same across regions. We can of courseincorporate cross-region differences by introducing fixed geographic (sub-district) effects. More importantly,however, we observe substantial across-time but within-region variation. This variation or over-dispersionpushes us to reject the hypothesis that average mortality rates are constant and to consider random variationin mortality rates.

The random variation substantially raises estimated standard errors. A treatment effect that naively ap-pears strongly significant (assuming constant mortality rates) is substantially reduced in significance. Thishighlights the value of study design: The regional and across-time observations in this study provide thedata one can use to uncover and estimate the variation in mortality rates that would otherwise be hidden.

Figure 1: Regions of South London Served by the Southwark & Vauxhall and the Lambeth Companies (Snow[1855])

4

Page 5: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

2 1855 and “On the mode of communication ...”: Origin of Difference-

in-Differences

Snow’s 1855 edition of “On the mode of communication of cholera” (Snow [1855]) can best be viewed asa sustained attempt to convince skeptics that water was the causal agent for cholera transmission. Snowpresented and discussed multiple sources of evidence, from the small outbreak at Albion Terrace in 1849,through the justly-famous mapping of the Broad Street outbreak in central London, to the South London“Grand Experiment”. (See Coleman [2018] for a more complete discussion of the multiple strands of evidencepresented by Snow.)

In London south of the Thames two companies, the Southwark and Vauxhall Waterworks Company and theLambeth Water Company, competed and supplied customers in nearby and overlapping districts (Figure 1),often laying pipes down the same street. Most of the competition and piping apparently occurred during the1830s and 1840s – by 1849 the distribution of customers seems to have been stable. In 1849 both companiesdrew their water from the lower Thames, water that was subject to pollution from the drains and cesspoolsof greater London. In 1852 the Lambeth Water Company moved its source upstream, above the main sourceof London sewage.1

With the South London data Snow performed what we would now call a difference-in-differences comparisonacross time and regions (and also a comparison of randomized subjects, discussed in the next section). In 12sub-districts of southern London the Southwark and Vauxhall company alone supplied customers. In 16 sub-districts, with a population of roughly 300,000, the two companies competed directly, supplying customersside-by-side:

In many cases a single house has a supply different from that on either side. Each companysupplies both rich and poor, both large houses and small; there is no difference in the conditionor occupation of the persons receiving the water of the different companies. Snow [1855] p 75

Snow recognized that the close proximity and mixing of customers, together with the change in Lambeth’ssource, fortuitously provided a natural experiment.

2.1 Table XII and Difference-in-Differences as 2x2 Table

Table 1: Population and Deaths from Cholera in 1849 & 1854, Summary from Snow 1855 Table VIII &Table XII

1851Population 1849 Deaths 1854 Deaths

First 12 (Southwark & VauxhallWater Company Only) 167,654 2,261 2,458Next 16 (Joint Southwark & Vauxhalland Lambeth Companies) 300,149 3,905 2,547

TOTAL 467,803 6,166 5,005

1The Metropolis Water Act 1852 mandated that water companies move their water sources above the tidal reaches of theThames – effectively above London sewage sources. The deadline for moving was 1855. The Lambeth Company had apparentlydecided in 1847 to move and in 1852 their move to Thames Ditton was completed. The Southwark and Vauxhall Company didnot move until 1855. See Johnson [2007] p 105.

5

Page 6: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

Snow [1855] Table XII has the basic count (death) data for 1849 and 1854 by sub-district summarized in Table1, with the full data from Table XII reproduced in the first four columns of my Table 7). Focusing on thesub-totals at the bottom of his Table XII, Snow points out that “The table exhibits an increase of mortalityin 1854 as compared with 1849, in the sub-districts supplied by the Southwark and Vauxhall Company only[the ’First 12 sub-districts’], whilst there is a considerable diminution of mortality in the sub-districts partlysupplied by the Lambeth Company [the ’Next 16’].” (Snow [1855] p 89) We can emphasize Snow’s point byconverting death counts to mortality rates.2

In Table 2 the highlighted section (middle three columns and rows) shows the mortality rates summarized bysub-districts. The numbers make Snow’s point clear – the 1854 mortality rate for the jointly-supplied “next16” sub-districts fell from 130.1 to 84.9 (by a factor of 1.5). The third row and column of Table 2 shows theresults of performing a “by-hand” DiD analysis.3

Table 2: Mortality Rates from Cholera per 10,000 Persons in 1849 & 1854, Summary from Snow 1855 TableXII & Table VIII

1849 Deathsper 10,000

1854 Deathsper 10,000

Diff 1854less 1849

Std Err ofDiff t-ratio

First 12 (Southwark & VauxhallWater Company Only) 134.9 146.6 +11.8 4.07E-04 2.9

Next 16 (Joint Southwark & Vauxhalland Lambeth Companies) 130.1 84.9 –45.2 2.66E-04 –17.0

Diff Water Supply Co.: Next 16 lessFirst 12 -4.8 –61.8 –57.0 4.86E-04 –11.7

Standard Error of Difference 3.49E-04 3.38E-04 4.86E-04t-ratio -1.4 -18.3 -11.7

The rates in Table 2 (highlighted in the central section) cannot answer is how much confidence we shouldhave in the estimate of -57.0. Table 7 shows that there is a total population of 486,936, with 167,654 and300,149 in the two sub-regions. These are large samples and from this we might naively conclude that themortality rates (or log-rates) are very precisely estimated.

A naive error analysis would test the difference in rates (say the difference across region in 1849 of –4.8 =130.1 – 134.9) with a t-test for the equality of means. The individual rates will be normally distributed(for large populations such as the present care), with variance r(1−r)/n. The difference will be normal withstandard error SE(r1 − r2) =

√r1(1−r1)/n1 + r2(1−r2)/n2. The differences themselves will be normal, so we

can test for the difference-in-differences also with a t-test. (The t-distribution degrees-of-freedom is given bya complicated formula, which is irrelevant in the present case since the sample sizes are so large everythingis essentially normal.) The standard errors and t-ratios are shown in the last two rows and columns of Table2. The t-ratio for the difference-in-difference treatment effect is -11.7, implying a high level of statisticalsignificance.

But the conclusion that the DiD treatment effect is highly significant is wrong; we will see below that thet-ratio is closer to 2.0, implying a p-value around 4%. The simple error analysis in Table 2 is wrong because itdoes not account for the substantial variation in Table 7. Mortality varies across both sub-districts and time:

2Counts are converted to rates using the population by sub-district for 1851 (the year of the census closest to 1849 and 1854)from Snow’s Table VIII, shown in the fifth column of Table 7. For some reason Snow did not convert the counts in Table XII torates, although he did in other instances. Displaying rates would have strengthened his argument, highlighting the “treatmenteffect,” and made his analysis closer to a modern DiD.

3I do not use the “last four four sub-districts” supplied only by the Lambeth company in 1854 because three of thesesubdistricts were not supplied by any water company in 1849: “It is necessary to observe, however, that the supply of theLambeth Company has been extended to Streatham, Norwood, and Sydenham, since 1849, in which year these places were notsupplied by any water company.” Snow [1855] p 89

6

Page 7: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

For example, mortality in two of the joint Vauxhall/Lambeth sub-districts (Kennington 1st and Clapham)actually increases, by quite a bit, from 1849 to 1854.

2.2 Difference-in-Differences as Regression Equation

To move from the Rates in Table 2 to a regression framework we note that:

ln(Rate) = ln (count/population)

and then write Count as a regression equation with error:

ln (countsubdist,yr) = µ+ δ54 · Iyr=1854 + γJ · Isubdist=joint+β · Isubdist=joint · Iyr=1854 + ln (populationsubdist,yr) + ε

(1)

δ54 Year effect – change in rate for 1854 versus 1849

γJ “Joint Region” effect – change in rate for jointly-supplied “Next 16” region versus Southwark-only

β Treatment effect – change in rate for joint “Next 16” treatment region when treated (1854) versusuntreated (1849)

Iyr=1854 etc. Indicator or dummy variables that are 1 for year=1854, region=joint, etc.

With Equation 1 we can examine and test various assumptions about the parameters and the error process.Assuming that the errors ε are Poisson-distributed gives a Poisson regression, while assuming the errors areNegative Binomial (a Gamma-mixture of Poissons that generalizes the Poisson by allowing the variance tobe different from the mean) gives a Negative Binomial regression.

To address the question of statistical variability of the estimates in Table ?? we use the DiD Equation 1 withthe data from Table 7. This regression framework allows us to use the substantial within-sample variabilityshown in Table 7 to assess our confidence in the observed reduction in mortality – applying what Stigler[2016] calls "intercomparison". )

2.3 Error Analysis and Stochastic Assumptions for DiD Regressions

Table 3 shows various versions of the DiD regression. Column 1 shows a regression assuming the error termε in Equation 1 is Poisson. (The estimated coefficient -0.511 matches what we would obtain “by-hand” inTable 2 if we expressed rates and differences in logs.) The Poisson regression in column 1 shows a z-ratio of-13.2, compared with the “by-hand” t-ratio in Table 2 of -11.7.4

The Poisson regression, however, cannot account for the variation in the observed counts (variation acrossand within sub-districts) – what is called in the literature “over-dispersion”. This has little impact on the

4The two calculations are based on somewhat different approaches: standard errors in Table 2 are based on rates beingnormal while the regression in Table 3 are based on counts being Poisson, a close approximation to Binomial counts for the lowrates here.

7

Page 8: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

Table 3: Regressions for Sub-District Difference-in-Differences 1849 vs 18541 2 3 4 5

Poisson Poisson,sub-districtFixed Effects

NegativeBinomial

NegativeBinomial, 2

Lambeth Effects

NegativeBinomial, 2Lambeth +Fixed Effects

Single Treatment -0.511 -0.511 -0.500 -0.338 -0.354standard error 0.039 0.039 0.246 0.248 0.101z-ratio (coeff/SE) -13.20 -13.20 -2.03 -1.36 -3.50Robust z-ratio -2.43 -2.18 -2.17 -1.40 -1.57

“More Lambeth”Treatment

-1.132 -1.189

standard error 0.353 0.148z-ratio (coeff/SE) -3.20 -8.05Robust z-ratio -3.84 -4.52

Joint region (single)control*

-0.036 -0.032 -0.064

Joint region (moreLambeth) control*

0.059

Time control* 0.084 0.084 0.057 0.057 0.076Residual Deviance 1541.6 456.8 59.8 60.0 54.3

p-value 0.00% 0.00% 21.45% 15.74% 0.06%theta (Gamma “size”) 4.96 5.57 41.34Pseudo-R2 24.2% 77.5% 16.8% 25.1% 87.5%

Deaths by sub-district from the full outbreaks of 1849 and 1854 for the 28 sub-districts (“first 12” Southwark-only and “next 16” jointly-supplied) shown in Snow [1855] Table XII and my Table 7, with populationfrom Snow’s Table VIII. Total 56 observations. * = not significant at the 10% level (robust errors forPoisson regression). The Poisson and Negative Binomial regressions are fitted with the R function glm(family=poisson) and the glm.nb function from the MASS package. The parameter “theta” is the size orθ for a “parametrization (1)” Negative Binomial (see Coleman [2018] appendix). Robust standard errorsare calculated with the R “sandwich” package, using the default “HC3” for the adjusted variance-covariancematrix (see Zeileis [2004] and the R “sandwich” manual).

estimated coefficient, but implies that the estimated standard errors are dramatically too small and thestatistical significance over-estimated.

The large value for the “Residual deviance” shows that the Poisson model does not account for the within-sample variation. The residual deviance is a generalization of the sum of squared residuals, and it is also(twice) the difference in log-likelihood between a fully unrestricted model (“saturated” with one rate perobservation) and the restricted Poisson model. As a difference in log-likelihood we can use to test therestricted (Poisson) model – a large value relative to a chi-squared implies rejection, i.e. rejection of thePoisson distribution assumption. The 5% level for a chi-squared with 52 degrees of freedom is 69.8 whilethe observed residual deviance is much larger – 1,541.6. The p-value shown in Table 3 column 1 is less than0.01%.

A natural extension to the Poisson model of column 1 is to allow rates to vary by sub-district – effectivelya finite fixed and mixture of Poissons. It is quite reasonable that differences in income or crowding or othercharacteristics might cause average mortality to differ by sub-district. Sub-district fixed effects will absorball such differences. Including fixed effects (column 2), the Poisson assumption is still rejected. Figure 2shows why: within-district variation is large. The empty circles show the predicted rate, with approximate95% confidence bands based on the Poisson assumption. The red filled circles show the 1849 actual rate.The blue triangles show the 1854 actual rates (after adjusting for the estimated time and treatment effect,to graph them on the same basis as the 1849 predicted).

There should be roughly three observations outside the 95% bands (5% of 56 observations) but in fact there

8

Page 9: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

0 50 100 150 200

1210

86

42

First−12 Southwark−only poisson 1849vs1854

Mortality per 10,000 population

sub−

dist

rict

1849 red circle1854 blue trianglepredicted

50 100 150 200 250

2520

15

Next−16 Jointly−Supplied poisson 1849vs1854

Mortality per 10,000 populationsu

b−di

stric

t

1849 red circle1854 blue trianglepredicted

Figure 2: 1849 vs 1854 Difference-in-Differences Actual vs Predicted, Poisson Regression with Fixed Effects:Mortality per 10,000, Poisson Count Model, Different Rates for Sub-Districts, Predicted (with 95% confidencebands) and Actual 1849 & 1854 (Adjusted for Time and Single Treatment Effect)

are 31 or 55%. Sometimes 1849 is higher (as for sub-district 3, St. John, Horsleydown) and sometimes lower(as for sub-district 4, St. James, Bermondsey). There is no consistency in 1849 being higher or lower than1854 – and there should not be since we are controlling for a time effect.

We conclude that mean mortality rates are not constant across sub-district or across time, and the observedvariation in mean rates is substantially higher than predicted by the Poisson assumption. The Poisson modelgives reasonable estimates of mean rates but misleading estimates of standard errors. The standard errorestimates are too low because under the Poisson the variance is equal to the mean. Although the data showthat the variation is higher than implied by the mean (the large residual deviance and Figure 2) the Poissonignores this higher observed variation.

To estimate reasonable standard errors we can proceed in two directions. First, we can estimate robuststandard errors using a “sandwich” estimator – see Zeileis [2004] and the R “sandwich” manual. I haveestimated robust standard errors for all the regressions in Table 3 and robust z-ratios are reported.

The second direction is to generalize the error distribution in Equation 1. A common generalization is theNegative Binomial, which can be expressed as a random mixture of Poisson means, mixing with a Gammadistribution. This gives the Negative Binomial an additional parameter, the size or shape parameter θ ofthe Gamma mixing distribution. It breaks the link between mean and variance: For a Poisson with meanm the variance is also m; for Negative Binomial with mean m and size or shape θ the variance is m+ m2

/θ.(See the appendix to Coleman [2018] for a discussion of the variety of parameterizations; also McNeil et al.[2005] section 10.2.4 and A.2.7) One could also consider other mixtures or other mechanisms to generategeneralizations to the Poisson, although I will not do so here.

The third column of Table 3 shows the data fitted with a Negative Binomial process. The residual deviance

9

Page 10: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

is 59.8 with a p-value of 21%, implying that we cannot reject the Negative Binomial error process. Figure 3shows why: the error bars are now wide enough to capture both the across and within sub-district variation.

50 100 150 200 250

1210

86

42

First−12 Southwark−only Negative Binomial(4.9562) 1849vs1854

Mortality per 10,000 population

sub−

dist

rict

1849 red circle1854 blue trianglepredicted

50 100 150 200 250

2520

15

Next−16 Jointly−Supplied Negative Binomial(4.9562) 1849vs1854

Mortality per 10,000 population

sub−

dist

rict

1849 red circle1854 blue trianglepredicted

Figure 3: 1849 vs 1854 Difference-in-Differences Actual vs Predicted, Negative Binomial Regression: Mortal-ity per 10,000, Negative Binomial, Same Rates for Sub-Districts & Single Treatment Effect, Predicted (with95% confidence bands) and Actual 1849 & 1854 (Adjusted for Time and Single Treatment Effect)

With an error process that models the observed variation we have some confidence that the standard errorsand the z-ratio for the Negative Binomial model, reported in the third column of Table 3, are reasonable.This is reinforced by the observation that the z-ratio for the Negative Binomial is roughly the same as thez-ratio using robust standard errors, and roughly the same as the robust z-ratio for the Poisson models.

2.4 Interpreting DiD Results for Snow’s Data

Column 3 of Table 3 shows that, assuming counts are Negative Binomial, the overall treatment effect is stilllarge (coefficient -0.505, a factor of 1.66 reduction) but is estimated with only a moderate degree of precision(z-ratio 2.03, with large-sample p-value 4.2%). This is dramatically less than the Poisson z-ratio of 13.2 orthe naive t-ratio of 11.7.

This is a rather stark contrast with the results of Table 2, but it is hard to escape the conclusion that theobserved effect (roughly -0.5 in logs or a ratio effect of 1.6) is barely large enough to separate from the noisewe see in mean mortality rates. It also demonstrate the importance of careful error analysis and good design:observations across sub-districts and across time give us the data to examine the error structure and uncoverthe modest statistical significance of the estimate.

How do we square the modest precision of the estimate with our modern knowledge that cholera is water-borne? A good part of the answer lies in the “treatment effect” itself. In Tables 2 and ??, and the regressionsso far, we are comparing untreated sub-districts (no clean water, supplied by Southwark & Vauxhall only)

10

Page 11: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

with “jointly-supplied” sub-districts (partial clean water, supplied by both the Southwark & Vauxhall andthe Lambeth Companies). The estimated “treatment effect” is in fact a “partial treatment effect” estimatedacross heterogeneous sub-districts: Some jointly-supplied sub-districts have a large fraction “treated” – alarge proportion of Lambeth customers – and others a small fraction treated.

Snow did highlight differences across sub-districts: “In certain sub-districts, where I know that the supplyof the Lambeth Water Company is more general than elsewhere, as Christchurch, London Road, WaterlooRoad 1st, and Lambeth Church 1st, the decrease of mortality in 1854 as compared with 1849 is greatest, asmight be expected.” (Snow [1855] p. 89)

The regression framework allows us to extend the single partial treatment effect to two: a “more-Lambeth”effect (the four named sub-districts) and a “less-Lambeth” effect (the remaining jointly-supplied sub-districts).

The fourth and fifth columns show the Negative Binomial regressions with two treatment effects. The fourthcolumn shows a very large and precisely-estimated more-Lambeth treatment: coefficient -1.132 or reductionby a factor of 3.1 with z-ratio of 3.20 and p-value 0.14%. The residual deviance is 60.0 with p-value of 15.7%,implying that the Negative Binomial fits the data. (More accurately, we cannot reject the Negative Binomialas a reasonable model of the error process.)

In 1855 Snow did not have information on the populations by sub-district supplied by the two water companiesand so could only measure the “partial treatment effect”. Population information was published in Simon[1856] and was utilized by Snow in Snow [1856]. With the detailed population data we could extend theDiD analysis. First and most important, we could model the treatment effect as proportional to populationweight, so that the effect in a sub-district varies with the fraction of population treated with clean water.Second, we could include “Other” (pumps, wells, the Thames, and ditches) as a category of water supply inaddition Southwark & Vauxhall (dirty) and Lambeth (clean in 1854). Doing so (see Coleman [2018]) givesa large treatment effect (ratio 4 or larger) that is precisely estimated (z-ratio 10 or larger). This is, in fact,an implementation of what Snow was trying to do in Snow [1856] Table VI.

Snow’s focus in Snow [1855, 1856] was on demonstrating “the overwhelming influence which the nature of thewater supply exerted over the mortality, overbearing every other circumstance which could be expected toaffect the progress of the epidemic” (Snow [1856] p. 248). The DiD regression in Table 3 column 4 (two Lam-beth effects) goes some way to demonstrate this. A further question is to examine the “other circumstances”.We already have some indication that there are no systematic differences between the Southwark-only versusjointly-supplied sub-districts, since the “region controls” (the coefficient γJ in Equations ?? or 1) are smalland statistically insignificant.

A more rigorous test for “other circumstances” is to include sub-district fixed effects which will capture anyand all fixed differences across sub-districts – housing stock, crowding, proximity to the Thames, socio-economic status or class, income. The fifth column of Table 3 shows that fixed effects do not substantiallychange either the size or the statistical significance of the estimated “more Lambeth” effect. We have to bea little careful because the Negative Binomial does not fit the data well (the residual deviance is 54.3, whichwith 25 degrees of freedom is very large, p-value 0.06%). We need to either find a better representation forthe error process or, as I will do here, rely on the robust standard errors. Using the robust standard errorsgives a z-ratio for the “more Lambeth” effect of 4.5.

Regarding sub-district fixed effects, it is interesting examine the pseudo-R-squared (R2 = 1−ResidDev/NullDev)which increases from 25.1% to 87.5%. This shows that differences across sub-districts are important in ac-

11

Page 12: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

counting for differences in observed mortality. Nonetheless, clean water continues to show a large andsignificant effect.

2.5 Digression on Over-Dispersion and Randomization

For measuring standard errors in Table 2 or the Poisson regression in column 1 of Table 3 one assumes thatcounts are Binomial or Poisson. This seems like not only the natural but almost an inevitable assumption.At an individual level mortality is a Bernoulli process: alive or dead, 0 or 1. For a population (say withinone sub-district) if Bernoulli probability is constant, the aggregate deaths will be Binomial (the sum ofsame-probability Bernoullis is Binomial). Even when probabilities vary across the population (and we everyreason to think they will) the sum is more complicated, but we can turn instead to the Poisson. The Binomialis well-approximated by the Poisson for the low rates encountered here. The sum of independent Poissonsis itself Poisson with overall rate equal to the mean across the population. So that even with populationheterogeneity in individual rates the aggregate rate should behave as a Poisson.

Furthermore, the South London data is effectively randomized within each sub-district. This should bal-ance the treated and control groups with respect to observed and unobserved characteristics, presumablyproducing the same distribution of mortality (apart from treatment) and thus the same (Poisson) rates.

Yet we have clear evidence that rejects the Poisson assumption. Why?

In the DiD regressions there are two sources of variation in the rates. The first is variation across sub-districts. Such variation is not surprising since customers were not randomized across sub-districts. Wecould consider another “experiment” where we did randomize across sub-districts, but in any case the sub-district variation is easily handled within a Poisson framework by estimating sub-district fixed effects. Theseallow for different means in each sub-district individual population distribution.

But the data indicate this is not sufficient. It seems means and the within-sub-district distributions them-selves change over time, and change differently across sub-districts. (If all sub-districts changed in the samemanner, this would be captured by the time fixed effect.)

Randomization within sub-districts should balance the distribution of unobservables at a point in time butwe are not randomizing across time. The distribution of unobservables may be the same but the effect ofunobservables could vary over time. To fix the idea take a simple but artificial example: within sub-districtsthere are people with different tastes for drinking water versus tea (tea involves boiling and thus purifyingwater). This produces variation across sub-districts: a sub-district with many people preferring tea wouldhave lower overall mortality and also less difference between treated and control. (Fixed effects would controlfor this.)

But we could easily envision changes from 1849 to 1854, say the price of tea falls. This would bring inmarginal tea-drinkers and change mortality rates. The change from 1849 to 1854 would be different acrosssub-districts because of the differences in tastes across sub-districts. This would show up as variation inmortality that is not explained by the sub-district fixed effects; it would appear as random variation in rates.We would need to randomize across time to remove the differential impact in 1849 and 1854, and we cannotrandomize across time. 5

5Contagion is another mechanism to generate non-Poisson rates. Dependence across individuals will generate non-Poisson,contagion generates dependence, and contagion within households and neighborhoods is to be expected in a cholera epidemic.

12

Page 13: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

The conclusion is that randomization within sub-districts may remove confounding effects but can still pro-duce random changes in mean rates across time (within sub-districts) and thus over-dispersion. The observedvariation across time pushes us towards using the Negative Binomial distribution, essentially treating thesub-district mean rates as random over time (and possibly sub-district). The conclusion is that, for theseSouth London data, we should consider random variation in mortality and over-dispersion, rather thanPoisson, as the default or null hypothesis.

3 Quasi-Randomized Trial Comparison

3.1 Table IX – Simple Randomized Comparison

In Table IX Snow directly compared those exposed to dirty versus clean water in 1854 – what we would nowconsider a randomized control trial. Snow is very modern in his recognition of the value of the South Londonquasi-randomized observations: “it is obvious that no experiment could have been devised which would morethoroughly test the effect of water supply on the progress of cholera than this, which circumstances placedready made before the observer.” (Snow [1855] p. 75)

Customers, particularly in the jointly-supplied sub-districts, were according to Snow effectively randomized:

In many cases a single house has a supply different from that on either side. Each companysupplies both rich and poor, both large houses and small; there is no difference in the conditionor occupation of the persons receiving the water of the different companies ... No fewer thanthree hundred thousand people of both sexes, of every age and occupation, and of every rankand station, from gentlefolks down to the very poor, were divided into two groups without theirchoice, and, in most cases, without their knowledge; one group being supplied water containingthe sewage of London, and amongst it, whatever might have come from the cholera patients, theother group having water quite free from such impurity. Snow [1855] p. 75.

The Registrar General collected the official mortality data weekly by address, and deaths were publishedby sub-district (sub-district populations varied from 5,000 to 27,000). Snow recognized, however, that acomparison required the water source or water supplier for each death: “To turn this grand experimentto account, all that was required was to learn the supply of water [the company, either the Southwark &Vauxhall or the Lambeth Company] to each individual house where a fatal attack of cholera might occur.”(p. 75)

Snow himself collected the supplier data (with some assistance from John Joseph Whiting) for the sevenweeks ending August 26th, by visiting houses to ascertain the water supply (Snow [1855] pp 75-80). Hereported the results in Snow [1855] Table VIII and Snow [1856] Tables I, II, and III. He also needed thepopulation-at-risk to calculate mortality rates. In 1855 Snow had data only on housing by supplier and thatonly at the overall regional level. He could therefore only calculate mortality per house for the overall region,which he did in Table IX, reproduced here as Table 4.

Data published in 1856 provided estimates on the geographic distribution of houses and population by watersupplier (Simon [1856], used in Snow [1856]). House and population estimates by the 32 detailed sub-districthave substantive problems (discussed in Snow [1856] pp. 245-246). Data by the nine Districts have lesser

13

Page 14: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

Table 4: Houses, Deaths, and Mortality per 10,000 Households, First Seven Weeks of 1854 Cholera Epidemic– Table IX Snow [1855] p 86

Water Supplier Number ofhouses

Deaths fromCholera

Deaths in each10,000 houses

Southwark & Vauxhall Co supply 40,046 1,263 315Lambeth Co supply 26,107 98 38Rest of London 256,423 1,422 59

Note that this corrects a rounding error in the “Deaths in each 10,000 houses” for Lambeth in Snow’s original table

but some remaining problems (two districts show higher populations assigned to the two water companiesthan shown in the 1851 census – implying negative populations sourcing from pump-wells, ditches, etc.).Table 5 shows Snow’s 1855 Table IX re-cast using the updated aggregate housing data. House totals arelower because of better estimates by supplier. Deaths from cholera are higher because the 22 un-ascertaineddeaths (deaths that Snow could not directly attribute to either the Southwark & Vauxhall or the LambethCompany) have been assigned to supplier at the district level proportionally to the ascertained cases, assuggested by Snow [1856] p. 247).6

Table 5: Houses, Deaths, and Mortality per 10,000 Households, First Seven Weeks of 1854 Cholera Epidemic– Data Table III Snow [1856] p 86

Water Supplier Number ofhouses

Deaths fromCholera

Deaths in each10,000 houses

Deaths in each10,000 houses

Southwark & Vauxhall Co supply 39,315 1,282 326.1 326.1Lambeth Co supply 24,829 99 39.9 39.9Reduction in Mortality due to cleanwater

–286.2

Naive t-ratio –29.2

The mortality rate upon treatment falls by 0.02862 or 286.2 per 10,000 houses. This is a large decrease butwe need to determine whether it is statistically significant: Could it arise from mere chance? Just as in theDiD analysis we need to determine the standard error and thus statistical significance of the estimate. Andjust as with the DiD analysis, a detailed error analysis shows that our initial and naive approach is quitewrong.

The naive approach mirrors that for the DiD Table 2 above: test the difference in rates of 286.2 or 0.02862with a t-test for the equality of two means. Under a simple Binomial assumption the individual rates willeach be normally distributed with variance r(1−r)/n. The difference will be normal with standard errorSE(r1−r2) =

√r1(1−r1)/n1 + r2(1−r2)/n2.) Using the rates and number of houses in Table 5 gives a standard

error for the difference of 0.00098 and a t-ratio of 29.2 (with degrees of freedom large enough to assumenormality). But this is wrong and the t-ratio is closer to 10 than 29. Just as with DiD the assumption thatthe counts are Binomial (or Poisson) is simply not consistent with the data.

6Koch and Denike [2006] deserve special mention here. They discuss the un-ascertained cases and Snow [1856]. Their goal ishighly worthwhile, to address “a previously unacknowledged methodological and conceptual problem in Snow’s 1856 argument.”They focus on the 623 un-ascertained deaths (for the full 1854 epidemic, versus the 22 for the first seven weeks mentionedabove). Unfortunately, Koch and Denike seem to mis-read or mis-interpret Snow’s data and analysis, conflating uncertaintyover supplier (a recognized problem for which Snow proposed a solution) with uncertainty over location (which to my knowledgewas never a problem with the data used by Snow). This misunderstanding leads them to reassign the 623 deaths across districts,thus unjustifiably altering the geographic patterns of mortality. This unfortunately invalidates their analysis and conclusions.The details are discussed in Coleman [2019] (and Coleman [2018] ).

14

Page 15: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

3.2 Randomized Comparison Error Analysis

The comparison of randomized customers here uses data from the first seven weeks, ending 26 August 1854(Snow [1856] Table III), in contrast to the DiD analysis above that uses data for the full 1849 and 1854outbreaks (Snow [1855] Table XII). I limit to the first seven weeks to minimize the number of un-ascertainedcases (Snow was more careful in collecting data than were the registrar agents after 26 August) and Ianalyze mortality per household to match Snow [1855] Table IX. (In Coleman [2018] I examine data for thefull epidemic and mortality per person, data from Snow [1856] Table V, and the conclusions are the same.)

Table 6 shows the results for count regressions. The Lambeth (treatment) effect for the Poisson column 1matches the results of Table 5: a log coefficient of -2.101 corresponds to a reduction in mortality by the ratio8.17 = exp(2.101), the same as the ratio in rates 8.17 = 326.1/39.9. The Poisson z-ratio is 20.15 but wecannot rely on this; the Poisson process does not capture the variation in the data as we can see from thelarge residual deviance (1,541.6 with p-value less than 0.01%).

Table 6: Regressions for Randomized Comparison by District, Seven Weeks Ending 26 August 18541 2 3 4

Poisson Poisson, DistrictFixed Effects

NegativeBinomial

NegativeBinomial +

Housing Density

Lambeth (treatment)Effect

-2.101 -2.027 -2.099 -2.097

standard error 0.104 0.107 0.194 0.177z-ratio (coeff/SE) -20.15 -18.93 -10.84 -11.86Robust z-ratio -9.87 -6.90 -8.56 -9.20

Housing Density 0.215z-ratio (coeff/SE) 2.07Robust z-ratio 1.24

Residual Deviance 114.9 11.8 18.2 17.3p-value 0.00% 6.69% 19.60% 18.75%

theta (Gamma “size”) 12.08 16.42Pseudo-R2 86.4% 98.5% 85.9% 89.3%

Deaths by District from the first seven weeks of 1854 for the 9 Districts shown in Snow [1856] Table III andmy Table 7, by household. The Poisson and Negative Binomial regressions are fitted with the R functionglm (family=poisson) and the glm.nb function from the MASS package. The parameter “theta” is the sizeor θ for a “parametrization (1)” Negative Binomial (see Coleman [2018] appendix). Robust standard errorsare calculated with the R “sandwich” package, using the default “HC3” for the adjusted variance-covariancematrix (see Zeileis [2004] and the R “sandwich” manual).

The data in Table 8 (reproducing Snow [1856] Table III) shows that there is considerable variation both inthe level of mortality for Southwark & Vauxhall customers (a high of 478.9 for St. Saviour and low 204.8 forWandsworth) and in the ratio of dirty versus clean (Southwark versus Lambeth mortality – high of 14.8 forNewington to a low of 6.0 for the District of Lambeth). The Poisson assumption simply does not capturethe observed within-sample variation.

Introducing District fixed effects is more problematic here than with the DiD analysis above, because we donot have repeated observations across time. We can reject the Poisson assumption with fixed effects at the10% level but not the 5% level (the p-value is 6.69%); the robust standard errors show a reduction relativeto the Poisson standard errors. On balance fixed effects do not appear very useful.

A Negative Binomial error process seems to be consistent with the data, and implies a standard error of 10.8instead of 20 for the Poisson error process. The conclusion, just as with the DiD, is that the within-samplevariation in mortality implies substantially lower statistical significance than we would naively assume given

15

Page 16: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

the large sample size – in this case a total of 55,144 houses. Instead of a z-ratio of 29 or 20 we find as z-ratiocloser to 10 or 11, with similar results using either robust standard errors for the Poisson or the NegativeBinomial error process.

As for the DiD analysis we would like to test for the effect of factors that vary across sub-districts. Here wedo not have repeated observations across District so that fixed effects are less appropriate. We can, however,examine whether crowding has an effect since Snow [1856] Table III reports housing density. Column 4of Table 6 shows that the effect is at best marginally significant. And does not produce any substantialimprovement in the pseudo-R2.

The DiD analysis above shows the contribution of variation in mortality across time (within sub-district),variation that we do not measure here. As a result we might suspect that the estimated standard errors(based only on 1854 observations) would be higher if we could also measure variation across time. Thesestandard errors are within-sample for 1854, but the DiD analysis indicates we might find higher standarderrors for an extended sample.

4 Conclusion

The primary goals for any statistical analysis of cholera mortality data are two-fold: first to remove or accountfor confounding factors and produce an unbiased estimate, and second to assess the reliability or statisticalsignificance of the estimate. The circumstances of water supply to South London customers provides strongarguments that many if not all confounding factors can be removed or controlled for. First, customers wereclosely mixed, thus removing many confounding factors. Second, for the DiD analysis we have observationsboth before and after Lambeth customers were supplied with clean water, providing a natural control forconfounding factors.

Although mixing or effective randomization across customers gives confidence that our estimated treatmenteffects are not biased by confounding factors, randomization does not imply that the error process is Binomialor Poisson and that we can rely on such a statistical assumption in measuring the standard errors. In factthe within-sample variation observed in Snow’s data implies standard errors larger (with lower statisticalsignificance) than if we assumed a simple Binomial or Poisson process for the large sample sizes (467,803people for the DiD analysis and 55,144 houses for the direct comparison).

Snow’s use of both DiD comparison (before versus after) and direct comparison, together with the design ofthe “experiment”, is important. The DiD analysis finds a treatment effect that is statistically significant evenincorporating the cross-region and cross-time variation (when measuring a “more Lambeth” effect). Becausewe are testing the effect “against reality in a variety of settings” we have more confidence that the measuredeffect will hold when confronted with out-of-sample variability across regions and time.

The direct comparison is closer to a modern clinical trial – randomization of subjects. The randomizationand direct comparison of treated versus control subjects is a powerful tool for controlling and removingconfounding factors, providing strong evidence here that there is a statistically and substantively significanteffect of drinking clean water. But the DiD analysis argues that over-dispersion may remain in the randomizeddata and that incorporating within-sample variation is critical.

The use of both studies – DiD and direct comparison of randomized subjects – mutually support andstrengthen Snow’s claim for the “influence which the nature of the water supply exerted over the mortality.”

16

Page 17: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

(Snow [1856] p 248) This essay reinforces Freedman’s claim for “Snow’s work ... [being] a success storyfor scientific reasoning based on nonexperimental data,” and that “statistical technique can seldom be anadequate substitute for good design, relevant data, and testing predictions against reality in a variety ofsettings.” (Freedman [1991] p 291)

5 Appendix – Selected Tables from Snow [1855, 1856]

References

Joshua D. Angrist and Jorn-Steffen Pischke. Mostly Harmless Econometrics: An Empiricist’s Companion.Princeton University Press, 1 edition edition, December 2008.

Joshua D. Angrist and Jorn-Steffen Pischke. Mastering ’Metrics: The Path from Cause to Effect. PrincetonUniversity Press, Princeton ; Oxford, with french flaps edition edition, December 2014. ISBN 978-0-691-15284-4.

Thomas S. Coleman. Causality in the Time of Cholera: John Snow as a Prototype for Causal Inference.SSRN Scholarly Paper ID 3262234, Social Science Research Network, Rochester, NY, October 2018. URLhttps://papers.ssrn.com/abstract=3262234.

Thomas S. Coleman. A Note on Koch & Denike’s Analysis of John Snow’s 1856 "Cholera in the south districtof London". SSRN Scholarly Paper, Social Science Research Network, Rochester, NY, July 2019.

David Freedman. Statistical Models and Shoe Leather. Sociological Methodology, 21:291–313, 1991. ISSN0883-4237, 2168-8745. doi: 10.2307/270939. URL https://www-jstor-org.proxy.uchicago.edu/

stable/270939.

David Freedman. From association to causation: some remarks on the history of statistics. StatisticalScience, 14(3):243–258, August 1999. ISSN 0883-4237, 2168-8745. doi: 10.1214/ss/1009212409. URLhttps://projecteuclid.org/euclid.ss/1009212409.

Sandra Hempel. The Strange Case of the Broad Street Pump: John Snow and the Mystery of Cholera.University of California Press, Berkeley, first edition edition, January 2007. ISBN 978-0-520-25049-9.

Steven Johnson. The Ghost Map: The Story of London’s Most Terrifying Epidemic–and How It ChangedScience, Cities, and the Modern World. Riverhead Books, New York, reprint edition edition, October2007. ISBN 978-1-59448-269-4.

Thomas Koch and Kenneth Denike. Rethinking John Snow’s South London study: A Bayesian eval-uation and recalculation. Social Science & Medicine, 63(1):271–283, July 2006. ISSN 0277-9536.doi: 10.1016/j.socscimed.2005.12.006. URL http://www.sciencedirect.com/science/article/pii/

S0277953605006933.

K. S. McLeod. Our sense of Snow: the myth of John Snow in medical geography. Social Science & Medicine(1982), 50(7-8):923–935, April 2000. ISSN 0277-9536.

Alexander McNeil, Rudiger Frey, and Paul Embrechts. Quantitative Risk Management. Princeton UniversityPress, Princeton, NJ, 2005.

17

Page 18: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

Kenneth J. Rothman. Epidemiology: an introduction. Oxford University Press, New York, N.Y., 2002. ISBN978-0-19-513553-4.

John Simon, editor. Report on the last two cholera-epidemics of London: as affected by the consump-tion of impure water addressed to the Rt. Hon. The President of the General Board of Health, by theMedical Officer of the Board. Printed by Eyre and Spottiswoode, for HMSO, London, 1856. URLhttps://collections-nlm-nih-gov.proxy.uchicago.edu/catalog/nlm:nlmuid-0260772-bk. OCLC:14531255.

John Snow. On the mode of communication of cholera. John Churchill, London, 1849. OCLC: 14550757.

John Snow. On the mode of communication of cholera. London: John Churchill, 2nd edition, 1855. URLhttp://archive.org/details/b28985266.

John Snow. Cholera and the water supply in the south district of London in 1854. Journal of Pub-lic Health and Sanitary Review, 2:239–257, October 1856. URL http://www.ph.ucla.edu/epi/snow/

cholerawatersouthlondon.html.

Stephen M. Stigler. The seven pillars of statistical wisdom. Harvard University Press, Cambridge, Mas-sachusetts, 2016. ISBN 9780674088917 (pbk.: alk. paper).

Edward R. Tufte. Visual Explanations: Images and Quantities, Evidence and Narrative. Graphics Press,1st edition, February 1997. ISBN 978-1-930824-15-7. https://www.edwardtufte.com/tufte/books_visex.

Peter Vinten-Johansen, Howard Brody, Nigel Paneth, Stephen Rachman, and Michael Russell Rip. Cholera,Chloroform and the Science of Medicine: A Life of John Snow. Oxford University Press, Oxford ; NewYork, 1 edition edition, May 2003. ISBN 978-0-19-513544-2.

Westminster and London School of Hygiene and Tropical Medicine, editors. Report on the cholera outbreakin the parish of St. james,Westminster, during the autumn of 1854. J. Churchill, London, 1855.

Achim Zeileis. Econometric Computing with HC and HAC Covariance Matrix Estimators | Zeileis | Journalof Statistical Software. Journal of Statistical Software, 11, November 2004. doi: 10.18637/jss.v011.i10.URL https://www.jstatsoft.org/article/view/v011i10.

18

Page 19: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

Table 7: Snow’s Table XII – Deaths from Cholera in 1849 & 1854 (Snow [1855] p 90) with Population in1851 from Table VIII (p 85)

Sub-Districts Deaths fromCholera in 1849

Deaths fromCholera in 1854 Water Supplier Population in

18511 St. Saviour, Southwark 283 371 SouthwarkVauxhall 19,7092 St. Olave, Southwark 157 161 SouthwarkVauxhall 8,0153 St. John, Horsleydown 192 148 SouthwarkVauxhall 11,3604 St. James, Bermondsey 249 362 SouthwarkVauxhall 18,8995 St. Mary Magdalen 259 244 SouthwarkVauxhall 13,9346 Leather Market 226 237 SouthwarkVauxhall 15,2957 Rotherhithe 352 282 SouthwarkVauxhall 17,8058 Battersea 111 171 SouthwarkVauxhall 10,5609 Wandsworth 97 59 SouthwarkVauxhall 9,61110 Putney 8 9 SouthwarkVauxhall 5,28011 Camberwell 235 240 SouthwarkVauxhall 17,74212 Peckham 92 174 SouthwarkVauxhall 19,444

13 Christchurch, Southwark 256 113 SouthwarkVauxhall& Lambeth 16,022

14 Kent Road 267 174 SouthwarkVauxhall& Lambeth 18,126

15 Borough Road 312 270 SouthwarkVauxhall& Lambeth 15,862

16 London Road 257 93 SouthwarkVauxhall& Lambeth 17,836

17 Trinity, Newington 318 210 SouthwarkVauxhall& Lambeth 20,922

18 St. Peter, Walworth 446 388 SouthwarkVauxhall& Lambeth 29,861

19 St. Mary, Newington 143 92 SouthwarkVauxhall& Lambeth 14,033

20 Waterloo Road (1st) 193 58 SouthwarkVauxhall& Lambeth 14,088

21 Waterloo Road (2nd) 243 117 SouthwarkVauxhall& Lambeth 18,348

22 Lambeth Church (1st) 215 49 SouthwarkVauxhall& Lambeth 18,409

23 Lambeth Church (2nd) 544 193 SouthwarkVauxhall& Lambeth 26,784

24 Kennington (1st) 187 303 SouthwarkVauxhall& Lambeth 24,261

25 Kennington (2nd) 153 142 SouthwarkVauxhall& Lambeth 18,848

26 Brixton 81 48 SouthwarkVauxhall& Lambeth 14,610

27 Clapham 114 165 SouthwarkVauxhall& Lambeth 16,290

28 St. George, Camberwell 176 132 SouthwarkVauxhall& Lambeth 15,849

29 Norwood 2 10 Lambeth 3,97730 Streatham 154 15 Lambeth 9,02331 Dulwich 1 0 Lambeth 1,63232 Sydenham 5 12 Lambeth 4,501

First 12 sub-districts 2261 2458 first12 167,654Next 16 sub-districts 3905 2547 next16 300,149Last 4 sub-districts 162 37 last4 19,133

TOTAL 6,328 5,042 486,936

19

Page 20: JohnSnowandCholera: RevisitingDifference-in-Differencesand ...hilerun.org/econ/papers/snow/SSM_DiD.pdf · observation) and the restricted Poisson model. As a difference in log-likelihood

Tab

le8:

MortalityDuringtheFirst

SevenWeeks

oftheEpidemic,A

rran

gedin

Districts,from

Snow

[1856]

Tab

leIII

District

Hou

ses

1851

Pop

1851

Pop

Density

Hou

sesSu

pplie

dby

Deaths1854

CalculatedMortality

Southw

Lam

bSo

uthw

Lam

bTha

mes

Pum

psUn-

ascert

Total

Southw

Lam

b

St.Sa

viou

r,So

uthw

ark

4,600

35,731

7.8

2,631

1,689

126

1310

01

150

478.

977

.0St.Olave,So

uthw

ark

2,360

19,375

8.2

2,193

091

08

05

104

415.

0Bermon

dsey

7,007

48,128

6.9

8,402

268

266

025

00

291

316.

60.

0St.George,

Southw

ark

6,992

51,824

7.4

3,419

3,183

134

200

03

157

391.

962

.8New

ington

10,458

64,816

6.2

5,224

5,473

155

110

12

169

296.

720

.1Lam

beth

20,447

139,325

6.8

8,077

11,763

176

438

79

243

217.

936

.6Wan

dsworth

8,276

50,764

6.1

3,028

618

621

1617

096

204.

816

.2Cam

berw

ell

9,412

54,667

5.8

4,005

1,835

185

90

21

197

461.

949

.0Rotherhithe

2,792

17,805

6.4

2,336

068

035

00

103

291.

1Lew

isha

m(Sub

-districtof

Sydenh

am)

801

4,501

5.6

00

10

21

4Not

identified

6.6

411

250

00

00

0TOTAL

73,145

486,936

6.7

39,726

24,854

1,263

98102

2922

1,514

317.

939

.4Thispa

rtially

reprod

uces

Snow

[1856]

Tab

leIII.“Pop

1851”isthepo

pulation

estimatefrom

the1851

Census,repo

rted

invariou

stables

inSn

ow[1855,

1856].Death

coun

tsfor1854

arefrom

theRegistrar-G

eneral

withassign

mentto

supp

lierby

Snow

andW

hiting

,repo

rted

inSn

ow[1855]

Tab

leIan

dII

andmatchingSn

ow[1855]

Tab

leVIII.“C

alculatedMortality”

arethemortalitype

r10,000

hous

ehol

ds.Icalculatemortalitype

rho

usehold(instead

ofpe

rpe

rson

asrepo

rted

inTab

leIII)

tomatch

thean

alysis

inSn

ow[1855]

Tab

leIX

andin

thetext.Fo

rbrevityIdo

notinclud

ethepo

pulation

estimates

bysupp

lierrepo

rted

inTab

leIII.Seehttps://github

.com

/tscolem

an/S

nowCho

lera

forthecompleteda

ta.

20