choices in statistical graphics: my storiesgelman/presentations/vistalk_meetup_new... · choices in...

60
Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department of Political Science Columbia University New York Data Visualization Meetup 14 Jan 2013 Andrew Gelman Choices in statistical graphics: My stories

Upload: truongtram

Post on 04-May-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Choices in statistical graphics: My stories

Andrew GelmanDepartment of Statistics and Department of Political Science

Columbia University

New York Data Visualization Meetup14 Jan 2013

Andrew Gelman Choices in statistical graphics: My stories

Page 2: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

My earlier talk on tradeoffs in statistical graphics

I Originally: Infoviz vs. stat graphicsI The best information visualizations are grabby, visually strikingI The best statistical graphics reveal patterns and discrepanciesI Different goals, different looks

I Lots of negative reactionsI (Some) infofiz people felt we were trivializing their workI (Some) statisticians felt we gave infofiz too much respect

I Our new theme: tradeoffs in statistical graphics

Andrew Gelman Choices in statistical graphics: My stories

Page 3: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

We did not come to mock . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 4: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Instead, compare a bare-bones infographic . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 5: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

To a corresponding statistical graphic . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 6: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Another example . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 7: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

The statistician’s version . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 8: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

A legendary early infographic . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 9: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

How we would display it . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 10: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

For those of you reading this talk off the web

I I’m not saying that the boring plots (constructed by Antony Unwin andmyself using R) are better than Florence Nightingale’s beautiful images!

I Rather, I’m saying that Nightingale’s graphic and ours serve different

purposes:

I She dramatizes the problem with a unique andvisually-appealing image that draws the casual viewer in deeper

I We display the data to reveal patterns, for viewers who arealready interested in the problem

I In any case, this is not my main point today. We’ll spend most of ourtime discussing the choices involved in graphs that I’ve made over theyears.

I Now, back to our regularly scheduled presentation . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 11: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

General theme

I All graphs are comparisons

I All of statistics are comparisons

Andrew Gelman Choices in statistical graphics: My stories

Page 12: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Specific recommendations

I Multiple plots per page (small multiples)

I Don’t clutter each plot

I Line plots are great—they facilitate more comparisons

Andrew Gelman Choices in statistical graphics: My stories

Page 13: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Don’t clutter each plot: example

From Graph Design for the Eye and Mind by Stephen Kosslyn:

Andrew Gelman Choices in statistical graphics: My stories

Page 14: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Redo using small multiples!

Andrew Gelman Choices in statistical graphics: My stories

Page 15: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Andrew Gelman Choices in statistical graphics: My stories

Page 16: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Line plots: Cleveland’s principle

I Always ask: What is the comparison?

I Example: an analysis from market research

Andrew Gelman Choices in statistical graphics: My stories

Page 17: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Improvement?

Andrew Gelman Choices in statistical graphics: My stories

Page 18: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Line plot is better

Consider the comparisons you can make!

Andrew Gelman Choices in statistical graphics: My stories

Page 19: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Statistics is . . .

Andrew Gelman Choices in statistical graphics: My stories

Page 20: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Today’s talk

I (Some of) my examples from (nearly) 30 years of appliedresarch

I Choices involved in making the graphs

I What works, what doesn’t, and why

I You must participate!

Andrew Gelman Choices in statistical graphics: My stories

Page 21: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1984: “The effects of solar flares on single event upsetrates”

Andrew Gelman Choices in statistical graphics: My stories

Page 22: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1984: “The effects of solar flares on single event upsetrates”

Andrew Gelman Choices in statistical graphics: My stories

Page 23: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1986: “Reduced subboundary misalignment in SOI filmsscanned at low velocities”

Andrew Gelman Choices in statistical graphics: My stories

Page 24: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1989: “Constrained maximum entropy methods in animage reconstruction problem”

Andrew Gelman Choices in statistical graphics: My stories

Page 25: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1990: “Estimating the electoral consequences of legislativeredistricting”

Andrew Gelman Choices in statistical graphics: My stories

Page 26: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1990: “Estimating the electoral consequences of legislativeredistricting”

Andrew Gelman Choices in statistical graphics: My stories

Page 27: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1990: “Estimating the electoral consequences of legislativeredistricting”

Andrew Gelman Choices in statistical graphics: My stories

Page 28: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1991: “Systemic consequences of incumbency advantagein U.S. House elections”

Andrew Gelman Choices in statistical graphics: My stories

Page 29: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2008: “Estimating incumbency advantage and itsvariation, as an example of a before/after study”

Andrew Gelman Choices in statistical graphics: My stories

Page 30: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2008: “Estimating incumbency advantage and itsvariation, as an example of a before/after study”

Andrew Gelman Choices in statistical graphics: My stories

Page 31: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1992: “Inference from iterative simulation using multiplesequences”

Andrew Gelman Choices in statistical graphics: My stories

Page 32: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1992: “Inference from iterative simulation using multiplesequences”

Andrew Gelman Choices in statistical graphics: My stories

Page 33: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1993: “Why are American Presidential election campaignpolls so variable when votes are so predictable?”

Andrew Gelman Choices in statistical graphics: My stories

Page 34: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1993: “Why are American Presidential election campaignpolls so variable when votes are so predictable?”

Andrew Gelman Choices in statistical graphics: My stories

Page 35: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1994: “Enhancing democracy through legislativeredistricting”

Andrew Gelman Choices in statistical graphics: My stories

Page 36: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1995: “Pre-election survey methodology: details from ninepolling organizations, 1988 and 1992”

Andrew Gelman Choices in statistical graphics: My stories

Page 37: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1996: “Physiological pharmacokinetic analysis usingpopulation modeling and informative prior distributions”

Andrew Gelman Choices in statistical graphics: My stories

Page 38: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1996: “Physiological pharmacokinetic analysis usingpopulation modeling and informative prior distributions”

Andrew Gelman Choices in statistical graphics: My stories

Page 39: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1996: “Physiological pharmacokinetic analysis usingpopulation modeling and informative prior distributions”

Andrew Gelman Choices in statistical graphics: My stories

Page 40: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1997: “Poststratification into many categories usinghierarchical logistic regression”

Andrew Gelman Choices in statistical graphics: My stories

Page 41: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1998: “Estimating the probability of events that havenever occurred: When is your vote decisive?”

Andrew Gelman Choices in statistical graphics: My stories

Page 42: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2009: “The probability your vote will make a difference”

Andrew Gelman Choices in statistical graphics: My stories

Page 43: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

1999: “All maps of parameter estimates are misleading”

Andrew Gelman Choices in statistical graphics: My stories

Page 44: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2000: “Type S error rates for classical and Bayesian singleand multiple comparison procedures”

Andrew Gelman Choices in statistical graphics: My stories

Page 45: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2002: “A probability model for golf putting”

Andrew Gelman Choices in statistical graphics: My stories

Page 46: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2003: “Forming voting blocs and coalitions as a prisoner’sdilemma: a possible theoretical explanation for politicalinstability”

Andrew Gelman Choices in statistical graphics: My stories

Page 47: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2004: “Standard voting power indexes don’t work”

Andrew Gelman Choices in statistical graphics: My stories

Page 48: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2005: “Multiple imputation for model checking:completed-data plots with missing and latent data”

Andrew Gelman Choices in statistical graphics: My stories

Page 49: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2006: “The boxer, the wrestler, and the coin flip”

Andrew Gelman Choices in statistical graphics: My stories

Page 50: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2007: “An analysis of the NYPD’s stop-and-frisk policy inthe context of claims of racial bias”

Andrew Gelman Choices in statistical graphics: My stories

Page 51: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2009: “Beautiful political data”

Andrew Gelman Choices in statistical graphics: My stories

Page 52: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2010: “Public opinion on health care reform”

Andrew Gelman Choices in statistical graphics: My stories

Page 53: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2010: “Public opinion on health care reform”

Andrew Gelman Choices in statistical graphics: My stories

Page 54: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2011: “Tables as graphs: The Ramanujan principle”

Andrew Gelman Choices in statistical graphics: My stories

Page 55: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2012: “Philosophy and the practice of Bayesian statistics”

Andrew Gelman Choices in statistical graphics: My stories

Page 56: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2013: “Election turnout and voting patterns”

Andrew Gelman Choices in statistical graphics: My stories

Page 57: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

2013: “Election turnout and voting patterns”

Andrew Gelman Choices in statistical graphics: My stories

Page 58: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Notes

I Gradual improvements in technique . . . and understanding

I Often, what we’re plotting is not “data”

I Research vs. publications: “Let me tell you about my firstwife”

Andrew Gelman Choices in statistical graphics: My stories

Page 59: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Take-home points

I Small multiples

I Line plots

I Try to make a display self-contained, then add words

I Graphs are comparisons

Andrew Gelman Choices in statistical graphics: My stories

Page 60: Choices in statistical graphics: My storiesgelman/presentations/vistalk_meetup_new... · Choices in statistical graphics: My stories Andrew Gelman Department of Statistics and Department

Some references

Andrew Gelman and Antony Unwin (2013). Infovis and statistical graphics: Differentgoals, different looks (with discussion by Stephen Few, Robert Kosara, Paul Murrell,and Hadley Wickham, and rejoinder by Gelman and Unwin). Journal of Computationaland Graphical Statistics. [Our current views on tradeoffs in statistical graphics]

Andrew Gelman (2004). Exploratory data analysis for complex models (with discussionby Andreas Buja and rejoinder by Gelman). Journal of Computational and GraphicalStatistics 13, 755–787. [An expression of the idea that exploratory graphics are a formof model checking: the better the model, the more effective the graphics. Thus,statistical modeling and graphics are not competitors (as is often thought) but canwork together.]

Andrew Gelman (2003). A Bayesian formulation of exploratory data analysis andgoodness-of-fit testing. International Statistical Review 71, 369–382. [A more formalexploration of the unity between statistical graphics and Bayesian modeling.]

Andrew Gelman, Cristian Pasarica, and Rahul Dodhia (2002). Let’s practice what wepreach: turning tables into graphs. American Statistician 56, 121–130. [Proof ofconcept: we went through an issue of the Journal of the American StatisticalAssociation and converted all the tables into graphs, in each case displaying all theinformation using less space.]

Andrew Gelman and Gary King (1993). Why are American Presidential electioncampaign polls so variable when votes are so predictable? British Journal of PoliticalScience 23, 409–451. [We resolved in writing this paper to do all the analysis usinggraphs, no tables. It worked well: we told a story and backed it up with evidence.]

Andrew Gelman Choices in statistical graphics: My stories