conversations with data

Post on 17-Dec-2014

3.010 Views

Category:

Education

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

#dalmooc 27/10/12 slides

TRANSCRIPT

Conversations with Data

Tony HirstComputing and Communications,

The Open University

(Recognising and addressing

a skills gap)

/via Adam Cooper, “Exploratory Data Analysis” http://blogs.cetis.ac.uk/adam/2012/05/18/exploratory-data-analysis/

http://cm.bell-labs.com/cm/ms/departments/sia/tukey/memo/techtools.html

“The Technical Tools of Statistics” read at the 125th Anniversary Meeting of the American Statistical Association, Boston, November 1964, published in April 1965 American Statistician.Jo

hn T

ukey

“journeyman carpenter of data-analytical tools”

“A Boy's Work is Never Done”, KellyB. (flickr: foreverphoto/2467694199/)

ouseful.info

“Exploratory data analysis is an attitude,

a flexibility,and reliance on display,

not a bundle of techniquesand should be so taught.” Jo

hn T

ukey

http://www.ece.rice.edu/~fk1/classes/ELEC697/TukeyEDA.pdf

Tukey, John W. "We need both exploratory and confirmatory." The American Statistician 34.1 (1980): 23-25.

“I … cannot disagree strongly enough with statements about the dangers of putting powerful tools in the hands of novices. Computer algebra, statistics, and graphics systems provide plenty of rope for novices to hang themselves and may even help to inhibit the learning of essential skills needed by researchers. The obvious problems caused by this situation do not justify blunting our tools, however. They require better education in the imaginative and disciplined use of these tools. And they call for more attention to the way powerful and sophisticated tools are presented to novice users.”

Leland Wilkinson, The Grammar of Graphics, Springer-Verlag, 1999, ISBN 0-387-98774-6, p15-16.

Data accessibility

Data sensemaking

CleanShape

AugmentLook

Dirty Data10th March, 2014,

3-10-14,

10/03/14

£1,249 millionNULL, NA, ‘’

openrefine.org

Shapes…

I see trees…

See also: IPython notebook demohttp://nbviewer.ipython.org/gist/psychemedia/9c54721e853403b43d21/pivotTable_demo.ipynb

“There is no more reason to expect one graph to ‘tell all’ than to expect one number to do the same.”

-- John Tukey

If quantities are conserved,can you think of them in terms of flow?

“[T]he picture examining eye is the best finder we haveof the wholly unanticipated.”

John Tukey

http://www.ece.rice.edu/~fk1/classes/ELEC697/TukeyEDA.pdf

Tukey, John W. "We need both exploratory and confirmatory." The American Statistician 34.1 (1980): 23-25.

How can we look at data?

How do we ask questions

of data?

else

underspend filetype:xls site:gov.uk

Search limits

underspend filetype:xls site:gov.uk

select webPages where text like “%underspend%” and filetype=“xls”

and domain=“gov.uk”

Structured queries

SQL

Count things

Sort things

http://www.coolinfographics.com/blog/2014/8/29/false-visualizations-sizing-circles-in-infographics.html

How do we interpret the

answers?

start to

Look for outliers

Top 3…

…bottom 3

median

mean

Outliers may be rare occurrences over time too…

Streaks and runs…

Look for similarities & differences

Look for trends

Look for patterns & structure

“Hand-drawing of graphs, except perhaps for reproduction in books and in some journals, is now economically wasteful, slow, and on the way out.”

– John Tukey

Recording your conversations

Rstudio.org

IPython Notebook

“I know of no person or group that is taking nearly adequate advantage of the graphical potentialities of the computer.”

– John Tukey

Hopefully, that contained some

ouseful.info-- @psychemedia

top related