introduction to data journalism
TRANSCRIPT
DATA JOURNALISM TRAINING
Day 1
WHAT IS DATA
Asking a question
Name Gender Age Height Feeling
Mandy F 21 150cm Swamped
Shani F 23 167cm Nervous
Zizo F 25 167cm Curious
Ashleigh F 22 163cm Relaxed
Danyal M 22 156cm Optimistic
Jason M 36 200cm Flustered
Hannah F 35 167cm Very excited
Phumlani M 24 180cm Grumpy
Milena F 29 160cm Excited
Data types
● QUALITATIVE DATA: is everything that refers to the
quality of something: A description of colours, texture
and feel of an object , a description of experiences, and
interview are all qualitative data.
● QUANTITATIVE DATA: is data that refers to a number.
Data types
● DISCRETE DATA: is numerical data with values which
are distinct and separate, i.e. they can be counted.
Examples might include the number of kittens in a litter;
the number of patients in a doctors surgery;
● CONTINUOUS DATA: is numerical data with a
continuous range. You can count, order and measure
continuous data. For example height, weight,
temperature, the amount of sugar in an orange, etc.
● CATEGORICAL DATA: puts the item you are
describing into a category; Examples can include
gender, colour, size, etc.
● ORDINAL DATA: data which can be ranked (put in
order) or have a rating scale attached. You can count
and order, but not measure, ordinal data; Example: a
scale from 1 to 5
Data types
Data types quiz
Role: Drummer
❏ Continuous Data
❏ Categorical Data
❏ Quantitative Data
Year Born: 1963
❏ Qualitative Data
❏ Discrete Data
❏ Continuous Data
❏ Categorical Data
Name: Rick Allen
❏ Quantitative Data
❏ Qualitative Data
❏ Discrete Data
Size: M
❏ Ordered Data
❏ Categorical Data
❏ Continuous Data
Height: 187cm
❏ Discrete Data
❏ Categorical Data
❏ Continuous Data
❏ Qualitative Data
Date: 5th of March 2014
❏ Discrete Data
❏ Categorical Data
❏ Continuous Data
Jargon busting
Data pipeline
DATA ETHICS &
VERIFICATION
[Jason]
Good practices and basic ethics
● Save original copy of data and do not touch it.
● Paper trail - Keep a log with every step that you take in the
analysis.
● Do not change original columns. Duplicate them and make
the changes here.
● Have several drafts and look at how your analysis
developed.
● Spend to understand your data. Read the methodology.
Good practices and basic ethics
● Do not assume what the data is. Run integrity check on each
column.
● Clean the data before interviewing it
● Count the records. Cross-reference with the methodology.
Report any inconsistency and request the missing data or a
recount. Keep the total records in mind while analysing the data.
● If a result looks to good to be true, it probably is.
● Make a summary of the end results, as if you were writing a
press release. Look for mistakes
Good practices and basic ethics
● Have somebody else verify your work, preferably
somebody who knows nothing about your project.
● Check your biases and look at your data from new
angles
● Look for context that would explain your results to
yourself and to your audience
● e.g. Egypt worst country for women’s rights
● Bounce your results against experts
FINDING DATA
& DATA
SOURCES
Advanced search
● Google Advanced Search
● Wayback Machine – for the dead web (1996 onwards)
http://archive.org/web/
Search operators
● * (asterix) – substitutes a word and will allow your search to
cover similar phrases
● Cache: - allows you to find web pages hidden in Google’s
cache
● filetype: - will get look for the specified file type
● Link: - helps you find all the sites that link to a particular
page
Search operators
● ‘ ‘ or “ “ (Quotation marks) – help you find the exact phrase
● + or AND – narrows down your search by returning the exact
word phrases
● OR – expands search by including either of two search
phrases
● - or NOT – it would tell an engine to exclude a term
● e.g. Monsanto-’agent orange’
WHAT MAKES A
GOOD
VISUALISATION
?
What makes a good visualisation
For each of these visualisations think of:● What is the target audience
● What is the key message
● How successful are they in communicating the
message
● What makes them stand out?
● How well are they explained?
● How simple/ complex they are?
Source: The Economist
Source: BBC News
Source: The Guardian
Source: New York Times, Amanda Cox;
Source: The Functional Art, Alberto Cairo
Source: Lower Saxony State Elections
Source: Population pyramid
Source: Hans Rosling, 200 Countries, 200 Years, 4 Minutes
Source: The Wall Street Journal
Source: Where does my money go, UK
Source: Where does my money go, UK
Source: Spending stories
Source: Driven by Data, Gregor Aisch
Source: The Guardian
Source: Transparency International
Source: The Guardian
Source: Migrations Map