mac281 big data & journalism lecture 2014

Post on 15-May-2015






Click to see full reader


2009 #iranelectionImage: Gilad Lotan, ReTweet Revolution





my o

f a tw





Database Journalism and Computer Assisted Reporting

Data Today : Visualisations and Interactivity

How To Be A Data Journalist



Adam Westbrook

“I think data-driven journalism is one of the big potential growth areas in the future of journalism. A lot of the forward-thinking discussion about the future of news focuses on the ‘glamorous’ possibilities, like video journalism and interactivity, but I often see data journalism being ignored.

In fact, I believe it is journalism in its truest essence: uncovering and mining through information the public do not have enough time to do themselves, interrogating it, and making sense of it before sharing it with the audience. If more journalists did this (rather than relying on ‘data’ from press releases) we would be a far more enlightened public.

Source link


Adam Westbrook

My message to the next generation of journalists - or any journalist looking for a new niche or direction - would be to learn the skills and tools of data interrogation. It’s not glamorous, but it’s a skill not many journalists have, and one which will give one an edge in the market.”

Source link


Brian Storm

One of our big goals in the storytelling process is to humanize the statistics. It’s hard for people to care about numbers, especially large numbers. How do you get your head around the death of 800,000 people in the Rwandan genocide? I think if you meet the individuals - see and hear the stories of the survivors - you can gain a better insight into the tragedy.

Source link


“Data-driven journalism is the future”

“[Journalism’s] going to be about poring over data and equipping yourself with the tools to analyse it and picking out what's interesting. And keeping it in perspective, helping people out by really seeing where it all fits together, and what's going on in the country.” Sir Tim Berners-Lee, inventor of the Web, 2010




Database Journalism

Computer Assisted Reporting (CAR)

Very expensive


The Indianapolis Star

Capital Journal circa 1961


CBS: 1952, Walter Cronkite

Presidential election battle

Eisenhower vs Stevenson

Remington Rand UNIVAC

Early vote returns analysis

Predicted a landslide victory

Contrary to popular opinion


Other notable examples

Clarence Jones, The Miami Herald, 1969 Criminal Justice systems

David Burnham, The New York Times, 1972 Police crime rates

Elliot Jaspin, The Providence Journal, 1986 School bus drivers and criminal records

Bill Dedman, The Atlanta Journal, 1988 Pullitzer Prize for The Color of Money

Since 2004



Adrian Holovaty (2005)

Chicago Transport Authority map + Firefox plug-in + Google Maps = real time updates

Chicago Police Department + Google Maps = real time police reports


Adrian Holovaty (2006)

Now working for the Washington Post

A fundamental way newspaper sites need to change

Most material collected by journalists is: "structured information: the type of information that can

be sliced-and-diced, in an automated fashion, by computers”


Adrian Holovaty (2006)

Traditional journalism

Articles as the finished product

Data journalism

Continually maintained and improved

Radical overhaul needed- Employing data- Making data available- Storing data- Coding data




Maps Everywhere!


Maps Everywhere!

2007 – Holovaty won $1.1 million from the Knight Foundation for Everyblock

2010 – SR2 Blog won’s ‘most inspirational site’ accolade


Bella Hurrell, Specials Editor with BBC News Online (2011)

Proximity of “journalists, designers and developers all working together, sitting alongside each other”


Bella Hurrell, Specials Editor with BBC News Online (2011)

“We have found that proximity really important to the success of projects. Although we have done this for a while, increasingly other organisations are reorganising along these lines after coming to realise the benefits of breaking down silos and co-locating people with different skillsets can produce more innovative solutions at a faster pace.”


Bella Hurrell, Specials Editor with BBC News Online (2011)

“As data visualisation has come into the zeitgeist, and we have started using it more regularly in our story-telling, journalists and designers on the specials team have become much more proficient at using basic spreadsheet applications like Excel or Google Docs”


Paul Bradshaw


Paul Bradshaw

“It represents the convergence of a number of fields which are significant in their own right - from investigative research and statistics to design and programming. The idea of combining those skills to tell important stories is powerful - but also intimidating. Who can do all that?”


Paul Bradshaw

“It represents the convergence of a number of fields which are significant in their own right - from investigative research and statistics to design and programming. The idea of combining those skills to tell important stories is powerful - but also intimidating. Who can do all that?”

“The reality is that almost no one is doing all of that, but there are enough different parts of the puzzle for people to easily get involved in, and go from there”

Dealing with Data (Bradshaw, 2010)

4 crucial aspects


1. Finding data  

2. Interrogating data  

3. Visualizing data

4. Mashing data


Data visualisation vs data journalism


New Tools of the Trade?


Excel or Calc sort your data

Google Refine clean your dirty data

Yahoo Pipes Composition mash-up tool

ScraperWiki transforms info from webpages

into data

R Process and manipulate data


Google Fusion Tables visualise data on maps,

timelines, etc

Tableau Public Visualise and share

IBM’s Many Eyes data visualisation tool

Processing create images & interactives

Wordle generate word clouds from

bulky text


Free tools…


Free tools…


Is this journalism?

Journalism educators doing students a disservice?

Journalists replaced by programmers?

Wikileaks: no journalist's required?



Knight Foundation, 2008, Sir Tim Berners-Lee talking about the Web at the Newseum

Bill on Capitol Hill, 2007, The Rim and the Slot

Marion Doss, 2008, Capital Journalism News Room 16 October 1961

Igorschwarzmann, 2010, NYT News Room

Mkandlez, 2009, The Billion Pound O Gram

BitBoy, 2006, The Elephant in the Room

Ravages, 2008, Links



To what extent is the traditional craft of storytelling being challenged by the emergence of big data?

What kind of problems are manifest by the deluge of large data sets (eg MPs expenses, Wikileaks Iraq war logs, US cables, etc)?

Can the use or release of big data sets have ethical implications?

top related