nr14: ten tips for data journalists

Post on 22-Apr-2015

1.134 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

10 things every data journalist should know

NR14, HamburgJennifer LaFleur

Center for Investigative Reporting

A bit about CIR

Nonprofit investigative newsroomPublic interest investigative journalismBased near San FranciscoAbout 80 staffPrint, web, radio and tv

#1 data is a powerful reporting tool

It takes you beyond the anecdote

And It’s easier than dealing with this

#1 data is a powerful reporting tool

Contrasts are in the data

Caution: This slide contains extreme nerdiness

#1 data is a powerful reporting tool

Contrasts are in the dataYour most powerful figures are in the data

Source: California Health Dept. data, Medicare billing data

Findings: Some hospitals had “alarming rates of a Third World nutritional disorder among its Medicare patients.”

Contrasts are in the dataYour most powerful figures are in the dataYou can make connections you might not be able to make otherwise

#1 data is a powerful reporting tool

Data: Youth prison workers, criminal convictions and grievance data

Findings: Employees with criminal backgrounds were more likely to be accused of abusing inmates.

Data: Federal bridge inspections and stimulus funding.

Findings: Some of the nation’s worst bridges did not get stimulus funds.

Contrasts are in the dataYour most powerful figures are in the dataYou can make connections you might not be able to make otherwiseYou can test assumptions

#1 data is a powerful reporting tool

Source: NHTSA complaint data

Findings: “…unintended acceleration has been a problem across the auto industry.”

#2 data comes from many places

If something is inspectedLicensedEnforced orPurchased

…There probably is a database

Where’s the data?

If there is a reportOr a formThere probably is a database

Where’s the data?

Sometimes data is readily available online for download

Where’s the data?

Sometimes you have to scrape it.

That usually involves programs that automate searching tasks on Web sites.

Where’s the data?

More often you need to go to an agency or source to get the data

Where’s the data?

Source: School district credit card purchases

Findings: District card holders made questionable purchases with their cards.

#3 people who keep data don’t always want t give it up

Getting electronic information

Know the law. Know what information you want.Do your homeworkKnow what the appropriate cost should be.Know who does the data entry. Get to know the computer people.

Just another way of saying no

Huge costsDelay tactics“Oh you silly little journalist”Sending you the wrong thing“Your request was unclear”HIPAAPrivacyPrivatization

#4 Sometimes holes in data can be a story

#5 Even when there is no data, you can use techniques for sampling and building a database.

SamplingPhysical surveys – go look at oneTestingQuestionnaires, polls and surveysBuilding from documents

We built a database of 500 people who had been granted or denied pardons during the Bush administration.

We started with a list of nearly 2,000 people. From that, we pulled a random sample. Then spent months researching the individuals.

We found that even after controlling for other factors, whites were more likely to get a pardon.

To examine food safety, the Center for Investigative Reporting in Bosnia sampled food – literally -- and had it tested in labs.

SVT surveyed 355 counties and districts about drug control – all replied (Courtesy Helena Bengtsson)

#6 Sometimes the crowd can help you

Where’s the data?

#7 There are many data tools – choose the right one

SpreadsheetsDatabasesMappingStatisticsProgramming

Source: Salary data and other charter school records

Findings: Reporters Found nepotism in charter schools and administrators earning six-figure salaries to run schools with only a few hundred or a couple of thousand students

Source: Washington Health Department dataFindings: “MRSA has been quietly killing in hospitals for decades.” But no one had tracked it until this story.

Source: City Budget

Findings: Some neighborhoods suffer more than others as mayor cuts budgets

SOURCE: Local health department inspection reports

FINDINGS: At 28% of the venues, more than half of the concession stands or restaurants had been cited for at least one "critical" or "major" health violation.

#8 Sharing data is good, but give it context and be sure it is right

Source: EPA and state data on hazardous chemical locationsFindings: Dallas County has 900+ sites that store hazardous chemicals

Source: Medicaid outcomes data for dialysis facilities

Findings: A CMS online tool did not tell the whole story about facilities. In some counties the gap in measures, such as survival rate were vast.

Source: Dam inspection data from Texas and federal government

Findings: Dam records had not been updated to account for population growth

#9 Data intended for one purpose can be used in other ways

Source: 311 calls for downed trees

Findings: After a tornado swept across New York City, 311 calls for downed trees helps trace its path

Disparities in water usage

“Water use highest in poor areas of the city”Mapping and statistical analysis

#10: No data is perfect

Check your data

• Read the documentation. Understand the contents of every field.

• Know how many records you should have.• Check counts and totals against reports.• Are all possibilities included? All states, all counties,

correct ranges?• Check for missing data, duplicates, internal

problems

Jennifer LaFleurjlafleur@cironline.org

@j_la28www.cironline.org

top related