(real-time) data and analytics...learning disability memory problems hearing loss lower iq lead ......

41
#DataSmartSummit

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

#DataSmartSummit

Page 2: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

THE VALUE OF DATA-DRIVEN DECISION-MAKING

RAYID GHANIDirector

Center for Data Science & Public

Policy

University of Chicago

@rayidghani

SUMMIT ON DATA-SMART GOVERNMENT #DataSmartSummitNovember 2017

Page 3: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani @rayidghani

Data-Driven Decision Making for Governments

Rayid Ghani

Center for Data Science & Public Policy

Page 4: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Examples from the field

• Data Analytics Capabilities– What’s possible?

– What’s hard?

– What to watch out for?

• How to get started?– Identifying and Framing the problem

– Building a team

– Identifying data needs and maturity

– Building, Testing, and Deploying the solution

– Change Management

Agenda

Page 5: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Impaired Attention Lack of Motor Skills

Learning Disability

Memory Problems

Hearing Loss

Lower IQ

LEAD

Children in at least 4 million U.S. households are exposed to high levels of lead (CDC Report)

Preventing Lead Poisoning in Children (Chicago, IL)

Page 6: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

50% fewer false positives20% more officers identified correctly

Reducing Adverse Police Incidents (Charlotte Mecklenburg, NC and Nashville, TN)

Page 7: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

More accurately and 4 years earlier

Increasing Educational Outcomes in Schools (10+ school districts across the US)

Page 8: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

11 million people move through 3,100 Jails

$22 Billion in costs

64 % suffer from mental illness, 68% have a substance abuse disorder44 % suffer from chronic health problems

In the top 200 predictions104 individuals went to jail in the next

year

19 years total jail time

Reducing # of people going to Jail (Johnson County, KS)

Page 9: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

46% of the Mexican Population lives in poverty

~7.5 Million additional people can potentially be

better matched with services they need

Matching citizens with social services they need (Mexico)

Page 10: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Before:~400 Violations per 1,000

After:~750 Violations per 1,000

87% Improvement

Increasing Compliance with Environmental Policies (EPA, NYSDEC)

Page 11: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

240,000 main breaks/yr in US

$13 billion in 2010 to repair

Expected $30 billion by 2040

180 breaks/yr in Syracuse, NY

64% of blocks in the top 1% of

predictions were correctly

predicted for last year

Preventing and Reducing Water Mains breaks (Syracuse, NY)

Page 12: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

2.4 billion people don’t have sanitation facilities

946 million people defecate in the open.

2.5 million people die each year due to sanitation-related diseases.

Operate 2.5x more toilets with the same resources

Serve 45,000 more people

Increasing efficiency of waste pickup (Kenya)

Page 13: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

27% of EMS incidents are under-dispatched

can get to the hospital faster

2,440 yearly

Improving the effectiveness of EMS Dispatches (Cincinnati, OH)

Page 14: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

More details on projects at http://dssg.uchicago.edu/projects

Page 15: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Data Analytics Capabilities

Page 16: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Capabilities: What’s possible?

Types of Data

• Program Level

• Transactional

• Spatial

• Text

• Images/Audio/Video

Types of Business Problems

• Early Warning

• Prioritization/Resource Constraints

• Routing

• Scheduling

Types of Analysis

• Description

• Detection

• Prediction

• Optimization

• Behavior Change

Page 17: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Building Inspections for Code Violations

• Getting 311 requests to the right department

• Targeting employment training opportunities

• Fraud Audits

• Placement of Mobile Clinics

Let’s work through some examples

Early Warning

Prioritization/Resource Constraints

Routing

Scheduling

Page 18: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Anonymization (not De-Identification)

• Incorporating Social Media effectively in to analytical systems

• Dealing with Bias and Fairness

• “Explaining” the Analysis

• “Deep Learning”

• Artificial Intelligence

Capabilities: What’s hard? hype?

Page 19: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Data Systems

– Can I export data out of it?

– Can I put more data in?

– Can I link data?

• Analytics Tools

– Can I take action based on it?

– How does it fit into my workflow?

– How does it compare to simple baselines?

– Can’t Watson do it as well?

Capabilities: What to watch out for?

Page 20: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Time for a Break

When we come back, we will focus

on how to get started:

● Framing the problem

● Building/Structuring a team

● Getting the data

● Testing/Deploying the Solution

● Change Management

Page 21: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Define a concrete problem and a specific goal you are trying to influence

• Identify whether you have a real opportunity to influence that goal

– What actions can you take?

– What data do you have and will need?

– What analysis will need to be done and how will you validate the analysis?

• Ensure you have a champion with the authority to move resources and set expectations for participation and collaboration

Getting Started: Framing the Problem

Page 22: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Building your Team

Page 23: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Building your Team

Page 24: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Building your Team

Skills

Computer Science & Programming Statistics

Social Sciences

Experimental Design

Ethics & Legal Issues

Communication

Problem Formulation

Page 25: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Training Social Scientists in Computational Methods & Tools

Page 26: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Page 27: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

The Eric & Wendy Schmidt

Data Science for Social GoodSummer Fellowship

Eric & Wendy Schmidt

Data Science for Social GoodSummer Fellowship

http://dssg.uchicago.edu

@datascifellows

Page 28: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Applied Data Analytics for Public Policywww.applieddataanalytics.org

Page 29: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Building your Team

Centralized

Distributed

Page 30: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Identifying the Data

Category Area Lagging Basic Advanced Leading

How is Data Stored?

Accessibility

Storage

Integration

What Data is Being Collected?

Relevancy & Sufficiency

Quality

Collection Frequency

Granularity

History

http://dsapp.uchicago.edu/resources/datamaturity/

Page 31: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Building, Testing, and Deploying the Solution

Collaboration Tools

Privacy, Confidentiality, Security

Page 32: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Building, Testing, and Deploying the Solution

Page 33: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Go back to the metrics and goals defined at the beginning of the project

• Run a Pilot

• Deploy

• Set up Infrastructure and allocate resources to monitor “lift”

Getting Started: Building, Testing, and Deploying the Solution

Page 34: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Getting Started: Building, Testing, and Deploying the Solution

Security Privacy

Interpretability & Explanations

Fairness & Ethics

Critical Needs

Page 35: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Ideally, the entire organization needs to be on board at all stages of the project

• Promote core practices and behaviors - Curiosity and Rigorousness

• Small Iterations – Not Big Transformations

• Metrics-Driven, but the right metrics

• Test, Test, Test

Getting Started: Change Management

Page 36: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

From Open Data to

Open Policies

Page 37: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

From Efficiency to

Personalized, Improved Services

Page 38: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

From POLICIES to

policy policy policy policy policy

Page 39: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

How are you going to use what you learned?

What’s Next?

Page 40: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

Rayid GhaniCenter for Data Science & Public Policy

University of Chicago

[email protected]

Data Science for Social Good Summer Fellowship

http://dssg.uchicago.edu

Center for Data Science & Public Policy

http://dsapp.uchicago.ed

code at github.com/dssg

Page 41: (Real-Time) Data and Analytics...Learning Disability Memory Problems Hearing Loss Lower IQ LEAD ... •Anonymization (not De-Identification) •Incorporating Social Media effectively

Rayid Ghani University of Chicago @rayidghani

• Can I detect who’s going to get lead poisoning early?

• Can I determine which home inspections to prioritize?

• How do I improve the scheduling and assignment of my medics/ambulances/firetrucks?

• Can I route citizen requests more efficiently and effectively?

• Which policies do I modify to improve maternal mortality ?

• How much impact is my after-school program having?

• Can I get data that helps me match employers with employees ?

Problem TemplatesX

X

X

X

X

X

X