(real-time) data and analytics...learning disability memory problems hearing loss lower iq lead ......
TRANSCRIPT
#DataSmartSummit
THE VALUE OF DATA-DRIVEN DECISION-MAKING
RAYID GHANIDirector
Center for Data Science & Public
Policy
University of Chicago
@rayidghani
SUMMIT ON DATA-SMART GOVERNMENT #DataSmartSummitNovember 2017
Rayid Ghani @rayidghani
Data-Driven Decision Making for Governments
Rayid Ghani
Center for Data Science & Public Policy
Rayid Ghani University of Chicago @rayidghani
• Examples from the field
• Data Analytics Capabilities– What’s possible?
– What’s hard?
– What to watch out for?
• How to get started?– Identifying and Framing the problem
– Building a team
– Identifying data needs and maturity
– Building, Testing, and Deploying the solution
– Change Management
Agenda
Rayid Ghani University of Chicago @rayidghani
Impaired Attention Lack of Motor Skills
Learning Disability
Memory Problems
Hearing Loss
Lower IQ
LEAD
Children in at least 4 million U.S. households are exposed to high levels of lead (CDC Report)
Preventing Lead Poisoning in Children (Chicago, IL)
Rayid Ghani University of Chicago @rayidghani
50% fewer false positives20% more officers identified correctly
Reducing Adverse Police Incidents (Charlotte Mecklenburg, NC and Nashville, TN)
Rayid Ghani University of Chicago @rayidghani
More accurately and 4 years earlier
Increasing Educational Outcomes in Schools (10+ school districts across the US)
Rayid Ghani University of Chicago @rayidghani
11 million people move through 3,100 Jails
$22 Billion in costs
64 % suffer from mental illness, 68% have a substance abuse disorder44 % suffer from chronic health problems
In the top 200 predictions104 individuals went to jail in the next
year
19 years total jail time
Reducing # of people going to Jail (Johnson County, KS)
Rayid Ghani University of Chicago @rayidghani
46% of the Mexican Population lives in poverty
~7.5 Million additional people can potentially be
better matched with services they need
Matching citizens with social services they need (Mexico)
Rayid Ghani University of Chicago @rayidghani
Before:~400 Violations per 1,000
After:~750 Violations per 1,000
87% Improvement
Increasing Compliance with Environmental Policies (EPA, NYSDEC)
Rayid Ghani University of Chicago @rayidghani
240,000 main breaks/yr in US
$13 billion in 2010 to repair
Expected $30 billion by 2040
180 breaks/yr in Syracuse, NY
64% of blocks in the top 1% of
predictions were correctly
predicted for last year
Preventing and Reducing Water Mains breaks (Syracuse, NY)
Rayid Ghani University of Chicago @rayidghani
2.4 billion people don’t have sanitation facilities
946 million people defecate in the open.
2.5 million people die each year due to sanitation-related diseases.
Operate 2.5x more toilets with the same resources
Serve 45,000 more people
Increasing efficiency of waste pickup (Kenya)
Rayid Ghani University of Chicago @rayidghani
27% of EMS incidents are under-dispatched
can get to the hospital faster
2,440 yearly
Improving the effectiveness of EMS Dispatches (Cincinnati, OH)
Rayid Ghani University of Chicago @rayidghani
More details on projects at http://dssg.uchicago.edu/projects
Rayid Ghani University of Chicago @rayidghani
Data Analytics Capabilities
Rayid Ghani University of Chicago @rayidghani
Capabilities: What’s possible?
Types of Data
• Program Level
• Transactional
• Spatial
• Text
• Images/Audio/Video
Types of Business Problems
• Early Warning
• Prioritization/Resource Constraints
• Routing
• Scheduling
Types of Analysis
• Description
• Detection
• Prediction
• Optimization
• Behavior Change
Rayid Ghani University of Chicago @rayidghani
• Building Inspections for Code Violations
• Getting 311 requests to the right department
• Targeting employment training opportunities
• Fraud Audits
• Placement of Mobile Clinics
Let’s work through some examples
Early Warning
Prioritization/Resource Constraints
Routing
Scheduling
Rayid Ghani University of Chicago @rayidghani
• Anonymization (not De-Identification)
• Incorporating Social Media effectively in to analytical systems
• Dealing with Bias and Fairness
• “Explaining” the Analysis
• “Deep Learning”
• Artificial Intelligence
Capabilities: What’s hard? hype?
Rayid Ghani University of Chicago @rayidghani
• Data Systems
– Can I export data out of it?
– Can I put more data in?
– Can I link data?
• Analytics Tools
– Can I take action based on it?
– How does it fit into my workflow?
– How does it compare to simple baselines?
– Can’t Watson do it as well?
Capabilities: What to watch out for?
Time for a Break
When we come back, we will focus
on how to get started:
● Framing the problem
● Building/Structuring a team
● Getting the data
● Testing/Deploying the Solution
● Change Management
Rayid Ghani University of Chicago @rayidghani
• Define a concrete problem and a specific goal you are trying to influence
• Identify whether you have a real opportunity to influence that goal
– What actions can you take?
– What data do you have and will need?
– What analysis will need to be done and how will you validate the analysis?
• Ensure you have a champion with the authority to move resources and set expectations for participation and collaboration
Getting Started: Framing the Problem
Rayid Ghani University of Chicago @rayidghani
Getting Started: Building your Team
Rayid Ghani University of Chicago @rayidghani
Getting Started: Building your Team
Rayid Ghani University of Chicago @rayidghani
Getting Started: Building your Team
Skills
Computer Science & Programming Statistics
Social Sciences
Experimental Design
Ethics & Legal Issues
Communication
Problem Formulation
Rayid Ghani University of Chicago @rayidghani
Training Social Scientists in Computational Methods & Tools
Rayid Ghani University of Chicago @rayidghani
Rayid Ghani University of Chicago @rayidghani
The Eric & Wendy Schmidt
Data Science for Social GoodSummer Fellowship
Eric & Wendy Schmidt
Data Science for Social GoodSummer Fellowship
http://dssg.uchicago.edu
@datascifellows
Rayid Ghani University of Chicago @rayidghani
Applied Data Analytics for Public Policywww.applieddataanalytics.org
Rayid Ghani University of Chicago @rayidghani
Getting Started: Building your Team
Centralized
Distributed
Rayid Ghani University of Chicago @rayidghani
Getting Started: Identifying the Data
Category Area Lagging Basic Advanced Leading
How is Data Stored?
Accessibility
Storage
Integration
What Data is Being Collected?
Relevancy & Sufficiency
Quality
Collection Frequency
Granularity
History
http://dsapp.uchicago.edu/resources/datamaturity/
Rayid Ghani University of Chicago @rayidghani
Getting Started: Building, Testing, and Deploying the Solution
Collaboration Tools
Privacy, Confidentiality, Security
Rayid Ghani University of Chicago @rayidghani
Getting Started: Building, Testing, and Deploying the Solution
Rayid Ghani University of Chicago @rayidghani
• Go back to the metrics and goals defined at the beginning of the project
• Run a Pilot
• Deploy
• Set up Infrastructure and allocate resources to monitor “lift”
Getting Started: Building, Testing, and Deploying the Solution
Rayid Ghani University of Chicago @rayidghani
Getting Started: Building, Testing, and Deploying the Solution
Security Privacy
Interpretability & Explanations
Fairness & Ethics
Critical Needs
Rayid Ghani University of Chicago @rayidghani
• Ideally, the entire organization needs to be on board at all stages of the project
• Promote core practices and behaviors - Curiosity and Rigorousness
• Small Iterations – Not Big Transformations
• Metrics-Driven, but the right metrics
• Test, Test, Test
Getting Started: Change Management
Rayid Ghani University of Chicago @rayidghani
From Open Data to
Open Policies
Rayid Ghani University of Chicago @rayidghani
From Efficiency to
Personalized, Improved Services
Rayid Ghani University of Chicago @rayidghani
From POLICIES to
policy policy policy policy policy
Rayid Ghani University of Chicago @rayidghani
How are you going to use what you learned?
What’s Next?
Rayid Ghani University of Chicago @rayidghani
Rayid GhaniCenter for Data Science & Public Policy
University of Chicago
Data Science for Social Good Summer Fellowship
http://dssg.uchicago.edu
Center for Data Science & Public Policy
http://dsapp.uchicago.ed
code at github.com/dssg
Rayid Ghani University of Chicago @rayidghani
• Can I detect who’s going to get lead poisoning early?
• Can I determine which home inspections to prioritize?
• How do I improve the scheduling and assignment of my medics/ambulances/firetrucks?
• Can I route citizen requests more efficiently and effectively?
• Which policies do I modify to improve maternal mortality ?
• How much impact is my after-school program having?
• Can I get data that helps me match employers with employees ?
Problem TemplatesX
X
X
X
X
X
X