using big data to solve economic and social problems...spring 2019 using big data to solve economic...

47
Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D. Harvard University

Upload: others

Post on 26-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Spring 2019

Using Big Data to Solve Economic and Social Problems

Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D.Harvard University

Page 2: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Improving Health Outcomes

Research in economics typically focuses on earnings or wealth as key outcomes of interest

But most people view health and life expectancy as among the most important aspects of well-being

What interventions are most effective in improving health (holding fixed current frontier of medical technology)?

- Research on these issues spans multiple fields, from epidemiology and public health to economics

Page 3: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

This part of the class illustrates how big data is helping us learn how to improve health, in three segments:

1. Descriptive analysis of health outcomes in U.S. population [method: survival analysis]

Chetty, Stepner, Abraham, Lin, Scuderi, Bergeron, Cutler. “The Association Between Income and Life Expectancy in the United States” JAMA 2016.

Improving Health Outcomes: Overview

Page 4: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

This part of the class illustrates how big data is helping us learn how to improve health, in three segments:

1. Descriptive analysis of health outcomes in U.S. population [method: survival analysis]

2. Economics applications: impacts of food stamps (Jesse Shapiro) and health insurance [method: regression discontinuities]

Wherry, Miller, Kaestner, Meyer. “Childhood Medicaid Coverage and Later Life Health Care Utilization” REStat 2017.

Improving Health Outcomes: Overview

Page 5: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

This part of the class illustrates how big data is helping us learn how to improve health, in three segments:

1. Descriptive analysis of health outcomes in U.S. population [method: survival analysis]

2. Economics applications: impacts of food stamps (Jesse Shapiro) and health insurance [method: regression discontinuities]

3. Epidemiology application: using big data to forecast pandemics [method: predictive modeling]

Ginsberg, Mohebbi, Patel, Brammer, Smolinski, Brilliant. “Detecting Influenza Epidemics Using Search Engine Query Data.” Nature 2009.

Lazer, Kennedy, King, Vespignani. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 2014.

Improving Health Outcomes: Overview

Page 6: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Most common measure of health: mortality rates

– Crude but well measured in population data

Begin with basic descriptive facts about life expectancy in America

Chetty et al. (2016) examine relationship between life expectancy and income

– Use data on entire U.S. population from 1999-2013 (1.4 billion observations)

Income and Life Expectancy

Page 7: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Estimating Life Expectancy: Data

Mortality measured using Social Security death records

Income measured at household level using tax returns

Focus on percentile ranks in income distribution

- Rank individuals in national income distribution within birth cohort, gender, and tax year

Page 8: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Methodology to Estimate Life Expectancy

Goal: estimate expected age of death conditional on an individual’s income at age 40, controlling for differences in race and ethnicity

- Period life expectancy: life expectancy for a hypothetical individual who experiences mortality rates at each age observed in a given year

Three steps:

1. Calculate mortality rates by income rank and age for observed ages

2. Estimate a survival model to extrapolate to older ages

3. Adjust for racial differences in mortality rates

Page 9: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Survival Curve for Men at 5th Percentile

Age 76

020

4060

8010

0Su

rviv

al R

ate

(%)

40 60 80 100 120Age in Years (a)

Page 10: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Age 76

020

4060

8010

0Su

rviv

al R

ate

(%)

40 60 80 100 120Age in Years (a)

Survival Curves for Men at 5th and 95th Percentiles

Page 11: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Age 76

020

4060

8010

0Su

rviv

al R

ate

(%)

40 60 80 100 120Age in Years (a)

Survival Curves for Men at 5th and 95th Percentiles

p5 Survival Rate: 52%

p95 Survival Rate: 83%

Page 12: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Step 2: Predicting Mortality Rates at Older Ages

To calculate life expectancy, need estimates of mortality rates beyond age 76

Gompertz (1825) documented a robust empirical pattern: mortality rates grow exponentially with age

Page 13: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Mortality Rates by Gender in the United States in 2001: CDC Data

-6-4

-20

Log

Mor

talit

y R

ate

40 50 60 70 80 90 100Age in Years

Age 76

Men Women

Page 14: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

-8-6

-4-2

Log

Mor

talit

y R

ate

40 50 60 70 80 90Age in Years

Gompertz: p95

Log Mortality Rates for Men at 5th and 95th Percentiles

Data: p5 Data: p95Gompertz: p5

Page 15: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Gompertz: p95Data: p5 Data: p95Gompertz: p5

-8-6

-4-2

Log

Mor

talit

y R

ate

40 50 60 70 80 90Age in Years

Medicare Eligibility Threshold

Age 65

Log Mortality Rates for Men at 5th and 95th Percentiles

Page 16: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Age 76 Age 90

GompertzExtrapolation0

2040

6080

100

Surv

ival

Rat

e (%

)

40 60 80 100 120Age in Years (a)

Gompertz: p95Data: p5 Data: p95Gompertz: p5

Survival Curves for Men at 5th and 95th Percentiles

Page 17: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Gompertz: p95Data: p5 Data: p95Gompertz: p5

Age 76 Age 90

GompertzExtrapolation

NCHS and SSAEstimates(constant acrossincome groups)

020

4060

8010

0Su

rviv

al R

ate

(%)

40 60 80 100 120Age in Years (a)

Survival Curves for Men at 5th and 95th Percentiles

Page 18: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

National Statistics on Income and Life Expectancy

Page 19: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

7075

8085

90Ex

pect

ed A

ge a

t Dea

th fo

r 40

Year

Old

s in

Yea

rs

0 20$25k

40$47k

60$74k

80$115k

100$2.0M

Household Income Percentile

Expected Age at Death vs. Household Income PercentileFor Men at Age 40

Page 20: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

7075

8085

90Ex

pect

ed A

ge a

t Dea

th fo

r 40

Year

Old

s in

Yea

rs

0 20$25k

40$47k

60$74k

80$115k

100$2.0M

Household Income Percentile

Bottom 1%: 72.7 Years

Top 1%: 87.3 Years

Expected Age at Death vs. Household Income PercentileFor Men at Age 40

Page 21: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

U.S. Life Expectancies by Percentile in Comparison toMean Life Expectancies Across Countries

Lesotho

Zambia

IndiaIraqSudan

Pakistan

LibyaChina

United KingdomCanada

San Marino

United States - P1

United States - P25

United States - P50United States - P100

60 65 70 75 80 85 90Expected Age at Death for 40 Year Old Men

Page 22: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Women

Men

Women, Bottom 1%: 78.8Women, Top 1%: 88.9Men, Bottom 1%: 72.7

Men, Top 1%: 87.37075

8085

90Ex

pect

ed A

ge a

t Dea

th fo

r 40

Year

Old

s in

Yea

rs

0 20 40 60 80 100Household Income Percentile

Expected Age at Death vs. Household Income PercentileBy Gender at Age 40

Page 23: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Women, Bottom 1%: 78.8Women, Top 1%: 88.9Men, Bottom 1%: 72.7

Men, Top 1%: 87.37075

8085

90Ex

pect

ed A

ge a

t Dea

th fo

r 40

Year

Old

s in

Yea

rs

0 20 40 60 80 100Household Income Percentile

Bottom 1% Gender Gap6.1 years

Top 1% Gender Gap1.6 years

Expected Age at Death vs. Household Income PercentileBy Gender at Age 40

Page 24: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Time Trends

How are gaps in life expectancy changing over time?

Page 25: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Annual Change = 0.08 (0.05, 0.11)

Annual Change = 0.12 (0.08, 0.16)

Annual Change = 0.18 (0.15, 0.20)

Annual Change = 0.20 (0.17, 0.24)

7580

8590

Expe

cted

Age

at D

eath

for 4

0 Ye

ar O

lds

in Y

ears

2000 2005 2010 2015Year

Trends in Expected Age at Death by Income Quartile in the USFor Men Age 40, 2001-2014

1st Quartile 3rd Quartile2nd Quartile 4th Quartile

Page 26: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Annual Change = 0.10 (0.06, 0.13)

Annual Change = 0.17 (0.13, 0.20)

Annual Change = 0.25 (0.22, 0.28)

Annual Change = 0.23 (0.20, 0.25)

8284

8688

90Ex

pect

ed A

ge a

t Dea

th fo

r 40

Year

Old

s in

Yea

rs

2000 2005 2010 2015Year

1st Quartile 3rd Quartile2nd Quartile 4th Quartile

Trends in Expected Age at Death by Income Quartile in the USFor Women Age 40, 2001-2014

Page 27: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Local Area Variation in Life Expectancy by Income

Page 28: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

New York City

San Francisco

Dallas

Detroit

7075

8085

90

0 25$30k

50$60k

75$101k

100$683k

Household Income Percentile

Expected Age at Death vs. Household Incomefor Men in Selected Cities

Page 29: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

New York City

San Francisco

Dallas

Detroit

7075

8085

90

0$27k

50$54k

75$95k

100$653k

Household Income Percentile

Expected Age at Death vs. Household Incomefor Women in Selected Cities

Page 30: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Expected Age at Death for 40 Year Old MenBottom Quartile of U.S. Income Distribution

Note: Lighter Colors Represent Areas with Higher Life Expectancy

Page 31: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Expected Age at Death for 40 Year Old MenPooling All Income Groups

Note: Lighter Colors Represent Areas with Higher Life Expectancy

Page 32: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Expected Age at Death for 40 Year Old WomenBottom Quartile of U.S. Income Distribution

Note: Lighter Colors Represent Areas with Higher Life Expectancy

Page 33: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Expected Age at Death for 40 Year Olds in Bottom QuartileTop 10 and Bottom 10 CZs Among 100 Largest CZs

Top 10 CZs Bottom 10 CZs

Rank CZ Expected Age at Death Rank CZ Expected Age

at Death

1 New York, NY 81.8 (81.6, 82.0) 91 San Antonio, TX 78.0 (77.6, 78.4)

2 Santa Barbara, CA 81.7 (81.3, 82.1) 92 Louisville, KY 77.9 (77.7, 78.2)

3 San Jose, CA 81.6 (81.2, 82.0) 93 Toledo, OH 77.9 (77.6, 78.2)

4 Miami, FL 81.2 (80.9, 81.6) 94 Cincinnati, OH 77.9 (77.7, 78.1)

5 Los Angeles, CA 81.1 (80.9, 81.4) 95 Detroit, MI 77.7 (77.5, 77.8)

6 San Diego, CA 81.1 (80.8, 81.4) 96 Tulsa, OK 77.6 (77.4, 77.9)

7 San Francisco, CA 80.9 (80.6, 81.3) 97 Indianapolis, IN 77.6 (77.4, 77.8)

8 Santa Rosa, CA 80.8 (80.5, 81.2) 98 Oklahoma City, OK 77.6 (77.3, 77.8)

9 Newark, NJ 80.7 (80.5, 80.9) 99 Las Vegas, NV 77.6 (77.4, 77.8)

10 Port St. Lucie, FL 80.7 (80.5, 80.9) 100 Gary, IN 77.4 (77.1, 77.8)

Note: 95% confidence intervals shown in parentheses

Page 34: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Why Does Life Expectancy for Low-Income Individuals Vary Across Areas?

Page 35: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Why Does Life Expectancy for Low-Income Individuals Vary Across Areas?

Now use local area variation to explore determinants of life expectancy

Key question: is lower life expectancy in some areas driven by lack of access to health care or differences in health behavior?

Correlate life expectancy estimates with measure of health care access and health behaviors to answer this question

Page 36: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Correlations of Expected Age at Death with Health and Social FactorsFor Individuals in Bottom Quartile of Income Distribution

Page 37: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Smoking Rates for Individuals in Bottom Income Quartile

Note: Lighter Colors Represent Areas Lower Smoking Rates

Page 38: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Correlations of Expected Age at Death with Health and Social FactorsFor Individuals in Bottom Quartile of Income Distribution

Page 39: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Correlations of Expected Age at Death with Other FactorsFor Individuals in Bottom Quartile of Income Distribution

Page 40: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Local area variation suggests that differences in health behaviors are more predictive of life expectancy than differences in health care access

Further evidence for this view comes from directly examining nutritional patterns

Why Does Life Expectancy for Low-Income Individuals Vary Across Areas?

Page 41: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Alcott et al. (2018) use Nielsen homescan data on grocery store purchases to examine how nutrition varies with income

About 170,000 households who scan all of their purchases and record UPCs, which are then matched to nutritional information from the USDA

Differences in Nutrition by Income

Page 42: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Healthfulness of Grocery Purchases by Household Income

Source: Allcott, Diamond, Dube, Handbury, Rahkovsky, and Schnell 2018

4045

5055

Gra

ms

adde

d su

gar p

er 1

,000

cal

orie

s

0 50 100 150Household income ($000s)

Added sugar

Page 43: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Healthfulness of Grocery Purchases by Household Income

Source: Allcott, Diamond, Dube, Handbury, Rahkovsky, and Schnell 2018

Page 44: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

These differences in nutrition are not driven by a lack of access to health food (“food deserts”)

Differences in Nutrition by Income

Page 45: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Source: Allcott, Diamond, Dube, Handbury, Rahkovsky, and Schnell 2018

Healthfulness of Grocery Purchases by Household Income that Shop in the Same Market

Page 46: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

These differences in nutrition are not driven by a lack of access to health food (“food deserts”)

Again suggests that differences in health outcomes are not caused by a direct lack of access to resources

Instead, appear to be due to different choices made by lower-income households

Differences in Health Behaviors by Income

Page 47: Using Big Data to Solve Economic and Social Problems...Spring 2019 Using Big Data to Solve Economic and Social Problems Professor Raj Chetty Head Section Leader: Gregory Bruich, Ph.D

Why do low income households tend to have less healthy behaviors?

One hypothesis: effects of environment and resources at early ages on preferences

– Ex: Atkin (2016) studies migrants in India and shows that nutritional habits formed at young ages persist for many years after people move

Alternative hypothesis: lack of income constrains choice (unhealthy foods may be less expensive per calorie)

– Discuss this economic explanation in next lecture with Jesse Shapiro

Differences in Health Behaviors by Income