big data in the national accounts
DESCRIPTION
Big Data in the National Accounts. Experience in the United States. Brent Moulton. Advisory Expert Group on National Accounts Washington, DC 9 September 2014. What are big data?. - PowerPoint PPT PresentationTRANSCRIPT
www.bea.gov
Big Data in the National Accounts
Experience in the United States
Brent MoultonAdvisory Expert Group on National Accounts
Washington, DC
9 September 2014
www.bea.gov 2
What are big data?
▪ Wikipedia: “Any collection of data sets so large and complex that it becomes difficult to process using… traditional data processing applications.”
▪ IBM: “Every day we create 2.5 quintillion bytes of data… This data comes from everywhere… This is big data.”
▪ Forbes: “12 big data definitions: what’s yours?” # 11 – “The belief that the more data you have, the
more insights and answers will arise automatically from the pool”
# 12 – “A new attitude… that combining data from multiple sources could lead to better decisions.”
www.bea.gov
Big data and official statistics
▪ Statistical agencies as producers of big data Consistency in format and presentation Catalogued in common, machine-readable format Accessible in bulk Desirable to make government data available on a
single platform
▪ Big data as source data for national accounts Administrative data, especially micro-data Data from private sources Web scraping
3
www.bea.gov
Concerns about using big data
▪ Do the concepts match those needed for national accounts?
▪ How representative are the data? Selection biases
▪ Is it possible to fill the gaps in coverage?
▪ Do the data provide consistent time series and classifications?
▪ How timely are the data?▪ How cost effective?
4
www.bea.gov
Defined-benefit pension funds
▪For the SNA’s new treatment of defined-benefit pensions, BEA found it useful to work with administrative micro-data filed by pension funds “Form 5500” data from Pension Benefit
Guaranty Corporation ~ 45,000 records per year covering
98% of private pension funds BEA had to edit data to remove data
errors and anomalies
5
www.bea.gov
Private source data for early estimates
▪ For “advance” GDP estimate (release about 30 days after the end of the quarter), official monthly/quarterly indicators are not always available
▪ Examples of private source data used by BEA: Ward’s/JD Powers/Polk (auto
sales/price/registrations) American Petroleum Institute (oil drilling) Air Transport Association of America (airlines) Variety magazine (motion picture admissions) Smith Travel Research (hotels and motels) Investment Company Institute (mutual fund sales)
6
www.bea.gov
Health care satellite account
▪ Schultze Commission (At What Price? 2002) recommended that health care price indexes should be based on cost of treating a specific diagnosis
▪ BEA is preparing a health care satellite care (http://www.bea.gov/national/health_care_satellite_account.htm) One approach uses insurance claims data for
several million insured individuals Claims grouped in disease episodes Allows comparison of change in cost for treating
particular diseases
7
www.bea.gov 8
Local area tracking system
▪ Used by BEA’s regional accounts staff for independent data on regional economies
▪ Used to vet official statistics before publishing
▪ Types of data Employment data: largest employers, principal
industries, recent layoffs Natural events affecting the economy Local real estate and financial trends
▪ Automated using web scraping methods Identifying key word searches Archiving relevant articles
www.bea.gov
BEA research on depreciation
▪ Identifying depreciation in the presence of obsolescence is a long-standing issue
▪ BEA research on motor vehicle depreciation proposes to address this problem using data on “build dates,” which can differ from model years
▪ Data scraping – VIN-level data from decodethis.com combined with auction data from NADA and data from other auto websites
▪ Goal is improved estimates of depreciation
9
www.bea.gov
Conclusions
▪ Big data will become increasingly important
▪ Priority to improving data quality, filling gaps, and keeping up with changing economy
▪ Big data especially useful for research projects
▪ Big data may allow for more timely or higher frequency estimates
▪ Attention must continue to be paid to traditional data quality issues
10