using data science as evidence in public policy with big data and elections dr. brand niemann...

Post on 27-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Using Data Science as Evidence in Public Policy With Big Data and Elections

Dr. Brand NiemannDirector and Senior Enterprise Architect – Data Scientist

Semantic Communityhttp://semanticommunity.info/

AOL Government Bloggerhttp://gov.aol.com/bloggers/brand-niemann/

November 1-2, 2012http://semanticommunity.info/CNSTAT

2

Start by Asking Questions

• Which by State, Congressional District, and which by time?

• Which is the easiest to reformat?• Which is the most interesting?• Where have the candidates been?• Which data is free?• Etc.

Note: Drew Conway (@drewconway) speaking about the joys, challenges, and power of data science. "Data science, as a discipline, is fundamentally about human behavior.” http://semanticommunity.info/AOL_Government/2012_Recorded_Future_User_Conference

3

Then Look for the Evidence

• Brainstorm:– What Have I Done Before?

• 2012 Annual Statistical Abstract:– Chapter 7. Elections

• Google Searches:– Election and Voting Data

• Conferences:– National Academy Seminars

• Television:– Debates, etc.

4

Begin With the End In Mind(Stephen Covey)

• Story (publicity and money)• Research Notes (document what I did and

learned)• Conditioned Data Sets (added value)• Spotfire Dashboard (cool visualizations)• Lecture to Students at George Mason

University (help them learn what a data scientist/data journalist does)

5

My 5-Step Method

• So what I like to do to illustrate (data science) and explain (data journalism) in the following (like a recipe):– Put the Best Content into a Knowledge Base (e.g. MindTouch)

• The 2012 Annual Statistical Abstract, CNSTAT, etc.

– Put the Knowledge Base into a Spreadsheet (Excel)• Linked Data to Subparts of the Knowledge Base

– Put the Spreadsheet into a Dashboard (Spotfire)• Data Integration and Interoperability Interface

– Put the Dashboard into a Semantic Model (Excel)• Data Dictionaries and Models

– Put the Semantic Model into Dynamic Case Management (Be Informed)• Structured Process for Updating Data in the Dashboard

6

Knowledge Base

http://semanticommunity.info/CNSTAT

7

2012 Annual Statistical Abstract:Chapter 7. Elections (Visualizations)

http://semanticommunity.info/FedStats.net#Section_7_ELECTIONS

8

2012 Annual Statistical Abstract:Chapter 7. Elections (Metadata)

http://semanticommunity.info/FedStats.net#Section_7._Elections

9

FedStat.net: Commemorating over 135 years of making statistics available to citizens everywhere

http://semanticommunity.info/FedStats.net#Story

10

FedStats.gov Remains Rich Source Of Government Data For Citizens

http://gov.aol.com/2012/07/26/fedstats-gov-remains-rich-source-of-government-data-for-citizens/

11

2012 Annual Statistical Abstract

http://www.census.gov/compendia/statab/

12

Data From CD-ROM to My Server

http://semanticommunity.net/StatAbs2012/

13

Spreadsheet

http://semanticommunity.info/@api/deki/files/19606/Elections2012.xls

14

Welcome to the Campaign 2012 Interactive Dashboard

http://campaign2012.c-span.org/electoral-college-map

My Note: Not like the next slide!

15

CNN Electoral Map

http://www.cnn.com/ELECTION/2012/ecalculator

16

CNN Electoral Map in Excel

http://semanticommunity.info/@api/deki/files/19606/Elections2012.xls

17

CNN Electoral Map in Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

18

Data Set Inventory and Results

http://semanticommunity.info/CNSTAT#Story

19

2012 Annual Statistical Abstract Election Tables Metadata

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

20

Table 397. Participation in Elections for President and U.S. Representatives and Table 402. Vote Cast for President, by Major Political Party

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

21

Table 405. Electoral Vote Cast for President by Major Political Party--States

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

22

Table 408. Apportionment of Membership in House of Representatives, by State

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

23

Table 410. Vote Cast by Congressional Districts: 2010

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

24

Cover Page

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

25

Conclusions and Suggestions• I had the pleasure of attending three very interesting and related professional statistical

meetings recently that showed that statisticians really care about current issues.• This made me appreciate that elections are a big data problem that is approached in

three basic ways: Historical elections data, Collection and modeling of polling survey data before the election, and Use of social media.

• So I used inventoried the historical and polling survey data (I could get) to aid in selection and visualization in a dashboard and found I needed both Congressional and State boundary files as shown in a table.

• So imagine an election season in which we had less or no polls to influence voters so they could focus on the candidates and the issues and then we got an amazing example of big data processing just after the polls closed (by gentleman's agreement with Congress) which we could all participate in by seeing the precinct voting results posted to Twitter and processed by many apps that developers had developed to bring us interesting and useful results. I am eager to see that to happen in 2014 and 2016!

• I will be updating these results with the final 2012 elections data and providing another story.

26

Extra Slides

• Boundary Files:– US States Repositioned– US Counties Repositioned– US Congressional Districts 1– US Congressional Districts 2

• Sources:– Spotfire

• https://silverspotfire.tibco.com/us/library

– US Census• http://www.census.gov/cgi-bin/geo/shapefiles2010/main

27

US States Repositioned

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

28

US Counties Repositioned

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

29

US Congressional Districts 1

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

30

US Congressional Districts 2

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

top related