opening large data sets
DESCRIPTION
In this presentation, Eric Gundersen shows some real life examples of awesomeness that was achieved by opening up public data sets and making this information widely accessibly and talks about how to do this. This presentation was given as part of the "Building Governmental Transparency" event hosted by the Center for American Progress on Friday, March 19, 2010. More details and video at http://www.americanprogress.org/events/2010/03/sunshine.html.TRANSCRIPT
![Page 1: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/1.jpg)
via flickr: by www.pictobank.com
opening large data sets
Thursday, March 11, 2010
![Page 2: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/2.jpg)
twitter.com/ericgThursday, March 11, 2010
![Page 3: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/3.jpg)
• Original Polling Center Master list of 6,969 polling centers from the Independent Election Commission (IEC).
• IEC's election prelim results from September 16th, a 2,500 page PDF.
• The Electoral Complaints Commission's (ECC) complaint data (which aggregates only to the provincial level).
Data Sources:
twitter.com/ericgThursday, March 11, 2010
![Page 4: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/4.jpg)
we needed a data browser
Thursday, March 11, 2010
![Page 5: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/5.jpg)
www.AfghanistanElectionData.com
Thursday, March 11, 2010
![Page 6: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/6.jpg)
Thursday, March 11, 2010
![Page 7: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/7.jpg)
The system geo codes votes down the the district level. The political boundaries for this map covered 400 districts.Density point visualization shows results based on the Highlighted stations criteria, in this case % of stations effected.
Thursday, March 11, 2010
![Page 8: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/8.jpg)
Complex analysis: This Afghan ethnic distribution base layer is overlaid with districts won by Karzai (red dots) and Abdullah (green dots). Dot size indicates the number of votes. Ethnic data is digitized from the Soviet Atlas Narodov Mira
Thursday, March 11, 2010
![Page 9: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/9.jpg)
Interacting with the data: you can quickly drill down to any region, as the map zooms.
Thursday, March 11, 2010
![Page 10: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/10.jpg)
• Percent Population Urban by District Population by District (2003-2004) AIMS CSO Population Statistics.
• Settled Population by Province (2006-2007) Afghanistan Human Development Report 2007, Center for Policy and Human Development, Kabul University
• Estimated votes, via IEC’s Master Polling Center list
Thursday, March 11, 2010
![Page 11: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/11.jpg)
Population: 22,700Estimated voters: 53,039
Difference: 30,339
Thursday, March 11, 2010
![Page 12: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/12.jpg)
Total votes: 15,023
Thursday, March 11, 2010
![Page 13: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/13.jpg)
Drill down in context: “Highlighted Station” selection continues to work within both provinces + districts
Thursday, March 11, 2010
![Page 14: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/14.jpg)
Per polling center data: see the affected stations and votes within a polling center
Thursday, March 11, 2010
![Page 15: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/15.jpg)
Thursday, March 11, 2010
![Page 16: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/16.jpg)
Thursday, March 11, 2010
![Page 17: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/17.jpg)
Thursday, March 11, 2010
![Page 18: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/18.jpg)
photo credit boston.com
security matters
Thursday, March 11, 2010
![Page 19: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/19.jpg)
Thursday, March 11, 2010
![Page 20: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/20.jpg)
Thursday, March 11, 2010
![Page 21: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/21.jpg)
via flickr: by www.pictobank.com
geography matters
Thursday, March 11, 2010
![Page 22: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/22.jpg)
Thursday, March 11, 2010
![Page 23: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/23.jpg)
Thursday, March 11, 2010
![Page 24: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/24.jpg)
Road data: OSM provides better street data than AIMSThursday, March 11, 2010
![Page 25: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/25.jpg)
Thursday, March 11, 2010
![Page 26: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/26.jpg)
twitter.com/ericgvia wikimedia.orgThursday, March 11, 2010
![Page 27: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/27.jpg)
Thursday, March 11, 2010
![Page 28: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/28.jpg)
Snow line: 1,800 meters according to FAO Thursday, March 11, 2010
![Page 29: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/29.jpg)
Thursday, March 11, 2010
![Page 30: Opening Large Data Sets](https://reader034.vdocument.in/reader034/viewer/2022042613/554d0b3eb4c9052c5a8b4cda/html5/thumbnails/30.jpg)
• Elevation information is from the SRTM (Shuttle Radar Topography Mission)
• Road information from OpenStreetMap
• Provincial and district data are from AIMS (Afghanistan Information Management Services)
Map Data Sources:
Thursday, March 11, 2010