open government data and mongodb

46
Open Government Data & MongoDB Luigi Montanez [email protected]

Upload: luigi-montanez

Post on 21-Jan-2015

1.360 views

Category:

Documents


0 download

DESCRIPTION

Given at MongoDC on June 27, 2011.

TRANSCRIPT

Page 1: Open Government Data and MongoDB

Open Government Data & MongoDB

Luigi [email protected]

Page 2: Open Government Data and MongoDB
Page 3: Open Government Data and MongoDB

Question? @LuigiMontanez

Page 4: Open Government Data and MongoDB

Question? @LuigiMontanez

Open Data + Open Source = Open Government

Page 5: Open Government Data and MongoDB

Question? @LuigiMontanez

MongoDB enablesopen data

Page 6: Open Government Data and MongoDB

Question? @LuigiMontanez

Opening Up Data

✴ Gather data from disparate sources✴ Data dumps (SQL, Fixed-width columns)✴ Web scraping✴ Text/PDF parsing

✴ Serving RESTful JSON APIs

Page 7: Open Government Data and MongoDB

Question? @LuigiMontanez

JSON

✴ Tree structure, not tabular✴ Still relational✴ JSON for data, XML for documents✴ Closely resembles native data structures✴ No manual parsing needed

Page 8: Open Government Data and MongoDB

Question? @LuigiMontanez

Three Projects

✴ Poligraft✴ Real Time Congress API✴ Open State Project

Page 9: Open Government Data and MongoDB

Question? @LuigiMontanez

Three Projects

✴ Poligraft✴ Real Time Congress API✴ Open State Project

Page 10: Open Government Data and MongoDB

Question? @LuigiMontanez

App designdrives

schema design

Page 11: Open Government Data and MongoDB
Page 12: Open Government Data and MongoDB
Page 13: Open Government Data and MongoDB
Page 14: Open Government Data and MongoDB
Page 15: Open Government Data and MongoDB

Text

{ "title": "President Obama's climate 'Plan B' in hot water - Darren Samuelsohn - POLITICO.com"}

Page 16: Open Government Data and MongoDB
Page 17: Open Government Data and MongoDB

Text

{ "title": "President Obama's climate 'Plan B' in hot water - Darren Samuelsohn - POLITICO.com",

"slug": "EOsc","source_url": "http://www.politico.com/news/stories/0810/40534.html","content": ".................",

}

Page 18: Open Government Data and MongoDB
Page 19: Open Government Data and MongoDB
Page 20: Open Government Data and MongoDB

Text

{ "title": "President Obama's climate 'Plan B' in hot water - Darren Samuelsohn - POLITICO.com",

"slug": "EOsc","source_url": "http://www.politico.com/news/stories/0810/40534.html","content": ".................","entities": [...]

}

Page 21: Open Government Data and MongoDB

Text

{ "title": "President Obama's climate 'Plan B' in hot water - Darren Samuelsohn - POLITICO.com",

"slug": "EOsc","source_url": "http://www.politico.com/news/stories/0810/40534.html","content": ".................","entities": [

{"name": "Barack Obama","type": "politician",},...

]}

Page 22: Open Government Data and MongoDB
Page 23: Open Government Data and MongoDB

Text

{ "title": "President Obama's climate 'Plan B' in hot water - Darren Samuelsohn - POLITICO.com",

"slug": "EOsc","source_url": "http://www.politico.com/news/stories/0810/40534.html","content": ".................","entities": [

{"name": "Barack Obama","type": "politician","breakdown": {"indiv": "33", "pac": "67"}"top_industries": ["Lawyers/Lobbyists","Finance/Insurance/Real Estate","Misc. Business"]},...

]}

Page 24: Open Government Data and MongoDB
Page 25: Open Government Data and MongoDB

Question? @LuigiMontanez

Natural Schemas

Page 26: Open Government Data and MongoDB

Question? @LuigiMontanez

Three Projects

✴ Poligraft✴ Real Time Congress API✴ Open State Project

Page 27: Open Government Data and MongoDB

Real-Time Congress API

Credit: vgm8383 on Flickr

Page 28: Open Government Data and MongoDB

Android App: “Congress”

Page 29: Open Government Data and MongoDB

Politiwidgets

Page 30: Open Government Data and MongoDB

Question? @LuigiMontanez

Requirements✴ Aggregate lots of data

Biographical, Bills, Votes, Earmarks, Video Clips, Floor Updates, Legislative Documents, Committee Schedules, Contributions, Interest Group Ratings

✴ Lightweight responses

Page 31: Open Government Data and MongoDB

{legislator: { in_office: true, title: "Rep", nickname: "", district: "9", bioguide_id: "L000551", govtrack_id: "400237", phone: "202-225-2661", website: "http://lee.house.gov/index.html", twitter_id: "", last_name: "Lee", name_suffix: "", last_updated: "2010/04/13 00:00:14 +0000", party: "D", chamber: "house", state: "CA", youtube_url: "http://www.youtube.com/RepLee", first_name: "Barbara", gender: "F", congress_office: "2444 Rayburn House Office Building", earmarks: { average_number: 20, total_amount: 10000000, average_amount: 22994535, total_number: 28, last_updated: "2010-03-18", fiscal_year: 2010, } ...}

Page 32: Open Government Data and MongoDB

// limit selection to a subset of fieldsdb.people.find( { 'first_name' : 'john' }, { 'last_name' : 1, 'address' : 1 } );

// use dot-notation to dig into an objectdb.people.find( { 'state': 'CA' }, { 'address.zip_code': 1 } );

Page 33: Open Government Data and MongoDB

{legislator: { last_name: "Lee", first_name: "Barbara", state: "CA", earmarks: { average_number: 20, total_amount: 10000000, average_amount: 22994535, total_number: 28, last_updated: "2010-03-18", fiscal_year: 2010, }}

?sections=last_name,first_name,state,earmarks

Page 34: Open Government Data and MongoDB

{legislator: { last_name: "Lee", first_name: "Barbara", state: "CA", earmarks: { total_amount: 10000000, total_number: 28 }}

?sections=last_name,first_name,state,earmarks.total_amount,earmarks.total_number

Page 35: Open Government Data and MongoDB

Question? @LuigiMontanez

Partial responses make payloads

smaller

Page 36: Open Government Data and MongoDB

Question? @LuigiMontanez

Three Projects

✴ Poligraft✴ Real Time Congress API✴ Open State Project

Page 37: Open Government Data and MongoDB
Page 38: Open Government Data and MongoDB

Question? @LuigiMontanez

50 States =50 Formats

Page 39: Open Government Data and MongoDB

Question? @LuigiMontanez

Schemalessness allows for granular

control

Page 40: Open Government Data and MongoDB

Question? @LuigiMontanez

Custom Fields✴ Traditional RDBMS

✴ Update the schema for new fields, run a migration, feel icky

✴ Create a custom_fields table✴ MongoDB

✴ Just store it

Page 41: Open Government Data and MongoDB

Question? @LuigiMontanez

Speaking JSONnatively

Page 42: Open Government Data and MongoDB

Source Scraped JSON PythonTransform PostgreSQL

Page 43: Open Government Data and MongoDB

Source Scraped JSON MongoDB

Page 44: Open Government Data and MongoDB

Question? @LuigiMontanez

Three Projects

✴ Poligraft✴ Real Time Congress API✴ Open State Project

Page 45: Open Government Data and MongoDB

Developer Happiness

Page 46: Open Government Data and MongoDB

Question? @LuigiMontanez

Thanks!sunlightlabs.com@LuigiMontanez