data: a cautionary tale by daniel katz
DESCRIPTION
TRANSCRIPT
A Cautionary Tale
The Big Picture Collect Clean Model Store Present
{ "classes": [ { "name": "Fundamental Process of Design", "professor": "Joo Youn Paek" , "year" : " 2010 ", "semester" : "fall", "students": [ { "student" : { "name": “Joe Student", “email": “[email protected]", "twitter_name": “@itp4life" , “blog_url": “http://itp4life.blogspot.co" , } } ] } ]}
<classes><class>
<name>Fundamental Process of Design</name><professor>Joo Youn Paek</professor><year>2010</year><semester>Fall</semester><students>
<student><name>Joe Student</name><email>[email protected]</email>
<twitter_name>@itp4life</twitter_name><blog_url>http://itp4life.blogspot.com</blog_url></student>
</students></class>
</classes>
The Open Data Movement is in Full Swing Governments Institutions Scientists Enthusiasts
http://vimeo.com/2598878
Commercial tools and open source are starting to converge
There will always be assumptions
Bring it down
FreeBase – Entity Graph Info Chimp Twitter Facebook
Data.gov MTA
Arduino Smart Phone Other sensors
Don’t be intimidated by data from disparate sources
Clean up messy data Inconsistent data points Identify patterns Combine data from disparate
sources
Collection of Twitter Responses from API
Value.parseJson().user.screen_name
Depending on the type of data you are collecting, there are appropriate places to
store it
Non-programmers Google Fusion Tables
For programmers Geo Database and programming tools
PostGIS (Postgresql) GeoTools (Java)
Non-programmers Google Docs (Read into processing) Microsoft Excel (internal charting tool) Text based formatting (visualize with
Google Chart API)
For programmers Any relational database
MySql PostgresSql
Graph Database
http://blog.blprnt.com/blog/blprnt/your-random-numbers-getting-started-with-processing-and-data-visualization
http://code.google.com/p/gdocjdbc/
http://www.infochimps.com/datasets/tweets-during-state-of-the-union-address
http://code.google.com/p/google-refine/
http://dev.twitter.com/doc/get/geo/search
http://flowingdata.com/2009/07/14/how-does-the-average-consumer-spend-his-money/
http://www.bls.gov/cex/ http://www.google.com/
fusiontables/Home