big data and me
DESCRIPTION
Talk given to Touro College's Leadership in Digital Technology innovation course, April 22nd 2013TRANSCRIPT
![Page 1: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/1.jpg)
Big Data and Me: experiences from the front line
Sara-Jayne FarmerChange Assembly
April 22nd 2013
![Page 2: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/2.jpg)
ME
![Page 3: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/3.jpg)
Me
• Data Scientist
• Using data to:– connect communities – improve access to information – so people can make better decisions– on both small and large scales
• It’s all about people:– Local people: know their needs; need more information– Local technologists: have skills; need connections– Large organisations: have resources; need guidance
![Page 4: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/4.jpg)
Some of those People
(smart, talented, dedicated hackers in Haiti, January 2013)
![Page 5: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/5.jpg)
My Personal Three Vs
• Variety– Data all over the place– Csv, json, xml, excel, pdf, text, webpages, rss, scanned pages, images,
videos, audiofiles, maps, proprietary. Etc.
• Velocity– Streams updating too fast for a mapping team (100-200 people) to
handle– Pages updating too frequently to check by hand
• Volume– Can’t open the data in a spreadsheet– Can’t fit the data on my laptop– Maxes out my credit card (thank you Amazon!)
![Page 6: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/6.jpg)
VARIETY
![Page 7: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/7.jpg)
“more people have mobile phones than toilets”– UN, March 2013
![Page 8: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/8.jpg)
But… but… there are always data issues…
• Datasets were difficult to find• No data available after 2010• Hard to track provenance – e.g. what decisions did
the people creating these datasets make? What assumptions?
• Data was rounded up• Countrynames didn’t match between sets• Multiple charactersets (e.g. Å, A, Ԇ)• Messy formatting (merges, ‘explanations’ etc)
![Page 9: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/9.jpg)
e.g. Country Names
DR Congo in Data.UN.Org:
• “Congo, Democratic Republic of the”, “Congo Democratic”, “Democratic Republic of the Congo”, “Congo (Democratic Republic of the)”, “Congo, Dem. Rep.”, “Congo Dem. Rep.”, “Congo, Democratic Republic of”, “Dem. Rep. of Congo”, “Dem. Rep. of the Congo”
DR Congo in common standards:
• “Democratic Republic of the Congo” (UN Stats), “Congo, The Democratic Republic of the” (ISO3166), “Congo, Democratic Republic of the” (FIPS10, Stanag), “180” (UN Stats), “COD” (ISO3166, Stanag), “CG” (FIPS10)
![Page 10: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/10.jpg)
And coding
![Page 11: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/11.jpg)
And interpretation
• Hang on… don’t some people have more than one phone?
• And how do you count the people without toilets?• What if the cities have lots of phones and toilets, and
the rural areas don’t? • Where does my composting toilet fit in this?• How big were these surveys?• What do we do with the zeros?• Etc…
![Page 12: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/12.jpg)
And purpose
![Page 13: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/13.jpg)
And Communication
![Page 14: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/14.jpg)
And Alternative Data Sources
![Page 15: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/15.jpg)
And alternative alternatives…
• Social media proxies• Grassroots maps• Etc.
![Page 16: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/16.jpg)
VELOCITY AND VOLUME
![Page 17: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/17.jpg)
2013 Boston bombings
![Page 18: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/18.jpg)
The Humans+Tools Solution: Crisismapping
![Page 19: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/19.jpg)
Find…
![Page 20: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/20.jpg)
Listen…
![Page 21: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/21.jpg)
Estimate…
![Page 22: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/22.jpg)
Geolocate…
![Page 23: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/23.jpg)
Create maps…
![Page 24: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/24.jpg)
Analyse
![Page 25: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/25.jpg)
Explain
![Page 26: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/26.jpg)
Use
![Page 27: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/27.jpg)
BUT WE NEED MORE DATA SCIENTISTS…
![Page 28: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/28.jpg)
Build and Connect Communities
![Page 29: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/29.jpg)
Train Non-Techies
![Page 30: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/30.jpg)
Create Higher-level Tools
![Page 31: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/31.jpg)
Big Data and Me: experiences from the front line
Sara-Jayne Farmerhttp://www.changeassembly.com/
@bodaceacat
![Page 32: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/32.jpg)
MORE REFERENCES
![Page 33: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/33.jpg)
strataconf.com
![Page 34: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/34.jpg)
datasciencecentral.com
![Page 35: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/35.jpg)
analytictalent.com
![Page 36: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/36.jpg)
Tools
![Page 37: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/37.jpg)
Formal (Free) Training
![Page 38: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/38.jpg)
NYC Meetups (see meetup.com)
![Page 39: Big Data and Me](https://reader035.vdocument.in/reader035/viewer/2022062513/55583b42d8b42ac6078b4b7b/html5/thumbnails/39.jpg)
Volunteering: datakind.org