add a data scientist to your startup.. or call it quits!
TRANSCRIPT
ADD A DATA SCIENTIST TO YOUR STARTUP…
OR CALL IT QUITS
Justo Hidalgo@justohidalgo
Hi!
• Co-founder, – BizDev, Data
• Data Integration and Management, Product Strategy and Innovation
• Ph.D. in Computer Science on Data Integration and Web Automation
• Ergo: Love Data
• @justohidalgo
A service to read and discover digital books that works on any device
Part I: Why Data Science
2. Have someone taking care of your metrics
3. Only measure what you truly care about
@justohidalgo
4. Beware vanity metrics…
… they may hide an awful truth
AARRR
Acquire Activate Retain Refer Get Revenue
SEOSEMCampaignsEmailBlogs…
Landing PageProduct Features…
Content (blogs, articles, …)EmailsAlerts…
CampaignsEmails…
Shopping cartSubscriptionsLead Gen…
traffic social business
@justohidalgo
Don’t be data-driven…
… be data-informed…
… but VERY DATA-INFORMED!
#ELS2014
DATA SCIENCE
SO EASY A CAVEMAN CAN DO IT
AND NOW…
BUSINESS-SAVVY!!!
@justohidalgo
DS organization
Source: http://radar.oreilly.com/2013/06/theres-more-than-one-kind-of-data-scientist.html
Yes, Data Science is a Frankenstein monster… but start
working on it…
@justohidalgo
… or else…
Challenges
• Data acquisition and standardization• Execution time• Data misunderstandings
Part II: Case in Point
Books are on the cloudWe can measure every action performed by our readers
Publishing dashboard for publishers and investors
• Dashboard for publishers
• Sales analysis and forecast
• Product experience
• Reader behavior
• Marketing• Research
(MSc)
Currently:– Raw user data: 80 GB– Book info: Hundreds of GB– Over 1 TB of data
February, 2014:– 700,000+ registered users in 24symbols.com– 35,000 new registered users monthly (accelerated growth)– Over 2,000 publishers, 200,000 books and growing– New instances per country
➤ Whitelabel with mobile carriers => hundreds of thousands of users per country
➤ Currently: 24symbols.com + 4 projects with mobile carriers + internet.org (Colombia, but many more countries coming)
https://developers.google.com/bigquery/case-studies/safari-books
Users:– Books read (pages, chapters, hours, geo, …)– “Marked” books (highlights, favorites, etc.)
Books:– Text on ebooks– Metadata– Categories– Publishers
Types of info
Dataset structure
Sample queries in SQL
Projects - recommendation
Projects – Topic modelling
Projects – ad-hoc querying & BI
Thanks for your time!
@justohidalgo
Justo Hidalgohttp://www.loscuentosdelabuelo.com
Resources
• http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/• http://www.allenai.org/Content/Publications/Etzioni--Data_Scientists_Guide_
to_Start-Ups.pdf
• https://www.kaggle.com/ • http://jeroenjanssens.com/2013/09/19/seven-command-line-tools-for-data-s
cience.html
• http://www.boozallen.com/content/dam/boozallen/media/file/The-Field-Guide-to-Data-Science.pdf
• http://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/
• http://www.wolfram.com/data-science-platform/ • Online courses:
– https://www.udacity.com/courses#!/data-science– https://www.coursera.org/specialization/jhudatascience/ – https://www.datacamp.com/
Credits
• http://www.flickr.com/photos/qualityandstyle/4628275080/ •
http://enpundit.s3.amazonaws.com/wp-content/uploads/2013/07/gym-teacher-yearbook-photo-1.jpg
• http://nirvacana.com/thoughts/becoming-a-data-scientist/ • http://rlv.zcache.com/little_unicorn_with_pink_mane_tie-
r63d74334bf074562988a6286e0081cdb_v9whb_8byvr_324.jpg • http://www.thebigdatainsightgroup.com/site/sites/default/files/
cloud24ii_0.jpg • https://www.flickr.com/photos/domiriel/6003445548