big data - big insights - waze @google
TRANSCRIPT
Data is Everything.Apps (and companies) win or lose based on how they use it.
Organize the world’s
information and make it
universally accessible
and useful.Google’s Mission
3
“
Google excels at collecting, storing, and extracting value from
big quantities of data
Google is a (Big) Data Company
Google in just 1 minute:
1000 new devices
3M Searches 100 Hours
1B Activated Devices
100M GB Search
Content
10+ Years of Tackling Big Data Problems
6
Google Papers
20082002 2004 2006 2010 2012 2014 2015
GFSMapReduce
Flume Java
Millwheel
OpenSource
2005
GoogleCloudProducts
BigQuery Pub/Sub Dataflow Bigtable
BigTable Dremel PubSub
Apache Beam
Tensorflow
“Google is living a few years in the future and sending the rest of us
messages”
Doug Cutting, Hadoop Co-Creator
Capture ProcessStore Analyze
Data on your terms
Capture ProcessStore Analyze
Data on your terms
Machine Learning
“Machine Learning is concerned with computer
programs that automatically improve their
performance through experience. “Herbert Simon Turing Award 1975Nobel Prize in Economics 1978
Machine Learning
“A breakthrough in machine learning would be worthten Microsofts” (Bill Gates, Chairman, Microsoft)
“Machine learning is the next Internet”(Tony Tether, Director, DARPA)
“Machine learning is the hot new thing” (John Hennessy, President, Stanford)
Google is no stranger to ML
So what do you actually do?
Gain Actionable Insights!
Trending Locations - Hilton TLV
Trending Locations / Day of Week Breakdown
Opening Hours Inference
Optimising - Ad clicks / Time from drive start
Time to Content (US) - Day of week / Category
Irregular Events / Anomaly Detection
Major events, causing out of the ordinary traffic/road blocks etc’ affecting large
numbers of users.
Dangerous Places - Clustering
Find most dangerous areas / streets, using custom developed clustering algorithms
● Alert authorities / users
● Compare & share with 3rd parties (NYPD)
Server Distribution Optimisation
Calculate the optimal routing servers distribution according to geographical load.
● Better experience - faster response time
● Saves money - no need for redundant elastic scaling of servers
Text Mining - Topic Analysis
Topic 1 - ETA Topic 2 - Unusual Topic 3 - Share info Topic 4 - Reports Topic 5 - Jams Topic 6 -Voice
wazers usual road social still morgan
eta traffic driving drivers will ang
con stay info reporting update freeman
zona today using helped drive kanan
usando times area nearby delay voice
real clear realtime traffic add meter
tiempo slower sharing jam jammed kan
carretera accident soci drive near masuk
Text Mining - New Version Impressions
● Text analysis - stemming / stopword detection etc.
● Topic modeling
● Sentiment analysis
Waze V4 update :
● Good - “redesign”, ”smarter”, “cleaner”, “improved”
● Bad - “stuck”
Overall very positive score!
Text Mining - Store Sentiments
Text Mining - Sentiment by Time & Place
Daniel [email protected]@gmail.com