prediction database: the need of the hour -...
TRANSCRIPT
Prediction DataBase: The Need of The Hour
Devavrat Shah
ProfessorEECS
DirectorStats & Data Sc
Massachusetts Institute of Technology
Co-FunderChief Scientist
Celect, Inc.
© 2015 Celect, Inc. All Rights Reserved.Use, reproduction, or disclosure is subject to restrictions set forth in Contract Number 2014-14031000011 and Sub Contract No. Celect 01Use, reproduction, or disclosure is subject to restrictions set forth in Contract Number 2014-14031000011 and Sub Contract No. Celect 01
An Ultimate Prediction Engine?
Prediction
ConfidenceProvenance
BigData
Heterogeneous Data
Sparse Data
Up-and-Running Instantly without
Team of Data Scientists
Add Data Incrementally
Stitch Different Data Sources for Better Predictions
Anybody (Excel user) can use it!
Existing Paradigm of Statistics / Machine Learning
Application
DataStore
Manual Data Processing
Predictive Queries
“Normalized Data”
Model Learning, Prediction
Existing Paradigm of Statistics / Machine Learning
Application
DataStore
Manual Data Processing
Predictive Queries
“Normalized Data”
Model Learning, Prediction
Getting Rid of This!
Prediction DataBase
Application
DataStore
Predictive Queries
A Software Layer
No Manual Processing
Predictive DataBase: A New Data Infrastructure
A Brief History of DataBase
1970s-80s: relational database like MySQL and Postgres
1980s-90s: personal database like Excel
1990s-00s: distributed database like Cassandra
Now: Prediction database
2000s-10s: search engines database like Elastic Search
Formal Description
9: ?
Schema less DB
key : value key : ?
Atomic Prediction
Name Table
1: ‘Vasudha Shivamoggi’2: ‘Devavrat Shah’
3: ‘Vishal Doshi’
4: ‘Ying-zong Huang’
5: ‘John Andrews’
6: ‘Balaji Rengarajan’
7: ‘Ritesh Madan’
8: ‘Daniel Xu’
Formal Description
Name Table
1: ‘Vasudha Shivamoggi’2: ‘Devavrat Shah’
3: ‘Vishal Doshi’
4: ‘Ying-zong Huang’
5: ‘John Andrews’
8: ‘Daniel Xu’
6: ‘Balaji Rengarajan’
7: ‘Ritesh Madan’
Schema less DB
key : value key : ?
Atomic Prediction
Gender Table
1: ‘Female’2: ‘Male’
3: ‘Male’
4: ‘Male’
5: ‘Male’
8: ‘Male’
6: ‘Male’
7: ‘Male’
9: ‘John Tsitsiklis’ 9: ?
Formal Description
Name Table
1: ‘Vasudha Shivamoggi’2: ‘Devavrat Shah’
3: ‘Vishal Doshi’
4: ‘Ying-zong Huang’
5: ‘John Andrews’
8: ‘Daniel Xu’
6: ‘Balaji Rengarajan’
7: ‘Ritesh Madan’
Schema less DB
key : value key : ?
Atomic Prediction
Gender Table
1: ‘Female’2: ‘Male’
3: ‘Male’
4: ‘Male’
5: ‘Male’
8: ‘Male’
6: ‘Male’
7: ‘Male’
9: ‘John Tsitsiklis’ 9: ?
May be ‘Male’
Formal Description
(1, ‘Name’): ‘Vasudha Shivamoggi’
(2, ‘Name’): ‘Devavrat Shah’
(3, ‘Name’): ‘Vishal Doshi’
(4, ‘Name’): ‘Ying-zong Huang’
(5, ‘Name’): ‘John Andrews’
(8, ‘Name’): ‘Daniel Xu’
(6, ‘Name’): ‘Balaji Rengarajan’
(7, ‘Name’): ‘Ritesh Madan’
Schema less Prediction DB
(key, table name) : value (key, table name) : ?
Atomic Prediction
(1, ‘Gender’): ‘Female’
(2, ‘Gender’): ‘Male’
(3, ‘Gender’): ‘Male’
(4, ‘Gender’): ‘Male’
(5, ‘Gender’): ‘Male’
(8, ‘Gender’): ‘Male’
(6, ‘Gender’): ‘Male’
(7, ‘Gender’): ‘Male’
(9, ‘Name’): ‘John Tsitsiklis’ (9, ‘Gender’): ?
Formal Description
value
text
numeric
image
geoJson
Schema less Prediction DB
(key, table name) : value (key, table name) : ?
Atomic Prediction
(key1, key2, table name) : value (key1, key2, table name) : ?
Graph DB: A Special Case
Schema less Prediction DB
(key, table name) : value (key, table name) : ?
Atomic Prediction
(key1, key2, table name) : value (key1, key2, table name) : ?
1
2
3
4
Graph DB: A Special Case
Schema less Prediction DB
(key, table name) : value (key, table name) : ?
Atomic Prediction
(key1, key2, table name) : value (key1, key2, table name) : ?
1
2
3
4
(1,2, retweet) : ‘GeoInt’
‘GeoInt’
Graph DB: A Special Case
Schema less Prediction DB
(key, table name) : value (key, table name) : ?
Atomic Prediction
(key1, key2, table name) : value (key1, key2, table name) : ?
1
2
3
4
(1,2, retweet) : ‘GeoInt’
‘GeoInt’
(1,2, SMS) : ‘Meet @ Hyatt Dulles’
‘Meet @ Hyatt Dulles’
Graph DB: A Special Case
Schema less Prediction DB
(key, table name) : value (key, table name) : ?
Atomic Prediction
(key1, key2, table name) : value (key1, key2, table name) : ?
1
2
3
4
(1,2, retweet) : ‘GeoInt’
‘GeoInt’
(1,2, SMS) : ‘Meet @ Hyatt Dulles’
‘Meet @ Hyatt Dulles’
(1,name) : ‘Dev’
‘Dev’
This is Not A Pipe Dream: Celect Has Built It
Application
DataStore
Schema Definition
Predictive Queries
5% - 15% Increase in Revenue
System in Cloud Auto-Scales as Data Grows
100Gbs/Day 100M+ Customers 100M+ Products
< 100 milliseconds API Response Time
Celect in Retail: At Fortune 500 Scale
Celect Beyond Retail
? ???
??
?
Parting Remarks
Prediction Database
New Paradigm for Modern Statistics and Machine Learning
Make Prediction a “Special” Database Query
Celect, Inc. Has Built Such an Infrastructure
Can Support Most (If Not All) Problems of Interest
Successful in Retail IndustryIntriguing Case-Studies Beyond Retail
Handles Unstructured Data: Text, GeoSpatial, Image