dunning strata-2012-27-02
DESCRIPTION
TRANSCRIPT
![Page 1: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/1.jpg)
1©MapR Technologies - Confidential
Expect More from Hadoop!
![Page 2: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/2.jpg)
2©MapR Technologies - Confidential
My Background
University, Startups– Aptex, MusicMatch, ID Analytics, Veoh– big data since before it was big
Open source– even before the internet– Apache Hadoop, Mahout, Zookeeper, Drill– bought the beer at first HUG
MapR Founding member of Apache Drill
![Page 3: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/3.jpg)
3©MapR Technologies - Confidential
MapR Technologies
Enterprise quality distribution for Hadoop–Many extensions beyond basic Hadoop
Super strong team–Long history of successful startups
Strong supporter of Apache Drill– and open source in general
![Page 4: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/4.jpg)
4©MapR Technologies - Confidential
meta-Hadoop?
![Page 5: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/5.jpg)
5©MapR Technologies - Confidential
meta Meta- (from Greek: μετά = "after", "beyond", "with", "adjacent", "self"), is a…
![Page 6: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/6.jpg)
6©MapR Technologies - Confidential
Beyond ≠Answering yesterday’s problems
![Page 7: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/7.jpg)
7©MapR Technologies - Confidential
Philosophy First
What is History?
![Page 8: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/8.jpg)
8©MapR Technologies - Confidential
The study of the past
(what came before now)
![Page 9: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/9.jpg)
9©MapR Technologies - Confidential
What is the future?
(it comes after now)
![Page 10: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/10.jpg)
10©MapR Technologies - Confidential
![Page 11: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/11.jpg)
11©MapR Technologies - Confidential
![Page 12: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/12.jpg)
12©MapR Technologies - Confidential
But the future also has a past!
![Page 13: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/13.jpg)
13©MapR Technologies - Confidential
the future of the pastis not
the past of the future
![Page 14: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/14.jpg)
14©MapR Technologies - Confidential
Do you remember the future?
![Page 15: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/15.jpg)
15©MapR Technologies - Confidential
![Page 16: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/16.jpg)
16©MapR Technologies - Confidential
![Page 17: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/17.jpg)
17©MapR Technologies - Confidential
![Page 18: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/18.jpg)
18©MapR Technologies - Confidential
Those are yesterday’s answers
![Page 19: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/19.jpg)
19©MapR Technologies - Confidential
and also the seeds
of tomorrow
![Page 20: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/20.jpg)
20©MapR Technologies - Confidential
Guys wearing Fedoras
![Page 21: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/21.jpg)
21©MapR Technologies - Confidential
Hadoop has a history
![Page 22: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/22.jpg)
22©MapR Technologies - Confidential
Hadoop also has a
future
![Page 23: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/23.jpg)
23©MapR Technologies - Confidential
The Old Future of Hadoop
Implementing yet another Google paper– Map-reduce and HDFS, and Yarn and Tez– more and more, but not really different
Eco-system additions (more Google papers)– simpler programming (Hive and Pig and Crunch) (Sawzall, FlumeJava, etc)– key-value store (big table)– ad hoc query (Dremel)– also not really different
Stands apart from other computing– required by HDFS and other limitations
![Page 24: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/24.jpg)
24©MapR Technologies - Confidential
The New Future of Hadoop
Real-time processing– Combines real-time and long-time
Integration with traditional IT– No need to stand apart
Integration with new technologies– Solr, Node.js, Twisted all should work directly on Hadoop
Fast and flexible computation– Drill logical plan language
![Page 25: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/25.jpg)
25©MapR Technologies - Confidential
Example #1Search Abuse
![Page 26: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/26.jpg)
26©MapR Technologies - Confidential
History matrix
One row per user
One column per thing
![Page 27: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/27.jpg)
27©MapR Technologies - Confidential
Recommendation based on cooccurrence
Cooccurrence gives item-item mapping
One row and column per thing
![Page 28: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/28.jpg)
28©MapR Technologies - Confidential
Cooccurrence matrix can also be implemented as a search index
![Page 29: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/29.jpg)
29©MapR Technologies - Confidential
SolRIndexerSolR
IndexerSolrindexing
Cooccurrence(Mahout)
Item meta-data
Indexshards
Complete history
![Page 30: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/30.jpg)
30©MapR Technologies - Confidential
SolRIndexerSolR
IndexerSolrsearchWeb tier
Item meta-data
Indexshards
User history
![Page 31: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/31.jpg)
31©MapR Technologies - Confidential
Objective Results
At a very large credit card company
History is all transactions, all web interaction
Processing time cut from 20 hours per day to 3
Recommendation engine load time decreased from 8 hours to 3 minutes
![Page 32: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/32.jpg)
32©MapR Technologies - Confidential
Scaling Estimates – Twitter Fire hose
Old School – 8+ separate clusters, 20-25 nodes– >3 Kafka nodes– >2 TwitterLogger– 5-10 Hadoop– >3 Storm– 3 zookeepers (or not?)– NAS for web storage– >2 web servers
MapR – one platform– 5-10 nodes total, any node does any
job– Full HA included,
backups included,disaster recovery included
![Page 33: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/33.jpg)
33©MapR Technologies - Confidential
Example #2Web
Technology
![Page 34: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/34.jpg)
34©MapR Technologies - Confidential
Fast analysis(Storm)
Analytic output
Real-timedata
Raw logs
![Page 35: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/35.jpg)
35©MapR Technologies - Confidential
Large analysis(map-reduce)
Analytic output Raw logs
![Page 36: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/36.jpg)
36©MapR Technologies - Confidential
Presentation tier (d3 + node.js)
Analytic output
Browser query
Raw logs
![Page 37: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/37.jpg)
37©MapR Technologies - Confidential
StormKafka
Twitter API
TwitterLoggerKafka
ClusterKafka
ClusterKafka
Cluster
Kafka API
Storm
NAS
Web Data
Hadoop
Flume
HDFS Data
Old School Storm: Complex architecture
Web-server
http
![Page 38: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/38.jpg)
38©MapR Technologies - Confidential
TwitterAPI
CatcherCatcher Storm
Topic Queue
Web-server
http
Web Data
MapR
TwitterLogger
MapR: One Platform with Streaming Writes
Users can also run extended analytics/MapReduce on the stored data
OptionalMapReduce HDFS
API
NFS NFS NFS NFS
![Page 39: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/39.jpg)
39©MapR Technologies - Confidential
![Page 40: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/40.jpg)
40©MapR Technologies - Confidential
Objective Results
Real-time + long-time analysis is seamless
Web tier can be rooted directly on Hadoop cluster
No need to move data
![Page 41: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/41.jpg)
41©MapR Technologies - Confidential
The future is not what we thought it would be
![Page 42: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/42.jpg)
42©MapR Technologies - Confidential
It is better!
![Page 43: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/43.jpg)
43©MapR Technologies - Confidential
Get Involved!
Tweet:#strataconf
#mapr@ted_dunning
![Page 44: Dunning strata-2012-27-02](https://reader035.vdocument.in/reader035/viewer/2022081518/54c6579b4a7959b93b8b45a6/html5/thumbnails/44.jpg)
44©MapR Technologies - Confidential
Get Involved!
Join Apache Drill!– [email protected] – Follow @apachedrill
Join MapR!– [email protected]
Download these slides– http://www.mapr.com/company/events/strata-conference-2-2-27-13
Contact me:– [email protected]– [email protected]– @ted_dunning