apache hadoop on the open cloud david dobbins nirmal ranganathan
TRANSCRIPT
![Page 1: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/1.jpg)
Apache Hadoopon the
Open Cloud
David Dobbins
Nirmal Ranganathan
![Page 2: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/2.jpg)
Who is using Apache Hadoop
•Traditionally = Developers
•Increasingly = Business Users / Data Scientists
•Why does this matter?
![Page 3: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/3.jpg)
3
Configuring and managing a Hadoop cluster is hard
![Page 4: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/4.jpg)
4
Resources / Expertise
![Page 5: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/5.jpg)
5
Multiple Performance and Design Variables
![Page 6: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/6.jpg)
6
The Cloud solves some of these
![Page 7: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/7.jpg)
7
Advantages of using the cloud
FastEasy
Flexible
![Page 8: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/8.jpg)
8
You still require expertise
![Page 9: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/9.jpg)
9
Lets check out another option
![Page 10: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/10.jpg)
10
Hadoop in the Cloud Use Cases
![Page 11: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/11.jpg)
11
Development / POC Clusters
![Page 12: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/12.jpg)
12
Dynamic Clusters
![Page 13: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/13.jpg)
13
Growth Clusters
![Page 14: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/14.jpg)
14
Your data is already in the Cloud
![Page 15: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/15.jpg)
15
Demo
Run an actual job
![Page 16: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/16.jpg)
16
Swift Filesystem for Hadoop: HADOOP-8545
•New filesystem URL, swift://•Read from, write to local & remote Swift clusters
•Keep long-lived data in Swift; upload while Hadoop cluster off-line
The challenges of running Map Reduce jobs against Swift..
• Identity management
• Block size
• Object store vs file paths
• Direct API into swift from HDFS
![Page 17: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/17.jpg)
17
Map Reduce to Swift (via “HDFS”)
HDFS
MapReduce
Application X
HDFS Proxy
MapReduce
Application X
SWIFT
![Page 18: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/18.jpg)
18
Hadoop + Openstack
![Page 19: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/19.jpg)
19
Cloud Big Data Platform
•Hortonworks Data Platform• HDP 1.1
• HDP 1.3
• Pig, Hive, HCatalog
• Coming soon HDP 2.0
![Page 20: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/20.jpg)
20
Cloud Big Data Platform
•Secure by default
•Comes pre-optimized
•Web UI, CLI, REST API
![Page 21: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/21.jpg)
21
Built on Openstack
![Page 22: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/22.jpg)
22
Why an Open Platform mattersSandbox on
Rackspace Cloud
Sandbox
VM
RAX
Resell
![Page 23: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/23.jpg)
Cool stuff
![Page 24: Apache Hadoop on the Open Cloud David Dobbins Nirmal Ranganathan](https://reader033.vdocument.in/reader033/viewer/2022051622/5697bfd11a28abf838caadf2/html5/thumbnails/24.jpg)
@caffiend@rnirmal
http://www.rackspace.com/big-data