Download - EMC Hadoop Starter Kit - ViPR Edition
![Page 1: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/1.jpg)
1© Copyright 2014 EMC Corporation. All rights reserved.
EMC Hadoop Starter KitViPR Edition
EMC Open Innovation Lab
![Page 2: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/2.jpg)
2© Copyright 2014 EMC Corporation. All rights reserved.
The Digital Universe
Less than 1% of the World’s Data
is AnalyzedBy 2020, the Internet will
connect 7.6B people
and 200B things (sensors, machines, cars, appliances…)
Data Volumes
2000: 2 Exabytes a year2011: 2 Exabytes a day
![Page 3: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/3.jpg)
3© Copyright 2014 EMC Corporation. All rights reserved.
Location & Types Of Big Data
Structured Data
UnstructuredData
Enterprise
ForecastData
LocationData
CreditData
ShippingData
Social, Video Data
Partner Public
10101010100101010011001010101110010
1101010100101011111
TelemetryData
Location & Types Of Big (& Fast!) Data
![Page 4: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/4.jpg)
4© Copyright 2014 EMC Corporation. All rights reserved.
Hadoop Challenges
Depends on HDFS for data repository– Must make legacy data accessible through HDFS
Hadoop HDFS inefficiencies:– 3 copies for protection– No advanced data efficiency: de-duplication, thin provision– Security
Integration with robust traditional data center products: compute virtualization, enterprise storage
![Page 5: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/5.jpg)
5© Copyright 2014 EMC Corporation. All rights reserved.
Hadoop Storage Options
Hadoop HDFS
• Leverage Hadoop distro HDFS data services
• Compute, and data converged on cluster of servers
Storage Array
• Name node and Data node services from storage array (i.e. EMC Isilon)
Storage OS
Name node and Data node services from storage OS (i.e. EMC ViPR)
![Page 6: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/6.jpg)
6© Copyright 2014 EMC Corporation. All rights reserved.
ViPR HDFS
HDFS is becoming the de facto file system for distributed applications
ViPR is a great platform for HDFS– Addresses limitations of off-the-shelf HDFS– Brings HDFS to existing storage hardware– Enables HDFS/object/file scenarios– Flexible software model allows colocation
![Page 7: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/7.jpg)
7© Copyright 2014 EMC Corporation. All rights reserved.
Support Mixed WorkloadsObject, File and HDFS operations on the same data
VIRTUAL ARRAY
Isilon3rd Party
VNX5500
ViPR Data Services offer three bucket options:
– Object– HDFS– ObjectandHDFS
ObjectandHDFS provides user with access to either S3 or HDFS
– Full compatibility with existing object based APIs
▪ Amazon S3, Openstack Swift, Atmos
Object HDFSObject& HDFS
![Page 8: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/8.jpg)
8© Copyright 2014 EMC Corporation. All rights reserved.
Simple, Easy, Cost Effective EMC Starter Kit for Hadoop – ViPR Edition
Deployment guides for major Hadoop distributions:– Pivotal, Cloudera, and Hortonworks
Four step deployment:– Deploy preferred Hadoop Distribution– Deploy EMC ViPR with Object, and HDFS data services– Configure Hadoop distribution to use ViPR HDFS target– Validation Process
▪ Load data file via S3 interface▪ Test MapReduce job
![Page 9: EMC Hadoop Starter Kit - ViPR Edition](https://reader033.vdocument.in/reader033/viewer/2022061303/5492c99bb479596f4d8b46af/html5/thumbnails/9.jpg)