big data in the cloud - shekhar vemuri
TRANSCRIPT
![Page 1: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/1.jpg)
Big Data and the CloudShekhar Vemuri
#phxdataconference
![Page 2: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/2.jpg)
ABOUT
• PRINCIPAL at CLAIRVOYANT
• PRODUCT, DATA, ANALYTICS and CLOUD
• large scale web and data systems
• simple, lightweight solutions
![Page 3: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/3.jpg)
QUICK POLL
• HADOOP, HIVE, PIG
• PUBLIC CLOUD, IaaS, SaaS
• AMAZON AWS, EC2
• ELASTICITY
• S3, EMR, KINESIS
• IoT
![Page 4: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/4.jpg)
WHAT WILL WE TALK ABOUT
![Page 5: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/5.jpg)
BIG DATA
![Page 6: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/6.jpg)
USE CASES
RISK MODELING PERSONALIZEDMEDICINE AD TARGETING
INTERNET OF THINGS
THREAT ANALYSIS
RECOMMENDATIONS
SURVEILLANCE RETENTION 360 CUSTOMERVIEW
![Page 7: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/7.jpg)
DRIVING FACTORS
• variety in data
• not just transactional data
• potential for tremendous insight - when combining transactional data with additional data sources
• LinkedIn, Twitter, Facebook, Pinterest , Open Data
• Internet of Things
![Page 8: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/8.jpg)
the CLOUD
![Page 9: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/9.jpg)
the CLOUD
• IaaS, SaaS
• on demand subscription
• subscription vs owning
• tradeoff
• ease of adoption
• powering nextgen entrepreneurship
![Page 10: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/10.jpg)
LANDSCAPE
![Page 11: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/11.jpg)
DATA VALUE CHAIN
![Page 12: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/12.jpg)
1010101011010101010101010101010101010101010101010101010101GENERATE STORE ANALYZE INSIGHTS
> > >
DATA VALUE CHAIN
ingest transform transform
![Page 13: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/13.jpg)
BIG DATA + the CLOUD
![Page 14: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/14.jpg)
![Page 15: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/15.jpg)
LOG ANALYSIS
![Page 16: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/16.jpg)
AMAZON S3
AMAZON EC2
LOG FILES
ReST CLIENTS
WEB APP, REST APIs
![Page 17: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/17.jpg)
AMAZON EMR
AMAZON S3
AMAZON EC2
LOG FILES
ReST CLIENTS
WEB APP, REST APIs
AMAZON REDSHIFT
LOG FILES - STORED in S3
MAP-REDUCE, HIVE, PIG, CASCADING jobs
STORE summarized data
![Page 18: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/18.jpg)
AMAZON EMR
AMAZON S3
AMAZON EC2
LOG FILES
ReST CLIENTS
WEB APP, REST APIs
LOG FILES - STORED in S3
MAP-REDUCE, HIVE, PIG, CASCADING jobs
CLOUDERA IMPALA
![Page 19: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/19.jpg)
AMAZON S3
AMAZON KINESIS
AMAZON REDSHIFT AMAZON DYNAMODBAMAZON RDS
AMAZON EMR
![Page 20: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/20.jpg)
AMAZON S3
DATA
![Page 21: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/21.jpg)
AMAZON S3
INPUT
AMAZON EMR
![Page 22: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/22.jpg)
AMAZON S3
INPUT
OUTPUT
AMAZON EMR
![Page 23: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/23.jpg)
AMAZON S3
INPUT
OUTPUT
AMAZON EMR
AMAZON EMRWITH SPOT instances
![Page 24: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/24.jpg)
BUILDING BLOCKS
• amazon AWS
• amazon EMR
• amazon S3
• kinesis
• redshift
• spot instances
![Page 25: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/25.jpg)
HEADER
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud.
SUBHEADER
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
![Page 26: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/26.jpg)
PROS
• like other cloud solutions - reduces the barrier to adoption
• especially if you are already in the cloud
• can provide ability to implement quick POCs
![Page 27: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/27.jpg)
HEADER
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud.
SUBHEADER
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Category 4
Category 3
Category 2
Category 1
0 1.3 2.5 3.8 5 6.3
Series 1 Series 2Series 3
![Page 28: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/28.jpg)
CONS
• depending on your current infrastructure - may end up continually replicating data
• data security, privacy
![Page 29: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/29.jpg)
LEARNINGS
• Build platforms once the need is strongly felt
• Prepare to Fail fast, couple of times before the final version
• what you think will happen, will not
![Page 30: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/30.jpg)
LEARNINGS
• COSTS can spiral out of control
• Leverage spot instances to reduce costs, especially for bursty workloads
• S3 Can be very slow to run and initialize large workloads
• especially in recovery scenarios
• but data resiliency is not an issue
![Page 31: Big data in the cloud - Shekhar Vemuri](https://reader036.vdocument.in/reader036/viewer/2022062522/587850431a28ab68198b6413/html5/thumbnails/31.jpg)
www.clairvoyantsoft.com
@shekharvlinkedin.com/in/shekharvemuri