data analytics in real world

11
Data Analytics in Real World Geeta Chauhan @ MUM Dec 2015

Upload: geetachauhan

Post on 26-Jan-2017

103 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Data Analytics in Real World

Geeta Chauhan @ MUM Dec 2015

1 2 3

6 5 4

7 8 9

Master's in Computer Application

Systems EngineerGeneral Manager & Technical Director

Senior ConsultantDevelopment DirectorInnovation & Research Director

Chief Technology Officer

Lead 13 New Products, Features across 30+ Products

Data Driven, Multi-tier, Social Media, Mobile, Cloud, Analytics

Agile, User Centered Design, Lean Startup,

Mindfulness

India

USA

Data Analytics in Real World 2

Challenges for Data Analytics in Real World

Technological

Rapidly evolving Technology Stack

Shift towards Open Source to contain costs

Shift from One standard way of doing things to Contextual use case driven

New types of access & usage patterns

Real Time, On- Demand, Exploratory, Internet of Things

Two different types of projects

Production Bread & Butter

Experimental - High unknowns, don’t know what you don’t know

Organizational & Cultural

ROI - lead time for first set of outcomes

Data cleansing & ingestion 80-90%

Lack of Domain Expertise, Not asking or solving for right questions

Learning curve - crucial for successful rollout of project

Data Driven decision making still new

Comfort level with high unknowns

Test driven approach - A/B Testing

Data Analytics in Real World 3

Architectural Patterns & Solutions

Lambda Architecture

Real-time speed layer + Batch Processing layer + Serving Layer

Edge Analytics – Internet of Things

Distributed analytics closer to source

Data Center as a Computer

Cluster computing, dynamic workloads

Data Analytics in Real World 4

Lambda (λ) Architecture

Data Analytics in Real World 5

Edge Analytics

Cloudlets with Edge

Analytics

Video

IOT

Automotive

Source: CMU

Data Analytics in Real World 6

Client Server Era

Small Apps, Big Servers

Static Partitioned

Cloud Era

Big Apps, Small Servers, Micro-services

Elastic Partitioned

Data Center as a Computer

Source: Andreesen Horowitz

Data Analytics in Real World 7

Dynamic Workloads Resource Utilization

Distributed Systems Kernel

General Purpose dynamic shared

cluster for multiple workloads

When resources become idle, can be

reused by other schedulers

Source: Apache Mesos

Data Analytics in Real World 8

Key Takeaways

Continuous Learning

Interpersonal Skills

Data Driven experimental approach

Contextual Use Case driven technology stack

Automation for rapid iterations and reproducible results

Meditation

Data Analytics in Real World 9

Q & A

Contact: [email protected]

Data Analytics in Real World 10

Resources

Lambda Architecture: http://lambda-architecture.net/

Edge Analytics: https://www.cs.cmu.edu/~satya/docdir/satya-edge2015.pdf

Apache Mesos Whitepaper: https://www.cs.berkeley.edu/~alig/papers/mesos.pdf

Data Analytics in Real World 11