presented by andrew yu - cleveland state...

29
Presented by Andrew Yu

Upload: others

Post on 24-May-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Presented by Andrew Yu

Page 2: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Introduction

Huawei European Research Center in Munich, Germany

Page 3: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow
Page 4: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Problem

■ High Density Writes (100,000 events/s)

■ High Complexity Reads (100 ad-hoc queries/s)

■ High Freshness requirement (ad-hoc queries require <1s concurrency with new writes)

Page 5: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Old Solutions: OLTP & OLAP

Scales Poorly!

Page 6: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Old Solutions: Apache

■ Writes

– Storm

■ Reads

– Hadoop

– Spark

■ Real-time Analytics

– Druid

Not Fast Enough!

Crappy + Slow

Page 7: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Analytics in Motion (AIM System)

Writes

Reads

Stores

Page 8: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

AIM System: Goals

■ Scale separately

■ Scale seamlessly

■ Scale with performance

■ Scale

■ Scale

■ Scale

Writes

Reads

Stores

Page 9: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Use case (the Huawei Marketing case)

■ As telecommunications provider

■ Gather cell-phone usage data from customers

■ Transform raw data to marketing-related attributes

■ Return real-time advertisements, promotions, and, and abuse warnings

■ Goal: Make a highly customized AIM System that caters to Huawei’s needsGoal: Make a highly customized AIM System that caters to Huawei’s needsGoal: Make a highly customized AIM System that caters to Huawei’s needsGoal: Make a highly customized AIM System that caters to Huawei’s needs

Page 10: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Definitions/Eventflow

■ Event Stream Processes (ESPs): raw writes data from customers

■ Business Rules (BRs): simple rules/triggers derived from ESPs for high-priority responses

■ Analytics Matrices (AMs): collection of marketing-related attributes that must be calculated from raw data

■ Real-Time Analytics (RTAs): complex BI queries derived from reads

Page 11: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Event Stream Processes (ESPs)

■ Get raw data -> update AMs

■ Only update attributes as necessary

■ (format AMs to be receptive to atomic updates)

■ e.g. call time, call location, caller, receiver, call duration, call cost

Page 12: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Business Rules (BRs)

■ Rules must be simple

■ Evaluations must be fast

■ Optimize evaluation algorithms for fast fail/fast success

■ e.g. unusual call location/receiver -> send customer a warning about stolen device

Page 13: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Analytics Matrices (AMs)

■ Marketing-related base attributes

■ Lots of aggregates

■ Building blocks for RTAs

■ ~80M rows, ~2K columns

■ e.g. call-density per unit times

Page 14: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Real-Time Analytics (RTAs)

■ Typical BI questions that Huawei might ask

■ Ad-hoc

■ e.g. call-density at given times in a given location (roll-ups and drill-downs)

Page 15: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

INTEGRATING WRITES AND READS

Page 16: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Separate the Processes

■ Copy-on-Write

– Use UNIX fork() paradigm: Computing RTAs use forked snapshots of AM memory state that are separate from ESP updates

■ Differential UpdatesDifferential UpdatesDifferential UpdatesDifferential Updates

– Delta “copy” is used for ESPsDelta “copy” is used for ESPsDelta “copy” is used for ESPsDelta “copy” is used for ESPs

– Main “copy” is used for RTAs, while periodically updating through delta copies Main “copy” is used for RTAs, while periodically updating through delta copies Main “copy” is used for RTAs, while periodically updating through delta copies Main “copy” is used for RTAs, while periodically updating through delta copies via merge processesvia merge processesvia merge processesvia merge processes

– Can force immediate updates if RTA is highCan force immediate updates if RTA is highCan force immediate updates if RTA is highCan force immediate updates if RTA is high----prioritypriorityprioritypriority

Page 17: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Threads

■ Allow multiple threads on same data

■ Partition data, each thread gets one partitionPartition data, each thread gets one partitionPartition data, each thread gets one partitionPartition data, each thread gets one partition

Page 18: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Architectural Layering

■ Each ESP, Storage, and RTA Each ESP, Storage, and RTA Each ESP, Storage, and RTA Each ESP, Storage, and RTA nodes can be added as nodes can be added as nodes can be added as nodes can be added as necessarynecessarynecessarynecessary

Writes

Reads

Stores

Page 19: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Data Placement & Join Processing

■ Multiple Storage Nodes = Multiple AM Fragments

■ Each node must have BRs and dimension tables (for joins and sorts)

■ But O.K. for the use case since they are relatively small

Page 20: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

AIM SYSTEM IMPLEMENTATION

For Huawei Use Case

Page 21: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Use Case Personalization

■ Each raw data has primary key

■ Small BRs and dimension tables

Page 22: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

RTA Nodes

■ Lightweight

■ Only send queries to Storage Nodes

■ Asynchronous (send answers ASAP)

Page 23: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

ESP Nodes

■ Heavyweight

■ Lots of writes (100,000/s)

■ BR processing

■ Synchronous (Threaded processes = sensitive to OS clock)

Page 24: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

ESP and Storage Nodes on Same Machine

■ Advantage: Share memory

■ Semi-built AMs are too big for network transfer

Page 25: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

AM Updates

■ Custom-built Kernels

■ Native data structures for AMs

■ No need to use higher-level language like C++

■ Use ESPs on memory:

– No conditional statements

– Aggregation functions are stored in kernel, not program

– All instructions are sequential

Page 26: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

BR Evaluation

■ 300 BRs

■ Huawei ruleset is small

■ No need for indexing

Page 27: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

ColumnMap

■ AM data structure design

■ Fit memory cache, not memory pages

■ Whole AMs stored in MEMORY (100s of GBs)!

■ 10MB L3 cache size

Page 28: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

Results & Conclusions

Does it scale well?

Page 29: Presented by Andrew Yu - Cleveland State Universitycis.csuohio.edu/~sschung/cis611/AnalyticsinMotionAndrewYu.pdf · Real-time Analytics – Druid Not Fast Enough! Crappy + Slow

References

■ SnappyData: A Hybrid Transactional Analytical Store Built On Spark

– Jags Ramnarayan, Barzan Mozafari, Sumedh Wale, Sudhir Menon, Neeraj Kumar, Hemant Bhanawat, Soubhik Chakraborty, Yogesh Mahajan, RishiteshMishra, Kishor Bachhav

– SIGMOD Conference 2016

■ Snappydata: Streaming, Transactions, and Interactive Analytics in a Unified Engine

– Jags Ramnarayan, Barzan Mozafari, Sumedh Wale, Sudhir Menon, Neeraj Kumar, Hemant Bhanawat, Soubhik Chakraborty, Yogesh Mahajan, RishiteshMishra, Kishor Bachhav 2016

■ Master's Thesis Nr. 140 Efficient Scan in Log-structured Memory Data Stores

– Systems Group, Kevin Bocksrocker, Markus Donald Kossmann, Pilman 2015