real time in process analytics at substantially lower costs

37
Real Time, In-Process Analytics at Substantially Lower Costs with IBM and Redis Labs Leena Joshi VP Product Marketing

Upload: redis-labs

Post on 23-Jan-2018

1.976 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Real time in process analytics at substantially lower costs

Real Time, In-Process Analytics at Substantially Lower Costs with IBM and Redis LabsLeena JoshiVP Product Marketing

Page 2: Real time in process analytics at substantially lower costs

2

Agenda

• The business driver for real time, in-process analytics

• Why Redis for transactional & analytic scenarios

• IBM Power + Redis Labs : Horsepower at substantially lower costs

• Next Steps

Page 3: Real time in process analytics at substantially lower costs

The Business Driver

Page 4: Real time in process analytics at substantially lower costs

4

The new standard for E2E application response time, under any load100msec

Average roundtrip internet latency50msec

Required roundtrip app response time (includes processing & multi-DB access)

50msec

Required DBresponse time

1msec

Database

App Servers

The Speed of Business

Page 5: Real time in process analytics at substantially lower costs

5

Why Do You Need Analytics At the Same Speeds?5

Lost Revenue Competitive RiskBetter User Experience

An offer made in context is more powerful than trying to

reach the customer later

Anticipating intelligently what the customer is likely to do

next makes for a fantastic user experience

A competitor offering a better experience or better offer will

grab market and revenue share

“We think you will also like..” “Let us do this for you...” “We do better than..”

Page 6: Real time in process analytics at substantially lower costs

Why Use Redis in Analytics

Page 7: Real time in process analytics at substantially lower costs

7

Who We Are

The open source home and commercial provider of Redis

Open source. The leading in-memory database platform, supporting any high performance transactional or analytics use case.

Page 8: Real time in process analytics at substantially lower costs

8

Redis Tops Database Popularity Rankings

……..#1 NoSQL in User Satisfaction and Market Presence

……..#1 NoSQL among Top 10 Data Stores

……..#1 database on Docker

#1 NoSQL database deployed in containers

………#1 in growth among top 3 NoSQL databases

………#1 database in skill demand

………# 1 database in Top Paying Technologies

Page 9: Real time in process analytics at substantially lower costs

9

Redis is a Game Changer

Simplicity(through Data Structures)

Extensibility (through Redis Modules)

Performance

ListsSorted Sets

Hashes Hyperlog-logs

Geospatial Indexes

Bitmaps

SetsStrings

Bit field

Page 10: Real time in process analytics at substantially lower costs

10

• Used by developers like “Lego” blocks

• Enables data to be processed on the database level rather than the application level

• Turns complex functionality into a single command such as:"Get the e-mail address of the user with the highest bid in an auction that started on July 24th at 11:00pm PST”ZREVRANGE 07242015_2300 0 0

Simplicity: Data Structures - Redis’ Building Blocks

• Enable solving complex problems by creating relations between data structures, using standard or custom (Lua) commands

• The result: cleaner, more elegant code, faster execution time

ListsSorted Sets

HashesHyperlog-

logs

Geospatial Indexes

Bitmaps

SetsStrings

Bit field

Page 11: Real time in process analytics at substantially lower costs

11

Extensibility: Modules Extend Redis Infinitely

• Add-ons using a Redis API for seamlessly adding to it use cases and data structures

• Modules enjoy Redis’ simplicity, super high performance, infinite scalability and high availability

• Modules can be created by anyone. Certified by Redis Labs.

Full Text Search Enhanced JSON Graph Operations Secondary Indexes

Linear Algebra SQL Support Image ProcessingN-Dimension

Queries …

Page 12: Real time in process analytics at substantially lower costs

12

Performance: The Most Powerful Database

Highest Throughput at Lowest Latency in High Volume of Writes Scenario

Lowest number of servers needed to deliver 1 Million writes/second

300

50 50

20

50

100

150

200

250

300

350

Benchmarks performed by Avalon Consulting Group Benchmarks published in the Google blog

Page 13: Real time in process analytics at substantially lower costs

13

Popular Redis Use Cases

Geo SearchData Ingestion Social Functionality

Following, Followers, Relations Location-based ApplicationsHigh Throughput Buffering

Job & Queue Caching

Any Business Application Any Web or Mobile App

High Speed Transactions Time-Series

Business Applications

Analytics

Real-time Computations Time-Based Analysis

Page 14: Real time in process analytics at substantially lower costs

14

Example : Redis For Bid Management

The Application Problem

• Many users bidding on items• Need to instantly show who’s

leading, in what order and by how much

• May also need to display analytics like how many users are bidding in what range

• Disk-based DBMS-es are too slow for real-time, high scale calculations

Why Redis Rocks This

• Sorted sets automatically keep list of users and scores updated and in order (ZADD)

• ZRANGE, ZREVRANGE will get your top users

• ZRANK will get any users rank instantaneously

• ZCOUNT will return a count of users in a range

• ZRANGEBYSCORE will return all the users in a range by their bids

Page 15: Real time in process analytics at substantially lower costs

15

Redis Sorted Sets

ZADD item:1 10000 id:2 21000 id: 1ZADD item:1 34000 id:3 35000 id 4ZINCRBY item1:1 10000 id:3

ZREVRANGE item:1 0 0id:3

Item: 1id:3 44000

id:4 35000

id:1

id:2

21000

10000

Page 16: Real time in process analytics at substantially lower costs

16

Example : Redis For RecommendationsThe Application Problem

• Users, items, likes, dislikes, similarities• Set comparisons of user likes, user

dislikes should help create similarity scores, which can then be stored in a sorted set

• Set comparisons of similar user likes/dislikes with items not purchased by current user should yield suggestions

• High speed and low latency requirements

Why Redis Rocks This• Redis Sets are unordered collections

of strings- SADD to add objects to each tag

• Set operations executed in –memory, blazing fast speeds

• SINTER, SINTERSTORE to intersect

multiple sets

• SUNIONSTORE to add multiple sets

• SISMEMBER to determine membership,

SMEMBERS to retrieve all values

• Sets and Sorted sets combined are a great choice for recommendation engines

Page 17: Real time in process analytics at substantially lower costs

17

Redis Sets

SADD item:1 tag:1 tag:22 tag:24SADD tag:1 item:1SADD tag: 2 item:22 item:14 item:3

SINTER tag1 tag2item:3

SUNIONSTORE tag:x tag1 tag2SMEMBERS tag:xitem:1 item:3 item:22 item:14 item:3

item 1 {tag:1, tag:22, tag:24}

{item:1, item:3}tag 1

{item:22, item:14, item: 3}tag 2

{item:1, item:22, item:14, item: 3}tag x

Page 18: Real time in process analytics at substantially lower costs

18

Customers Use Redis for Real Time Transactions & Analytics

LARGE DAILY DEALS COMPANY LARGE ONLINE MOBILE AD COMPANY

Why Redis :

• Extreme throughput at low latencies –essential for simultaneous transaction and analytic processing

• At very high transactional rates, impossible to use a different datastore with equivalent ability

• Ease of implementation – built-in data structures reduce the complexity involved in both simple and complex analytics

• Redis used to store and serve the most up-to-date

offers/coupons to customers (Strings, Hashes, Lists)

• Redis also used for personalized recommendations for

customers purchasing coupons(Sets, Sorted Sets, Strings)

• Redis used as datastore to serve mobile ads

• Redis also used to store which ads are being served how

often, top revenue generators and other drivers of financial

and business reporting

Page 19: Real time in process analytics at substantially lower costs

19

Intuit: Redis for Recommendations

• Quickbooks self-employed users get rules that auto-categorize transactions

• Increases the wow factor of the app, 60% of people who use the recommended rules subscribe

• Creating and applying recommended rules is managed through the Redis backend

• Redis handles 1000s of simultaneous channels, does not even blink

Page 20: Real time in process analytics at substantially lower costs

20

Scopely: Redis for Probabilistic Analysis• Scopely, next generation mobile entertainment• Needs to generate on-the-fly game insights so

games can be tailored to user preferences by location, demographic etc

• 2.8 million events/min, 2.4 billion events/day• Redis powers their real time system for

operational monitoring/business alerts• Ongoing analysis of current game performance,

user engagement vs past• Hyperloglog for estimation of different things –

examples: cheating likelihood, anomalous installs, game play times

Analytics architecture

Page 21: Real time in process analytics at substantially lower costs

21

New Modules for Analytics Use Cases

Powerful text search

redisearch redabloom topk countminsketch

Bloom filter Top k most frequent Counts of observations

Page 22: Real time in process analytics at substantially lower costs

Redis & Spark

Page 23: Real time in process analytics at substantially lower costs

23

Spark Operation w/o Redis

Read to RDD Deserialization Processing Serialization Write to RDD

Analytics & BI

1 2 3 4 5 6

Data SinkData Source

Page 24: Real time in process analytics at substantially lower costs

24

Spark SQL &Data Frame

Spark Operation with Redis

Data Source Serving Layer

Analytics & BI

1 2

Processing

Spark-Redis connector

Read filtered/sorted

data

Writefiltered/sorted

data

Page 25: Real time in process analytics at substantially lower costs

25

Accelerating Spark Time-Series with Redis

Redis is faster by upto 100 times compared to HDFS and over 45 times compared to Tachyon or Spark

Page 26: Real time in process analytics at substantially lower costs

Cost Effective Analytics: IBM Power and Redis Labs

Page 27: Real time in process analytics at substantially lower costs

27

Why Redis on Power 8

• 800+ databases on a single P8 box:

‒ Redis is single threaded - each DB runs on a single core

‒ P8 supports up to 8 virtual cores on each physical core, i.e. up to 196

virtual cores in a box

‒ RLEC allows running multiple DBs on a single virtual core with no

performance degradation

• Flash, used as RAM extender, provides significant

deployment cost savings

• Strong networking capability, eliminates

packets/sec and bandwidth bottlenecks and keeps

latencies at <1msec

CAPI(Coherent

Accelerator Processor Interface)IBM Power 8

(24 cores, 192 vcores, RAM, Flash)

Page 28: Real time in process analytics at substantially lower costs

28

Redis on Flash – A New ConceptFlash used as RAM extender

Page 29: Real time in process analytics at substantially lower costs

29

How to Achieve Optimal Price/Performance

By dynamically setting RAM/Flash ratio Behind the scenes…

Page 30: Real time in process analytics at substantially lower costs

30

Deployment Options

Standard Servers IBM Data Engine for NoSQL Appliances

S822 LC32GB -1 TB memory

S822L or S812LExternal IBM Flash

ORS822LC + 2 TB FlashS812L + 4 TB Flash S822 L + 8 TB Flash

RL400: 3xS812LC, 30 cores, 384 GB RAM

RL7000: 3xS822LC, 60 cores, 768 GB RAM, 6TB Flash

RL14000: 3xS822L, 60 cores, 1536 GB RAM, 12TB Flash

Page 31: Real time in process analytics at substantially lower costs

31

Customer Example : Redis on Flash

• Genome dataset: 31TBs of raw data

• Optimized data set through encodingand using Redis Hashes

• Resulting data runs high speed analyses with 55GB of RAM and 4.5TB of Flash

• 97% annual savings compared to a pure RAM solution

Redis on RAM Redis on Flash

RAM Size 5TB 0.5TB

Flash size N/A 4.5TB

Serverson AWS :

21x r3.8xlarge on P8:

2x s822 LC

1yr costs $489,333 $15,677

P8 savings 97%

Page 32: Real time in process analytics at substantially lower costs

32

Comparison : P8 vs DellSpecs IBM Power8 Dell Power Edge

Model S822 LC Dell PowerEdgeR820

CPU 20 cores, 160 threads(vcores)@2.92 GHz

2x Intel® Xeon® Processor E5-4657L v2; 24 core/48 threads (vcores) @ 2.4Ghz

RAM Size 256GB 256GB

Flash Size 2TB 2TB

Flash IOPS 700k 300k

Network 2x10 Gbps 2x10 Gbps

Price w/o Flash $17,515 $23,033

Price w Flash $23,515 $26,033

• Higher number of virtual

cores on equivalent

boxes

• Higher throughput from

IBM Flash

• 24% lower cost

Page 33: Real time in process analytics at substantially lower costs

33

Power 8 Outperforms on RAM

Redis on RAM IBM POWER8 Dell Power Edge

Price w/o Flash $17,515 $23,033

# of shards (dedicated)80 12

$/shard (dedicated)$218.94 $1919.42

# of shards (multi-tenant)320 48

$/shard (multi-tenant)$54.73 $479.85

Max throughput (ops/sec) at sub-msec latency

4,000,000 2,400,000

$/transaction/sec$0.004 $0.01

IBM Throughput % Gain67%

IBM Cost Savings55%

• Higher core count

translates to higher

number of Redis shards,

each capable of handling

very high throughput

• IBM Power 8 delivers

67% higher throughput at

55% lower cost

Page 34: Real time in process analytics at substantially lower costs

34

Power 8 Outperforms on Flash

Redis on RAM IBM Power8 Dell Power Edge

Price with Flash $23,515 $26,033

Max throughput (ops/sec) at sub-msec latency

200,000 66,000

$/transaction/sec $0.12 $0.39

IBM Throughput % Gain200%

IBM Cost Savings70%

• More efficient IBM Flash

delivers 200% higher

throughput at 70% lower

cost!

Page 35: Real time in process analytics at substantially lower costs

35

Redis Labs Appliances

RL400 – for OLTP use cases 384GB RAM, no Flash, 30 cores

Manufactured by IBM, private-labelled by Avnet

RL7000 – for OLTP/OLAP use cases 768GB RAM, 6TB Flash, 60 cores

RL14000 – for OLAP use cases 1,536GB RAM, 12TB Flash, 60 cores

Page 36: Real time in process analytics at substantially lower costs

36

Next Steps

Learn more about Redis for Transactions & Analytics by contacting [email protected]

To learn more about Redis on Power Systems visit http://ibm.co/2blGNqY or call 1-866-872-3902 (Code: Power)

Page 37: Real time in process analytics at substantially lower costs

THANK YOU!