is revolution r enterprise faster than sas? benchmarking results revealed
DESCRIPTION
TRANSCRIPT
![Page 1: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/1.jpg)
RRE: Faster than SAS Results from Benchmarking
Thomas W. Dinsmore, Revolution Analytics
John Wallace, DataSong
![Page 2: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/2.jpg)
Polling Question Do you currently use:
– A) R or Revolution R Enterprise (RRE)
– B) SAS
– C) Both
– D) Neither
![Page 3: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/3.jpg)
Benchmarking RRE vs. SAS Background
Approach
Results
Discussion
![Page 4: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/4.jpg)
4
Revolution R Enterprise Open source R
Commercially support distribution
Enhanced for enterprise use:
– Scalable analytics
– Developer tools
– Integration tools
– Deployment tools
![Page 5: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/5.jpg)
5
2012: Allstate Benchmark
0 50 100 150 200 250 300
6
300
Runtime, Minutes
SAS PROC GENMOD RRE
Poisson Regression, 150MM rows
![Page 6: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/6.jpg)
Criticism: “Apples to Oranges”
6
20 Cores 16 Cores
![Page 7: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/7.jpg)
7
Most SAS/STAT PROCs (including PROC
GENMOD) run single-threaded.
SAS/STAT: 91 PROCs • 69 single threaded
• 13 multi-threaded
• 9 distributed (if you license SAS HP Statistics)
![Page 8: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/8.jpg)
8
![Page 9: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/9.jpg)
9
2013: SAS Benchmark PROC HPGENSELECT
– SAS/STAT
– SAS High Performance Statistics
Massive grid (140/144 nodes)
– 16 cores per node
– 2,240/2,304 cores
Conclusion: SAS on 2,304 cores is competitive
with RRE on 20 cores.
![Page 10: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/10.jpg)
Honest Benchmarking Compare RRE and SAS/STAT performance
– Same data
– Same environment
– Same tasks
Test under real-world conditions
Make the test fair and transparent
![Page 11: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/11.jpg)
Data
11
Manufactured data
Reproducible in any environment
Designed to emulate “typical” working data
“Entity” tables: 1MM, 5MM rows
“Predict” tables: 10MM, 50MM rows
Fact Pre-
dict
Entity 1
Entity 2
Entity key
571 Columns
21 Columns
![Page 12: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/12.jpg)
Benchmarking Environment
12
SAS 9.4:
• Base
• STAT
• Grid Manager
Commodity servers: • 4 cores
• 16GB Memory
Gbit network
CentOS
RRE 7.0
Platform LSF 9
![Page 13: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/13.jpg)
Analytic Tasks
13
Task SAS Capability RRE Capability
Descriptive Statistics PROC SURVEYMEANS rxSummary
Median and Deciles PROC SURVEYMEANS rxQuantile
Frequency Distribution PROC FREQ rxCube
Linear Regression (Numeric predictors) PROC REG, HPREG rxLinMod
Linear Regression (Mixed predictors) PROC GENMOD rxLinMod
Stepwise Linear (100 predictors) PROC REG rxLinMod/rxStepControl
Logistic Regression PROC LOGISTIC rxLogit
Generalized Linear PROC GENMOD rxGLM
K-Means Clustering PROC FASTCLUS rxKMeans
Score PROC SCORE rxPredict
![Page 14: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/14.jpg)
14
Preparation Generated data with randomized procedure
Loaded data into native formats:
– RRE: XDF file
– SAS: SAS DATA set
Generation and load times not included
No meaningful differences
![Page 15: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/15.jpg)
15
RRE: 42 Times Faster Than SAS 9.4
0 1,000 2,000 3,000 4,000 5,000 6,000
124
5,192
Runtime, Seconds
N=5,000,000
SAS 9.4 RRE RRE ~2 minutes
SAS ~1 hour, 26 minutes
Complete script: ten analytic tasks.
![Page 16: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/16.jpg)
16
RRE: Linear Scalability
68 124
623
5,192
0
1,000
2,000
3,000
4,000
5,000
6,000
0 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000
Runtim
e, S
econds
# Rows in Entity Table
RRE 7
SAS 9.4
RRE: consistent
performance with
increased data volume.
![Page 17: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/17.jpg)
17
RRE: Up to 350X Faster Than SAS
0
50
100
150
200
250
300
350
400
RRE Speed Multiple
213 185
351
39 37 19
58
18
101
32 Runtim
e, S
econds
N=5MM Stats
Quintiles
Freq
Lin Reg 1
Lin Reg 2
Step Lin
Logistic
GLM
Kmeans 1
Kmeans 2
![Page 18: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/18.jpg)
18
Why is RRE faster than SAS? RRE supports scalable computing out of the
box
– Multi-threaded processing
– Distributed processing
Legacy SAS is mostly single-threaded
– DATA Step processing
– Most SAS/STAT PROCs
![Page 19: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/19.jpg)
19
SAS HP PROCs
9 new SAS PROCs
Bundled into SAS 9.4
Designed for scalability
Multiple operating modes:
– Single machine
– Distributed (must license SAS HP
Statistics)
![Page 20: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/20.jpg)
20
HP PROCs: Minimal Improvement
0 50 100 150 200 250 300
6.8
267.17
253.82
Runtime, Seconds
N=5,000,000
SAS: PROC HPREG SAS: PROC REG RRE: rxLinMod
Linear regression, 20 predictors
HPREG running in single machine mode.
![Page 21: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/21.jpg)
21
Summary RRE is faster than Legacy SAS:
– Same tasks
– Same hardware
RRE speed:
– Efficient engineering
– Multi-threaded and distributed processing
SAS performance claims:
– Massive hardware requirements
– Force you to license more software from SAS
– Don’t apply to Legacy SAS
![Page 22: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/22.jpg)
22
Polling Question Which of the following analytic software
benefits is most important to you:
– A) Completing projects faster
– B) Building better predictive models
– C) High performance with low infrastructure costs
![Page 23: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/23.jpg)
23
John Wallace, Founder & CEO
![Page 24: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/24.jpg)
Background
Approaching $1 trillion in revenue analyzed. $3 billion in marketing spend under our lens.
Experienced 60+ person team based in San Francisco with offices in Seattle, Los Angeles,
Singapore, and India.
Founded in 2003 with a proven history of solving difficult analytics problems. Evolved from
consulting through close partnerships with our clients.
Our Offerings
Customer interaction insight that powers applications for customer-level revenue attribution,
targeting, media optimization.
Descriptive and predictive modeling of hidden trends and relationships in big data.
Custom development including applications, process automation, and decision support solutions.
DataSong at a Glance
![Page 25: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/25.jpg)
DataSong Offerings Hosted Applications
● Revenue Attribution
● Customer Targeting
● Marketing Planning
We know Big Data. We analyze and provide the “so what”.
![Page 26: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/26.jpg)
DataSong Architecture
• ETL
• N marketing channels
• Behavioral variables
• Promotional data
• Overlay data
• Functions to read Hadoop output; xdf creation
• Exploratory data analysis
• GAM survival models
• Scoring for inference
• Scoring for prediction
• 5 billion scores per day
per customer
DATASONG DATA
FORMAT (DDF)
CUSTOM VARIABLES
(PMML)
![Page 27: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/27.jpg)
Where Speed Matters 3 key dimensions
● how many rows
● how many variables
● how many iterations of a model
Trade offs for speed
● Sampling variance
● Test fewers features
● Have less understanding of the signal
This 3rd dimension means we must multiply any benchmark by N
![Page 28: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/28.jpg)
28
![Page 29: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/29.jpg)
29
![Page 30: Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed](https://reader030.vdocument.in/reader030/viewer/2022020720/54c62bdd4a7959b5078b4572/html5/thumbnails/30.jpg)
30
Thank You