revolution r: 100% r and more

29
Revolution Confidential Revolution R: 100% R and More Presented by: David Smith VP Marketing, Revolution Analytics

Upload: revolution-analytics

Post on 10-May-2015

4.423 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Revolution R: 100% R and more

Revolution Confidential

Revolution R: 100% R and More

Presented by:David SmithVP Marketing, Revolution

Analytics

Page 2: Revolution R: 100% R and more

Revolution Confidential

2

August 24, 2011: Welcome!

Thanks for coming. Slides and replay available (soon) at:

http://bit.ly/railcj

David SmithVP Marketing, Revolution AnalyticsEditor, Revolutions blog

http://blog.revolutionanalytics.comTwitter: @revodavid

Page 3: Revolution R: 100% R and more

Revolution Confidential

3

In today’s webcast:

About Revolution Analytics and R

What Revolution R adds to R

Resources for getting more from R

Q&A

Introducing Revolution R

Page 4: Revolution R: 100% R and more

Revolution Confidential

4

What is R?

Data analysis software A programming language

Development platform designed by and for statisticians

An environment Huge library of algorithms for data access, data

manipulation, analysis and graphics

An open-source software project Free, open, and active

A community Thousands of contributors, 2 million users Resources and help in every domain

Download the White Paper

R is Hot

Page 5: Revolution R: 100% R and more

Revolution Confidential

Source: http://r4stats.com/popularity 5

R is exploding in popularity and functionality

Stata 10%

S-Plus 0%

SPSS -27%

SAS -11%

R 46%

Scholarly ActivityGoogle Scholar hits (’05-’09 CAGR)

0

500

1000

1500

2000

2500

20102008200620042002

Package GrowthNumber of R packages listed on CRAN

“A key benefit of R is that it provides near-instant availability of new and

experimental methods created by its user base — without waiting for the

development/release cycle of commercial software. SAS recognizes the value of R

to our customer base…”

Product Marketing Manager SAS Institute, Inc.

“I’ve been astonished by the rate at which R has been adopted. Four years ago,

everyone in my economics department [at the University of Chicago] was using

Stata; now, as far as I can tell, R is the standard tool, and students learn it first.”

Deputy Editor for New Products at Forbes

Page 6: Revolution R: 100% R and more

Revolution Confidential

6

15

20

25

30

MSFT [2009-01-02/2010-03-31]

Last 29.29

Volume (millions):63,760,000

50

100

150

200

250

Moving Average Convergence Divergence (12,26,9):MACD: 0.702Signal: 0.712

-6

-4

-2

0

2

4

6

Jan 02 2009 Apr 01 2009 Jul 01 2009 Oct 01 2009 Jan 04 2010 Mar 31 2010

3000+ R Packages from the Open Source community

Time Series analysis

Portfolio Optimization

Econometrics

Genomics

Clinical Trials

Bayesian Inference

Survival analysis

Social Networks

Data Visualization

Data APIs (Twitter)

.. and more

Page 7: Revolution R: 100% R and more

7

R User CommunityFrom: The R Ecosystem

bit.ly/R-ecosystem

Page 8: Revolution R: 100% R and more

Revolution Confidential

8

Revolution R Enterprise is

Page 9: Revolution R: 100% R and more

Revolution Confidential

9

R Productivity Environment (Windows)Script with type ahead and code

snippetsSolutions window

for organizing code and data

Packages installed and

loaded

Objects loaded in the

R Environment

Object details

Sophisticated debugging with

breakpoints , variable values etc.

http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm

Page 10: Revolution R: 100% R and more

Revolution Confidential

10

Interactive Debugging

One-click to set a breakpoint in an R script Step in/out/over, inspect variables Eliminate the edit -> browser -> repair cycle

Page 11: Revolution R: 100% R and more

Revolution Confidential

11

Coming soon: Revolution R GUI Accessible

Powerful

Extensible

Page 12: Revolution R: 100% R and more

Revolution Confidential

12

Performance: Multi-threaded Math

Open

Source R

Revolution R Enterprise

Computation (4-core laptop) Open Source R Revolution R Speedup

Linear Algebra1

Matrix Multiply 327 sec 13.4 sec 23x

Cholesky Factorization 31.3 sec 1.8 sec 17x

Linear Discriminant Analysis 216 sec 74.6 sec 2x

General R Benchmarks2

R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x

R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable

1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php2. http://r.research.att.com/benchmarks/

Page 13: Revolution R: 100% R and more

Revolution Confidential

13

Three Paradigms for Big Data

Standard R engine is constrained by capacity and performance

Revolution R Enterprise offers three methods for big data with R: Off-line: parallel out-of-memory analytics Off-line, distributed analytics On-line, in-database analytics

Hadoop Netezza

Page 14: Revolution R: 100% R and more

Revolution Confidential

14

Revolution R Enterprise with RevoScaleRBig Data Statistics in R

www.revolutionanalytics.com/bigdata

Every US airline departure and arrival, 1987-2008

File: AirlineData87to08.xdfRows: 123.5 millionVariables: 29Size on disk: 13.2Gb

arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE)

Page 16: Revolution R: 100% R and more

Revolution Confidential

16

Compute Node

(RevoScaleR)

Compute Node

(RevoScaleR) Master Node

(RevoScaleR)

DataPartition

DataPartition

Compute Node

(RevoScaleR)

Compute Node

(RevoScaleR)

DataPartition

DataPartition

• Portions of the data source are made available to each compute node

• RevoScaleR on the master node assigns a task to each compute node

• Each compute node independently processes its data, and returns its intermediate results back to the master node

• master node aggregates all of the intermediate results from each compute node and produces the final result

RevoScaleR – Distributed Computing

*Available for Microsoft HPC Server, November 2011Video demo: http://bit.ly/riUBgs

Page 17: Revolution R: 100% R and more

Revolution Confidential

17

Revolution Analytics with Netezza Appliance

More info: http://bit.ly/R-Netezza

Page 18: Revolution R: 100% R and more

Revolution Confidential

18

R Client

R

Task Tracker

Map or Reduce

Job Tracker

Task Node

Revolution Analytics with Hadoop

• Connectors to HDFS and HBASE for interacting with data stores directly in R

• Hadoop Streaming package for executing MapReduce jobs from R.

HDFS

Page 19: Revolution R: 100% R and more

Revolution Confidential

19

Enterprise Readiness: Revolution R Enterprise Server

Multi-User Support Production Applications

Integrate R analytics into Web based applications Data Analysis and Visualization Reporting Dashboards Interactive applications

Revolution R Enterprise Server with RevoDeployR

Page 20: Revolution R: 100% R and more

Revolution Confidential

20

Deployment with Revolution R Enterprise

RevoDeployR Web Services

Client libraries (JavaScript, Java, .NET)

Desktop Applications (i.e. Excel)

Business Intelligence

(i.e. Jaspersoft)

Interactive Web Applications

HTTP/HTTPS – JSON/XML

Session Management

AuthenticationData/Script

ManagementAdministration

R

R Programmer

ApplicationDeveloper

End User

Page 21: Revolution R: 100% R and more

Revolution Confidential

21

The Advanced Analytics Stack

Deployment / Consumption

Advanced Analytics

ETL

Data / Infrastructure

“Open Analytics Stack” White Paper: bit.ly/lC43Kw

Page 22: Revolution R: 100% R and more

Revolution Confidential

22

On-Call Technical Support Consulting

Migration | Analytics | Applications | Validation Training

R | Revolution R | Statistical Topics Systems Integration

BI | ERP | Databases | Cloud

Page 23: Revolution R: 100% R and more

Revolution Confidential

Wrapping Up

Page 24: Revolution R: 100% R and more

Revolution ConfidentialWhy R?

24

Every data analysis technique at your fingertips Create beautiful and unique data visualizations Get better results faster Draw on the talents of data scientists worldwide R is hot, and growing fast

Page 25: Revolution R: 100% R and more

Revolution Confidential

25

Revolution R Enterprise

High-performance R for multiprocessor systemsModern Integrated Development EnvironmentStatistical Analysis of Terabyte-Class Data Sets In-database R analytics with Hadoop1 and NetezzaDeploy R Applications via Web ServicesTelephone and email technical supportTraining and consulting services100% compatible with R packagesEasy-to-Use GUI1

Production-Grade Statistical Analysis for the Workplace

1 Coming Soon

Page 27: Revolution R: 100% R and more

Revolution Confidential

27

Revolution R Enterprise: Free to Academia

Personal use Research Teaching Package development

Free Academic Downloadwww.revolutionanalytics.com/downloads/free-academic.php

Discounted Technical Support Subscriptions Available

Page 28: Revolution R: 100% R and more

Revolution Confidential

28

Thank You!

Download slides, replay (from Aug 24) http://bit.ly/railcj

Learn more about Revolution R revolutionanalytics.com/products

Keep up to date with R and Revolution news revolutionanalytics.com/newsletter

Contact Revolution Analytics http://bit.ly/hey-revo

Page 29: Revolution R: 100% R and more

Revolution Confidential

29

The leading commercial provider of software and support for the popular open source R statistics language.

www.revolutionanalytics.com+1 (650) 330 0553

Twitter: @RevolutionR