a gentle introduction to oracle r enterprise

21
BÂLE BERNE BRUGG DUSSELDORF FRANCFORT S.M. FRIBOURG E. BR. GENÈVE HAMBOURG COPENHAGUE LAUSANNE MUNICH STUTTGART VIENNE ZURICH A Gentle Introduction to Oracle R Enterprise Lausanne, 24 November 2015 Christian Antognini Senior Principal Consultant

Upload: swiss-data-forum-swiss-data-forum

Post on 12-Apr-2017

332 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: A gentle introduction to Oracle R Enterprise

BÂLE BERNE BRUGG DUSSELDORF FRANCFORT S.M. FRIBOURG E.BR. GENÈVE

HAMBOURG COPENHAGUE LAUSANNE MUNICH STUTTGART VIENNE ZURICH

A Gentle Introduction to Oracle R Enterprise

Lausanne, 24 November 2015 Christian Antognini Senior Principal Consultant

Page 2: A gentle introduction to Oracle R Enterprise

@ChrisAntognini

Senior principal consultant, trainer and partner at Trivadis

[email protected]

– http://antognini.ch

Focus: get the most out of Oracle Database

– Logical and physical database design

– Query optimizer

– Application performance management

Author of Troubleshooting Oracle Performance (Apress, 2008/14)

OakTable Network, Oracle ACE Director

Page 3: A gentle introduction to Oracle R Enterprise

What Is R?

R is a language and environment for statistical computing and graphics.

It is a GNU project.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …)

and graphical techniques, and is highly extensible.

Source: https://www.r-project.org/about.html

Page 4: A gentle introduction to Oracle R Enterprise

Agenda

1. R Technologies from Oracle

2. Oracle R Enterprise

Page 5: A gentle introduction to Oracle R Enterprise

R Technologies from Oracle

Page 6: A gentle introduction to Oracle R Enterprise

R Technologies from Oracle

Oracle has adopted R as a language and environment for performing statistical data

analysis and advanced analytics, as well as generating sophisticated graphics

Oracle provides R integration through four key technologies:

– Oracle R Distribution

– ROracle

– Oracle R Enterprise (ORE)

– Oracle R Advanced Analytics for Hadoop (ORAAH)

Page 7: A gentle introduction to Oracle R Enterprise

Oracle R Distribution

Oracle's distribution of open source R

Free download

Support provided to customers of the

Oracle Advanced Analytics option,

Oracle Linux, and the Oracle Big

Data Appliance

Page 8: A gentle introduction to Oracle R Enterprise

ROracle

Open source R package providing a

DBI-compliant driver for Oracle

Database

Based on the OCI library

It’s publicly available on CRAN and

is maintained by Oracle

Page 9: A gentle introduction to Oracle R Enterprise

Oracle R Enterprise (ORE)

It’s a component, along with Data

Mining, of the Oracle Advanced

Analytics option

It’s a set of R packages and Oracle

Database features

– Run R commands and scripts for

analyses on data stored in the

Oracle Database

– Translate R operations into SQL

– One or more R engines run on the

database server

Page 10: A gentle introduction to Oracle R Enterprise

Oracle R Advanced Analytics for Hadoop (ORAAH)

It’s one of the components in the

Oracle Big Data Software

Connectors Suite, an option to the

Big Data Appliance (BDA)

It provides an R interface to access

HDFS and MapReduce

programming framework

– Data manipulation

– Writing mapper and reducer

functions

– Invocation of Hadoop jobs

Page 11: A gentle introduction to Oracle R Enterprise

Oracle R Enterprise

Page 12: A gentle introduction to Oracle R Enterprise

Architecture

Oracle Database

Client R Engine

ORE Packages

Spawned R Engine

ORE Packages

Spawned R Engine

ORE Packages

Spawned R Engine

ORE Packages

Client Database Server

SQL

Results

R

Results

Page 13: A gentle introduction to Oracle R Enterprise

Advantages of Oracle R Enterprise (According to Oracle)

Operate on database-resident data

without using SQL

Eliminate data movement

Keep data secure

Use the power of the database

Use current data

Prepare data in the database

Save R objects in the database

Build models in the database

Score data in the database

Execute R scripts in the database

Integrate with the Oracle technology

stack

Page 14: A gentle introduction to Oracle R Enterprise

ore.frame Class

An ore.frame object represents a relational query for an Oracle Database instance

Typically, you get ore.frame objects that are proxies for database tables

An ore.frame object can be ordered or unordered

– This is an important difference compared to an R data.frame that always has an

explicit order

– Relation data must be explicitly ordered

Page 15: A gentle introduction to Oracle R Enterprise

Persisted R Objects

R objects (incl. ORE proxy objects) exist for the duration of the current R session

The standard R functions for saving and restoring R objects, save and load, can’t

be used with the ORE proxy objects

– The database objects associated to them aren’t persisted

To persist them, ORE provides datastores that store data in the database

– The ore.save and ore.load functions are available

– Also R objects can be persisted

Page 16: A gentle introduction to Oracle R Enterprise

Preparing and Exploring Data in the Database

Selecting Data

Indexing Data

Combining Data

Summarizing Data

Transforming Data

Sampling Data

Partitioning Data

Preparing Time Series Data

Correlating Data

Cross-Tabulating Data

Analyzing the Frequency of Cross-

Tabulations

Building Exponential Smoothing Models

on Time Series Data

Ranking Data

Sorting Data

Analyzing Distribution of Numeric

Variables

Page 17: A gentle introduction to Oracle R Enterprise

Building Models and Predictions

Two categories of models are provided:

– Oracle R Enterprise models (OREmodels package: linear regression, generalized

linear model, neural network)

– Oracle Data Mining models (OREdm package: association rules, decision trees,

Naïve Bayes, k-means, …)

The ore.predict function is able to score data in ore.frame objects

– Degree of parallelism can be manually set

Page 18: A gentle introduction to Oracle R Enterprise

ORE Embedded R Execution

It enables to store and invoke R scripts in the Oracle Database server

– Both an R and a SQL API exist

When invoked, a script executes in one or more R engines that run on the database

server

– Degree of parallelism can be manually set

Page 19: A gentle introduction to Oracle R Enterprise

Core Messages

Easy to install

Simple to use

Expensive

A more in-depth analysis is required to

judge performance and stability

Page 20: A gentle introduction to Oracle R Enterprise

Questions and Answers Christian Antognini

Senior Principal Consultant

[email protected]

Page 21: A gentle introduction to Oracle R Enterprise

References

Oracle R Enterprise Installation and Administration Guide

Oracle R Enterprise User's Guide