sas and open source a match made in ... group...title valente-opensourceandsas author matt...
TRANSCRIPT
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS AND OPEN SOURCE – A MATCH MADE IN
ENTERPRISE MINER
TORONTO DATA MINING USER GROUP
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
AGENDA SAS AND OPEN SOURCE
• Open Source analytics in Business
• Open Source Integration Node
• Output modes
• Workflow examples to incorporate R models
• Careful considerations
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS ANALYTICS IN
ACTION
OPEN SOURCE
INTEGRATION
THIS IS ACHIEVED WITH SAS ANALYTICS IN
ACTION
• Data is about gathering data from the different data sources and locations, unifying
it and making it ready for modeling
• Discovery is about having the flexibility to prototype analytical models to uncover
business value
• Deployment is about engineering enterprise level solutions from those prototypes
with governance measures to ensure quality
=
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
OPEN SOURCE
INTEGRATIONSAS DOES IT BY “INTEGRATING” AND “EXTENDING” IT
Where do we integrate? Where do we extend?
Inte
gra
teE
xte
nd
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
USING R IN SAS
ENTERPRISE MINER
• Enables the execution of R code within an Enterprise Miner workflow.
• Transfers data, metadata, and results automatically between Enterprise Miner
and R
THE OPEN SOURCE INTEGRATION NODE
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
USING R IN SAS
ENTERPRISE MINER
• Facilitates multitasking in R
• Generates text and graphical output from R
• Integrates both supervised and unsupervised learning tasks
THE OPEN SOURCE INTEGRATION NODE
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
USING R IN SAS
ENTERPRISE MINER
Predictive modeling markup language (PMML) is an open standard
enabling certain R models to be translated into SAS DATA step code
PMML OUTPUT
Currently supported R models include:
• Linear Models (lm)
• Multinomial Log-Linear Models
(multinom (nnet))
• Generalized Linear Models (glm (stats))
• Decision Trees (rpart)
• Neural Networks (nnet)
• k-means Clustering (kmeans (stats))
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
USING R IN SAS
ENTERPRISE MINERPMML MODE
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
USING R IN SAS
ENTERPRISE MINER
Merge output mode enables integration with thousands of R packages that
are not supported in PMML output mode.
MERGE OUTPUT MODE
Variables created in R are merged with SAS
Enterprise Miner data sources by the user.
SAS DATA step code is not created.
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
USING R IN SAS
ENTERPRISE MINERMERGE MODE
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
USING R IN SAS
ENTERPRISE MINERSOME PRECAUTIONS
Some items to consider when running R models in Open Source
note:
• Missing Values may be an issue
• Ensure Categorical Variables are not high in cardinality
• Memory issues
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS AND OPEN
SOURCEADDITIONAL RESOURCES
Video: SAS Enterprise Miner and R
SAS Webinar: How SAS Adds Value to Open Source
Whitepaper: The Use of Open Source is Growing. So Why Do Organizations
Still Turn to SAS?
Whitepaper: SAS Analytics and Open Source
Copyr i g ht © 2015, SAS Ins t i tu t e Inc . A l l r ights reser ve d .sas.com
QUESTIONS