sql server luis de sousa · sql ‘16 supports advanced analytics in-db using r 2015 2016 i believe...
TRANSCRIPT
-
Luis de Sousa
12 th Ju ly 2016
w w w . l u i s d e s o u s a . c o . z a
SQL SERVERR SERVICES
-
Intelligent apps
Intro to R
SQL + R
AGENDA
Presentation is a summary of Build 2016 – Advanced Analytics with
R and SQL
https://channel9.msdn.com/Events/Build/2016/B805
https://channel9.msdn.com/Events/Build/2016/B805
-
SQL Server
enables
data mining
using SSAS
Computers
work on
users behalf,
filtering junk
email
Microsoft
search
engine built
with machine
learning
Bing Maps
ships with
ML traffic-
prediction
service
1999 20082004 2005
Microsoft
Kinect can
watch users
gestures
Microsoft
launches
Azure
Machine
Learning
Successful,
real-time,
speech-to-
speech
translation
2012 20142010
Microsoft
launches R
server for
scalable,
enterprise
grade
analytics
SQL ‘16
supports
advanced
analytics in-
DB using R
2015 2016
I believe over the next decade computing will become even more ubiquitous and
intelligence will become ambient. This will be made possible by an ever-growing network of
connected devices, incredible computing capacity from the cloud, insights from big data, and
intelligence from machine learning.
Machine learning is pervasive throughout Microsoft products.
-
Value
DataActionDecisions
Advanced
AnalyticsPredictive & Prescriptive
Analytics
Business
IntelligenceDescriptive &
Diagnostic Analytics
-
PROCESS FOR CREATING AN INTELLIGENT APP
ADVANCED ANALYTICS PROCESS
Prepare: Assemble,
cleanse, profile and
transform diverse data
relevant to the subject.
OperationalizeModelPrepare
Model: Use of
statistical and machine
learning algorithms to
build classifiers and
predictions
Operationalize: Apply
predictions and
visualizations to support
business applications,
Evaluate Results & Iterate
https://azure.microsoft.com/en-us/documentation/learning-paths/cortana-analytics-process/
-
INTRO TO R
THE LANGUAGEOF ADVANCED
ANALYTICS
-
R Usage GrowthRexer Data Miner Survey, 2007-2015
Language PopularityIEEE Spectrum Top Programming Languages, 2015
76% of analytic professionals report using R
36% select R as their primary tool
http://blog.revolutionanalytics.com/2015/11/new-surveys-show-continued-popularity-of-r.htmlhttp://blog.revolutionanalytics.com/2015/07/ieee-2015-rankings.html
-
• R is an open source (GNU) version of the S language developed by John Chambers et al. at Bell Labs in 80’s History of R
• R was initially written in early 1990’s by Robert Gentleman and Ross Ihaka then with the Statistics Department of the University of Auckland
• R is administered and controlled by the R Foundation
• Microsoft is founding member and Platinum Sponsor of R Consortium
R Reference Card from CRAN
http://www.gnu.org/copyleft/gpl.htmlhttps://www.stat.auckland.ac.nz/~ihaka/downloads/Interface98.pdfhttps://en.wikipedia.org/wiki/Robert_Gentleman_(statistician)http://www.stat.auckland.ac.nz/~ihaka/https://www.r-project.org/foundation/https://www.r-consortium.org/https://cran.r-project.org/doc/contrib/Short-refcard.pdf
-
Open Source “lingua franca”
Analytics, Computing, Modeling
CRAN Task View by Barry Rowlingson: http://www.maths.lancs.ac.uk/~rowlings/R/TaskViews/
More packages on Github and BioConductor project
http://www.maths.lancs.ac.uk/~rowlings/R/TaskViews/http://www.github.com/http://bioconductor.org/
-
Works With Open Source R
Enterprise Scale & Performance
– Scales from workstations to large clusters
– Scales to large data sizes
– Growing portfolio of Parallelized algorithms
Secure, Scalable R Deployment/Operationalization
Write Once Deploy Anywhere for multiple platforms
– RDBMS: SQL Server & TeraData
– Windows, Linux: RedHat & SUSE
– Hadoop: HortonWorks, Cloudera, MapR
– Cloud: AzureVMs, Azure HDInsight
R Tools for Visual Studio IDE
DeployRRTVS
R Open Microsoft R Server
-
R Open Microsoft R Server
DeployRRTVS
ConnectR• High-speed & direct
connectors
Available for:• High-performance XDF
• SAS, SPSS, delimited & fixed format text data files
• Hadoop HDFS (text & XDF)
• Teradata Database & Aster
• EDWs and ADWs
• ODBC
ScaleR• Ready-to-Use high-performance
big data big analytics
• Fully-parallelized analytics
• Data prep & data distillation
• Descriptive statistics & statistical tests
• Range of predictive functions
• User tools for distributing customized R algorithms across nodes
• Wide data sets supported – thousands of variables
DistributedR• Distributed computing framework
• Delivers cross-platform portability
R+CRAN• Open source R interpreter
• R 3.1.2
• Freely-available huge range of R algorithms
• Algorithms callable by RevoR
• Embeddable in R scripts
• 100% Compatible with existing R scripts, functions and packages
Microsoft R Open• Based on open source R
• High-performance math library to speed up linear algebra functions
• Checkpoint package to easily share R code and replicate results using specific R package versions
DeployR• RESTful APIs for easy
integration from Java, JavaScript, .NET
• Enterprise authentication & security
• Horizontal scaling
R Tools for Visual Studio• State of the art, R Tools for Visual Studio IDE
-
SQL + R
IN-DATABASE ADVANCED ANALYTICS
-
Relevant data available in real-time Ingest
All relevant data available in real-time Query
All relevant data available for analytics in real-time Analytics
These are 3 key ingredients to build an Intelligent Application
OperationalizeModelPrepare
-
Working from my R IDE on my workstation, I can execute an R script that runs in-database, and get the
results back.
Microsoft R Open
Microsoft R Server
R IDE
Data Scientist
Workstation SQL Server 2016Script
Results
Execution1 2
3
sqlCompute
-
I can call a T-SQL System Stored Procedure from my application and have it trigger R script execution in-
database. Results are then returned to my application (predictions, plots, etc).
Application
Call System Stored
Procedure
Results: scores,
plots
The stored
procedure contains R
code and executes
in-database.
1
3
exec sp_execute_external_script
@ languague = ‘R’
, @script =
-- R code --
SQL Server 2016
2
Microsoft R Open
Microsoft R Server
Advanced Analytics
Extensions
-
SUMMARY
Build 2016 - Advanced Analytics with R and SQL
https://channel9.msdn.com/Events/Build/2016/B805
Short URL: http://bit.ly/1TnGskj
https://channel9.msdn.com/Events/Build/2016/B805http://bit.ly/1TnGskj