microsoft sql server 2016 r services...sql server oracle mysql sap hana tpc-h oracle #2 is #5 sql...
TRANSCRIPT
![Page 1: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/1.jpg)
Microsoft
SQL Server 2016 R Services
![Page 2: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/2.jpg)
Consistent experience from on-premises to cloud
Microsoft Tableau Oracle
$120
$480
$2,230
Self-service BI per user
In-memory across all workloads
built-inbuilt-in built-in built-in built-in
at massive scale
0 14
0 03
34
29
22
15
5
22
6
43
20
69
18
49
3
0
10
20
30
40
50
60
70
80
1 2 3 4 5 6
SQL Server Oracle MySQL SAP HANA TPC-H
Oracle is #5#2
SQL Server
#1
SQL Server
#3
SQL Server
SQL Server 2016: Everything built-in
2
![Page 3: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/3.jpg)
從資料到決策和行動
價值
資料
$1.6trillion
行動决策
![Page 4: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/4.jpg)
微軟先進分析產品
Cortana
Analytics Suite
SQL Server 2016
![Page 5: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/5.jpg)
典型先進分析的生命週期
Ingest Transform Explore Model Deploy
Score Visualize Measure
Model
Score
ƒ(x)
準備 Modeling
投入生產
![Page 6: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/6.jpg)
資料科學家應該是關注創建/測試模型
Data scientist
Ingest Transform Explore Model Deploy
Score Visualize Measure
Model
Score
ƒ(x)
準備 Modeling
投入生產
![Page 7: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/7.jpg)
但現實是...
Data scientist focus time
Ingest Transform Explore Model Deploy
Score Visualize Measure
Model
Score
ƒ(x)
準備 Modeling
投入生產
80%
5%
15%
![Page 8: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/8.jpg)
決定
投入生產
先進分析是一項團隊運動
Preparation
model
![Page 9: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/9.jpg)
什麼是 R ?
開源“lingua franca”
Analytics, computing, modeling
Global community
Millions of users 7,000+Packages
Big dataEcosystem
Scalability
![Page 10: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/10.jpg)
CRAN: The Comprehensive R Archive Network
Open Source “lingua franca”
Analytics, Computing, Modeling
In addition to CRAN, Bioconductor, GitHub, and others distribute R packages
![Page 11: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/11.jpg)
大量人才知道如何使用
為什麼 R ?
可擴充正在進行計算的資料
更容易保護重要的資料
角色使用創建效率
![Page 12: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/12.jpg)
$?
開源R的挑戰
Uncertain total cost of ownership and return on investment
Integrating R with existing and ever changing data infrastructures
Scale and Performance
Data movement restricts access for efficient data modeling
![Page 13: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/13.jpg)
Big Data In-memory bound Hybrid memory & disk scalability Operates on bigger
volumes & factors
Speed of
Analysis
Single threaded Parallel threading and Processing Shrinks analysis time
Enterprise
Readiness
Community support Commercial support Delivers full service
production support
Analytic
Breadth &
Depth
7000+ innovative analytic
packages
Leverage and optimize open
source packages plus Big Data
ready packages
Supercharges R
Commercial
Viability
Risk of deployment of open
source
Commercial licenses Eliminates risk with
open source
開源 好處微軟R
微軟R的好處
![Page 14: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/14.jpg)
Faster And More Scalable
![Page 15: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/15.jpg)
Custom parallelization
PEMA-R API
rxDataStep
rxExec
Data step
Data import – Delimited, fixed, SAS, SPSS, OBDC
Variable creation & transformation
Recode variables
Factor variables
Missing value handling
Sort, merge, split
Aggregate by category (means, sums)
Descriptive statistics
Min/max, mean, median (approx.)
Quantiles (approx.)
Standard deviation
Variance
Correlation
Covariance
Sum of squares (cross-product matrix for set variables)
Pairwise cross tabs
Risk ratio & odds ratio
Cross-tabulation of data (standard tables & long form)
Marginal summaries of cross tabulations
Statistical tests
Chi Square Test
Kendall Rank Correlation
Fisher’s Exact Test
Student’s t-Test
Sampling
Subsample (observations & variables)
Random sampling
Predictive models
Sum of squares (cross-product matrix for set variables)
Multiple linear regression
Generalized linear models (GLM) exponential family distributions: binomial,
Gaussian, inverse Gaussian, Poisson, Tweedie. Standard link functions: cauchit,
identity, log, logit, probit. User defined distributions & link functions.
Covariance & correlation matrices
Logistic regression
Classification & regression trees
Predictions/scoring for models
Residuals for all models
Simulation
Simulation (e.g., Monte Carlo)
Parallel random number generation
Cluster analysis
K-Means
Classification
Decision trees
Decision forests
Gradient-boosted decision trees
Naïve Bayes
Parallelized, Remote Executing Algorithms
![Page 16: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/16.jpg)
In-database advanced analytics
Data Scientist
Interacts directly with data
SQL Developer/DBAManage data and
analytics together
ExtensibilityExample solutions
Sales forecasting
Warehouse efficiency
Predictive
maintenance
Credit risk protection
010010
100100
010101
Relational data
Analytics library
T-SQL interface
?R
integration
Built into
SQL Server 2016
010010
100100
010101
Real-time operational analyticswithout moving data
R with in-memory scalability
![Page 17: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/17.jpg)
rows
min
ute
s
External
Access
In
Database
![Page 18: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/18.jpg)
Flexibility & Agility
寫一次部署在任何地方 No model re-writes across platforms
No re-writes from modeling to scoring
Hybrid modeling & scoring Model on premises, score on premises
Model on premises, score in the cloud
Model on cloud, score on premises
ModelPrepare
SQL
Server
Score
Parallelized Models
![Page 19: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/19.jpg)
Financial Services Digital Media & Retail
Healthcare & Pharma Government & Academia Analytics Service Providers
Manufacturing & High Tech
微軟R部分的客戶
![Page 20: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/20.jpg)
SQL Server 2016 R Services ( In-database)
In-DB analytics
Parallel threading and processing
Easy to operationize
Developers, DBAs and Data Scientists can use their preferred tools
Model on-premises, score in cloud—or vice versa
Easy way to overcome memory limitations -enabling limits of larger data sets
Included in SQL Server 2016
Reuse and optimization of existing R code
Reduced recoding and training costs
$
![Page 21: Microsoft SQL Server 2016 R Services...SQL Server Oracle MySQL SAP HANA TPC-H Oracle #2 is #5 SQL Server #1 SQL Server #3 SQL Server SQL Server 2016: Everything built-in 2 從資料到決策和行動](https://reader033.vdocument.in/reader033/viewer/2022051903/5ff49653b82227271b5c2ad8/html5/thumbnails/21.jpg)