analytics on aws - amazon web services, inc. · store analyze amazon glacier amazon s3 amazon...
TRANSCRIPT
![Page 1: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/1.jpg)
ANALYTICS ON AWSPaul Armstrong, Solutions Architect, Amazon Web Services
![Page 2: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/2.jpg)
What to expect from this session
• AWS toolkit for analytics
• Analytics stakeholders
• Amazon Redshift and Amazon QuickSight
• Anomaly Detection
• Amazon Machine Learning – Churn Prediction Example
• Q & A
![Page 3: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/3.jpg)
AnalyzeStore
Amazon
Glacier
Amazon
S3
Amazon
DynamoDB
Amazon RDS,
Amazon Aurora
Big data portfolio – focus on choice
AWS Data Pipeline
Amazon
CloudSearch
Amazon EMR Amazon EC2
Amazon
Redshift
Amazon
Machine
Learning
Amazon
Elasticsearch
Service
AWS Database
Migration Services
Amazon
Kinesis
Analytics
Amazon Kinesis
Firehose
Collect
Amazon Kinesis
Streams
AWS Direct
Connect
Amazon
QuickSight
AWS Import/Export
![Page 4: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/4.jpg)
AnalyzeStore
Amazon
Glacier
Amazon
S3
Amazon
DynamoDB
Amazon RDS,
Amazon Aurora
Big data portfolio – focus on choice
AWS Data Pipeline
Amazon
CloudSearch
Amazon EMR Amazon EC2
Amazon
Redshift
Amazon
Machine
Learning
Amazon
Elasticsearch
Service
Amazon
Kinesis
Analytics
Amazon Kinesis
Firehose
Collect
Amazon Kinesis
Streams
AWS Direct
Connect
Amazon
QuickSight
AWS Import/Export
AWS Database
Migration Services
![Page 5: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/5.jpg)
Match toolset to right persona
• Business intelligence (BI) analyst• Primary tool is SQL
• Historical data resides in data warehouse such as Amazon Redshift
• Data scientist • Uses programmatic languages such as R or Python
• Application developer• Requires API to integrate with AWS services
![Page 6: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/6.jpg)
B I A N A L Y S T
![Page 7: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/7.jpg)
BI analyst with existing BI tools
BI Analyst
BI tools
Amazon EC2
Amazon Redshift
Amazon QuickSight API
• Primary tool is SQL
• Data is largely structured with well known data sources
• Primary concern is fast, consistent performance
• Need to extend SQL with custom functions
BI tools
Amazon EC2
Amazon QuickSight
Amazon QuickSight
![Page 8: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/8.jpg)
Amazon Redshift system architecture
Leader node• SQL endpoint
• Stores metadata
• Coordinates query execution
Compute nodes• Local, columnar storage
• Execute queries in parallel
• Load, backup, restore via Amazon S3; load from Amazon DynamoDB, Amazon EMR, or SSH
Two hardware platforms• Optimized for data processing
• DS2: HDD; scale from 2 TB to 2 PB
• DC1: SSD; scale from 160 GB to 356 TB
10 GigE
(HPC)
JDBC/ODBC
![Page 9: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/9.jpg)
New SQL functions
We add SQL functions regularly to expand Amazon Redshift’s query capabilities
Added 25+ window and aggregate functions since launch, including:
LISTAGG
[APPROXIMATE] COUNT
DROP IF EXISTS, CREATE IF NOT EXISTS
REGEXP_SUBSTR, _COUNT, _INSTR, _REPLACE
PERCENTILE_CONT, _DISC, MEDIAN
PERCENT_RANK, RATIO_TO_REPORT
We’ll continue iterating but also want to enable you to write your own
Window function examples: http://docs.aws.amazon.com/redshift/latest/dg/r_Window_function_examples.html
![Page 10: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/10.jpg)
Scalar user defined functions
You can write UDFs using Python 2.7
• Syntax is largely identical to PostgreSQL UDF
• Python execution is performed in parallel
• System and network calls within UDFs are prohibited
Comes integrated with Pandas, NumPy, SciPy, DateUtil, and
Pytz analytic libraries
• Import your own libraries for even more flexibility
• Take advantage of thousands of functions available through Python
libraries to perform operations not easily expressed in SQL
![Page 11: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/11.jpg)
A very fast, cloud-powered, business
intelligence service for 1/10 the cost of
traditional BI software
What is Amazon QuickSight?
![Page 12: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/12.jpg)
Business
User
Business
User
Amazon
QuickSight
APIAmazon QuickSight UI
Mobile Devices Web Browsers
Partner BI Products
MetadataData PrepConnectors SuggestionsSPICE
Amazon
S3
Amazon
Kinesis
Amazon
DynamoDB
Amazon EMRAmazon
RedshiftAmazon RDSFiles Third-party
![Page 13: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/13.jpg)
D A T A
S C I E N T I S T
![Page 14: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/14.jpg)
Data scientist with existing toolsets
Data scientistToolkits like SAS or
R Studio installed
with Amazon EC2
Unstructured data
Amazon S3
Structured data
Amazon Redshift
• Work with unstructured datasets
• Use existing toolsets to connect to Amazon Redshift
![Page 15: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/15.jpg)
Querying Amazon Redshift with R packages
• RJDBC—Supports SQL queries
• dplyr—Uses R code for data
analysis
• RPostgreSQL—R compliant
driver or Database Interface (DBI)R UserR Studio
Amazon
EC2
Unstructured data
Amazon S3
User profile
Amazon RDS
Amazon Redshift
Connecting R with Amazon Redshift blog post: https://aws.amazon.com/blogs/big-data/connecting-r-with-amazon-redshift/
![Page 16: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/16.jpg)
Querying Amazon Redshift with R packages example
![Page 17: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/17.jpg)
A P P L I C A T I O N
D E V E L O P E R
![Page 18: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/18.jpg)
Application developers can build smart
applications using Amazon Machine Learning
Structured data/predictions
Amazon Redshift
Generate/query
predictions
Amazon QuickSight
Application
Amazon Machine
Learning
Visualize
• All skill levels
• Amazon Machine Learning technology is accessed through APIs and SDKs
• Embed visualizations in applications
![Page 19: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/19.jpg)
Resources
Amazon Redshift Getting Started Guide:
http://docs.aws.amazon.com/redshift/latest/gsg/getting-started.html
Scalar UDF Documentation: http://docs.aws.amazon.com/redshift/latest/dg/user-defined-
functions.html
Introduction to Python UDFs in Amazon Redshift:
https://blogs.aws.amazon.com/bigdata/post/Tx1IHV1G67CY53T/Introduction-to-Python-UDFs-in-
Amazon-Redshift
Connecting R with Amazon Redshift:
https://blogs.aws.amazon.com/bigdata/post/Tx1G8828SPGX3PK/Connecting-R-with-Amazon-
Redshift
Databricks Apache Spark–Amazon Redshift Tutorial: https://github.com/databricks/spark-
redshift/tree/master/tutorial
Amazon ML Getting Started Guide: https://aws.amazon.com/machine-learning/getting-started/
Amazon QuickSight: https://aws.amazon.com/quicksight/
![Page 20: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/20.jpg)
Real-Time Anomaly Detection
• Ingest data from website through API Gateway and Amazon Kinesis
Streams
• Use Amazon Kinesis Analytics to produce an anomaly score for
each data point and identify trends in data
• Send users and machines notifications through Amazon SNS
Amazon API
Gateway
Amazon
Kinesis
Streams
Amazon
Kinesis
Streams
Amazon
Kinesis
Analytics
Lambda
functionAmazon
SNStopic
notification
users
SMS
notification
SMS
Ingest clickstream data Detect anomalies & take action Notify users
![Page 21: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/21.jpg)
Predicting Customer Churn with Amazon ML
![Page 22: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/22.jpg)
Supervised Learning
![Page 23: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/23.jpg)
Supervised Learning
Input Outcome
![Page 24: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/24.jpg)
Supervised Learning
Input Outcome
Input
Input
Input
Outcome
Outcome
Outcome
![Page 25: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/25.jpg)
Supervised Learning
Input Outcome
Input
Input
Input
Outcome
Outcome
Outcome
Supervised
Learning
known historical data
Amazon ML
![Page 26: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/26.jpg)
Supervised Learning
Input Outcome
Input
Input
Input
Outcome
Outcome
Outcome
Supervised
Learning
Unseen Input Same Outcome
known historical data
Amazon ML
![Page 27: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/27.jpg)
Amazon Machine Learning Service
![Page 28: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/28.jpg)
Amazon Machine Learning Service
![Page 29: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/29.jpg)
Amazon Machine Learning Service
![Page 30: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/30.jpg)
Amazon Machine Learning Service
![Page 31: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/31.jpg)
Telco Churn Dataset
• US telco customers, their cell phone plans and usage
• 21 attributes, 3333 rows:
• Customer: State, Area_Code, Phone
• Plan: Intl_Plan, VMail_Plan
• Behavior: VMail_Messages, Day_Mins, Day_Calls,
Day_Charge, Eve_Mins, Eve_Calls, Eve_Charge,
Night_Mins, Night_Calls, Night_Charge, Intl_Mins,
Intl_Calls, Intl_Charge
• Other: Account_Length, CustServ_Calls, Churn
![Page 32: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/32.jpg)
Telco Churn Dataset
• US telco customers, their cell phone plans and usage
• 21 attributes, 3333 rows:
• Customer: State, Area_Code, Phone
• Plan: Intl_Plan, VMail_Plan
• Behavior: VMail_Messages, Day_Mins, Day_Calls,
Day_Charge, Eve_Mins, Eve_Calls, Eve_Charge,
Night_Mins, Night_Calls, Night_Charge, Intl_Mins,
Intl_Calls, Intl_Charge
• Other: Account_Length, CustServ_Calls, Churn
![Page 33: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/33.jpg)
Telco Churn Dataset
KS, 128, 415, 382-4657, 0, 1, 25, 265.100000, 110, 45.070000, 197.400000, 99, 16.780000, 244.700000, 91, 11.010000, 10.000000, 3, 2.700000, 1, 0
OH, 107, 415, 371-7191, 0, 1, 26, 161.600000, 123, 27.470000, 195.500000, 103, 16.620000, 254.400000, 103, 11.450000, 13.700000, 3, 3.700000, 1, 0
NJ, 137, 415, 358-1921, 0, 0, 0, 243.400000, 114, 41.380000, 121.200000, 110, 10.300000, 162.600000, 104, 7.320000, 12.200000, 5, 3.290000, 0, 0
OH, 84, 408, 375-9999, 1, 0, 0, 299.400000, 71, 50.900000, 61.900000, 88, 5.260000, 196.900000, 89, 8.860000, 6.600000, 7, 1.780000, 2, 0
OK, 75, 415, 330-6626, 1, 0, 0, 166.700000, 113, 28.340000, 148.300000, 122, 12.610000, 186.900000, 121, 8.410000, 10.100000, 3, 2.730000, 3, 0
AL, 118, 510, 391-8027, 1, 0, 0, 223.400000, 98, 37.980000, 220.600000, 101, 18.750000, 203.900000, 118, 9.180000, 6.300000, 6, 1.700000, 0, 0
![Page 34: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/34.jpg)
Console: Creating Datasource for Amazon ML
![Page 35: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/35.jpg)
Console: Creating Datasource for Amazon ML
![Page 36: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/36.jpg)
Console: Creating Datasource for Amazon ML
![Page 37: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/37.jpg)
Console: Building the Amazon ML Model
![Page 38: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/38.jpg)
Recipe
{ "groups": {
"NUMERIC_VARS_NORM": "group('Intl_Charge','Night_Calls','Day_Calls','Eve_Calls','Eve_Mins','Intl_Mins','VMail_Message','Intl_Calls','Day_Mins','Night_Mins','Day_Charge','Night_Charge','Eve_Charge','Account_Length')” },
"assignments": {},
"outputs": [
"ALL_BINARY",
"State",
"Area_Code",
"normalize(NUMERIC_VARS_NORM)",
"CustServ_Calls"
]
}
![Page 39: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/39.jpg)
Recipe: normalize() function
Account_Length Normalized Value
128 0.808771865
107 -0.047574816
137 1.175777586
84 -0.985478323
75 -1.352484044
118 0.400987732
![Page 40: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/40.jpg)
Building the Amazon ML Model
![Page 41: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/41.jpg)
Cost of Errors
• Cost of Customer Churn and Acquisition (false
negative):
• foregone cashflow
• advertising costs
• POS and sign-up admin costs
• Customer Retention Cost (false + true positive)
• Discounts
• Phone upgrades
• etc
![Page 42: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/42.jpg)
Financial Outcome of Applying a Model
Prior Churn Churn Cost Cost without ML
14.49% $500.00 $72.46
![Page 43: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/43.jpg)
Financial Outcome of Applying a Model
Prior Churn Churn Cost Cost without ML
14.49% $500.00 $72.46
False Negative True + False Pos Retention Cost Cost with ML
4.80% 12.10% + 14.30% $100.00 $50.40
![Page 44: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/44.jpg)
Financial Outcome of Applying a Model
Prior Churn Churn Cost Cost without ML
14.49% $500.00 $72.46
False Negative True + False Pos Retention Cost Cost with ML
4.80% 12.10% + 14.30% $100.00 $50.40
• Threshold 0.3 0.17
• $22.06 of savings per customer
• With 100,000 customers over $2MM
in savings with ML
![Page 45: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/45.jpg)
What Next?
• https://aws.amazon.com/getting-started/projects/build-machine-
learning-model/
• https://aws.amazon.com/machine-learning/developer-resources/
• Cost Threshold Calculation
https://github.com/dbatalov/cost_based_ml
• Apache Spark on EMR https://aws.amazon.com/emr/details/spark/
• Artificial Intelligence on AWS https://aws.amazon.com/amazon-ai/
• Amazon AMIs for Deep Learning https://aws.amazon.com/amazon-
ai/amis/
![Page 46: ANALYTICS ON AWS - Amazon Web Services, Inc. · Store Analyze Amazon Glacier Amazon S3 Amazon DynamoDB Amazon RDS, Amazon Aurora Big data portfolio –focus on choice AWS Data Pipeline](https://reader033.vdocument.in/reader033/viewer/2022053023/60536fe661d8c8075b26e776/html5/thumbnails/46.jpg)
THANK YOU!