deploying and managing machine
TRANSCRIPT
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deploying and managing machine learning models at scale
A I M 3 4 8
Sireesha Muppala
AI/ML Specialist SA
Amazon Web Services
Nitin Wagh
Sr. BDM, Amazon SageMaker
Amazon Web Services
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our awesome support experts
A I M 3 4 8
Arun Nagarajan
Sr. SDE, AI Platforms
Kiran Bakshi
Consultant, ProServ
Piyush Bothra
Sr. Solutions Architect
Workshop map
Workshop map
Related Breakouts
AIM307 - Amazon SageMaker deep dive: A modular solution for ML
AIM311 - Choose the right instance type in Amazon SageMaker
AIM318 - Amazon SageMaker: Automatically tune hyperparameters
AIM306 - How to build high-performance machine learning solutions at low cost
Workshop map
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://tinyurl.com/w5wu595
Workshop map
You are a data scientist at Media Company
• Build music recommendation model for customers
• Dataset provides user purchase/listening patterns
• Develop model monitoring solution to ensure it is up-to-date
• Build movie recommendation model
• Dataset provides user purchase/viewing patterns
You are an engineer responsible for deployment atMedia Company
• Deploy models as real-time endpoints at scale
• Set up model drift detection pipeline that triggers training if required
• Save cost and efficiently run large number of models
Workshop map
Amazon SageMaker at Re:Invent 2019
Amazon
SageMakerGround
Truth
Algorithms
& FrameworksNotebooks
Training
& Tuning
Deployment &
HostingRLML
MarketplaceNeo
SageMakerStudio
NEW!
Quick-start
Notebooks (Preview)
NEW!
Experiments
NEW!
Debugger
NEW!
Autopilot
NEW!
Model Monitor
NEW!
Build, Train, Deploy Machine Learning Models Quickly at Scale
Processing
NEW!
Model deployment in SageMaker - Overview
Model deployment in SageMaker – Key features
Model deployment – Security and compliance
Amazon Confidential – Do not share or distribute
Deploying a model is not the end.
You need to continuously monitor
models in production and iterate.
Concept drift due to
divergence of data
Model performance can
change due to unknown
factors
Continuous monitoring involves a
lot of tooling and expense
Model monitoring is
cumbersome but critical
+
+
=
Amazon Confidential – Do not share or distribute
Introducing Amazon SageMaker Model Monitor
Automatic data
collection
Continuous
monitoringCloudWatch
integration
Continuous monitoring of models in production
Visual
data analysisFlexibility
with rules
Workshop map
Deploy trained model (XGBoost movie recommendation model)
Amazon SageMaker
training job
Model Amazon SageMaker
Endpoint
Applications
Enable data capture for Amazon SageMaker Endpoint
Amazon SageMaker
training job
Model Amazon SageMaker
EndpointApplications
Requests,
predictions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Create Endpoint with data capture enabled
s3://{destination-bucket-prefix}/{endpoint-name}/{variant-name} /yyyy/mm/dd/hh/filename.jsonl
Data captured from SageMaker Endpoint
Example of collected prediction request and response
Workshop map
2. Run predictions and
view captured data
Run predictions and view captured data
Amazon SageMaker
training job
Model Amazon SageMaker
Endpoint
Applications
Requests,
predictions
View
captured data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Workshop map
Amazon SageMaker
training job
Model Amazon SageMaker
Endpoint
Applications
Baseline statistics
and constraints
Requests,
predictions
Analyze
baseline results
Generate baseline: Create a ProcessingJob
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generate baseline: Under the hood ProcessingJob
Baseline results – Statistics
Baseline results – Statistics
Baseline results – Constraints (suggested)
Workshop map
Amazon SageMaker
training job
Model Amazon SageMaker
Endpoint
Applications
Results:
statistics
and violations
Baseline statistics
and constraints
Requests,
predictions
MonitoringSchedule Job
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Model monitoring: Under the hood
ProcessingJob
Monitoring Schedule Execution Summary
MonitoringSchedule execution: constraint_violations.json
MonitoringSchedule execution: Violation sample
{ "violations": [{
"feature_name" : "string",
"constraint_check_type" :
"data_type_check",
| "completeness_check",
| "baseline_drift_check",
| "missing_column_check",
| "extra_column_check",
| "categorical_values_check"
"description" : "string"
}]
}
MonitoringSchedule execution: Violation types
For numerical fields:
Metric : Max → query for MetricName: feature_data_{feature_name}, Stat: Max
Metric : Min → query for MetricName: feature_data_{feature_name}, Stat: Min
Metric : Sum → query for MetricName: feature_data_{feature_name}, Stat: Sum
Metric : SampleCount → query for MetricName: feature_data_{feature_name}, Stat: SampleCount
Metric:Average→queryforMetricName:feature_data_{feature_name},Stat:Average
For both numerical and string fields:
Metric: Completeness → query for MetricName: feature_non_null_{feature_name}, Stat: Sum
Metric:BaselineDrift→queryforMetricName:feature_baseline_drift_{feature_name},Stat:Sum
CloudWatch metrics
/aws/sagemaker/Endpoints/data-metric namespace with EndpointName and ScheduleName dimensions
Workshop map
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Alerting and automate training trigger
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
Results:
statistics
and violations
Baseline statistics
and constraintsAmazon
CloudWatch
metrics
Requests,
predictions
Analysis of
results
Notifications
• Model updates
• Training data
updates
• Retraining
MonitoringSchedule execution: CloudWatch Alarms
Take corrective action: Retrigger model training
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Workshop map
Price a
house
Find
regulatory
violations
USA
Brazil
Singapore
… …
Next best
action
C-000001 C-000002 C-945821…
Number of models can add up quickly …
Multi-model endpointsFlexible cost savings as number of models scale
EP-1
Model 1
EP-2
Model 2
EP-10
Model 10
…
EP
Model 1
Model 2
…Model 10
Sample scenario: ml.c5.xlarge, $0.238/hr, 2 instances running 24/7
10 separate endpoints
$3,430/mo
1 multi-model endpoint
$343/mo
Multi-Model Endpoints
Mode:
Artifact location:
predict
s3://bucket/your-endpoint-models/
load
new_york.tar.gz
texas.tar.gz
florida.tar.gz
nevada.tar.gz
Amazon SageMaker
Multi-model endpoint S3 model storage
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
multi-model-endpoint
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sireesha Muppala
AI/ML Specialist SA
Amazon Web Services
Nitin Wagh
Sr. BDM, Amazon SageMaker
Amazon Web Services
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.