the long road to model deployment · 2018-04-11 · •hyper-parameters 5 categorical parameters of...
TRANSCRIPT
…or how to make a good (machine learning) model great.
The Long Road to Model Deployment
GTC, March 2018Greg Heinrich
22
Exemplar model A less successful trial
Same model design, same data, different parameters.
MODELS IN ACTION
33
Object Detector Concorde
It is as easy as flying a Concorde.
PARAMETER TUNING
• Topology parameters▪ Number of layers and their width
▪ Choice of activations
• Training parameters▪ Learning rate
▪ Batch size
▪ Choice of optimizer
▪ Normalization
▪ Number of iterations
• Data parameters▪ Spatial augmentation
▪ Color augmentation
▪ Rasterization
▪ Number of training samples
Source: Christian Kath
44
Hypothetical example
What is the scale of the problem?
PARAMETER TUNING
• Data▪ 5 cars, 6 cameras, 1 month, 10 frames per
minute per camera → 1.3M images
▪ 60 epochs
• Hyper-parameters▪ 5 categorical parameters of 4 values
▪ 5 continuous parameters
• Quasi-Exhaustive (“Grid”) search▪ Explore only 3 values of the continuous
parameters → 250k jobs
▪ 10ms/image → 6 thousand years (!)
• Random search▪ May reduce total time by orders of
magnitude: 50 jobs → 1.25 years
Workflow
Model
Experiment
Dataset
→→
55
Divide & conquer
• Run the 50 jobs in
parallel
• Use 8 GPUs per job
• Total time → 1 day
Parallelism comes to the rescue
PARAMETER TUNING
Run
Training
Use
Datasets
Analyze
ResultsBuild
Experiments
Dataset
ServiceExperiment
Service
Workflow
Manager
Training Cluster
(10’s of thousands of GPUs)
Metrics
66
One service, multiple identical workers
PARAMETER TUNING
Worker
Experiment/hyperopt Service
Get
ParamsTrain Evaluate
Report
MetricsContinue?
77
Jupyter notebook
Collecting and analyzing results
PARAMETER TUNING
88
Overall parameter sensitivity When Accuracy is over 60%
Which parameters have the most impact on metrics?
PARAMETER TUNING
99
Learning rate Batch size
Zooming in on important parameters
PARAMETER TUNING
1010
Eliminating underperformers
PARAMETER TUNING
1111
Diminishing returns
• Adding data helps
• Increasing accuracy
through more data is
increasingly expensive
Do we need more?
DATA
1212
Active learning
• Collecting data is
relatively cheap
• Annotating data is very
expensive
• Use trained model to
select next frame to
annotate
• Strategies:▪ Maximum variance
▪ Maximum entropy
Selecting better data
DATA
Annotated dataset
Train
Trained model
Select imageto annotate
Annotate
Unannotated dataset
See: Adam Lesnikowsky’s talk on Deep Active Learning
1313
Inference time
• Using DL framework to
run inference
• DrivePX2 platform
• Time/frame (excluding
data loading): 73ms
• 6 cameras → one
prediction every 438ms
Great accuracy, slow response time
SELECTING BEST MODEL
1414
Unpruned network Pruned network
Pruning unimportant neurons
REDUCING MODEL COMPLEXITY
12 neurons, 32 connections
11 neurons, 24 connections
1515
Selecting neurons to prune Workflow
Pruning implementation
REDUCING MODEL COMPLEXITY
• Exhaustive search
• Random
• Minimum weight
• Minimum activations
• Gradient based
Train Prune Re-train
Results
• Min weight method, single pass
• 83% of weights removed
• Inference time: 73ms → 26ms
• 2.7x speed-up
See: Jose Alvarez’s talk on Model Compression
1616
TensorRT conversion
• Inference-specific
optimizations
• Platform-specific
optimizations
Inference optimization
TENSORRT
TensorRT OptimizerTrained Neural Network
Plan 1
Plan 2
Plan 3
Optimized Plans
ImportModel
SerializeEngine
Results (FP32 precision)
• 26ms → 8.5ms/frame
• 3x speed-up
1717
TensorRT conversion
• Store weights and
activations in 8 bits
• Accumulations in FP16
INT8 precision
TENSORRT
Results (FP16 precision)
• 8.5ms → 3.9ms/frame
• 2.2x speed-up
• Even greater speed-up
with Tensor Cores on
Xavier SoC!
Dataset
Pre-
processor
Pre-
processed
images
INT8
calibrator
INT8 cal
file
Train Model
TensorRT
Optimizer
TensorRT
engineEvaluatorMetrics
Process Artifact
Legend
1818
Automated workflows enable traceability
AUTOMATION
Code
BuildData
Loader
SCM Data
TrainPrune Re-Train
Pick best
modelTensorRT Evaluate
Train
Train
Train
Train
Train
Knowledge base
Config
Traceability firewall
1919
Q&A
Fore More information contact:Poonam Chitale ([email protected])NVIDIA AV Perception Infrastructure Product Manager