deepar: a hybrid device-edge-cloud execution framework for...
TRANSCRIPT
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkfor
MobileDeepLearningApplications
Yutao Huang1, Feng Wang2, Fangxin Wang1, Jiangchuan Liu1
1School of Computing Science, Simon Fraser University, Canada2Department of Computer and Information Science, The University of Mississippi, USA
2019/04/29
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Outline
2
Introduction−Today’s challenges for mobile deep learning applications
DeePar: Layer-level Partitioning Optimization for DNNs−Enabling layer-level partitioning optimization for DNN inference−Scheduling tasks for optimized total delay−Experiment and simulation results
Conclusion
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
3
AI is changing our lives
Facerecognition Machinetranslation
Driverlesscarself-servicesupermarket
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
4
Rising in mobile deep learning applications
AppleSiri
FaceID
Uberrouting
Migrainebuddy
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
5
Models are getting larger:
8layers~16%error
19layers~7.5%error
152layers~3.5%error
AlexNet(2012) VGG(2014) ResNet(2015)PicturedownloadedfromInternet
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
6
Cloud: current solution for mobile ML apps
AmazoninstanceswithGPUcomputing
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
6
Disadvantages of cloud computing
•HugevolumeofInternettraffic
• Limitednetworkbandwidth
•Highlatencyresponse
•Securityissue
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Edge Computing:
Introduction
7PicturedownloadedfromInternet
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
8
Advantages of edge computing
•Real-timeornearreal-timereaction
• Loweroperatingcosts
•Reduced core networktraffic
• Improvedapplicationperformance
PicturedownloadedfromInternet
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
9
Edge-assisted learning
Mobiledevice Networkedge Remotecloud
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Introduction
9
Edge-assisted learning
Mobiledevice Networkedge Remotecloud
Foreachtask,whichedgeservershouldbearrangedtooffloaddataandcomputation?
What/whichpartshallbeprocessedontheedge?
Howtoallocateresourceforeachtask?
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Outline
10
Introduction−Today’s challenges for mobile deep learning applications
DeePar: Layer-level Partitioning Optimization for DNNs−Enabling layer-level partitioning optimization for DNN inference−Scheduling tasks for optimized total delay−Experiment and simulation results
Conclusion
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Motivation
11
AlexNet layer-level performance
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Motivation
12
AlexNet performance comparison
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
13
DeePar Framework for one single task
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
14
Online multi-task scheduling
𝑖 ∈ 𝐼 𝑒 ∈ 𝐸 𝐶Mobiledevice Networkedge Remotecloud
Networkresourceindicator(fromthedevicetotheedge):
𝑥𝑖𝑒
Edgecomputationresourceindicator:𝑧𝑖𝑒
Networkresourceindicator(fromtheedgetothecloud):
𝑦𝑖𝑒
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
15
Constraints
Bandwidthconstraint:
,𝑏𝑖1 ∗ 𝑥𝑖𝑒 ≤ 𝐵𝑒1�
345
Computationresourceconstraint:
,𝑧𝑖𝑒 ≤ 𝑟𝑒�
345
Bandwidthconstraint:
,𝑏𝑖2 ∗ 𝑦𝑖𝑒 ≤ 𝐵𝑒2�
345
𝑖 ∈ 𝐼 𝑒 ∈ 𝐸 𝐶Mobiledevice Networkedge Remotecloud
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
16
Objective
Target:minimizingthetotalexecutiondelay
Computationdelayondevice,edgeserverandcloud
Datatransmissiondelaybetweendeviceandedgeserver,andbetweenedgeserverandcloud
Finalresulttransmissiondelay(fromcloudtothedevice)
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
19
Delayed-start strategy
Time
Delayed-start Shorterdelay
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
20
Single-task experiment
DeepFace
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
21
Single-task experiment
VGG-16
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
22
Single-task experiment
LeNet
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
23
Multi-task simulation
Timeintervalwithin300s,10edgeservers
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
24
Multi-task simulation
Timeintervalwithin50s,10edgeservers
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
DeePar:Acollaborativeexecutionapproach
25
Multi-task simulation
Timeintervalwithin50s,40edgeservers
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Outline
26
Introduction−Today’s challenges for mobile deep learning applications
DeePar: Layer-level Partitioning Optimization for DNNs−Enabling layer-level partitioning optimization for DNN inference−Scheduling tasks for optimized total delay−Experiment and simulation results
Conclusion
DeePar:AHybridDevice-Edge-CloudExecutionFrameworkforMobileDeepLearningApplications
Conclusion
27
We propose DeePar, a double-partition layer-level neural network partitioning optimization framework for edge inference tasks.
We formulate a multi-task scheduling problem for DeePar and propose an online algorithm with a delayed-start strategy.
Through experiments and simulations, DeePar can outperform device-only, edge-only and cloud-only execution with 20% -80% delay reduction.