learning relaxed belady for cdn caching
TRANSCRIPT
LearningRelaxedBeladyforCDNCaching
COS316:PrinciplesofComputerSystemDesignLecture10
AmitLevy&WyattLloyd
EdgeCachewithDifferentAlgos
• Clairvoyant (Bélády) shows we can do much better!
��
��
��
��
��
��
� �� ��
���������
����������
�������������������
�������
Infinite Cache
3
CuttingEdgeResearchFromPrinceton!LearningRelaxedBelady forContentDistributionNetworkCaching.
Zhenyu Song,DanielS.Berger,KaiLi,andWyattLloyd.
In17thUSENIXSymposiumonNetworkedSystemsDesignandImplementation(NSDI20),February2020.
Edgecache
CDNCachingGoal:MinimizeWANTrafficRequests
MissHit
User
Requests
WideAreaNetwork(WAN)trafficisexpensive
Keymetrichitratio
4
CachingRemainsChallengingHeuristic-basedalgorithms(1965–):LRU,LFU,GDSF,ARC,...● Workwellforsomeworkloads,butworkpoorlyforother
ML-basedadaptationofheuristics(2017–):UCB,LeCAR,...● Alsoworkwellforsomeworkloads,butpoorlyforothers
TheBelady algorithm(1966)● Offlineoptimal:requiresfutureknowledge● Largegapinmissratiobetweenstate-of-the-artandBelady:● 20–40%onproductiontraces
5
IntroducingLearningRelaxedBelady(LRB)
Newapproach:mimicBelady usingmachinelearning
6
● Machine-Learning-for-Systems(ML-for-Systems)○ Enablingtechnologies
○ Whendoesitmakesense?
GeneralOverviewofourApproach
R R R R R R·········
Now
Cache
R
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
7
Challenge1:PastInformation
R R R R R R·········
Now
Cache
RWhatpastinformationtouse?
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
8
Moredataimprovestrainingbutincreasesmemory overhead
Challenge2:GenerateOnlineTrainingData
R R R R R R·········
Now
Cache
RWhatpastinformationtouse?
Generateonlinetrainingdata?
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
9
Challenge3:MLArchitecture
R R R R R R·········
Now
Cache
RWhatpastinformationtouse?
Generateonlinetrainingdata?
WhatMLarchitecturetoselect?
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
10
Largedesignspace:features,model,predictiontarget,lossfunction
Challenge4:EvictionCandidates
R R R R R R·········
Now
Cache
R
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
Howtoselectevictcandidates?
Whatpastinformationtouse?
Generateonlinetrainingdata?
WhatMLarchitecturetoselect?
11
Solution:RelaxedBelady Algorithm
Howtoselectevictcandidates?
Whatpastinformationtouse?
Generateonlinetrainingdata?
WhatMLarchitecturetoselect?RelaxedBeladyalgorithm
16
Challenge:HardtoMimicBelady Algorithm
MimickingexactBeladyisimpractical● Needpredictionsforallobjects→prohibitivecomputationalcost● Needexactpredictionofnextaccess→furtherpredictionareharder
Belady:evictobjectwithnextaccessfarthestinthefuture
17
Cache(now)
A······
B
C D
Timetonextrequest
D B A C······ ······
Evict
IntroducingtheRelaxedBeladyAlgorithm
Observation:manyobjectsaregoodcandidatesforeviction
RelaxedBelady evictsa random objectbeyondboundary
18
Cache(now)
A······
B
C D
Timetonextrequest
D B A C······ ······
EvictBeladyboundary
● Donotneedpredictionsforallobjects→reasonablecomputation● Noneedtodifferentiatebeyondboundary→simplifiestheprediction
Challenge1:PastInformation
R R R R R R·········
MLarchitecture
Trainingdata
Evictioncandidates
Moredataimprovestrainingbutincreasesmemory overhead
PastinformationWhatpastinformationtouse?Now
Cache
R
22
TrackObjectswithinaSlidingMemoryWindow
Perobjectfeatures
R R R R R R·········
Now
R
Slidingmemorywindow mimicsBeladyboundary
Onlytrackobjectswithinmemorywindow
23
WindowsizeisLRB’smainhyperparameter
Challenge2:TrainingData
R R R R R R·········
Now
Cache
RWhatpastinformationtouse?
Generateonlinetrainingdata?
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
24
SampleTrainingData&LabelonAccessorBoundary
Perobjectfeatures
R R R R R R·········
Now
R
Slidingmemorywindow
Sample
Unlabeledtrainingdata
Past memorywindowAccess
Labeledtrainingdata
25
Challenge3:MLArchitecture
Largepotentialdesignspace
R R R R R R·········
Now
Cache
RWhatpastinformationtouse?
Generateonlinetrainingdata?
WhatMLarchitecturetoselect?
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
26
Solution3:Feature&ModelSelection
Gradientboostingdecisiontrees
Lightweight &highgooddecisionratio
Training~300ms,prediction~30us
Features
Objectsize
Objecttype
Inter-requestdistances(recency)
Exponentialdecaycounters(long-termfrequencies)
Usegooddecisionratiotoevaluatenewdesigns
27
Challenge4:EvictionCandidates
R R R R R R·········
Now
Cache
R
Pastinformation
MLarchitecture
Trainingdata
Evictioncandidates
Howtoselectevictcandidates?
Whatpastinformationtouse?
Generateonlinetrainingdata?
WhatMLarchitecturetoselect?
28
Solution4:RandomSamplingforEviction
CanmimicrelaxedBeladyifwecanfind1objectbeyondtheboundary
k=64candidates;moredoesnotimprovegooddecisionratio
R R R R R R·········
Now
Cache
R
Pastinformation
Randomkcandidates
29
Label
Labeleddataset
Sample
Unlabeleddataset
LearningRelaxedBelady
30
Now
Cache
RR R R R R R······
Memorywindow
RR
Train
Model EvictionCandidates
···
Sample
Predict
Evict
● Simulatorimplementation○ LRB+14otheralgorithms
● Prototypeimplementation○ C++ontopofproductionsystem(ApacheTrafficServer)○ Manyoptimizations
Implementation
31
● Q1:LearningRelaxedBelady(LRB)trafficreductionvsstate-of-the-art
● Q2:overheadofLRBvsCDNproductionsystem
● Traces:6productiontracesfrom3CDNs
● Hyperparameter(memorywindow/model/...)tunedon20%oftrace
EvaluationSetup
32
LRBReducesWANTraffic20%trafficreductionoverB-LRU10%reductionoverthebestSOA
Wikipediatrace
Industrystandard
33
CDN-B1 CDN-B3CDN-B2
LRBConsistentlyImprovesontheStateoftheArt
Wikipedia CDN-A1 CDN-A2
34
LRBOverheadIsModest
Throughput:11.7Gbpsvs11.7Gbps(unmodified)
Memoryoverhead=1‒3%cachesize
PeakCPU:16%vs9%(unmodified)
35
Edgecache
Requests
User
Conclusion
● LRBreducesWANtrafficwithmodestoverhead
● ML-for-systemsgenerallypromisingtoreplaceheuristics
● Keyinsight:relaxedBelady
→ Simplifiesmachinelearning&reducessystemoverhead
36