Learning Relaxed Belady for CDN Caching

Post on 16-Oct-2021

1 views 0 download

transcript

LearningRelaxedBeladyforCDNCaching

COS316:PrinciplesofComputerSystemDesignLecture10

AmitLevy&WyattLloyd

EdgeCachewithDifferentAlgos

• Clairvoyant (Bélády) shows we can do much better!

��

��

��

��

��

��

� �� ��

���������

����������

�������������������

�������

Infinite Cache

3

CuttingEdgeResearchFromPrinceton!LearningRelaxedBelady forContentDistributionNetworkCaching.

Zhenyu Song,DanielS.Berger,KaiLi,andWyattLloyd.

In17thUSENIXSymposiumonNetworkedSystemsDesignandImplementation(NSDI20),February2020.

Edgecache

CDNCachingGoal:MinimizeWANTrafficRequests

MissHit

User

Requests

WideAreaNetwork(WAN)trafficisexpensive

Keymetrichitratio

4

CachingRemainsChallengingHeuristic-basedalgorithms(1965–):LRU,LFU,GDSF,ARC,...● Workwellforsomeworkloads,butworkpoorlyforother

ML-basedadaptationofheuristics(2017–):UCB,LeCAR,...● Alsoworkwellforsomeworkloads,butpoorlyforothers

TheBelady algorithm(1966)● Offlineoptimal:requiresfutureknowledge● Largegapinmissratiobetweenstate-of-the-artandBelady:● 20–40%onproductiontraces

5

IntroducingLearningRelaxedBelady(LRB)

Newapproach:mimicBelady usingmachinelearning

6

● Machine-Learning-for-Systems(ML-for-Systems)○ Enablingtechnologies

○ Whendoesitmakesense?

GeneralOverviewofourApproach

R R R R R R·········

Now

Cache

R

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

7

Challenge1:PastInformation

R R R R R R·········

Now

Cache

RWhatpastinformationtouse?

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

8

Moredataimprovestrainingbutincreasesmemory overhead

Challenge2:GenerateOnlineTrainingData

R R R R R R·········

Now

Cache

RWhatpastinformationtouse?

Generateonlinetrainingdata?

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

9

Challenge3:MLArchitecture

R R R R R R·········

Now

Cache

RWhatpastinformationtouse?

Generateonlinetrainingdata?

WhatMLarchitecturetoselect?

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

10

Largedesignspace:features,model,predictiontarget,lossfunction

Challenge4:EvictionCandidates

R R R R R R·········

Now

Cache

R

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

Howtoselectevictcandidates?

Whatpastinformationtouse?

Generateonlinetrainingdata?

WhatMLarchitecturetoselect?

11

Solution:RelaxedBelady Algorithm

Howtoselectevictcandidates?

Whatpastinformationtouse?

Generateonlinetrainingdata?

WhatMLarchitecturetoselect?RelaxedBeladyalgorithm

16

Challenge:HardtoMimicBelady Algorithm

MimickingexactBeladyisimpractical● Needpredictionsforallobjects→prohibitivecomputationalcost● Needexactpredictionofnextaccess→furtherpredictionareharder

Belady:evictobjectwithnextaccessfarthestinthefuture

17

Cache(now)

A······

B

C D

Timetonextrequest

D B A C······ ······

Evict

IntroducingtheRelaxedBeladyAlgorithm

Observation:manyobjectsaregoodcandidatesforeviction

RelaxedBelady evictsa random objectbeyondboundary

18

Cache(now)

A······

B

C D

Timetonextrequest

D B A C······ ······

EvictBeladyboundary

● Donotneedpredictionsforallobjects→reasonablecomputation● Noneedtodifferentiatebeyondboundary→simplifiestheprediction

Challenge1:PastInformation

R R R R R R·········

MLarchitecture

Trainingdata

Evictioncandidates

Moredataimprovestrainingbutincreasesmemory overhead

PastinformationWhatpastinformationtouse?Now

Cache

R

22

TrackObjectswithinaSlidingMemoryWindow

Perobjectfeatures

R R R R R R·········

Now

R

Slidingmemorywindow mimicsBeladyboundary

Onlytrackobjectswithinmemorywindow

23

WindowsizeisLRB’smainhyperparameter

Challenge2:TrainingData

R R R R R R·········

Now

Cache

RWhatpastinformationtouse?

Generateonlinetrainingdata?

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

24

SampleTrainingData&LabelonAccessorBoundary

Perobjectfeatures

R R R R R R·········

Now

R

Slidingmemorywindow

Sample

Unlabeledtrainingdata

Past memorywindowAccess

Labeledtrainingdata

25

Challenge3:MLArchitecture

Largepotentialdesignspace

R R R R R R·········

Now

Cache

RWhatpastinformationtouse?

Generateonlinetrainingdata?

WhatMLarchitecturetoselect?

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

26

Solution3:Feature&ModelSelection

Gradientboostingdecisiontrees

Lightweight &highgooddecisionratio

Training~300ms,prediction~30us

Features

Objectsize

Objecttype

Inter-requestdistances(recency)

Exponentialdecaycounters(long-termfrequencies)

Usegooddecisionratiotoevaluatenewdesigns

27

Challenge4:EvictionCandidates

R R R R R R·········

Now

Cache

R

Pastinformation

MLarchitecture

Trainingdata

Evictioncandidates

Howtoselectevictcandidates?

Whatpastinformationtouse?

Generateonlinetrainingdata?

WhatMLarchitecturetoselect?

28

Solution4:RandomSamplingforEviction

CanmimicrelaxedBeladyifwecanfind1objectbeyondtheboundary

k=64candidates;moredoesnotimprovegooddecisionratio

R R R R R R·········

Now

Cache

R

Pastinformation

Randomkcandidates

29

Label

Labeleddataset

Sample

Unlabeleddataset

LearningRelaxedBelady

30

Now

Cache

RR R R R R R······

Memorywindow

RR

Train

Model EvictionCandidates

···

Sample

Predict

Evict

● Simulatorimplementation○ LRB+14otheralgorithms

● Prototypeimplementation○ C++ontopofproductionsystem(ApacheTrafficServer)○ Manyoptimizations

Implementation

31

● Q1:LearningRelaxedBelady(LRB)trafficreductionvsstate-of-the-art

● Q2:overheadofLRBvsCDNproductionsystem

● Traces:6productiontracesfrom3CDNs

● Hyperparameter(memorywindow/model/...)tunedon20%oftrace

EvaluationSetup

32

LRBReducesWANTraffic20%trafficreductionoverB-LRU10%reductionoverthebestSOA

Wikipediatrace

Industrystandard

33

CDN-B1 CDN-B3CDN-B2

LRBConsistentlyImprovesontheStateoftheArt

Wikipedia CDN-A1 CDN-A2

34

LRBOverheadIsModest

Throughput:11.7Gbpsvs11.7Gbps(unmodified)

Memoryoverhead=1‒3%cachesize

PeakCPU:16%vs9%(unmodified)

35

Edgecache

Requests

User

Conclusion

● LRBreducesWANtrafficwithmodestoverhead

● ML-for-systemsgenerallypromisingtoreplaceheuristics

● Keyinsight:relaxedBelady

→ Simplifiesmachinelearning&reducessystemoverhead

36