+ All Categories
Home > Documents > for Cache Replacement An Imitation Learning Approach

for Cache Replacement An Imitation Learning Approach

Date post: 29-Oct-2021
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
23
Evan Z. Liu, Milad Hashemi, Kevin Swersky, Pahasarathy Ranganathan, Junwhan Ahn An Imitation Learning Approach for Cache Replacement
Transcript
Page 1: for Cache Replacement An Imitation Learning Approach

Evan Z. Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, Junwhan Ahn

An Imitation Learning Approach for Cache Replacement

Page 2: for Cache Replacement An Imitation Learning Approach

The Need for Faster Compute

(https://openai.com/blog/ai-and-compute/)

Small cache improvements can make large differences! (Beckman, 2019)● E.g., 1% cache hit rate improvement → 35%

decrease in latency (Cidon, et. al., 2016)

Caches are everywhere:● CPU chips● Operating Systems● Databases● Web applications

Our goal: Faster applications via better cache replacement policies

Page 3: for Cache Replacement An Imitation Learning Approach

TL;DR:

I. We approximate the optimal cache replacement policy by (implicitly) predicting the future

II. Caching is an attractive benchmark for the general reinforcement learning / imitation learning communities

Page 4: for Cache Replacement An Imitation Learning Approach

MissHit (100x faster)

Cache Replacement

Miss

BA C

D A

BA D

C

BA DCache

Accesses

Evict

Goal: Evict the cache lines to maximize cache hits

Page 5: for Cache Replacement An Imitation Learning Approach

Miss

Cache Replacement

C

C

Cache

Accesses

Evict

BA D

D A

BA B DA

HitMiss

Mistake

Page 6: for Cache Replacement An Imitation Learning Approach

Cache Replacement

C

C

Cache

Accesses

BA D

D

B DA

HitMiss

BA

A

Optimal decision

Miss

Page 7: for Cache Replacement An Imitation Learning Approach

Cache Replacement

C

C

Cache

Accesses

BA D

D

B DA

HitMiss

BA

A

Miss

Reuse distance dt(line): number of accesses from access t until the line is reusedd0(A) = 1, d0(B) > 2, d0(C) =

2

Optimal Policy (Belady’s): Evict the line with the greatest reuse distance (Belady, 1966)

Page 8: for Cache Replacement An Imitation Learning Approach

Belady’s Requires Future Information

Reuse distance dt(line): number of accesses from access t until the line is reused

Problem: Computing reuse distance requires knowing the future

So in practice, we use heuristics, e.g.:● Least-recently used (LRU)● Most-recently used (MRU)

… but these perform poorly on complex access patterns

Page 9: for Cache Replacement An Imitation Learning Approach

Leveraging Belady’s

Idea: approximate Belady’s from past accesses

Past accesses Current access

Future accesses

. . .. . .

Learned Model Belady’s

Predicted decision Optimal decisionTraining

Page 10: for Cache Replacement An Imitation Learning Approach

Prior Work

Past accesses

Current access

Currentcache state

Current line cache friendly or averse?

Evict line XTrained on

Belady’s

Traditional Algorithm

Hawkeye / Glider

Current state-of-the-art (Shi et. al., ‘19, Jain et. al., ‘18)

Page 11: for Cache Replacement An Imitation Learning Approach

Prior Work

+ binary classification is relatively easy to learn

- traditional algorithm can’t express optimal policy

Past accesses

Current access

Currentcache state

Current line cache friendly or averse?

Evict line XTrained on

Belady’s

Traditional Algorithm

Hawkeye / Glider

Current state-of-the-art (Shi et. al., ‘19, Jain et. al., ‘18)

Page 12: for Cache Replacement An Imitation Learning Approach

. . .

Our proposal

Our Approach

Past accesses

Current access

Model

Currentcache state

Evict line X

Our contribution:Directly approximate Belady’s

via imitation learning

Trained on Belady’s

Past accesses

Current access

Currentcache state

Current line cache friendly or averse?

Evict line XTrained on

Belady’s

Traditional Algorithm

Hawkeye / Glider

Current state-of-the-art (Shi et. al., ‘19, Jain et. al., ‘18)

Page 13: for Cache Replacement An Imitation Learning Approach

Cache Replacement Markov Decision Process

MissHitMiss

BA C

D

B D

C

BA DCache

Accesses

Evict

A

A

Similar to Wang, et. al., 2019

Page 14: for Cache Replacement An Imitation Learning Approach

Past accesses Current access MissHitMiss

BA C

D

B D

C

BA DCache

Accesses

Evict

A

A

Current cache contents

Cache Replacement Markov Decision Process

Similar to Wang, et. al., 2019

Page 15: for Cache Replacement An Imitation Learning Approach

MissHitMiss

BA C

D

B D

C

BA D

A

ACache

Accesses

Cache Replacement Markov Decision Process

Similar to Wang, et. al., 2019

Page 16: for Cache Replacement An Imitation Learning Approach

BA DCache

Accesses

Evict

MissHitMiss

D CA

BA C B DA

Cache Replacement Markov Decision Process

Similar to Wang, et. al., 2019

Page 17: for Cache Replacement An Imitation Learning Approach

Leveraging the Optimal Policy

Typical imitation learning setting(Pomerlau, 1991, Ross, et. al., 2011, Kim, et. al., 2013)

state

Learned policy

optimal action

Learned policy Approximate optimal policy

state

optimize, e.g.,

Observation: Not all errors are equally bad● Learning from optimal policy yields

greater training signal

Concretely: minimize a ranking loss

Page 18: for Cache Replacement An Imitation Learning Approach

Reuse distance

Reuse Distance as an Auxiliary Task

Observation: predicting reuse distance is correlated with cache replacement● Cast this as an auxiliary task (Jaderberg, et. al., 2016)

State st

State embedding

Policy

Loss

Page 19: for Cache Replacement An Imitation Learning Approach

Results

LRU cache-hit rate

Optimal cache-hit rate

~19% cache-hit rate increase over Glider (Shi, et. al., 2019) on memory-intensive SPEC2006 applications (Jaleel, et. al., 2009)

~64% cache-hit rate increase over LRU on Google Web Search

Page 20: for Cache Replacement An Imitation Learning Approach

This work: Establish a proof-of-concept

A Note on Practicality

12 ...Address: 0x C5 A1

Byte 1 Byte 2 Byte 3

Linear Layer

address embedding

Per-byte address embedding● Reduce embedding size from 100MB to <10KB● ~6% cache-hit rate increase on SPEC2006 vs.

Glider● ~59% cache-hit rate increase on Google Web

Search vs. LRU

Page 21: for Cache Replacement An Imitation Learning Approach

Per-byte address embedding● Reduce embedding size from 100MB to <10KB● ~6% cache-hit rate increase on SPEC2006 vs.

Glider● ~59% cache-hit rate increase on Google Web

Search vs. LRU

This work: Establish a proof-of-concept

A Note on Practicality

12 ...Address: 0x C5 A1

Byte 1 Byte 2 Byte 3

Linear Layer

address embedding

Future work: Production ready learned policies● Smaller models via distillation (Hinton, et. al., 2015), pruning (Janowsky, 1989,

Han, et. al., 2015, Sze, et. al., 2017), or quantization● Target domains with longer latency and larger caches (e.g., software

caches)

Page 22: for Cache Replacement An Imitation Learning Approach

A New Imitation / Reinforcement Learning Benchmark

+ plentiful data- delayed real-world utility

- limited / expensive data+ immediate real-world impact

+ plentiful data+ immediate real-world impact

Miss

BA C

D

EvictBellemare, et. al., 2012,Silver, et. al., 2017, OpenAI, 2019, Vinyals, et. al., 2019

Levine, et. al., 2016, Lillicrap, et. al., 2015

Open-source cache replacement Gym environment coming soon!

Page 23: for Cache Replacement An Imitation Learning Approach

Takeaways

● A new state-of-the-art approach for cache replacement by imitating the oracle policy○ Future work: making this production ready

● A new benchmark for imitation learning / reinforcement learning research


Recommended