+ All Categories
Home > Documents > PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H....

PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H....

Date post: 21-Jan-2016
Category:
Upload: juliana-richard
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
PIPP: Promotion/Insertion Pseudo- Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying Tian 36th ACM/IEEE International Symposium on Computer Architecture (ISCA ‘09)
Transcript
Page 1: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

PIPP:Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches

Yuejian Xie, Gabriel H. LohGeorgia Institute of Technology

Presented by: Yingying Tian

36th ACM/IEEE International Symposium on Computer Architecture (ISCA ‘09)

Page 2: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Last Level Caches (LLCs) are shared by all cores in Chip Multi-Processors (CMPs).

Multiple cores compete for the limited LLC capacity.

Manage Shared Caches

Core0

L1I L1D

Core1

L1IL1D

Last Level Cache (LLC)Core1’s DataCore0’s Data

Page 3: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

LRU leads to poor performance and fairness as a sharing-oblivious cache management policy.

Previous works tried to allocate LLC resources fairly via: Capacity Management: way-partitioning

(UCP) Dead-Time Management: LRU insertion

(TADIP)

PIPP: Do both capacity and dead time management better at the same time !

Page 4: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Outline

Background and Motivation Previous Work PIPP Evaluation Conclusion

Page 5: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

UCP (Utility based Cache Partitioning) `

Core1Core0

Core 0 gets 5 ways

Core 1 gets 3 ways

*Some materials are taken from original presentation slides.

Page 6: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

DIP (Dynamic Insertion Policy)

MRU LRU

Incoming Block

Page 7: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

MRU LRU

Occupies one cache blockfor a long time with no benefit!

DIP (Dynamic Insertion Policy)

Page 8: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

DIP (Dynamic Insertion Policy)

MRU LRU

Incoming Block

Page 9: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

DIP (Dynamic Insertion Policy)

MRU LRU

Useless Block Evicted at next eviction

Useful Block Moved to MRU position

Page 10: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

DIP (Dynamic Insertion Policy)

MRU LRU

Useless Block Evicted at next eviction

Useful Block Moved to MRU position

Page 11: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Cache Replacement Policy Eviction: Which block should be

replaced when a cache miss occurs? LRU block

Insertion: For a coming block, where should it be inserted in the corresponding set? MRU insertion (Default LRU replacement

policy) LRU insertion (Dead-on-arrival blocks)

Promotion: If a block is re-referenced, where should its position be adjusted? Move to MRU position

Page 12: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

PIPP: Promotion/Insertion Pseudo-Partitioning Insertion:Target partitioning: ∏ = {∏1, ∏2, …., ∏n},

∑∏i = w (w is the associativity of the cache)On insertion, corei inserts its coming block in position ∏i. (Dynamically computed via

UCP monitors or other ways.) Promotion:One step toward MRU position with P and unchanged with 1-P.

MRU LRU

To Evict

Promote

Hit

Insert Position = 3 (Target Allocation) New

Page 13: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

13

PIPP ExampleCore0 quota: 5 blocksCore1 quota: 3 blocks

1 A 2 3 4 5B C

Core0’s Block Core1’s Block

Request

MRU LRU

Core1’s quota=3

D

Page 14: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

14

PIPP ExampleCore0 quota: 5 blocksCore1 quota: 3 blocks

1 A 2 53 4 D B

Core0’s Block Core1’s Block

Request

MRU LRU

6

Core0’s quota=5

Page 15: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

15

PIPP ExampleCore0 quota: 5 blocksCore1 quota: 3 blocks

1 A 2 6 3 4 D B

Core0’s Block Core1’s Block

Request

MRU LRU

Core0’s quota=5

7

Page 16: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

16

PIPP ExampleCore0 quota: 5 blocksCore1 quota: 3 blocks

1 A 2 6 3 4 D

Core0’s Block Core1’s Block

Request

MRU LRU

D

7

Page 17: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

17

PIPP ExampleCore0 quota: 5 blocksCore1 quota: 3 blocks

1 A 2 7 6 4

Core0’s Block Core1’s Block

Request

MRU LRU

Core1’s quota=3

D3

E

Page 18: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

18

PIPP ExampleCore0 quota: 5 blocksCore1 quota: 3 blocks

1 A 2 7 6 D

Core0’s Block Core1’s Block

Request

MRU LRU

3E

2

Page 19: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

19

Pseudo-Partition Benefit

MRU0

Core0 quota: 5 blocksCore1 quota: 3 blocks

Core0’s Block Core1’s Block

Request

Strict Partition

MRU1 LRU1LRU0

New

Page 20: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

20

Pseudo-Partition Benefit

MRU LRU

Core0 quota: 5 blocksCore1 quota: 3 blocks

Core0’s Block Core1’s Block

Request

New

Pseudo Partition

Page 21: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Methodology

SimpleScalar simulator for x86 Intel Core 2 processor 32KB, 8-way 3-cycle L1I-L1D for

each core A shared 4MB, 16-way, 11-cycle LLC Multi-programmed workloads from

SPEC CPU benchmarks. (2-core and 4-core workloads)

500m insns warmup, 250m insns simulation

Page 22: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Evaluation 2-Core Weighted Speedup

TADIP FriendlyUCP Friendly

PIPP outperforms LRU by 19.0%, UCP by 10.6%, TADIP by 10.1%

Page 23: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

4-Core Weighted Speedup

TADIP FriendlyUCP Friendly

PIPP outperforms LRU by 21.9%, UCP by 12.1%, TADIP by 17.5%

Page 24: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Occupancy Control

For most workloads, the partitioning deviation is within 1.0 of the target allocation, similar to UCP.

Page 25: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Conclusion

Novel proposal on Insertion and Promotion

A single unified mechanism provides both capacity and dead time management

Outperforms prior UCP and TADIP

Page 26: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Thank you !

Questions?


Recommended