+ All Categories
Home > Documents > Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture Seongbeom Kim, Dhruba...

Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture Seongbeom Kim, Dhruba...

Date post: 14-Dec-2015
Category:
Upload: aiden-booth
View: 217 times
Download: 3 times
Share this document with a friend
Popular Tags:
52
Fair Cache Sharing and Fair Cache Sharing and Partitioning Partitioning in a Chip in a Chip Multiprocessor Architecture Multiprocessor Architecture Seongbeom Kim, Dhruba Chandra, and Yan Solihin Dept. of Electrical and Computer Engineering North Carolina State University {skim16, dchandr, solihin}@ncsu.edu
Transcript
  • Slide 1

Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture Seongbeom Kim, Dhruba Chandra, and Yan Solihin Dept. of Electrical and Computer Engineering North Carolina State University {skim16, dchandr, solihin}@ncsu.edu Slide 2 PACT 20042Seongbeom Kim, NCSU L2 $ Cache Sharing in CMP L1 $ Processor Core 1Processor Core 2 L1 $ Slide 3 PACT 20043Seongbeom Kim, NCSU L2 $ Cache Sharing in CMP L1 $ Processor Core 1 L1 $ Processor Core 2 t1 Slide 4 PACT 20044Seongbeom Kim, NCSU L1 $ Processor Core 1 L1 $ Processor Core 2 L2 $ Cache Sharing in CMP t2 Slide 5 PACT 20045Seongbeom Kim, NCSU L1 $ L2 $ Cache Sharing in CMP Processor Core 1Processor Core 2 t1 L1 $ t2 t2s throughput is significantly reduced due to unfair cache sharing. Slide 6 PACT 20046Seongbeom Kim, NCSU Shared L2 cache space contention Slide 7 PACT 20047Seongbeom Kim, NCSU Shared L2 cache space contention Slide 8 PACT 20048Seongbeom Kim, NCSU Uniprocessor scheduling 2-core CMP scheduling Problems of unfair cache sharing Sub-optimal throughput Thread starvation Priority inversion Thread-mix dependent throughput Fairness: uniform slowdown for co-scheduled threads Impact of unfair cache sharing t1 t4 t1 t3 t2 t1 t2 t1 t3 t1 t2 t1 t3 t4 t1 P1: P2: time slice Slide 9 PACT 20049Seongbeom Kim, NCSU Contributions Cache fairness metrics Easy to measure Approximate uniform slowdown well Fair caching algorithms Static/dynamic cache partitioning Optimizing fairness Simple hardware modifications Simulation results Fairness: 4x improvement Throughput 15% improvement Comparable to cache miss minimization approach Slide 10 PACT 200410Seongbeom Kim, NCSU Related Work Cache miss minimization in CMP: G. Suh, S. Devadas, L. Rudolph, HPCA 2002 Balancing throughput and fairness in SMT: K. Luo, J. Gummaraju, M. Franklin, ISPASS 2001 A. Snavely and D. Tullsen, ASPLOS, 2000 Slide 11 PACT 200411Seongbeom Kim, NCSU Outline Fairness Metrics Static Fair Caching Algorithms (See Paper) Dynamic Fair Caching Algorithms Evaluation Environment Evaluation Conclusions Slide 12 PACT 200412Seongbeom Kim, NCSU Fairness Metrics Uniform slowdown Execution time of t i when it runs alone. Slide 13 PACT 200413Seongbeom Kim, NCSU Fairness Metrics Uniform slowdown Execution time of t i when it shares cache with others. Slide 14 PACT 200414Seongbeom Kim, NCSU Fairness Metrics Uniform slowdown We want to minimize: Ideally: Slide 15 PACT 200415Seongbeom Kim, NCSU Fairness Metrics Uniform slowdown We want to minimize: Ideally: Slide 16 PACT 200416Seongbeom Kim, NCSU Fairness Metrics Uniform slowdown We want to minimize: Ideally: Slide 17 PACT 200417Seongbeom Kim, NCSU Outline Fairness Metrics Static Fair Caching Algorithms (See Paper) Dynamic Fair Caching Algorithms Evaluation Environment Evaluation Conclusions Slide 18 PACT 200418Seongbeom Kim, NCSU Partitionable Cache Hardware LRU P1: 448B P2 Miss P2: 576B Current Partition P1: 384B P2: 640B Target Partition Modified LRU cache replacement policy G. Suh, et. al., HPCA 2002 Slide 19 PACT 200419Seongbeom Kim, NCSU Partitionable Cache Hardware LRU * P1: 448B P2 Miss P2: 576B Current Partition P1: 384B P2: 640B Target Partition Modified LRU cache replacement policy G. Suh, et. al., HPCA 2002 LRU * P1: 384B P2: 640B Current Partition P1: 384B P2: 640B Target Partition Slide 20 PACT 200420Seongbeom Kim, NCSU Dynamic Fair Caching Algorithm P1: P2: Ex) Optimizing M3 metric P1: P2: Target Partition MissRate alone P1: P2: MissRate shared Repartitioning interval Slide 21 PACT 200421Seongbeom Kim, NCSU Dynamic Fair Caching Algorithm 1 st Interval P1:20% P2: 5% MissRate alone Repartitioning interval P1: P2: MissRate shared P1:20% P2:15% MissRate shared P1:256KB P2:256KB Target Partition Slide 22 PACT 200422Seongbeom Kim, NCSU Dynamic Fair Caching Algorithm Repartition! Evaluate M3 P1: 20% / 20% P2: 15% / 5% P1:20% P2: 5% MissRate alone Repartitioning interval P1:20% P2:15% MissRate shared P1:256KB P2:256KB Target Partition P1:192KB P2:320KB Target Partition Partition granularity: 64KB Slide 23 PACT 200423Seongbeom Kim, NCSU Dynamic Fair Caching Algorithm 2 nd Interval P1:20% P2: 5% MissRate alone Repartitioning interval P1:20% P2:15% MissRate shared P1:20% P2:15% MissRate shared P1:20% P2:10% MissRate shared P1:192KB P2:320KB Target Partition Slide 24 PACT 200424Seongbeom Kim, NCSU Dynamic Fair Caching Algorithm Repartition! Evaluate M3 P1: 20% / 20% P2: 10% / 5% P1:20% P2: 5% MissRate alone Repartitioning interval P1:20% P2:15% MissRate shared P1:20% P2:10% MissRate shared P1:192KB P2:320KB Target Partition P1:128KB P2:384KB Target Partition Slide 25 PACT 200425Seongbeom Kim, NCSU Dynamic Fair Caching Algorithm 3 rd Interval P1:20% P2: 5% MissRate alone Repartitioning interval P1:20% P2:10% MissRate shared P1:128KB P2:384KB Target Partition P1:20% P2:10% MissRate shared P1:25% P2: 9% MissRate shared Slide 26 PACT 200426Seongbeom Kim, NCSU Dynamic Fair Caching Algorithm Repartition! Do Rollback if: P2:


Recommended