+ All Categories
Home > Documents > Caches Where is a block placed in a cache? –Three possible answers three different types...

Caches Where is a block placed in a cache? –Three possible answers three different types...

Date post: 14-Jan-2016
Category:
Upload: ursula-davidson
View: 214 times
Download: 2 times
Share this document with a friend
35
Transcript
Page 1: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
Page 2: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Caches• Where is a block placed in a cache?

– Three possible answers three different types

Anywhere Fully associative

Only intoone block

Direct mapped

Into subsetof blocks

Set associative

Page 3: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

How is a block found?• Cache has an address tag for each block

• Tags are checked in parallel for a match

• Also has a valid bit

Processor address: Block Address Block OffsetTag Index

Identifies datain blockIdentifies setIdentifies block

Page 4: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Which block should be replaced on a miss?

• Direct mapped:– Simple (there can only be one!)

• Associative caches:– Choice involved– Three techniques

• Random

• Least-recently used (LRU)– Often only approximated

• FIFO (approximates LRU)

Page 5: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Random vs LRU (16kB cache)

0

1

2

3

4

5

6

2-way 4-way 8-way

LRURandom

Mis

s R

ate

(%)

Page 6: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Random vs LRU (256kB cache)

0

0.2

0.4

0.6

0.8

1

1.2

2-way 4-way 8-way

LRURandom

Mis

s R

ate

(%)

Page 7: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

What happens on a write?

• Reads predominate– Instruction fetches, more loads than stores– MIPS instruction mix:

• 10% stores

• 37% loadsWrites: 7% of memory traffic,21% of data traffic

Amdahl’s Law: We can’t ignore them!

Page 8: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Write Strategy

• Must complete checking tags before starting to write– Read can sometimes proceed safely while tags

are checked

• Must modify only part of the block– Reads can read more than is required

Page 9: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Write Strategy

• Two main approaches

Dirty bit

• Write through

Cache MainMemory

CPU

• Write back

Cache MainMemory×

CPU

Page 10: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Advantages

• Write back– Writes occur at cache speeds– Only one memory access after multiple writes

• Lower memory bandwidth

• Write through– Efficient read misses– Simple implementation– Memory and cache are consistent

Good for multiprocessors

Good for multi-processors!

Page 11: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Optimising Write Through

• Reduce write stalls– Write buffer

• Processor continues while write buffer updates memory

Page 12: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Handling Write Misses

• Write allocate– Fetch block into cache on miss– Good with write back

• No-write allocate– Memory is updated without loading block into

cache– Good with write through

Page 13: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Alpha 21264 Data Cache

• Data cache– 64kB– 64-byte blocks– 2-way set associative– Write back

• Write allocate

– Victim Buffer (similar to Write Buffer)• 8 blocks

Page 14: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Alpha Data Cache Hit

Page 15: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Data Cache

• Uses FIFO (one bit per set)

• If victim buffer is full, CPU must stall

• Write miss:– Write allocate– Similar to read miss

Page 16: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Performance

• Hit– 3 cycles

• Three cycle load delay

• Miss– 9ns to transfer data from next level (6 cycles @

667MHz)

Page 17: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Alpha 21264 Instruction Cache

• Instruction cache– Separate from data cache– 64kB

Page 18: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Separate Caches

• Doubles available bandwidth– Prevents fetch unit stalling on data accesses

• Caches can be optimised separately– UltraSPARC

• Data cache: 16kB, direct mapped, 2 × 16-byte sub-blocks

• Instruction cache: 16kB, 2-way set associative, 32-byte blocks

Page 19: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Unified Caches

• Hold both data and instructions

• Miss rates for instructions are much lower than for data (an order of magnitude)

• Unified cache may have slightly better overall miss rate– 16kB data cache: 11.4%– 16kB instruction cache: 0.4%– 32kB unified cache: 3.18%

3.24%

BUT: extra cycle stall for unified cache: average memoryaccess time is slower (4.44 rather than 4.24 cycles)

Page 20: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

5.3. Cache Performance

• Miss rate can be misleading– See last example!

• Better measure is average memory access time

= Hit time + Miss rate × Miss penalty

Page 21: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Performance Issues• Cache is very significant factor

– Example: CPU time increased by 4

• Particularly for:– Low CPI machines– Fast clock speeds

• Simplicity of direct mapped cache may give faster clock rate

Page 22: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Miss Penalty and Out of Order Execution

• Processor may be able to do useful work during cache miss

• Makes analysis of cache performance very difficult!

• Can have a significant impact

Page 23: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Improving Cache Performance

• Very important topic– 1600 papers in 6 years! (2nd Edition)– 5000 papers in 13 years! (3rd Edition)

Page 24: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Improving Cache Performance

• Four categories of optimisation:– Reduce miss rate– Reduce miss penalty– Reduce miss rate or miss penalty using

parallelism– Reduce hit time

AMAT = Hit time + Miss rate × Miss penalty

Page 25: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

5.4. Reducing Miss Penalty

• Traditionally, focus on miss rate

• But, cost of miss penalties is increasing dramatically

Page 26: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Multi-level Caches

• Two caches– A small, fast one close to the CPU– A big, slower one between the first cache and

memory

L1cache Main

Memory

CPU L2 Cache

Page 27: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Second-level caches

• Complicates analysis

L1L1L1 penalty Miss RateMissTime Hit AMAT

L2L2L2L1 penalty Miss RateMissTime Hit PenaltyMiss

Page 28: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Analysis of two-level caches

• Local miss rate– Number of misses / number of accesses to this

cache– Artificially high for L2 cache

• Global miss rate– Number of misses / number of accesses by CPU

– Miss rateL1 × Miss rateL2 for L2 cache

Page 29: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Design of two-level caches

• Second level cache should be large– Minimises local miss rate– Big blocks are more feasible (reducing miss

rate)

• Multilevel inclusion property– All data in L1 is also in L2– Useful for multiprocessor consistency

• Can be enforced at L2

Page 30: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Early restart & critical word first

• Minimise CPU waiting time

• Early restart– As soon as requested word arrives send it to

CPU

• Critical word first– Request required word from memory first then

fill rest of cache block

Page 31: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Prioritising read misses

• Write-through caches normally make use of a write buffer

• Problem: may lead to RAW hazards

• Solution: stall read miss until write buffer empties– May be as much as 50% increase in read miss

• Better solution: check write buffer for conflict

Page 32: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Prioritising read misses

• Write-back caches– Long read misses due to writing back dirty

block

• Solution:– Write buffer– Handle read miss then write back the dirty

block– Need to do the same conflict checking (or stall

for the write buffer to drain)

Page 33: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Merging Write Buffer

• Write buffers merge data being written to the same area of memory

• Benefits:– More efficient use of buffer– Reduces stalls due to write buffer being full

Page 34: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Victim Caches

• Small ( 5 entries), fully associative cache on the refill path– Holds recently discarded blocks

• Temporal locality

– Experiment (4kB, direct-mapped cache):• 4-entry victim cache

• Removed 20% to 95% of conflict misses

• AMD Athlon: 8 entry victim cache

Page 35: Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.

Recommended