16406810 Machine Architecture 14 Cache Memory Principles Elements of Cache Design

8/14/2019 16406810 Machine Architecture 14 Cache Memory Principles Elements of Cache Design

http://slidepdf.com/reader/full/16406810-machine-architecture-14-cache-memory-principles-elements-of-cache 1/34

TMA 1271

INTRODUCTION TO MACHINEARCHITECTURE

Week 9 and 10 – Lec10Cache Memory – Principles,

Elements of Cache Design



RK 2

What you are going to study?

Cache Memory

Typical organization

Operation -overview

Elements of Cache Design Mapping - Direct, Associative, Set

Associative

Replacement Algorithms

Write Policy

Block Size

Number of Caches



RK 3

Cache

Small amount of fast memory

Sits between normal main memory and

CPU May be located on CPU chip or module



RK 4

Cache



RK 5

Cache operation - overview



RK 6

Cache operation - overview

CPU requests contents of memory location

Check cache for this data

If present, get from cache (fast) If not present, read required block from

main memory to cache

Then deliver from cache to CPU Cache includes tags to identify which

block of main memory is in each cacheslot

C<<M (Cache lines << Main memory



RK 7

Typical Cache Organization



RK 8

Elements of Cache Design

Size

Mapping Function

Replacement Algorithm Write Policy

Block Size

Number of Caches



RK 9

Size does matter

Cost

More cache is expensive

Speed More cache is faster (up to a point)

Larger cache-larger gates involved-slow down

Checking cache for data takes time

Studies show that size bet. 1K and 512 K words-effective



RK 10

Mapping Function

There are fewer cache lines than main memoryblocks, an algo. Is needed for mapping mainmemory blocks into cache lines.(Direct/Associative/Set Associative)

Means needed to determine which main memoryblock currently occupies a cache line.

Ex: Assume Cache of 64kByte block of 4 bytes - Data transfer bet. Mem. And cache.

i.e. cache is organized as 16k (214) lines of 4 byteseach

Assume 16MBytes main memory

each byte directly addressable by 24 bit address

(224=16M)



RK 11

Direct Mapping

Each block of main memory maps to onlyone cache line

i.e. if a block is in cache, it must be in one

specific place Address is in two parts

Least Significant w bits identify unique

word or byte within a block of mainmemory

Most Significant s bits specify one of 2s memory block

The MSBs are split into a cache line field r



RK 12

Direct Mapping

Address Structure

Tag s-r Line or Slot r Word w

8 14 2

24 bit address 2 bit word identifier (4 byte block)

22 bit block identifier 8 bit tag (=22-14)

14 bit slot or line No two blocks in the same line have the same Tag

field

Check contents of cache by finding line and

checking Tag



RK 13

Direct Mapping

Cache Line Table

Cache line Main Memory blocksassigned

0 0, m, 2m, 3m…2s-m

1 1,m+1, 2m+1…2s-m+1

m-1 m-1, 2m-1,3m-1…2s-1

m = 2r lines of cache = number of lines incache

mapping i = j modulo m



RK 14

Direct Mapping Cache

Organization



RK 15

Direct Mapping Example



RK 16

Direct Mapping pros & cons

Simple

Inexpensive

Fixed location for given block If a program accesses 2 blocks that map to the

same line repeatedly, cache misses are veryhigh



RK 17

Associative Mapping

A main memory block can load into anyline of cache

Memory address is interpreted as tag andword

Tag uniquely identifies block of memory

Every line’s tag is examined for a match

Cache searching gets expensive



RK 18

Fully Associative Cache

Organization



RK 19

Associative Mapping Example



RK 20

Tag 22 bitWord

2 bit

Associative Mapping

Address Structure

22 bit tag stored with each 32 bit block of

data Compare tag field with tag entry in cache

to check for hit

Least significant 2 bits of address identifywhich 16 bit word is required from 32 bitdata block

e.g.

Address Tag Data



RK 21

Set Associative Mapping

Cache is divided into a number of sets

Each set contains a number of lines

A given block maps to any line in a givenset

e.g. Block B can be in any line of set i

e.g. 2 lines per set

2 way associative mapping

A given block can be in one of 2 lines in onlyone set



RK 22


Example

13 bit set number

Block number in main memory is modulo

213

000000, 00A000, 00B000, 00C000 … map

to same set



RK 23

Two Way Set Associative

Cache Organization



RK 24


Address Structure

Use set field to determine cache set tolook in

Compare tag field to see if we have a hit

e.g Address Tag Data Set

number

1FF 7FFC 1FF 12345678 1FFF

001 7FFC 001 11223344 1FFF

Tag 9 bit Set 13 bitWord

2 bit



RK 25

Two Way Set Associative

Mapping Example



RK 26

Replacement Algorithms (1)

Direct mapping

No choice

Each block only maps to one line

Replace that line



RK 27

Replacement Algorithms (2)

Associative & Set Associative

Hardware implemented algorithm (speed)

Least Recently used (LRU)

e.g. in 2 way set associative

Which of the 2 block is lru?

First in first out (FIFO)

replace block that has been in cache longest

Least frequently used replace block which has had fewest hits

Random



RK 28

Write Policy

Must not overwrite a cache block unlessmain memory is up to date

Multiple CPUs may have individual caches I/O may address main memory directly



RK 29

Write through

All writes go to main memory as well ascache

Multiple CPUs can monitor main memorytraffic to keep local (to CPU) cache up todate

Lots of traffic

Slows down writes

Remember bogus write through caches!



RK 30

Write back

Updates initially made in cache only

Update bit for cache slot is set when

update occurs If block is to be replaced, write to main

memory only if update bit is set

Other caches get out of sync

I/O must access main memory throughcache

N.B. 15% of memory references are writes



RK 31

Line Size

As the block size increases from very small tolarger size, the cache hit ratio will at firstincrease because of principle of locality.

As the block size increases, more useful data are

brought into the cache. However, the cache hit ratio will begin to

decrease Larger blocks reduce the number of blocks that fit into a

cache. Because each block fetch overwrites older cache

contents, a small number of blocks results in data beingoverwritten shortly after they are fetched. As a block becomes larger, each additional word is

farther from the requested word, therefore less likely tobe needed in the near future.



RK 32

Multi-Level Caches

Increases in transistor densities haveallowed for caches to be placed insideprocessor chip

Internal caches have very short wires(within the chip itself) and are thereforequite fast, even faster then any zero wait-state memory accesses outside of the chip

This means that a super fast internalcache (level 1) can be inside of the chipwhile an external cache (level 2) canprovide access faster then to main

memory



RK 33

Unified versus Split Caches

Split into two caches – one for instructions, onefor data

Disadvantages

Questionable as unified cache balances dataand instructions merely with hit rate.

Hardware is simpler with unified cache

Advantage

What a split cache is really doing is providingone cache for the instruction decoder and onefor the execution unit.

This supports pipelined architectures.



RK 34

Problem

A set associative cache consists of 64 lines,or slots, divided into four-line sets. Mainmemory contains 4K blocks of 128 wordseach. Show the format of main memoryaddresses.

A two-way set associative cache has linesof 16 bytes and a total size of 8 Kbytes. The64-Mbytes main memory is byte-addressable. Show the format of main

memory addresses

Date post:	30-May-2018
Category:	Documents
Upload:	abul-asar-sayyad
View:	227 times
Download:	0 times

16406810 Machine Architecture 14 Cache Memory Principles Elements of Cache Design

Documents