+ All Categories
Home > Documents > 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture...

4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture...

Date post: 21-Dec-2015
Category:
View: 216 times
Download: 3 times
Share this document with a friend
Popular Tags:
70
Rev. 3 (2005-06) by Enrico Na rdelli 1 4 - William Stallings Computer Organization and Architecture Chapter 4 Internal Memory
Transcript
Page 1: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 14 -

William Stallings Computer Organization and Architecture

Chapter 4Internal Memory

Page 2: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 24 -

Memory

• How much ? As much as possible

• How fast ? As fast as possible

• How expensive ? As cheap as possible

• Fast memory is expensive• Large memory is expensive• The larger the memory, the slower the

access

Page 3: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 34 -

Memory Hierarchy

• CPU Registers• L1 cache (on chip)• L2 cache (on

board)• Main memory• Disk cache• Disk• Optical• Tape

Acc

ess

tim

e

Siz

e

Acc

ess

Fre

qu

ency

Co

st p

er b

it

Page 4: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 44 -

Characteristics

• Location• Capacity• Unit of transfer• Access method• Performance• Physical type• Physical characteristics• Organisation

Page 5: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 54 -

Location

• CPU Registers

• Internal: access directly from CPU Cache RAM

• External: access through I/O module Disks CD-ROM, …

Page 6: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 64 -

Capacity

• Word size The natural unit of organisation Usually, it is equal to the numer of bits used for

representing numbers or instructions Typical word size: 8 bits, 16 bits, 32 bits

• Number of words (or Bytes)1 Byte = 8 bits = 23 bits

1 K Byte = 210 Bytes = 210 x 23 bits = 1024 bytes (Kilo)

1 M Byte = 210 K Bytes = 1024 K Bytes (Mega)

1 G Byte = 210 M Bytes = 230 Bytes (Giga)

1 T Byte = 210 G Bytes = 1024 G Bytes (Tera)

Page 7: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 74 -

Unit of Transfer

• Number of bits can be read/written at the same time

• Internal Usually governed by data bus width bus width may be equal to word size or (often) larger Typical bus width: 64, 128, 256 bits

• External Usually a block which is much larger than a word

• A related concept: addressable unit Smallest location which can be uniquely addressed Word internally Cluster on M$ disks

Page 8: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 84 -

Access Methods (1)

• Sequential Start at the beginning and read through in order Access time depends on location of data and previous

location e.g. tape

• Direct Individual blocks have unique address Access is by jumping to vicinity plus sequential search Access time depends on location and previous location e.g. disk

Page 9: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 94 -

Access Methods (2)

• Random Individual addresses identify locations exactly Access time is independent of location or previous

access e.g. RAM

• Associative Data is located by a comparison with contents of a

portion of the store Access time is independent of location or previous

access e.g. cache

Page 10: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 104 -

Performance

• Access time Time between presenting the address and getting the

valid data

• Memory Cycle time Time may be required for the memory to “recover”

before next access Cycle time is access + recovery

• Transfer Rate Rate at which data can be moved TN=TA+ N/R

N: number of bits TA: access time TN: time need to read N bits R: transfer rate

Page 11: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 114 -

Physical Types

• Semiconductor RAM, ROM, EPROM, Cache

• Magnetic Disk & Tape

• Optical CD & DVD

• Others Bubble Hologram

Page 12: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 124 -

Semiconductor Memory

• RAM (Random Access Memory) Misnamed as all semiconductor mem. are random

access Read/Write Volatile Temporary storage Static or dynamic

• ROM (Read only memory) Permanent storage Read only

Page 13: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 134 -

Dynamic RAM

• Bits stored as charge in capacitors• Charges leak• Need refreshing even when powered• Simpler construction• Smaller per bit• Less expensive• Need refresh circuits• Slower• Main memory (static RAM would be too expensive)

Page 14: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 144 -

Static RAM

• Bits stored as on/off switches• No charges to leak• No refreshing needed when powered• More complex construction• Larger per bit• More expensive• Does not need refresh circuits• Faster• Cache (here the faster the better)

Page 15: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 154 -

Read Only Memory (ROM)

• Permanent storage• Microprogramming (see later)• Library subroutines• Systems programs (BIOS)• Function tables

Page 16: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 164 -

Types of ROM

• Written during manufacture Very expensive for small runs

• Programmable (once) PROM Needs special equipment to program

• Read “mostly” Erasable Programmable (EPROM)

• Erased by UV (it can take up to 20 minuts) Electrically Erasable (EEPROM)

• Takes much longer to write than read• a single byte can be erased

Flash memory• Erase memory electrically “block-at-a-time”

Page 17: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 174 -

Physical Characteristics

• Decay (refresh time)• Volatility (needs power source)• Erasable• Power consumption

Page 18: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 184 -

Organisation

• Physical arrangement of bits into words• Not always obvious

e.g. interleaved

Page 19: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 194 -

Basic Organization (1)

• Basic element: memory cell has 2 stable states: one represent 0, the other 1 can be written at least once can be read

Select

R/W Control

Input DataCell

SelectCell

Write Read

R/W Control

Output Data

Page 20: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 204 -

Basic Organization (2)

• Basic organization of a 512x512 bits chip

A0

A8

A9

A17

Array of Memory Cells

(512x512)

Column AddressDecoder

Ro

w A

dd

ress

D

eco

de

r

9

9

D0

Timing and control

1Sense Amplifierand I/O Gate

Page 21: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 214 -

Module Organisation

• Basic organization of a 256KB chip

• 8 times a 512x512 bits chip

• …For a 1 MB chip replicate 4 times this organization…

Page 22: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 224 -

Module Organisation (1 MByte)

Page 23: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 234 -

Organisation for larger sizes

• The larger the size the higher the number of address pins

• For 2k words, k pins are needed

• A solution to reduce the number of address pins Multiplex row address and

column address k/2 pins to address 2k Bytes Adding one more pin doubles

range of values so x4 capacity

Page 24: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 244 -

Typical 16 Mb DRAM (4M x 4)

X X

Page 25: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 254 -

Refreshing (Dynamic RAM)

• Refresh circuit included on chip• Disable chip• Count through rows• Read & Write back• Takes time• Slows down apparent performance

Page 26: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 264 -

Packaging

X

Page 27: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 274 -

Error Correction

• Hard Failure Permanent defect

• Soft Error Random, non-destructive No permanent damage to memory

• Detected using Hamming error correcting code it is able to detect and correct 1-bit errors

Page 28: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 284 -

Error Correcting Code Function

Page 29: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 294 -

A simple example of correction (1)

A B

C

A B

C

1

11 0

1

11 0

1 0

0

• Correcting errors in 4 bits words• 3 control groups

• In each control group add 1 parity bit

Page 30: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 304 -

• One of the bits change value

• Using control bit the right value is restored

A simple example of correction (2)

A B

C

A B

C

1

10 0

1

11 0

1 0

0

1 0

0

Page 31: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 314 -

Compare Circuit

• it takes two K-length binary strings X, Y as input X=XK…X1

Y=YK…Y1

• it returns a K-length binary string Z (syndrome) Z=ZK…Z1

Zi=Xi Yi for each i=1,…,K

• Z=0…0 means no error

Page 32: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 324 -

Relation between M and K• Z may assume 2K values• the value Z=0…0 means no error• the error may be in any bit among the M+K bits• it must be

2K -1 M+K

Data bits (M) Control Bits (K) Additional Memory (%)

4 3 75

8 4 50

16 5 31,25

32 6 18,75

64 7 10,94

128 8 6,25

256 9 3,52

Page 33: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 334 -

How to arrange the M+K bits

• the M+K bits are arranged so that if Z contains a single bit equal to 1

• error occured in the corresponding control bit

if Z contains more than one bit equal to 1• error occured in the i-th bit where i is the value (in

binary) of Z

Page 34: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 344 -

The case M=4

bit position 7 6 5 4 3 2 1

position number 111 110 101 100 011 010 001

data bits D4 D3 D2 D1

control bits C4 C2 C1

C1= D1 D2 D4C2= D1 D3 D4C4= D2 D3 D4

D1

D4D2 D3

C1 C2

C4

Page 35: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 354 -

Exercise

• Design a Hamming error correcting code for 8-bit words

• See the textbook for the solution

Page 36: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 364 -

Cache

• Small amount of fast memory• Sits between normal main memory and

CPU• May be located on CPU chip or module

Page 37: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 374 -

Cache operation - overview

• CPU requests contents of memory location• Check cache for this data• If present (hit), get from cache (fast)• If not present (miss), read required block

from main memory to cache• Then deliver from cache to CPU

Page 38: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 384 -

Cache Performance

• Cache access time: t=1• Memory access time: T=10• Hit Probability: H

Taverage access=t*H+(T+t)*(1-H)=t+(1-H)*T

0123456789

10

0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00

H

T a

vera

ge a

ccess

Page 39: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 394 -

Locality of Reference (Denning’68)

• Spatial Locality Memory cells physically close to those just

accessed tend to be accessed

• Temporal Locality During the course of the execution of a

program, all accesses to the same memory cells tend to close in time

• e.g. loops, arrays

Page 40: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 404 -

Typical Cache Organization

Page 41: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 414 -

Cache Design

• Size• Mapping Function• Replacement Algorithm• Write Policy• Block Size• Number of Caches

Page 42: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 424 -

Size does matter

• Cost More cache is expensive

• Speed More cache is faster (up to a point) Checking cache for data takes time

Page 43: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 434 -

Cache-memory mapping

• There are M=2n/K blocks C << M

• Each block is mapped to a cache line

Page 44: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 444 -

Mapping Function

• Word size: 1 Byte• Cache of 64KBytes (216 Bytes)• Cache block of 4 bytes

64 KB/4 = 16K (214) lines of 4 bytes

• 16MBytes (224) main memory 224/4 = 4M (222) blocks in main memory

• Map 222 blocks to 214 lines of cache

Page 45: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 454 -

A simple example of Direct Mapping

00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 ……..……..……..1111011111

{

{

{

{

{

Block 0

Block 1

Block 2

Block 3

Block 15

wrs-r

{Block 4

Line 0

Line 1

Line 2

Line 3

Line 0

Line 3

Page 46: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 464 -

Direct Mapping (1)

• Each block of main memory is mapped to a specific cache line i.e. if a block is in cache, it must be in one

specific place

• In a cache of C lines block j is stored into line i, where: i = j mod C

Page 47: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 474 -

Direct Mapping (2)

• Address is in two parts w Least Significant Bits (LSB) identify unique

word s Most Significant Bits (MSB) specify one

memory block

• The MSBs are split into a cache line field r (least significant) a tag of s-r (most significant)

Page 48: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 484 -

Direct Mapping: Summarizing

• address length: n=s+w bits• number of addressable units (words): 2s+w

• block size=cache line size= 2w words• number of memory bocks: 2s+w/2w= 2s

• number of cache lines: C= 2r

• tag length: (s-r) bits

Page 49: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 494 -

Cache line Main Memory blocks held

• 0 0, C, 2C, …,2s-C• 1 1, C+1, 2C+1, …, 2s-C+1

• C-1 C-1, 2C-1, 3C-1, …, 2s-1

Cache Line Mapping Table

Page 50: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 504 -

Direct MappingAddress Structure

Tag s-r Line or Slot r Word w

8 14 2

• 24 bit address – 16MBytes (224) main memory• 2 bit word identifier (4 byte block)• Cache: 64 KB/4 = 16K (214) lines of 4 bytes• 22 bit block identifier

8 bit tag (=22-14) 14 bit slot or line

• No two blocks mapping to the same line have the same Tag field

• Check contents of cache by finding line and checking Tag

Page 51: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 514 -

Direct Mapping Cache Organization

Page 52: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 524 -

Direct Mapping pros & cons

• Simple• Inexpensive• Fixed location for given block

If a program repeatedly accesses 2 distinct blocks that are mapped to the same line, cache misses are very high (thrashing)

Page 53: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 534 -

Associative Mapping

• A main memory block can load into any line of cache

• Memory address is interpreted as tag and word

• Tag uniquely identifies block of memory• Every line’s tag is examined for a match• Cache searching gets expensive

Page 54: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 544 -

A simple example of Associative Mapping

00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 ……..……..……..1111011111

{

{

{

{

{

Block 0

Block 1

Block 2

Block 3

Block 15

ws

{Block 4

0011 0001 0000 0100

Line 0Line 1Line 2Line 3

w=0 w=1

}

}

}

}

Note: a replacement algorithm is needed(see later)

Page 55: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 554 -

Associative Mapping: Summarizing

• address length: n=s+w• number of addressable units (words): 2s+w

• block size=cache line size= 2w words• number of memory bocks: 2s+w/2w= 2s

• number of cache lines: not specified• tag length: s bits

Page 56: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 564 -

Tag 22 bitWord2 bit

Associative MappingAddress Structure

• 22 bit tag stored with each 4 byte block of data

• Compare tag field with tag entry in cache to check for hit

• Least significant 2 bits of address identify which byte is required from the 4 byte data block

Page 57: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 574 -

Fully Associative Cache Organization

Page 58: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 584 -

Set Associative Mapping

• Cache is divided into v sets• Each set contains k lines• number of cache lines C=v۰k• A given block maps to any line in a given set

Block j can be in any line of set i, where i=j mod v

• There are k lines in a set (k-way set associative mapping) k=1: direct mapping; k=C: associative mapping

• The best choice in practice is 2 lines per set 2 way associative mapping A given block can be in only one set, but in any of its 2 lines

Page 59: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 594 -

A simple example of Set Associative Mapping

00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 ……..……..……..1111011111

{

{

{

{

{

Block 0

Block 1

Block 2

Block 3

Block 15

ws-d

{Block 4

Set 0

Set 1

Set 1

Set 0

010 000 111 000

Line 0Line 1Line 2Line 3

w=0 w=1

d

Set 0 Set 0

Set 1

{

{

Set 1

Note: a replacement algorithm is needed(see later)

Page 60: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 604 -

Set Associative Mapping

• Address is in two parts w Least Significant Bits (LSB) identify unique

word s Most Significant Bits (MSB) specify one

memory block

• The MSBs are split into a cache set field d (least significant) a tag of s-d (most significant)

Page 61: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 614 -

Set Associative Mapping: Summarizing

• address length: n=s+w bits• number of addressable units (words): 2s+w

• block size=cache line size= 2w words• number of memory bocks: 2s+w/2w= 2s

• number of lines for each cache set: k• number of sets: v =2d

• number of cache lines: C = k v= k 2d • tag length: (s -d) bits

Page 62: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 624 -

Set Associative MappingAddress Structure

Tag 9 bit Set 13 bitWord2 bit

• number of cache lines: 214 • number of cache sets: 213 • each cache set has two lines: 2-way set

associative mapping• Use set field to determine cache set to look in• Compare Tag field with all lines in the set to

see if we have a hit

Page 63: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 634 -

Two Way Set Associative Cache Organization

Page 64: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 644 -

Replacement Algorithms (1)Direct mapping

• No choice• Each block only maps to one line• Replace that line

Page 65: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 654 -

Replacement Algorithms (2)Associative & Set Associative

• Hardware implemented algorithm (to obtain speed)• Least Recently used (LRU)• e.g. in 2 way set associative

Which of the 2 blocks is LRU?

• First in first out (FIFO) replace block that has been in cache longest

• Least frequently used replace block which has had fewest hits

• Random Almost as good as LRU

Page 66: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 664 -

Write Policy

• Must not overwrite a cache block unless main memory is up to date

• Multiple CPUs may have individual caches• I/O may address main memory directly

Page 67: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 674 -

Write through

• All writes go to main memory as well as cache

• Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up to date

• Lots of traffic• Slows down writes

Page 68: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 684 -

Write back

• Updates initially made in cache only• Update bit for cache slot is set when update

occurs• If block is to be replaced, write to main memory

only if update bit is set• I/O must access main memory through cache• N.B. 15% of memory references are writes• Caches of other devices get out of sync

Cache coherency problem (a general problem in distributed systems !)

Page 69: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 694 -

Block Size

• Too small Locality of reference is not used

• Too large Locality of reference is lost

Page 70: 4 - Rev. 3 (2005-06) by Enrico Nardelli1 William Stallings Computer Organization and Architecture Chapter 4 Internal Memory.

Rev. 3 (2005-06) by Enrico Nardelli 704 -

Number of Caches

• 2 levels of cache L1 on chip (since technology allows it) L2 on board (to fill the speed gap)

• 2 kinds of cache Data cache Instruction cache To allow instruction parallel processing and

data fetching interfere


Recommended