+ All Categories
Home > Documents > Computer Architecture

Computer Architecture

Date post: 03-Jan-2016
Category:
Upload: miriam-gallagher
View: 17 times
Download: 0 times
Share this document with a friend
Description:
Computer Architecture. Lecture 8: Memory hierarchy. Cache memory Piotr Bilski. Characteristics of the memory systems. Location Capacity Transfer unit Access mode Performance Physical structure Physical characteristics Organization. Memory location. - PowerPoint PPT Presentation
36
Computer Architecture Lecture 8: Memory hierarchy. Cache memory Piotr Bilski
Transcript
Page 1: Computer Architecture

Computer Architecture

Lecture 8: Memory hierarchy. Cache memory

Piotr Bilski

Page 2: Computer Architecture

Characteristics of the memory systems

• Location• Capacity• Transfer unit• Access mode• Performance• Physical structure• Physical characteristics• Organization

Page 3: Computer Architecture

Memory location

• Processor (registers, L1cache memory)

• Internal (main) memory (RAM)

• External memory (auxilary – disk drives)

Page 4: Computer Architecture

Memory capacity

• Word size

• Number of words

• Memory capacity is expressed in bytes and their multiplications, so:

1 B = 8 b

1 KB = 1024 B, 1 MB = 1024 KB etc.

Page 5: Computer Architecture

Transfer unit

• Number of the data lines connected to the memory module (normally equal to the word length), but:– Word is a basic unit in the memory

organization– Adressable unit is used to direct memory

addressing (byte or word)– Transfer unit can be equal to word or

addressable unit

Page 6: Computer Architecture

Memory access modes

• Sequential access (e.g. tape memory)

• Direct access (disk memory)

• Random access (main memory)

• Associative access (cache memory)

Page 7: Computer Architecture

Memory performance

• Access time– time between putting address to the address bus and acquiring information on the data bus

• Cycle time – access time increased by the time of the gap between the next access

• Transfer speed – for RAM: 1 / cycle time

Page 8: Computer Architecture

Physical memory structure

• Semiconductor (RAM, ROM)

• Magnetic (hard disks, floppy disks, streamers)

• Optical (CD-ROM, DVD-ROM)

• Magnetooptical (WORM)

Page 9: Computer Architecture

Physical characteristics

• Volatility– Volatile memory (RAM)– Non-volatile memory (ROM)

• Content modification– Erasable (np. RAM, EPROM)– Non-erasable (ROM)

Page 10: Computer Architecture

Memory organization

• One level („flat”)

• Multilevel (e.g. cache)

Hit ratio0 1

T2

T1

T1 + T2

Access time

Page 11: Computer Architecture

Memory hierarchy

Processor registers

Cache memory

Main (operational) memory

External memory

Capacity

Speed

Access time

cost

access time – cost / bit

capacity – cost / bit

capacity – access time

Page 12: Computer Architecture

Why do we need cache memory?

• Locality of references rule – executed program consists of the fragments existing next to each other and executed one by one

• Time locality

• Spatial locality

Page 13: Computer Architecture

Cache memory work regime

0

1

2

C-1

BlockRowsFlag

Block length (K words)

Memory address

Block 1 (K words)

0

1

2

3

Block N (K words)

Word length

2n - 1Main memory addressed using n bits (total 2n words)

Cache memory has C rows

Page 14: Computer Architecture

Cache memory work regime (cont.)

ProcessorCache memory Main memory

Transfer of words

Transfer of blocks

Page 15: Computer Architecture

Reading from cache memorySTART

Acquiring address from CPU

Is this block’s address in the

cache memory?

EXECUTION

Accessing main memory for the addressed block

Assignment of the block to the cache memory row

Transferring block into the cache memory

Transferring word to CPU

NO

Transfer of word to CPU

YES

Page 16: Computer Architecture

Details of the cache memory

• Size

• Mapping

• Replacement algorithm

• Writing algorithm

• Row size

• Number of the cache memories

Page 17: Computer Architecture

Size of the cache memory• Minimization of the memory cost• Maximization of the processor’s speed

Processor TypeProd. year

L1 cache instruction

L1 cache data

L2 memory

L3 memory

IBM 360 Mainframe 1968 16-32 KB None None

IBM 3033 Mainframe 1978 64 KB None None

80486 PC 1989 8 KB None None

Pentium PC 1993 8 KB 8 KB 256/512 None

PowerPC G4 PC/serv. 1999 32 KB 32 KB256/1 MB

2 MB

Pentium 4 PC/serv. 2000 8 KB 8 KB 256 KB None

Itanium PC/serv. 2001 16 KB 16 KB 96 KB 4 MB

Athlon Xp PC/serv. 1999 64 KB 64 KB 512 KB None

Athlon 64 PC/serv. 2002 64 KB 64 KB 1 MB None

Page 18: Computer Architecture

Mapping function

• The number of the rows in the cache is smaller than the number of the blocks in the main memory

• Three methods exist:– Direct– Associative– Set-associative

Page 19: Computer Architecture

Cache memory with direct mapping

Comparison

Flag Row Word

Memory address

Flag

Data

L0

Li

Main memory

B0

W0

W1

W2

W3

s-r r ws-r

w

s+w

s

w

hit

miss

Page 20: Computer Architecture

Direct mapping (cont.)• i – number of the row in the cache memory• j – number of the block in the main memory• m – number of rows in the cache memory

i = j mod m

Address length: s+w bitsNumber of the addressed units: 2s+w wordsBlock size = row size: 2w wordsNumber of blocks in the main memory: 2s

Number of rows in the cache memory: 2r

Page 21: Computer Architecture

Result of the direct mapping

Row in the cache memoryAssigned blocks in the main

memory

0 0, m, 2m, ... , 2s – m

1 1, m+1, 2m+1, ..., 2s – m +1

... ...m-1 m-1, 2m-1, 3m-1, ... , 2s – 1

Page 22: Computer Architecture

Example of the direct mapping

• For the cache memory having 214 rows (4 B each) and main memory of 16 MB capacity:

Row in the cache memory Assigned main memory blocks

0 000000, 010000, ... , FF0000

1 000004, 010004, ..., FF0004

... ...214-1 00FFFC, 01FFFC, ... , FFFFFC

Row width: 8 b flag, 32 b data

Page 23: Computer Architecture

Cache memory of associative mapping

Comparison

Flag Word

Memory address

Flag

Data

L0

Li

Main memory

B0

W0

W1

W2

W3

s

w

w

s+w

s

w

hit

miss

s

s

Page 24: Computer Architecture

Associative mapping (cont.)

Address length: s+w bitsNumber of the addressed units: 2s+w wordsBlock size = row size: 2w wordsNumber of the main memory blocks: 2s

Number of rows in the cache memory: anyFlag size: s words

Page 25: Computer Architecture

Example of the associative mappingAddress

000000

000004

12357A

FFFFF4

FFFFF8

FFFFFC

Data

35281987

F235A72C

3982FB1A

Flag Data

22 b 32 b

Flag (22 b)

Word (2 b)

000000

3FFFFF

048D5E

35281987

3982FB1A

F235A72C

Page 26: Computer Architecture

Cache memory with set-associative mapping

Comparison

Flag Section Word

Memory address

Flag

Data

S0

Si

Main memory

W0

W1

W2

W3

s-d d ws-d

s+w

s+w

hit

miss

Page 27: Computer Architecture

Set-associative mapping (cont.)

• i – number of the row in the cache memory• j – number of the block in the main memory• m – number of rows in the cache memory

m = v x ki = j mod v

Address length: s+w bitsNumber of addressed units: 2s+w wordsBlock size = row size: 2w wordsNumber of blocks in the main memory: 2s

Page 28: Computer Architecture

Set-associative mapping (cont.)

Number of rows in a section: k

Number of sections: v = 2d

Number of rows in the cache memory: kv = k x 2d

Flag size: (s-d) bits

Page 29: Computer Architecture

Example of the set-passociative mapping

Flag

000

01A

1FF

Data

35281987

F235A72C

67321342

3982FB1A

Flag Data

9 b 32 b

Flag (9 b)

Słowo (2 b)

000

01A

35281987

67321342

Section (13 b)

0000

0004

7FFC

0000

0004

7FFC9 b 32 b

01A F235A72C

Page 30: Computer Architecture

Algorithms of the cache memory content replacement

• Least recently used (LRU)

• First in - first out (FIFO)

• Least frequently used (LFU)

• Random choice

Page 31: Computer Architecture

Algorithms of writing into the cache memory

• write through

• write back

• System assuring consistency (multiprocessor system with cache)– Bus control with write through– Hardware transparency– Memory not mapped by the cache memory

Page 32: Computer Architecture

Other problems

• Row size and block size

• Number of the cache memories– Memory of the higher level is integrated in

one chip with the processor, works with identical frequency

– Memory of the lower level works with the bus frequency (it is on the mainboard)

Page 33: Computer Architecture

Pentium 4 cache memory

Page 34: Computer Architecture

Pentium 4 processor core

• Instruction fetching/decoding unit– Fetches instructions from L2 cache memory– Decodes them into microoperations– transfers microoperations to L1 cache memory

• Non-sequential instruction execution unit– Queues microoperations

• Execution units– Execute microoperations– Fetch data from the L1 cache– Write results into the registers

• Memory subsystem– Communicates with the system bus and L2 cache

memory

Page 35: Computer Architecture

PowerPC cache memory

Processor Size B / row Organization

PowerPC 601 1 x 32 KB 32 8-way

PowerPC 603 2 x 8 KB 32 2-way

PowerPC 604 2 x 16 KB 32 4-way

PowerPC 620 2 x 32 KB 64 8-way

PowerPC G3 2 x 32 KB 64 8-way

PowerPC G4 2 x 32 KB 32 8-way

Page 36: Computer Architecture

PowerPC cache memory (cont.)


Recommended