+ All Categories
Home > Documents > Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based...

Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based...

Date post: 15-Jun-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
133
L15-1 MIT 6.823 Spring 2020 Mengjia Yan Computer Science & Artificial Intelligence Lab M.I.T. Based on slides from Daniel Sanchez Directory-Based Cache Coherence April 9, 2020
Transcript
Page 1: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

L15-1MIT 6.823 Spring 2020

Mengjia YanComputer Science & Artificial Intelligence Lab

M.I.T.

Based on slides from Daniel Sanchez

Directory-BasedCache Coherence

April 9, 2020

Page 2: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Valid/Invalid Example

L14-2April 7, 2020

LD 0xACore 0

Main Memory

Cache

Core 1

Cache

1

Tag State Data Tag State DataTag State Data

0xA V 2

Page 3: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Valid/Invalid Example

L14-3April 7, 2020

LD 0xALD 0xA

Core 0

Main Memory

Cache

Core 1

Cache

12

Tag State Data Tag State DataTag State Data

0xA V 2

Page 4: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Valid/Invalid Example

L14-4April 7, 2020

LD 0xALD 0xA

Core 0

Main Memory

Cache

Core 1

Cache

12

Tag State Data Tag State DataTag State Data

0xA V 2

BusRd 0xA

Page 5: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Valid/Invalid Example

L14-5April 7, 2020

LD 0xALD 0xA

Core 0

Main Memory

Cache

Core 1

Cache

12

Tag State Data Tag State DataTag State Data

0xA V 2

BusRd 0xA

Page 6: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Valid/Invalid Example

L14-6April 7, 2020

LD 0xALD 0xA

Core 0

Main Memory

Cache

Core 1

Cache

12

Tag State Data Tag State DataTag State Data

0xA V 2

BusRd 0xA

Tag State Data

0xA V 2

Page 7: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Maintaining Cache Coherence

L15-7April 9, 2020

It is sufficient to have hardware such that• only one processor at a time has write permission for a location• no processor can load a stale copy of the location after a write

Page 8: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Maintaining Cache Coherence

L15-8April 9, 2020

It is sufficient to have hardware such that• only one processor at a time has write permission for a location• no processor can load a stale copy of the location after a write

Þ A correct approach could be: (e.g. MSI)

Page 9: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Maintaining Cache Coherence

L15-9April 9, 2020

It is sufficient to have hardware such that• only one processor at a time has write permission for a location• no processor can load a stale copy of the location after a write

Þ A correct approach could be: (e.g. MSI)

write request:

Page 10: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Maintaining Cache Coherence

L15-10April 9, 2020

It is sufficient to have hardware such that• only one processor at a time has write permission for a location• no processor can load a stale copy of the location after a write

Þ A correct approach could be: (e.g. MSI)

write request:The address is invalidated in all other caches before the write is performed

Page 11: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Maintaining Cache Coherence

L15-11April 9, 2020

It is sufficient to have hardware such that• only one processor at a time has write permission for a location• no processor can load a stale copy of the location after a write

Þ A correct approach could be: (e.g. MSI)

write request:The address is invalidated in all other caches before the write is performed

read request:

Page 12: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Maintaining Cache Coherence

L15-12April 9, 2020

It is sufficient to have hardware such that• only one processor at a time has write permission for a location• no processor can load a stale copy of the location after a write

Þ A correct approach could be: (e.g. MSI)

write request:The address is invalidated in all other caches before the write is performed

read request:If a dirty copy is found in some cache, a write-back is performed before the memory is read

Page 13: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Based Coherence(Censier and Feautrier, 1978)

April 9, 2020 L15-13

•Snoopy schemes broadcastrequests over memory bus

•Difficult to scale to large numbers of processors

•Requires additional bandwidth to cache tags for snoop requests

$P

$P

$P

$P

Bus

Mem.

Snoopy Protocols (Goodman 1983)

Page 14: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Based Coherence(Censier and Feautrier, 1978)

April 9, 2020 L15-14

•Snoopy schemes broadcastrequests over memory bus

•Difficult to scale to large numbers of processors

•Requires additional bandwidth to cache tags for snoop requests

$P

$P

$P

$P

Bus

Mem.

Snoopy Protocols (Goodman 1983)

$P

$P

$P

$P

Dir.

Interconnect Network

Mem.

Directory Protocols

Page 15: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Based Coherence(Censier and Feautrier, 1978)

•Directory schemes send messages to only those caches that might have the line

•Can scale to large numbers of processors

•Requires extra directory storage to track possible sharers

April 9, 2020 L15-15

•Snoopy schemes broadcastrequests over memory bus

•Difficult to scale to large numbers of processors

•Requires additional bandwidth to cache tags for snoop requests

$P

$P

$P

$P

Bus

Mem.

Snoopy Protocols (Goodman 1983)

$P

$P

$P

$P

Dir.

Interconnect Network

Mem.

Directory Protocols

Page 16: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

An MSI Directory Protocol

• Cache states: Modified (M) / Shared (S) / Invalid (I)

L15-16April 9, 2020

Core 0

Main Memory

Cache 0

Core N

Cache NTag State DataTag State DataTag State Data

DirectoryTag State Sharers

Page 17: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

An MSI Directory Protocol

• Cache states: Modified (M) / Shared (S) / Invalid (I)• Directory states:

– Uncached (Un): No sharers– Shared (Sh): One or more sharers with read permission (S)– Exclusive (Ex): A single sharer with read & write permissions (M)

L15-17April 9, 2020

Core 0

Main Memory

Cache 0

Core N

Cache NTag State DataTag State DataTag State Data

DirectoryTag State Sharers

Page 18: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

An MSI Directory Protocol

• Cache states: Modified (M) / Shared (S) / Invalid (I)• Directory states:

– Uncached (Un): No sharers– Shared (Sh): One or more sharers with read permission (S)– Exclusive (Ex): A single sharer with read & write permissions (M)

• Transient states not drawn for clarity; for now, assume no racing requests

L15-18April 9, 2020

Core 0

Main Memory

Cache 0

Core N

Cache NTag State DataTag State DataTag State Data

DirectoryTag State Sharers

Page 19: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (1/3)

L15-19April 9, 2020

M

S

I

Transitions initiated by processor accesses:

ActionsProcessor Read (PrRd)

Processor Write (PrWr)

Shared Request (ShReq)

Exclusive Request (ExReq)

Page 20: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (1/3)

L15-20April 9, 2020

M

S

I

Transitions initiated by processor accesses:

PrRd / ShReq

ActionsProcessor Read (PrRd)

Processor Write (PrWr)

Shared Request (ShReq)

Exclusive Request (ExReq)

Page 21: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (1/3)

L15-21April 9, 2020

M

S

I

Transitions initiated by processor accesses:

PrRd / ShReq

PrWr /ExReq

ActionsProcessor Read (PrRd)

Processor Write (PrWr)

Shared Request (ShReq)

Exclusive Request (ExReq)

Page 22: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (1/3)

L15-22April 9, 2020

M

S

I

Transitions initiated by processor accesses:

PrRd / ShReq

PrWr /ExReq

PrRd / --

ActionsProcessor Read (PrRd)

Processor Write (PrWr)

Shared Request (ShReq)

Exclusive Request (ExReq)

Page 23: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (1/3)

L15-23April 9, 2020

M

S

I

PrWr / ExReq

Transitions initiated by processor accesses:

PrRd / ShReq

PrWr /ExReq

PrRd / --

ActionsProcessor Read (PrRd)

Processor Write (PrWr)

Shared Request (ShReq)

Exclusive Request (ExReq)

Page 24: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (1/3)

L15-24April 9, 2020

M

S

I

PrWr / ExReq

Transitions initiated by processor accesses:

PrRd / ShReq

PrWr /ExReq

PrRd / --

PrRd / --PrWr / --

ActionsProcessor Read (PrRd)

Processor Write (PrWr)

Shared Request (ShReq)

Exclusive Request (ExReq)

Page 25: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (2/3)

L15-25April 9, 2020

M

S

I

Transitions initiated by directory requests:

ActionsInvalidation Request

(InvReq)Downgrade Request

(DownReq)Invalidation Response

(InvResp)Downgrade Response

(DownResp)

Page 26: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (2/3)

L15-26April 9, 2020

M

S

I

Transitions initiated by directory requests:

InvReq / InvResp (with data)

ActionsInvalidation Request

(InvReq)Downgrade Request

(DownReq)Invalidation Response

(InvResp)Downgrade Response

(DownResp)

Page 27: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (2/3)

L15-27April 9, 2020

M

S

I

Transitions initiated by directory requests:

InvReq / InvResp (with data)

ActionsInvalidation Request

(InvReq)Downgrade Request

(DownReq)Invalidation Response

(InvResp)Downgrade Response

(DownResp)

DownReq /DownResp(with data)

Page 28: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (2/3)

L15-28April 9, 2020

M

S

I

Transitions initiated by directory requests:

InvReq / InvResp (with data)

ActionsInvalidation Request

(InvReq)Downgrade Request

(DownReq)Invalidation Response

(InvResp)Downgrade Response

(DownResp)

DownReq /DownResp(with data)

InvReq /InvResp(without data)

Page 29: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (3/3)

L15-29April 9, 2020

M

S

I

Transitions initiated by evictions:

ActionsWriteback Request

(WbReq)

Page 30: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (3/3)

L15-30April 9, 2020

M

S

I

Transitions initiated by evictions:

Eviction /WbReq(with data)

ActionsWriteback Request

(WbReq)

Page 31: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches (3/3)

L15-31April 9, 2020

M

S

I

Transitions initiated by evictions:

Eviction /WbReq(with data)

ActionsWriteback Request

(WbReq)

Eviction /WbReq(without data)

Page 32: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Caches

L15-32April 9, 2020

M

S

I

Transitions initiated by processor accesses

Transitions initiated by directory requests

Transitions initiated by evictions

Page 33: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (1/2)

L15-33April 9, 2020

Ex

Sh

Un

Transitions initiated by data requests:

Page 34: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (1/2)

L15-34April 9, 2020

Ex

Sh

Un

ShReq / Sharers = {P}; ShResp

ShReq / Sharers = Sharers + {P}; ShResp

Transitions initiated by data requests:

Page 35: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (1/2)

L15-35April 9, 2020

Ex

Sh

Un

ShReq / Sharers = {P}; ShResp

ShReq / Sharers = Sharers + {P}; ShResp

ExReq / Sharers = {P}; ExResp

Transitions initiated by data requests:

Page 36: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (1/2)

L15-36April 9, 2020

Ex

Sh

Un

ShReq / Sharers = {P}; ShResp

ShReq / Sharers = Sharers + {P}; ShResp

ExReq / Inv(Sharers – {P}); Sharers = {P}; ExResp

ExReq / Sharers = {P}; ExResp

Transitions initiated by data requests:

Page 37: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (1/2)

L15-37April 9, 2020

Ex

Sh

Un

ShReq / Sharers = {P}; ShResp

ShReq / Sharers = Sharers + {P}; ShResp

ExReq / Inv(Sharers – {P}); Sharers = {P}; ExResp

ExReq / Sharers = {P}; ExResp

ShReq / Down(Sharer); Sharers = Sharer + {P}; ShResp

Transitions initiated by data requests:

Page 38: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (2/2)

L15-38April 9, 2020

Ex

Sh

Un

Transitions initiated by writeback requests:

Page 39: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (2/2)

L15-39April 9, 2020

Ex

Sh

Un

WbReq / Sharers = {}; WbResp

Transitions initiated by writeback requests:

Page 40: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (2/2)

L15-40April 9, 2020

Ex

Sh

Un

WbReq && |Sharers| > 1 /Sharers = Sharers - {P}; WbResp

WbReq / Sharers = {}; WbResp

Transitions initiated by writeback requests:

Page 41: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Protocol: Directory (2/2)

L15-41April 9, 2020

Ex

Sh

Un

WbReq && |Sharers| > 1 /Sharers = Sharers - {P}; WbResp

WbReq && |Sharers| == 1 /Sharers = {}; WbResp

WbReq / Sharers = {}; WbResp

Transitions initiated by writeback requests:

Page 42: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-42April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

Page 43: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-43April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Page 44: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-44April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

Page 45: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-45April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Page 46: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-46April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Tag State Sharers

0xA Sh {0}

Page 47: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-47April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Tag State Sharers

0xA Sh {0}

Mem[0xA] = 33

Page 48: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-48April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Tag State Sharers

0xA Sh {0}

Mem[0xA] = 3

4 ShResp 0xA, data=3

3

Page 49: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-49April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

DirectoryTag State Sharers

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

Tag State Data

0xA S 3

2 ShReq 0xA

Tag State Sharers

0xA Sh {0}

Mem[0xA] = 3

4 ShResp 0xA, data=3

3

Page 50: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-50April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

Page 51: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-51April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Page 52: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-52April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

Page 53: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-53April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Page 54: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-54April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Tag State Sharers

0xA Sh {0,2}

Page 55: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-55April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Tag State Sharers

0xA Sh {0,2}

Mem[0xA] = 33

Page 56: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-56April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

2 ShReq 0xA

Tag State Sharers

0xA Sh {0,2}

Mem[0xA] = 3

4 ShResp 0xA, data=3

3

Page 57: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-57April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

LD 0xA1

Tag State Data

0xA I->S

Tag State Data

0xA S 3

2 ShReq 0xA

Tag State Sharers

0xA Sh {0,2}

Mem[0xA] = 3

4 ShResp 0xA, data=3

3

Page 58: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-58April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

Page 59: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-59April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

2 ExReq 0xA

Page 60: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-60April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

2 ExReq 0xA 3 InvReq 0xA3 InvReq 0xA

Page 61: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-61April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA 3 InvReq 0xA3 InvReq 0xA

Tag State Data

0xA I 3

Page 62: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-62April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA 3 InvReq 0xA3 InvReq 0xA

4 InvResp 0xA 4 InvResp 0xA

Tag State Data

0xA I 3

Page 63: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-63April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex {1}

3 InvReq 0xA3 InvReq 0xA

4 InvResp 0xA 4 InvResp 0xA

Tag State Data

0xA I 3

Page 64: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-64April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex {1}

Mem[0xA] = 35

3 InvReq 0xA3 InvReq 0xA

4 InvResp 0xA 4 InvResp 0xA

Tag State Data

0xA I 3

Page 65: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-65April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex {1}

Mem[0xA] = 35

3 InvReq 0xA3 InvReq 0xA

4 InvResp 0xA 4 InvResp 0xA

6 ExResp 0xAdata = 3

Tag State Data

0xA I 3

Page 66: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-66April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

0xA S 3

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex {1}

Mem[0xA] = 35

3 InvReq 0xA3 InvReq 0xA

4 InvResp 0xA 4 InvResp 0xA

6 ExResp 0xAdata = 3

Tag State Data

0xA I 3

Tag State Data

0xA M 5

Page 67: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-67April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Page 68: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-68April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

2 WbReq 0xA, data=5

Page 69: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-69April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

2 WbReq 0xA, data=5

Mem[0xA] = 53

Page 70: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-70April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

Page 71: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-71April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

Page 72: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-72April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

Page 73: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-73April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Page 74: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-74April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Tag State Sharers

0xB Ex {1}

Page 75: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-75April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Mem[0xB] = 106

Tag State Sharers

0xB Ex {1}

Page 76: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-76April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Mem[0xB] = 106

Tag State Sharers

0xB Ex {1}

7 ExResp 0xB, data=10

Page 77: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-77April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Mem[0xB] = 106

Tag State Sharers

0xB Ex {1}

Tag State Data

0xB M 10

7 ExResp 0xB, data=10

Page 78: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-78April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Mem[0xB] = 106

Tag State Sharers

0xB Ex {1}

Tag State Data

0xB M 10

7 ExResp 0xB, data=10

Why are 0xA’s wb and 0xB’s req serialized?

Page 79: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-79April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Mem[0xB] = 106

Tag State Sharers

0xB Ex {1}

Tag State Data

0xB M 10

7 ExResp 0xB, data=10

Why are 0xA’s wb and 0xB’s req serialized? Structural dependence

Page 80: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-80April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Mem[0xB] = 106

Tag State Sharers

0xB Ex {1}

Tag State Data

0xB M 10

7 ExResp 0xB, data=10

Why are 0xA’s wb and 0xB’s req serialized? Structural dependencePossible solutions?

Page 81: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

MSI Directory Protocol Example

L15-81April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA I 3

DirectoryTag State Sharers

0xA Ex {1}

Core 1

Cache 1Tag State Data

0xA M->I 5

Core 2

Cache 2Tag State Data

ST 0xB1

Tag State Data

0xA I 3

Tag State Data

0xB I->M

4 WbResp 0xA

Tag State Sharers

0xA Un {}

2 WbReq 0xA, data=5

Mem[0xA] = 53

5 ExReq 0xB

Mem[0xB] = 106

Tag State Sharers

0xB Ex {1}

Tag State Data

0xB M 10

7 ExResp 0xB, data=10

Why are 0xA’s wb and 0xB’s req serialized? Structural dependencePossible solutions? Buffer outside of cache to hold write data

Page 82: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Miss Status Handling Register

• On eviction/writeback– No free MSHR entry: stall– Allocate new MSHR entry– When channel available send WBReq and data– Deallocate entry on WBResp

L15-82April 9, 2020

AddrXV Data

MSHR entry

MSHR – Holds load misses and writes outside of cache

Page 83: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Miss Status Handling Register

• On cache load miss– No free MSHR entry: stall– Allocate new MSHR entry– Send ShReq (or ExReq)– On *Resp forward data to CPU and cache– Deallocate MSHR

L15-83April 9, 2020

AddrXV Data

MSHR entry per ld/st slots

MSHR – Holds load misses and writes outside of cache

Inum BlockOffsetL/S

Page 84: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Miss Status Handling Register

L15-84April 9, 2020

AddrXV Data

MSHR entry per ld/st slots

MSHR – Holds load misses and writes outside of cache

Inum BlockOffsetL/S

Page 85: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Miss Status Handling Register

L15-85April 9, 2020

AddrXV Data

MSHR entry per ld/st slots

MSHR – Holds load misses and writes outside of cache

Inum BlockOffsetL/SV

Inum BlockOffsetL/SV

Inum BlockOffsetL/SV

Per ld/st slots allow servicing multiple requests with one entry

Page 86: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Miss Status Handling Register

• On cache load miss– Look for matching address is MSHR

• If not found– If no free MSHR entry: stall– Allocate new MSHR entry and fill in

• If found, just fill in per ld/st slot– Send ShReq (or ExReq)– On *Resp forward data to CPU and cache– Deallocate MSHR

L15-86April 9, 2020

AddrXV Data

MSHR entry per ld/st slots

MSHR – Holds load misses and writes outside of cache

Inum BlockOffsetL/SV

Inum BlockOffsetL/SV

Inum BlockOffsetL/SV

Per ld/st slots allow servicing multiple requests with one entry

Page 87: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory Organization

• Requirement: Directory needs to keep track of all the cores that are sharing a cache block

• Challenge: For each block, the space needed to hold the list of sharers grows with number of possible sharers…

L15-87April 9, 2020

Page 88: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Flat, Memory-based Directories

• Dedicate a few bits of main memory to store the state and sharers of every line

• Encode sharers using a bit-vector

L15-88April 9, 2020

Sh

State Sharer Set

0 1 0 0 1 1 0 0

Main Memory

64 bytes 10 bits

Page 89: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Flat, Memory-based Directories

• Dedicate a few bits of main memory to store the state and sharers of every line

• Encode sharers using a bit-vector

L15-89April 9, 2020

üSimpleû Slowû Very inefficient with many processors (~P bits/line)

Sh

State Sharer Set

0 1 0 0 1 1 0 0

Main Memory

64 bytes 10 bits

Page 90: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Sparse Full-Map Directories

• Not every line in the system needs to be tracked –only those in private caches!

• Idea: Organize directory as a cache

L15-90April 9, 2020

0xF00 Sh

Line Address State Sharer Set

0 1 0 0 1 1 0 0

Directory Entry FormatWay 1 Way 2 Way 3 Way 4

Page 91: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Sparse Full-Map Directories

• Not every line in the system needs to be tracked –only those in private caches!

• Idea: Organize directory as a cache

L15-91April 9, 2020

0xF00 Sh

Line Address State Sharer Set

0 1 0 0 1 1 0 0

ü Low latency, energy-efficientû Bit-vectors grow with # cores à Area scales poorlyû Limited associativity à Directory-induced invalidations

Directory Entry FormatWay 1 Way 2 Way 3 Way 4

Page 92: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-92April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State DataTag State Data

Tag State Sharers

0xF Ex {1}

Page 93: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-93April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

Tag State Sharers

0xF Ex {1}

Page 94: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-94April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

Tag State Sharers

0xF Ex {1}

Tag State Data

0xB I->S

Page 95: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-95April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB

Tag State Sharers

0xF Ex {1}

Tag State Data

0xB I->S

Page 96: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-96April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB3

Tag State Sharers

0xF Ex {1}

InvReq 0xA

Tag State Data

0xB I->S

Page 97: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-97April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB3

Tag State Sharers

0xF Ex {1}

InvReq 0xA

Tag State Data

0xA I 3

Tag State Data

0xB I->S

Page 98: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-98April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB3

Tag State Sharers

0xF Ex {1}

InvReq 0xA4 InvResp 0xA

Tag State Data

0xA I 3

Tag State Data

0xB I->S

Page 99: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-99April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB

Tag State Sharers

0xB Sh {2}

3

Tag State Sharers

0xF Ex {1}

InvReq 0xA4 InvResp 0xA

Tag State Data

0xA I 3

Tag State Data

0xB I->S

Page 100: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-100April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB

Tag State Sharers

0xB Sh {2}

3

Tag State Sharers

0xF Ex {1}

InvReq 0xA4 InvResp 0xA

Mem[0xB] = 55

Tag State Data

0xA I 3

Tag State Data

0xB I->S

Page 101: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-101April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB

Tag State Sharers

0xB Sh {2}

3

Tag State Sharers

0xF Ex {1}

InvReq 0xA4 InvResp 0xA

6 ShResp 0xB,data=5

Mem[0xB] = 55

Tag State Data

0xA I 3

Tag State Data

0xB I->S

Page 102: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-102April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB

Tag State Sharers

0xB Sh {2}

3

Tag State Sharers

0xF Ex {1}

InvReq 0xA4 InvResp 0xA

6 ShResp 0xB,data=5

Mem[0xB] = 55

Tag State Data

0xA I 3

Tag State Data

0xB I->S

Tag State Data

0xB S 5

Page 103: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Directory-Induced Invalidations• To retain inclusion, must invalidate all sharers of an entry

before reusing it for another address• Example: 2-way set-associative sparse directory

L15-103April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0}

Core 1

Cache 1Tag State Data

0xF M 1

Core 2

Cache 2Tag State Data

LD 0xB1

Tag State Data

2 ShReq 0xB

Tag State Sharers

0xB Sh {2}

3

Tag State Sharers

0xF Ex {1}

InvReq 0xA4 InvResp 0xA

6 ShResp 0xB,data=5

Mem[0xB] = 55

Tag State Data

0xA I 3

How many entries should the directory have?

Tag State Data

0xB I->S

Tag State Data

0xB S 5

Page 104: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Inexact Representations of Sharer Sets

• Coarse-grain bit-vectors (e.g., 1 bit per 4 cores)

• Limited pointers: Maintain a few sharer pointers, on overflow mark ‘all’ and broadcast (or invalidate another sharer)

• Allow false positives (e.g., Bloom filters)

L15-104April 9, 2020

Sharer Set

Sharer Set 80 14 33

all sharer 1 sharer 2 sharer 3

0

8-11

0

4-7

0

0-3

0

12-15

0

16-19

0

20-23

Page 105: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Inexact Representations of Sharer Sets

• Coarse-grain bit-vectors (e.g., 1 bit per 4 cores)

• Limited pointers: Maintain a few sharer pointers, on overflow mark ‘all’ and broadcast (or invalidate another sharer)

• Allow false positives (e.g., Bloom filters)

L15-105April 9, 2020

ü Reduced area & energyû Overheads still not scalable (these techniques simply play with

constant factors)û Inexact sharers à Broadcasts, invalidations or spurious

invalidations and downgrades

Sharer Set

Sharer Set 80 14 33

all sharer 1 sharer 2 sharer 3

0

8-11

0

4-7

0

0-3

0

12-15

0

16-19

0

20-23

Page 106: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Protocol Races• Directory serializes multiple requests for the same address

– Same-address requests are queued or NACKed and retried

• But races still exist due to conflicting requests• Example: Upgrade race

L15-106April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

0xA S 3

ReqQ

Page 107: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Protocol Races• Directory serializes multiple requests for the same address

– Same-address requests are queued or NACKed and retried

• But races still exist due to conflicting requests• Example: Upgrade race

L15-107April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

0xA S 3

ST 0xA1’ST 0xA1

ReqQ

Page 108: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Protocol Races• Directory serializes multiple requests for the same address

– Same-address requests are queued or NACKed and retried

• But races still exist due to conflicting requests• Example: Upgrade race

L15-108April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

0xA S 3

ST 0xA1’

Tag State Data

0xA S->M 3

ST 0xA1

ReqQ

Tag State Data

0xA S->M 3

Page 109: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Protocol Races• Directory serializes multiple requests for the same address

– Same-address requests are queued or NACKed and retried

• But races still exist due to conflicting requests• Example: Upgrade race

L15-109April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

0xA S 3

ST 0xA1’

2 ExReq 0xA

Tag State Data

0xA S->M 3

ST 0xA1

2’ ExReq 0xA

ReqQ

Tag State Data

0xA S->M 3

Caches 0 and 1 issue simultaneous ExReqs

Page 110: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Protocol Races• Directory serializes multiple requests for the same address

– Same-address requests are queued or NACKed and retried

• But races still exist due to conflicting requests• Example: Upgrade race

L15-110April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

0xA S 3

ST 0xA1’

2 ExReq 0xA

Tag State Data

0xA S->M 3

ST 0xA1

2’ ExReq 0xA

ReqQ1, ExReq 0xA

Tag State Data

0xA S->M 3

Caches 0 and 1 issue simultaneous ExReqsDirectory starts serving cache 0’s ExReq, queues cache 1’s

Page 111: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Protocol Races• Directory serializes multiple requests for the same address

– Same-address requests are queued or NACKed and retried

• But races still exist due to conflicting requests• Example: Upgrade race

L15-111April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

0xA S 3

ST 0xA1’

2 ExReq 0xA 3 InvReq 0xA

Tag State Data

0xA S->M 3

ST 0xA1

2’ ExReq 0xA

ReqQ1, ExReq 0xA

Tag State Data

0xA S->M 3

Caches 0 and 1 issue simultaneous ExReqsDirectory starts serving cache 0’s ExReq, queues cache 1’s

Cache 1 expected ExResp, but got InvReq!

Page 112: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Protocol Races• Directory serializes multiple requests for the same address

– Same-address requests are queued or NACKed and retried

• But races still exist due to conflicting requests• Example: Upgrade race

L15-112April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA S 3

DirectoryTag State Sharers

0xA Sh {0,2}

Core 1

Cache 1Tag State Data

0xA S 3

ST 0xA1’

2 ExReq 0xA 3 InvReq 0xA

Tag State Data

0xA S->M 3

ST 0xA1

2’ ExReq 0xA

ReqQ1, ExReq 0xA

Tag State Data

0xA S->M 3

Caches 0 and 1 issue simultaneous ExReqsDirectory starts serving cache 0’s ExReq, queues cache 1’s

Cache 1 expected ExResp, but got InvReq!

Cache 1 should transition from S->M to I->M and send InvResp

Page 113: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-113April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

Page 114: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-114April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Page 115: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-115April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

Page 116: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-116April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

2 ExReq 0xA

Page 117: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-117April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

2 ExReq 0xA

Tag State Sharers

0xA Ex->Ex {2}

Page 118: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-118April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

2 ExReq 0xA

Tag State Sharers

0xA Ex->Ex {2}

3 ExFwd 0xA, req=2

Page 119: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-119April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex->Ex {2}

3 ExFwd 0xA, req=2

Page 120: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-120April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex->Ex {2}

3 ExFwd 0xA, req=2

3 ExResp 0xA,data=3

Page 121: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-121April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex->Ex {2}

3 ExFwd 0xA, req=2

Tag State Data

0xA M 3

3 ExResp 0xA,data=3

Page 122: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-122April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex->Ex {2}

3 ExFwd 0xA, req=2

Tag State Data

0xA M 3

3 ExResp 0xA,data=3

ExAck 0xA4

Page 123: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Extra Hops and 3-Hop ProtocolsReducing Protocol Latency

• Problem: Data in another cache needs to pass through the directory, adding latency

• Optimization: Forward data to requester directly

L15-123April 9, 2020

Core 0

Main Memory

Cache 0Tag State Data

0xA M 3

DirectoryTag State Sharers

0xA Ex {0}

Core 1

Cache 1Tag State Data

Core 2

Cache 2Tag State Data

ST 0xA1

Tag State Data

0xA I->M

Tag State Data

0xA I 3

2 ExReq 0xA

Tag State Sharers

0xA Ex->Ex {2}

3 ExFwd 0xA, req=2

Tag State Data

0xA M 3

3 ExResp 0xA,data=3

ExAck 0xA4

Tag State Sharers

0xA Ex {2}

Page 124: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

In-Cache Directories

• Common multicore memory hierarchy:– 1+ levels of private caches– A shared last-level cache– Need to enforce coherence

among private caches

• Idea: Embed the directoryinformation in shared cachetags– Shared cache must be inclusive– Need extended directory if non-inclusive

L15-124April 9, 2020

Core 0

Main Memory

Private cache

Shared cache

Private cache

Core 1

Private cache

Core N

Page 125: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

In-Cache Directories

• Common multicore memory hierarchy:– 1+ levels of private caches– A shared last-level cache– Need to enforce coherence

among private caches

• Idea: Embed the directoryinformation in shared cachetags– Shared cache must be inclusive– Need extended directory if non-inclusive

L15-125April 9, 2020

Core 0

Main Memory

Private cache

Shared cache

Private cache

Core 1

Private cache

Core N

üAvoids tag overheads & separate lookups û Can be inefficient if shared cache size >>

sum(private cache sizes)

Page 126: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Coherence in Multi-Level Hierarchies• Can use the same or different protocols to keep coherence

across multiple levels• Key invariant: Ensure sufficient permissions in all

intermediate levels• Example: 8-socket Xeon E7 (8 cores/socket)

L15-126April 9, 2020

Core 0

Main Memory

L1I

L3

L1D

L2

Core 7L1I L1D

L2…

Chip 0Core 0L1I

L3

L1D

L2

Core 7L1I L1D

L2…

Chip 7

Interconnect

MESI protocolL3 in-cache directory

MESIF protocolSnooping (QPI)

Main Memory

Page 127: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

req

Avoiding Protocol Deadlock

• Protocols can cause deadlocks even if network is deadlock-free! (more on this later)

L15-127April 9, 2020

Node 0 Node 1

reqreqreqreq

req

req

req

resp

resp

Example: Both nodes saturate all intermediate buffers with requests to each other, blocking responses from entering the network

Page 128: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

req

Avoiding Protocol Deadlock

• Protocols can cause deadlocks even if network is deadlock-free! (more on this later)

• Solution: Separate virtual networks– Different sets of virtual channels and endpoint buffers– Same physical routers and links

L15-128April 9, 2020

Node 0 Node 1

reqreqreqreq

req

req

req

resp

resp

Example: Both nodes saturate all intermediate buffers with requests to each other, blocking responses from entering the network

Page 129: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

req

Avoiding Protocol Deadlock

• Protocols can cause deadlocks even if network is deadlock-free! (more on this later)

• Solution: Separate virtual networks– Different sets of virtual channels and endpoint buffers– Same physical routers and links

• Most protocols require at least 2 virtual networks (for requests and replies), often >2 needed

L15-129April 9, 2020

Node 0 Node 1

reqreqreqreq

req

req

req

resp

resp

Example: Both nodes saturate all intermediate buffers with requests to each other, blocking responses from entering the network

Page 130: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

L15-130MIT 6.823 Spring 2020

Thank you!

Next Lecture:On-chip Networks

April 9, 2020

Page 131: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Load-reserve & Store-conditional

L15-131April 9, 2020

If the cache receives an invalidation to the addressin the reserve register, the reserve bit is set to 0

• Several processors may reserve ‘a’ simultaneously• These instructions are like ordinary loads and storeswith respect to the bus traffic

Special register(s) to hold reservation flag and address, and the outcome of store-conditionalLoad-reserve R, (a):

<flag, adr> ¬ <1, a>; R ¬ M[a];

Store-conditional (a), R:if <flag, adr> == <1, a> then cancel other procs’

reservation on a;M[a] ¬ <R>; status ¬ succeed;

else status ¬ fail;

Page 132: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Load-Reserve/Store-Conditional

Swap implemented with Ld-Reserve/St-Conditional

# Swap(R1, mutex):

L: Ld-Reserve R2, (mutex)St-Conditional (mutex), R1if (status == fail) goto LR1 <- R2

L15-132April 9, 2020

Page 133: Directory-Based Cache Coherencecsg.csail.mit.edu/6.823/Lectures/L15split.pdf · Directory-Based Coherence (Censierand Feautrier, 1978) April 9, 2020 L15-14 •Snoopy schemes broadcast

MIT 6.823 Spring 2020

Performance: Load-reserve & Store-conditional

L15-133April 9, 2020

The total number of coherence transactions is not necessarily reduced, but splitting an atomic instruction into load-reserve & store-conditional:

• increases utilization (and reducesprocessor stall time), especially in split-transaction buses and directories

• reduces cache ping-pong effect because processors trying to acquire a semaphore donot have to perform stores each time


Recommended