Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 215 times |
Download: | 0 times |
CSCS: A Concise Implementation of User-Level Distributed Shared
Memory Zhi Zhai Feng Shen
Computer Science and EngineeringUniversity of Notre Dame
Dec. 11, 2009
Final Presentation
DSM Overview
DSM Characteristics:• Physically: distributed memory• Logically: a single shared address space
Figure 1 DSM architecture
Related WorkModels and Main Features: • IVY (Yale) - Divided Space: Shared & Private space • Mirage (UCLA) - Time Interval d : Avoid page thrashing• TreadMarks (Rice) - Lazy Release Consistency : Improve efficiency
• SAM (Stanford)
System Design
• Client– Physical memory owner
– UI/Work/Page Fetch Thread
– Fixed-home Protocol
– Not Aware of Peer Clients
Implementation
• Server/Client Page Table– Server holds most up-to-date meta data– Server managers whole virtual memory space– Server records id & addresses of all nodes
– Client owns the most up-to-date local memory segment
– Client caches referenced pages from peer nodes
Client ID IP Address
0 129.74.155.107 (e.g.)
1 129.74.155.122
…. …
Page # Frame # Access Bits Page Owner
0 57 PROT_READ 1
1 67 PROT_READ|PROT_WRITE 1
2 57 PROT_READ 3
… … … …
Figure 7 Connection Table
Figure 8 Server Page Table
Implementation
Page # Frame # Access Bits Page Owner Ref Count
0 30 PROT_READ 1 0
1 31 PROT_READ 1 0
2 32 PROT_READ 1 4
3 60 PROT_READ|PROT_WRITE 1 1
4 200 PROT_READ 5 0
… … … … …
Figure 9 Client Page Table
Implementation
• Page fault handler– Client Server
• Check the access right• Fetch the page owner id/address• Update global access bits
– Client Client• Connect to the page owner• Cache the referenced page• Update local access bits
Implementation
• Page fault handler– Page fault type
• Read remote page• Write on a page
– Assumption• Reading happens more often than writing• Writing needs most-to-date copy more than
reading
Implementation
Assume reading remote page
dsm call:dsm_do_no_page ()
Truly a remote reading fault?
NO: double page fault
dsm call:dsm_do_wrt_page ()
YES: continue
Figure 10 Page fault handler wordflow
Implementation
• Memory Consistency Model– Assumption Revisit
• Reading happens more often than writing• Writing needs most-to-date copy more than reading
– Multi-Reader/Single Writer• Snap-shot for reading• Every writing triggers page fault
– Locks on pages being referenced• Semaphore-like reference counts:
If ref_count > 0 Waiting/Re-random
Future Work
• Enhance system robustness
• Evaluate scalability boundary
• Provide better programmability