Lessons Learned During the Development of the CapoOne Deterministic Multiprocessor Replay System
Department of Computer ScienceUniversity of Illinois at Urbana-Champaign
Pablo Montesinos, Matthew Hicks, Wonsun Ahn, Samuel T. King and Josep Torrellas
Pablo Montesinos Lessons Learned during the CapoOne Development
Motivation: Time Travel2
Pablo Montesinos Lessons Learned during the CapoOne Development
Motivation: Time Travel
Allows us to visit and recreate past states and events in computer
2
Pablo Montesinos Lessons Learned during the CapoOne Development
Motivation: Time Travel
Allows us to visit and recreate past states and events in computer
Wide range of uses:
2
Pablo Montesinos Lessons Learned during the CapoOne Development
Motivation: Time Travel
Allows us to visit and recreate past states and events in computer
Wide range of uses:
Debugging
2
Pablo Montesinos Lessons Learned during the CapoOne Development
Motivation: Time Travel
Allows us to visit and recreate past states and events in computer
Wide range of uses:
Debugging
Security
2
Pablo Montesinos Lessons Learned during the CapoOne Development
Motivation: Time Travel
Allows us to visit and recreate past states and events in computer
Wide range of uses:
Debugging
Security
High-availability
2
Pablo Montesinos Lessons Learned during the CapoOne Development
Motivation: Time Travel
Allows us to visit and recreate past states and events in computer
Wide range of uses:
Debugging
Security
High-availability
Enabled by using Deterministic Replay of Execution
2
Pablo Montesinos Lessons Learned during the CapoOne Development
How Deterministic Replay Works3
Pablo Montesinos Lessons Learned during the CapoOne Development
Phase I: Initial Execution (a.k.a Recording)
Execute and record certain non-deterministic events into log
Sources of non-determinism: interrupts, memory access interleaving ...
How Deterministic Replay Works3
Pablo Montesinos Lessons Learned during the CapoOne Development
Phase I: Initial Execution (a.k.a Recording)
Execute and record certain non-deterministic events into log
Sources of non-determinism: interrupts, memory access interleaving ...
Phase II: Replay
Restore to a previous checkpoint
Re-execute and use log to force software down the same execution path
How Deterministic Replay Works3
Pablo Montesinos Lessons Learned during the CapoOne Development
Phase I: Initial Execution (a.k.a Recording)
Execute and record certain non-deterministic events into log
Sources of non-determinism: interrupts, memory access interleaving ...
Phase II: Replay
Restore to a previous checkpoint
Re-execute and use log to force software down the same execution path
How Deterministic Replay Works3
Pablo Montesinos Lessons Learned during the CapoOne Development
Phase I: Initial Execution (a.k.a Recording)
Execute and record certain non-deterministic events into log
Sources of non-determinism: interrupts, memory access interleaving ...
Phase II: Replay
Restore to a previous checkpoint
Re-execute and use log to force software down the same execution path
How Deterministic Replay Works3
Pablo Montesinos Lessons Learned during the CapoOne Development
Phase I: Initial Execution (a.k.a Recording)
Execute and record certain non-deterministic events into log
Sources of non-determinism: interrupts, memory access interleaving ...
Phase II: Replay
Restore to a previous checkpoint
Re-execute and use log to force software down the same execution path
How Deterministic Replay Works3
Pablo Montesinos Lessons Learned during the CapoOne Development
Phase I: Initial Execution (a.k.a Recording)
Execute and record certain non-deterministic events into log
Sources of non-determinism: interrupts, memory access interleaving ...
Phase II: Replay
Restore to a previous checkpoint
Re-execute and use log to force software down the same execution path
How Deterministic Replay Works3
Pablo Montesinos Lessons Learned during the CapoOne Development
SW-Based vs HW-Based Deterministic Replay4
Pablo Montesinos Lessons Learned during the CapoOne Development
SW-Based vs HW-Based Deterministic Replay4
SW Based
Schemes
Pablo Montesinos Lessons Learned during the CapoOne Development
SW-Based vs HW-Based Deterministic Replay4
SW Based
Schemes
Flexible, integrate well with OS, apps
Pablo Montesinos Lessons Learned during the CapoOne Development
SW-Based vs HW-Based Deterministic Replay4
SW Based
Schemes
Flexible, integrate well with OS, apps
Very slow on multiprocessors
Pablo Montesinos Lessons Learned during the CapoOne Development
SW-Based vs HW-Based Deterministic Replay4
HWBased
Schemes
SW Based
Schemesvs
Flexible, integrate well with OS, apps
Very slow on multiprocessors
Pablo Montesinos Lessons Learned during the CapoOne Development
SW-Based vs HW-Based Deterministic Replay4
HWBased
Schemes
SW Based
Schemesvs
Flexible, integrate well with OS, apps
Fast multiprocessor executionVery slow on multiprocessors
Pablo Montesinos Lessons Learned during the CapoOne Development
SW-Based vs HW-Based Deterministic Replay4
HWBased
Schemes
SW Based
Schemesvs
Flexible, integrate well with OS, apps
Fast multiprocessor execution
Poor integration with SW
Very slow on multiprocessors
Pablo Montesinos Lessons Learned during the CapoOne Development
HW-Assisted Deterministic Replay5
Flexible, integrate well with OS, apps
Fast multiprocessor execution
HWAssistedSchemes
Pablo Montesinos Lessons Learned during the CapoOne Development
HW-Assisted Deterministic Replay6
Flexible, integrate well with OS, apps
Fast multiprocessor execution
HWAssistedSchemesCapoOne
Pablo Montesinos Lessons Learned during the CapoOne Development
Today’s Agenda 7
Pablo Montesinos Lessons Learned during the CapoOne Development
Today’s Agenda
Overview: Capo and CapoOne
7
Pablo Montesinos Lessons Learned during the CapoOne Development
Today’s Agenda
Overview: Capo and CapoOne
From full-system replay to sphere-based replay
7
Pablo Montesinos Lessons Learned during the CapoOne Development
Today’s Agenda
Overview: Capo and CapoOne
From full-system replay to sphere-based replay
Exiting the replay sphere
7
Pablo Montesinos Lessons Learned during the CapoOne Development
Today’s Agenda
Overview: Capo and CapoOne
From full-system replay to sphere-based replay
Exiting the replay sphere
System Issues
7
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)8
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)
SW-HW interface for HW-Assisted deterministic replay
8
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)
SW-HW interface for HW-Assisted deterministic replay
Integrates HW-Based replay systems with O.S. and applications
8
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)
SW-HW interface for HW-Assisted deterministic replay
Integrates HW-Based replay systems with O.S. and applications
Narrow: compatible with any HW-Based replay system
8
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)
SW-HW interface for HW-Assisted deterministic replay
Integrates HW-Based replay systems with O.S. and applications
Narrow: compatible with any HW-Based replay system
Replay Sphere: new abstraction
8
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)
SW-HW interface for HW-Assisted deterministic replay
Integrates HW-Based replay systems with O.S. and applications
Narrow: compatible with any HW-Based replay system
Replay Sphere: new abstraction
Isolates SW that is being recorded (replayed) from the rest
8
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)
SW-HW interface for HW-Assisted deterministic replay
Integrates HW-Based replay systems with O.S. and applications
Narrow: compatible with any HW-Based replay system
Replay Sphere: new abstraction
Isolates SW that is being recorded (replayed) from the rest
Separates the responsibilities of the HW and the SW components
8
Pablo Montesinos Lessons Learned during the CapoOne Development
Capo (ASPLOS 09)
SW-HW interface for HW-Assisted deterministic replay
Integrates HW-Based replay systems with O.S. and applications
Narrow: compatible with any HW-Based replay system
Replay Sphere: new abstraction
Isolates SW that is being recorded (replayed) from the rest
Separates the responsibilities of the HW and the SW components
CapoOne: first implementation of Capo
8
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere9
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere9
OS
CPU1
CPU2
CPU3
CPU4
Replay HW
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere9
OS
FFT
Thread103
FFT
Thread128
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere9
OS
FFT
Thread103
FFT
Thread128
Vi
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere9
OS
FFT
Thread103
FFT
Thread128
GCC
Thread26
Vi
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space
Replay Sphere 2
Replaying
Replay Sphere9
Replay Sphere 1
Recording
OS
FFT
Thread103
FFT
Thread128
GCC
Thread26
Vi
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space
Only user-mode threads run inside spheres: R-threads
Replay Sphere 2
Replaying
Replay Sphere9
Replay Sphere 1
Recording
OS
FFT
Thread103
FFT
Thread128
GCC
Thread26
FFT
R-thread1
FFT
R-thread2
GCC
R-thread 1
Vi
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space
Only user-mode threads run inside spheres: R-threads
Replay Sphere 2
Replaying
Replay Sphere9
Replay Sphere 1
Recording
OS
FFT
Thread103
FFT
Thread128
GCC
Thread26
FFT
R-thread1
FFT
R-thread2
GCC
R-thread 1
Vi
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
Replay Sphere Manager
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos
Replay Sphere: Separating Responsibilities10
Pablo Montesinos
Replay Sphere: Separating Responsibilities
Replay HW:
Records memory access interleaving of R-threads running within same sphere
Produces per-sphere Memory Interleaving Log
Enforces same memory interleaving during replay
10
Pablo Montesinos
Replay Sphere: Separating Responsibilities
Replay HW:
Records memory access interleaving of R-threads running within same sphere
Produces per-sphere Memory Interleaving Log
Enforces same memory interleaving during replay
Replay Sphere Manager:
Logs the other sources of non-determinism that affect the sphere
Produces per-sphere Input Log
Includes system call return values, signals, data copied into the sphere...
Injects data from log into sphere during replay
10
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: First Capo Implementation 11
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
11
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
11
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
11
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
Ubuntu Linux with modified kernel:
11
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
11
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
Made some functions more deterministic
11
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
Made some functions more deterministic
Split Replay Sphere Manager
11
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Kernel-Level RSM
User-Level RSMLog
2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Log1
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
Made some functions more deterministic
Split Replay Sphere Manager
Simulated HW-Based replay system:
11
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Kernel-Level RSM
User-Level RSMLog
2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Log1
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
DeLorean HW
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays multithreaded Linux apps running on multiprocessors
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
Made some functions more deterministic
Split Replay Sphere Manager
Simulated HW-Based replay system:
DeLorean [Montesinos et al. ISCA’08]
11
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Kernel-Level RSM
User-Level RSMLog
2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Log1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
chunk chunk
P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
Execute chunks atomically
chunk chunk
st X...st Y
ld T...st W
P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
Execute chunks atomically
chunk chunk
st X...st Y
ld T...st W
P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
chunk chunk
st X...st Y
ld T...st W
P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
chunk chunk
st X...st Y
ld T...st W
P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
chunk chunk
st X...st Y
ld T...st W
P0 P1 P0 P1 P0
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
chunk chunk
st X...st Y
ld T...st W
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
chunk chunk
st X...st Y
ld T...st W
ld X...ld X
st X...st Y
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
chunk chunk
st X...st Y
ld T...st W
ld X...ld X
st X...st Y
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
chunk chunk
st X...st Y
ld T...st W
ld X...ld X
st X...st Y
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
ld X...ld X
st X...st Y
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
Record chunk commit order
Log
0
1
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
Record chunk commit order
Log
0
1
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
Record chunk commit order
Log
0
1
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay
HW records chunk commit order in Processor Interleaving Log
12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
Record chunk commit order
Log
0
1
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay
HW records chunk commit order in Processor Interleaving Log
Also records size of some irregular chunks in Chunk Size Log
12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
Record chunk commit order
Log
0
1
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
DeLorean: Chunk-Based Record/Replay
HW records chunk commit order in Processor Interleaving Log
Also records size of some irregular chunks in Chunk Size Log
During replay: generate same chunks and commit them in same order
12
st X...st Y
ld T...st W
st X...st Y
ld T...st W
HW groups consecutive dynamic instructions
commit
commit
Execute chunks atomically
Execute chunks in isolation
collisionon Xchunk chunk
st X...st Y
ld T...st W
st X...st Y
ld X...ld X
Record chunk commit order
Log
0
1
P0 P1 P0 P1 P0 P1
Pablo Montesinos Lessons Learned during the CapoOne Development
13
Pablo Montesinos Lessons Learned during the CapoOne Development
From Full-System Replay to Sphere-Based Replay14
Pablo Montesinos Lessons Learned during the CapoOne Development
From Full-System Replay to Sphere-Based Replay
Only log non-deterministic events that affect a sphere
14
Pablo Montesinos Lessons Learned during the CapoOne Development
From Full-System Replay to Sphere-Based Replay
Only log non-deterministic events that affect a sphere
Log entries record R-thread IDs, not processor IDs
14
Pablo Montesinos Lessons Learned during the CapoOne Development
From Full-System Replay to Sphere-Based Replay
Only log non-deterministic events that affect a sphere
Log entries record R-thread IDs, not processor IDs
Information about other non-deterministic events is discarded
14
Pablo Montesinos Lessons Learned during the CapoOne Development
From Full-System Replay to Sphere-Based Replay
Only log non-deterministic events that affect a sphere
Log entries record R-thread IDs, not processor IDs
Information about other non-deterministic events is discarded
Memory Interleaving Log must only include interleaving of instructions from R-threads within same sphere
14
Pablo Montesinos Lessons Learned during the CapoOne Development
From Full-System Replay to Sphere-Based Replay
Only log non-deterministic events that affect a sphere
Log entries record R-thread IDs, not processor IDs
Information about other non-deterministic events is discarded
Memory Interleaving Log must only include interleaving of instructions from R-threads within same sphere
Changed DeLorean’s chunk truncation rules to enforce isolation
14
Pablo Montesinos Lessons Learned during the CapoOne Development
From Full-System Replay to Sphere-Based Replay
Only log non-deterministic events that affect a sphere
Log entries record R-thread IDs, not processor IDs
Information about other non-deterministic events is discarded
Memory Interleaving Log must only include interleaving of instructions from R-threads within same sphere
Changed DeLorean’s chunk truncation rules to enforce isolation
Example: System calls cause chunk truncation in CapoOne, not in DeLorean
14
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
ProcessorInterleaving
Log
P1
Processor P1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
ProcessorInterleaving
Log
P1
Processor P1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
ProcessorInterleaving
Log
P1
Processor P1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
CapoOne
ProcessorInterleaving
Log
P1
Processor P1
R-threadInterleaving
Log
R1
R-Thread R1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
CapoOne
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
R-threadInterleaving
Log
R1
R-Thread R1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
CapoOne
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
R-threadInterleaving
Log
R1
R-Thread R1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
CapoOne
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
R-threadInterleaving
Log
R1
R-Thread R1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation 15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
CapoOne
OSSysscallHandler
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
R-threadInterleaving
Log
R1
R-Thread R1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation
Inform HW to stop logging interleaving of instructions from current proc
Always do it when execution leaves replay sphere
Otherwise, OS instructions might become part of interleaving log
15
DeLorean
OSSyscallHandler
500501
inst m. . .
999
syscall
CapoOne
OSSysscallHandler
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
Processor P1
R-threadInterleaving
Log
R1
R-Thread R1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation
Inform HW to stop logging interleaving of instructions from current proc
Always do it when execution leaves replay sphere
Otherwise, OS instructions might become part of interleaving log
16
DeLorean
OSsyscallhandler
500501
inst m. . .
999
syscall
CapoOne
OSsyscallhandler
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
P1
R-threadInterleaving
Log
R-Thread N
P1
Pablo Montesinos Lessons Learned during the CapoOne Development
New Chunking Rules Provide R-Thread Isolation
Inform HW to stop logging interleaving of instructions from current proc
Always do it when execution leaves replay sphere
Otherwise, OS instructions might become part of interleaving log
16
DeLorean
OSsyscallhandler
500501
inst m. . .
999
syscall
CapoOne
OSsyscallhandler
500501
inst m. . .
syscall
ProcessorInterleaving
Log
P1
P1
R-threadInterleaving
Log
R-Thread N
P1
RSM and HW must work together to avoid any OS instruction to pollute the Memory Interleaving
Log
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Interrupts17
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Interrupts
Ensure interrupt handler code is not part of Memory Interleaving Log
17
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Interrupts
Ensure interrupt handler code is not part of Memory Interleaving Log
Balance conflicting demands because of chunk-based execution:
17
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Interrupts
Ensure interrupt handler code is not part of Memory Interleaving Log
Balance conflicting demands because of chunk-based execution:
17
Size of Memory Interleaving Log
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Interrupts
Ensure interrupt handler code is not part of Memory Interleaving Log
Balance conflicting demands because of chunk-based execution:
17
Size of Memory Interleaving Log
Interrupt Latency
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Interrupts
Ensure interrupt handler code is not part of Memory Interleaving Log
Balance conflicting demands because of chunk-based execution:
17
Size of Memory Interleaving Log
Interrupt Latency
Wasted Work
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Interrupts: Three Approaches18
Pablo Montesinos Lessons Learned during the CapoOne Development
Finish First
Handling Interrupts: Three Approaches18
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
R-thread R1
inst m
. . .
Chunk SizeLog
Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
R-thread R1
inst m
. . .
Chunk SizeLog
Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
R-thread R1
inst m
. . .
. . .
Chunk SizeLog
Chunk SizeLog
999
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
R-thread R1
inst m
. . .
. . .
Chunk SizeLog
Chunk SizeLog
999
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
OSFault
Handler
R-thread R1
inst m
. . .
. . .
Chunk SizeLog
Chunk SizeLog
999
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
OSFault
Handler
R-thread R1
inst m
. . .
. . .
Chunk SizeLog
Chunk SizeLog
. . .
999
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
OSFault
Handler
R-thread R1
inst m
. . .
. . .
Chunk SizeLog
Chunk SizeLog
. . .
999
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
OSFault
Handler
R-thread R1
inst m
. . .
. . .
Chunk SizeLog
Chunk SizeLog
. . .
999
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Finish First
Handling Interrupts: Three Approaches18
OriginalExecution
OSFault
Handler
R-thread R1
inst m
. . .
. . .
Chunk SizeLog
Chunk SizeLog
. . .
999
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
inst m
. . .Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .
ChunkID
0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .
ChunkSize
0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
OSInt
Handler
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
OSInt
Handler
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .
999
0
. . .
0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
OSInt
Handler
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .
999
0
. . .
0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
OSInt
Handler
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .
999
0
. . .
0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
OSInt
Handler
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .
999
0
. . .
0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Commit Now
Handling Interrupts: Three Approaches19
OriginalExecution
OSInt
Handler
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
500inst m
. . .
999
0
. . .
0Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .0
500
Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .0
500
Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .
Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
999
inst m
0Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
999
inst m
0Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
999
inst m
0Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
999
inst m
0Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches20
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
999
inst m
0Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches21
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
999
inst m
0
0
500
Commit Now
Finish First
Pablo Montesinos Lessons Learned during the CapoOne Development
Log SizeWasted Work
Interrupt Latency
Squash Now
Handling Interrupts: Three Approaches21
OriginalExecution
R-thread R1
Chunk SizeLog
Chunk SizeLog
inst m
. . .OSInt
Handler
999
inst m
0
0
500
Commit Now
Finish First Highly non-deterministic events, such as interrupts, can be treated as deterministic events
CapoOne uses SquashNow: easy to implement and little overhead
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults22
inst n
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
22
inst n
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
22
OriginalExecution
inst n
inst z
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
22
OriginalExecution
inst n
inst n
inst z
R-thread R1
500501
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
22
OriginalExecution
inst n
inst n
inst z
R-thread R1
500501
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
22
OriginalExecution
inst n
inst n
inst z
R-thread R1
500501
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
22
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
22
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
23
inst n
inst z
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
ReplayCase II
ReplayCase III
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
23
inst n
inst z
R-thread R1
500 inst m. . .
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
ReplayCase II
ReplayCase III
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
23
inst n
inst z
R-thread R1
500 inst m. . .
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
ReplayCase II
ReplayCase III
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
23
inst n
inst n
inst z
R-thread R1
500501
inst m. . .
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
ReplayCase II
ReplayCase III
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
23
inst n
inst n
inst z
R-thread R1
500501
inst m. . .
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
ReplayCase II
ReplayCase III
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
23
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
inst m. . .
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
ReplayCase II
ReplayCase III
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
23
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
999
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
ReplayCase II
ReplayCase III
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
24
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
inst n
inst z
ReplayCase I
ReplayCase II
ReplayCase III
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
24
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
inst n
inst z
R-thread R1
ReplayCase I
ReplayCase II
ReplayCase III
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
24
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
inst n
inst z
R-thread R1
500 inst m
ReplayCase I
ReplayCase II
ReplayCase III
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
24
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
inst n
inst z
R-thread R1
500 inst m. . .
ReplayCase I
ReplayCase II
ReplayCase III
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
24
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase I
ReplayCase II
ReplayCase III
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
inst rinst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst rinst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst rinst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
inst rinst z
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
inst rinst z
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
inst rinst z
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
OSFault
Handler
inst rinst z
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
OSFault
Handler
inst rinst z
0
199
0
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
25
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
OSFault
Handler
inst rinst z
0
199
0
No otherR-thread
can commit
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
26
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
OSFault
Handler
inst rinst z
0
199
0
No otherR-thread
can commit
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Faults
Commit all instructions up to faulting one and log chunk size
Handling faults during replay can be tricky
26
OriginalExecution
inst n
OSFault
Handler
inst n
inst z
R-thread R1
500501
0
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
ReplayCase I
inst n
OSFault
Handler
inst n
inst z
R-thread R1
5000
0
inst m. . .
999
ReplayCase II
inst n
inst z
R-thread R1
500
0
inst m. . .
999
ReplayCase III
R-thread R1
500 inst m. . .
inst r799800
inst q
inst n
OSFault
Handler
inst rinst z
0
199
0
No otherR-thread
can commit
Faults are synchronous events, but they are not necessarily deterministic
A fault can occur during replay but not during recording, or viceversa
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions27
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
27
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
No need to record the size of irregular chunk
27
R-thread R1 Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
No need to record the size of irregular chunk
27
500501
inst m. . .
syscall
R-thread R1 Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
No need to record the size of irregular chunk
27
500501
inst m. . .
syscall
R-thread R1 Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
No need to record the size of irregular chunk
27
500501
inst m. . .
syscall
R-thread R1 Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
No need to record the size of irregular chunk
27
OSSyscallHandler
500501
inst m. . .
syscall
R-thread R1 Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
No need to record the size of irregular chunk
28
OSSyscallHandler
500501
inst m. . .
syscall
R-thread R1 Chunk SizeLog
Pablo Montesinos Lessons Learned during the CapoOne Development
Handling Traps and Programmed Exceptions
Commit all instructions including the one raising the trap
No need to record the size of irregular chunk
28
OSSyscallHandler
500501
inst m. . .
syscall
R-thread R1 Chunk SizeLogTraps and programmed exceptions are
deterministic events: no need to record irregular-sized chunks
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
Replay Sphere ManagerLog 1Log 1
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere ManagerLog 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres
Problem: interleaving between OS copies and R-threads not recorded
29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0
X = buf[2]buf[3] = Y
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres
Problem: interleaving between OS copies and R-threads not recorded
29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0
X = buf[2]buf[3] = Y
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres
Problem: interleaving between OS copies and R-threads not recorded
29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0
X = buf[2]buf[3] = Y
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres
Problem: interleaving between OS copies and R-threads not recorded
Solution: insert copy_to_user into sphere:
29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0
X = buf[2]buf[3] = Y
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres
Problem: interleaving between OS copies and R-threads not recorded
Solution: insert copy_to_user into sphere:
HW can log memory access interleaving
29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0
X = buf[2]buf[3] = Y
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres
Problem: interleaving between OS copies and R-threads not recorded
Solution: insert copy_to_user into sphere:
HW can log memory access interleaving
copy_to_user exits sphere once copy is over
29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0
X = buf[2]buf[3] = Y
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres
Problem: interleaving between OS copies and R-threads not recorded
Solution: insert copy_to_user into sphere:
HW can log memory access interleaving
copy_to_user exits sphere once copy is over
29
OS
Replay Sphere 1 - Recording
B
Y
E
\0
Replay Sphere Manager
R-thread R1 R-thread R2
copy_to_user
H
I
!
\0
X = buf[2]buf[3] = Y
read(&buf)
Log 1Log 1
buf
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exitread( )
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exitread( )
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
RSM
read( )
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
RSM
read( )
copy_to_user
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
RSM
read( )
copy_to_user
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
RSM
read( )
copy_to_user
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
RSM
read( )
copy_to_user
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
RSM
RSM
inject before returningread( )
copy_to_user
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
30
R-thread A R-thread B R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit
RSM
RSM
inject before returningread( )
copy_to_user
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B
RSM
RSM
copy_to_user
read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A A read_enter
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A A read_enter
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
copy_to_user
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
copy_to_user
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A
A
A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
copy_to_user
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A
A
B
A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
copy_to_user
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A
A
B
A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
31
R-thread A R-thread B R-thread A R-thread B
RSM
RSM
copy_to_user
read( ) read( )
copy_to_user
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
A read_exit inject before returning
A
A
B
A read_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
32
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
32
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
33
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
33
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
R-thread A R-thread BR-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
33
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
R-thread A R-thread B
read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
33
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
R-thread A R-thread B
read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
A
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
33
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
RSM
R-thread A R-thread B
read( )
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
A A read_enter
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
33
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
RSM
R-thread A R-thread B
read( )
write()
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
A A read_enter
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
33
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
RSM
R-thread A R-thread B
read( )
write()
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
A A read_enter
A B write_enter
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
34
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
RSM
R-thread A R-thread B
read( )
write()
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
A A read_enter
A B write_enter
Pablo Montesinos Lessons Learned during the CapoOne Development
RSM
Injecting Data into Spheres (II)
Look for interactions between R-thread Interleaving Log and Input Log
34
R-thread A R-thread B
RSM
RSM
read( )
copy_to_user
write( )
RSM
R-thread A R-thread B
read( )
write()
R-threadInterleaving
Log
A
A
B
InputLogInputLog
A read_enter
B write_enter
A read_exit inject before returning
A A read_enter
A B write_enter
Circular dependences between Memory Interleaving Log and Input Log cause deadlocks
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
CapoOne keeps speculative data in cache until commit
A chunk may access more lines mapping to a cache set than ways the set has
BulkSC-based system do not allow storing speculative state in memory
When a cache would overflow, CapoOne must commit current chunk
Non-deterministic event: must log chunk size
35
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows36
inst n
inst zinst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows36
OriginalExecution
inst n
inst z
R-thread R1
Chunk SizeLog
Chunk SizeLog
7 500
inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows36
OriginalExecution
inst n
inst z
R-thread R1
. . .
Chunk SizeLog
Chunk SizeLog
7 500
inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows36
OriginalExecution
inst n
inst z
R-thread R1
. . .
Chunk SizeLog
Chunk SizeLog
7 500
inst t
inst n
inst m
ROB
Head
inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows36
OriginalExecution
inst n
inst z
R-thread R1
. . .
Chunk SizeLog
Chunk SizeLog
7 500
inst t
inst n
inst m
ROB
Head inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows36
OriginalExecution
inst n
inst z
R-thread R1
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
inst t
inst n
inst m
ROB
Head inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows36
OriginalExecution
inst n
inst z
R-thread R1
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
inst t
inst n
inst m
ROB
Head inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
36
OriginalExecution
inst n
inst z
R-thread R1
inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
inst t
inst n
inst m
ROB
Head
inst toverflow
inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
36
OriginalExecution
inst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
inst t
inst n
inst m
ROB
Head
inst toverflow
inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
inst r
inst z
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
inst r
inst z
Replay
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
inst r
inst z
Replay
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
inst n
inst r
inst z
0
Replay
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
400
inst n
inst r
inst z
0
Replay
overflow
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
400
inst n
inst r
inst z
0
Replay
overflow
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
400
inst n
inst r
inst z
0
199
0
Replay
overflow
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
36
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
400
inst n
inst r
inst z
0
199
0
No otherR-thread
can commit
Replay
overflow
Pablo Montesinos Lessons Learned during the CapoOne Development
Cache Overflows
The instruction causing the overflow might not be at the head of ROB
Can happen during initial execution and/or during replay
37
OriginalExecution
inst ninst n
inst z
R-thread R1
500 inst m. . .
Chunk SizeLog
Chunk SizeLog
7 500
999
inst t
inst n
inst m
ROB
Head
inst toverflow
R-thread R1
500 inst m. . .
400
inst n
inst r
inst z
0
199
0
No otherR-thread
can commit
Replay
overflow
The instruction that caused cache overflow might not be the first instruction of next chunk
HW must ensure that chunk boundaries are consistent
Lessons Learned During the Development of the CapoOne Deterministic Multiprocessor Replay System
Department of Computer ScienceUniversity of Illinois at Urbana-Champaign
Pablo Montesinos, Matthew Hicks, Wonsun Ahn, Samuel T. King and Josep Torrellas
Pablo Montesinos Lessons Learned during the CapoOne Development
Overall DeLorean System
Proc 0
PI Log
CS Log
Chunk A Arbiter
Proc 1
CS Log
Chunk B
39
Pablo Montesinos Lessons Learned during the CapoOne Development
Overall DeLorean System
Proc 0
PI Log
CS Log
Chunk A Arbiter
Interrupt Log
Proc 1
CS Log
Chunk B
Interrupt Log
39
Pablo Montesinos Lessons Learned during the CapoOne Development
Overall DeLorean System
Proc 0
PI Log
CS Log
Chunk A Arbiter
Interrupt Log
I/O Log
Proc 1
CS Log
Chunk B
Interrupt Log
I/O Log
39
Pablo Montesinos Lessons Learned during the CapoOne Development
Network
Overall DeLorean System
Proc 0
PI Log
CS Log
Chunk A Arbiter
Interrupt Log
I/O Log
Proc 1
CS Log
Chunk B
Interrupt Log
I/O Log
39
Pablo Montesinos Lessons Learned during the CapoOne Development
DMA
DMA Log
Network
Overall DeLorean System
Proc 0
PI Log
CS Log
Chunk A Arbiter
Interrupt Log
I/O Log
Proc 1
CS Log
Chunk B
Interrupt Log
I/O Log
39
Pablo Montesinos Lessons Learned during the CapoOne Development
DMA
DMA Log
Network
Overall DeLorean System
Proc 0
PI Log
CS Log
Chunk A Arbiter
Interrupt Log
I/O Log
Proc 1
CS Log
Chunk B
Interrupt Log
I/O Log
39
Interrupt, I/O and DMA logs are common to other HW-based schemes
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: HW Implementation40
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: HW Implementation40
DMA
NetworkProc 0
PI Log
CS Log
Chunk A Arbiter
Proc 1
CS Log
Chunk B
DMA Log
Interrupt Log
I/O Log
Interrupt Log
I/O Log
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: HW Implementation
No need for DeLorean’s Interrupt Log, DMA Log nor I/O Log
40
DMA
NetworkProc 0
PI Log
CS Log
Chunk A Arbiter
Proc 1
CS Log
Chunk B
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: HW Implementation
No need for DeLorean’s Interrupt Log, DMA Log nor I/O Log
PI Log becomes the per-sphere Interleaving Log
40
DMA
NetworkProc 0
PI Log
CS Log
Chunk A Arbiter
Proc 1
CS Log
Chunk B
Interleaving Log
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: HW Implementation
No need for DeLorean’s Interrupt Log, DMA Log nor I/O Log
PI Log becomes the per-sphere Interleaving Log
CS Log becomes a per-R-thread Log
40
DMA
NetworkProc 0
PI Log
CS Log
Chunk A Arbiter
Proc 1
CS Log
Chunk B
Interleaving Log
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: HW Implementation
No need for DeLorean’s Interrupt Log, DMA Log nor I/O Log
PI Log becomes the per-sphere Interleaving Log
CS Log becomes a per-R-thread Log
Chunks only have instructions from one application (or the kernel)
40
DMA
NetworkProc 0
PI Log
CS Log
Chunk A Arbiter
Proc 1
CS Log
Chunk B
Interleaving Log
Pablo Montesinos Lessons Learned during the CapoOne Development
Emulating vs Re-Executing System Calls41
Pablo Montesinos Lessons Learned during the CapoOne Development
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
41
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
41
R-thread1
R-thread2
OS
RSM
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
41
R-thread1
R-thread2
read()
OS
RSM
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
41
R-thread1
R-thread2
read()
OS
RSM
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
41
R-thread1
R-thread2
read()Log 1
OS
RSM
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
41
R-thread1
R-thread2
read()Log 1
OS
RSM
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
thread674
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
thread674
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
thread674
R-thread3
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
thread674
R-thread3
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
Thread management (clone)
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
thread674
R-thread3
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
Thread management (clone)
Address space modification (mprotect)
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
thread674
R-thread3
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
Replaying
Replay Sphere 1
Replaying
Emulating vs Re-Executing System Calls
During replay, the RSM emulates most system calls:
RSM injects return values from Sphere Input Log, squashes outputs
Some have to be re-executed
Thread management (clone)
Address space modification (mprotect)
41
R-thread1
R-thread2
read()Log 1 fork()
OS
RSM
sys_fork()
thread674
Pablo Montesinos Lessons Learned during the CapoOne Development
Implicit Dependencies42
Pablo Montesinos Lessons Learned during the CapoOne Development
Implicit Dependencies
R-thread changes mapping or protection of address space, and another R-thread uses this changed address space
42
Pablo Montesinos Lessons Learned during the CapoOne Development
Implicit Dependencies
R-thread changes mapping or protection of address space, and another R-thread uses this changed address space
42
Page Table
Sphere 1
R-thread 1 R-thread 2
CPU 1 CPU 2
Pablo Montesinos Lessons Learned during the CapoOne Development
Implicit Dependencies
R-thread changes mapping or protection of address space, and another R-thread uses this changed address space
42
Page Table
Sphere 1
R-thread 1
mprotect X
R-thread 2
while(1){ *x = *x+1}
CPU 1 CPU 2
Pablo Montesinos Lessons Learned during the CapoOne Development
Implicit Dependencies
R-thread changes mapping or protection of address space, and another R-thread uses this changed address space
42
Page Table
Sphere 1
R-thread 1
mprotect X
R-thread 2
while(1){ *x = *x+1}
CPU 1 CPU 2
Pablo Montesinos Lessons Learned during the CapoOne Development
Implicit Dependencies
R-thread changes mapping or protection of address space, and another R-thread uses this changed address space
42
Page Table
Sphere 1
R-thread 1
mprotect X
R-thread 2
while(1){ *x = *x+1}
CPU 1 CPU 2
Pablo Montesinos Lessons Learned during the CapoOne Development
Implicit Dependencies
R-thread changes mapping or protection of address space, and another R-thread uses this changed address space
RSM can express these dependencies to hardware so these interactions can be recorded
42
Page Table
Sphere 1
R-thread 1
mprotect X
R-thread 2
while(1){ *x = *x+1}
CPU 1 CPU 2
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere43
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere43
OS
CPU1
CPU2
CPU3
CPU4
Replay HW
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere43
OS
Thread103
Thread128
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere43
OS
Thread103
Thread128
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere43
OS
Thread103
Thread128
Thread26
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space Replay
Sphere 2
Replaying
Replay Sphere43
Replay Sphere 1
Recording
OS
Thread103
Thread128
Thread26
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space
Only user-mode threads run inside spheres
Replay Sphere 2
Replaying
Replay Sphere43
Replay Sphere 1
Recording
OS
Thread103
Thread128
Thread26
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space
Only user-mode threads run inside spheres
Threads inside a sphere: R-threads
Replay Sphere 2
Replaying
Replay Sphere43
Replay Sphere 1
Recording
OS
Thread103
Thread128
Thread26
R-thread1
R-thread2
R-thread 1
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space
Only user-mode threads run inside spheres
Threads inside a sphere: R-threads
R-threads that share memory run within same sphere
Replay Sphere 2
Replaying
Replay Sphere43
Replay Sphere 1
Recording
OS
Thread103
Thread128
Thread26
R-thread1
R-thread2
R-thread 1
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Set of threads recorded and replayed as a unit and their address space
Only user-mode threads run inside spheres
Threads inside a sphere: R-threads
R-threads that share memory run within same sphere
Replay Sphere 2
Replaying
Replay Sphere43
Replay Sphere 1
Recording
OS
Thread103
Thread128
Thread26
R-thread1
R-thread2
R-thread 1
Thread39
CPU1
CPU2
CPU3
CPU4
Replay HW
Replay Sphere Manager
CPU1
CPU3
CPU2
CPU4
Pablo Montesinos Lessons Learned during the CapoOne Development
Self-Modifying Code
Handling self-modifying code in chunk-based systems is laborious:
HW adds instruction fetches to read set
Instruction fetches are checked against local write set
In CapoOne, when a processor detects it has modified code
However,
44
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: First Capo Implementation 45
Pablo Montesinos Lessons Learned during the CapoOne Development
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
45
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
45
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
45
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
DeLorean HW
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
Simulated HW-Based replay system:
DeLorean [Montesinos et al. ISCA’08]
45
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
DeLorean HW
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
Simulated HW-Based replay system:
DeLorean [Montesinos et al. ISCA’08]
45
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
DeLorean HW
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
Simulated HW-Based replay system:
DeLorean [Montesinos et al. ISCA’08]
Ubuntu Linux with modified kernel:
45
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
DeLorean HW
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
Simulated HW-Based replay system:
DeLorean [Montesinos et al. ISCA’08]
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
45
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
DeLorean HW
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
Simulated HW-Based replay system:
DeLorean [Montesinos et al. ISCA’08]
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
Made some functions more deterministic
45
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Pablo Montesinos Lessons Learned during the CapoOne Development
Replay Sphere 1
FFT
Thread1863
Thread1864
Replay Sphere 1
Apache
Thread8765
Thread8777
DeLorean HW
Ubuntu Linux
CapoOne: First Capo Implementation
Records and replays parallel Linux apps
Simulated HW-Based replay system:
DeLorean [Montesinos et al. ISCA’08]
Ubuntu Linux with modified kernel:
Added support for spheres, R-threads
Made some functions more deterministic
Split Replay Sphere Manager
45
Replay Sphere 1
Recording FFT
R-thread1
R-thread2
Kernel-Level RSM
User-Level RSMLog
2
Replay Sphere 2
Replaying Apache
R-thread1
R-thread2
Log1