1
Constructive Computer Architecture
Virtual Memory and Interrupts
Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-1
Caching vs. Demand Paging
CPU cache primary memory
secondary memory
Caching Demand paging cache slot page frame cache line (~32 bytes) page (~4K bytes) cache miss rate (1% to 20%) page miss rate (<0.001%) cache hit (~1 cycle) page hit (~100 cycles) cache miss (~100 cycles) page miss (~5M cycles) miss is handled in hardware miss is handled mostly in software
primary memory
CPU
November 9, 2015 http://csg.csail.mit.edu/6.175 L20-2
2
Address Translation: putting it all together
Virtual Address
TLB Lookup
Page Table Walk
Update TLB Page Fault (OS loads page)
Protection Check
Physical Address (to cache)
miss hit
the page is memory memory denied permitted
Protection Fault
hardware hardware or software software
SEGFAULT Where?
Resume the instruction November 13, 2015 http://csg.csail.mit.edu/6.175 L21-3
Address Translation in Pipeline Machines
Software handlers need a restartable exception on page fault or protection violation
Handling a TLB miss needs a hardware or software mechanism to refill TLB
Methods to overcome the additional latency of a TLB: slow down the clock
pipeline the TLB and cache access
virtual address caches
parallel TLB/cache access
PC Inst TLB
Inst. Cache D Decode E M
Data TLB
Data Cache W +
TLB miss? Page Fault? Protection violation?
TLB miss? Page Fault? Protection violation?
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-4
3
Physical vs Virtual Address Caches?
One cycle in case of a hit (+)
cache needs to be flushed on a context switch unless address space identifiers (ASIDs) included in tags (-)
aliasing problems due to the sharing of pages (-)
CPU Physical Cache
TLB Primary Memory
VA PA
Alternative: place the cache before the TLB
CPU
VA
(StrongARM) Virtual Cache
PA TLB
Primary Memory
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-5
Aliasing in Virtual-Address Caches
VA1
VA2
Page Table
Data Pages
PA
VA1
VA2
1st Copy of Data at PA
2nd Copy of Data at PA
Tag Data
Two virtual pages share one physical page
Virtual cache can have two copies of same physical data. Writes to one copy not visible
to reads of other!
General Solution: Disallow aliases to coexist in cache
Software (i.e., OS) solution for direct-mapped cache
VAs of shared pages must agree in cache index bits; this ensures all VAs accessing same PA will conflict in direct-mapped cache (early SPARCs)
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-6
4
Concurrent Access to TLB & Cache
Index L is available without consulting the TLB cache and TLB accesses can begin simultaneously
Tag comparison is made after both accesses are completed Cases: L + b = k L + b < k L + b > k what happens here?
VPN L b
TLB Direct-map Cache 2L
blocks 2b-byte block
PPN Page Offset
= hit?
Data Physical Tag
Tag
VA
PA
Virtual Index
k
Partially VA cache!
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-7
Virtual-Index Physical-Tag Caches: Associative Organization
VPN L = k-b b
TLB Direct-map 2L
blocks
PPN Page Offset
=
hit?
Data
Phy. Tag
Tag
VA
PA
Virtual Index
k Direct-map 2L
blocks
=
After the PPN is known, W physical tags are compared Allows cache size to be greater than 2L+b bytes
W ways
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-8
5
Exception handling in a pipeline machine
November 13, 2015 L21-9 http://csg.csail.mit.edu/6.175
Exception Handling
PC Inst. Mem D Decode E M
Data Mem W +
Illegal Opcode
Overflow Data address Exceptions
PC address Exception
External Interrupts
Ex D
PC D
Ex E
PC E
Ex M
PC M
Cause
EPC
Kill D Stage
Kill F Stage
Kill E Stage
Select Handler PC
Kill Writeback
Commit Point
1. An instruction may cause multiple exceptions; which one should we process?
2. When multiple instructions are causing exceptions; which one should we process first?
from the earliest stage
from the oldest instruction November 13, 2015 http://csg.csail.mit.edu/6.175 L21-10
6
Interrupt processing Internal interrupts can happen at any stage but cause a redirection only at Commit
External interrupts are considered only at Commit
If an instruction causes an interrupt then the external interrupt, if present, is given a priority and the instruction is executed again
Some instructions, like Store, cannot be undone once launched. So an instruction is considered to have completed before an external interrupt is taken
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-11
Exception Handling When instruction x in stagei raises an exception, its cause is recorded and passed down the pipeline
For a given instruction, exceptions from the later stages of the pipeline do not override cause of exception from the earlier stages
If an exception is present at commit: Cause and EPC registers are set, and pc is redirected to the handler PC
Epoch mechanism takes care of redirecting the pc
November 4, 2015 http://csg.csail.mit.edu/6.175 L19-12
7
Killing vs Poisoning
PC
Inst
Memory
Decode
Register File
Execute
Data
Memory
f2d
Epoch
m2c d2e
Next
Addr
Pred
scoreboard
f12f2
e2m
wrong path insts are dropped
wrong path insts are poisoned
This affects whether an instruction is removed from sb in case of an interrupt
exte
rnal in
terr
upts
consid
ere
d a
t Com
mit
November 4, 2015 http://csg.csail.mit.edu/6.175 L19-13
Interrupt processing at Execute
Incoming Interrupt
-if (mem type) issue Ld/St -if (mispred) redirect -pass eInst to M stage
-pass eInst to M stage unmodified
no yes
eInst will contain information about any newly detected interrupts at Execute
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-14
8
Interrupt processing at Memory stage
Incoming Interrupt
-pass eInst with modified data to Commit
-pass eInst to Commit unmodified
no yes
Memory Interrupt?
no yes
-pass new Cause to Commit
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-15
Interrupt processing at Commit
External Interrupt?
EPC<= pc; causeR <= inCause; if (inCause after Reg Fetch) sb.rm; mode <= privilege; Redirect
no yes
Incoming interrupt
no yes no yes
-commit -sb.rm
Incoming interrupt
commit; sb.rm; EPC<= ppc; causeR <= Ext; mode <= privilege; Redirect
EPC<= pc; causeR <= Ext; if (inCause after Reg Fetch) sb.rm; mode <= privilege; Redirect
November 13, 2015 http://csg.csail.mit.edu/6.175 L21-16
9
Final comment
There is generally a lot of machinery associated with a plethora of exceptions in ISAs
Precise exceptions are difficult to implement correctly in pipelined machines
Performance is usually not the issue and therefore sometimes exceptions are implemented using microcode
November 4, 2015 http://csg.csail.mit.edu/6.175 L19-17
RISC-V Virtual Memory Privileged ISA v. 1.9.1
November 13, 2015 L21-18 http://csg.csail.mit.edu/6.175
10
RISC-V Privilege Levels Separation between low-level access to the hardware and high-level user programs
Machine-mode (M) – all addresses are physical addresses, has access to all addresses including memory-mapped IO devices
Supervisor-mode (S) – addresses are typically virtual addresses, can switch page-table in use (sptbr)
User-mode (U) – addresses are virtual, access to devices only through systemcalls
November 13, 2015 L21-19 http://csg.csail.mit.edu/6.175
RISC-V Memory Maps
November 13, 2015 L21-20 http://csg.csail.mit.edu/6.175
DRAM
Boot ROM
MMIO
Debug Unit
Machine-Mode Physical
Addresses
DRAM
User-Mode Virtual
Addresses
Demand Paging makes the entire address space look like DRAM
Only part of the address space is DRAM
11
RISC-V Paged Virtual Memory
Different modes for different systems:
Sv32: 32-bit VA, 34-bit PA
4 GB virtual address space
16 GB physical address space
2-layer page table
Sv39: 39-bit VA, 50-bit PA
512 GB virtual address space
1 PB physical address space
3-layer page table
November 13, 2015 L21-21 http://csg.csail.mit.edu/6.175
Requires RV64, 64-bit ISA
Sv32 Addresses
Virtual Addresses:
Physical Addresses:
November 13, 2015 L21-22 http://csg.csail.mit.edu/6.175
VPN[0] page offset VPN[1]
12 bits 10 bits 10 bits
VPN[0] page offset PPN[1]
12 bits 10 bits 12 bits
1st level Page Table Index
2st level Page Table Index
Both come from 2nd level Page Table Entry
12
Sv32 Page Table Entries
November 13, 2015 L21-23 http://csg.csail.mit.edu/6.175
PPN[0] SW Reserved PPN[1] D A G U X W R V
2 bits 10 bits 12 bits
Dirty – This page has been written to
Accessed - This page has been accessed
Global – Mapping exists in all virtual address spaces
User – User-mode programs can access this page
eXecute – This page can be executed
Write – This page can be written to
Read – This page can be read from
Valid – This page valid and in memory
Sv32 Page Table Entries
November 13, 2015 L21-24 http://csg.csail.mit.edu/6.175
If V = 1, but X, W, R == 0, PPN[] points to the 2nd level page table
If V = 0, page is either invalid or in disk. If in disk, the OS can reuse bits in the PTE to store the disk address (or part of it).
Disk Address G U X W R 0
26 bits
PPN[0] SW Reserved PPN[1] D A G U X W R V
2 bits 10 bits 12 bits
13
RISC-V Pipeline with VM
November 13, 2015 L21-25 http://csg.csail.mit.edu/6.175
PC Inst TLB
Inst. Cache D Decode E M
Data TLB
Data Cache W +
Page Table Walker
miss miss
Page Table Walker
translation or fault
translation or fault
memory accesses
memory accesses
On a fault, an exception is raised and the OS takes over
RISC-V VM Instructions SFence.VM
Privileged instruction to synchronize TLB translation. Ensures that stores to data cache are seen by hardware page table walker
CSRs – Control and Status Registers
Privileged registers for processor configuration
sptbr – Page Table Base Register
mstatus.vm – Virtual Memory mode (e.g. Sv32)
mstatus.mxr, mstatus.pum, mstatus.mprv –
Fields for modifying privileges for memory accesses to emulate accesses at low privilege levels
November 13, 2015 L21-26 http://csg.csail.mit.edu/6.175