Flexible Timing Simulation of RISC-V Processors with Sniper
Neethu Bal Mallya1, Cecilia Gonzalez-Alvarez2, Trevor E. Carlson1
1National University of Singapore, Singapore2Ghent University, Belgium
Outline
• Need for Simulation• Sniper Simulator Overview• Our enhancements to Sniper• Initial Processor Performance Analysis• Conclusion
2/6/2018 2
Why do we need Simulation?
2/6/2018 3
Performance analysis of next-generation systems
Pre-silicon software optimizations
Architecture design space exploration
Verilog/RTL High-level
Trade-offs in Simulation
2/6/2018 4
Verilog/RTL High-level
Trade-offs in Simulation
2/6/2018 5
Efficient MLP processor
Fetch Decode
Instr.Queue
BypassQueue
CommitWriteback
ALUsand
Caches
LearnSlice
Source: T.E.Carlson, et al., “Load Slice Core” [ISCA2015]
Sniper Simulator – An Overview
• Parallel simulator based on Interval Simulation
2/6/2018 6
1Currently not supported for RISCV
• Models multi-/many-cores running multithreaded1
and multi-program workloads
• Hardware validated for x86
• Flexible simulation options
Sniper – Beyond Traditional Simulation• Strong adoption in industry and academia
• 550+ citations• 800+ researcher downloads• 64+ countries
2/6/2018 7
• Actively used since 2011• Belgium-based team• Supports next generation Xeon Phi (KNL++)• HiPEAC TechTransfer Award
Sniper – Key Differentiators
• Fast development time
• Enables Limit Studies• Branch Prediction• Memory Dependence Prediction• Shared Multi-level Cache Hierarchy
2/6/2018 8
Almost 10 MIPS
1000 cores
Average error of just 11%
with HW • High Performance and Scalability
Sniper - Interacting with the Simulator
• Python interfaces
• SimAPI• Magic Instructions• SimROIStart() - SimROIEnd()
2/6/2018 9
$SN
IPER
_HO
ME/
incl
ude/
sim_a
pi.h
Sniper - Interacting with the Simulator
• Energy Stats
2/6/2018 10
Run McPAT
Update the statistics
Sniper - Interacting with the Simulator
• Loop Tracer
2/6/2018 11
cycles
inst
ruct
ions
Sniper + RISC-V ecosystem
• RISC-V • Open, Extensible ISA• Collection of related software tools
2/6/2018 12
• Existing Architecture-level Software implementations• Functional simulators
• Many additional things
Spike rv8
Comparison with existing solutions
2/6/2018 13
Sniper + RISC-V gem5 (RISC5) FireSim / Chisel / Verilog
Development Methodology
C++ based (SW) C++ based (HW) RTL based (HW)
Dev-time +++ ++ +
Sim-time +++ ++ ++++/+/+
Simulationmodel
Cycle-level + Cycle-approximate
Cycle-level Cycle-exact + Cycle-approximate
Flexibility Ease-of-use / modification Requires RTL/ abstract models
Fidelity Sophisticated models require hardware validation
Cycle-exact models derived from synthesizable RTL
Simulation Flow
2/6/2018 14
RTL
Validation Only
Snip
er Final Check
Short TestingDev
Final Check
Short TestingDevelopmentRT
L
Sniper Architecture
2/6/2018 15
Sniper Backend
L1
L1L2
L1
L1L2
Cache Performance
Models
Core Performance
Models
Core Model 1
Core Model 2
Core Model M-1
Core Model M
…
NoC
Thre
ad S
ched
uler
Decoder Library
SIFT 1
…
SIFT 2
SIFT M
SIFT M-1
SIFT pipesEmulation/ Binary Instrumentation
events thread 1
…
events thread 2
events thread M
events thread M-1
Sniper Frontend
How did we enhance Sniper?
2/6/2018 16
Sniper Backend
L1
L1L2
L1
L1L2
Cache Performance
Models
Core Performance
Models
Core Model 1
Core Model 2
Core Model M-1
Core Model M
…
NoC
Thre
ad S
ched
uler
Decoder Library
SIFT 1
…
SIFT 2
SIFT M
SIFT M-1
SIFT pipesEmulation/ Binary Instrumentation
events thread 1
…
events thread 2
events thread M
events thread M-1
Sniper Frontend
How did we enhance Sniper?
2/6/2018 17
Configuration filesto resemble a BOOM processor
4
RISC-V functional simulators - rv8 / Spike were updated to support SIFT generation
1
3 Core ModelParameters like description of ports/ functional units, latencies, etc. were updated
2 Decoder LibraryArchitectural agnostic methods
were added to implement the decoding phase of the processor
Backend……
SIFT pipes
…
Frontend
12
34
Sniper Instruction Trace File Format (SIFT)
2/6/2018 18
• Dynamic Instruction stream generated by the Frontend
Instruction Execution Order
Memory Addresses for Loads and Stores
Branch Directions (taken/not taken)
Executed/masked info for Predicated instructions
Dynam
ic
How to add new Frontend?
2/6/2018 19
Sift::Writer::InstructionCount()Sift::Writer::CacheOnly()Sift::Writer::Instruction()
// addresses, branch direction, etc.
Instruction Instrumentation
Control
Sift::Writer::Magic()
How to add new Frontend?
2/6/2018 20
rv8 / Spike
How to add new Frontend?
2/6/2018 21
Backend……
SIFT pipes
…
Frontend
SIFTrv8 / Spikerv8 / Spike
Sift::Writer
How to update Backend?
2/6/2018 22
$SNIPER_HOME/decoder_lib• Decoder Library• 2 classes
• Decoder• InstructionDecoded
$SNIPER_HOME/config• Config Files
$SNIPER_HOME/common/performance_model• Core Model
How to run Sniper ?./run-sniper --frontend=[pin|dr|spike|rv8|legacy] --config
2/6/2018 23
[SNIPER] Start [SNIPER] --------------------------------------------------------------------------------[SNIPER] Sniper using SIFT/trace-driven frontend[SNIPER] Running full application in DETAILED mode[SNIPER] --------------------------------------------------------------------------------[SNIPER] Enabling performance models[SNIPER] Setting instrumentation mode to DETAILEDTrace Monitor Started[TRACE:0] -- DONE --[SNIPER] Disabling performance models[SNIPER] Leaving ROI after 18.26 secondsOUT: RUN: TraceThread[SNIPER] Simulated 5.0M instructions, 11.2M cycles, 0.45 IPC[SNIPER] Simulation speed 273.4 KIPS (273.4 KIPS / target core - 3657.1ns/instr)[SNIPER] Setting instrumentation mode to FAST_FORWARD[SNIPER] End[SNIPER] Elapsed time: 18.41 seconds
Experimental Setup
• Sniper multi-core simulator• Similar to BOOM v1 DefaultConfig
• Dispatch width:2, Issue Width:3, ROB:80• 32KB L1s, 1MB L2• 2.0GHz
• SPEC CPU2006 benchmarks• First 5M instructions
2/6/2018 24
Initial Processor Performance AnalysisTestcase IPC KIPS
470.lbm 0.15 97.899
444.namd 1 304.719
450.soplex 1.52 343.668
456.hmmer 2.71 523.41
462.libquantum 2.65 611.968
2/6/2018 25
130.
8713
0.96
Source: Tuan Ta, et. al, “Simulating Multi-Core RISC-V Systems in gem5”, [CARRV 2018]
Conclusion• An infrastructure extension of Sniper
• Sniper + RISC-V is now available
• Next steps• Improve the simulator features to allow for a detailed comparison with
cycle-level processor implementations
2/6/2018 26
• Thank you• Download Today!
• http://snipersim.org/w/Download• Questions?
• http://groups.google.com/group/snipersim
2/6/2018 27
Flexible Timing Simulation of RISC-V Processors with Sniper
Neethu Bal Mallya1, Cecilia Gonzalez-Alvarez2, Trevor E. Carlson1
1National University of Singapore, Singapore2Ghent University, Belgium