Verification
Sungho Kang
Yonsei University
2CS&RSOC YONSEI UNIVERSITY
Outline
Hardware AccelerationEmulationCo-VerificationFormal Verification
3CS&RSOC YONSEI UNIVERSITY
AccelerationWhy Simulation Engine
Speed up difficulty in software simulationParallel Processing
Multi-processingPipeliningArray Processing
Hardware Implementation
Simulation Engine / Hardware AcceleratorCompiledEvent DrivenMulti-ProcessorArray
4CS&RSOC YONSEI UNIVERSITY
AccelerationSimulation Engine Performance
ArchitectureMaximumEvalution
UnitsSimulationAlgorithm
MaximumGates
AnnouncedSpeed
YSE(IBM)
Multi-ProcessorPipelining
256 Compile1M
( 4 input /1 output gate )
2000M( gates / sec )
HAL(NEC)
Multi-ProcessorPipelining
31Level
ControlledEvent Driven
1.5M 300M( gates / sec )
LE(ZYCAD)
Multi-ProcessorPipelining
16 Event Driven1.6M
( 2 input /1 output gate )
16M( gates / sec )
5CS&RSOC YONSEI UNIVERSITY
AccelerationTEGAS Accelerator
Control & Statistics Proc.
Status Reg.
Instr.Mem
Cmd Reg
Result Buff Proc.
Activity Srch. Proc.
Evaluation Processors
Result
Flag Mem
PIN MemInstr. Me
m
Update Processor
Host Interface Processor
Input Event
Result Buff.
Update LIFO
672 bit wideSimulation Proc. Mem
To Host
Simulation Data Bus
Maint Proc.
InstrMem
Clock/Clear
I/O Ports
6CS&RSOC YONSEI UNIVERSITY
AccelerationTEGAS Accelerator
Functional Level Block DiagramControl and
StatisticProcessor
UpdatePass
Processor
FaultList
Proessor
HostInterfaceBuffers
MaintenanceProcessorand Clock
Logic
EvaluationPass
Processor
HostInterface
Processor
SimulationProcessingMemories
7CS&RSOC YONSEI UNIVERSITY
AccelerationTEGAS Accelerator
Accelerator Update Processor
LIFOAddress
andAccessControl
MasterController
SimulationProcessing
MemoryAccess
Controller
DescriptorAddress
PipeLIFO
Addressand
AccessControl
To / FromSimulationProcessing
Memory
To/Fromresult
Buffer MemoryTo/From
Control and StatisticsProcessor
To/FromFault ListProcessor
Support Bus
To/FromHost Interface Processor
From Host InterfaceProcessor
From UpdateProcessor
LIFOMemory
From HostInterface
Processor
TO/FromUpdate
ProcessorLIFO Memory
From EvaluationProcessor
8CS&RSOC YONSEI UNIVERSITY
AccelerationTEGAS AcceleratorAccelerator Evaluation Processor
BehaviorProcessorInstruction
Memory
BehaviorialLanguageProcessor
BehavioralBuffer
StructuralEvaluationProcessor
StructuralProcessorInstruction
Memory
StructuralBuffer
StructuralProcessorScheduler
SchedulerMemory
PinAttributeMemory
Level 0
Support Bus
To/From Fault List Processor
To/Fromcontrol and Statistics Processor
To/FromTime Queue Processor,
Time Queue Memory
To/Fromcontrol and Statistics Processor
Behavioral Processor
Structural Processor
To/From Fault List Processor
To/FromActivity Search Processor
To/From Simulation Processing Memory
To/Fromcontrol and Statistics Processor
9CS&RSOC YONSEI UNIVERSITY
AccelerationTEGAS AcceleratorAccelerator Time Queue Processor
TimeQueueSearchPage
ActivityLogic
TimeQueueSearchPage
ActivityLogic
TimeQueue
ProcessorController
TimeQueue
MemoryAvailability Memory
TimeQueue
MemoryAvailabilty
Logic
TimeQueueSearchPage
ActivityLogic
To/From Evalution Processor
To/FromTime Queue
Memory
From Control and Statistics
Processor
From Activity Search ProcessorSimulation
ProcessingMemory
Address Bus
Support Bus
From Control and Statistic Processor
10CS&RSOC YONSEI UNIVERSITY
AccelerationYorktown Simulation Engine
Compiled
256 x 256 Switch
Logic Proc 0
Logic Proc 2
Logic Proc 1
Logic Proc 256
ArraySimulator........
Bus ControlControlProc.
Host
11CS&RSOC YONSEI UNIVERSITY
AccelerationYorktown Simulation Engine
Partitioning of 256 PUsEach PU simulates a subcircuit consisting of up to 4k gatesSpecialized PU for RAMs and ROMs
All PUs are synchronized by a common clocksPU can evaluate a gate during every clock cyclePartitioning
Minimize the waiting timeControl processor : host to YSE
12CS&RSOC YONSEI UNIVERSITY
AccelerationYorktown Simulation Engine
Logic ProcessorPC provide the index to the next gate to be evaluated (compiled)Signal values(0,1,X,Z) are stored in data memory Up to 4 inputs for each gate
Generalized DeMorgan Code(GDM)16 functions of 4 valued variablesEvaluation is done in zoom table
13CS&RSOC YONSEI UNIVERSITY
AccelerationAAP-1
ARRAYCONTROL
UNIT
INSTRUCTIONMEMORY
INTERFACEUNIT
DATABUFFERMEMORY
PE PE PE PE
PEPEPEPE
PE PE PE PE
256 X 256 PE ARRAY
256
HOST
COMPUTER
16
16
14CS&RSOC YONSEI UNIVERSITY
AccelerationAAP-1
Processing Element
MUX-ID1
MUX-RUT
REG-RS
MUX-OD
MUX-ID2 RAM-B(32WX1b)
RAM-B(64X1b)
LAT-A LAT-B LAT-S
MUX-ALDS
MUX-CRY
REG-10
REG-C
A BALU
CORALU
'1'
CiU
CiRCiCo
F
DO2
DTU2DTU1
DO1
RF-A RF-B
PELEVEL
BYPASS
8 NEIGHBOR PEs UPPER AND LOWER PEs
15CS&RSOC YONSEI UNIVERSITY
EmulationWhat is Emulation?
Turnkey rapid prototyping systems
Read users design and automatically partition & map to array of FPGAsEnable user to run at system level and verify with application softwareFull internal visibility to debug - thousands of probesModify design in minutes
CPU
Compiler
DesignMapping
Logic Design
Emulator Target system
16CS&RSOC YONSEI UNIVERSITY
EmulationBugs Found with Emulation:
Functional ASIC bugsBoard/system-level bugsSoftware, firmware bugsSynthesis bugsBugs that require rich, real-world stimulus or high throughput to findBugs caused by spec. misinterpretation
17CS&RSOC YONSEI UNIVERSITY
EmulationComparison with Co-simulation
Performance potential of simulation accelerator is not achievable with current testbench strategies
Speed of testbench (workstation)Channel latency & bandwidthFrequency of communicationDesign under test execution speed
18CS&RSOC YONSEI UNIVERSITY
EmulationComparison with FPGA
FPGAHigh chip capacitySlow compilationLow I/O to gate ratio
EmulationFast compile speedProductive debugging High I/O to gate ratioOn-board logic analyzer
Generic FPGAs used for emulation Unpredictable capacity and highly variable routing delays with poor debuggability
19CS&RSOC YONSEI UNIVERSITY
EmulationAnatomy of an Emulator
Emulation moduleswith FPGAs & cross
point chip
Special memorycards for mappingvery complex ofdeep memory
Instrumentationcards for debug
Inter-card connectioncrossbar backplane
to allow modularcapacity addition
Specialized add-on cards for cores
Targetinterfacehardware
Cable for target systeminterface or debug
20CS&RSOC YONSEI UNIVERSITY
EmulationEmulator Architecture
Hierarchical Multiplexed Architecture Simplifies Design Mapping Process
AutomatedDesign
Mapping
MuxBackplane
EmulationModule
21CS&RSOC YONSEI UNIVERSITY
EmulationDefinition
A logic emulator is a system of Programmable hardware with capacity much greater than one FPGASoftware which automatically programs the hardware according to a gate level design representationSoftware and hardware to support operation and analysis of the emulated design as a component in real hardware
22CS&RSOC YONSEI UNIVERSITY
EmulationSystem Overview : SW ComponentsDesign compiler
Netlist reader and parser : Reads and parses gate-level design netlists
Technology mapper :Maps design components into optimal emulator equivalents
System-level Partitioner and Placer : Partitions mapped design into boxes, boards, ultimately into FPGA netlists.
System-level Interconnect router : Determines the programming of interconnect hardware to complete nets cut by the partitioner
FPGA compiler : Reads each FPGA netlist, maps, partitions, places and routes FPGA.
Timing Analysis(optional) : Analyzes compiled design on emulation hardware for speed, hold violations.
Runtime download and analysis controller.Graphical User Interface Hardware diagnostics
23CS&RSOC YONSEI UNIVERSITY
EmulationSystem Overview : HW Components
Logic emulation boards : FPGAs and interconnect chipsMemory emulation boards : RAMs, FPGAs and interconnect.System interconnect board : chips which interconnect emulation boards.I/O Connectors and Pods : connects to in-circuit interfaces, external components.Instrumentation : stimulus generator, logic analyzer, vector interface.Controller : downloads configurations, operates instruments.Interface : to host computer.
24CS&RSOC YONSEI UNIVERSITY
EmulationAdvantages of Emulation
Emulation performance is not a function of design size
Deep Sub-micron Zone
Emulation
Point of Emulation
Simulator
Accelerator
Cycle-based Simulator
Tim
e to
Ver
ify D
esig
n
25CS&RSOC YONSEI UNIVERSITY
EmulationDisadvantage of Logic Emulation
Hardware emulation system is requiredSpeed is 5-10 X slower than real design speed
System emulation speeds of 1 to 4 MHz are common todayTarget system must be slowed down for emulation
Delays do not match those of real designTiming-induced errors are possible, that is, hold-time violationDelay independent functionality may not operate correctly
26CS&RSOC YONSEI UNIVERSITY
EmulationLogic Emulation
FPGA-based Hardware EmulationContain a large pool of general purpose logic blockDesign preparation time and compilation time are costly
Processor-based Hardware EmulationAn array of basic CPUs or simple Boolean processors that perform basic logic operation on a time sharing basisDesign under verification is converted into a simulation data structure, similar to that of a software simulatorSlower than FPGA-based
27CS&RSOC YONSEI UNIVERSITY
Co-verificationMixed Implementation
MemoryBehavioral
specificationplus
constraints
Analoginterface
Software
Hardware
Constrains
Cost
PerformanceMixed
implementation
A mixedimplementation
Program
ASIC
Interface
µP
28CS&RSOC YONSEI UNIVERSITY
Co-verificationCo-Design
Integrated design of systems implemented using both hardware and software componentsWhy
Advances in enabling technologiesSystem level specification / simulationHigh level synthesis and CAD frameworks
Advanced design methods required due to the increased diversity and complexityCost and performance of HW/SW systems should be optimized for market competitivenessProduce-to-market time is vitalExploiting concurrency among design threads and tools will result in significant gains
DifficultiesTarget moves (e.g. in size and complexity)Tolerance level for error decreases
29CS&RSOC YONSEI UNIVERSITY
Co-verificationCo-Design
Integration of HW and SW design techniquesThe HW and SW components are interdependentHW and SW are typically described and design using different methodologies, languages and tools
AdvantagesAcceleration of the design processLengthy system integration and test phase can be avoidedDynamic HW/SW trade-offs in the design process
Co-design methodologies differ widelyWidely differing assumptionsInterface/communication techniquesDesign goals
Example typesA microprocessor and its associated glue logicA microprocessor and special-purpose computing engine
30CS&RSOC YONSEI UNIVERSITY
Co-verificationCo-Simulation
Simulation of heterogeneous systems whose HW and SW components are interfacingRoles of co-simulation
Verification of system specification before system synthesisVerification of mixed system after system synthesis and integrationSystem performance estimation for system partitioning
Issues of HW-SW co-simulationTiming accuracyProcessor modelPerformanceInterface transparencyTransition to co-synthesisIntegrated user interface and internal representation
31CS&RSOC YONSEI UNIVERSITY
Co-verificationReason for Co-simulation
Processor transition to embedded coreNo existing hardware
Increased software contentSoftware controls more functionsDevelopment and debug times increased
More complex hardware interfacesProcessor interactions with DSPProcessor interactions with increasingly complex hardware
32CS&RSOC YONSEI UNIVERSITY
Co-verificationReason for Co-Simulation
Design problem foundSomething wrong due to logic errorProblem is visible, but not apparent in hardwareFound while running software
Brings SW and HW together
33CS&RSOC YONSEI UNIVERSITY
Co-verificationCo-Design Problems
Instruction set processorsCodesign to design well-balanced long-lasting processorsInstruction set selectionCache design Pipeline control
HW mechanism : flush the pipelinesSW solution : reorder instructions or insert no operation
ASIPSW : retargetable code generation for ASIP data pathHW : library bindingHigh performanceDesirable programmabilityLow unit cost than ASIC
Embedded systems and controllersReal time systems with peripheral devices (sensors, actuators)
34CS&RSOC YONSEI UNIVERSITY
Co-verificationCo-Design Methodologies
Automation of conventional codesign
clear early bindingeasy design decisions
Constraints/requirementsanalysis
System specification
HW/SW partitioning
HW synthesisSW generation
interface synthesis
HW/SW cosimulation
Integrated systemevaluation/verification
35CS&RSOC YONSEI UNIVERSITY
Co-verificationCo-Design Methodologies
Model-based codesignlate partitioning/binding after refiningeasy to handle design changescomponent reusemodular hierarchical models
Constraints/requirements analysis
System specification
Systemmodeling
Validation/Simulation
Verified model(desired granularity)
Technology assignment(into HW/SW/interface components)
Model base
Refinement(decompose
into submodels)
36CS&RSOC YONSEI UNIVERSITY
Co-verificationHW-SW Partitioning
Performance requirementsSome functions may need to be implemented in HWThe overhead of synchronization and data transfer should be considered
Implementation costsHW can be shared
ModifiabilitySW can be easily changed
Nature of computationSome function may have an affinity for either HW or SWDegree of data parallelism
37CS&RSOC YONSEI UNIVERSITY
Co-verificationHW-SW Partitioning
Software-preferredTask which calls OS often
Hardware-preferredArithmetic operationHigh degree of data parallelismMultiple threads of controlCustomized memory architecture
38CS&RSOC YONSEI UNIVERSITY
Co-verificationPerformance Estimation
At a low abstraction level Easy and accurateLong iteration time
At a higher level of abstractionNecessary to reduce the exploring timeCurrently quick and dirtyGoal : quick and accurate
Performance and cost estimation is important for HW-SW partitioning and for HW/SW synthesis/optimization
39CS&RSOC YONSEI UNIVERSITY
FormalComplexity Trends
Rapidly Growing Design SizeDoubling of million-gate designs50% reduction in designs under 500K gates3X reduction in designs under 100K gates
Shrinking Process GeometriesNearly 10X reduction in .5 micron designsMost designs going to .35 micron or below
Architectural ComplexityBefore: simple instruction pipelines, single functional units, simple stalls/holds, simple caches/TLBs, protocols at the pinsNow: deep instruction pipelines, super-scalar design, multiple (pipelined) functional units, instruction re-circulate, speculative execution, complex stalls/holds, complex protocols on chip and at the pins, integration of external IP
40CS&RSOC YONSEI UNIVERSITY
FormalInformal Verification
SimulationCompare against an executable version of the specification, also know as THE GOLDEN MODELSimulate in softwareSimulate in hardware
Test casesHard-written by the designersRandomly generated test vectors
41CS&RSOC YONSEI UNIVERSITY
FormalTrends : Verification Problem
Number of test vectors proportional to design complexSimulation cannot guarantee correctnessConfidence based on proportion of design space explored
Hardware, software systems are becoming increasingly complex
Number of basic components growing exponentiallyDesigns are increasingly aggressive
42CS&RSOC YONSEI UNIVERSITY
FormalWhy Formally Specify?
Showing that a property holds globally of the entire system
Want to characterize the correctness conditional can promise theuser of my systemWant to show this property is really a system invariant
Error handlingWant to specify what happens if an error occursWant to specify the right thing happens if an error occursWant to make sure this error never occurs
CompletenessWant to make sure that I’ve covered all the cases, including error cases, for this protocolLike to know that this language I’ve designed is computationally complete
43CS&RSOC YONSEI UNIVERSITY
FormalWhat to Specify?
Correctness conditionsInvariantsObservable behaviorsProperties of state entities
44CS&RSOC YONSEI UNIVERSITY
FormalFormal Specification
A concise description of the behavior and properties of a system written in a mathematically-based languageSpecifies what a system is supposed to do as abstract as possibleEliminating distracting detail and providing a general description resistant to future system modification
45CS&RSOC YONSEI UNIVERSITY
FormalFormal Verification
Mathematically proves correctnessShows specification satisfies requirement propertiesShows implementation satisfies the properties required by specification
Higher performanceUse symmetry and decompositionCollapse sets of similar behaviors into a single cases
46CS&RSOC YONSEI UNIVERSITY
FormalDefinitions
Formal Verification:Use of mathematics to automate logic verificationFormal, mathematical proof that circuit = specification
Specification:A trusted description of any portion of a circuit’s behavior
Equivalence Checking:Formal tool that proves whether a circuit description is functionally equal to a reference circuit description
Model Checking:Formal tool that proves whether a particular functional design detail operates as specified
47CS&RSOC YONSEI UNIVERSITY
FormalDefinition of Formal Verification
Formal verification is proving that the functions in the specification are the same as the functions in the implementationThe proof is done mathematically not experimentally
Functional CorrectnessAny piece of hardware is functionally correct if we can somehow prove that its implementation realizes the specification
48CS&RSOC YONSEI UNIVERSITY
FormalCorrectness Preserving Transformation
Useful when intergrating verification and automated synthesis in a cooperative approach to correct designTakes a correct implementation of a specification and derives another correct implementation Generates design alternatives to improve the quality of some original solutionProves the equivalence of two hardware descriptions
49CS&RSOC YONSEI UNIVERSITY
FormalDisadvantages of Formal Verification
Often difficult to applyTend to take too longVerify what is specifiedGap between abstraction and real implementationOne can only verify what is specifiedDiscrepancies between the ideal and real worldsAssumption about the environment
50CS&RSOC YONSEI UNIVERSITY
FormalVerification Paradigms
Equivalence CheckingConsider implementation and specificationThe specification and implementation are equivalent in a suitable senseRequires complete specification
Property Verification (Model Checking)The specification consists of a set of properties that are to be proved about the implementation modelAssumes that the implementation is functionally correct for the typical cases and that the goal is to discover the infrequently occurring corner cases that result in deadlocks, access conflicts, etc.
51CS&RSOC YONSEI UNIVERSITY
FormalTemporal Logic
A special type of Modal Logic, a branch of philosophyProvides a formal system for qualitatively describing and reasoning about how the truth values of assertions change over timeA useful formalism for specifying and verifying correctness of computer programs
52CS&RSOC YONSEI UNIVERSITY
FormalTemporal Logic
Consider time and temporal evolutionIntroduces the concept of possibility and necessity in the futureCan capture time and dynamic behaviors and avoid the introduction of explicit time functionRaces and hazards can be representedIncludes all usual connectives and adds some typical operatorsHenceforth Eventually ◊Next Until ∪
53CS&RSOC YONSEI UNIVERSITY
FormalTheorem Proving
A theorem prover is based on a logic - a formal language for stating mathematical propositionsA logic is equipped with a proof system - a set of axioms and inference rules that make it possible to reason in a step-by-step manner from premises to conclusionsDepending on how powerful the logic is, the proof system may or may not be completeMost theorem provers are interactive, requiring guidance from the user in order to generate proofs
54CS&RSOC YONSEI UNIVERSITY
FormalCombining Formal and Informal
GoalsShould fit into existing verification methodologyRobust
graceful degradation with increased design complexity
Better coverage than simulation/FVHighly automated
HybridUse formal techniques and exhaustive simulationTry to balance proof power and computational efficiency Enumeration on a restricted set of variables