of 18
[email protected] www.verificationacademy.com
Metrics in SoC Verification What Can Metrics Tell Us?
Andreas Meyer Verification Architect
WHAT CAN METRICS TELL US?
We are interested in a broad set of metrics giving us insight into the entire verification flow: The build, simulation and regression processes
Various aspects of the overall project
Yet, metrics must be actionable Otherwise the process of measuring
and storing metrics data waste project resources
Example of Various Verification Metrics Process and
Focused Areas Process and Focused Area Attributes and the Information Associated Metrics Can Provide
Design Abstraction level Simulated performance List of instantiated
blocks (and versions)
Stimulus Source of stimulus Type of stimulus (CR,
firmware, graph, legacy, etc.)
Checking Source of checkers Results of checkers Checker abstraction
levels
IP Interface activity Key internal states
Coverage Categories of coverage RTL/stimulus/checker
reference model Abstraction level of
coverage
Build Source and rev of files Initial configuration
used
Run Simulator/Emulator Host machine info
(memory, disk image distance, etc.)
Simulation performance Revision of tools
Debugging Area of failure Commonality of cases
where many tests report same failure
Regress Which simulations Errors found Errors re-found (i.e., wasted simulation)
Improvements in coverage results
Bug Status Open bugs Bug discovery info:
stimulus, abstraction level, checker
Metrics used to isolate bug
Bug closure information (sim time, engineer
time, number of runs)
Correlating multiple metrics
Only when multiple metrics are correlated during analysis does real value emerge
For example, for newly built revision of a system we might want to answer the question:
Correlating multiple metrics gives us clearer view of the circumstances that allowed us to hit specific coverage item
What coverage was hit for a specific type of applied stimulusat a particular level of abstraction of the designand for a particular revision of the firmware?
Metrics as part of the build process
In large SoC environments, with significant code churn, metrics can provide us insight into the build process
For example, we might be interested in knowing:
Which IP blocks were used during the build process? Where did each IP block originated? Which version number was associated with each IP? What level of abstraction was used for the build?
Event-based and non-event-based metrics
Metrics need not count multiple events to be useful
Metrics for the non-event-based build can be used to qualify queries around specific code
Correlating event-based with non-event-based metrics may be useful in checking completeness of overall verification
For instance, we could determine the coverage metrics associated with a specific IP version, or identify what configuration or randomization that was done during the build
Metrics as part of the simulation process
Metrics as part of simulation enable us to determine: What happened during the simulation process
Which pieces of the simulation environment were used
How the pieces played together
Aspects of the simulation process that can benefit from metrics include Stimulus sources
Checking methods
Coverage metrics
Domain-specific performance
Simulation performance
Simulation configuration
Metrics and stimulus sources
Large SoC designs are likely to use a number of stimulus sources within a single simulation
Metrics can help us
Measure which sources were used and help identify the type and frequency of traffic generated by each source
Understand how the system was tested
Measure the productivity of various stimulus methods
Metrics and checking methods
Large SoC designs are also likely to use a number of different checking methods
Metrics can help us
Checking metrics, when correlated with other metrics, provide a deeper understanding of the simulation environments effectiveness and productivity
Ensure that the desired checkers are in place and are receiving traffic
Identify the number and types of checks performed
Metrics and coverage metrics
Coverage metrics are useful for identifying stimulus holes
Yet, cannot answer the questions, Did a specific test activate and then propagate and event to a specific checker?
IP Core
Processor
Core
Coherent Cache
IP Core
Coherent Cache
Memory-Intensive
IP
Network Switch RAM
Mem Operation
TLB Tables
Cache State
Cache State
TLB Tables
Mem Operation
Fabric Ops
Open Pages
RAM Op Queues
Mem Access
Metrics and domain-specific performance
Understanding performance characteristics can be critical
Metrics can be used to measure domain-specific performance and then used to calculate system-performance
IP Core
Processor
Core
Coherent Cache
IP Core
Coherent Cache
Memory-Intensive
IP
Network Switch RAM
Mem Operation
TLB Tables
Cache State
Cache State
TLB Tables
Mem Operation
Fabric Ops
Open Pages
RAM Op Queues
Mem Access
Metrics and simulation performance
Metrics can help us track performance and provide an early detection mechanism to identify issues
By correlating performance with code revisions, it may be possible to link performance degradations to the introduction of specific blocks of code
Metrics and simulation configuration
Each simulation may have run-specific configurations that can affect other aspects of the verification process
Reporting each configuration option as a part of metrics allows configuration changes to be correlated with other metrics such as coverage, simulation performance and bug statistics
Probably the most obvious configuration is the random seed that was generated for each run
That seed or other parameters may be used to select tests or change the configuration of the simulation
Metrics as part of the regression process
Metrics help ensure regressions are efficient & productive
Two types of regression metrics are often tracked: Information on the regression run
Information on the simulation farm
Regression run information may include: test names, frequency of tests, random seeds, configuration
choices and so on
By looking at the tests that provide the most coverage or are most effective at uncovering bugs, test run order or test frequency can be adjusted to increase verification productivity
Metrics as part of the overall project
Useful non-simulation-based metrics related to bugs
Understanding bug status can provide insight into the simulation, testing, coverage and overall progress
Knowing which simulation reported the bug provides information about the effectiveness of stimulus, checking, versions and abstraction level used to find the bug
Knowing when the bug was closed can provide information about the project progress and insight into when a particular test does not need to be run in regressions
Additional project-wide metrics
Examples of other external measures of project-wide progress
Code check-in rates (stability, maturity, churn)
Team status metrics (are teams getting everything they need)
Activity by geographic location
Simulation farm efficiency (bug rates per cycle)
Determining which metrics are useful is likely to depend on many aspects of the project, the expected lifespan of IP that is being developed and corporate culture
Summary Metrics Considerations
Provide visibility into all aspects of build, stimulus, check, coverage, and regressions.
Metrics must be actionable
Insight into the project generally comes from correlation of measurements
Measurements from any part of a project may be
useful not just in the simulation
[email protected] www.verificationacademy.com
Metrics in SoC Verification What Can Metrics Tell Us?
Andreas Meyer Verification Architect
Slide Number 1What Can Metrics Tell Us?Example of Various Verification MetricsCorrelating multiple metricsMetrics as part of the build processEvent-based and non-event-based metricsMetrics as part of the simulation processMetrics and stimulus sourcesMetrics and checking methodsMetrics and coverage metricsMetrics and domain-specific performanceMetrics and simulation performanceMetrics and simulation configurationMetrics as part of the regression processMetrics as part of the overall projectAdditional project-wide metricsSummarySlide Number 18