hardware emulation of sequential atpg-based

transcript

HARDWARE EMULATION OF SEQUENTIAL ATPG-BASED

BOUNDED MODEL CHECKING

GREGORY FICK FORD

Submitted in partial fulfilment of the requirements

for the degree of Master of Science

Thesis Advisor: Dr. Daniel Saab

Department of Electrical Engineering & Computer Science

CASE WESTERN RESERVE UNIVERSITY

January, 2014

CASE WESTERN RESERVE UNIVERISTY

SCHOOL OF GRADUATE STUDIES

We hereby approve the thesis of

Gregory Fick Ford

candidate for the Master of Science degree.

Dr. Daniel Saab

Dr. Francis Merat

Dr. Christos Papachristou

(date) 11/08/2013

* We also certify that written approval has been obtained for any proprietary material contained

therein.

Contents

1 Background

1.1 Properties & Temporal Logic ………………………………..…………...

1.1.1 Linear Temporal Logic …………………..……………….

1.1.2 Computational Tree Logic …………………..……………

1.1.3 SUGAR …………………………………..……….………

1.1.4 OpenVera ………………………………..……….……….

1.1.5 Property Examples ……………………..…………………

1.2 Model Checking …………………………………………..………………

1.2.1 Ordered Binary Decision Diagrams ………..……………..

1.2.2 SAT Modeling …………………………….……………...

1.2.3 Bounded Model Checking ……………..…………………

1.2.4 Automatic Test Pattern Generation ………..….…………..

1.3 Prior Work ……………………………………………..…………………

2 Algorithm

3 Architecture

3.1 PI/PPI Decision Block ………………………………………..…………..

3.2 Objective Decision Block …………………………………..…………….

3.3 Forward Network Derivation …………………………………..…………

3.4 Backward Network Derivation ………………………………..………….

3.4.1 Backward Network Fanout Handling …………..………...

3.4.2 Backward Network Conflict Detection ………..….………

3.4.3 Backward Network Decoder ………………..….…………

3.4.4 Backward Network Encoder ………………..….…………

3.5 Translating Circuits into Forward/Backward Networks ……...…………..

3.5.1 Processing Input Data ………………………...…………..

3.5.2 In-Memory Data Structure ………………..………………

3.5.3 Writing Output Networks ……………………..………….

4 Results

5 Conclusion & Future Work

A Example Input Circuit and Network Translations

B Example DONE Simulation for c17 Benchmark Circuit

C Example FAIL Simulation for s27 Benchmark Circuit

D FPGA Emulation Algorithm Implementation Base Verilog Code

E Forward / Backward Network Generation Program C++ Code

F FPGA Algorithm & Network Integration TCL Script

G Formal CTL Rules Generation TCL Script

Bibliography

List of Figures

1.1 CTL Time/State Representation …..……………………………….……………..

1.2 Example Simple Search System Model ……………………………..……………

1.3 OBDD Graphical Representation Example ………………………..……………..

1.4 Variable Order Dependency in OBDDs …………………………….……………

1.5 Example of the One-Literal Rule in SAT ……………………………..………….

1.6 Propagation and Consistency in the D-Algorithm ………………………..……...

2.1 ILA Model of Sequential Circuit for k Time Frames ……………...…...………..

2.2 Objective Tracing Within a Frame ……………………………...…...…………..

2.3 Algorithm Flow Diagram ……...…………………………………………………

3.1 Top Level Architecture Block Diagram ……...……………………...…………..

3.2 PI/PPI Decision Block Structure Diagram ……...………………………………..

3.3 State Transition Model of “More Justification” Operation …………..………….

3.4 State Transition Model of “Move to Ti-1” Operation ………………..………….

3.5 State Containment Check Implementation ……………………………..………..

3.6 State Transition Model of “Backtrack” Operation ……………………..………..

3.7 State Transition Model of “Move to Ti+1” Operation ………………..…………

3.8 Objective Decision Block Structure Diagram ……...………………...………….

3.9 Objective Decision Block PPI Objective Read In ……………………...………..

3.10 Assembly of Frame-k Objective Word …………………………………..………

3.11 Objective Decision Block Conflict/Done Decision Logic ……...………………..

3.12 Objective Decision Block Sequencer Operation …………………………..…….

3.13 Abstract Backward Gate Values Dependence …………………………...………

3.14 Backward Fanout Signal Contention …………………………………...………..

3.15 Priority Encoder Structure for Required Objective Bit ………………...………..

3.16 Priority Encoder Structure for Required Objective Bit with Reset …...…………

3.17 Priority Encoder Structure for Objective Value Bit …………………...…………

3.18 Complete Structure for Backward Network Priority Encoder ………...…………

3.19 Trace-blocking Conflict ……………………………………………...…………..

3.20 Non-blocking Conflict ………………………………………………...…………

3.21 State Transition Diagram for Backward Network Decoder …………...…………

3.22 Backward Network Decoder Value Assignment ……………………...…………

3.23 State Transition Diagram of Backward Network Encoder ……………...……….

3.24 Backward Network Encoder Shiftout Operation ………………………...………

3.25 Network Translation Flow ……………………………………………...………..

3.26 Internally Consistent Pre-Processing …………………………………...………..

3.27 Gate Data Structure …………………………………………...………………….

3.28 Gate Fan-out List Structure ……………………………………...……………….

3.29 Level List Structure ……………………………...…………...………………….

4.1 Virtex-6 Utilization vs. ISCAS89 Benchmark Size …………………...………….

4.2 FPGA vs. Software Solve Time for ISCAS89 Benchmarks ……………………...

4.3 FPGA vs. Software Total Time for ISCAS89 Benchmarks ………………...……

B.1 c17 Benchmark Circuit Structure ……………………………….....……………..

B.2 c17 Simulation Circuit Structure …………………………………..……………..

B.3 c17 Simulation Backtrace …………………………………………..…………….

B.4 Contents of c17 Block RAM After Objective 1 …………………….....…………

B.5 c17 Simulation Trace / Implication ……………………………..…...…………...

B.6 Final c17 Block RAM Contents …………………………………..………………

C.1 s27 Benchmark Circuit Structure ………………………………..……………….

C.2 s27 Benchmark Simulation Structure …………………………..………………..

C.3 s27 Simulation Frame k Backtrace ……………………………..………………...

C.4 Contents of Block RAM After Frame k ………………………..…………………

C.5 Contents of Block RAM After Move to Frame k-1 ………………..……………..

C.6 Contents of RF in Frame k-1 ………………………………..……...…………….

C.7 s27 Simulation Frame k-1 Backtrace 1 ………………………..………………….

C.8 Contents of RAM in Frame k-1 with Backtrace 1 ………………..………………

C.9 s27 Simulation Frame k-1 Imply 1 ……………………………..………………...

C.10 Contents of RF and PO/PPO in Frame k-1, Imply 1 ……………….....………….

C.11 s27 Simulation Frame k-1 Backtrace 2 …………………………..……………….

C.12 s27 Simulation Frame k-1 Imply 2 ………………………………......…………...

C.13 Contents of RF and PO in Frame k-1, Imply 2 …………………………..……….

C.14 State Check Comparison for Frame k-1 ………………………..…...……………

C.15 Contents of RF in Frame k-2 …………………………………..…...…………….

C.16 Contents of Block RAM at Frame k-2 Start ……………………..……………….

C.17 s27 Simulation Frame k-2 Backtrace 1 …………………………..……………….

C.18 s27 Simulation Frame k-2 Backtrace 2 ………………………..………………….

C.19 Contents of Block RAM after First Clear Top ………………..……...…………..

C.20 Contents of Block RAM after First Swap Value ………………..………………..

C.21 s27 Simulation Frame k-2, Backtrack 1 Imply …………………..……………….

C.22 Simulation Frame k-2, Backtrack 1 Backtrace …………………..……………….

C.24 Block RAM Contents after Move to Ti+1 and Swap Value ………..……………...

C.26 RF Contents and PO/PPO after Move to Ti+1, Imply 1 …………..……………….

C.27 Block RAM Contents after Move to Ti+1 and Backtrack ………..………………..

C.28 Block RAM Contents after Second Move to Ti+1 and Swap Value ……..………..

C.29 Simulation Frame k, Backtrack 1 Imply ……………………………..…………...

List of Tables

1.1 Propositional Operators for LTL Formulae ……………………..……………….

1.2 Temporal Operators for LTL Formulae ……………………………...…………..

1.3 Temporal Operators for LTL Formulae ……………………………..….………..

1.4 Operators of the “Temporal Layer” in Sugar 2.0 ………………..…….…………

1.5 Directive Operators in OpenVera ………………………………..………………

1.6 Temporal Operators in OpenVera ………………………………..………………

1.7 Language Representations of a Safety Property …………………..……………..

1.8 Language Representations of a Liveness Property ………………..……………..

3.1 PI/PPI Decision Block Control Logic States ……………………..……………...

3.2 More Justification Operation Pseudo-code ……………………..……………….

3.3 Pseudo-code for “Move to Ti-1” Operation …………………..…………………

3.4 “Backtrack” Operation Pseudo-code …………………………………...………..

3.5 “Move to Ti+1” Operation Pseudo-code ………………………..……………….

3.6 Full One-Bit Truth Table for AND Gate ………………..……………………….

3.7 Two-Bit Truth Table for AND Gate ……………………..………………………

3.8 Split K-maps for AND Gate Output Bits ………………..…………….…………

3.9 Minterm Expressions for AND Gate Output Bits ………….………….…………

3.10 Forward Translation Equations for Basic Gates ………….…………….………..

3.11 Pseudo-code for AND Gate Backward Model Outputs ……..……….…………..

3.12 Truth Table for Backward Model of AND Gate …………..………….………….

3.13 Backward Translation Equations for Basic Gates ………….……………………

3.14 Input Network Format ……………………………………..…………………….

3.15 Forward Network Module Interface Example ………………..………………….

4.1 Virtex-6 Resource Utilization for ISCAS89 Benchmarks ………..….…………..

4.2 Runtime Comparison for ISCAS89 Benchmarks ……………..…………………

A.1 Benchmark Code for s27 Circuit …………………………..………...…………...

A.2 Forward Network Verilog for s27 Benchmark Circuit ………….………………..

A.3 Backward Network Verilog for s27 Benchmark Circuit …………..…...………...

A.4 Backward Network Priority Encoder Verilog Example ……….…………………

B.1 Circuit c17 Simulation Input Stimulus ………………………..……...…………..

B.2 Circuit c17 Initial Reset ………………………………………..…………………

B.3 Circuit c17 Simulation Cycle 1 …………………………..……………………….

B.4 Circuit c17 Simulation Cycles 2-4 …………………………..……………………

B.5 Circuit c17 Simulation Cycles 5-7 ………………………..………………………

B.6 Circuit c17 Simulation Cycles 8-12 ……………………..………………………..

B.7 Circuit c17 Simulation Cycles 13-20 ……………………..………………………

B.8 Circuit c17 Simulation Cycle 21 …………………………..……………………...

B.9 Circuit c17 Simulation Cycles 22-23 ……………………..………………………

C.1 Circuit s27 Simulation Cycles 1-20 ………………………..……………………..

C.2 Circuit s27 Simulation Cycles 21-24 ……………………..………………………

C.14 Circuit s27 Simulation Cycles 96-105 ……………………..……………………..

C.15 Circuit s27 Simulation Cycles 106-114 …………………..………………………

Hardware Emulation of Sequential ATPG-Based

Bounded Model Checking

Abstract

GREGORY FICK FORD

The size and complexity of integrated circuits is continually increasing, in accordance with

Moore’s law. Along with this growth comes an expanded exposure to subtle design errors, thus

leaving a greater burden on the process of formal verification. Existing methods for formal

verification, including Automatic Test Pattern Generation (ATPG) are susceptible to exploding

model sizes and run times for larger and more complex circuits. In this paper, a method is

presented for emulating the process of sequential ATPG-based Bounded Model Checking on

reconfigurable hardware. This achieves a speed up over software based methods, due to the fine-

grain massive parallelism inherent to hardware.

1. Background

The complexity of integrated circuits is continually increasing, and with it, the chances of having

subtle errors in a design. This growth also increases the amount of work needed in verification,

hence driving up the total time that projects spend in verification and effecting the bottom line of

time-to-market. In 2010, the Wilson Research Group published a study on functional

verification, showing, among other things, that verification now accounts for over half of the

total time of digital design projects [1]. On top of that, this percentage is continuing to increase,

from 50% in 2007 to 55% in 2010.

Historically, simulation has been used as the main means of discovering bugs in a design, but as

designs grow larger, the chances of finding these bugs becomes less and less. Attention has

turned toward formal verification as a means to augment design verification in the face of

growing designs. Formal verification is well suited to the problem of comparing the register

transfer level (RTL) with the logic level for a design, as this is simply proving combinatorial

equivalence. On the other hand, comparing between the RTL and behavioral level is a much

more complicated problem, as the behavioral model is generally defined using code that follows

a natural language structure.

1.1. Properties & Temporal Logic

To address the problem of bridging the gap in formal verifiability between the RTL and behavior

level, the concept of properties is introduced. These are statements about the function of small

portions of the total design, written by the designers. These functional statements can then be

used as a means of comparison against the RTL. As described by Lamport in [2], properties can

be divided into two main categories. The first being a “safety property”, which is a statement

that a defined bad event will never happen. The second being a “liveness property”, which is a

statement that a defined good event will eventually happen. This, then, presents the issue of how

to formally define properties such that they can be verified.

1.1.1. Linear Temporal Logic

These properties are generally represented using “temporal logic”, which provides a means for

applying an assertion over a period of time. For example, if condition “p” is true in the present,

then another condition “q” must be true at some point in the future. Pnueli examined the

applicability of temporal logic to programs in [3], which can similarly be transferred to digital

design. In “Linear Temporal Logic” (LTL), time is represented as a linearly ordered set, or in

the context of digital design, a sequence of design states. As defined by Huth and Ryan in [4], a

formula in LTL is constructed from three different parts. The first are “propositional atoms”,

which are representative of conditions in the digital system (as “p” and “q”, above). The second

are “propositional operators”, which function as modifiers to propositional atoms, outside any

concept of time. Table 1.1 provides the definition of these operators.

Operator Name Description

┬ True Always True

┴ False Always False

⌐ p Negation True when p is False.

p ˄ q Conjunction True when both p AND q are True.

p ˅ q Disjunction True when p OR q (or both) is True.

p → q Implication IF p is True THEN q must also be True.

Table 1.1: Propositional Operators for LTL Formulae

The third, and final, are “temporal operators”, which function as modifiers to propositional atoms

based on a concept of time represented by design states occurring either before or after the

current state in a sequence. Table 1.2 provides the definition of these operators.

X p Next p is true at the next point in time.

F p Eventuality p is true at some time in the future.

G p Globally p is true at all future points in time.

p U q Until p is true, until a point in the future when

q becomes true.

Table 1.2: Temporal Operators for LTL Formulae

It is important to note that in the case of the “Until” operator, there are two possible conditions

where the operator could produce a true result. When p holds true for some time in the future,

there must be a point where q then holds true for all future points. If q must be true at some

point in the future (F q), then p U q is called a “Strong Until”. If q is not necessarily true at any

point in the future, then p U q is called a “Weak Until”. That is, p U q can be considered true

regardless of p, if q is never true at any point in the future.

1.1.2. Computational Tree Logic

Another system for constructing temporal logic formulae is Computational Tree Logic (CTL), as

defined by Clarke, et al. in [5]. CTL’s representation is based off the fact that for any point in

time, there are many possible futures; or for any design state, there are many possible sequences

of design states that follow, based on inputs. This is realized for any starting state, S0, where

each possible next state becomes a branch from S0 in the tree representation, and each further

state from those next states becomes a branch from them, ad infinitum. An example of this

construction is shown in Figure 1.1.

Figure 1.1: CTL Time/State Representation

CTL has its own set of temporal operators, which provide a means to formulate equations that

can assert properties based on conditions in branches of the tree. These are used in addition to

the base set of LTL operators, which are used to assert properties within a branch of the tree.

Table 1.3 provides the definitions of these operators.

A q Necessary q will always be true, along all branches.

E q Possible q is sometimes true, along some branches.

Table 1.3: Temporal Operators for LTL Formulae

1.1.3. SUGAR

Sugar 2.0 is another example of a formal specification language that can be used for verification,

as introduced by Eisner and Fisman in [6]. Sugar expressions are composed in four different

layers. The first layer is the “boolean layer”, which consists of Boolean operations on

propositional atoms, where any such expression can evaluate to a true or false logic value. This

can be thought of as equivalent to the “propositional operators” component of LTL, as outlined

in Table 1.1. The second layer is the “temporal layer”, which is used to define relationships

between expressions in the Boolean layer, over a period of time. The operators used in the

temporal layer and defined in Table 1.4.

Operator Description

always p p is true at all times.

never p p is false at all times.

next p p is true in the following cycle.

eventually p p is true at some cycle in the future.

p until q p is true until a point when q becomes true.

p before q p is true at a point in time before q is true.

Table 1.4: Operators of the “Temporal Layer” in Sugar 2.0

As an addition to these temporal operations, Sugar supports postfixes of “!” and “_”, in

appropriate cases. The exclamation point postfix is supported by “eventually”, “until” and

“before”. It defines the operator as being strong, as opposed to the default of weak. This is the

same context as “strong until”, as defined earlier, in that a strong operator requires that the

second argument be true at some point in time, excluding the case where the expression is true

by virtue of a case where the second argument never occurs (thereby never requiring a check on

the first argument). The underscore postfix is supported by “until” and “before”. The

underscore defines that there is an allowed overlap between the two arguments of one cycle. In

the context of “until”, this would mean that the first argument is true up to and including the first

cycle that the second argument is true. In the context of “before”, this would mean that the first

argument must be true before, or at the same time as, the second argument.

The third layer is the “verification layer”, which provides direction to a verification tool reading

the Sugar expressions. If an expression from the “temporal layer” is defined with “assert” in this

layer, then that instructs the verification tool that it must verify the defined properly. If an

expression is defined with “assume”, then that instructs the verification tool that it can assume

the behavior defined in the property to be true. The final layer is the “modeling layer”, which

allows for definition of the behavior of the propositional atoms in the expressions. In the context

of circuit verification, this can be thought of as assigning behavior to signals in the design.

1.1.4. OpenVera

A final example of another formal specification language used in verification is OpenVera, as

defined in [7]. OpenVera Assertions (OVAs) can be thought of as being divided into two

distinct components; directives and events. Directives make statements about events, defining

what conditions the verification tool should be checking for. This is similar in function to the

“verification layer” of Sugar 2.0. Table 1.5 shows the available directives.

check(e) Event e should always hold true.

forbid(e) Event e should never hold true.

Table 1.5: Directive Operators in OpenVera

Events are separately defined entities that contain expressions comprised of Boolean and

temporal logic. Events can be thought of as similar to a combination of the “boolean layer” and

“temporal layer” in Sugar 2.0. The operators used at this layer are summarized in Table 1.6.

#n p After n cycles, p is true.

#[n..m] p After between n and m cycles, p is true.

p followed_by q q is true at some point after p is true.

p triggers q q is true immediately after p is true.

p until q p is true until a point where q is true.

next p p is true in the next cycle.

Table 1.6: Temporal Operators in OpenVera

1.1.5. Property Examples

To illustrate the use of the different formal specification languages discussed here, consider a

simple searching system, as illustrated in Figure 1.2. This system has a memory containing

arbitrary data that can be searched. The system remains in an idle state until a request (req)

arrives with a pattern to be searched for. The system moves to a search state, and examines

locations in the memory for the pattern until one is found, at which point it acknowledges (ack)

that the pattern exists in memory.

Figure 1.2: Example Simple Search System Model

In this system, a safety property could be that the search never overflows from the memory.

That is, “index” should never exceed the number of locations in memory (defined here as

mem_rows). Table 1.7 shows how this property can be realized in each of the languages

discussed.

LTL G ⌐(index > mem_rows)

CTL AG ⌐(index > mem_rows)

Sugar 2.0 never (index > mem_rows)

OpenVera assert a_overflow : forbid(e_overflow) ;

event e_overflow : (index > mem_rows) ;

Table 1.7: Language Representations of a Safety Property

In LTL, the overflow condition (index > mem_rows) can be used as a negative argument to the

global operator G; that is, for all points in time the overflow condition should not be true. In

CTL, similarly, the overflow condition is used as a negative argument to the necessary operator;

that is, for all branches the overflow condition must not be true. In Sugar, the overflow condition

can be directly used as an argument to the “never” operator, stating that the condition can never

occur. In OpenVera, a directive is defined with the “forbid” operator, stating that the event

“e_overflow” can never happen. The event e_overflow is then defined as the overflow

condition.

In this system, a liveness property could be that the system always returns to its idle state. That

is, “ack” should eventually be asserted as a response to “req”. Table 1.8 shows how this

property can be implemented in each of the languages discussed.

LTL G (req → F ack)

CTL AG (req → F ack)

Sugar 2.0 always (req → eventually! ack)

OpenVera assert a_return : check(e_return) ;

event e_return : req followed_by ack ;

Table 1.8: Language Representations of a Liveness Property

In LTL, the “return to idle” condition can be represented as req implying that ack will eventually

be true. This statement is then asserted to be globally true. In CTL, the same set of statements

can be applied, with the addition of the necessary operator, adding that the condition must be true

for all branches of the tree. In Sugar, the “return to idle” condition is represented with req

implying ack with a strong eventually. Explicitly defining this as a strong eventually prevents

the case where req stays true forever, and ack never becomes true (thus potentially masking a

bug). In OpenVera, a directive is defined with the “check” operator, stating that the event

“e_return” should always happen. The event e_return then uses the “followed_by” operator to

state that the signal req becoming true must be followed by the signal ack becoming true at some

point in the future.

1.2. Model Checking

Properties expressed in temporal logic provide one of the inputs necessary for model checking.

The other input needed is a model of the circuit being verified. In model checking, the supplied

design model is analyzed with respect to the input properties. The result of the process is either a

sequence of states showing that the property is satisfied, or a statement that no satisfying

sequence could be found. One of the early implementations of model checking is “temporal

logic model checking”, as described by Clarke and Emerson in [8]. In this system, properties are

expressed in CTL, and the design is modeled as a state-transition diagram. The major limitation

of this early implementation is that both the CTL expressions needed to represent properties and

the design state model grow polynomially with respect to the size of the design being verified.

This is known as the “state explosion problem”.

1.2.1. Ordered Binary Decision Diagrams

The first major development in combating the state explosion problem was the SMV system,

introduced by McMillan in [9]. SMV reduces the growth of the design state space by applying

Ordered Binary Decision Diagrams (OBDDs) to the modeling of design states. OBDDs were

first detailed by Bryant in [10]. OBDDs for a given Boolean function can be represented as a

binary tree, where each level in the tree is assigned to a variable from the function, in order. At

any non-terminal vertex in the tree (representing a variable), there will be two possible

transitions to a following level of the tree. A transition to the left indicates a logical low, and a

transition to the right represents a logical high. If for a given transition, the overall value of the

function remains indeterminate, the transition will be to the next variable in order. If the

transition provides determination of the function value, then the transition will be to a terminal

node in the tree (0 or 1). Figure 1.3 shows an example OBDD graph representation, where

circles are non-terminal (variable) nodes and squares are terminal nodes.

Figure 1.3: OBDD Graphical Representation Example

Evaluation for the function (w·x + y·z) begins with variable w. If w is false, then the value of x

does not matter, so the transition proceeds to y. If w is true, then the transition proceeds to x. In

this case, there is only one instance of x, so if x is true, that means that the function is true (since

both w and x are true); the transition will then proceed to the “1” terminal vertex. If x is false,

then evaluation continues and the transition will proceed to y. Again, there is only one vertex for

y, and the precedent for being at vertex y is that the first term in the equation is false. Then, if y

is false, the second function is false, and the entire function is also false; the transition will then

proceed to terminal vertex “0”. If y is true, then evaluation continues by transitioning to vertex z.

Since vertex z is unique, the condition for being at the vertex is that the first term is false and y is

true, meaning that the overall function value will be determined by the value of z. If z is false,

then the second term is also false, making the overall function false; the transition then proceeds

to terminal vertex “0”. If z is true, then the second term is true, and the overall function is true;

the transition proceeds to terminal vertex “1”.

The usefulness of OBDDs for reducing the state explosion problem does have limitations,

though. One key issue faced in OBDD construction is that the resulting graphical structure is

directly linked to the variable ordering used in the Boolean equations being evaluated. The fact

that variables are evaluated in order can cause large variations in graph representation efficiency

among Boolean equations of the same structure. This problem is illustrated in Figure 1.4.

Figure 1.4: Variable Order Dependency in OBDDs

Consider a new equation, based on the one described in Figure 1.3, where the order of two of the

variables is changed. In both cases, the variable ordering w-x-y-z is used in evaluation, but vastly

different graph efficiencies result, even though the Boolean equation structure is the same. One

observation that can be made for this situation is that when related variables are not close

together in evaluation, the complexity of the graph grows. In the case of the new equation, w

and y are closely related, as well as x and z. But, both of these pairs of related variables are

separated in the variable evaluation ordering. Observations like this can be used as heuristics to

improve variable ordering, helping to maintain high efficiency in OBDDs.

As an example, Fujita, et al in [11] and Malik, et al in [12] present that a depth-first traversal of a

circuit being verified can often provide a reasonable variable ordering. In situations where

heuristics fail to give reasonable results, Rudell presents a solution in [13] called “dynamic

reordering”. In this solution, a “shifting algorithm” is periodically run within the OBDD in an

attempt to minimize it. In each shifting operation, given n variables, one variable is selected for

optimization and the order of the n-1 other variables is fixed. The position of the selected

variable is then shifted to a more ideal spot, out of the n possible choices. Additionally, further

enhancements to the OBDD model have been proposed by Brace, et al in [14]. Specifically,

more complicated sets of Boolean equations can be modeled as a multi-rooted OBDD, where the

different functions have opportunities to share sub-trees (instead of distinct single-rooted trees

for each equation).

1.2.2. SAT Modeling

OBDDs provide some mitigation of the state explosion problem, but they are still complete state-

space representations of a circuit being verified, and as such the model size can still quickly

grow too large as circuit complexity increases. Another approach that has been used to get

around this problem is modeling as satisfiability (SAT) problems. In this case, the circuit being

verified is modeled as a set of Boolean propositions, as opposed to a full expansion of design

states. One of the major methods for SAT problems is the Davis-Putnam method [15]. This

method consists of two parts. The first is the QFL-Generator, which uses the formula for the

property being verified to create a growing propositional calculus formula. The second part is

the Processor, which continually checks the propositional calculus formula for consistency. If, at

any point, the formula is found to be inconsistent, that provides a proof to the original formula

being verified. SATO is one implementation of a SAT solver that leverages the Davis-Putnam

method, as presented by Zhang in [16].

The process of checking a propositional calculus formula for consistency in this context involves

reduction of the formula via elimination of terms. This reduction is achieved with a variety of

rules. One-literal clauses can assist in reduction using the one-literal rule (also known as unit

propagation). This rule states that for a set of clauses, containing a unit clause (a clause that is a

single literal), each clause containing the unit clause can be eliminated and each occurrence of

the negation of the unit clause in other clauses can be deleted. The resulting reduced set of

clauses will be logically equivalent to the starting set. An example of this process is shown in

Figure 1.5.

Figure 1.5: Example of the One-Literal Rule in SAT

In this example, the one-literal clause is “x”. Therefore, each remaining clause in the set is

examined for inclusion of “x”. The clause “y˄z” does not contain “x”, and remains the same.

The clause “x˅z” contains an affirmative reference to x, and thus the clause is dropped. The

clause “⌐x˄y” contains a negative reference to x, so the negative reference to x is removed from

the clause. This then yields the reduced set of clauses {x, y˄z, y}.

Another rule for formula reduction is the affirmative-negative rule (also known as pure literal

elimination). This rule states that if a propositional variable only appears in a single form (either

affirmative or negated) across all uses in a set of clauses, then all clauses containing that variable

can be eliminated. If a propositional variable only appears with a single polarity (also called a

pure variable), then an assignment can always be made to make all clauses containing the pure

variable true.

A final rule that can be applied when the previous two rules have been exhausted is the splitting

rule, which allows for a re-structuring of clauses. Davis, et al examine this rule more closely in

[17]. The rule states that a formula F should first be put into the form of (A ˅ p) ˄ (B ˅ ⌐p) ˄ R.

This can be achieved by creating three groups of clauses; those containing p, those containing ⌐p

and those not containing p. p and ⌐p can then be factored out of the first two groups of clauses

to create the desired form. It can then be stated that formula F is inconsistent if and only if (A ˅

B) ˄ R is inconsistent. This can also be represented in another form, stating that formula F is

inconsistent if and only if A ˄ R and B ˄ R are both inconsistent. The splitting rule does,

though, also present one of the limitations of the model. The problem of how to select “p” for

splitting is difficult to solve, as the answer will vary depending on the model being verified.

Many heuristics exist to assist in selecting a reasonable “p”, as a poor selection of “p” can reduce

the performance of the Davis-Putnam SAT model by orders of magnitude.

1.2.3. Bounded Model Checking

A further technique for combating the problem of state explosion is Bounded Model Checking

(BMC), as introduced by Biere et al. in [18]. In the BMC technique, a limit of k is set on the

number of state transitions within which a property must hold. This means that the paths to be

searched in the model can have at most k + 1 states. Biere et al. also propose that in this model, k

should begin with a value of 0 (searching for a single state counterexample). k can then be

continually increased until either an imposed limit is hit, implying that there is no

counterexample, or a counterexample is found with a length of k + 1. This imposed upper limit

on k is information that would be provided by a user of the BMC system. As logic designers

generally know the bounds within which a given property should hold, expecting this input for

BMC is a reasonable assumption. Copty et al. investigated the effectiveness of BMC in an

industry setting in [19]. Portions of the Pentium 4TM

were used to test BMC with a SAT solver

against an OBDD symbolic model checker. BMC was found to provide improved productivity

over the OBDD solution, mainly due to the high amount of manual tuning required with OBDD

to optimize its performance.

1.2.4. Automatic Test Pattern Generation

Automatic Test Pattern Generation (ATPG) is another method that avoids the state-explosion

problem by employing a different approach to model checking. ATPG focuses on the stuck-at

fault model, which is designed to detect faults where a signal in the circuit being verified is stuck

at a constant value, regardless of circuit inputs. A Boolean model of the circuit being verified is

stored in memory, but a full state expansion is not required. This then presents the necessity to

have a method for modeling a circuit to be verified in memory for the ATPG algorithm to work

on. Armstrong presents a method for applying ATPG for combinatorial circuits in [20]. Hsiao

and Chia then extended this to implement a solution for ATPG in sequential circuits in [21].

One of the first ATPG methods was the D-Algorithm as introduced by Roth et al. in [22]. In this

method, combinations of primary input (PI) assignments are examined by making assignments at

internal circuit nodes, based on the fault being tested. A new logic value of “D” is introduced in

these assignments, which represents a value of 1 for a good circuit, when testing a stuck-at-0

fault ( D represents 0 when testing a stuck-at-1 fault). When working with a fault at some

internal net in a circuit, this method will both “propagate” the D value forward, and maintain

“consistency” by implying values back through the circuit based on propagated values. An

example of this is shown in Figure 1.6.

Figure 1.6: Propagation and Consistency in the D-Algorithm

Given a stuck-at-0 fault under test for net N, the first implication is that in a good circuit, gate 1

must output a 1 (D). This then requires inputs A and B to both be 1. Further, to observe this

fault on N, the value of D must propagate forward to output Z, passing through gate 2. For gate

2 to have a value of D on its output (Z), its other input must be 1, which implies that net P must

have a value of 1 to maintain consistency. That then requires gate 3 to have both its inputs be 1,

meaning that both net M and input D must be 1. Similarly, gate 4 must output a 1, implying that

input C must have a value of 0. This has then generated a complete assignment on all inputs.

One weakness in the D-Algorithm was first seen when exercising the method on circuits that

included error correction code (ECC) logic. With ECC logic, an XOR tree to compute parity

exists that is then reconvergent with the main logic being checked. This presents an efficiency

issue when testing a stuck-at fault in the main logic, as the entire ECC parity tree will need to be

evaluated to make a consistent assignment. To solve this problem, Goel presented the Path

Oriented Decision Making (PODEM) system in [23]. PODEM differs in approach from the D-

Algorithm, as instead of evaluating from the point of the stuck-at fault, it directly assigns PI

values and tracks their effects to generate a complete PI assignment. Initially, all PIs to the

circuit are assigned as “don’t care” (X). PODEM then chooses a PI to make an assignment on,

and implies that assignment forward through the circuit (similar to D-Algorithm propagation). If

the assignment made is consistent with the required stuck-at test, then further assignments are

selected, continuing until a complete assignment is made that satisfies the test. If an assignment

is determined to be inconsistent with the stuck-at fault being tested, that assignment will be

undone. One of the limitations of PODEM is that a good choice for what PI to assign and what

logic value to use are critical to finding a complete PI assignment without extraneous evaluation.

To assist with this Goel also presents heuristics for selecting a good assignment, which is based

on finding a gate which has the stuck-fault (D) as an input, a don’t care (X) on its output and is

close to a primary output (PO) of the circuit. The logic is then backtraced from this circuit to

find the closest PI related to that gate, which becomes the PI for which an assignment will be

A further efficiency improvement of PODEM is the fan-out oriented test generation algorithm

(FAN), as presented by Fujiwara et al. in [24]. The core method used for improving efficiency is

by limiting the extent of back-trace operations. This is done using the concept of “head-lines”

and “fan-out points”. A head-line is a net in the circuit such that it is assigned a value of X (and

all of its generation logic is also assigned X), and it is adjacent to another net with an assigned

value. For example, a two input AND gate has one input with an assigned value, and another

input with an X value. This net would then be a head-line, which means that backtracing can

stop after a value assignment is made to the head-line. Since all final PI assignments associated

with the head-line assignment are directly implied by the head-line, they can be deferred until the

very end of the operation. Fan-out points are nets in the circuit that fan out to multiple gates.

These are convergence points, where consistency must be maintained while back-tracing, since

all of the fan-out points on the net must have values that do not present a conflict, or the

backtrace must be stopped and a new assignment will be tried. In FAN, backtracing is stopped at

the fan-out points, until all other objectives are exhausted. If one of the other objectives were to

present a conflict at a fan-out point, that will be detected, and a new assignment can be tried,

without having to have backtraced through all of the fan-out points.

Yet another efficiency improvement, building off of FAN, is the Structure-Oriented Cost-

Reducing Automatic TESt pattern generation system (SOCRATES) as presented by Schulz et al.

in [25]. One of the main ways that SOCRATES improves on FAN is by the addition of

implication learning. During value propagation, values that are being assigned are implied by

the original stuck-at value being propagated. SOCRATES evaluates these assignments to find

non-local implications; that is, implications that aren’t trivially determined by the normal FAN

implication procedures. These “learned” implications are then stored, such that they can help

speed up the future implication process.

1.3. Prior Work

Keller et al provided one of the first looks at the use of ATPG engines in [26], where they were

applied to general problems where a search space needed to be examined. Boppana et al

investigated the use of sequential ATPG for model checking in [27]. Their work focused on

safety properties and noted the fact that ATPG does not require explicit state-space storage

between time-frames as a major advantage. Cheng et al discussed the use of ATPG for property

checking in [28]. Their method involved mapping the property being checked into a

combinational circuit, where the output would be tested for a stuck-at fault.

Parthasarathy et al compared SAT and ATPG algorithms on combinational circuits in [29],

finding that there is no performance gap between the two. This comparison is extended by

Abraham et. al. in [30]. A software implementation for sequential ATPG was presented, and

tested on benchmark circuits for checking temporal logic properties. In this case it was shown

that ATPG outperformed SAT on smaller circuits, and further, SAT was unable to model some

of the largest circuits. Qiang et al show in [31] that even with ATPG, larger circuits can cause

software solvers to fail.

An effective way to push beyond these limitations is to leverage parallelism. Czutro et al applied

this idea with TIGUAN in [32]. TIGUAN applies a two-stage approach to SAT solving, where

the first stage is a single threaded run, quickly working out the easy to solve faults. The second

stage applies multi-threaded test generation to achieve a speedup on the hard to solve faults. Cai

et al extended the application of threading in [33]. By creating a system where test generation

and good/fault simulation were all threaded, they were able to achieve a linear speedup of

ATPG, up to the maximum of 8 CPUs used in testing.

Another way to approach parallelism is to take advantage of the inherent parallelism of

hardware. Sarmiento and Fernandez applied this method for fault emulation in [34]. By

translating a circuit under test onto reconfigurable hardware, they were able to take advantage of

hardware parallelism during propagation. This resulted in emulation being 27 to 2200 times

faster than the associated software-based simulation, in their testing. Dunbar and Nepal used a

similar strategy in [35], and found that by implementing multiple instantiations of a circuit under

test on an FPGA, they could reduce the final test pattern set by 13% on average, while

maintaining fault coverage and run time.

Abramovici et al apply emulation on reconfigurable hardware to a PODEM-like instance specific

SAT solver in [36], though only one objective is backtraced at a time. They put forth a new

architecture in [37] where multiple objectives are backtraced in parallel, taking greater advantage

of the parallelism available in hardware, and achieving an average 10x speed-up over software-

based solvers.

Gulati et al take this method further in [38], implementing an application specific SAT solver on

reconfigurable hardware. To achieve this, the Conjunctive Normal Form clauses of the problem

are split up into bins, which are then sequentially evaluated by the FPGA-based solver. This

allows very large problems, which normally may not fit onto an FPGA, to be evaluated with this

architecture. Ultimately, they demonstrated an average 17x speed-up over the best software-

based solvers.

Kocan and Saab also leverage reconfigurable hardware to implement a concurrent D-algorithm

in [39], handling both propagation and justification. They found this to be 3.25 to 14.8 times

faster than equivalent software, with higher speed-up for larger circuits.

This work extends the idea of reconfigurable hardware emulation to the more complex problem

of sequential ATPG. One of the unique core requirements of sequential ATPG is that multiple

frames must be modeled to account for the temporal aspect of the circuit. Like the previously

discussed works in reconfigurable hardware emulation, fine-grain massive parallelism is

leveraged to gain a significant benefit over software solvers.

2. Algorithm

The first step in designing an algorithm to support hardware emulation of sequential ATPG is

determining how to handle modeling the circuit being verified. Any sequential circuit can be

thought of as being composed of two parts, combinational logic and flip-flops (state elements).

To model the circuit’s behavior over time, it can be “unrolled” into an Iterative Logic Array

(ILA), as described by Abramovici et al in [40]. In this process, all flip-flops are removed from

the circuit, with their inputs and outputs being re-purposed as pseudo-primary inputs and outputs

of the circuit. The ILA consists of k instances of the circuit (where k is the search bound), with

each instance being referred to as a “time frame”. Each of these frames is connected together, in

order, by the state input/outputs from the removed flip-flops. This structure is illustrated in

Figure 2.1.

Figure 2.1: ILA Model of Sequential Circuit for k Time Frames

As part of the ILA modeling method, the property that is being checked is also transformed into

a structural monitor block on the primary outputs of the final frame of the ILA model. This

monitor is designed such that the property is achieved by assigning a value of 1 to “line k” (the

output of the monitor). Then, an ATPG-based justification algorithm can be applied to find a set

of conditions to satisfy line k = 1.

This property justification uses a PODEM-like approach to trace an objective on the output of a

frame, back to a set of required inputs to the frame. To distinguish the original combinational

inputs/outputs of the circuit from the new, state-based inputs/outputs, the originals are called

Primary Inputs and Primary Outputs (PIs and POs) and the state-based are called Pseudo-

Primary Inputs and Pseudo-Primary Outputs (PPIs and PPOs). The objective tracing function

within each frame is shown in Figure 2.2.

Figure 2.2: Objective Tracing Within a Frame

The algorithm works on one frame at a time, starting with frame k. As the first objective (line k

= 1) is traced back, any objective values that propagate to a PPI of the frame, then imply an

objective on the PPO of the previous frame that must also be traced. Objective values traced

back to PI values require no further justification, as these values can be set independent of the

time frame. Once all objective values for the current frame have been successfully traced back,

processing will make a decision on what more needs to be done. If only PI objectives were

traced back in the current frame (or the PPI objectives match an initial state PPI1), then the

algorithm has successfully found a test pattern to generate the original objective of line k = 1, and

is done. If there are any PPI objectives that require further justification, processing will move to

the previous frame in the ILA. This strategy is called Reverse-Time Processing (RTP), as

described by Marlett in [41].

If, while tracing objectives in a frame, a conflict is found (two objective traces require different

values on the same signal), a “Backtrack” operation will remove the conflicting assignment.

This will move processing back to the previous assignment. If there were no other assignments

remaining in the current frame, then processing will return to the prior frame (frame Ti+1) to get

back to the previous assignment. If conflicts are encountered such that line k = 1 cannot be

justified in frame k, then the algorithm fails. This decision flow is shown in Figure 2.3.

Figure 2.3: Algorithm Flow Diagram

The implementation of the algorithm flow chart shown in Figure 2.3 can be thought of in four

separate categories. The first category are decisions related to the current time frame; based on

the current state of objectives, should more processing be done on the current frame, or should

processing move to a different frame. These decisions fall to the “PI/PPI Decision Block”. The

second category consists of decisions based on objectives in the current frame; which objective

should be bracktraced, are there any conflicts and are all objectives satisfied. These are part of

the “Objective Decision Block”. The third category handles the “implication” portion of the

PODEM-based logic tracing. This is where current input objective values are propagated

forward through the circuit under test to verify the consistency of the output objective values

being traced. This is handled by the “Forward Network”. The fourth, and final, category

handles the “backtrace” portion of the PODEM-based logic tracing. This is where an output

objective value is backtraced to necessary objective values on inputs. This is handled by the

“Backward Network”.

3. The Architecture

As discussed in the previous section, the algorithm presented here can be divided up into four

discrete functional groups. These groups directly correspond to the different modules used in

implementing the algorithm on reconfigurable hardware. The required connections in the

algorithm flowchart then become data signals between each of the modules, as shown in Figure

3.1. The Verilog code implementing all of the modules described in this section is included in

Appendix D.

Figure 3.1: Top Level Architecture Block Diagram

The PI/PPI Decision Block controls time-frame based decisions, which are contingent on the

current state of objectives in the frame. Thus, the Objective Decision block communicates this

state using Done, Conflict, line_k_X, and line_k_1. The PI/PPI Decision Block must set up each

new time frame when moving to a new one, so it must set the new values in the Forward

Network (via the PI and PPI busses) as well as providing the objectives for the new frame to the

Objective Decision Block (via the ppi bus).

The Objective Decision Block is responsible for all objective-based operations. It must examine

the output of the Forward Network to ensure that all current objectives are consistent (via the PO

and PPO busses). When it selects the next objective to be worked on, it passes that objective to

the Backward Network over the obj bus. It must then also signal the PI/PPI Decision Block to

make a decision on the current frame, once the objective operations are complete.

The Forward and Backward Networks implement the logical tracing of objective values for the

circuit being tested. To maintain proper consistency when objectives are being backtraced, all

node values in the Forward Network must be passed to the Backward Network (via the STATE

busses). Finally, backtraced objective values from the Backward Network must be passed back

to the PI/PPI Decision Block for storage (via the in bus), as they represent objectives for the next

frame.

3.1 PI/PPI Decision Block

The PI/PPI Decision Block provides the central control system for the architecture, as well as the

main storage. As such, this block can be thought of in terms of two major components; the

control logic, and the results storage RAM. The full structure of this block can be seen in Figure

Figure 3.2: PI/PPI Decision Block Structure Diagram

The results storage RAM component of the block is a 14 bit wide, by 8k deep FPGA block

RAM. The 14 bit word width of the RAM is fixed, based on a 14 bit objective encoding scheme,

described in section 3.2. The depth of the RAM is a variable limit on the total number of

objectives that can be stored, constrained by the available block RAM sizes on the FPGA being

utilized.

The control logic takes input signals from the Objective Decision Block (Done, Conflict,

line_k_1, line_k_X) and uses these to make decisions on how to proceed in the algorithm. These

decisions will result in one of four main operations performed by the PI/PPI Decision Block.

Inside the control logic is a state machine which executes these operations through multiple state

transitions. Table 3.1 summarizes these states and transitions.

State Transition(s) Operation Description

s0 init Idle / NoOp Waiting for external trigger.

s1 s16, s17 Make Decision Current frame assignment incomplete.

s2 s12 Pop Value Pop the top obj value from the RAM.

s3 s3, s7, s8,

s22 Frame Readout

Handle frame value readout when moving between

frames.

s4 s0 Set Top Mark Set frame top mark bit for move to frame Ti-1.

s5 s3 Move to Ti+1 Setup to start move to frame Ti+1 (back off frame).

s6 s12 Swap Val Swap the obj value of top word in current frame.

s7 s6, s9 Backtrack Return Handle returning from Ti+1 to Backtrack.

s8 s11 Start Counter Start counting up or down depending on cnt_dir.

s9 s12 Clear Val Clear top word from in current frame from RAM.

s10 s0, s2, s19 VF Assign (top) Assign current value on top into fwd value buffer

registers.

s11 s4 Stop Counter Stop counter after one decrement.

s12 s10, s18, s20 Backtrack Update Update VF based on last Backtrack operation.

s13 s13 (end) Fail Justification failed.

s14 s14 (end) Done Justification complete. Results in RAM.

s15 s3 Move to Ti-1 Setup to start move to frame Ti-1 (start new frame).

s16 s0 Done with Objs All objs in received. Push onto Forward Network.

s17 s10 Request Obj Request next obj from Backward Network Decoder.

s18 s5, s6, s9 Backtrack Op Selects current operation to perform in Backtrack.

s19 s10 Read RAM Read current addr in RAM.

s20 s10 VF Assign (ow) Assign current value on ow into fwd value buffer

registers.

s21 s9, s15 State Check Verify that current frame is unique before moving to Ti-1.

s22 s3 VF Assign (top) Assign current value on top into fwd value buffer

registers.

Table 3.1: PI/PPI Decision Block Control Logic States

The first major operation performed by the PI/PPI Decision Block is the “More Justification”

operation. This operation occurs when working inside a frame, once a new objective has been

backtraced, and new objectives are available on the Backward Network. Since the Backward

Network may have backtraced multiple new objectives, the goal of the “More Justification”

operation is to iterate as many times as necessary to push all objectives from the Backward

Network into the block RAM, and onto the Forward Network. The pseudo-code for this

operation is shown in Table 3.2.

while (in != 14’b11111111111111) begin

push in onto Block RAM;

push in[2:1] onto fwd buffer addr in[13:3];

signal back_encoder for next in value;

push fwd buffer values onto Fwd Network;

Table 3.2: More Justification Operation Pseudo-code

To achieve the function of this operation, four states act as operations in a loop, with an

additional state for operation when looping completes. The operation consists of a loop through

four states with an additional completion state, as illustrated in Figure 3.3.

Figure 3.3: State Transition Model of “More Justification” Operation

From the idle state of s0, when the Backward Network Encoder receives backtraced objectives

from the Backward Network, it saves them into a buffer and sets the NReady signal low. This

triggers the PI/PPI Decision Block to move into state s1. In this state, the PI/PPI Decision Block

sets the genobj signal high, which triggers the Backward Network Encoder to send an objective

value. The encoder scans through its buffer, starting from the last position of the objective

pointer (last objective sent out). If another objective is encountered, the 14 bit objective value is

pushed onto the in bus, back to the PI/PPI Decision Block. If the end of the buffer is reached (no

further objectives), in is set to all 1s. The PI/PPI Decision Block makes a transition based on

this. In the case where an objective value was set, the state moves to s17, where the objective

value on in is pushed into the block RAM. In the next clock cycle the state always proceeds to

s10, where the value that was just pushed into the RAM is also pushed through to the forward

network value buffers (within the PI/PPI Decision Block). The state then returns to s0; idle. In

the case where all 1s are set on in, the state moves to s16, which pushes all values in the Forward

Network values buffer onto the Forward Network, starting the “Imply” operation.

The second major operation of the PI/PPI Decision Block is “Move to Ti-1”. This operation is

triggered when all objectives in the current frame have been satisfied. In frame_k, this means

that the value on line_k=1 (signal line_k_1 high and line_k_X low). Beyond frame_k, this means

that all PPI objectives from the previous frame match the PPO values of the current frame (signal

Done high and Conflict low). The signals that trigger this operation are all generated by the

Objective Decision Block. The goal of the “Move to Ti-1” operation is to set up the system to

start working on the next frame in the justification. The “Move to Ti-1” operation consists of

three main parts. The first is state containment, where the assignment made in the current frame

is checked against all past frames to ensure a loop has not been encountered. The second is the

process of pushing all PPO objective values from the current frame out to the RF in the

Objective Decision Block, to be justified as PPIs in the frame that is being moved to. The third,

and last, is the clearing of the Forward Network, which then triggers the start of the “Imply”

operation. The pseudo-code for this operation is shown in Table 3.3:

//State Containment

while (addr != 14’b11111111111111) begin

if (top flag set) begin

if (current_frame == past_frame} begin

Stop Move to Ti-1;

Start Backtrack;

addr--;

//Push PPOs to Obj Dec Block

Reset addr to top;

while (!top flag set) begin

if (val at addr is PPI) begin

Push value to Obj Dec Block RF;

addr--;

//Clear Fwd Network

Set all values in Fwd Network buffer to 2’b11;

Push Fwd Network buffer values to Fwd Network;

//Update frame counter

frame_count--;

Table 3.3: Pseudo-code for “Move to Ti-1” Operation

The implementation of this operation utilizes 7 states in sequence, with 2 states having internal

loops to iterate on multiple objective values inside of and across frames. This logic flow is

shown in Figure 3.4:

Figure 3.4: State Transition Model of “Move to Ti-1” Operation

The triggering for the “Move to Ti-1” operation begins within the Objective Decision Block.

Whenever the “Imply” operation completes on the Forward Network (new PO/PPO values arrive

at the Objective Decision Block), these values are checked to see if the current objectives have

been met. If the system is currently in frame_k, then the only check is if line_k is 1. If so,

line_k_1 is set high and line_k_X is set low. If the system is beyond frame_k, then each PPO

value from the Forward Network is compared against each PPI value stored in the Objective

Decision Block’s RF. If all values match, then Done is set high and Conflict is set low. Once

this check completes positively, the Objective Decision Block sets the newframe_ready signal

high. This signal triggers the Backward Network Encoder, which controls the NReady signal.

The raising of the newframe_ready signal translates into a raising of the NReady signal, which

triggers the PI/PPI Decision Block to move out of idle. Given the signals set by the Objective

Decision Block, the PI/PPI Decision Block will begin the “Move to Ti-1” operation by moving to

state 21. This state implements the state containment check, as shown in Figure 3.5:

Figure 3.5: State Containment Check Implementation

The check keys off the fact that at the end of working on a frame, the full PI/PPI assignment for

that frame is contained in the buffer to the Forward Network (valuestofoward). A duplicate of

that buffer is used, called pastframevalues. A value from past frames stored in the block RAM is

read into pastframevalues each clock cycle. When the “top” mark bit is set on a value being read

out, this indicates that the last value read out was the last value of a complete frame, which

triggers the endofframe signal. When endofframe is set, the two sets of values in the buffers are

compared to check for an exact match, which becomes the statecheckresult value. If the end of

the values in the block RAM has not yet been reached, the values in pastframevalues will be

cleared and the next frames values will continue to be read in for another round of comparison.

If all values in the block RAM have been read out, and statecheckresult remains 0, the check

passes. When the check ends, either passing or failing, the donewithstatecheck signal will be

raised. When donewithstatecheck is raised, statecheckresult will determine the next state

transition; if 1, the state containment check failed, and the state will transition over to the

“Backtrack” operation – if 0, the state containment check passed, and the state will transition to

state 15 to continue the “Move to Ti-1” operation.

State 15 will always transition to state 3 on the next clock cycle. In state 3, the block RAM is

enabled in read mode, and one objective value per clock cycle is read out from the current frame.

These values are examined by the “isppi” module, which compares the address bits of the

objective to the number of PIs in the circuit to decide if the objective is a PPI. If the objective is

a PPI, the value is set on the ppi signal. This signal travels to the Objective Decision Block,

where the value will be read into the RF. When the “top” mark bit is encountered in the block

RAM, the current frame has been completely read out, and the endofframe signal is raised. This

triggers state 3 to transition to state 8.

State 8 enables the frame counter to count down (to track the move back by one frame to Ti-1) by

setting Cnt_enable high and Cnt_dir low. On the next clock cycle, the state will always

transition to state 11, which disables the counter by setting Cnt_enable low. As the counter is

allowed to run for one clock cycle, it will count down by 1, tracking to the new frame being

moved to; Ti-1. State 11 also sets the nclear signal to 2’b00, which triggers the valuestoforward

buffer to be cleared out, and the clear values to be pushed onto the Forward Network. This then

begins the next frame’s “Imply” operation. The PI/PPI Decision Block then completes one final

action by proceeding to state 4, which triggers the block RAM to set the “top” mark value on the

top word by setting rewrite1 high and rewrite0 low. This, then, marks the last word from the

frame pushed into the block RAM as the end of the current frame. The next clock cycle will

always transition back to state 0, idle, for the PI/PPI Decision Block to wait for its next trigger

based on the new “Imply” operation that was started.

The third major operation of the PI/PPI Decision block is “Backtrack”. This operation is

triggered by the Conflict signal from the Objective Decision Block. This signal being raised

indicates that either a PPO from the Forward Network conflicts with a stored PPI, or a

conflicting objective assignment has caused an objective propagation failure inside the Backward

Network. The “Backtrack” operation consists of two functions that are used in a complementary

fashion. The “Swap Val” function will change the value of the top objective between 0 and 1, as

well as set the “flag” bit to indicate that “Swap Val” has been run on this objective. The “Clear

Top” operation removes the top objective from the stack for the current frame. Whenever a

Conflict is encountered, any objectives that have already been swapped are cleared, and then the

next un-swapped objective value is swapped. This operation is detailed in the pseudo code

shown in Table 3.4:

//Clear Top

while (flag set on top word) begin

Rewrite top word: value=2’b11;

Push top to Fwd Network buffer;

Pop top off block RAM;

if (mark set on top word) begin

Start “Move to Ti+1” Operation;

addr--;

//Swap Val

Rewrite top word: swap value, set flag;

Push top to Fwd Network buffer;

Table 3.4: “Backtrack” Operation Pseudo-code

The “Backtrack” operation is implemented over a total of 7 states, with 4 handling the functions

and 2 handling the control flow (plus the idle state). This construction is show as a state

transition model in Figure 3.6:

Figure 3.6: State Transition Model of “Backtrack” Operation

The “Backtrack” operation begins external to the PI/PPI Decision Block, with a conflict in

objectives being detected. In one case, if a PPO value read out of the Forward Network

disagrees with a PPI value stored in the Objective Decision Block’s RF, the Conflict signal will

be raised and the newframe_ready signal will toggle. The toggling of newframe_ready triggers

the Backward Network Encoder to set the NReady signal low. In the other case, a new obj is

pushed onto the Backward Network. If this does not resolve into a change in the output of the

Backward Network by the next clock cycle, the Backward Network Encoder will time out and

raise the propfail signal and set delayed_nready. The propfail signal triggers the Objective

Decision Block to raise the Conflict signal. The delayed_nready signal will have the Backward

Network Encoder wait until the next clock cycle before setting NReady low, which allows time

for the Objective Decision Block to receive propfail and assert the Conflict signal.

Once the PI/PP Decision Block receives the NReady signal, it is triggered to leave the idle state,

s0. In this case, with Conflict set, there are three possible transitions that it could make. If the

current top word on the block RAM has its “mark” bit set, the “Move to Ti+1” operation will

start, as the “mark” bit indicates that the current top word is the last of a previous frame.

Otherwise, a function will be run on the top word based on its “flag” bit. If the “flag” bit is not

set, the value will be swapped, the “flag” bit will be set and the word will be written back to the

block RAM. If the “flag” bit is set, this word has already been swapped, so the value will be

cleared (set to 2’b11), the clearvaluescheck control var will be set and the word will be written

back to the block RAM. Both states then transition on to state 12.

State 12 acts as the main control state for the “Backtrack” operation. State 12 will always set the

control value sendvaluestoforward=0, which will suppress any value updates in the Forward

Network buffer from being sent through to the Forward Network. This allows for multiple value

updates (ie, “Clear Val” and then “Swap Val”) before finally triggering the Forward Network

with a new “Imply” operation. When first visiting state 12, no control bits are set, so the process

continues on to state 10. In state 10, the current top word value is pushed to the valuestoforward

buffer. If the previous function that was run was a “Swap Val”, then the operation is complete

and the state will transition back to the idle state s0. In this transition the sendvaluestoforward

control value will be set back to 1, which will then trigger the “Imply” operation to start onto the

Forward Network with the updated values on the Forward Network buffer. If the previous

function run was “Clear Val”, then the clearvaluescheck control value was set, which will cause

state 10 to transition to state 2 (clearvaluescheck will also be set back to 0 at this point).

State 2 will begin a pop operation in the block RAM, which will remove the current top word

from the stack. The popcheck control var will be set to 1 at this point, and the process will then

transition back to state 12. On this second visit to state 12, the control var popcheck has been set,

which will cause state 12 to redirect the process flow to state 18 (as well as resetting the

popcheck var to 0). In state 18, the new top word will be examined to determine what function

runs next (the same initial decision that was made transitioning out of state 0). If the “mark” bit

is set, the “Move to Ti+1” operation will start. If the “flag” bit is set, processing will return again

to state 9 to clear this word. In the default case, neither bit is set, indicating a fresh word.

Processing will then proceed to state 6.

In state 6, the value of the top word will be swapped and written back to the block RAM, along

with the “flag” bit being set. The process continues on, back to state 12. This time through,

again no control vars are set, so processing continues on to state 10. Assuming that this time

through, state 6 was visited and a value was swapped, the clearvaluescheck control var will not

be set and processing will finally return to state 0. As part of this transition, the

sendvaluestoforward control value will be set back to 1, which will trigger the updated Forward

Network buffer to send its values onto the Forward Network, beginning a new “Imply”

operation.

The fourth major operation of the PI/PPI Decision Block is “Move to Ti+1”. This operation is

called as part of the “Backtrack” operation, when all objectives in the current frame have been

exhausted. That is triggered when the top word on the block RAM has its “mark” bit set,

indicating that it is the last objective of the previous frame in the stack. The core function of the

“Move to Ti+1” operation is to restore the state of the system to what it was at the end of the

previous frame in the stack (Ti+1). The first step to accomplish this is to un-mark the top word

for frame Ti+1. Then, all objective values in frame Ti+1 need to be read back onto the Forward

Network. Finally, all PPI objective values from frame Ti+2 need to be loaded back into the RF in

the Objective Decision Block. Once this is complete, the “Bracktrack” operation which started

the “Move to Ti+1” operation can continue. The function described here is illustrated in pseudo-

code in Table 3.5:

Rewrite top word: mark=1’b0;

//Read Ti+1 objs to Fwd Network Buffer

while (!endofframe) begin

Read obj at block RAM addr to Fwd Network Buffer;

addr--;

//Read Ti+2 PPIs to Obj Dec RF

while (!endofframe) begin

if (val at addr is PPI) begin

Push val to Obj Dec Block RF;

addr--;

//Update frame counter

frame_count++;

//Return to Backtrack operation

Return to calling Backtrack;

Table 3.5: “Move to Ti+1” Operation Pseudo-code

The “Move to Ti+1” operation is implemented in a total of 4 states. One handling the starting

unmarking of the top word, another handling the frame counter update and return to “Backtrack”,

and the remaining two handling the looping read-out of objectives from the block RAM. The

processing flow of these states is shown with state transitions in Figure 3.7:

Figure 3.7: State Transition Model of “Move to Ti+1” Operation

The “Move to Ti+1” operation always starts coming from a “Backtrack” operation, where the top

word on the block RAM has its “mark” bit set, indicating that all objective words for the current

frame have been exhausted. The first way for this to occur is transitioning from state 0. If the

Conflict signal is set, as discussed in the startup of the “Backtrack” operation, but the top

objective word on the block RAM has its “mark” bit set, processing proceeds directly into the

“Move to Ti+1” operation, going to state 5. The alternate case is when the “Backtrack” operation

is proceeding, but after clearing all “flagged” objectives, the next one encountered on the block

RAM has its “mark” bit set. In this case, state 18 in the “Backtrack” operation will transition to

state 5 to start up the “Move to Ti+1” operation, so that “Backtrack” can continue.

State 5 completes a rewrite operation for the top word in the block RAM, reading the word out,

setting the “mark” bit to 1’b0 and then writing it back to the same address. Once the unmarking

is complete, state 5 transitions to state 3 in the next clock cycle. State 3 handles reading out

objectives from the memory to restore the system state to that of the previous frame. It puts the

block RAM into read mode, sets the current addr to the top word on the block RAM, and will

decrement addr every clock cycle, to read out one objective per cycle. When first transitioning

to state 5, the blockisppi2 control var is set, which blocks the output of the RAM from being

evaluated by the “isppi” block (forwarding PPIs to the Objective Decision Block RF). The first

set of objectives to be read out is frame Ti+1, so nothing goes through “isppi” to the RF. Since

blockisppi2 is not set, each clock cycle the process transitions to state 22. In this state, the

objective word that was just read out of the block RAM (into the top signal) is assigned into the

Forward Network buffer. State 22 then always transitions back to state 3 for the next clock

cycle.

In state 3, once the first “mark” bit is encountered while reading off objective words, the

blockisppi2 control var is set to 0. This “mark” bit indicates that the addr is now pointing into

frame Ti+2, so objective words should no longer be assigned into the Forward Network buffer and

should instead be passed through “isppi” to the Objective Decision Block RF. Processing will

now stay in state 3, as each clock cycle a new objective word is read from the block RAM into

the top signal, which feeds into the “isppi” block. For each of these objective words, if their

address value is greater than the number of PIs in the circuit under test, the objective value will

be forwarded on to the Objective Decision Block via the ppi signal, where it will be read into the

When the second “mark” bit is encountered in state 3, processing will proceed to state 7. The

second “mark” bit indicates that the current address is pointing into frame Ti+3, so there are no

more objective words to be read out in this operation. State 7 enables the frame counter to count

up by setting Cnt_enable to 1’b1 and Cnt_dir to 1’b1. The “flag” bit on the top word in the

block RAM is then examined to decide on which state in the “Backtrack” operation to return to.

If the “flag” bit is set, the current objective words has already been swapped, so processing

transitions to state 9 to clear the value and continue on in the “Backtrack” flow. If the “flag” but

is not set, the current objective has not yet been swapped, so processing transitions to state 6 to

swap the value and continue with “Backtrack”. Note that both state 6 and state 9 set Cnt_enable

to 1’b0, disabling the counter, which locks the counter back in after incrementing by 1 to track to

the new current frame.

3.2 Objective Decision Block

The Objective Decision Block serves as the main interface between the Forward Network and

Backward Network; providing two core functions. The first is selection of objective values to be

pushed onto the Backward Network as input to the Backtrace operation. The second is signaling

the PI/PPI Decision Block as to the status of objective satisfaction within the current frame being

operated on. To provide these functions, the Objective Decision Block must store a full set of

objectives for the current frame. To this end, the heart of the Objective Decision Block is a

register file which is sized to be able to contain as many objectives as there are PPI/PPOs in the

circuit under test (l). The high level structure of implementation for the Objective Decision

Block is illustrated in Figure 3.8:

Figure 3.8: Objective Decision Block Structure Diagram

Storing objectives for the current frame is an important action in the Objective Decision Block

which is the first action taken when a new frame is being evaluated. When the PI/PPI Decision

Block begins moving to a new frame, a signal will be sent to the Objective Decision Block. In

the case of moving to frame Ti+1, the tiplus1 signal will be set to 1. In the case of moving to

frame Ti-1, the output signals from the “State Check” sub-module of the PI/PPI Decision Block

will provide indication to the Objective Decision Block (donewithstatecheck=1’b1 and

statecheckresult=1’b0 indicates a successful move to frame Ti-1). When this “new frame” signal

is received, any current values in the RF are flushed out, replacing all words with 14’b0.

After the RF has been cleared out, the PI/PPI Decision Block begins shifting all objectives for

the current frame to the Objective Decision Block over the ppi bus. The idle state for the ppi bus

is 14’b11111111111111, so any time this is assigned to a different value, that indicates that a

new objective word is being pushed from the PI/PPI Decision Block. This value will then be

read from the ppi bus into the Objective Decision Block RF. The reading process involves

decoding the objective word, to insert the objective value into the proper associated RF address.

Since the value of the address represents the position of the objective value across all PIs and

PPIs, the address must be corrected by the number of PIs in the circuit under test (represented by

the parameter m). This shifts the address of the objective word to have a value from 0 to l, which

matches the size of the Objective Decision Block RF. This value read-in operation is illustrated

in Figure 3.9:

Figure 3.9: Objective Decision Block PPI Objective Read In

Selection of an objective value to begin a ‘Backtrace” operation with can be broken down into

two distinct cases. The first, simpler, case is when operating in frame k. In this case, there are

no PPI values from a prior frame to be satisfied; there is only one objective, which is the base

objective that the test is attempting to find a pattern for; line_k = 1. With line_k being the first

PO (based on the bit order of the PO bus), the objective is assembled with an address component

of 0, and an objective value of 1 (encoded with the standard three-value 2-bit encoding as 10).

The “flag” and “mark” bits are set to 0, as the PI/PPI Decision Block is the only place where

these bits can be modified to 1. The assembly of this bit data into the 14-bit objective word

format is shown in Figure 3.10:

Figure 3.10: Assembly of Frame-k Objective Word

The second, more complicated, case for selecting an objective is in a frame other than frame k.

In this case, there is a set of objectives that must be satisfied for the current frame; so the

Objective Decision Block must evaluate the current satisfaction of objectives in the frame and

select the next un-met objective for Backtrace. This evaluation occurs as a logical comparison

between the current fame objective values in the RF against the current PPO values from the

Forward Network. These comparisons can all occur in parallel, as shown in Figure 3.11:

Figure 3.11: Objective Decision Block Conflict/Done Decision Logic

The results of all comparison operations are combined together to produce composite Done and

Conflict signals. Done indicates that all current frame objective values match their associated

values on the Forward Network PPO, hence the current frame is completely satisfied. Conflict

indicates that a value on the Forward Network PPO is incompatible with the associated current

frame objective in the RF. If either Done or Conflict is set, then the newframe_ready signal will

also be toggled. This is seen by the Backward Network Encoder, which subsequently sets the

NReady signal low. The setting of the NReady signal triggers the PI/PPI Decision Block to start

evaluation. When Done is seen while operating outside of frame k, a Move to Ti-1 operation will

begin. When Conflict is seen outside of frame k, the PI/PPI Decision Block will take corrective

action via the “Backtrack” operation.

In the case where neither the Done or Conflict signal is set, more justification is required for the

current frame. This situation will trigger the sequencer to select a new objective value to start a

“Backtrace” on. The sequencer examines each objective value in the RF, until one is found

associated with an “X” (2’b11) value on the Forward Network PPO. This objective value is then

selected as the next objective to be pushed to the Backward Network for “Backtrace”. This

function of the sequencer is illustrated in Figure 3.12:

Figure 3.12: Objective Decision Block Sequencer Operation

In addition to selecting objective values, the Objective Decision Block is responsible for

maintaining state control signals that are output to the PI/PPI Decision Block. These include

Done and Conflict (as previously discussed), as well as line_k_X and line_k_1. Similar to how

Done and Conflict provide signaling to the PI/PPI Decision Block when not in frame k, both of

the line_k_* signals are used when in frame k. If the value of line k currently coming out of the

Forward Network is X/2’b11 (this is the state after initial reset), the Objective Decision Block

sets the objective of line_k=0, and also sets the line_k_X signal to the PI/PPI Decision Block. In

frame k, line_k_X will trigger the PI/PPI Decision Block to accept backtraced objective values

that will be coming from the Backward Network based on the selected objective. If in frame k,

and the value on line k from the Forward Network is 1, then frame k is completely satisfied. The

line_k_1 signal is set, and the newframe_ready signal is toggled. The toggle of newframe_ready

is detected by the Backward Network Encoder, which will consequently set the NReady signal

low. This then triggers the PI/PPI Decision Block, reading line_k_1 in frame k, causing a “Move

to Ti-1” operation to start.

One final control signal interlock managed by the Objective Decision Block is the propfail

signal, originating from the Backward Network Encoder. When an objective pushed onto the

Backward Network as part of a “Backtrace” operation does not produce any objective values, an

objective propagation failure has occurred. When this happens, a “Backtrack” operation is

needed from the PI/PPI Decision Block to undo the cause of the propagation failure. So, when

the Backward Network Encoder detects a propagation failure, the first action it takes is sending

the propfail signal to the Objective Decision Block. This signal triggers the Objective Decision

Block to set the Conflict signal. The Backward Network Encoder’s next action is then to set the

NReady signal low. This causes the PI/PPI Decision Block to begin an operation. As the

Conflict signal has been set, the PI/PPI Decision Block will begin the needed “Backtrack”

operation.

3.3 Forward Network Derivation

One important facet of the algorithm described here is the need to model 0, 1 and X values in the

circuit, and hence the need for signals in the network to be represented by 2-bit values. This

means that to be modeled in this architecture, circuits under test must first be transformed into

two-bit equivalent networks to be compatible. The more straightforward of the two required

network translations is the Forward Network, as it is a direct mapping of function from one bit to

In mapping the Forward Network, the representations that will be used are 01 as a logic 0, 10 as

a logic 1 and 11 as a logic X. The first step in translation is to expand a gate’s one-bit truth table

into a full form, including the value X in all possible combinations. This expansion is shown in

Table 3.6, using an AND gate as an example.

Table 3.6: Full One-Bit Truth Table for AND Gate

The fully expanded truth table can then be directly mapped into a two-bit representation. Note

that in this translation to two-bit representation, the meaning of X changes. Xs from Table 3.6

are translated to 11, but since 00 is an unassigned value in the Forward Network, when either A

or B is 00, the output will be “don’t care” or X. Table 3.7 continues the example of this using an

AND gate.

A1 A0 B1 B0 Y1 Y0

0 0 0 0 X X

0 0 0 1 X X

0 0 1 0 X X

0 0 1 1 X X

0 1 0 0 X X

0 1 0 1 0 1

0 1 1 0 0 1

0 1 1 1 0 1

1 0 0 0 X X

1 0 0 1 0 1

1 0 1 0 1 0

1 0 1 1 1 1

1 1 0 0 X X

1 1 0 1 0 1

1 1 1 0 1 1

1 1 1 1 1 1

Table 3.7: Two-Bit Truth Table for AND Gate

The expanded two-bit truth table can then be broken up into two separate Karnaugh maps, one

for the output bit Y1 and another for the output bit Y0. Table 3.8 continues the example using

the AND gate, showing the associated K-maps with minimum groupings highlighted.

Table 3.8: Split K-maps for AND Gate Output Bits

From these Karnaugh maps, minterm expressions can be derived for both Y1 and Y0. Note

again that 00 is an unassigned value in the Forward Network, thus Xs are assigned to those spots

in the K-map, which can then be used to extend minterm groups for further simplification. Table

3.9 completes the example using the AND gate.

Y1 = A1 B1 Y0 = A0 + B0

Table 3.9: Minterm Expressions for AND Gate Output Bits

Using this method for each of the three basic gates, a full set of translation equations for the

Forward Network can be obtained. These equations are summarized in Table 3.10.

AND Gate OR Gate NOT Gate

Y1 = A1 B1

Y0 = A0 + B0

Y1 = A1 + B1

Y0 = A0 B0

Y1 = A0

Y0 = A1

Table 3.10: Forward Translation Equations for Basic Gates

3.4 Backward Network Derivation

The more complex of the two network translations is the Backward Network, due to the fact that

the function of each gate must be transformed, as opposed to being directly mapped between bit

representations. One important consideration in backtracing is that consistency must be

maintained. That is, if a backtrace is attempting to assign a value in the Backward Network that

does not match a value for that node that has already been assigned in the Forward Network, that

assignment must be prevented. In the context of backtracing through a gate, this means that

propagating an objective value requires the corresponding signal be “don’t care” (X) in the

Forward Network. This can either be implemented as a check against the inputs of each gate

being evaluated, or the output of each gate being evaluated. Here, the output of each gate is used

as the check to minimize the number of required consistency checks. This dependence is

illustrated in Figure 3.13.

Figure 3.13: Abstract Backward Gate Values Dependence

From this abstract model of a gate in the Backward Network, a functional definition must be

defined. To this end, pseudo-code defining the values for the A and B outputs can be defined, as

in Table 3.11.

A Obj Output

If (Z_obj is required obj) then

If (Z_fwd = X) then

A_obj = Z_obj

B Obj Output

If (Z_obj is required obj) then

If (Z_fwd = X) then

If (Z_obj = 1) then

B_obj = Z_obj

Table 3.11: Pseudo-code for AND Gate Backward Model Outputs

Note that the behaviors of the A and B objective values differ based on the Z objective value.

This is based on the requirement to complete a minimized backtrace operation. In the case of the

AND gate, when an objective value of 1 is required on Z, both A and B must also have a

required objective value of 1 to achieve this. When an objective value of 0 is required on Z,

however, only one input needs to be 0 to achieve this. In this case, A is used for propagating the

required objective of 0 and B is ignored.

The next step from this point is to derive a truth table to model the described functionality of this

gate. One important point in modeling this gate is that while the A/B state inputs from the

Forward Network already have a format definition (0=01, 1=10, X=11), the objective

inputs/outputs that are part of the flow of the Backward Network need a separate format. This is

required because while the Forward Network models 3 signal values, the backward network must

model 2 signal values, as well as whether or not the value is a required objective being

backtraced. To this end, the Backward Network also uses 2-bit representations of signals, where

the higher order bit represents whether a value is an objective (1=yes, 0=no) and the lower order

bit represents the value of the signal. Using this signal modeling, the truth table for a backward

model of a gate can be derived, as illustrated in Table 3.12 continuing the AND gate example.

Z_obj_1 Z_obj_0 Z_1 Z_0 A_obj_1 A_obj_0 B_obj_1 B_obj_0

0 0 0 0 X X X X

0 0 0 1 X X X X

0 0 1 0 X X X X

0 0 1 1 X X X X

0 1 0 0 X X X X

0 1 0 1 X X X X

0 1 1 0 X X X X

0 1 1 1 X X X X

1 0 0 0 X X X X

1 0 0 1 1 0 0 0

1 0 1 0 0 1 1 0

1 0 1 1 1 0 0 0

1 1 0 0 X X X X

1 1 0 1 0 0 0 1

1 1 1 0 1 1 0 1

1 1 1 1 1 1 1 1

Table 3.12: Truth Table for Backward Model of AND Gate

Many of the output values defined in the truth table are “don’t care”, since this will occur for

cases where the objective Z value is not a required objective, or if the Z value in the Forward

Network is in an unassigned/illegal state (00). This indicates that equations defining this

backwards gate model are likely to have a reasonably compact final form. In this case, the

pseudo-code presented earlier can be directly translated into the simplified Boolean equations

defining the backward model of the gate, as shown in Table 3.13:

AND Gate OR Gate NOT Gate

A1 = Z1 Z0 Zo1

A0 = Zo0

B1 = Z1 Z0 Zo1 Zo0

B0 = Zo0

A1 = Z1 Z0 Zo1

A0 = Zo0

B1 = Z1 Z0 Zo1 ! Zo0

B0 = Zo0

A1 = Z1 Z0 Zo1

A0 = ! Zo0

Table 3.13: Backward Translation Equations for Basic Gates

Since in the Backward Network encoding, the value is completely represented by the “0” bit, that

value can always be directly passed from the gate output to the gate inputs (inverted in the case

of NOT). The real decision being made in the backward model is with the “1” bit, whether or

not the required objective propagates. For all inputs, the Forward Network gate value must be X

(Z1 = 1, Z0 = 1). For the “A” input, the current output objective value in the Backward Network

must also be 1 (Zo1 = 1). For the “B” input, on top of all previous checks, the value of the

required objective in the Backward Network must be such that all inputs of the gate must be set

to achieve the objective (Zo0 = 1 for AND, Zo0 = 0 for OR).

3.4.1 Backward Network Fanout Handling

One added complexity in the Backward Network is logic fanout. The output of one gate driving

multiple inputs is a fairly common occurrence in the forward logic, but this presents a problem in

the Backward Network. When the logic flow is reversed, this would result in multiple gate

outputs driving single inputs, resulting in contention on the input, as illustrated in Figure 3.14.

Figure 3.14: Backward Fanout Signal Contention

An important consideration in resolving this issue is that delay through different paths will vary,

so different values may arrive at the fanout junctions at an arbitrary time. This opens the

possibility of having transient value states at the junction that could potentially propagate value

glitches to the output. To behave like a single input / single output node, these multiple input

backward nodes must switch once and retain that value until the backward network inputs are

changed. To this end, a priority encoder scheme is employed, such that the first objective value

to arrive at a multiple input node is locked in on the output and any further input changes are

blocked from propagating.

Since signals in the Backward Network are composed of two bits, one determining if a value is a

required objective and the other carrying the objective value, the priority encoding scheme will

require two modules. The first module will generate the higher order, required objective bit.

This is the simpler of the two from an implementation standpoint. The first time a required

objective arrives to the priority encoder, the 1 value should pass through to the output and lock in

at that value. This can be implemented in a straightforward way by using an OR gate to merge

required objective signals, and a feedback of the output to lock in a value of 1. This structure is

illustrated in Figure 3.15.

Figure 3.15: Priority Encoder Structure for Required Objective Bit

This architecture does present one additional challenge, though. With the loopback value lock

directly feeding back into its own generating OR gate, this effects a permanent lock of the gate

output value. Given that the Backward Network will be required to execute many backtraces in

any given justification operation, there needs to be a way to unlock these priority encoded

values. This can be implemented by a new priority_reset signal that is ANDed together with the

Final Req Obj output before the loopback value lock. In normal operation, the priority_reset is

held at 1, which allows the Final Req Obj to pass through. When the Backward Network values

need to be cleared out, the priority_reset signal is set to 0, which causes the loopback value lock

to reset. The priority encoder is then reset to an unlocked state, and is ready to re-lock when the

next required objective propagates. This addition to the structure is illustrated in Figure 3.16.

Figure 3.16: Priority Encoder Structure for Required Objective Bit with Reset

The second module will generate the lower order, objective value bit. This structure will be

slightly more complicated, because in addition to combining and locking values, values must

only pass through to the final locking stage if their corresponding required objective bit is set.

Given the requirement of the value passing with a control signal of 1 and being blocked with a

control signal of 0, a simple AND between the required objective bit and objective value bit will

generate the output value bit into the combination logic. These output value bits will remain 0

except in a case where a required objective of 1 arrives at the input of the node. In that case, the

corresponding output value bit will reflect the input objective value bit. When this happens, the

value must pass through as the final value bit into the locking logic. To achieve this, all output

value bits are ORed together. Since all non-required objective bits are locked to 0, this then

allows the value of the now required objective to pass through this stage of logic as well. This

structure is illustrated in Figure 3.17.

Figure 3.17: Priority Encoder Structure for Objective Value Bit

This half of the priority encoder will require locking and reset logic as well. One important note

in considering this implementation is that unlike the required objective bit, the objective value bit

does not have a set transition when it needs to be locked. For this reason, the locking bit from

the required objective bit must also be used to control value locking for the objective value bit.

Since both halves of the priority encoder share common locking and reset control signals, the

final structure can be represented in a single, complete unit, as shown in Figure 3.18.

Figure 3.18: Complete Structure for Backward Network Priority Encoder

This final structure achieves the required goal for resolving fanout node values in the Backward

Network. When the backward network is idle / cleared, the priority encoder passes through the

required objective bit of 0, and a 0 value bit. When a required objective value arrives at the

encoder, if the encoder is unlocked, the objective value will pass through the required AND filter

and the OR value combination and be output as the final objective value. The required objective

bit will pass through the OR value combination and be passed on to the output as the final

required objective bit. This signal will also lock in both the objective value bit and required

objective bit, so that further input changes are blocked from effecting the priority encoder output,

and hence the final network outputs. Finally, the priority reset signal serves to unlock the

priority encoder, so that the next new required objective can be locked in, by forcing the lock

value bit to 0, and allowing values from the inputs again to pass through to the outputs.

3.4.2 Backward Network Conflict Detection

While backtracing an objective, it is possible for a conflict to occur. This is a situation where a

prior objective in the same frame requires the output of a gate to be the opposite of the current

objective’s requirement. Since the Backward Network is designed to produce a single minimal

objective assignment, a conflict condition means that the frame can not converge with its current

set of objectives.

There are two possible conflict scenarios that can occur in backtracing. The first, as shown in

Figure 3.19 is a trace-blocking conflict. In this situation, the objective Z2=0 is backtraced first,

generating two new objectives (A=1, C=0). The second backtrace is for the objective Z1=1,

which encounters a conflict on the assignment of the output of G1. As this is the only active

backtraceing path for Z1, propagation is completely blocked. With the objective propagation

blocked, no new values are propagated through to the Backward Network Encoder module. The

Objective Decision module is signaled (via the prop_fail signal), causing it to raise the Conflict

signal.

Figure 3.19: Trace-blocking Conflict

The second possible conflict scenario, as shown in Figure 3.20, is the non-blocking conflict. In

this situation, the first objective backtraced is Z1=1. The second objective is Z2=0. This

objective requires the output of G3 (an OR gate) to be 0, hence all inputs must propagate the

objective of 0. Like in the previous example, a conflict occurs at the output of G1, but in this

case there are two active backtracing paths. While the path backtracing to A=1 is blocked, the

path backtracing to C=0 is not, so the Backward Network Encoder module still receives an

objective. The Backward Network Encoder will then provide this objective to the PI/PPI

Decision block as normal. Once the PI/PPI Decision Block pushes the new set of objectives to

the Forward Network, though, the Objective Decision block will see that the objective of Z2=0

was not achieved (it will remain 1 due to the output of G1). This will then directly cause the

Objective Decision block to raise the Conflict signal.

Figure 3.20: Non-blocking conflict

3.4.3 Backward Network Decoder

Due to the previously described need for the Backward Network to use a different value

encoding from the rest of the circuit, encoding and decoding modules are needed as wrappers

around the Backward Network. The Backward Network Decoder module acts as the interface

between the Objective Decision Block and the Backward Network. In addition to translating

objective value encoding between these two blocks, the decoder must also handle resetting the

priority encoding logic that is used to lock in values inside traces in the Backward Network. The

entire operation to assign a new objective takes 3 clock cycles to complete, implemented as a

state machine (controlled by the mode register) as shown in Figure 3.21:

Figure 3.21: State Transition Diagram for Backward Network Decoder

Operation in the Backward Network Decoder is triggered by a change in the obj_set signal from

the Objective Decision Block (a new objective value has been assigned to be backtraced). This

causes the decoder to move from idle into the Backward Network reset state. In this state, the

priority_reset signal is toggled, which causes priority encoder value locks within the Backward

Network to release. At the same time, all PI/PPI input values to the Backward Network are

cleared (value 2’b00) to flush out the network for a fresh value trace.

On the next clock cycle, the decoder moves into the assignment state. In this state, the decoder

makes an assignment to each PI/PPI Backward Network input bit. For the bit that corresponds to

the address in the new obj from the Objective Decision Block, that value is assigned into the

PI/PPI bus as a required objective. For all other bits, values are directly assigned from the

corresponding PO/PPO values output by the Forward Network. These values are not required

objectives, so their higher order “objective” bit is set to 0. This value assignment is detailed in

Figure 3.22:

Figure 3.22: Backward Network Decoder Value Assignment

Each address in the PI/PPI Backward Network input bus is compared to the address contained in

obj[13:4]. When these match, the associated objective value (obj[2]) is assigned into that PI/PPI

bus bit as a required objective (lower order bit is set to obj[2]; higher order bit is set to 1’b1).

For any other bits, the associated higher order bit on the PO/PPO output bus from the Forward

Network is assigned through into the lower order bits on the PI/PPI Backward Network input

bus. Note that because of the way values are encoded outside the Backward Network, and

because the Backward Network only deals with 0/1 (not X), the higher order value bit will

always represent the binary value of that 3-state value.

Once all bits in the PI/PPI Backward Network input bus have been assigned, the trace_start

signal is toggled. This signal is output to the Backward Network Encoder, and notifies it that a

new backtrace has started, so the Backward Network outputs should be monitored for a value

change. After toggling the signal to the encoder, the decoder returns to an idle state.

3.4.4 Backward Network Encoder

The Backward Network Encoder operates on the opposite side of the Backward Network, taking

the output from the Backward Network, encoding it back into the 14-bit objective scheme of the

architecture, and passing the data on to the PI/PPI Decision Block. In addition to value

translation, the encoder must also be able to track the output values of the backward network to

detect when a backtrace operation has failed to produce results. In order to implement these

functions, the Backward Network Encoder is constructed via three states (tracked by the state

register), as illustrated in Figure 3.23:

Figure 3.23: State Transition Diagram of Backward Network Encoder

The Backward Network Encoder begins operation in the idle state. When a “Backtrace”

operation is started by the Backward Network Decoder, the trace_start signal is toggled. This

signal triggers the encoder to move to the “check” state. At the start of the next clock cycle, in

the check state, the current output of the Backward Network is compared against the last saved

output from the network (the decoder saves the values output from the Backward Network every

cycle that it is not in the “check” state). If the values differ, that indicates that the “Backtrace”

operation successfully produced results. If that is the case, the higher order outputs bits are then

checked to ensure that at least one is set to 1’b1 (indicating a required objective). Once that has

been verified, the encoder moves on to the “shiftout” state. If the output values from the

Backward Network have not changed, or there are no required objectives, that indicates a

“Backtrace” propagation failure. The propfail signal is set, which triggers the Objective

Decision Block, which in turn triggers the PI/PPI Decision Block to start a corrective

“Backtrack” operation.

The “shiftout” operation is where communication of objective values happens between the

Backward Network Encoder and the PI/PPI Decision Block. Each objective value saved from

the Backward Network is sequentially shifted out to the PI/PPI Decision Block over the in signal

bus. Because the operation of the PI/PPI Decision Block to write a received objective value into

the block RAM takes multiple clock cycles, the triggering signals NReady and genobj are used to

communicate readiness in the operation. The structure of this communication is illustrated in

Figure 3.24:

Figure 3.24: Backward Network Encoder Shiftout Operation

When first transitioning to the “shiftout” state the NReady signal is set low, which communicates

to the PI/PPI Decision Block that a backtrace result will be ready. The PI/PPI Decision Block

responds by setting genobj high, indicating that it is ready to accept a new required objective

value on the in bus. This signal triggers the internal index in the encoder to start incrementing,

searching for the next required objective in the saved values. Once the next required objective

value is found, the value is encoded onto the in bus, and the NReady signal is again set to 0. This

triggers the PI/PPI Decision Block to read the new value on in and store it into the block RAM.

Once this operation is complete, the PI/PPI Decision Block again sets genobj, triggering the

Backward Network Encoder to start scanning for the next required objective.

When the index reaches the end of the PO/PPOs saved from the Backward Network, indicating

that there are no more required objectives, the encoder sets the in bus to all 1s, and sets NReady

low. This value on the in bus then causes the PI/PPI Decision Block to move on to start the next

“Imply”operation on the Forward Network.

One additional function provided by the Backward Network Encoder, because it controls the

NReady signal, is forwarding trigger request on to the PI/PPI Decision Block. Specifically, this

relates to the newframe_ready signal, which comes from the Objective Decision Block. When it

is determined that the current frame is Done (all previous frame PPIs match current frame PPOs)

or in Conflict (value conflict between previous frame PPI and current frame PPO), the

corresponding signals are set from the Objective Decision Block to the PI/PPI Decision Block.

Once complete, the NReady signal still needs to be set low to trigger the PI/PPI Decision Block

to start an operation and examine those set values. To this end, the Objective Decision block

toggles the newframe_ready signal, which signals the Backward Network Encoder to set NReady

3.5 Translating Circuits into Forward/Backward Networks

In order to place a specific circuit under test in the ATPG architecture outlined here, the network

must first be translated from its basic form into a structure that is compatible to properly

interface with adjacent blocks in the ATPG architecture. To accomplish this in an automated

fashion, a translation program is implemented in C++ code to perform the mapping, which is

included in Appendix E. This section outlines the operation and flow of this utility, as shown in

Figure 3.25.

Figure 3.25: Network Translation Flow

3.5.1 Processing Input Data

The translation program is designed around reading a specific function-level format for networks

to be translated. In this format, each net in the circuit is functionally defined with one line of

code. The exception to this are output nets, which are defined by two lines; one with their

logical definition and the other defining them as an output net. Appendix A contains a sample

input circuit (along with translated circuits output) and Table 3.14 shows the general definition

of this format.

Input Code Functional Description

INPUT(G0) Defines G0 as an input to the circuit.

OUTPUT(G1) Defines G1 as an output of the circuit.

G1 = not(G0) Defines G1 as the inverse of G0.

G2 = and(G0,G1) Defines G2 as a 2-input and of G0 and G1.

G3 = or(G0,G1,G2) Defines G3 as a 3-input or of G0, G1 and G2.

G4 = xor(G2,G3) Defines G4 as the exclusive or of G2 and G3.

G5 = dff(G4) Defines a D-flip-flop with input G4 and output

Table 3.14: Input Network Format

When reading in circuit data, the program does some pre-processing of structures before storing

them in memory. This pre-processing consists of breaking arbitrary complex logic gates down

into constituent base components (AND, OR, NOT). By employing this method of only

modeling the circuit with the most basic components, the task of later translating these circuit

gates into the functional blocks that make up the Forward/Backward Networks is simplified to a

limited number of transforms. This also allows for full flexibility to support any arbitrary

complex gate that is employed in the circuit to be translated. Figure 3.26 provides a simple

example of how the pre-processing is handled in an internally consistent way.

Figure 3.26: Internally Consistent Pre-Processing

3.5.2 In-Memory Data Structure

Once the input circuit has been pre-processed, there are gates that must be stored in memory to

maintain a model of the circuit. Each gate that is read in is modeled into a custom data structure.

This structure contains essential information about the gate being stored, such as name, function,

logical level in the circuit, other gates that are inputs to this one and other gates that are outputs

of this one. This data structure is shown in Figure 3.27.

Figure 3.27: Gate Data Structure

Multiple gates are strung together in linked lists, using gate-nodes as the linking elements in the

list. These lists are employed in many functions, including global lists of inputs, outputs, DFFs

and circuit logic levels. Also of note is the fact that a gate may have multiple gates fanning in to

it or out from it, so these attributes of a gate are also modeled in a linked fashion, as shown in

Figure 3.28.

Figure 3.28: Gate Fan-out List Structure

The main global data structure that ties together gates into the model used is the level list. This

is a two-way linked list, which in its first dimension links together each logical level of the

circuit and in its second dimension links together the list of gates associated with a given logic

level. Inputs and DFF outputs (pseudo-primary inputs) are placed on level 0. Remaining levels

are populated by placing each gate on a level that is one higher than the highest level of its entire

set of fan in gates. This structure is illustrated in Figure 3.29.

Figure 3.29: Level List Structure

Other structured lists of gates include the input list, which links together all input records for the

circuit, the output list, which links together all gates that are also drivers of outputs, the DFF list,

which links together all DFF gates, and the gate list, which links together all combinational gates

in the circuit. Note that the gate list and DFF list are mutually exclusive, because DFFs are

translated into inputs/outputs in the final network output, unlike combination gates that are

translated into Forward/Backward Network equation equivalents.

3.5.3 Writing Output Networks

The final output of the translation program is the Forward and Backward Network Verilog,

which then interface with the rest of the ATPG architecture. As the in-memory model of the

circuit only contains AND, OR and NOT gates, only three transforms are needed at this point to

map the circuit model into the Forward or Backward Network. To write out a network Verilog,

standard module headers are first written out, which include parameterized interface constructs

for primary inputs, pseudo-primary inputs, primary outputs, pseudo-primary outputs and state

bits for all gates. Table 3.15 shows an example of how the interface is defined.

module fwd_net (PI_1, PI_0, PPI_1, PPI_0, STATE_1, STATE_0,

PO_1, PO_0, PPO_1, PPO_0);

parameter n = 1;

parameter m = 4;

parameter l = 3;

parameter s = 20;

input [m-1:0] PI_1;

input [m-1:0] PI_0;

input [l-1:0] PPI_1;

output [s-1:0] STATE_1;

output [n-1:0] PO_1;

output [l-1:0] PPO_1;

Table 3.15: Forward Network Module Interface Example

The parameters defined at the top of the module (n, m, l, s) are used to size the interface busses

based on the attributes of the circuit in memory. The parameter n represents the number of

primary outputs in the circuit, and thus sizes the PO_1 and PO_0 interface busses. The

parameter m represents the number of primary inputs and thus sizes the PI_1 and PI_0 interface

busses. The parameter l represents the number of D-flip-flops and thus sizes both the

PPO_1/PPO_0 and PPI_1/PPO_0 interface busses. The parameter s represents the total number

of gates in the circuit and thus sizes the STATE_1 and STATE_0 interface busses.

Once the module header information has been written out, the next phase is to handle assignment

of all bits in the input busses to internal signals. Since the input busses consist of both circuit

inputs (in the case of the Forward Network) and circuit DFFs, the program loops through all

defined inputs and then all defined DFFs to generate a complete internal assignment from the

module inputs.

The next step is to loop through all gates in the design, in level order. For the Forward Network,

levels are traversed in ascending order and for the Backwards Network, in descending order.

Each gate modeled in memory is translated to its equivalent network equations based on the

transforms outlined in sections 3.3 and 3.4.

Once equations for each gate have been written out, the final phase is to assign internal network

bit values onto the module output busses. In the case of the Forward Network, all primary

outputs and DFF inputs (pseudo-primary outputs) also have a corresponding internal gate driving

them that defines their value. So, once equations are written out for all gates, values are

available for all POs and PPOs. By traversing the list of all outputs and all DFFs, corresponding

gate output bits are assigned into the module output busses.

4. Results

When setting up testing for the implementation of the architecture described here, it is important

to note that the architecture itself has no function without a test circuit being integrated in the

form of Forward and Backward Networks. To this end, the ISCAS89 benchmark circuits are

integrated as test circuits. This provides a wide range of size, structure and complexity which

will then produce a robust set of testing results.

The first property of interest in the architecture proposed here is the intent that it be implemented

on an FPGA. This means that to be consistent with that goal, the architecture must both

successfully synthesize with an FPGA library, and also fit within the resource constraints of the

FPGA. To test against this constraint, the Xilinx Virtex- 6 board was selected, and synthesis was

completed using Mentor Graphics Precision RTL Synthesis 2012b.10_64-bit. Table 4.1

summarizes utilization of the Virtex-6 resources for the ISCAS89 circuits, once integrated into

the architecture.

The “s35932”, “s38417” and “s38584” benchmark circuits are not listed in the compiled results.

As described in section 3.2, the 14-bit encoding scheme for data in this system uses 10 bits for

address. Addresses in this context index the set of PI/PPIs for the circuit being tested. That then

places a limit of 1024 on the number of PIs plus sequential elements in a circuit to be tested. The

three benchmark circuits excluded all have more than 1024 flip-flops, which then overflows the

current addressing scheme, leading to unpredictable results.

It is important to note that while all resource utilization increases along with the relative size /

complexity of the benchmark circuit, LUT and CLB utilization represents the real constraint on

size, as DFF usage is much lower. Aside from DFF usage for the RF within the Objective

Decision Block, and DFF buffering of busses between modules, the size of the non-network

portion of the architecture is largely static.

Circuit LUTs

Slices DFF/Latches

c17 1.25% 1.25% 0.30%

c432 2.97% 2.97% 0.62%

c499 4.05% 4.05% 0.66%

c880 4.02% 4.02% 0.84%

c1355 7.12% 7.12% 0.96%

c1908 6.44% 6.44% 1.00%

c2670 13.96% 13.96% 2.72%

c3540 9.42% 9.42% 1.11%

c5315 14.05% 14.05% 2.37%

c6288 24.52% 24.52% 2.06%

c7552 24.55% 24.55% 3.19%

s27 1.35% 1.35% 0.35%

s208 2.20% 2.21% 0.54%

s298 2.47% 2.47% 0.58%

s344 2.88% 2.88% 0.69%

s349 2.93% 2.94% 0.69%

s386 2.37% 2.37% 0.46%

s420 3.46% 3.46% 0.82%

s444 3.14% 3.14% 0.74%

s510 3.20% 3.20% 0.60%

s526 3.06% 3.07% 0.71%

s526n 3.09% 3.09% 0.71%

s641 4.34% 4.34% 1.02%

s713 4.59% 4.59% 1.05%

s820 3.24% 3.24% 0.55%

s832 3.17% 3.17% 0.55%

s838 6.02% 6.02% 1.35%

s953 5.77% 5.77% 1.09%

s1196 3.30% 3.31% 0.72%

s1238 3.49% 3.50% 0.72%

s1423 8.00% 8.00% 2.08%

s1488 3.66% 3.66% 0.49%

s1494 3.75% 3.75% 0.49%

s5378 17.92% 17.92% 4.80%

s9234 22.76% 22.76% 4.46%

s13207 62.20% 62.21% 13.20%

s15850 59.57% 59.57% 12.46%

Table 4.1: Virtex-6 Resource Utilization for IACAS89 Benchmarks

This then points to the generated Forward / Backward Networks as the core contributor to

utilization, which directly relates to the size and complexity of the input benchmark circuit. The

relationship between benchmark circuit size and utilization can be seen plotted in Figure 4.1.

Figure 4.l: Virtex-6 Utilization vs. ISCAS89 Benchmark Size

The data shows a roughly linear relationship between benchmark circuit size and FPGA

utilization. It is also important to note that the variance from the relationship increases with the

size of the circuit. This behavior is expected, as larger circuits have more nodes where variations

in net fanout can occur. Higher net fanouts result in more required circuits for the Backward

Network, as discussed in section 3.4.1, hence the more nets there are in the benchmark circuit,

the more potential variation in resulting utilization of the translated circuit.

The second point of interest for the architecture is simulating the resulting benchmark circuits.

This determines the number of clock cycles required for operation to complete in each of the

benchmark circuits, as well as a check that the intended behavior of the architecture is followed.

For this purpose, the ISCAS89 benchmark circuits that were synthesized were also simulated in

Mentor Graphics QuestaSim-64 6.5f r2010.06. This data is then combined with the maximum

achievable clock frequency reported in Precision during synthesis to obtain the total runtime on

the FPGA. Finally, as a point of comparison, the Formal software ATPG solver tool was run for

each of the same benchmark circuits. This data is summarized in Table 4.2.

ATPG Emulation Software Solver Clocking Simulation Run Time Run Time

Circuit Freq

(Mhz) Sim Cycles Gen (s)

Compile (s)

Synthesis (s)

Sim Time (s)

Total Time (s)

Rules Gen (s)

Model Gen (s)

Solve (s)

Total Time (s)

s298 78.046 483 0.036 2 56.5 6.19E-06 58.5 0.001 0.002 6.028 6.031

s344 75.884 56 0.038 3 59.6 7.38E-07 62.6 0.001 0.002 0.001 0.004

s349 66.702 56 0.033 3 57 8.40E-07 60.0 0.001 0.002 0.001 0.004

s386 86.957 56 0.034 2 53.4 6.44E-07 55.4 0.001 0.002 0.012 0.015

s444 67.668 207 0.04 4 68 3.06E-06 72.0 0.001 0.002 3.192 3.195

s510 71.922 36 0.045 3 58.4 5.01E-07 61.4 0.001 0.002 0.028 0.031

s526 83.977 483 0.042 4 69.2 5.75E-06 73.2 0.001 0.002 8.964 8.967

s526n 86.505 483 0.047 4 64 5.58E-06 68.0 0.001 0.002 9.02 9.023

s641 38.728 939 0.046 4 83.9 2.42E-05 87.9 0.001 0.002 0.028 0.031

s713 41.416 12688 0.146 5 90 3.06E-04 95.1 0.001 0.002 0.04 0.043

s820 59.144 84 0.057 4 61.8 1.42E-06 65.9 0.001 0.002 0.001 0.004

s832 56.893 84 0.057 4 62.8 1.48E-06 66.9 0.001 0.002 0.001 0.004

s953 56.825 352 0.059 6 97.7 6.19E-06 103.8 0.001 0.002 14.96 14.967

s1196 49.039 100 0.069 6 83.4 2.04E-06 89.5 0.001 0.002 0.004 0.007

s1238 58.306 104 0.072 6 85.6 1.78E-06 91.7 0.001 0.002 0.004 0.007

s1423 32.951 77 0.092 15 166.8 2.34E-06 181.9 0.001 0.002 0.044 0.047

s1488 59.963 64 0.09 7 65 1.07E-06 72.1 0.001 0.002 0.004 0.007

s1494 57.478 64 0.085 7 71.6 1.11E-06 78.7 0.001 0.002 0.004 0.007

s5378 45.116 332 0.668 57 597.1 7.36E-06 654.8 0.009 0.002 0.016 0.027

s9234 29.462 128 1.655 81 784.2 4.34E-06 866.9 0.008 0.002 0.012 0.022

s13207 17.493 540 4.139 605 9024.5 3.09E-05 9633.6 0.013 0.002 0.012 0.027

s15850 16.876 461 5.196 416 6057.5 2.73E-05 6478.7 0.014 0.002 0.092 0.108

Table 4.2: Runtime Comparison for ISCAS89 Benchmarks

The runtime data is broken down among the required processes from start to finish for each of

the solvers. In the case of the ATPG emulation architecture, this starts out with the time it takes

to generate a set of Forward/Backward Network Verilog code from the input benchmark circuit.

The complete Verilog must then be compiled and synthesized before it can finally be loaded onto

an FPGA and run to complete the solve operation.

In the case of the Formal software solver, the first step is to generate a CTL rule statement based

on the circuit to match what is being solved by the FPGA (in this case, the ANDing of all POs).

This rule generation is automated for this test case with the TCL script included in Appendix G.

This rule is then merged with the benchmark circuit, and the combined code is translated into a

model that Formal understands. That model is read into Formal and the solve operation is then

run. Note that the Formal solve operation has an internal time limit before giving up early. The

solve times highlighted in Table 4.2 indicate runs where Formal gave up early. For that reason,

those benchmarks are not included in further analysis, as no direct comparison could be drawn.

There are two interesting sets of data to compare within the run time results. The first is the raw

solve time between the FPGA and the software-based solver, as shown in Figure 4.2.

Figure 4.2: FPGA vs. Software Solve Time for ISCAS89 Benchmarks

The solve times here were plotted using a logarithmic scale due to the large difference between

the two. It is seen, then, that on average the FPGA architecture solves the ATPG problem 3

orders of magnitude faster than the software-based solver (average of 6991x faster, with a

minimum of 131x for s713 and a maximum of 55939x for s510). In addition to this result,

though, the complete process run time must also be considered, as shown in Figure 4.3.

Figure 4.3: FPGA vs. Software Total Time for ISCAS89 Benchmarks

When the complete process time is considered, the results are the complete reverse of the solve

time alone, with the software-based solver being 4 orders of magnitude faster than the FPGA

based solution (average of 36309x faster, with a minimum of 2048x for s510 and a maximum of

356801x for s13207). This discrepancy shows that the FPGA based solution still has limitations

inherited from pre-processing required before the FPGA can actually solve a problem. Compile

and synthesis time dwarf all other run time considerations with the FPGA solution, and present

the largest barrier between the FPGA and software solutions.

5. Conclusion & Future Work

Circuits are continually increasing in size and complexity, and this growth increases exposure to

subtle bugs in circuit function. This leads to increased reliance on methods of formal verification

to catch design flaws, and provide assurance of function. Formal verification methods also

suffer issues such as “state explosion” and increasing runtime with larger circuits. To keep up,

formal verification tequiniques continue to evolve, as described in section 1.2. This has lead up

to the current use of ATPG-based methods for formal verification.

As circuit sizes approach the limits of even ATPG-based method feasibility, further solutions are

required. A method has been presented here for implementing an ATPG-based algorithm for

formal verification in reconfigurable hardware (FPGA). This implementation has been shown to

have a linear relationship between the size of the circuit being verified and ultimate FPGA

resource utilization. This implies a reasonable bound on the size of the implementation, as

opposed to an exponential utilization explosion as circuit size increases.

One limitation encountered that prevented simulation with the three largest benchmark circuits

was the limit of 1024 PI/flops for a circuit under test, due to the 10-bit addressing scheme used in

the current implementation. With larger FPGAs to accommodate larger circuits for testing, this

limit could be increased. Increasing the bit width of the data words across the emulation

implementation to support a larger address size is mostly trivial. The only portion that would

require more re-work is the interface with the block RAM, as the address / data bit allocation

configuration would need to change to a different implementation that supported the new desired

size for the address.

This method has been shown to be and average of 3 orders of magnitude faster than a similar

software-based approach, based on the time for solving a given ATPG problem. At the same

time, though, total runtime for the FPGA emulation based implementation is significantly limited

by the parts of its process still in software (mainly compilation and synthesis). One future

enhancement that could be made to improve this limitation would be to split the property

monitor portion of the circuit under test into a separate module. Currently the property monitor

for a given CTL rule is integrated as part of the Forward and Backward Networks, and as such

the whole set of networks must be re-compiled and re-synthesized for each new property to be

tested. If the property monitor portion were to be separated out, then only that relatively small

portion of the total circuit would need to be re-compiled and re-synthesized for each iteration of

different properties on the same circuit. This would reduce the impact of high compile and

synthesis time overhead, and make FPGA based emulation a more attractive substitute for

software based solvers, with that benefit being directly proportional to circuit size.

Appendix A: Example Input Circuit and Network Translations

The code used to generate Forward and Backward Networks for the architecture described here

is designed to accept a specific input format. The specific constructs used in this format were

described in section 3.5.1. Those constructs can then be applied to create an input circuit for

translation, as exemplified in Table A.1, which is the ISCAS89 circuit “s27”.

# 4 inputs

# 1 outputs

# 3 D-type flipflops

# 2 inverters

# 8 gates (1 ANDs + 1 NANDs + 2 ORs + 4 NORs)

INPUT(G0)

INPUT(G1)

INPUT(G2)

INPUT(G3)

OUTPUT(G17)

G5 = DFF(G10)

G6 = DFF(G11)

G7 = DFF(G13)

G14 = NOT(G0)

G17 = NOT(G11)

G8 = AND(G14, G6)

G15 = OR(G12, G8)

G16 = OR(G3, G8)

G9 = NAND(G16, G15)

G10 = NOR(G14, G11)

G11 = NOR(G5, G9)

G12 = NOR(G1, G7)

G13 = NOR(G2, G12)

Table A.1: Benchmark Code for s27 Circuit

This input circuit code is read into the in-memory model of the translation code, which then

generates the networks. The first network to be generated is the Forward Network, which is a

direct translation of the circuit, with the DFFs being mapped to PPIs/PPOs and the functional bit

encoding being changed to a 2-bit representation (to model 3 logic values; 0, 1 and X). Given

the direct nature of this translation, each “wire” line defining a gate in the forward network is

directly linked to (and has the same base name as) a gate from the input circuit, in a one-to-one

relationship. Table A.2 shows the generated forward network code for “s27”.

module fwd_net (PI_1, PI_0, PPI_1, PPI_0, STATE_1, STATE_0, PO_1, PO_0, PPO_1,

PPO_0);

parameter n = 1;

parameter m = 4;

parameter l = 3;

parameter s = 20;

input [m-1:0] PI_1;

input [m-1:0] PI_0;

wire G3_1, G2_1, G1_1, G0_1, G3_0, G2_0, G1_0, G0_0, G7_z1, G6_z1, G5_z1, G7_z0,

G6_z0, G5_z0; assign {G3_1, G2_1, G1_1, G0_1} = PI_1;

assign {G3_0, G2_0, G1_0, G0_0} = PI_0;

assign {G7_z1, G6_z1, G5_z1} = PPI_1;

assign {G7_z0, G6_z0, G5_z0} = PPI_0;

wire G14_z1 = G0_0;

wire G14_z0 = G0_1;

wire G12_base_z1 = G7_z1 | G1_1;

wire G12_base_z0 = G7_z0 & G1_0;

wire G12_z1 = G12_base_z0;

wire G8_z1 = G6_z1 & G14_z1;

wire G8_z0 = G6_z0 | G14_z0;

wire G16_z1 = G8_z1 | G3_1;

wire G16_z0 = G8_z0 & G3_0;

wire G15_z1 = G8_z1 | G12_z1;

wire G15_z0 = G8_z0 & G12_z0;

wire G13_base_z1 = G12_z1 | G2_1;

wire G13_base_z0 = G12_z0 & G2_0;

wire G9_base_z1 = G15_z1 & G16_z1;

wire G9_base_z0 = G15_z0 | G16_z0;

wire G17_z1 = G11_z0;

wire G17_z0 = G11_z1;

assign STATE_1 = {G3_1, G2_1, G1_1, G0_1, G7_z1, G6_z1, G5_z1, G14_z1, G12_base_z1,

G12_z1, G8_z1, G16_z1, G15_z1, G13_base_z1, G13_z1, G9_base_z1, G9_z1,

G11_base_z1, G11_z1, G10_base_z1, G17_z1, G10_z1}; assign STATE_0 = {G3_0, G2_0, G1_0, G0_0, G7_z0, G6_z0, G5_z0, G14_z0, G12_base_z0,

G12_z0, G8_z0, G16_z0, G15_z0, G13_base_z0, G13_z0, G9_base_z0, G9_z0,

G11_base_z0, G11_z0, G10_base_z0, G17_z0, G10_z0}; assign PO_1 = {G17_z1};

assign PO_0 = {G17_z0};

assign PPO_1 = {G13_z1, G11_z1, G10_z1};

assign PPO_0 = {G13_z0, G11_z0, G10_z0};

endmodule

Table A.2: Forward Network Verilog for s27 Benchmark Circuit

The Backward Network is more complex in its relationship back to the input circuit, since each

input gate maps to multiple Backward Network gates, and special circuitry to handle fanout

conditions needs to be inserted. The same base names from the input circuit are still used for the

related gates in the Backward Network, though there are many post-fixes used to handle the one-

to-many mapping. Table A.3 shows the generated Backward Network code for “s27”.

module back_net (priority_reset, PO_1, PO_0, PPO_1, PPO_0, STATE_1, STATE_0, PI_1,

PI_0, PPI_1, PPI_0); parameter n = 1;

parameter m = 4;

parameter l = 3;

parameter s = 20;

input priority_reset;

input [n-1:0] PO_1;

input [n-1:0] PO_0;

input [l-1:0] PPO_1;

input [l-1:0] PPO_0;

input [s-1:0] STATE_1;

input [s-1:0] STATE_0;

output [m-1:0] PI_1;

output [m-1:0] PI_0;

output [l-1:0] PPI_1;

output [l-1:0] PPI_0;

wire G17_zo1, G17_zo0, G3_1, G2_1, G1_1, G0_1, G7_z1, G6_z1, G5_z1, G14_z1,

G12_base_z1, G12_z1, G8_z1, G16_z1, G15_z1, G13_base_z1, G13_z1, G9_base_z1,

G9_z1, G11_base_z1, G11_z1, G10_base_z1, G17_z1, G10_z1, G3_0, G2_0, G1_0,

G0_0, G7_z0, G6_z0, G5_z0, G14_z0, G12_base_z0, G12_z0, G8_z0, G16_z0,

G15_z0, G13_base_z0, G13_z0, G9_base_z0, G9_z0, G11_base_z0, G11_z0,

G10_base_z0, G17_z0, G10_z0, G13_zo1, G11_zo1, G10_zo1, G13_zo0, G11_zo0,

G10_zo0; assign {G17_zo1} = PO_1;

assign {G17_zo0} = PO_0;

assign {G13_zo1, G11_zo1, G10_zo1} = PPO_1;

assign {G3_1, G2_1, G1_1, G0_1, G7_z1, G6_z1, G5_z1, G14_z1, G12_base_z1, G12_z1,

G8_z1, G16_z1, G15_z1, G13_base_z1, G13_z1, G9_base_z1, G9_z1, G11_base_z1,

G11_z1, G10_base_z1, G17_z1, G10_z1} = STATE_1; assign {G3_0, G2_0, G1_0, G0_0, G7_z0, G6_z0, G5_z0, G14_z0, G12_base_z0, G12_z0,

G8_z0, G16_z0, G15_z0, G13_base_z0, G13_z0, G9_base_z0, G9_z0, G11_base_z0,

G11_z0, G10_base_z0, G17_z0, G10_z0} = STATE_0; wire G17_zo1_1 = G17_zo1;

wire G17_zo0_1 = G17_zo0;

wire G10_base_zo1 = G10_zo1 & G10_base_z1 & G10_base_z0;

wire G10_base_zo0 = ~G10_zo0;

wire G11_zo1_1 = G10_base_zo1 & G11_z1 & G11_z0;

wire G11_zo0_1 = G10_base_zo0;

wire G14_zo1 = G10_base_zo1 & ~G10_base_zo0 & G14_z1 & G14_z0;

wire G14_zo0 = G10_base_zo0;

reg G17_priority_0;

reg G17_priority_1;

reg G17_priority_last_reset;

always @(priority_reset, G17_zo1, G17_priority_1, G17_priority_0, G17_zo1_1) begin

if (G17_priority_last_reset != priority_reset) begin

G17_priority_0 = 1'b0;

G17_priority_last_reset = priority_reset;

else begin

G17_priority_0 = G17_zo1 & ~G17_priority_1;

G17_priority_1 = ~G17_priority_0 & G17_zo1_1;

wire G17_zo1_2 = G17_priority_0 | G17_priority_1;

wire G17_zo0_2 = (G17_priority_0 & G17_zo0) | (G17_priority_1 & G17_zo0_1);

wire G11_zo1_2 = G17_zo1_2 & G11_z1 & G11_z0;

wire G11_zo0_2 = ~G17_zo0_2;

reg G11_priority_0;

reg G11_priority_1;

reg G11_priority_2;

always @(priority_reset, G11_zo1, G11_priority_1, G11_priority_2, G11_priority_0,

G11_zo1_1, G11_priority_2, G11_priority_0, G11_priority_1, G11_zo1_2) begin if (G11_priority_last_reset != priority_reset) begin

else begin

G11_priority_0 = G11_zo1 & ~G11_priority_1 & ~G11_priority_2;

G11_priority_1 = ~G11_priority_0 & G11_zo1_1 & ~G11_priority_2;

G11_priority_2 = ~G11_priority_0 & ~G11_priority_1 & G11_zo1_2;

wire G11_zo1_3 = G11_priority_0 | G11_priority_1 | G11_priority_2;

wire G11_zo0_3 = (G11_priority_0 & G11_zo0) | (G11_priority_1 & G11_zo0_1) |

(G11_priority_2 & G11_zo0_2); wire G11_base_zo1 = G11_zo1_3 & G11_base_z1 & G11_base_z0;

wire G11_base_zo0 = ~G11_zo0_3;

wire G9_zo1 = G11_base_zo1 & G9_z1 & G9_z0;

wire G5_o1_0 = G11_base_zo1 & ~G11_base_zo0 & G5_z1 & G5_z0;

wire G5_o0_0 = G11_base_zo0;

wire G15_zo1 = G9_base_zo1 & G15_z1 & G15_z0;

wire G16_zo1 = G9_base_zo1 & G9_base_zo0 & G16_z1 & G16_z0;

wire G8_zo1 = G16_zo1 & G8_z1 & G8_z0;

wire G8_zo0 = G16_zo0;

wire G3_o1_0 = G16_zo1 & ~G16_zo0 & G3_1 & G3_0;

wire G3_o0_0 = G16_zo0;

wire G8_zo1_1 = G15_zo1 & G8_z1 & G8_z0;

wire G8_zo0_1 = G15_zo0;

wire G12_zo1 = G15_zo1 & ~G15_zo0 & G12_z1 & G12_z0;

wire G12_zo0 = G15_zo0;

wire G2_o1_0 = G13_base_zo1 & ~G13_base_zo0 & G2_1 & G2_0;

reg G12_priority_0;

reg G12_priority_1;

else begin

wire G12_base_zo1 = G12_zo1_2 & G12_base_z1 & G12_base_z0;

wire G12_base_zo0 = ~G12_zo0_2;

reg G8_priority_0;

reg G8_priority_1;

else begin

wire G6_o1_0 = G8_zo1_2 & G6_z1 & G6_z0;

wire G6_o0_0 = G8_zo0_2;

wire G14_zo1_1 = G8_zo1_2 & G8_zo0_2 & G14_z1 & G14_z0;

wire G14_zo0_1 = G8_zo0_2;

reg G14_priority_0;

reg G14_priority_1;

else begin

wire G0_o1_0 = G14_zo1_2 & G0_1 & G0_0;

wire G0_o0_0 = ~G14_zo0_2;

wire G7_o1_0 = G12_base_zo1 & G7_z1 & G7_z0;

wire G1_o1_0 = G12_base_zo1 & ~G12_base_zo0 & G1_1 & G1_0;

wire G3_o1 = G3_o1_0;

wire G3_o0 = G3_o0_0;

wire G2_o1 = G2_o1_0;

wire G2_o0 = G2_o0_0;

wire G1_o1 = G1_o1_0;

wire G1_o0 = G1_o0_0;

wire G0_o1 = G0_o1_0;

wire G0_o0 = G0_o0_0;

wire G7_zo1 = G7_o1_0;

assign PI_1 = {G3_o1, G2_o1, G1_o1, G0_o1};

assign PI_0 = {G3_o0, G2_o0, G1_o0, G0_o0};

assign PPI_1 = {G7_zo1, G6_zo1, G5_zo1};

assign PPI_0 = {G7_zo0, G6_zo0, G5_zo0};

endmodule

Table A.3: Backward Network Verilog for s27 Benchmark Circuit

One piece of repeated code that is important to note in the backward network, is the priority

encoder logic that is used to handle logical fanout when translating the network to the reverse

direction. Table A.4 shows an excerpt from the backward network code which is used to

implement the priority encoder.

wire G11_zo1_2 = G17_zo1_2 & G11_z1 & G11_z0;

wire G11_zo0_2 = ~G17_zo0_2;

reg G11_priority_0;

reg G11_priority_1;

reg G11_priority_2;

always @(priority_reset, G11_zo1, G11_priority_1, G11_priority_2, G11_priority_0,

G11_zo1_1, G11_priority_2, G11_priority_0, G11_priority_1, G11_zo1_2)

begin if (G11_priority_last_reset != priority_reset) begin

else begin

G11_priority_0 = G11_zo1 & ~G11_priority_1 & ~G11_priority_2;

G11_priority_1 = ~G11_priority_0 & G11_zo1_1 & ~G11_priority_2;

G11_priority_2 = ~G11_priority_0 & ~G11_priority_1 & G11_zo1_2;

wire G11_zo1_3 = G11_priority_0 | G11_priority_1 | G11_priority_2;

wire G11_zo0_3 = (G11_priority_0 & G11_zo0) | (G11_priority_1 & G11_zo0_1) |

(G11_priority_2 & G11_zo0_2); ...

Table A.4: Backward Network Priority Encoder Verilog Example

Each time the translation code processes a gate input while generating the backward network, the

name of the driving cell is recorded. If another instance of a gate input being sourced by the

same driver cell is encountered, the next incremental post-fix is selected to reference that version

of the driver cell output in the backward model.

Once processing arrives at the driving gate that has multiple sinks, and thus multiple versions

created of its output, those multiple signals need to be resolved into a single signal to continue

propagation through the Backward Network. So, at this point, a priority encoder is instantiated.

This is modeled as a block that triggers off any change in the “required objective” bits for any of

the versions of the driver cell output signal (or the global priority_reset signal used for clearing

the Backward Network). Whenever the first change occurs in those signals, the priority bits in

the encoder lock in, preventing any changes until a priority reset occurs.

The final portion of the priority encoder occurs outside of the detect/lock code block. In the

following two “wire” statements, the merged driver cell signal is defined. The “zo1” (required

objective) bit is defined by ORing together all the priority signals. Since one of these will lock

in with a value of 1 once the first one arrives, this will then cause the final output to also lock in

with a value of 1. The “zo0” (objective value) bit is defined by ANDing together each objective

value with its priority bit, and then ORing them all together. Since the AND operation will act as

a pass-through for the objective value only when the priority input is 1, only the term with the

locked-in priority bit will be passed through (all other terms will always be 0). These are then all

ORed together, effectively propagating the single term representing the locked-in objective from

the encoder into the final merged objective value bit.

Appendix B: Example DONE Simulation for c17 Benchmark Circuit

The following results are taken from simulation of the “c17” ISCAS89 benchmark circuit using

Mentor Graphics QuestaSim-64 6.5f r2010.06. For reference, the logical structure of “c17” is

shown in Figure B.1. Note that as a c* class benchmark circuit, there are no sequential elements,

and thus no PPIs/PPOs. Also note that the model of the circuit used in the verification

architecture uses separate AND + NOT structures in place of the NANDs defined in the base

circuit. For the sake of diagram simplicity, these will remain abstracted as singular NAND gates.

Figure B.1: c17 Benchmark Circuit Structure

One final note about the structure of the circuit to keep in mind is that in the case of this

simulation, a property monitor is used which ANDs together all POs, effectively translating the

test of line_k=1 to testing if all POs can be 1 at the same time. Thus, the structure that is being

simulated is the one shown in Figure B.2.

Figure B.2: c17 Simulation Circuit Structure

To begin simulation of the circuit, the “clk” input is defined with a 100ps period (first rising edge

arriving at +50ps) and the “global_reset” signal is pulsed high for 5ps to initialize the circuit:

force -freeze sim/:top:clk 0 0, 1 {50 ps} -r 100

force -freeze sim/:top:global_reset 1 0 -cancel 5

Table B.1: Circuit c17 Simulation Input Stimulus

When global_reset goes high, the first operation in the circuit is executed, with all modules

resetting to their initial states.

# fourcounter | reseting the frame counter

# statecheck | reset

# obj-dec | RESET

# back-bec | RESET

# back-enc | RESET

# fourcounter | global reset

# PPI Decision Block / Reset of PPI Decision Block

# memram | ADDRESS=0=0000000000

# mycontrol | executing state 0

Table B.2: Circuit c17 Initial Reset

As part of this reset, the values sent from the PI/PPI Decision Block to the Forward Network are

cleared to all Xs (2’b11). Thus in the first clock cycle, the Objective Decision Block sees that it

is in frame_k with line_k=X. Thus, an objective of line_k=1 is set, and pushed to the Backward

Network Decoder.

# mycontrol | in state 0

# obj-dec | line_k = X

# obj-dec | push obj: 00000000000100

Table B.3: Circuit c17 Simulation Cycle 1

The Backward Network Decoder receives this objective value and begins the process of a

“Backtrace” on the Backward Network. This involves two cycles of operation. In the first, the

Backward Network is cleared, where all priority encoders are unlocked, so that new propagation

can occur. In the second cycle, the objective is pushed onto the Backward Network, and the

Backward Network Encoder is signaled that a ‘Backtrace” operation has started via toggling of

the trace_start signal.

# back-dec | received new obj: 00000000000100

# ------------------------------

# back-dec | clearing back-net

# ------------------------------

# back-dec | pushing obj onto back-net: 00000000000100

Table B.4: Circuit c17 Simulation Cycles 2-4

Once the objective is pushed onto the Backward Network, tracing takes place in a single cycle,

while the Backward Network Encoder waits for trace results. Figure B.3 shows the result of the

“Backtrace” on the circuit structure. Note that in the circuit diagram, the bottom inputs to gates

are considered the “A” input, and as such when a gate output objective only requires one input to

be set as an objective, the bottom input will be set.

Figure B.3: c17 Simulation Backtrace

While this “Backtrace” is happening, the Backward Network Encoder is waiting for results.

Once the results are available, the Backward Network Encoder sets the NReady signal low to

indicate to the PI/PPI Decision Block that backtraced results are ready to send to it. It moves to

state 1 to prepare to accept a new objective value from the Backward Network.

# back-enc | trace started, waiting for results

# ------------------------------

# back-enc | traced values received

# ------------------------------

# back-enc | reset nready signal

From state 1, the PI/PPI Decision Block moves to state 16, where it sets genobj to high,

indicating to the Backward Network Encoder that it is ready to accept an objective value. In the

following cycle the PI/PPI Decision Block returns to state 0 (idle), while the Backward Network

Encoder finds the first objective value on the Backward Network outputs. It finds an objective

value of “1” (encoded from “11” to “10”) at PI index 1. This corresponds to the assignment of

“G2gat” to 1. This objective is pushed onto the in bus and NReady is set low, triggering the

PI/PPI Decision Block to action. The PI/PPI Decision Block moves through state 1 and on to

state 17.

# ------------------------------

# back-enc | found obj at PI index 0000000001: 10

# ------------------------------

# memram | beginning push

# ------------------------------

# memram | pushing on to ram

# input word=00000000010100

# memram | ADDRESS=li=0000

# ------------------------------

# PPI Decision Block | Assign vf (from top)

# addr=0000000001

# val=10

In state 17, the PI/PPI Decision Block pushes the value on in onto the Block RAM, which will

then contain its first value, as shown in Figure B.4.

Figure B.4: Contents of c17 Block RAM After Objective 1

The PI/PPI Decision Block then moves on to state 10, where the same objective value is loaded

into the Forward Network input buffer. At the same time, genobj is set, signaling the Backward

Network Encoder that the PI/PPI Decision Block will be ready to accept another objective value.

The PI/PPI Decision Block then returns to state 0, and the same cycle that just completed runs 2

more times to pass on the objective values set on G6gat and G7gat.

# ------------------------------

# input word=00000000110010

# ------------------------------

# addr=0000000011

# val=01

# ------------------------------

# input word=00000001000100

# ------------------------------

# addr=0000000100

# val=10

After the final objective value has been passed from the Backward Network Encoder to the

PI/PPI Decision Block, genobj is again set high. This time, the Backward Network Encoder has

no more objectives to pass. Detecting that it is done, it sets the in bus to all 1s, indicating no

value, and again triggers the PI/PPI Decision Block by setting NReady to 0. Receiving this

signal that there are no further objectives, the PI/PPI Decision Block pushes the values in the

buffer onto the Forward Network, starting a new trace. The newvaluestoforward signal is also

toggled to signal the Objective Decision Block that a new trace has started.

# back-enc | no objectives; done

# addr=0000000100

# val=10

# PPI Decision Block | Sending Values to Forward Network

Table B.8: Circuit c17 Simulation Cycle 21

The same three objective values that were backtraced are pushed onto the Forward Network,

which propagates the values from PI to PO. The flow of this trace on the circuit structure is

shown in Figure B.5.

Figure B.5: c17 Simulation Trace / Implication

In the next cycle, the Objective Decision Block sees nexvaluestoforward has toggled, and checks

the values on the Forward Network. The PO value of “final_zo” has changed from X (2’b11) to

1 (2’b10), indicating a successful trace. Since the circuit is in frame_1 (circuits without

sequential elements only operate in a single frame; frame_k=frame_1), the Objective Decision

Block checks the Forward Network state against the objective of line_k=1. Since line_k

(final_zo) is set to 1, the final objective in frame_1 has been satisfied. The Done signal is set

high, indicating a done state, and the newframe_ready signal is toggled indicating that the current

frame is complete (in a sequential circuit this would mean moving to the next frame).

# obj-dec | line_k = 1

# obj-dec | DONE with frame

# ------------------------------

# back-enc | forwarding newframe nready

# DONE!!

# cycle= 23 FAIL=0 DONE=1

# back-enc | recovering nready in idle

In the final cycle, the Backward Network Encoder receives the toggle of the newframe_ready

signal, which triggers it to set NReady low, passing on the signal to the PI/PPI Decision Block.

The PI/PPI Decision Block receives the NReady signal and also sees that the signals Done=1,

Conflict=0 and frame_1=1, indicating that the final objective has been satisfied in frame_1. The

PI/PPI Decision Block then moves to the final state 14, where DONE=1 is passed to the global

output, completing the simulation. At this point, the full set of PI/PPI assignment vectors

required to reproduce the line_k=1 objective are located in the Block RAM for extraction.

Figure B.6: Final c17 Block RAM Contents

Appendix C: Example FAIL Simulation for s27 Benchmark Circuit

The following results are taken from simulation of the “s27” ISCAS89 benchmark circuit using

Mentor Graphics QuestaSim-64 6.5f r2010.06. For reference, the logical structure of “s27” is

shown in Figure C.1.

Figure C.1: s27 Benchmark Circuit Structure

Note that as part of the translation into the Forward and Backward Networks, the sequential

elements are removed, and converted into PPI/PPOs of the circuit. In illustration, the PPIs are

located along the bottom of the circuit, using lower case notation. Their corresponding PPOs are

along the right side of the circuit, using the same name with an upper case notation. Also note

that inverting gates are converted into a non-inverting gate and a separate not gate, but for the

sake of illustration simplicity, these gates remain singular in this example. The final network

structure for tracing in simulation is shown in Figure C.2.

Figure C.2: s27 Benchmark Simulation Structure

Simulation begins with an input clock defined with a 100ps period (first rising edge at 50ps), and

a pulse of global_reset to high for 5ps, to trigger initialization/reset of the circuit. From this

point, simulation of the first frame begins. Since the first frame objective is always line_k=1,

simulation proceeds exactly as in Appendix B, up until the point that the first frame is complete.

Simulation output from the first frame is shown in Table C.1, but is not discussed in detail for

this reason.

# fourcounter | reseting the frame counter

# statecheck | reset

# obj-dec | RESET

# back-bec | RESET

# back-enc | RESET

# ------------------------------

# obj-dec | line_k = X

# obj-dec | push obj: 00000000000100

# ------------------------------

# back-enc | found obj at PPI index

00000000000000000000000000000101: 01

# ------------------------------

# input word=00000001010010

# ------------------------------

# addr=0000000101

# val=01

# ------------------------------

00000000000000000000000000000110: 10

# ------------------------------

# input word=00000001100100

# ------------------------------

# addr=0000000110

# val=10

# ------------------------------

# addr=0000000110

# val=10

# ------------------------------

# obj-dec | line_k = 1

# ------------------------------

Table C.1: Circuit s27 Simulation Cycles 1-20

The only difference in the first frame between s27 and Appendix B is the result of the

“Backtrace” operation in the Backward Network, and hence the set of values passed to the PI/PPI

Decision Block to be stored in the Block RAM. The “Backtrace” of the circuit for the first

processed frame (frame_k; line_k=1) is shown in Figure C.3, where G17 is line_k, being the only

PO of the circuit.

Figure C.3: s27 Simulation Frame k Backtrace

At this point, the Block RAM contains two entries, storing the two PPI objective values that were

backtraced. These contents are shown in Figure C.4. Note that the “top” bit for the last

objective read onto the memory is currently 0.

Figure C.4: Contents of Block RAM After Frame k

This is where processing diverges from the example in Appendix B. Since sequential elements

exist, the current frame is k, but not 1. Thus, the PI/PPI Decision Block, having PPIs to justify,

moves to state 15 to begin a move to frame k-1. From here processing moves to state 3, where

each PPI assignment from the previous frame is read from the RAM onto the ppi bus to the

Objective Decision Block. The Objective Decision Block reads these values into the RF.

# memram | ADDRESS=addr=00000000000001

# Reading the ram

# ------------------------------

# mycontrol | in state 3 (Ti-1)

# memram | ram reading out - =00000001100100

# Reading the ram

# isppi | evaluating output to RF; in=00000001100100

# ------------------------------

# obj-dec | read in PPI: 00000001100100

# Reading the ram

# ------------------------------

# fourcounter | counting a frame DOWN

# memram | setting Top in RAM

Within the PI/PPI Decision Block, the frame counter counts down 1. After this, the “top mark”

bit is set on the last objective from this frame. This process requires reading out and writing

back to the memory over multiple cycles, and this it occurs concurrently with the other processes

over the next 3 cycles. The contents of the Block RAM after this operation are shown in Figure

Figure C.5: Contents of Block RAM After Move to Frame k-1

The PI/PPI Decision Block then finishes the “Move to Ti-1” operation by clearing the Forward

Network and toggling newvaluestoforward, which triggers the Objective Decision Block to

action.

# PPI Decision Block | Clearing ValuestoForward

# ------------------------------

# obj-dec | not frame k, check for conflict/done

# obj-dec | push obj: 00000001010010

# memram | beginning rewrite1

The Objective Decision Block again has all Xs from the output of the cleared Forward Network,

but this time it is no longer frame_k, so the contents of the RF (PPI objectives from previous

frame; PPO objectives for this frame) must be checked against the current Forward Network

PPOs to determine the state of the frame. The current contents of the RF are shown in Figure

Figure C.6: Contents of RF in Frame k-1

Since all PPOs are currently X, the first objective value in the RF is selected to be the next

objective, and pushed to the Backward Network Decoder.

# memram | setting the flag in RAM

# ------------------------------

The Backward Network Decoder receives the new objective and starts by clearing the Backward

Network of values locked into the priority encoders from the previous “Backtrace”. Once

complete, the new objective is pushed into the Backward Network, and the trace_start signal is

toggled, triggering the Backward Network Encoder to begin waiting for “Backtrace’ results. The

first “Bracktrace” in the current frame (k-1) is shown in Figure C.7.

Figure C.7: s27 Simulation Frame k-1 Backtrace 1

The “Backtrace” completes, and again the objective values are passed on to the PI/PPI Decision

Block, where they are pushed into the Block RAM.

# ------------------------------

00000000000000000000000000000101: 01

# ------------------------------

# input word=00000001010010

# ------------------------------

# addr=0000000101

# val=01

# ------------------------------

00000000000000000000000000000110: 10

# ------------------------------

# input word=00000001100100

# ------------------------------

# addr=0000000110

# val=10

At this point the Block RAM contains a completed frame k assignment, and a partial assignment

for frame k-1 (only the justification of the first objective has been completed) as shown in Figure

Figure C.8: Contents of RAM in Frame k-1 with Backtrace 1

After receiving the final objective, the PI/PPI Decision Block pushes the received values into the

Forward Network to complete the “Imply” operation, and generate the new objective’s resultant

PO/PPO values. The PI/PPI Decision Block also toggles the newvaluestofoward signal,

triggering the Objective Decision Block to take action. This forward trace is shown in Figure

Figure C.9: s27 Simulation Frame k-1 Imply 1

# addr=0000000110

# val=10

# ------------------------------

# obj-dec | push obj: 00000001100100

In the following cycle, the Objective Decision Block again checks the state of the current frame.

This time, the first value in the RF is satisfied by having an equal assignment on its

corresponding PPO from the Forward Network, as shown in Figure C.10.

Figure C.10: Contents of RF and PO/PPO in Frame k-1, Imply 1

Thus, the Objective Decision Block selects the second (and final) value in the RF as the

objective for further justification, as its corresponding PPO value from the Forward Network is

still X. The objective value is pushed to the Backward Network Decoder, which again clears the

Backward Network for a new “Backtrace” operation. The objective is then pushed onto the

Backward Network, and trace_start is toggled, triggering the Backward Network Encoder to

begin waiting for “Backtrace” results.

# ------------------------------

Note that this time in “Backtrace”, the Forward Network is not cleared (all Xs), so there are

current STATE values for each gate that factor in to whether or not an objective will continue

propagating in the “Backtrace”. These values are shown as blue in the “Backtrace” 2 illustration,

Figure C.11.

The value being backtraced on “G7” does eventually interact with a current STATE value from

the Forward Network, at G12. Here the gate is already assigned a value of 0 in the Forward

Network. Since the current value is not “X”, the “Backtrace” stops on this path. Since the

values are the same, the justification for this part of the path was already completed and further

evaluation along this path is not required. If the values were not equal, that would present a

conflict. Evaluation on the path would still stop, but a conflict would then be detected by the

Objective Decision Block in the subsequent “Imply” operation on the Forward Network, as a

complete assignment for the backtraced objective will not have been generated.

As shown, one objective is found in the second “Backtrace”. This value is received by the

encoder and passed to the PI/PPI Decision Block. The PI/PPI Decision Block stores this value in

memory, and adds it into the current Forward Network output buffer (in addition to the values

already in place from the first forward trace).

# ------------------------------

# input word=00000000100010

# ------------------------------

# addr=0000000010

# val=01

# ------------------------------

# addr=0000000010

# val=01

Once it is determined that there are no more objective values from the Backward Network

Encoder, the current values in the buffer are pushed onto the Forward Network and

newvaluestoforward is toggled to inform the Objective Decision Block that a trace has started

and action will be necessary. This second trace / “Imply” operation is shown in Figure C.12.

Figure C.12: s27 Simulation Frame k-1 Imply 2

In the next cycle, the “Imply” operation is complete and updated values are available on the

output of the Forward Network. The current PPO values are checked against their associated

values in the RF, as shown in Figure C.13. This time, both values in the RF are satisfied by their

corresponding Forward Network PPO values, so the Done signal is set and newframe_ready is

toggled, making the Backward Network encoder signal the PI/PPI Decision Block to take action

via NReady.

# obj-dec | signal newframe_ready

# ------------------------------

Figure C.13: Contents of RF and PO in Frame k-1, Imply 2

Since Done is asserted, it is not frame_1, and there are no PPOs in the current frame to be further

justified, processing must move back yet another frame, to k-2. This time, though, before

proceeding, the PI/PPI Decision Block goes to state 21. This is the State Check, which is run for

each “Move to Ti-1” operation beyond frame_k.

# Reading the ram

# ------------------------------

# statecheck | store value in Out: 11111111111111

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# statecheck | end of frame compare (Addr=00000000000000)

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# statecheck | check finished with no duplicates

# memram | ram reading out - =xxxxxxxxxxxxxx

# Reading the ram

The Forward Network output buffer contains the PI/PPI assignment that defines the current

frame to be locked into memory. To check this, State Check has a separate buffer that each past

frame in the memory is read out to. Each cycle, another past objective from the RAM is read out

into the State Check buffer. When a “top” mark bit is hit, it indicates the current RAM entry is

the start of a different frame. The current values in the State Check buffer are then compared to

the Forward Network output buffer. If the first entry in the RAM is reached and the final State

Check comparison passes, then no duplicates were found, and the “Move to Ti-1” process will

begin. In this example, the current assignment for frame k-1 is compared to the assignment for

frame k. It is found to be different, as shown in Figure C.14, so processing continues.

Figure C.14: State Check Comparison for Frame k-1

With the state check complete, the PI/PPI Decision Block continues through the “Move to Ti-1”

operation. Upon seeing the donewithstatecheck signal, the Objective Decision Block clears the

values currently in the RF, in preparation for the next frame to begin. The PI/PPI Decision

Block then begins passing the PPI objectives from the last frame to the Objective Decision

Block, which again stores them in the RF as the PPO objectives for the next frame.

# obj-dec | clearing RF for new frame

# memram | ram reading out - =xxxxxxxxxxxxxx

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# fourcounter | counting a frame DOWN

# ------------------------------

# PPI Decision Block | Clearing ValuestoForward

Note that although three objectives were written to the Block RAM as part of frame k-1, only

two objective values are transferred to the RF for frame k-2. This is because one of the

objectives in frame k-1 is a PI, which can be set arbitrarily, and thus does not require further

justification. Thus, the “isppi” module filters that objective in the memory while reading, and it

is not passed on to the Objective Decision Block. The contents of the RF after these new values

have been shifted in can be seen in Figure C.15.

Figure C.15: Contents of RF in Frame k-2

The PI/PPI Decision Block then clears the Forward Network and proceeds to mark the last

objective in the Block RAM with the “top” bit, indicating the end of the last frame. The contents

of the RAM at this point can be seen in Figure C.16.

Figure C.16: Contents of Block RAM at Frame k-2 Start

At the same time that the “top” bit is being marked, the Objective Decision Block has again been

triggered to action. All PO/PPOs from the Forward Network are cleared (values of X), and it is

not frame_1, so an objective from the RF will be selected for justification. The first value is

selected and passed to the Backward Network Decoder. The decoder clears the Backward

Network, pushes the new objective on, and signals the Backward Network Encoder that a new

“Backtrace” is starting.

# obj-dec | push obj: 00000001010010

# ------------------------------

# memram | setting the flag in RAM

# ------------------------------

Note that this first objective is the same as the first objective Backtraced as part of frame k-1,

and as such, the results of the “Backtrace” operation will be the same, as shown in Figure C.17.

The same traced values as before are received and passed back to the PI/PPI Decision Block.

These values are again stored in the Block RAM, and pushed onto the Forward Network, leading

to the same situation as in frame k-1.

# ------------------------------

00000000000000000000000000000101: 01

# ------------------------------

# input word=00000001010010

# ------------------------------

# addr=0000000101

# val=01

# ------------------------------

00000000000000000000000000000110: 10

# ------------------------------

# input word=00000001100100

# ------------------------------

# addr=0000000110

# val=10

# ------------------------------

# addr=0000000110

# val=10

# ------------------------------

# obj-dec | push obj: 00000001100100

# ------------------------------

The second objective to be pushed to the Backward Network is also the same, and the state of the

Forward Network is the same, so the second “Backtrace” operation is also identical, as shown in

Figure C.18.

Again, the same objective is backtraced and returned to the PI/PPI Decision Block, which in turn

pushes the update onto the Forward Network. The Objective Decision Block is now in the exact

same state as it was in frame k-1. Since both objectives in the RF are satisfied, the Done signal

is set and the PI/PPI Decision Block is again triggered that the current frame is done.

# ------------------------------

# input word=00000000100010

# ------------------------------

# addr=0000000010

# val=01

# ------------------------------

# addr=0000000010

# val=01

# ------------------------------

Since it is not frame_k or frame_1 and there are PPI objectives from the last frame, a “Move to

Ti-1” operation is desired. This then again triggers the State Check operation, which begins

reading out past frame values to the State Check buffer for comparison vs. the last frame’s

assignment.

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# statecheck | duplicate found

# Reading the ram

This time in the State Check, a duplicate is found, as frame k-1 was exactly the same as the

current frame k-2. The presence of the duplicate indicates a loop in frame assignments, so a

Backtrack operation is started. In this case of a State Check failure, the PI/PPI Decision Block

goes directly to state 9, which completes a “Clear Top” operation, removing the last set objective

on the RAM.

# memram | beginning cleartop

# ------------------------------

# mycontrol | in state 9 (cycle 1)

# memram | clearing top value in RAM

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000010

# val=11

# ------------------------------

# memram | beginning pop

# ------------------------------

# memram | popping off of RAM

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

After clearing the value in the RAM, the Forward Network output buffer is also updated to clear

the associated value to “X”. Finally, the “top” mark bit is updated to the second to last objective

value from the last frame, as the cleared value is no longer part of the last frame. The contents of

the Block RAM after this clear operation are shown in Figure C.19.

Figure C.19: Contents of Block RAM after First Clear Top

After the “Clear Top” operation completes, the PI/PPI Decision Block moves to state 6 to

continue the Backtrack with a “Swap Value” operation. In this process the current top objective

in the RAM for the last frame is read out, its value is swapped from 1 to 0, and it is written back

into the Block RAM. The contents of the Block RAM after the “Swap Value” operation are

shown in Figure C.20.

# memram | beginning swapwrite

# ------------------------------

# memram | swapping values in RAM

# ram_DATA_IN before swap: xx0000000001100100

# memram | ram_DATA_IN after swap: xx0000000001100011

# memram | ram_ADDRESS=0000000110

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000110

# val=01

# ------------------------------

Figure C.20: Contents of Block RAM after First Swap Value

After the objective value is updated in both the Block RAM and the Forward Network output

buffer, a new trace is started on the Forward Network, and the Objective Decision Block is

signaled to expect trace results. The new Forward Trace with the swapped value is shown in

Figure C.21. Note that the value that was cleared was associated with “G2” and the value that

was swapped was associated with “g6”. The value of “g7” remains the same.

Figure C.21: s27 Simulation Frame k-2, Backtrack 1 Imply

Although two objective values are pushed onto the Forward Network, they both stop propagating

within the circuit, which results in the PO/PPO feeding the Objective Decision block with all Xs.

Thus, for the Objective Decision Block, the current state looks the same as the start of frame k-2,

with two objectives in the RF to justify, and no current PPO values from the Forward Network.

The first objective (G6=0) is selected.

# obj-dec | push obj: 00000001010010

# ------------------------------

The Backward Network Decoder clears the Backward Network and pushes this new objective on

to be backtraced. Note that this time for the “Backtrace” of G6=0, there are different STATE

values coming from the Forward Network, which change the behavior of the operation, as shown

in Figure C.22.

Figure C.22: Simulation Frame k-2, Backtrack 1 Backtrace

One portion of the Backtrace is stopped at gate G8 due to a prior assignment in the Forward

Network. The other portion of the Backtrace, going to “g7” is also stopped, though the reason is

not apparent due to another abstraction in the illustration. As part of DFF handling, isolation

buffers are added to all DFF outputs. This prevents no-logic paths from being introduced into

the networks by DFFs feeding other DFFs. One other effect of this is that these buffers are

present on all PPI inputs in the networks. Thus, this “virtual” buffer on “g7” has an inherited

value of 0 from the Forward Network due to the traced assignment. Thus, the value that is

attempting to be Backtraced to “g7” is blocked.

This situation leads to no change in the Backward Network output. This trace failure is caught

by the Backward Network Encoder, which signals the Objective Decision Block via the propfail

signal. The Objective Decision Block sees this signal and raises Conflict to the PI/PPI Decision

Block. At the same time, the Backward Network Encoder also sets NReady low to trigger the

PI/PPI Decision Block to action.

# back-enc | no obj propagated; signaling propagation failure

# ------------------------------

# obj-dec | back-net propagation failure

# back-enc | asserting delayed NReady

With Conflict asserted, the PI/PPI Decision Block again moves to state 9, starting another

Backtrack operation with a “Clear Top”. After that, “Swap Value” is again performed, updating

both the Block RAM and Forward Network output buffer. Upon completion of this “Backtrack”,

the latest values are again pushed onto the Forward Network.

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000110

# val=11

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000101

# val=10

# ------------------------------

Note that in this second “Backtrack” operation the value of G7 was cleared, and the value of G6

was swapped from 0 to 1. Once these updated values are pushed onto the Forward Network, the

resulting “Imply” operation is shown in Figure C.23.

This time in the “Imply” operation, the remaining value of g6=1 is immediately stopped, as it

directly feeds a single AND gate, and a value of 1 just passes through the X on the other input.

No PO/PPO values change from the Forward Network, and the Objective Decision Block then

detects a propagation failure on the Forward Network. The Conflict signal is again raised, with

newframe_ready being sent to the Backward Network Encoder to signal the PI/PPI Decision

Block with NReady. The PI/PPI Decision Block, having Conflict set again, begins another

# obj_dec | fwd network propagation failure

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000101

# val=11

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

This time, there is only one objective left in the current frame k-2. Once the “Clear Top”

operation has completed, the PI/PPI Decision block sees the “top” bit set on the next objective

that it needs to execute a “Swap Value” on. This indicates that all options in the current frame

have been exhausted and a “Move to Ti+1” operation must be completed to continue the

When the “Move to Ti+1” operation begins, the PI/PPI Decision Block sets the tiplus1 signal,

which triggers the Objective Decision Block to clear the RF, in preparation for a new frame. The

first operation that the PI/PPI Decision Block needs to complete is restoring the final state from

the frame that is being moved to. To do this, each objective value from the new frame is read out

from the Block RAM and into the Forward Network output buffer, cycling between states 3 and

22. Once complete, the previous PPO objective values in the RF must be restored. The

objective values from the frame prior to the one being moved to are also read out, and passed

over the ppi bus to the Objective Decision Block to read into the RF. Once all objective values

have been read out, the “top” mark bit is cleared from the last objective in the RAM, “unlocking”

the frame being moved to, and the counter increments by 1 to reflect the new current state.

# clearing the flag in RAM

# ------------------------------

# Reading the ram

# ------------------------------

# mycontrol | in state 3 (Ti+1)

# PPI Decision Block | Assign vf (from Out)

# addr=1111111111

# val=11

# Reading the ram

# addr=0000000010

# val=01

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000110

# val=10

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000101

# val=01

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# fourcounter | counting a frame UP

Once the “Move to Ti+1” operation has completed, the Backtrack that was in progress can

continue. It left off waiting to run a “Swap Value” on the last value in the frame that is now the

current frame (k-1). That “Swap Value” is now executed, and the current memory state after

both the “Move to Ti+1” and “Swap Value” is shown in Figure C.24.

Figure C.24: Block RAM Contents after Move to Ti+1 and Swap Value

Once the value is swapped in the Block RAM, it is also updated in the Forward Network output

buffer. The PI/PPI Decision Block then pushes these updated values into the Forward Network

to start a new “Imply” operation. Note that this time, the value that was swapped was that of G2,

changing from 0 to 1, while g7 remains 1 and g6 remains 0, for frame k-1. This “Imply”

operation is shown in Figure C.25.

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000010

# val=10

# ------------------------------

Note that this time when the “Imply” operation completes the PO/PPO assignment from the

Forward Network includes the assignment G7=0. Comparing this with the contents of the RF, as

shown in Figure C.26, there is now a value conflict between the required PPO value of G7 from

the previous frame (1), and the assigned value in the current frame (0).

Figure C.26: RF Contents and PO/PPO after Move to Ti+1, Imply 1

The Objective Decision Block detects this conflict, causing it to raise the Conflict signal again.

The newframe_ready signal is sent to the Backward Network Encoder, which again sets the

NReady signal, triggering the PI/PPI Decision Block to take action. With the Conflict signal

raised, the PI/PPI Decision Block will run another iteration of the “Backtrack” operation.

# obj-dec | CONFLICT found

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000010

# val=11

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000110

# val=01

Note that after this “Backtrack” operation, the contents of the Block RAM for frame k-1 are

exactly the same as the contents of the RAM were for frame k-2 after the first “Backtrack”

operation. The current contents of the Block RAM at this point are shown in Figure C.27.

Figure C.27: Block RAM Contents after Move to Ti+1 and Backtrack

Since the current state in frame k-1 is the same as it was in frame k-2 after the first “Backtrack”,

the simulation will proceed exactly as it did in that state, triggering two more “Backtrack”

operations, until the end of frame k-1 is encountered.

# ------------------------------

# obj-dec | push obj: 00000001010010

# ------------------------------

# back-enc | no obj propagated; signaling propagation failure

# ------------------------------

# back-enc | asserting delayed NReady

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000110

# val=11

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000101

# val=10

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000101

# val=11

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

Once the third “Backtrack” operation in frame k-1 starts, it encounters the same situation as in

frame k-2. There are no more objectives in frame k-1, and the next objective to be set is part of

frame k. Thus the PI/PPI Decision Block starts another “Move to Ti+1” operation. Note that this

time, the Forward Network output buffer is updated with the frame k PI/PPI assignments, but no

PPO objective values are sent to the RF in the Objective Decision Block, once it clears itself for

the new frame. Since the new frame is frame k, there is no frame before it where PPO objectives

would have been inherited from. The only objective in frame k is line_k=1, which is specially

checked by the Objective Decision Block (instead of checking the RF) when in frame_k.

# clearing the flag in RAM

# ------------------------------

# Reading the ram

# ------------------------------

# addr=1111111111

# val=11

# Reading the ram

# addr=0000000110

# val=10

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000101

# val=01

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# fourcounter | counting a frame UP

# ------------------------------

After the “Move to Ti+1” operation and the following “Swap Value” occur, the circuit is back in

frame_k, with a changed value for the second objective that was part of that frame (g7). The

state of the Block RAM at this point can be seen in Figure C.28.

Figure C.28: Block RAM Contents after Second Move to Ti+1 and Swap Value

After swapping the value in the Block RAM, the value is also updated in the Forward Network

output buffer, which is then pushed onto the Forward Network to start a new “Imply” operation.

This new “Imply” with g7 swapped to 0 and g6 remaining 0 is shown in Figure C.29.

Figure C.29: Simulation Frame k, Backtrack 1 Imply

# Reading the ram

# ------------------------------

# Reading the ram

# ------------------------------

# addr=0000000110

# val=01

# ------------------------------

Propagation of both objectives is stopped early, as they cause single 0 inputs to OR gates, which

passes through a value of X. Thus, the PO/PPO outputs of the Forward Network remain all Xs,

causing the Objective Decision Block to detect a Forward Network propagation failure. The

Objective Decision Block raises Conflict and toggles newframe_ready, triggering the Backward

Network Encoder to set NReady low. The PI/PPI Decision Block is triggered by NReady, and

sees the current state of Conflict and frame_k, which indicates a final FAIL. The PI/PPI Decision

Block moves to a final state of 13, which sets the FAIL output of the circuit high.

# ------------------------------

# FAIL!!!

# cycle= 247 FAIL=1 DONE=0

Appendix D: FPGA Emulation Algorithm Implementation Base Verilog Code

Included in this appendix is the base Verilog code implementing the FPGA emulation for the

algorithm presented here. Each module is implemented as described in section 3. Forward and

Backward Networks are excluded from this code, as they are specific to the circuit under test.

Example network code can be found in Appendix A. Note that the n/m/l/s parameter settings in

the code are not generic, and are updated for each different circuit under test via the script in

Appendix F.

00000001 //

00000002 //

00000003 // STATIC CODE

00000004 //

00000005 //

00000006

00000007 module vcontrol (Top, In, Out);

00000008 input [13:0] Top;

00000009 input [13:0] In;

00000010

00000011 output reg [1:0] Out;

00000012

00000013 //reg [13:0] last_Top;

00000014 //reg [13:0] last_In;

00000015

00000016 always @(Top, In) begin

00000017 //if (Top != last_Top) begin

00000018 // Out = Top[2:1];

00000019 //end

00000020 //else begin

00000021 // Out = In[2:1];

00000022 //end

00000023

00000024 //last_Top is not controlled, so this always reads Top

00000025 Out = Top[2:1];

00000026 end

00000027

00000028 endmodule

00000029

00000030 module obj_dec(global_reset, clk, ppi, PO_0, PO_1, PPO_0, PPO_1, frame_k, obj, Done,

Conflict, line_k_1, line_k_X, obj_set, propfail, newframe_ready, newvaluestoforward,

donewithstatecheck, statecheckresult, tiplus1);

00000031 //parameters - filled in generation

00000032 parameter n = 1;

00000033 parameter m = 4;

00000034 parameter l = 3;

00000035 parameter s = 22;

00000036

00000037 //io

00000038 input global_reset;

00000039 input clk;

00000040 input [13:0] ppi;

00000041 input [n-1:0] PO_0; //gen (n)

00000042 input [n-1:0] PO_1; //gen (n)

00000043 input [l-1:0] PPO_0; //gen (l)

00000044 input [l-1:0] PPO_1; //gen (l)

00000045 input frame_k;

00000046 input propfail;

00000047 input newvaluestoforward;

00000048 input donewithstatecheck;

00000049 input statecheckresult;

00000050 input tiplus1;

00000051 output reg [13:0] obj;

00000052 output reg Done;

00000053 output reg Conflict;

00000054 output reg line_k_1;

00000055 output reg line_k_X;

00000056 output reg obj_set;

00000057 output reg newframe_ready;

00000058

00000059 //vars

00000060 reg [13:0] RF [l-1:0];

00000061 reg [l-1:0] val_ppo_1;

00000062 reg [l-1:0] val_ppo_0;

00000063 reg [n-1:0] val_po_1;

00000064 reg [n-1:0] val_po_0;

00000065 reg last_newvaluestoforward;

00000066 reg [31:0] i;

00000067 reg [31:0] j;

00000068 reg rf_clear;

00000069

00000070 //At clock, determine what to do

00000071 always @(posedge global_reset, posedge clk) begin

00000072 //$display("obj-dec | TRIGGER");

00000073 //$display(" global_reset=%b", global_reset);

00000074 //$display(" propfail=%b", propfail);

00000075 //$display(" ppi=%b", ppi);

00000076 //$display(" PO_1=%b", PO_1);

00000077 //$display(" PO_0=%b", PO_0);

00000078 //$display(" PPO_1=%b", PPO_1);

00000079 //$display(" PPO_0=%b", PPO_0);

00000080

00000081 if (global_reset) begin

00000082 $display("obj-dec | RESET");

00000083 val_ppo_1 = {l{1'b0}};

00000084 val_ppo_0 = {l{1'b0}};

00000085 val_po_1 = {n{1'b0}};

00000086 val_po_0 = {n{1'b0}};

00000087 i = 32'b0;

00000088 j = 32'b0;

00000089 newframe_ready = 1'b0;

00000090 obj = 14'b00000000000000;

00000091 obj_set = 1'b0;

00000092 last_newvaluestoforward = 1'b0;

00000093 Done = 1'b0;

00000094 Conflict = 1'b0;

00000095 line_k_1 = 1'b0;

00000096 line_k_X = 1'b0;

00000097

00000098 //clear out RF

00000099 while(i<l) begin

00000100 RF[i] = 14'b0;

00000101 i = i + 1;

00000102 end

00000103

00000104 rf_clear = 1'b1;

00000105 i = 32'b0;

00000106 end

00000107 else if (propfail) begin

00000108 $display("obj-dec | back-net propagation failure");

00000109 Conflict = 1'b1;

00000110 end

00000111 else if (ppi != 14'b11111111111111) begin

00000112 //PPI value being sent from Dec block

00000113 if (ppi[2:1] == 2'b11) begin

00000114 //Value is 11; clear entry

00000115 $display("obj-dec | clear RF for PPI: %b", ppi);

00000116 RF[ppi[13:4]-m] = 14'b0;

00000117 end

00000118 else begin

00000119 //Read in ppi value

00000120 $display("obj-dec | read in PPI: %b", ppi);

00000121 RF[ppi[13:4]-m] = ppi;

00000122 rf_clear = 1'b0;

00000123 end

00000124 end

00000125 else if ((donewithstatecheck && !statecheckresult) || tiplus1) begin

00000126 //Moving between frames - clear the RF

00000127 if (!rf_clear) begin

00000128 $display("obj-dec | clearing RF for new frame");

00000129 i = 32'b0;

00000130 while(i<l) begin

00000131 RF[i] = 14'b0;

00000132 i = i + 1;

00000133 end

00000134 rf_clear = 1'b1;

00000135 end

00000136 end

00000137 else if (PPO_1 != val_ppo_1 || PPO_0 != val_ppo_0 || PO_1 != val_po_1

|| PO_0 != val_po_0) begin

00000138 //Reset signals

00000139 //Ready = 1'b1;

00000140 //newframe_ready = 1'b0;

00000141 last_newvaluestoforward = newvaluestoforward;

00000142

00000143 //Data on fwd, read in

00000144 val_ppo_1 = PPO_1;

00000145 val_ppo_0 = PPO_0;

00000146 val_po_1 = PO_1;

00000147 val_po_0 = PO_0;

00000148

00000149 //Is this frame k?

00000150 if(frame_k == 1'b1) begin

00000151 //Frame k, check line_k

00000152 if(val_po_1[0] == 1'b1 && val_po_0[0] == 1'b1) begin

00000153 $display("obj-dec | line_k = X");

00000154 //Line_k = X

00000155 line_k_X = 1'b1;

00000156 line_k_1 = 1'b0;

00000157 //Set line_k=1 on objective

00000158 $display("obj-dec | push obj: %b",

14'b00000000000100);

00000159 obj = 14'b00000000000100;

00000160 obj_set = ~obj_set;

00000161 //Need more justification, stay in frame

00000162 end

00000163 else if(val_po_1[0] == 1'b1 && val_po_0[0] == 1'b0)

00000164 $display("obj-dec | line_k = 1");

00000165 $display("obj-dec | DONE with frame");

00000166 //Line_k = 1

00000167 line_k_X = 1'b0;

00000168 line_k_1 = 1'b1;

00000169 newframe_ready = ~newframe_ready;

00000170 if(l == 1) begin

00000171 Done = 1'b1;

00000172 Conflict = 1'b0;

00000173 end

00000174 //Done with Tk, move to new frame

00000175 end

00000176 else begin

00000177 $display("obj-dec | line_k = 0");

00000178 $display("obj-dec | FAIL");

00000179 //Line_k = 0

00000180 line_k_X = 1'b0;

00000181 line_k_1 = 1'b0;

00000183 //FAIL

00000184 end

00000185 end

00000186 else begin

00000187 $display("obj-dec | not frame k, check for

conflict/done");

00000188 //Not frame k, check control logic

00000189 line_k_X = 1'b0;

00000190 line_k_1 = 1'b0;

00000191

00000192 //Conflict/Done logic - created in generation

00000193 // If generated PPO matches past PPI (RF), then Done

00000194 // If PPO/RF mismatch, then Conflict

00000195 // Else, continue to try and assign 11 (X) PPOs

00000196 Conflict = 1'b0;

00000197 Done = 1'b0;

00000198 i = 32'b0;

00000199 j = 32'b0;

00000200 while(i < l) begin

00000201 if((RF[i][2] == 1'b1 && RF[i][1] == 1'b0) &&

(PPO_1[i] == 1'b1 && PPO_0[i] == 1'b0)) begin

00000202 //RF = 10, PPO = 10 : match

00000203 j = j + 1;

00000204 end

00000205 else if((RF[i][2] == 1'b0 && RF[i][1] == 1'b1) &&

00000206 //RF = 01, PPO = 01 : match

00000207 j = j + 1;

00000208 end

00000209 else if((RF[i][2] == 1'b1 && RF[i][1] == 1'b0) &&

00000210 //RF = 10, PPO = 01 : CONFLICT

00000211 $display("obj-dec | CONFLICT found");

00000212 Conflict = 1'b1;

00000213 end

00000214 else if((RF[i][2] == 1'b0 && RF[i][1] == 1'b1) &&

00000215 //RF = 01, PPO = 10 : CONFLICT

00000216 $display("obj-dec | CONFLICT found");

00000217 Conflict = 1'b1;

00000218 end

00000219 else if(RF[i][2] == 1'b0 && RF[i][1] == 1'b0)

00000220 //RF = 00 : NO ASSIGNMENT TO MATCH

00000221 j = j + 1;

00000222 end

00000223

00000224 i = i + 1;

00000225 end

00000226

00000227 if(j == l) begin

00000228 //All RF entries match PPO, we are DONE

00000229 $display("obj-dec | DONE with frame");

00000230 Done = 1'b1;

00000231 end

00000232

00000233

00000234 //If no conflict and not done, pick new obj

00000235 if((Conflict | Done) == 0) begin

00000236 //Start sequencer

00000237 // Find unassigned PPO (11)

00000238 // Where associated PPI is assigned (not 00)

00000239 i = 32'b0;

00000240 j = 32'b0;

00000241 while(j < l && i == 0) begin

00000242 if((RF[j][2] == 1'b0) && (RF[j][1] ==

1'b0)) begin

00000243 //No objective at this index in

RF, skip

00000244 end

00000245 else if((PPO_1[j] == 1'b1) && (PPO_0[j]

== 1'b1)) begin

00000246 //RF for this index is not empty

and currently PPO is X, pick up as obj and halt sequencer

00000247 i = j;

00000248 end

00000249 else begin

00000250 //Current RF[j] = PPO[j], already

traced, pick next possible objective

00000251 end

00000252 j = j + 1;

00000253 end

00000254 //Found unassigned ppo with index i

00000255 $display("obj-dec | push obj: %b", RF[i]);

00000256 obj = RF[i];

00000257 obj_set = ~obj_set;

00000258 end

00000259 else begin

00000260 $display("obj-dec | signal newframe_ready");

00000261 //Either DONE or in CONFLICT

00000262 // Signal the Dec_Block

00000264 end

00000265 end

00000266 end

00000267 else if (newvaluestoforward != last_newvaluestoforward) begin

00000268 //Failure to trace on forward network

00000269 $display("obj_dec | fwd network propagation failure");

00000270 Conflict = 1'b1;

00000272 last_newvaluestoforward = newvaluestoforward;

00000273 end

00000274 else begin

00000275 //$display("obj-dec | doing nothing");

00000276 end

00000277 end

00000278 endmodule

00000279

00000280

00000281 module back_encoder(global_reset, clk, trace_start, PI_1, PI_0, PPI_1, PPI_0, NReady,

in, mode, genobj, propfail, newframe_ready);

00000282 ////parameters - filled in generation

00000286 parameter s = 22;

00000287

00000288 //io

00000290 input clk;

00000291 input trace_start;

00000292 input genobj;

00000293 input [m-1:0] PI_1;

00000294 input [m-1:0] PI_0;

00000295 input [l-1:0] PPI_1;

00000296 input [l-1:0] PPI_0;

00000297 input newframe_ready;

00000298 output reg NReady;

00000299 output reg [13:0] in;

00000300 output reg mode;

00000301 output reg propfail;

00000302

00000303 //vars

00000304 reg [31:0] i;

00000305 reg last_trace_start;

00000306 reg [1:0] state;

00000307 reg [m-1:0] PI_1_save;

00000308 reg [m-1:0] PI_0_save;

00000309 reg [l-1:0] PPI_1_save;

00000310 reg [l-1:0] PPI_0_save;

00000311 reg [m-1:0] PI_1_last;

00000312 reg [m-1:0] PI_0_last;

00000313 reg [l-1:0] PPI_1_last;

00000314 reg [l-1:0] PPI_0_last;

00000315 reg last_newframe_ready;

00000316 reg delayed_nready;

00000317

00000318 always @(posedge clk, posedge global_reset) begin

00000320 $display("back-enc | RESET");

00000321 i = 32'b0;

00000322 last_trace_start = 1'b0;

00000323 mode = 1'b0;

00000324 state = 2'b00;

00000325 NReady = 1'b1;

00000326 last_newframe_ready = 1'b0;

00000327 in = 14'b11111111111111;

00000328 delayed_nready = 1'b0;

00000329 propfail = 1'b0;

00000330 PI_1_save = {m{1'b0}};

00000331 PI_0_save = {m{1'b0}};

00000332 PPI_1_save = {l{1'b0}};

00000333 PPI_0_save = {l{1'b0}};

00000334 PI_1_last = {m{1'b0}};

00000335 PI_0_last = {m{1'b0}};

00000336 PPI_1_last = {l{1'b0}};

00000337 PPI_0_last = {l{1'b0}};

00000338 end

00000339 else begin

00000340 if (state == 2'b00) begin

00000341 //STATE=IDLE

00000342 if (trace_start != last_trace_start) begin

00000343 //Ready has toggled, a back-trace is in progress

00000344 $display("back-enc | trace started, waiting for

results");

00000345 mode = 1'b1;

00000346 last_trace_start = trace_start;

00000347 state = 2'b01;

00000348 NReady = 1'b1;

00000349 propfail = 1'b0;

00000350 end

00000351 else if (newframe_ready != last_newframe_ready) begin

00000352 //Trigger Dec to move to new frame

00000353 $display("back-enc | forwarding newframe

nready");

00000354 last_newframe_ready = newframe_ready;

00000355 NReady = 1'b0;

00000356 end

00000357 else if (delayed_nready == 1'b1) begin

00000358 //Assert delayed NReady signal

00000359 $display("back-enc | asserting delayed NReady");

00000361 NReady = 1'b0;

00000362 end

00000363 else if (NReady == 1'b0) begin

00000364 //Reset NReady to 1

00000365 $display("back-enc | recovering nready in idle");

00000366 NReady = 1'b1;

00000367 if (propfail) begin

00000368 propfail = 1'b0;

00000369 end

00000370 end

00000371 end

00000372 else if (state == 2'b01) begin

00000373 //STATE=WAITING ON TRACE

00000374 if (PI_1 == PI_1_last && PI_0 == PI_0_last && PPI_1 ==

PPI_1_last && PPI_0 == PPI_0_last) begin

00000375 //Backtrace has failed (no value change)

00000376 $display("back-enc | no values change; signaling

propagation failure");

00000377 propfail = 1'b1;

00000378 state = 2'b00;

00000379 mode = 1'b0;

00000381 end

00000382 else if (PI_1 == {m{1'b0}} && PPI_1 == {l{1'b0}}) begin

00000383 //Backtrace has failed (no objective propagation)

00000384 $display("back-enc | no obj propagated; signaling

propagation failure");

00000385 propfail = 1'b1;

00000386 state = 2'b00;

00000387 mode = 1'b0;

00000389 end

00000390 else begin

00000391 //Backtrace success, save values

00000392 $display("back-enc | traced values received");

00000393 PI_1_save = PI_1;

00000394 PI_0_save = PI_0;

00000395 PPI_1_save = PPI_1;

00000396 PPI_0_save = PPI_0;

00000397 state = 2'b10;

00000398 NReady = 1'b0;

00000399 end

00000400 end

00000401 else if (state == 2'b10) begin

00000402 //STATE=PUSHING TO DEC BLOCK

00000403 if (NReady) begin

00000404 //Ready to push an objective out

00000405 if (genobj) begin

00000406 //Obj encode requested

00000407 i = 32'b0;

00000408 while ((i < m) && (PI_1_save[i] != 1'b1))

00000409 i = i + 1;

00000410 end

00000411

00000412 if (i >= m) begin

00000413 //No objective on PI, check PPI

00000414 if (l > 1) begin

00000415 i = 32'b0;

00000416 while ((i < l) &&

(PPI_1_save[i] != 1'b1)) begin

00000417 i = i + 1;

00000418 end

00000419

00000420 if (i != l) begin

00000421 //Found objective

input with index i, PPI(i), encode

00000422 $display("back-enc

| found obj at PPI index %b: %b%b", i+m, PPI_0_save[i], !PPI_0_save[i]);

00000423 in[13:4] = i + m;

00000424 in[3] = 1'b0;

00000425 in[2] =

PPI_0_save[i];

00000426 in[1] =

!PPI_0_save[i];

00000427 in[0] = 1'b0;

00000428 PPI_1_save[i] =

00000429 NReady = 1'b0;

00000430 end

00000431 else begin

00000432 //No ojbective

found, set mode to done

00000433 $display("back-enc

| no objectives; done");

00000434 in =

14'b11111111111111;

00000435 mode = 1'b0;

00000436 state = 2'b00;

00000437 end

00000438 end

00000439 else begin

00000440 $display("back-enc | no

objectives; done");

00000441 in = 14'b11111111111111;

00000442 mode = 1'b0;

00000443 state = 2'b00;

00000444 end

00000445 end

00000446 else begin

00000447 //Found objective input with index

i, encode

00000448 $display("back-enc | found obj at

PI index %b: %b%b", i, PI_0_save[i], !PI_0_save[i]);

00000449 in[13:4] = i;

00000450 in[3] = 1'b0;

00000451 in[2] = PI_0_save[i];

00000452 in[1] = !PI_0_save[i];

00000453 in[0] = 1'b0;

00000454 PI_1_save[i] = 1'b0;

00000455 NReady = 1'b0;

00000456 end

00000457 end

00000458 end

00000459 else begin

00000460 //Reset ready cycle

00000461 $display("back-enc | reset nready signal");

00000462 NReady = 1'b1;

00000463 end

00000464 end

00000465

00000466 if (mode != 2'b01) begin

00000467 //Not waiting on trace - save current back net values

00000468 PI_1_last = PI_1;

00000469 PI_0_last = PI_0;

00000470 PPI_1_last = PPI_1;

00000471 PPI_0_last = PPI_0;

00000472 end

00000473 end

00000474 end

00000475 endmodule

00000476

00000477

00000478 module back_decoder(global_reset, clk, obj, obj_set, PO_1, PO_0, PPO_1, PPO_0, PI_1,

PI_0, PPI_1, PPI_0, priority_reset, trace_start);

00000479 ////parameters - filled in generation

00000483 parameter s = 22;

00000484

00000485 //io

00000487 input clk;

00000488 input [13:0] obj;

00000489 input obj_set;

00000490 input [n-1:0] PO_1;

00000491 input [n-1:0] PO_0;

00000492 input [l-1:0] PPO_1;

00000493 input [l-1:0] PPO_0;

00000494 output reg [n-1:0] PI_1;

00000495 output reg [n-1:0] PI_0;

00000496 output reg [l-1:0] PPI_1;

00000497 output reg [l-1:0] PPI_0;

00000498 output reg priority_reset;

00000499 output reg trace_start;

00000500

00000501 //vars

00000502 reg [31:0] i;

00000503 reg [1:0] mode;

00000504 reg last_obj_set;

00000505

00000508 $display("back-bec | RESET");

00000509 priority_reset = 1'b0;

00000510 trace_start = 1'b0;

00000511 i = 32'b0;

00000512 mode = 2'b00;

00000513 last_obj_set = 1'b0;

00000514 PI_1 = {n{1'b0}};

00000515 PI_0 = {n{1'b0}};

00000516 PPI_1 = {l{1'b0}};

00000517 PPI_0 = {l{1'b0}};

00000518 end

00000519 else begin

00000520 if (mode == 2'b00 && obj_set != last_obj_set) begin

00000521 //Obj has changed - start clear/encode sequence

00000522 $display("back-dec | received new obj: %b", obj);

00000523 mode = 2'b01;

00000524 last_obj_set = obj_set;

00000525 end

00000526 else if(mode == 2'b01) begin

00000527 //Clear the back net

00000528 $display("back-dec | clearing back-net");

00000529 PI_1 = {n{1'b0}};

00000530 PI_0 = {n{1'b0}};

00000531 PPI_1 = {l{1'b0}};

00000532 PPI_0 = {l{1'b0}};

00000533 priority_reset = ~priority_reset;

00000534 mode = 2'b10;

00000535 end

00000536 else if(mode == 2'b10) begin

00000537 $display("back-dec | pushing obj onto back-net: %b",

00000538 //Do back-net assignment

00000539 for(i = 0; i < n; i = i + 1) begin

00000540 if(i == obj[13:4]) begin

00000541 //found obj index, insert obj

00000542 PI_1[i] = 1'b1;

00000543 PI_0[i] = obj[2];

00000544 end

00000545 else begin

00000546 //insert regular ppo

00000547 PI_1[i] = 1'b0;

00000548 PI_0[i] = PO_1[i];

00000549 end

00000550 end

00000551

00000552 if(l > 1) begin

00000553 for(i = m; i < m+l; i = i + 1) begin

00000554 if(i == obj[13:4]) begin

00000555 //found obj index, insert obj

00000556 PPI_1[i-m] = 1'b1;

00000557 PPI_0[i-m] = obj[2];

00000558 end

00000559 else begin

00000560 //insert regular ppo

00000561 PPI_1[i-m] = 1'b0;

00000562 PPI_0[i-m] = PPO_1[i-m];

00000563 end

00000564 end

00000565 end

00000566

00000567 trace_start = ~trace_start;

00000568 mode = 2'b00;

00000569 end

00000570 end

00000571 end

00000572 endmodule

00000573

00000574 module control ( global_reset, clk, push, line_k_1, cnt_dir, nready, DONE, done,

comp_tried, rw, line_k_X, cnt_enable, pop, empty, conflict, frame_k, frame_1, addr, FAIL,

top_mark, value, rewrite0, rewrite1, swapwrite, cleartop, notclear, last_in, next_top,

ninputs, genobj, sendvaluestoforward, inputwordallones, readingow, endofframe,

beginstatecheck, donewithstatecheck, statecheckresult, blockisppi, owmark, tiplus1);

00000576 input clk;

00000577 input conflict ;

00000578 input line_k_1 ;

00000579 input nready ;

00000580 input frame_k ;

00000581 input done ;

00000582 input comp_tried ;

00000583 input line_k_X ;

00000584 input frame_1 ;

00000585 input [1:0] value ;

00000586 input top_mark ;

00000587 input empty ;

00000588 input [31:0] last_in;

00000589 input next_top;

00000590 input inputwordallones;

00000591 input endofframe;

00000592 input donewithstatecheck; //Is the dup state check done?

00000593 input statecheckresult; //Did the dup state check pass?

00000594 input owmark; //Passes in the mark bit from the output word from memram

00000595

00000596 output reg push;

00000597 output reg cnt_dir;

00000598 output reg DONE;

00000599 output reg [1:0] rw;

00000600 output reg cnt_enable;

00000601 output reg [13:0] addr;

00000602 output reg FAIL;

00000603 output reg pop;

00000604 output reg rewrite0;

00000605 output reg rewrite1;

00000606 output reg swapwrite;

00000607 output reg cleartop;

00000608 output reg [1:0] notclear;

00000609 output reg [2:0] ninputs;

00000610 output reg genobj;

00000611 output reg sendvaluestoforward;

00000612 output reg readingow; //signals top if it should read Top or ow into VF

00000613 output reg beginstatecheck; //signals to start dup state check prior to move

back to Ti-1

00000614 output reg blockisppi;

00000615 output reg tiplus1;

00000616

00000617 reg blockisppi2; //flag bit to control output to isppi during Ti+1

00000618 reg [4:0] state;

00000619 reg wwait;

00000620 reg timinus1;

00000621 reg waits5; //add another clock cycle to state 5

00000625 //reg swapchecks12; //handle comp_tried checking in s12 -> s2?s10

00000626 reg popchecks12; //handle top update after pop operation

00000627 reg clearvalueschecks12; //handle vf update in backtrack top clear operation

00000628

00000629 parameter s0 = 5'b00000; parameter s1 = 5'b00001; parameter s2 = 5'b00010;

parameter s3 = 5'b00011;

00000635

00000637 if(global_reset) begin

00000638 state = s0;

00000639 tiplus1 = 0;

00000640 timinus1 = 0;

00000641 beginstatecheck = 1'b0;

00000642 blockisppi = 1'b0;

00000643 blockisppi2 = 1'b0;

00000644 waits5 = 1'b1;

00000645 waits6 = 1'b1;

00000646 waits9 = 1'b1;

00000647 waits12 = 1'b1;

00000648 addr = 14'b11111111111111;

00000649 wwait = 1'b0;

00000650 end

00000651 else if(nready == 1'b0 && wwait == 1'b0) begin

00000652 wwait = 1'b1;

00000653 end

00000654 else begin

00000655 case(state)

00000656 s1: begin

00000657 if (inputwordallones) begin

00000658 state = s16;

00000659 end

00000660 else begin

00000661 state = s17;

00000662 end

00000663 $display ("mycontrol | in state 1");

00000664 end

00000665 s2: begin

00000667 addr = 14'b11111111111111;

00000668 state = s12;

00000669 end

00000670 s3: begin

00000671 if (tiplus1) begin

00000672 //Moving to Ti+1

00000673 $display ("mycontrol | in state 3

(Ti+1)");

00000674 if (!blockisppi2 && addr ==

14'b11111111111111) begin

00000675 //In frame k - no Ti-2 to read to

00000676 tiplus1 = 1'b0;

00000677 state = s7;

00000678 end

00000679 else if (!owmark && !blockisppi2) begin

00000680 //Reading Ti-1 into VF

00000682 addr = addr - 1;

00000683 state = s22;

00000684 end

00000685 else if (owmark) begin

00000686 //Switch to frame Ti-2

00000688 blockisppi2 = 1'b1;

00000689 addr = addr - 1;

00000690 end

00000691 else if (blockisppi2 && (owmark || addr

== 14'b11111111111111)) begin

00000692 //Done reading Ti-2

00000694 tiplus1 = 1'b0;

00000695 state = s7;

00000696 end

00000697 else begin

00000698 //Reading Ti-2 into RF

00000699 state = s3;

00000700 addr = addr - 1;

00000701 end

00000702 end

00000703 else begin

00000704 //Moving to Ti-1

00000705 $display ("mycontrol | in state 3 (Ti-

00000706 if (addr == 14'b11111111111111 ||

endofframe == 1'b1) begin

00000707 state = s8;

00000708 end

00000709 else begin

00000710 state = s3;

00000711 addr = addr - 1;

00000712 end

00000713 end

00000714 end

00000715 s4: begin

00000717 blockisppi2 = 1'b0;

00000718 state = s0;

00000720 end

00000721 s5: begin

00000722 if (waits5) begin

00000723 $display ("mycontrol | in state 5 (cycle

00000724 waits5 = 1'b0;

00000725 end

00000726 else begin

00000728 waits5 = 1'b1;

00000729 addr = last_in;

00000730 state = s3;

00000731 end

00000732 end

00000733 s6: begin

00000736 waits6 = 1'b0;

00000737 end

00000738 else begin

00000740 waits6 = 1'b1;

00000741 addr = 14'b11111111111111;

00000742 state = s12;

00000743 end

00000744 end

00000745 s7: begin

00000747 if(comp_tried) begin

00000748 state = s9;

00000749 end

00000750 else begin

00000751 state = s6;

00000752 end

00000753 end

00000754 s8: begin

00000755 state = s11;

00000757 end

00000758 s9: begin

00000761 waits9 = 1'b0;

00000762 end

00000763 else begin

00000765 waits9 = 1'b1;

00000766 addr = 14'b11111111111111;

00000767 state = s12;

00000768 end

00000769 end

00000770 s10: begin

00000772 if (tiplus1) begin

00000773 state = s19;

00000774 addr = addr - 1;

00000775 end

00000776 else if (clearvalueschecks12) begin

00000777 state = s2;

00000778 end

00000779 else begin

00000781 blockisppi2 = 1'b0;

00000782 state = s0;

00000783 end

00000784 end

00000785 s11: begin

00000786 state = s4;

00000788 end

00000789 s12: begin

00000792 waits12 = 1'b0;

00000793 end

00000794 else begin

00000796 waits12 = 1'b1;

00000797 if (popchecks12) begin

00000798 state = s18;

00000799 end

00000800 /*else if(!swapchecks12) begin

00000801 swapchecks12 = 1'b1;

00000802 state = s2;

00000803 end*/

00000804 else if(tiplus1) begin

00000805 state = s20;

00000806 end

00000807 else begin

00000808 state = s10;

00000809 end

00000810 end

00000811 end

00000812 s13: state = s13;

00000813 s14: state = s14;

00000814 s15: begin

00000816 state = s3;

00000817 end

00000818 s16: begin

00000819 $display("mycontrol | in state 16");

00000821 blockisppi2 = 1'b0;

00000822 state = s0;

00000823 end

00000824 s17: begin

00000826 state = s10;

00000827 end

00000828 s18: begin

00000830 if(top_mark) begin

00000831 tiplus1 = 1'b1;

00000832 timinus1 = 1'b0;

00000833 state = s5;

00000834 end

00000835 else if(comp_tried) begin

00000836 state = s9;

00000837 end

00000838 else begin

00000839 state = s6;

00000840 end

00000841 end

00000842 s19: begin

00000844 if (addr == 14'b11111111111111) begin

00000845 tiplus1 = 1'b0;

00000846 end

00000847 state = s10;

00000848 end

00000849 s20: begin

00000851 addr = last_in + 1;

00000852 state = s10;

00000853 end

00000854 s21: begin

00000856 if (donewithstatecheck) begin

00000858 if (statecheckresult) begin

00000859 //State check found duplicate -

backtrack

00000861 state = s9;

00000862 end

00000863 else begin

00000864 //State check clean - move to Ti-1

00000866 state = s15;

00000867 end

00000868 end

00000869 else begin

00000870 addr = addr - 1;

00000871 end

00000872 end

00000873 s22: begin

00000875 state = s3;

00000876 end

00000877 default: begin

00000878 //Default case is assumed to be IDLE (s0)

00000880 if(frame_1 && !conflict && done) begin

00000881 state = s14;

00000882 end

00000883 else if(wwait) begin

00000884 if(!frame_k && !conflict && done &&

frame_1) state = s14;

00000885 if(!frame_k && !conflict && done &&

!frame_1) begin

00000886 //state = s15;

00000889 state = s21;

00000890 tiplus1 = 1'b0;

00000891 timinus1 = 1'b1;

00000892 end

00000893 if(!frame_k && !conflict && !done) state

00000894 if(!frame_k && conflict && !empty &&

comp_tried) state = s9;

00000895 if(!frame_k && conflict && !empty &&

!comp_tried) state = s6;

00000896 if(!frame_k && conflict && empty) begin

00000897 tiplus1 = 1'b1;

00000898 timinus1 = 1'b0;

00000899 state = s5;

00000900 end

00000901 if(frame_k && !line_k_X && !line_k_1)

state = s13;

00000902 if(frame_k && !line_k_X && line_k_1)

00000904 state = s15;

00000905 tiplus1 = 1'b0;

00000906 timinus1 = 1'b1;

00000907 end

00000908 if(frame_k && line_k_X) state = s1;

00000909 if(frame_k && conflict) state = s13;

00000910

00000911 wwait = 1'b0;

00000912 end

00000913 else begin

00000915 blockisppi2 = 1'b0;

00000916 state = s0;

00000917 end

00000918 end

00000919 endcase

00000920 end

00000921 end

00000922

00000923 always @(global_reset, state, waits6, waits9) begin

00000925 notclear = 2'b11;

00000926 genobj = 1'b0;

00000927 ninputs = 0;

00000928 end

00000929 else begin

00000930 case(state)

00000931 s0: begin

00000932 $display("mycontrol | executing state 0");

00000933 push = 1'b0;

00000934 pop = 1'b0;

00000935 rw = 2'b00;

00000936 rewrite0 = 1'b0;

00000937 rewrite1 = 1'b0;

00000938 swapwrite = 1'b0;

00000939 cleartop = 1'b0;

00000940 cnt_enable = 1'b0;

00000941 DONE = 1'b0;

00000942 FAIL = 1'b0;

00000943 notclear = 2'b11;

00000944 genobj = 1'b0;

00000945 sendvaluestoforward = 1'b1;

00000946 //swapchecks12 = 1'b0;

00000947 popchecks12 = 1'b0;

00000948 readingow = 1'b0;

00000949 clearvalueschecks12 = 1'b0;

00000950 end

00000951 s1: begin

00000953 push = 1'b1;

00000954 pop = 1'b0;

00000955 rw = 2'b00;

00000956 rewrite0 = 1'b0;

00000957 rewrite1 = 1'b0;

00000958 swapwrite = 1'b0;

00000959 cleartop = 1'b0;

00000960 cnt_enable = 1'b0;

00000961 DONE = 1'b0;

00000962 FAIL = 1'b0;

00000963 ninputs = ninputs + 1;

00000965 end

00000966 s2: begin

00000968 push = 1'b0;

00000969 pop = 1'b1;

00000970 rw = 2'b00;

00000971 rewrite0 = 1'b0;

00000972 rewrite1 = 1'b0;

00000973 swapwrite = 1'b0;

00000974 cleartop = 1'b0;

00000975 cnt_enable = 1'b0;

00000976 DONE = 1'b0;

00000977 FAIL = 1'b0;

00000978 popchecks12 = 1'b1;

00000980 end

00000981 s3: begin

00000983 push = 1'b0;

00000984 pop = 1'b0;

00000985 rw = 2'b10;

00000986 rewrite0 = 1'b0;

00000987 rewrite1 = 1'b0;

00000988 swapwrite = 1'b0;

00000989 cleartop = 1'b0;

00000990 cnt_enable = 1'b0;

00000991 DONE = 1'b0;

00000992 FAIL = 1'b0;

00000993 ninputs = 1'b0;

00000994 notclear = 2'b11;

00000995 readingow = 1'b1;

00000997 end

00000998 s4: begin

00001000 push = 1'b0;

00001001 pop = 1'b0;

00001002 rw = 2'b00;

00001003 rewrite0 = 1'b0;

00001004 rewrite1 = 1'b1;

00001005 swapwrite = 1'b0;

00001006 cleartop = 1'b0;

00001007 cnt_enable = 1'b0;

00001008 DONE = 1'b0;

00001009 FAIL = 1'b0;

00001010 end

00001011 s5: begin

00001013 ninputs = 1'b0;

00001014 push = 1'b0;

00001015 pop = 1'b0;

00001016 rw = 2'b00;

00001017 rewrite1 = 1'b0;

00001018 swapwrite = 1'b0;

00001019 cleartop = 1'b0;

00001020 cnt_enable = 1'b0;

00001021 DONE = 1'b0;

00001022 FAIL = 1'b0;

00001023 //This had to moved from above

00001025 rewrite0 = 1'b1;

00001026 end

00001027 else begin

00001028 rewrite0 = 1'b0;

00001029 end

00001030 end

00001031 s6: begin

00001034 push = 1'b0;

00001035 pop = 1'b0;

00001036 rw = 2'b00;

00001037 rewrite0 = 1'b0;

00001038 rewrite1 = 1'b0;

00001039 swapwrite = 1'b1;

00001040 cleartop = 1'b0;

00001041 cnt_enable = 1'b0;

00001042 DONE = 1'b0;

00001043 FAIL = 1'b0;

00001044 //swapchecks12 = 1'b1;

00001045 end

00001046 else begin

00001047 swapwrite = 1'b0;

00001048 end

00001049 end

00001050 s7: begin

00001052 push = 1'b0;

00001053 pop = 1'b0;

00001054 rw = 2'b00;

00001055 rewrite0 = 1'b0;

00001056 rewrite1 = 1'b0;

00001057 swapwrite = 1'b0;

00001058 cleartop = 1'b0;

00001059 cnt_enable = 1'b1;

00001060 cnt_dir = 1'b1;

00001061 DONE = 1'b0;

00001062 FAIL = 1'b0;

00001063 readingow = 1'b0;

00001064 end

00001065 s8: begin

00001067 push = 1'b0;

00001068 pop = 1'b0;

00001069 rw = 2'b00;

00001070 rewrite0 = 1'b0;

00001071 rewrite1 = 1'b0;

00001072 swapwrite = 1'b0;

00001073 cleartop = 1'b0;

00001074 cnt_enable = 1'b1;

00001075 cnt_dir = 1'b0;

00001076 DONE = 1'b0;

00001077 FAIL = 1'b0;

00001078 end

00001079 s9: begin

00001082 cleartop = 1'b1;

00001083 push = 1'b0;

00001084 pop = 1'b0;

00001085 rw = 2'b00;

00001086 rewrite0 = 1'b0;

00001087 rewrite1 = 1'b0;

00001088 swapwrite = 1'b0;

00001089 cnt_enable = 1'b0;

00001090 DONE = 1'b0;

00001091 FAIL = 1'b0;

00001093 end

00001094 else begin

00001095 cleartop = 1'b0;

00001096 end

00001097 end

00001098 s10: begin

00001100 push = 1'b0;

00001101 pop = 1'b0;

00001102 if(!tiplus1) begin

00001103 rw = 2'b00;

00001104 end

00001105 rewrite0 = 1'b0;

00001106 rewrite1 = 1'b0;

00001107 swapwrite = 1'b0;

00001108 cleartop = 1'b0;

00001109 cnt_enable = 1'b0;

00001110 DONE = 1'b0;

00001111 FAIL = 1'b0;

00001112 genobj = 1'b1;

00001113 notclear = 2'b01;

00001114 end

00001115 s11: begin

00001117 push = 1'b0;

00001118 pop = 1'b0;

00001119 rw = 2'b00;

00001120 notclear = 2'b00;

00001121 rewrite0 = 1'b0;

00001122 rewrite1 = 1'b0;

00001123 swapwrite = 1'b0;

00001124 cleartop = 1'b0;

00001125 cnt_enable = 1'b0;

00001126 DONE = 1'b0;

00001127 FAIL = 1'b0;

00001128 end

00001129 s12: begin

00001131 push = 1'b0;

00001132 pop = 1'b0;

00001133 rw = 2'b10;

00001134 rewrite0 = 1'b0;

00001135 rewrite1 = 1'b0;

00001136 swapwrite = 1'b0;

00001137 cleartop = 1'b0;

00001138 cnt_enable = 1'b0;

00001139 DONE = 1'b0;

00001140 FAIL = 1'b0;

00001142 notclear = 2'b11;

00001143 end

00001144 s13: begin

00001145 $display("FAIL!!!");

00001146 FAIL = 1'b1;

00001147 end

00001148 s14: begin

00001149 $display("DONE!!");

00001150 DONE = 1'b1;

00001151 end

00001152 s15: begin

00001154 end

00001155 s16: begin

00001157 push = 1'b0;

00001158 pop = 1'b0;

00001159 rw = 2'b00;

00001160 rewrite0 = 1'b0;

00001161 rewrite1 = 1'b0;

00001162 swapwrite = 1'b0;

00001163 cleartop = 1'b0;

00001164 cnt_enable = 1'b0;

00001165 DONE = 1'b0;

00001166 FAIL = 1'b0;

00001167 genobj = 1'b1;

00001168 notclear = 2'b11;

00001169 end

00001170 s17: begin

00001172 push = 1'b0;

00001173 end

00001174 s18: begin

00001176 popchecks12 = 1'b0;

00001177 end

00001178 s19: begin

00001180 push = 1'b0;

00001181 pop = 1'b0;

00001182 rw = 2'b10;

00001183 notclear = 2'b11;

00001184 rewrite0 = 1'b0;

00001185 rewrite1 = 1'b0;

00001186 swapwrite = 1'b0;

00001187 cleartop = 1'b0;

00001188 cnt_enable = 1'b0;

00001189 DONE = 1'b0;

00001190 FAIL = 1'b0;

00001191 ninputs = 1'b0;

00001193 end

00001194 s20: begin

00001196 push = 1'b0;

00001197 pop = 1'b0;

00001198 rw = 2'b00;

00001199 notclear = 2'b10;

00001200 rewrite0 = 1'b0;

00001201 rewrite1 = 1'b0;

00001202 swapwrite = 1'b0;

00001203 cleartop = 1'b0;

00001204 cnt_enable = 1'b0;

00001205 DONE = 1'b0;

00001206 FAIL = 1'b0;

00001207 readingow = 1'b1;

00001208 end

00001209 s21: begin

00001211 push = 1'b0;

00001212 pop = 1'b0;

00001213 rw = 2'b10;

00001214 rewrite0 = 1'b0;

00001215 rewrite1 = 1'b0;

00001216 swapwrite = 1'b0;

00001217 cleartop = 1'b0;

00001218 cnt_enable = 1'b0;

00001219 DONE = 1'b0;

00001220 FAIL = 1'b0;

00001221 ninputs = 1'b0;

00001223 end

00001224 s22: begin

00001226 push = 1'b0;

00001227 pop = 1'b0;

00001228 if(!tiplus1) begin

00001229 rw = 2'b00;

00001230 end

00001231 rewrite0 = 1'b0;

00001232 rewrite1 = 1'b0;

00001233 swapwrite = 1'b0;

00001234 cleartop = 1'b0;

00001235 cnt_enable = 1'b0;

00001236 DONE = 1'b0;

00001237 FAIL = 1'b0;

00001238 genobj = 1'b1;

00001239 notclear = 2'b01;

00001240 end

00001241 endcase

00001242 end

00001243 end

00001244

00001245 endmodule

00001246

00001247

00001248 module fourcounter ( global_reset, frame_1 ,cnt_dir ,cnt_enable ,frame_k);

00001252 parameter s = 22;

00001253

00001254 input cnt_dir ;

00001255 input cnt_enable ;

00001257 output reg frame_1 ;

00001258 output reg frame_k ;

00001259

00001260 reg [l-1:0] tmp;

00001261

00001262 always @(posedge cnt_enable, posedge global_reset) begin

00001264 $display ("fourcounter | reseting the frame counter");

00001265 tmp = 2**l-1;

00001266 if(l-1 == 0) begin

00001267 frame_k = 1'b1;

00001268 frame_1 = 1'b1;

00001269 end

00001270 else begin

00001271 frame_k = 1'b1;

00001272 frame_1 = 1'b0;

00001273 end

00001274 end

00001275 else begin

00001276 if(cnt_dir) begin

00001277 if(tmp == 0) begin

00001278 frame_1 = 1'b1;

00001279 end

00001280 else if (tmp == 2**l-2) begin

00001281 frame_k = 1'b1;

00001282 end

00001283 else begin

00001284 frame_1 = 1'b0;

00001285 frame_k = 1'b0;

00001286 end

00001287 $display ("fourcounter | counting a frame UP");

00001288 tmp = tmp + 1'b1;

00001289 end

00001290 else begin

00001291 if(tmp == 2) begin

00001292 frame_1 = 1'b1;

00001293 end

00001294 else if (tmp == 2**l) begin

00001295 frame_k = 1'b1;

00001296 end

00001297 else begin

00001298 frame_1 = 1'b0;

00001299 frame_k = 1'b0;

00001300 end

00001301 $display ("fourcounter | counting a frame DOWN");

00001302 tmp = tmp - 1'b1;

00001303 end

00001304 end

00001305 end

00001306

00001307 endmodule

00001308

00001309

00001310 module decoder (top, value, top_mark, comp_tried );

00001311 input [13:0] top ;

00001312 output reg [1:0] value ;

00001313 output reg top_mark ;

00001314 output reg comp_tried ;

00001315

00001316 always @(top) begin

00001317 value = top[2:1];

00001318 top_mark = top[3];

00001319 comp_tried = top[0];

00001320 end

00001321

00001322 endmodule

00001323

00001324

00001325 module isppi ( out, in, blockisppi );

00001329 parameter s = 22;

00001330

00001331 input [13:0] in ;

00001332 input blockisppi;

00001333 output reg [13:0] out ;

00001334

00001335 always @(in, blockisppi) begin

00001336 if(!blockisppi && in != 14'b11111111111111 && in[13:4] > m) begin

00001337 $display("isppi | evaluating output to RF; in=%b", in);

00001338 out = in;

00001339 end

00001340 else begin

00001341 out = 14'b11111111111111;

00001342 end

00001343 end

00001344

00001345 endmodule

00001346

00001347

00001348 // Module: XC2V_RAMB_1_PORT

00001349 // Description: 18Kb Block SelectRAM-II example

00001350 // Single Port 512 x 36 bits

00001351 // Use template "SelectRAM_A36.v"

00001352 //

00001353 // Device: Virtex-II Pro Family

00001354 //-------------------------------------------------------------------

00001355 module XC2V_RAMB_1_PORT (CLK, SET_RESET, ENABLE, WRITE_EN, ADDRESS, DATA_IN,

DATA_OUT);

00001356 input CLK, SET_RESET, ENABLE, WRITE_EN;

00001357 input [17:0] DATA_IN;

00001358 input [9:0] ADDRESS;

00001359 output [17:0] DATA_OUT;

00001360 wire CLK_BUFG, INV_SET_RESET;

00001361

00001362 //Use of the free inverter on SSR pin

00001363 assign INV_SET_RESET = ~SET_RESET;

00001364

00001365 // initialize block ram for simulation

00001366 // synopsys translate_off

00001367 defparam

00001368

00001369 //"Read during Write" attribute for functional simulation

00001370 U_RAMB16_S36.WRITE_MODE = "READ_FIRST", //WRITE_FIRST(default)/READ_FIRST/

NO_CHANGE

00001371

00001372 //Output value after configuration

00001373 U_RAMB16_S36.INIT = 36'h000000000,

00001374

00001375 //Output value if SSR active

00001376 U_RAMB16_S36.SRVAL = 36'h012345678,

00001377

00001378 //Plus bits initial content

00001379 U_RAMB16_S36.INITP_00 =

256'h0123456789ABCDEF000000000000000000000000000000000000000000000000,

00001380 U_RAMB16_S36.INITP_01 =