+ All Categories
Home > Documents > Asynchronous Design Using Commercial HDL Synthesis Tools Michiel Ligthart Karl Fant Ross Smith...

Asynchronous Design Using Commercial HDL Synthesis Tools Michiel Ligthart Karl Fant Ross Smith...

Date post: 19-Dec-2015
Category:
View: 218 times
Download: 0 times
Share this document with a friend
Popular Tags:
37
Asynchronous Design Using Asynchronous Design Using Commercial HDL Synthesis Commercial HDL Synthesis Tools Tools Michiel Ligthart Karl Fant Ross Smith Alexander Taubin Alex Kondratyev
Transcript

Asynchronous Design Using Asynchronous Design Using Commercial HDL Synthesis ToolsCommercial HDL Synthesis Tools

Michiel Ligthart

Karl Fant

Ross Smith

Alexander Taubin

Alex Kondratyev

OutlineOutline

Added Value of NCL - Simplification of design Added Value of NCL - Simplification of design

Canonical form of gates - The key for optimizationCanonical form of gates - The key for optimization

NCL in CAD flow. An exampleNCL in CAD flow. An example

Validation of optimizationValidation of optimization

Experimental resultsExperimental results

Conclusion and future workConclusion and future work

OutlineOutline

Added Value of NCL - Simplification of design Added Value of NCL - Simplification of design

Canonical form of gates - The key for optimizationCanonical form of gates - The key for optimization

NCL in CAD flow. An exampleNCL in CAD flow. An example

Validation of optimizationValidation of optimization

Experimental resultsExperimental results

Conclusion and future workConclusion and future work

Inherent to asynchronousInherent to asynchronous- no clock system- low EMI- free stand-by mode, etc.

Particular to NULL Convention Logic (NCL)Particular to NULL Convention Logic (NCL)- ease of design (reduced time to market)­ use standard HDL and commercial tools to use standard HDL and commercial tools to

simulate and synthesize asynchronous circuitssimulate and synthesize asynchronous circuits

- nicely fits current/future (DSM) technology

Inherent to delay-insensitiveInherent to delay-insensitive

- easy to reuse design- plug-’n’-play SoC design- easily portable among technologies

Potential NCL AdvantagesPotential NCL Advantages

OutlineOutline

Added Value of NCL - Simplification of designAdded Value of NCL - Simplification of design

Canonical form of gates - The key for optimizationCanonical form of gates - The key for optimization

NCL in CAD flow. An exampleNCL in CAD flow. An example

Validation of optimizationValidation of optimization

Experimental resultsExperimental results

Conclusion and future workConclusion and future work

NULL

Data Communication Based on DI Data Communication Based on DI EncodingEncoding

Completion­detection

Combinational

circuitry

Request for DATA/NULL

• DI­protocol­with­spacer­(NULL)– NULL propagation / NULL acknowledge – Data propagation / Data acknowledge

Reg

ister

Reg

ister

DATACompletion by codeword

­

Logic gate

no data present

NCL:NCL: Pushing Two-phase Behavior DownPushing Two-phase Behavior Down

to the Level of Each Gateto the Level of Each Gate

Logic gate

­ complete data present

NCL:NCL: Pushing Two-phase Behavior DownPushing Two-phase Behavior Down

to the Level of Each Gateto the Level of Each Gate

Gate output acknowledges input changesGate output acknowledges input changesSimplest­DI­encoding­-­dual-rail­[Sims’58]­Simplest­DI­encoding­-­dual-rail­[Sims’58]­

General Implementation General Implementation of Hysteresis Gates in CMOSof Hysteresis Gates in CMOS

p-tree

n-tree

Set function

functionReset

. . .x1

xn

g

g=S+gRg=S+gR

Dual-rail circuits underDual-rail circuits undertwo-phase operation:two-phase operation:• A transition from NULL to A transition from NULL to

Data is monotonicData is monotonic• An input transition to An input transition to

NULL resets all gates to NULL resets all gates to NULLNULL

nn xxxxxR ...),...,( 211

Set is positively unateSet is positively unate

Reset Reset

RefinedRefined Implementation Implementation of of NCLNCL Hysteresis Gates in CMOS Hysteresis Gates in CMOS

n-tree

Set function. .

.x1

xn

g

nn xxxxxR ...),...,( 211

Depends only on the number of inputs

Canonical form of Canonical form of reset is the key to reset is the key to use synchronous use synchronous optimization toolsoptimization tools

Reset of each individual gate scales up to the whole network

g=S+gRg=S+gR

Room­forRoom­foroptimizationoptimization

2 2

3

2

3

4

2

3

4

5

11 1 1

M of N threshold gates with hysteresis behavior

C-element equivalents

OR gate equivalents

DIMS­[Muller’62][Sparso’92]

Family of Logic GatesFamily of Logic Gates

z=ab+ac+bcz=ab+ac+bc+z+z(a+b+c)(a+b+c)

• The gate switchesThe gate switches to to data when when MM inputs are inputs are data toto NULL when all inputs arewhen all inputs are NULL

• It is possible to use “negative logic” – reversing pull-up and pull-down networks

a

b

b

b

c

c

a

a

z

Example: 2-of-3 Threshold Gate Example: 2-of-3 Threshold Gate with Hysteresiswith Hysteresis

c

OutlineOutline

Added Value of NCL - Simplification of design Added Value of NCL - Simplification of design

Canonical form of gates - The key for optimizationCanonical form of gates - The key for optimization

NCL in CAD flow. An exampleNCL in CAD flow. An example

Validation of optimizationValidation of optimization

Experimental resultsExperimental results

Conclusion and future workConclusion and future work

RTL Design Flow – RTL Design Flow – Combinational OptimizationCombinational Optimization

Separate­combinational­logic­and­registersSeparate­combinational­logic­and­registers

Request for data/nullreset

Combi-national

process

Request for data/nullS

eque

ntia

lpr

oces

s

Replaced by NCL Replaced by NCL registration in RTL coderegistration in RTL codeSubject of Subject of

synthesis and synthesis and optimizationoptimization

The topic of this The topic of this presentationpresentation

NCL­NCL­librarylibrary

VHDL

Generic­library

Synthesis

Dual-railDual-raildefinitiondefinition

Intermediatenetlist

NCLNCLnetlistnetlist

Synthesis

Step 1. Step 1. Translate HDL into Translate HDL into

“synchronous” netlist“synchronous” netlist

­­Step 2.Step 2. Convert intermediate Convert intermediate

netlist into NCL netlistnetlist into NCL netlist

Two-Step Synthesis FlowTwo-Step Synthesis Flow (Using Synopsys' Design Compiler)(Using Synopsys' Design Compiler)

• RTL­description­(MUX)

entity test input a,b,s : ncl_logic; output z : ncl_logic;architectureprocess (a, b, s) is begin if s = ‘1’ then z <= a; else z <= b; end if;end process;

ab

s

z

Input to Step 1: RTL DescriptionInput to Step 1: RTL Description(Multiplexer Example)(Multiplexer Example)

MUX Example: Output of Step 1 / Input MUX Example: Output of Step 1 / Input to Step 2: Intermediate Netlistto Step 2: Intermediate Netlist

a

s

b

x

y

z

Two input NAND gates

ab

s

z

Dual-rail PackageDual-rail Package• Define type

type dual_rail_logic is record rail1 : std_logic ; rail0 : std_logic ; end record;

a.0a.1

{0,1}

{0,1}a{0,1,N}

function “not”a.0a.1

z.1z.0

• Overload operators 22

22

22

22

13 z.0z.1

a.0a.1

b.0b.1

function “nand”

th22 = two-input C-elementth13 = three-input OR

Optimizing with Design CompilerOptimizing with Design Compiler

Dual-rail expansion

Two phases (set and reset) are separated

Set phase ensures circuit functionality

Reset phase is implied

Optimizations are applied to the set phase

Dual-rail Dual-rail ExpansionExpansion ofof MUX MUX

a

s

b

x

y

z

Naive semi-static DIMS implementation – 114 transistors (can be reduced to 63 transistors by merging C-elements with OR-gates) versus 14 for a synchronous circuit

b.f

a.t

b.tD-R

NAND

D-RNAND

D-RNAND

x.t

s.f

a.f

x.f

y.t

y.f

z.t

z.f

s.t

““Images”-Boolean Gates Images”-Boolean Gates Implementing Set FunctionsImplementing Set Functions

NCL gatesNCL gates

z=ab+z=ab+z(a+b)z(a+b)

aa

bb th22

zz

aa

aabb

zz

z=a+bz=a+b

z=a(b+c)+z=a(b+c)+z(a+b+c)z(a+b+c)

bbcc th33w2

zz

……

Boolean gates (images)Boolean gates (images)

z=abz=abaa

aabb

zz

z=a+bz=a+b

z=a(b+c)z=a(b+c)

aa

bbzz

bbcc

zz

……equivalent for

set phase

In the initial state:In the initial state: z=a=b=c=0z=a=b=c=0

Hysteresis-Hysteresis-sequential behaviorsequential behavior

Combinational Combinational behaviorbehavior

Projection for optimization

Mapping for implementation

Image of Dual-rail NAND GateImage of Dual-rail NAND Gate

out.t

out.f

C

C

C

C

a.tb.t

a.fb.f

D-RNAND

a.ta.fb.t

b.f

C-elementC-element equation: z=abequation: z=ab++z(a+b). z(a+b).

out.t

out.f

Image of Dual-rail NAND GateImage of Dual-rail NAND Gate

out.t

out.f

a.tb.t

a.fb.f

C-elementC-element equation: z=ab+z(a+b), equation: z=ab+z(a+b), initially z=a=b=0initially z=a=b=0

In a set phase it behaves like anIn a set phase it behaves like an AND gateAND gate z=abz=ab

Dual-rail Dual-rail ExpansionExpansion forfor MUX MUX

b.f

a.t

b.t

x.t

s.f

a.f

x.f

y.t

y.f

z.t

z.f

s.t

Twelve 2-input C-gates Twelve 2-input C-gates &&

Three 3-input OR-gatesThree 3-input OR-gates

Image Circuit of Dual-rail Image Circuit of Dual-rail ExpansionExpansion forfor MUX MUX

b.f

a.t

b.t

x.t

s.f

a.f

x.f

y.t

y.f

z.t

z.f

s.t

Optimized with Design Compiler Optimized with Design Compiler

MUX circuit passes technology independent optimization and is mapped to “images” of gates from NCL library.

b.f

a.t

b.t

s.f

a.f

s.t

z.t

z.f

image of th33w2

A(B+C)A(B+C)

image of thXORAB+CDAB+CD

Technology Mapping with Design Technology Mapping with Design Compiler Compiler

NCL circuit: images are replaced by gates withimages are replaced by gates with hysteresishysteresis

b.f

a.t

b.t

s.f

a.f

s.t

thXOR

z.t

z.f

th33w2

th33w2th33w2

thXOR

2

2

thXORthXOR2

2

th22

th22

th24w2

th24w2

f

e

m

n

e

f

m

n

k

Semi-static CMOS Semi-static CMOS implementation of implementation of thXOR.thXOR.

44 transistors - 30% better than optimized DIMS

OutlineOutline

Added Value of NCL - Simplification of design Added Value of NCL - Simplification of design

Canonical form of gates - The key for optimizationCanonical form of gates - The key for optimization

NCL in CAD flow. An exampleNCL in CAD flow. An example

Validation of optimizationValidation of optimization

Experimental resultsExperimental results

Conclusion and future workConclusion and future work

Optimization FlowOptimization Flow

Boolean­circuit

Dual-rail­image

translation

Optimized­circuit

optimization

Mapped­toimages

dual-rail

package

Design

compiler

Design

compiler

tech.mapping

Synchronous

DIMS circuit

Hysteresis gates

DIDIequivalenceequivalence

Asynchronous

Virtual objectVirtual object

Real objectReal object

Validation of OptimizationValidation of Optimization

The­validity­of­transformations­(DI­equivalenceDI­equivalence)­is­based­on­two­properties:

­Functional­equivalenceFunctional­equivalence­of­optimized­and­original­circuits­(under­two-phase­operation)

­Maintenance­of­DI­propertiesMaintenance­of­DI­properties­in­optimized­circuit­

Both­are­based­on­the­properties­ofBoth­are­based­on­the­properties­of­prime­and­­prime­and­irredundant­networks­irredundant­networks­and­propertiesand­properties­of­algebraic­­of­algebraic­factorization­factorization­[Brayton’90,­Hachtel’92][Brayton’90,­Hachtel’92]

Starting point: prime and irredundant Boolean network(known to be 100% stuck-at testable, [Scherz’72])

algebraic transformations

Set of test vectors for stuck-at faults is maintained [Hachtel’92]

induction by topology order

Testability: each gate acknowledges inputs changesTestability: each gate acknowledges inputs changes(Delay insensitivity)(Delay insensitivity)

Same for tree-based technology mapping­

Validation of Optimization: Validation of Optimization: Idea of the ProofIdea of the Proof

OutlineOutline

Added Value of NCL - Simplification of design Added Value of NCL - Simplification of design

Canonical form of gates - The key for optimizationCanonical form of gates - The key for optimization

NCL in CAD flow. An exampleNCL in CAD flow. An example

Validation of optimizationValidation of optimization

Experimental resultsExperimental results

Conclusion and future workConclusion and future work

Manual vs. Synthesized DesignsManual vs. Synthesized Designs

0

500

1000

1500

2000

2500

3000

3500

4000

4500

ManualSynthesized

Area (transistor number)

For bigger circuits Synthesis/Manual ratio is better For bigger circuits Synthesis/Manual ratio is better (22% improvement for biggest example) (22% improvement for biggest example)

Synchronous vs. NCL design Synchronous vs. NCL design

0

500

1000

1500

2000

2500

clock

NCL

0

5000

10000

15000

20000

25000

30000

35000

gates transistorsPenalty in transistors:

Dual-rail implementationEffective delay-insensitivity

To reduce transistor count:To reduce transistor count:Use four-rail encodingUse four-rail encodingImprove architectural solutions: e.g., OR instead MUXImprove architectural solutions: e.g., OR instead MUXCompromise delay insensitivityCompromise delay insensitivity

OutlineOutline

Added Value of NCL - Simplification of design Added Value of NCL - Simplification of design

Canonical form of gates - The key for optimizationCanonical form of gates - The key for optimization

NCL in CAD flow. An exampleNCL in CAD flow. An example

Validation of optimizationValidation of optimization

Experimental resultsExperimental results

Conclusion and future workConclusion and future work

ConclusionsConclusions

• First­methodology­to­use­standard­HDL­First­methodology­to­use­standard­HDL­and­commercial­tools­both­to­simulate­and­and­commercial­tools­both­to­simulate­and­synthesize­asynchronous­circuitssynthesize­asynchronous­circuits

• The­methodology­is­formally­validatedThe­methodology­is­formally­validated

• The­results­of­the­synthesis­are­acceptableThe­results­of­the­synthesis­are­acceptable

Future TasksFuture Tasks

Reduce­area/powerReduce­area/power­­without­losing­delay­insensitivitywithout­losing­delay­insensitivity­­(e.g.,­four-rail­design)(e.g.,­four-rail­design)

Relax­DI­requirements­to­reduce­area­(e.g.,­usingRelax­DI­requirements­to­reduce­area­(e.g.,­using­­timing­assumptionstiming­assumptions))

UseUse peepholepeephole­optimizations­optimizations­­(e.g.,(e.g.,­merge­gates­­merge­gates­used­used­for­registration­with­their­input­gates­etc.)for­registration­with­their­input­gates­etc.)

Write­DesignWare­components­to­get­better­Write­DesignWare­components­to­get­better­performance­for­arithmetic­units­(infer­performance­for­arithmetic­units­(infer­hand­designed­hand­designed­componentscomponents))


Recommended