+ All Categories
Home > Documents > Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent...

Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent...

Date post: 28-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
32
1 Copyright © 2000 K. Keutzer 1 Technology Dependent Logic Optimization Prof. Kurt Keutzer EECS University of California Berkeley, CA Thanks to S. Devadas Kurt Keutzer 2 RTL Design Flow RTL Synthesis HDL netlist logic optimization netlist Library physical design layout a b s q 0 1 d clk a b s q 0 1 d clk Module Generators Manual Design
Transcript
Page 1: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

1

Copyright © 2000 K. Keutzer

1

Technology Dependent Logic Optimization

Prof. Kurt Keutzer

EECS

University of California

Berkeley, CA

Thanks to S. Devadas

Kurt Keutzer 2

RTL Design Flow

RTLSynthesis

HDL

netlist

logicoptimization

netlist

Library

physicaldesign

layout

a

b

s

q0

1

d

clk

a

b

s

q0

1

d

clk

ModuleGenerators

ManualDesign

Page 2: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

2

Copyright © 2000 K. Keutzer

Kurt Keutzer 3

Logic Optimization

Perform a variety of transformations and optimizations

– Structural graph transformations

– Boolean transformations

– Mapping into a physical library

smaller, fasterless power

logicoptimization

netlist

netlist

Library

a

b

s

q0

1

d

clk

a

b

s

q0

1

d

clk

Kurt Keutzer 4

Combinational Logic Optimization

Input:

• Initial Boolean network

• Timing characterization for the module

• - input arrival times and drive factors

• - output loading factors

• Optimization goals

• - output required times

• Target library description

Output:

• Minimum-area net-list of library gates which meets timing constraints

A very difficult optimization problem !

Page 3: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

3

Copyright © 2000 K. Keutzer

Kurt Keutzer 5

Modern Approach to Logic Optimization

Divide logic optimization into two subproblems:

– • Technology-independent optimization

• - determine overall logic structure

• - estimate costs (mostly) independent of technology

• - simplified cost modeling

– • Technology-dependent optimization (technology mapping)

• - binding onto the gates in the library

• - detailed technology-specific cost model

Orchestration of various optimization/transformation techniques for each subproblem

Kurt Keutzer 6

Logic Optimization

logicoptimization

netlist

netlist

Library

techindependent

techdependent

2-levelLogic opt

multilevelLogic opt

Library

TimingConstraints

Page 4: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

4

Copyright © 2000 K. Keutzer

Kurt Keutzer 7

“Closed Book” Technology Library

A standard cell technology or library may contain many hundreds of cells

Typical cells are NAND, NOR, NOT, AOI (AND-or-Invert), OAI (Or-And-Invert) etc.

A

A

A

C

A

B

AB+C

B

C

A

Kurt Keutzer 8

Library

Contains for each cell:

– Functional information: cell = a *b * c

– Timing information: function of

• input slew

• intrinsic delay

• output capacitance

non-linear models used in tabular approach

– Physical footprint (area)

– Power characteristics

Wire-load models - function of

– Block size

– Wiring

Library

Page 5: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

5

Copyright © 2000 K. Keutzer

Kurt Keutzer 9

Elements of a library - 1

INVERTER 2

NAND2 3

NAND3 4

NAND4 5

Element/Area Cost

Kurt Keutzer 10

Elements of a library - 2

AOI21 4

AOI22 5

Element/Area Cost

Page 6: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

6

Copyright © 2000 K. Keutzer

Kurt Keutzer 11

Reasonable Library

Inverter, Buffer

ND2-ND4; NOR2-NOR4; AND2- AND4;

AOI21 - AOI333; OAI21 - OAI333

XOR, XNOR

MUX, Full Adder

Neg-Edge Triggered D-Flip-Flop

Pos-Edge Triggered D-FF

J-K FF

Above with various clears, enables

Scan versions of each of the above

Most of the above in 6 different power sizes:

– 1x, 2x, 4x, 6x, 8x, 16x

Kurt Keutzer 12

Input Circuit Netlist

``subject DAG’’

Page 7: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

7

Copyright © 2000 K. Keutzer

Kurt Keutzer 13

Problem statement

into the technology library (simple example below):

Find an ``optimal’’ (in area, delay, power) mapping of a circuit

Kurt Keutzer 14

Is there a problem? Trivial Covering #1

subject DAG

7 NAND2 (3) = 215 INV (2) = 10

Area cost 31

Page 8: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

8

Copyright © 2000 K. Keutzer

Kurt Keutzer 15

Covering #2

2 INV = 42 NAND2 = 61 NAND3 = 41 NAND4 = 5

Area cost 19

Kurt Keutzer 16

Covering #3

1 INV = 21 NAND2 = 32 NAND3 = 81 AOI21 = 4

Area Cost 17

Costs:31, 19, 17Yes, there’s a problem!

Page 9: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

9

Copyright © 2000 K. Keutzer

Kurt Keutzer 17

History of the Problem - 1

Technology mapping in 1986 was a big problem

• Almost every design group (e.g. AT&T) had their own library

– ASIC – 400 cells

– Microprocessor/DSP – 200 base cells

– Government – 200+ cells

• Every group had their own approach to mapping

– ``Do what you have to do!’’ – handcrafted mappers tied to particular libraries and optimization tools

– ``Rule-based’’ systems – e.g. GE Socrates – very slow ``expert systems’’ that made no guarantee on final quality of result

Kurt Keutzer 18

History of the Problem - 2

Yes, there are two problems:

– Technology mapping can significant affect the area, speed, and power dissipation of a circuit

– There are over 200 different semiconductors each with multiple internal libraries – how to create a tool that can utilize a diverse set of libraries??

Page 10: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

10

Copyright © 2000 K. Keutzer

Kurt Keutzer 19

A similar problem – code generation

Example of code generation in compilers using tree-covering

• Handles complex instruction sets ���� Handles complex libraries

• Easily portable to other instruction sets ���� Easily portable to

Kurt Keutzer 20

Problem Formulation: DAG Covering

Represent input netlist in normal form⇒⇒⇒⇒ subject DAG

Represent each library gate with normal forms for the logic function⇒⇒⇒⇒ primitive DAGs

Each primitive DAG has a cost

Goal: Find a minimum cost covering of the subject DAG by the primitive DAGs

Normal form: 2-input NAND gates and inverters

K. Keutzer, DAGON: Technology Binding and Local Optimization by DAG Matching, in Proceedings of the24th Design Automation Conference, 1987 and 25 Years of Design Automation

Page 11: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

11

Copyright © 2000 K. Keutzer

Kurt Keutzer 21

Step 1: Extract Combinational Logic

B

Flip-flops

CombinationalLogic

Since FF’s don’t need to be optimized with surrounding combinational logic we can partition them out

inputs outputs

Kurt Keutzer 22

Step 2: Normalize Circuit Netlist

``subject DAG’’

Reduce the netlist into ND2 gates

Page 12: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

12

Copyright © 2000 K. Keutzer

Kurt Keutzer 23

Step 3a: Normalize library

INVERTER 2

NAND2 3

NAND3 4

NAND4 5

Element/Area Cost Tree Representation (normal form)

Kurt Keutzer 24

Step 3b: Normalize library

AOI21 4

AOI22 5

Element/Area Cost Tree Representation (normal form)

Page 13: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

13

Copyright © 2000 K. Keutzer

Kurt Keutzer 25

Sound Algorithmic approach

NP-hard optimization problem

Tree covering heuristic: If subject and primitive DAGs are trees, efficient algorithm can find optimum cover ⇒⇒⇒⇒ dynamic programming formulation

Step 4: DAG Covering

multiple fanout

K. Keutzer, D. Richards, Computation Complexity of Logic Synthesis and Optimization, in Proceedings of theInternational Workshop on Logic Synthesis, 1989

Kurt Keutzer 26

Solution formulation

1) Partition input netlist into forest of trees2) Solve each tree optimally using tree covering3) Stitch trees back together

Page 14: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

14

Copyright © 2000 K. Keutzer

Kurt Keutzer 27

Resulting Trees

Break at multiple fanout points

Kurt Keutzer 28

For each tree - Dynamic Programming

Principle of optimality: Optimal cover for a tree consists of a best match at the root of the tree plus the optimal cover for the sub-trees starting at each input of the match

x

y

z

p

Best cover forthis match usesbest covers forx, y, z

Best cover forthis match usesbest covers forp, z

Choose leastcost tree-coverat root

K. Keutzer, DAGON: Technology Binding and Local Optimization by DAG Matching, in Proceedings of the24th Design Automation Conference, 1987

Page 15: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

15

Copyright © 2000 K. Keutzer

Kurt Keutzer 29

Example of Optimal Tree Covering

NAND23

AOI214 + 3 = 7

INV11 + 2 = 13

NAND22 + 6 + 3 = 11

NAND23 + 3 = 6

NAND23

INV2

Kurt Keutzer 30

DAG covering in detail

1) partition DAG into a forest of trees

2) normalize netlist

3) optimally cover each tree

a) generate all candidate matches

b) find the optimal match using dynamic programming

Page 16: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

16

Copyright © 2000 K. Keutzer

Kurt Keutzer 31

Partition DAG into Forest of trees

Each gate with fanout >1 becomes root of a new tree

Kurt Keutzer 32

Normalize netlist

Re-express netlist into 2-input Nand gates and Inverters

Make each tree left-oriented

Page 17: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

17

Copyright © 2000 K. Keutzer

Kurt Keutzer 33

Generate candidate matches - 1

subject tree

At the end of this segment each gate in the subject tree is annotatedwith every possible library cell that could be rooted at that gate

What are some ways we can generate matches?

Kurt Keutzer 34

Generating candidate matches -2

Naïve approach -

try to match each cell in the library with each node of the tree (libraries can be large! - beware of large constants!!)

Better approach

build tables such that only potential candidate matches are checked

Best approach

fancy string matching - pp. 862-869

Introduction to Algorithms, T. Cormen, C. Lesierson, R. Rivest, The MIT Press, Second Printing, 1996. - pp. 862-869

What’s the complexity of each approach?

Page 18: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

18

Copyright © 2000 K. Keutzer

Kurt Keutzer 35

Optimal tree covering - 1

``subject tree’’

3

2

2

3

Kurt Keutzer 36

Optimal tree covering - 2

``subject tree’’

5

8

3

2

2

3

Page 19: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

19

Copyright © 2000 K. Keutzer

Kurt Keutzer 37

Optimal tree covering - 3

``subject tree’’

Cover with ND2 or ND3 ?

3

2

2

3

813

5

1 NAND2 3+ subtree 5

1 NAND3 = 4

Area cost 8

Kurt Keutzer 38

Optimal tree covering – 3b

``subject tree’’

3

2

2

3

813

5 4

Label the root of the sub-tree with optimal match and cost

Page 20: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

20

Copyright © 2000 K. Keutzer

Kurt Keutzer 39

Optimal tree covering – 4a

``subject tree’’

Cover with INV or AO21 ?

54

3

8

2

2

13

2

1 Inverter 2+ subtree 13

Area cost 15

1 AO21 4+ subtree 1 3+ subtree 2 2

Area cost 9

Kurt Keutzer 40

Optimal tree covering – 4b

``subject tree’’54

3

8

2

2

13

2

9

Label the root of the sub-tree with optimal match and cost

Page 21: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

21

Copyright © 2000 K. Keutzer

Kurt Keutzer 41

Optimal tree covering - 5

``subject tree’’

Cover with ND2 or ND3 ?

subtree 1 9subtree 2 41 NAND2 3

Area cost 16

NAND2 NAND3

8

4

9

subtree 1 8subtree 2 2subtree 3 41 NAND3 4

Area cost 18

2

Kurt Keutzer 42

Optimal tree covering – 5b

``subject tree’’

168

4

9

2

Label the root of the sub-tree with optimal match and cost

Page 22: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

22

Copyright © 2000 K. Keutzer

Kurt Keutzer 43

Optimal tree covering - 6

``subject tree’’

Cover with INV or AOI21 ?

INV AOI21

Area cost 22

5

16

Area cost 18

subtree 1 161 INV 2

subtree 1 13subtree 2 51 AOI21 4

13

Kurt Keutzer 44

Optimal tree covering – 6b

``subject tree’’5

16

1813

Label the root of the sub-tree with optimal match and cost

Page 23: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

23

Copyright © 2000 K. Keutzer

Kurt Keutzer 45

Optimal tree covering - 7

``subject tree’’

Cover with ND2 or ND3 or ND4 ?

Kurt Keutzer 46

Cover 1 - NAND2

``subject tree’’

Cover with ND2 ?

16

18

subtree 1 18subtree 2 01 NAND2 3

Area cost 21

4

9

Page 24: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

24

Copyright © 2000 K. Keutzer

Kurt Keutzer 47

Cover 2 - NAND3

``subject tree’’

Cover with ND3?

subtree 1 9subtree 2 4subtree 3 01 NAND3 4

Area cost 17

9

4

Kurt Keutzer 48

Cover - 3

``subject tree’’

Cover with ND4 ?

Area cost 19

subtree 1 8subtree 2 2subtree 3 4subtree 4 01 NAND4 5

8

4

2

Page 25: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

25

Copyright © 2000 K. Keutzer

Kurt Keutzer 49

Optimal Cover was Cover 2

``subject tree’’

Cover with ND3?

INV 2ND2 32 ND3 8AOI21 4

Area cost 17

AOI21

ND2

INV

ND3

ND3

Clear that greedy doesn’t work wellWhat’s the complexity?

Kurt Keutzer 50

Computational Complexity

To determine the optimal cover for a tree we only need to consider a best cost match at the root of the tree

This is constant-time in the number of matched cells

Plus the optimal cover for the sub-trees starting at each input of the match

This is constant-time in the indegree/fan-in of each match

x

y

z

p

Best cover forthis match usesbest covers forx, y, z

Best cover forthis match usesbest covers forp, z

Choose leastcost tree-coverat root

O(n) - amazing!

What’s the complexity?

Page 26: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

26

Copyright © 2000 K. Keutzer

Kurt Keutzer 51

Enhancements to DAG covering

Many enhancements incorporated over the last decade

• Timing optimization incorporating load-dependent delays

– – Rudell - UCB

• Optimization for low power

• Application to FPGAs –

– J. Rose - Chortle

– J. Cong - Flowmap

• Optimal direct DAG covering without tree covering approximation (didn’t net much)

Kurt Keutzer 52

Summary of Technology Mapping

DAG covering formulation

– Separated library issues from mapping algorithm

Heuristics based on tree covering for area and delay

– surprisingly efficient final result - for technology/library dependent reasons

Very efficient

– linear time

Very flexible approach

– applicable to wide range of libraries (standard cell, gate array) and technologies (FPGAS)

Best enhancement is integration of technology decomposition

Also requires ``follow up’’ rule based approaches for best final circuit efficiency

Page 27: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

27

Copyright © 2000 K. Keutzer

Kurt Keutzer 53

Why does this approximation work well?

Each gate with fanout >1 becomes root of a new tree

Kurt Keutzer 54

Why does this approximation work well?

Few non-tree cells – XOR, MUX – one-level deep

Page 28: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

28

Copyright © 2000 K. Keutzer

Kurt Keutzer 55

Why does this approximation work well?

Non-tree matching usually requires duplication – rarely a benefitfor area

Kurt Keutzer 56

Page 29: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

29

Copyright © 2000 K. Keutzer

Kurt Keutzer 57

Retrospective

DAG covering by tree-covering is effective for four reasons

• separates library definition and characterization from mapping algorithm

• Duplication of logic not a win in terms of area optimization. Advantage of duplication of logic for timing is very (physical) context dependent

• provided an efficient mapping in what appears to be a relatively flat solution space

• Very computationally efficient so suitable to VLSI scale (millions of gates) netlist

Principal weaknesses

• Problems handling multiplexor-trees, full-adders, other DAG patterns

• Problems in performing performance optimization tricks in tight pipelined logic

Kurt Keutzer 58

Extra Slides

Page 30: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

30

Copyright © 2000 K. Keutzer

Kurt Keutzer 59

Typical library costs

2 3 4

3 3 7

Kurt Keutzer 60

But what if?

2 3 4

3 3 4

Page 31: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

31

Copyright © 2000 K. Keutzer

Kurt Keutzer 61

Given a function f to be strong divided by g

– Add an extra input to f corresponding to g, namely G and obtain function h as follows

Minimize h using two-level minimizer

Strong (or Boolean) Division

hON = fON −−−− hDC

hOFF ==== fON ++++ hDC

hDC = G g + G g

Kurt Keutzer 62

Typical library costs

2 3 4

3 3 7

Page 32: Technology Dependent Logic Optimizationkeutzer/classes/...– • Technology-independent optimization • - determine overall logic structure • - estimate costs (mostly) independent

32

Copyright © 2000 K. Keutzer

Kurt Keutzer 63

But what if?

2 3 4

3 3 4


Recommended