CSE 242A Integrated Circuit Layout Automation
Lecture: Partitioning
Winter 2009
Chung-Kuan Cheng
Outlines Motivations Formulations
2-way partitioning, multi-way partitioning, multilevel partitioning, replication cut, clustering.
Net Modeling Algorithms
Optimal Methods: Special cases, branch and bound,
Heuristic Methods: group migration, network flow, clustering, simulated annealing, genetic approaches
Motivation
Huge designs 100 Millions Trans Design Analysis Engineering Change Orders
Good partitioning makes difference on design quality
Motivation: Applications
Physical Hierarchy
Divide and Conquer
Project decomposition
Complexity reduction of each level
Emulation
Hardware & Software codesign
Management design resource
Formulations: Two Way Partitioning
Random two-way partition
N-pin net to be cut
a
a
a
ab
bb
b
2 2
2
n
np
Min-Cut
,,
min ( , ) ijs X t X
i X j X
C X X C
X
X
s t
{ } { }X s or X t
X
Theorem:
There exists an optimal linear placement
s.t. X and are separated.
Trend:
Min-Cut
Opt->Linear Placement
Trend: X=V
S
S
( , ) ( , )min
| |
C X X C X s
X
Bisection
X
X
1 1
2 2
1 2
min ( , )
| |
| |
| | | |
C X X
Const
L X U
L X U
Trend
X L or X L
Ratio Cut
( , )?
| || |
C X X
X X
Multi-Way Partitioning
K-Way Partitioning
X1 X2
Xk
1 2
,
( ) min ( , ,..., )
( ) ( )
( )
( ) min max ( )
i
i
i i
i i i
k
netnet is not contained in any subset
ii
i nn X n X
i i
L X U
i C X X X
C
ii E X
E X C
iii E X
Cluster Ratio Cut
1 2,
| | 15
| | 24
( , ,..., )min ( )
| || |k
k c ci j
i j
V
E
C X X XR k
X X
Multi-Level Partitioning
K-Level Partitioning
Li<= Xi<= Ui
Min Connection Cost E(Ti) <= Ci
External connection cost <= threshold at level i
Generic Binary Tree
L <= Xi <=U Level of Node = longest
path to leaves Connection cost of
node i at level L
,
,: min 2
l i nn gothrouth node i
ll i
l i
F C
Obj F
L4
L3
L2
L1
L0
Replication Cut
X Y
R
X
R
R
Y
C(X,Y)+C(Y,X)+C(Y,R)+C(X,R)
Performance-Driven Partitioning
Need an incremental timing analysis to reflect the performance fast
Replication helps
R
XR
Y
Retiming (Pipelined process)
Allocate one clock cycle for interpartition communication System performance is dominated by
Loops: max#
loop
loop
d
reg
1
1
2
2
2
#
#
l
l
l
l
d T
reg
d
reg
1 2 1 2
1 2 1 2
2, ,
# # #l l l l
l l l l
d d d d T
reg reg reg reg
Clustering
K-Way Partitioning K>>10 Obj strongly depends
on applications Performance driven
Obj min max # cuts between registers
Complexity reduction min i
i i
E
IEi external connection
Ii internal connection
Net Modeling
Shifting: For each shift, we update k, n-k. The cost of the net changes only when k = 0, 1 or n-k =0, 1
Two pin net clique 2/k total weight k-1 1/(k-1) S.K.
k n-k
Net Modeling: Loop Model
Suppose relative positions of pins are given, we can use a loop model
The model remains correct if any two adjacent (in order) pins swap
1/2
Net Modeling: Hypernet Model (Flow Approach)
star
n
Xin0
in ni
in nii i
x C
x x
Optimal Methods: Branch & Bound
1
2 3
L R
L LR R
X X
Prune the branches when size constraint is violated
Partial cost >= existing cost
For U=L=|V|/2
# combinations = |V| ! / (|V/2)! (|V|/2)!
With an elegant implementation |V| <= 60 is feasible
Optimal Methods: Serial & Parallel Graph
Dynamic Programming on Series-Parallel graph G(V, E, s, t)
C(a, i, j) a: s, t on different sides
C(b, i, j) b: s, t on the left side, left side has i nodes, right side has j nodes
1 2
, 1 2
, 1 2
1 2
, 1 2
//
( , , ) min ( , , ) ( , , )
( , , ) min ( , , ) ( , , )
( , , ) min(min ( , , ) ( , , )
k l
k l
k l
G G G
C a i j C a i k j l C a k l
C b i j C b i k j l C b k l
G G G
C a i j C a i k j l C b k l
Series s1
t1 s1
Parallel s1 s2
t1t2
Optimal Methods: Serial & Parallel Graph (Cont)
Dynamic Programming on Series-Parallel graph G(V, E, s, t)
C(a, i, j) a: s, t on different sides
C(b, i, j) b: s, t on the left side, left side has i nodes, right side has j nodes
1 2
, 1 2
, 1 2
1 2
1 2
( , , ) min(min ( , , ) ( , , ),
min ( , , ) ( , . ))
( , , ) min(min ( , , ) ( , , ),
min ( , , ) ( , , ))
k l
k l
G G G
C a i j C a i k j l C b k l
C b i k j l C a k l
C b i j C a i k j l C a l k
C b i k j l C b l k
Series s1
t1 s1
Parallel s1 s2
t1t2
Heuristic Methods
Group Migration Kernighan & Lin Fiducccia-Matheyses
Programming Network Flow Replication Cut Clustering
Group MigrationKermighan & Lin Bisection Cost Ci: change of #cuts by moving node i to
the other side Heapsort nodes in each partition according to
CiRepeat
Repeat
Choose among the top k the best pair to swap
Update the cost, lock the moved nodes
Until all nodes are locked
Find the best sequence to swap
Until no more improvement
2
/30
,
(log ) #
(assuming # )
Probability of Opt:
( ) 2
ij i i j j ij
n
swap i j
S E I E I C
Complexity
n n k passes
pins n
P n
Cut
1 2 3
Group Migration
Hill climbing to jump over local optimal solutions. Locking mechanism to avoid repeated moves.
Fiduccia-Mattheyses
No swapping, move a single node each time Replace the heap with an array
max maxd b
max maxd b
max max max max
: degree
: cost of each net
i
d
b
d b C d b
max
max max max
max max max
Total increase
Total decrease 2
# top shifting = ( )
: #terminals
b p
b p b d
O b d pb
p
High Order Gains
, ,
,
,
( ) ( ) ( )
{ | , ( ) , ( ) 0
{ | , ( ) 0, ( ) 1
( ) 2
( ) 1
#nodes in A, B
pos ki
ke E e Eneg ki
pos ki A B
neg ki A B
A
B
D i C e C e
E e E i e e k e
E e E i e e e k
e
e
1 1 1 1
0 0 0
-1
B
i
eA
Probabilistic Model
( ,1), ( ,2)
( ) ( ( ) ( ))
( ) ( )
( ) ( ( ))
( ) 1 if S is an empty set
n x n i ii n i x i n
x n xn
x x
ii S
g V C p V p V
g V g V
p V f g V
p V
P
G(Vx)
Pmax
Pmin
For any i locked ( ) 0ip V
Move all nodes according to a single net
Adv: move more nodes a the same operation. Good for multi-way, hierarchical where cost function dominated by the way to handle the nets
Dis: Complicate
Simulated Annealing
21 2 1 2
crossing nets
( , ) ( ) ( ( ) ( ))
( ) size of i i
C V V C e S V S V
S V V
Simulated Annealings, nexts: configurations, T: real, count: integer Begin
S= random initial configuration T=T0
Repeat Count= 0 Repeat
Count= count+1 Nexts= generate(s) If c(nexts)<= c(s) or f(c(s),c(nexts),T)> random(0,1) Then s= nexts
Until equilibrium(count,s,T) T= update(T) Unitl frozen(T)
End
Programming
2
,
0 1 | | 1
0 0
1
For each node i if , 1
, 1
1 1then ( , ) ( )
4 4
,
1 0, | |
The eigenvalues of B ...
0, 1
The opt solution of programming
A l
i i
i i
Tij i j
i j
ij ij ii ijj
T T
V
n X x
n X x
C X X C x x X BX
b C b C
set X X X V
V
X V
1ower bound of bisection
Improving the lower bound
~2
1
1
Let C( , ) ( , )
The cost function won't change the opt.
Solution of bisection
max (nonlinear programming)
/ | |
2nd smallest eigenvalue of B+D , 0
Adv: global v
i
i ii
d
ii
ii i ij
X X C X X d x
d V
d d d
1
iew
Disadv: ... are near (numerical problem)
many points are near
no fixed can not be done recursively
k
i
i
x
x
Network Flow Approaches
t
fi
S
f
jXij
max f
0
0
ij ij ij
s js sjj j
t jt tjj j
i ji ijj j
ij
x C
x x f
x x f
x x
x
max f
0
0
0
0
ij ij
js sjj j
jt tjj j
ji ijj j
ij
x C
x x f
x x f
x x
x
,
min
0
1
0
ij iji j
ij ij i j
t s
ij
C
x
,
min
1
ij iji j
ij i j
t s
C
Multiple Commodity Flow
i
a
b
jijf
ijfijab
ij
x
max
| | 1
0
kij ij ij
k
k k kk jk jk
j j
k k ki jk ij
j j
kij
f
x C
x x f
fx x
V
x
max
0
0| | 1
0
kij ij
k
k kjk jk
j j
k kjk ij
j j
kij
f
x C
x x f
fx x
V
x
Xl Xj
Xi
ij
min
0
1| | 1
0
ij ijij
k kij i j
kkik
k i k k
ij
C
V
min
( ) | | 1
0
ij ijij
k kij i j
k ki k
k i k
ij
C
V
( ) | | 1
Triangular inequality
longest path the distance of the longest path from i to k
def: distance(i,j) =
k kij i jk ki k
k k i
ij jl il
k ki k
ij
A V
B
B constraints can be used to replace A
Weighted Cluster Ratio Cut
-
i i
,
,
Theorem:
If k 4, there exists an opt ratio cut R(X,X), s.t.
X or X
Weighted cluster ratio cut
( , )
min| || |
: distance between and
: distance of the longest path from
ij i ji j
ij i ji j
ij i j
ij
X X
C X X
d X X
X X
d
to
Theorem: Multi-Commodity flow derives an opt
solution of weighted cluster ratio cut
i jX X
Replication Cut
( , ) ( , )
: min
0
0
0
1
1
0
0
, 0
ij ij ji iji j E j i E
ij i j
ij i j
i i
s
s
t
t
ij ji
Obj C W C U
W p p
U q q
q p
p
q
p
q
W U
R
S T
R R
S T
S T
Replication Graph
Replication Graph
' '
'n
'
(1) Edges or nets are directed
(2) Duplicate the graph, create , create net
but has reverse direction, equal weight C
(3) Link directed edge ( , ) with weight
(4) Min Cut separatin
i i V n n E
n
i i
'g S & t
derives X, R, Y
Heuristic Flow Approaches
-
-
MFMC (Max-Flow Min-Cut) for bisection
1 Find two seeds as s, t
2 Find Min Cut separating s & t
3 If |X|<|X |, find a seed merge with s
4 Else if |X|<|X |, find a seed merge with t
5 Repeat 2-4 until |X|=|X-
|
Adv: Use parametric flow approach
The whole process is equivalent to one cut
Dis: The choice of the seeds
Cluster Ratio
1 2,
0.5
( , ,..., )min ( )
| || |
Complexity log ,
(1) Randomly choose two nodes
find the shortest path
(2) Add weight on the path
(3) Repeat (1)-(2) r times
kR C C
i ji j
C V V VR k
V V
rn n r n
k
2
2
2
ij
ij
ij
Radomly choose two nodes i & j
Prof that i & j X
| |
| |
| || |
| |
If are small,
dNet weight
C
d : #pahts went through wire (i, j)
kk
k kk
k
XP
V
X V XP
V
X P P
Clustering (Performance-Driven)
1
1
1
1
1
1
1
1
1
11
1
2
2
2
2
3
Cap <= 4
(1) Min delay
(2) Min #clusters
(3) Min #clusters, T>=delay
(4) Min delay #clusters <= C
(5) Min max #inputs, #cluster<= C
(6) Min #clusters, #inputs <= K
Clustering Heuristics
Cluster during target operation Recursive
partitioning (ratio cut) to find clusters.
Then treat each cluster as a single node to perform partitioning.
Linear Placement Iteration
Do linear placement Cluster adjacent nodes Treat each cluster as a single node repeat linear
placement Adv: Work on the right target Dis: Need an efficient & effective target operation
# Crossing
a b
a
a b
b
L
ba is stronger than bL
Max Pair
Given pairing cost, find the best pair to cluster
Treat each cluster as a single node. Repeat the process
The cost function does not encourage the largest node pair with other nodes continuously.
Max Matching
Match the n/2 pairs simultaneously
Adv: max matching can be solved in polynomial operation optimally
Dis: enforce unnatural pair to merge
c
ba
f
ed
C and f are merged because their choices are taken by others
Variations of cost function
Similarity of signatures (Data Path)
Bit #, pin sequence
Control signal, function of gate
The conductivity between nodes i & j
ij
ij
ikk
Replacing nets with resistance 1/C
We can use random walk to derive ij between pairs of nodes
Random walk: at node i, the probability to goto j is
C
C
Expected # of hops between i & j is proportio
ij ji
nal to ij
h h 1/ ij 2|E|
Kth-connectivity
' ' '
2 2
2
[ ], 0,
: connectivity of i, j with one hop
[ ]
: connectivity of i, j with 2 hops
[ ]
: connectivity of i, j with k hops
ij ii ij ij
ij
ij
ij
k kij
kij
C c c c c
c
C c
c
C c
c
Research Directions
a b c
ed
a c
d,e
b
d e
a bc
Logic Hierarchy Layout Hierarchy
(1) Obj: min distortion, max performance
(2) Linkage between two: eg ECO revision
(3) Improvement of operations by exploring the given hierarchy