May 1, 2013May 1, 2013
Challenges of Giant Silicons in Advanced Process Technology Node
Presenter: Sorel Horovitz, Design ManagerMarvell Israel
May 1, 2013May 1, 2013
Introduction
Challenge:• Physical implementation and management of giant
silicons Device partitioning, flat full chip timing, project,
inter-block connectivity, management
• Implementations methods Reduce time to market using new techniques
May 1, 2013May 1, 2013
The level of functionality that can be squeezed
into a “reasonable” die size almost double from
one process generation to the next.
28nmm600mm²60 Macros
40nmm600mm²30 Macros
21.5 x 23.7 = 509.55
cr
cr cr
cr
PLL
DATA
DDR CNTL1.5 x 1.97
DATA
DATA
DATAD A T A
DD
R C
NT
L1
.5 x
1.9
7
D A T A
D A T A
D A T A
D A T A
DD
R C
NT
L1
.5 x
1.9
7
D A T A
D A T A
D A T A
D A T A
DD
R C
NT
L1
.5 x
1.9
7
D A T A
D A T A
D A T A
D A T A
DD
R C
NT
L1
.5 x
1.9
7
D A T A
D A T A
DD
R C
NT
L1
.5 x
1.9
7
D A T A
D A T A
DD
R C
NT
L1
.5 x
1.9
7
D A T A
D A T A
DD
R C
NT
L1.
5 x
1.9
7
D A T A
DATA
DDR CNTL1.5 x 1.97
DATA
DATA
DDR CNTL1.5 x 1.97
DATA
DATA
DDR CNTL1.5 x 1.97
DATA
PLL
PLL
PLL
PLL
PEX
PEX
pex
pex
CG = 1.52 x 0.78 = 1.56
Network PCL ETC (Ingress Lower 1)
– 103.4 x 3.1 = 10.54
Network L2I MT (Ingress Upper 1) – 18.65
4.4 x 4.25 = 18.7
Network IPVX POLICER (Ingress upper 2)
– 9.53.8 x 4.4 = 16.72
BMEM lower – 2.62.563 x 1.1 = 2.8193
P L L
EMC LookUp – 10
2.2 x 4.4 = 9.68
ILK_etc = 1.73.2 x 0.8 = 2.56
S DS D
S DS D
S DS D
S DS D
S DS D
S DS D
C
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
C
S DS D
S DS D
S DS D
S DS D
S DS D
S DS D
Min
iGop
2
1.5
x 2 = 3
Network PCL (Ingress Lower 1)
– 103.4 x 3.1 = 10.54Network L2I MT
(Ingress Upper 1) – 13.44.4 x 4.25 = 18.7
Network IPVX POLICER (Ingress upper 2)
– 9.53.8 x 4.4 = 16.72
Min
iGop
0
1.5
x 2 = 3
Min
iGop
1
1.5 x 2
= 3
CG = 1.52 x 0.78 = 1.56
Min
iGop
2
1.5
x 2
= 3
Min
iGop
0
1.5
x 2
= 3
Min
iGop
1
1.5
x 2
= 3
BMEM lower – 2.62.563 x 1.1 = 2.8193
PLL
PCL TCAM Macro
3.3 x 2 = 6.6
Action Table(19Mb)
3.2 x 3.2 = 10.24
PCL TCAM Macro
3.3 x 2 = 6.6
PCL TCAM Macro
3.3 x 2 = 6.6
PCL TCAM Macro
3.3 x 2 = 6.6
dfx
BMEM lower – 2.62.5625 x 1.1 = 2.8187
CMOS
0
CG+INLK = 2.52.3987 x 1.2 =
2.8784
MiniGop3 1.5
1.25 x
1.2 = 1.5
PLL
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
C
BMEM upper 0 – 6.03.4 x 1.9 = 6.46
BMEM upper 1 – 6.03.4 x 1.9 = 6.46
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
C
BMEM lower – 2.62.5625 x 1.1 = 2.8187
BMEM upper 2 – 6.03.4 x 1.9 = 6.46
BMEM upper 3 – 6.03.4 x 1.9 = 6.46
MiniGop0 1.5
1.25 x
1.2 = 1.5
MiniGop1 1.5
1.25 x
1.2 = 1.5
MiniGop2 1.5
1.25 x
1.2 = 1.5
CG+INLK = 2.52.3987 x 1.2 =
2.8784
MiniGop3 1.5
1.25 x
1.2 = 1.5
MiniGop0 1.5
1.25 x
1.2 = 1.5
MiniGop1 1.5
1.25 x
1.2 = 1.5
MiniGop2 1.5
1.25 x
1.2 = 1.5
CMOC
1
CMOS
2
CMOS
3
CMOS
4
CMOS
5
FCU 1.5
1.5 x 1 =
1.5
BMA – 6.53.2 x 2.0462 = 6.5477
EMC Wide – 5.52.85 x 2.0185 = 5.7528
TXQ
2.1 x 3.45 = 7.245
LL
2.75 x 3.4 = 9.35
CNC 01.6 x 2.3 =
3.68 FB Control – 4.42.8 x 1.6 = 4.48
FB Control – 4.42.8 x 1.6 = 4.48
Eg2r_sht_dq21.95 x 1.8 =
3.51
Eg2r_sht_dq21.95 x 1.8 =
3.51
MPPM
CNC 11.6 x 2.3 =
3.68
015 Fb_top 0 0 15Fb_top 1
011
nw_top 1
011
Nw
_top 0
023 Tcam ilkn
PLL
Size & Technology
May 1, 2013May 1, 2013
• Floor Plan
• Power Grid
Partition
• Clocks• FC Timing
• Wire Sampling
• Database
• Management
Backend Challenges
May 1, 2013May 1, 2013
Clusters - contains different number of macros
Management
Physical Cluster • Pin assignment • Routing Channels • Clocks
Logical Cluster
• FC Timing – Static Timing
• Run time / Complexity • Macro Duplication • Inter-Block Connectivity
Our Approach: Project partitioning
May 1, 2013May 1, 2013
Clusters - contains different number of macros
Management
• Macro size / aspect ratio
• Top Routing• Clock Structure • Long Wire Sampling
Physical Partition
May 1, 2013May 1, 2013
• Pins/ routing density o Almost No possibility for Buffers in Top Level Channels o Sampling macros: ~45K ports o Regular macros: ~26k ports
• Top Routing channels are very small – reduces also chip area• Main clocks (rings/meshes) are routed in top channels
• Hold slacks between 2 macros is closed inside the macros
Top Routing channels Physical Partition
May 1, 2013May 1, 2013
Clock Structure Physical Partition
May 1, 2013May 1, 2013
• Long Connectivity wires between macros • Developed Tool for sampling inside macros o Optimal feed through paths o Number of sampling levels inside each macro. o Automatic budget (sdc) - depending on the distance between the port and the FF placement.
o Automatic Pin location
Wire Sampling Physical Partition
May 1, 2013May 1, 2013
Management• FC Timing Static Timing Flat / ETM / QTM Overlap Macros
• Run time / Complexity • Macro Duplication• Inter-Block Connectivity
Logical Partition
May 1, 2013May 1, 2013
Top Right
Bottom Right
Top Left
Bottom Left
Flat
ETM
QTM
Logical Partition
May 1, 2013May 1, 2013
• The Static Timing Partition is not dependent on Physical Partition. o Fast Timing Cluster Definition – Flat / ETM / QTM Models.
o Each partition run independently.
• Budget and slack allocation in cluster level and also
between
clusters.
Full-chip timing Logical Partition
May 1, 2013May 1, 2013
• Clock Balancing is done per cluster and between clusters.
• Database Collector:o Timing interface (setup / hold) per macro port is recordedo Tool analyzes database timing and decides which fixes to do
•Duplicated macros
Full-chip timing Logical Partition
May 1, 2013May 1, 2013
• Each cluster has its own database (Vout / Spef / Sdc). • Common database for Top Level – read by all clusters • Each macros needs to have Flat / ETM / QTM models
Database Management
May 1, 2013May 1, 2013
• Cluster Partition enable an efficient management o Divides the responsibility and accountability between backend leader and cluster leaders
o Physical & Logical Partition
• Shortens Time to Market
Summary
May 1, 2013May 1, 2013
Questions?
May 1, 2013May 1, 2013
THANK YOU!