of 34
8/6/2019 So Cde Sign
1/34
SoC Design
8/6/2019 So Cde Sign
2/34
2
SoC Design
Synthesis DFT Insertion Floorplanning Power Planning Clock tree insertion Place and Route RC extraction Timing check
8/6/2019 So Cde Sign
3/34
3
Design Tools
System Architecture C/C++ SystemC Matlab
Synthesis RTL Compiler BuildGates Prime Power
RTL Verilog-XL NC-Verilog NC-VHDL Debussy
Physical Design Silicon Ensemble SoC Encounter Magma Mentor
8/6/2019 So Cde Sign
4/34
4
Simplified Flow
Logic Synthesis
Place &Route
Static Timing
Analysis
Clock Tree
Synthesis
RC Extraction
Test
(ATPG)
Formal
Verification
Floor planning
Static Timing
AnalysisDRC/LVS
RTL TimingConstraints.libLEF
GDSII
Logic
Simulation
Netlist SPEF, SDF
Front End
Back End
8/6/2019 So Cde Sign
5/34
5
TSMCs Design Flow
8/6/2019 So Cde Sign
6/34
6
Flow with Multi-Vendor Tools
8/6/2019 So Cde Sign
7/34
Design Abstraction Levels
7
n+n+S
G
D
+
DEVICE
CIRCUIT
GATE
MODULE
SYSTEM
8/6/2019 So Cde Sign
8/34
8
Logic Synthesis
Translation of RTL descriptioninto an intermediate format
Optimization of logic
Mappingof the optimized netlist tothe gates of target library.
Synthesis tool requires RTL code Target ASIC cell library User Constraints
Timing and Area Environmental Power, Load etc.
Output of the synthesis is a gatelevel netlist in the targettechnology
Idea
Functional
Description
Behavioral
HDL
Gate-LevelNetlist
RTL
Synthesis is the process by which an
abstract description (known as RTL) ofthe circuit behaviour (generally in VHDL)
is mapped to a set of primitive standard
cells in a library for a particular process
technology.
LogicSynthesis
DFTArchitecture
Netlist Synthesis
8/6/2019 So Cde Sign
9/34
9
RTL Coding RTL stands for Register Transfer Level RTL description of a design describes the
design in terms registers and logic thatresides between them
This captures the timing constraints of thedesign efficiently
Verilog and VHDL are two most popularhardware description languages that arecommonly used to write RTL description
RTL description captures the change indata at each clock cycle
All the registers are updated at the sametime in a clock cycle
RTL captures the data flow Logic synthesis tools translate an RTL
model more efficiently compared tobehavioral model
else
end if;
:= MEM(PC);:= PC + 1;:= SP - 1;:= DBUF;
DBUFMEM(SP)SPPC
if IR(3) = 0'then'
PC := PC + 1;
Sample RTL code
8/6/2019 So Cde Sign
10/34
10
Logic Synthesis
Process (CLK, RST)
if (RST = 1) then
Q
8/6/2019 So Cde Sign
11/34
11
Logic Synthesis: Technology Mapping
Z = (not S and A) or (S and B)
Z
A
B
S
ANDOR-001
I-002
Standard Cells
Z
A
B
S
Generic Gates
8/6/2019 So Cde Sign
12/34
12
DfT Insertion
Testable Flip-Flops Scan chain generation Chain propagation
from core to output pin
DfT Insertion
Test generation
DfT Insertion and Synthesis
DfT Analysis
Handoff deliverables
ATPG / Expansion
test validation
8/6/2019 So Cde Sign
13/34
13
Backend Design Technology Information and
Physical Libraries Corelib.lef IOlib.lef Rams.vclef
Timing libraries Corelib_slow,lib Corelib_fast.lib Corelib_typ.lib IOlib_slow.lib RAM timing libraries
Timing constraints (userdefined)
Design Netlist
Add IO pads, power pads
Verilog design netlist IO pad location file
Power Grid
DesignAnalysis
Chip
Assembly
Chip Physical Architecture
Hierarchical
STA
I/O
& HierarchicalPlanning
Floorplan
Implementation
Placement DFT
Physical Synthesis
Clock Tree
SynthesisPost PlacementOptimisation
Signal RoutingAntennas
Decap, Fillers
Crosstalk Fixing
Routing and Final Optimisation
Post Route FixEditing
8/6/2019 So Cde Sign
14/34
14
Floorplanning
Floor planning is the task of decidinghow the chip area is to be utilized bythe leaf modules taking care of wiringconsiderations
Two methods of floorplanning: Top Down: Here the chip is
partitioned up during thedevelopment of the RTL levelmodelling. Area is assigned on thebasis of estimated block areas andshapes, and blocks are placedrelative to each other depending onconnectivity.
Bottom up: Here the design is firstsynthesised and then the resultantgates are clustered together intoblocks on the basis of connectivity.
Most designs use a combination ofboth of the above techniques, but theemphasis is increasingly on the first.
IP Block
Std. Cells
Pads
8/6/2019 So Cde Sign
15/34
15
Floorplanning
Core Size of Standard Cell =standard cell area
core utilization
Example
Standard cell area = 2,000,000um2 Core utilization demanded = 85% No macros Core Size of Standard Cells = 2,000,000 / 0.85 =
2,352,941um2
Width = Height = (2,352,941)0.5 =1534um
Calculating core size, width and height When calculating core size of standard cells, the core utilization must be
decided first. Usually the core utilization is higher than 85% The core size is calculated as follows
The recommended core shape is a square, i.e. Core Aspect Ratio = 1. Width = Height = (Core Size of Standard Cells)0.5
8/6/2019 So Cde Sign
16/34
16
Floorplanning Core Margins
Space for power and groundrouting
Core limited / Pad limited designs When pad width > (core width +
core margin),die size is decidedby pads. And it is called padlimited design
When pad width < (core width +core margin), die size is decidedby core. And it is called corelimited design
8/6/2019 So Cde Sign
17/34
17
Power Planning Metal migration (also known as electro-
migration)
Under high currents, electron collisions withmetal grains cause the metal to move. Themetal wire may be open circuit or short circuit.
Prevention: sizing power supply lines toensure that the chip does not fail
Experience: make current density of powerring < 1mA/m
IR drop
IR drop is the problem of voltage drop of thepower and ground due to high current flowingthrough the power-ground resistive network
When there are excessive voltage drops in thepower network or voltage rises in the groundnetwork, the device will run at slower speed
IR drop can cause the chip to fail due to Performance (circuit running slower than
specification) Functionality problem (setup or hold violations)
Unreliable operation (less noise margin) Power consumption (leakage power) Latch up Prevention: adding stripes to avoid IR drop on
cells power line
8/6/2019 So Cde Sign
18/34
18
Power Planning: IR Drop
t
v(t)
Counter
enable
C2 Counts vs. DSP activity (Fc = 20 MHz)(Tambient = 27C)
691692693694
695696697698699
0 50 100 150 200 250Tester ck-cycles
C
2counts
counts = 6
Number of counts inversely proportionalto DSP clock frequency FC= 10, 20 and 25 MHz Ringo frequency 115 MHz @ VDD= 1.8V DSP induced PSN is clearly detected
Average PSN = 6 counts 2.4 mV/count = 14.4 mV
Source: J. Rius, UPC
8/6/2019 So Cde Sign
19/34
19
Voltage Drop Verification
Virtual Prototype(flat implementation)
VoltageStorm (Cadence)
SoC EncounterBlock
PowergridView
Power Grid
View Library
Voltage Storm
Encounter Power Analysis
Block PowerConsumption
Block-level Analysis
Voltage Storm
Encounter Power Analysis
Instance PowerConsumption
Top-level Analysis
Create HierarchyBlock-level PG AnalysisTop-level Chip PG Sign-off
IP Block
Partition 1
Partition 2
Results displayed inSoC Encounter Interface
8/6/2019 So Cde Sign
20/34
20
Power Grid Design
8/6/2019 So Cde Sign
21/34
21
Power Ring Width
Experience Gate count = 70 k 4000 Flip-Flops 80% FF with dynamic gated clock Current needed = 0.2mA/MHz
Note: the value should multiply with 1.8~2 for nogated design
Example: Gate count = 200 k No gated clock Clock frequency = 20 MHz Current needed = (200/70) * 0.2 * 20 * 2 = 22.86 mA Current density < 1mA/m The Width of P/G Ring > 22.86 um In order to avoid the slot rule of wide metal, the
largest width is 20 um (process dependent) Use two sets of P/G ring for this case
8/6/2019 So Cde Sign
22/34
22
Power Stripe Calculation
Experience
Add one strap set per 100 umExample
Core width = height = 1600 Stripe set added = 15Core/IO power pad selection
Core power pad One set core power pad
(PVDDC along with PVSSC)can provide 40~50mA current
IO power pad One set IO power pad(PVDDR along with PVSSR)
can provide the power for 3~4 output pads, or 6~8 input pads
Power ring
Stripes
Core power
connection
8/6/2019 So Cde Sign
23/34
23
Placement Placement decides the positions of components within allocated blocks One cannot route until the components have been placed. The quality of placement is decided solely on the basis of the quality of routing it allows. Placement is performed using simple estimates of final routing. Timing driven P&R is the state of the art Gates, flip-flops/latches are the common placement objects.
Smaller elements like logic gates are placed in single row. Larger blocks are placed in multiple-rows.
Std cells
Low utilizationcore
8/6/2019 So Cde Sign
24/34
24
Placement
Source: Magma
8/6/2019 So Cde Sign
25/34
25
Clock Tree Synthesis The goal of clock tree synthesis
includes
Creating clock tree spec file Building a buffer distribution network
In automatic CTS mode, Encounter willdo the following things Build the clock buffer tree according to
the clock tree specification file
Balance the clock phase delay withappropriately sized, inserted clockbuffers
Clock signal is used as a timing referencein a synchronous digital system for themovement of data within that system.
The Clock Tree or clock distributionnetwork distributes the clock signal(s) froma common point to all the elements thatneed it
Properties of clock signals They are loaded with the greatest fanout, travel over the greatest distances operate at the highest speeds
8/6/2019 So Cde Sign
26/34
26
Clock Tree Synthesis
8/6/2019 So Cde Sign
27/34
27
Routing Routing is the process of building the
physical connections between blocksas defined by the logical connections.
Routing takes place in more than onelayer, the exact number availabledepending on the process and designconventions.
Layers are connected together usingvias
Global Routing Assigns wires to channels
defined during the floorplanning phase Detailed Routing
Assigns nets to individualtracks in the channel
Signal Routing
Antennas
Decap, Fillers
Crosstalk Fixing
Routing and Final Optimisation
Post Route FixEditing
8/6/2019 So Cde Sign
28/34
28
Routing: Signal Integrity Cross-talk
Parallel repeater insertion does notreducethe cross-talk peak noise
For a 10mm communication bus, the delaynoise is lowered by about 77%
Staggered repeaters reduce delay noise byabout 88%
Source: M. Meijer and A. Katoch, Philips
Peak Noise 20mm wire
Propagation Delay 20mm wire
8/6/2019 So Cde Sign
29/34
29
Routing: SI Prevention
Timing & Crosstalk
Analysis
PowerDistribution
Analysis
Verification Signoff
ParasiticExtraction
8/6/2019 So Cde Sign
30/34
30
Static Timing Analysis This involves three main steps:
Design is broken down into sets oftiming paths
The delayof each path iscalculated
All path delays are checked to seeif timing constraints have been met
D QA
CLK
Z
Path 1
Path 3
Path2
0.431.0
0.54
0.32
0.66
0.23
0.25
D1
U33
Path delay calculations
path_delay = (1.0 + 0.54 + 0.32 + 0.66 + 0.23 + 0.43 + 0.25) = 3.43 ns
8/6/2019 So Cde Sign
31/34
31
Physical Verification
DRCDesign Rule
Checking LVS
Layout vs.Schematicverifications
8/6/2019 So Cde Sign
32/34
32
Chip Finishing
Seal-ring & Artefact Generationhelps to make the circuit moistureresistant and prevents thegeneration of cracks in the dieduring sawing the wafer
Sometimes this step is simplycalled Design Chip Finishing
critical dimensions structures, maskids, fuse markers, etcTiling - dummy fill/pattern fill
Fabs stringent min and rules onlayer densities on active, poly andmetal must be met by all designs
Currently back-end operationEach step is followed by
Physical Verification step
tiles
Seal ring
8/6/2019 So Cde Sign
33/34
33
Package Fitting
Selection of appropriatepackage
Route pads to pins Wire length is important Rule checking
GDS2 minimum requiredinformation is the nitride orpad opening layer or thepad boundary layer
Package options
8/6/2019 So Cde Sign
34/34
Packaging