Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | maximillian-bruce-smith |
View: | 239 times |
Download: | 4 times |
Digital Integrated CircuitsA Design Perspective
DesignMethodologies
Jan M. RabaeyAnantha ChandrakasanBorivoje Nikolic
A System-on-a-Chip: Example
Courtesy: Philips
Impact of Implementation ChoicesEn
ergy
Effi
cien
cy (i
n M
OPS
/mW
)
Flexibility(or application scope)
0.1-1
1-10
10-100
100-1000
None Fullyflexible
Somewhatflexible
Har
dwire
d cu
stom
Confi
gura
ble/
Para
met
eriza
ble D
omai
n-sp
ecifi
c pr
oces
sor
(e.g
. DSP
)
Embe
dded
mic
ropr
oces
sor
Design Methodology
• Design process traverses iteratively between three abstractions: behavior, structure, and geometry• More and more automation for each of these steps
Implementation Choices
Custom
Standard CellsCompiled Cells Ma cro Cells
Cell-based
Pre-diffused(Gate Arrays)
Pre-wired(FPGA's)
Array-based
Semicustom
Digital Circuit Implementation Approaches
The Custom Approach
Intel 4004
Courtesy Intel
Transition to Automation and Regular Structures
Intel 4004 (‘71)Intel 8080 Intel 8085
Intel 8286 Intel 8486Courtesy Intel
Standard Cell — Example
[Brodersen92]
Standard Cell – The New Generation
Cell-structurehidden underinterconnect layers
Standard Cell - Example
3-input NAND cell(from ST Microelectronics):C = Load capacitanceT = input rise/fall time
Automatic Cell Generation
Courtesy Acadabra
Initial transistorgeometries
Placedtransistors
Routedcell
Compactedcell
Finishedcell
A Historical Perspective: the PLA
x0 x1 x2
ANDplane
x0 x1
x2
Product terms
ORplane
f0 f1
Two-Level Logic
Inverting format (NOR-NOR) more effective
Every logic function can beexpressed in sum-of-productsformat (AND-OR)
minterm
PLA Layout – Exploiting Regularity
f0 f1x0 x0 x1 x1 x2 x2Pull-up devices Pull-up devices
V DD GNDfAnd-Plane Or-Plane
Breathing Some New Life in PLAsRiver PLAs• A cascade of multiple-output PLAs.• Adjacent PLAs are connected via river routing.
PRE-CHARGE
PRE-
CHARGE
PRE-CHARGE
PRE-CHARGE
BUFFER
BUFFER
BUFFER
BUFFER
PRE-CHARGE
PRE-CHARGE
BUFFER
BUFFER
PRE-CHARGE
PRE-
CHARGE
BUFFERBUFFER
• No placement and routing needed. • Output buffers and the input buffers of
the next stage are shared.
Courtesy B. Brayton
Experimental Results
Layout of C2670
Network of PLAs, 4 layers OTC
River PLA,2 layers no additional routing
Standard cell, 2 layers channel routing
Standard cell,3 layers OTC
0.2
0.6
1
1.4
0 2 4 6 area
delay
SC NPLA RPLA
Area: RPLAs (2 layers) 1.23 SCs (3 layers) - 1.00, NPLAs (4 layers) 1.31 DelayRPLAs 1.04SCs 1.00 NPLAs 1.09 Synthesis time: for RPLA , synthesis time equals design time; SCs and NPLAs still need P&R.
Also: RPLAs are regular and predictable
MacroModules
25632 (or 8192 bit) SRAMGenerated by hard-macro module generator
“Soft” MacroModules
Synopsys DesignCompiler
“Intellectual Property”
A Protocol Processor for Wireless
Semicustom Design Flow
HDL
Logic Synthesis
Floorplanning
Placement
Routing
Tape-out
Circuit Extraction
Pre-Layout Simulation
Post-Layout Simulation
Structural
Physical
BehavioralDesign Capture
Des
ign
Itera
tion
The “Design Closure” Problem
Courtesy Synopsys
Iterative Removal of Timing Violations (white lines)
Integrating Synthesis with Physical Design
Physical Synthesis
RTL (Timing) Constraints
Place-and-RouteOptimization
Artwork
Netlist with Place-and-Route Info
MacromodulesFixed netlists
Pre-diffused(Gate Arrays)
Pre-wired(FPGA's)
Array-based
Late-Binding Implementation
Gate Array — Sea-of-gates
rows of
cells
routing channel
uncommitted
VDD
GND
polysilicon
metal
possiblecontact
In1 In2 In3 In4
Out
UncommitedCell
CommittedCell(4-input NOR)
Sea-of-gate Primitive Cells
NMOS
PMOS
Oxide-isolation
PMOS
NMOS
NMOS
Using oxide-isolation Using gate-isolation
Sea-of-gates
Random Logic
MemorySubsystem
LSI Logic LEA300K(0.6 mm CMOS)
Courtesy LSI Logic
The return of gate arrays?
metal-5 metal-6
Via-programmable cross-point
programmable via
Via programmable gate array(VPGA)
[Pileggi02]
Exploits regularity of interconnect
Prewired ArraysClassification of prewired arrays (or field-programmable devices):• Based on Programming Technique
– Fuse-based (program-once)– Non-volatile EPROM based– RAM based
• Programmable Logic Style– Array-Based– Look-up Table
• Programmable Interconnect Style– Channel-routing– Mesh networks
Fuse-Based FPGA
antifuse polysilicon ONO dielectric
n+ antifuse diffusion
2 l
From Smith97
Open by default, closed by applying current pulse
Array-Based Programmable Logic
PLA PROM PAL
I 5 I 4
O 0
I 3 I 2 I 1 I 0
O 1O 2O 3
Programmable AND array
ProgrammableOR array I 5 I 4
O 0
I 3 I 2 I 1 I 0
O 1O 2O 3
Programmable AND array
Fixed OR array
Indicates programmable connection
Indicates fixed connection
O0
I 3 I 2 I 1 I 0
O1O2O3
Fixed AND array
ProgrammableOR array
Programming a PROM
f0
1 X 2 X 1 X 0
f1NANA
: programmed node
2-input mux as programmable logic block
FA 0
B
S
1
Configuration
A B S F=
0 0 0 00 X 1 X0 Y 1 Y0 Y X XYX 0 YY 0 XY 1 X X 1 Y1 0 X1 0 Y1 1 1 1
XYXY
XY
LUT-Based Logic Cell
Courtesy Xilinx
D 4
C 1 ....C4
xxxxxx
D 3
D 2
D 1
F4
F3
F2
F1
Logicfunction
ofxxx
Logicfunction
ofxxx
Logicfunction
ofxxx
xx
xx
4
xxxxxx
xxxxxxxx
xxx
xxxx xxxx xxxx
HP
Bitscontrol
Bitscontrol
Multiplexer Controlledby Configuration Program
x
xx
x
xx
xxx xx
xxxx
x
xxxxxx
xx
x
xx
xxx
xx
Xilinx 4000 Series
Figure must be updated
Array-Based Programmable Wiring
Input/output pinProgrammed interconnection
InterconnectPoint
Horizontaltracks
Vertical tracks
Cell
Mesh-based Interconnect Network
Switch Box
Connect Box
InterconnectPoint
Courtesy Dehon and Wawrzyniek
Transistor Implementation of Mesh
Courtesy Dehon and Wawrzyniek
Hierarchical Mesh Network
Use overlayed meshto support longer connections
Reduced fanout and reduced resistance
Courtesy Dehon and Wawrzyniek
EPLD Block Diagram
MacrocellPrimary inputs
Courtesy Altera
Altera MAX
From Smith97
Altera MAX Interconnect Architecture
LAB2
PIA
LAB1
LAB6
t PIA
t PIA
row channelcolumn channel
LAB
Courtesy Altera
Array-based(MAX 3000-7000)
Mesh-based(MAX 9000)
Field-Programmable Gate ArraysFuse-based
I/O Buffers
P rogram/Test/Diag nostics
I/O Buffers
I/O B
uffe
rs
I/O B
uffe
rs
Vertical ro utes
Rows o f logic m odule s
Routing channels
Standard-cell likefloorplan
Xilinx 4000 Interconnect Architecture
2
12
8
4
3
2
3
CLB
8 4 8 4
Quad
Single
Double
Long
DirectConnect
DirectConnect
Quad Long GlobalClock
Long Double Single GlobalClock
CarryChain
Long12 4 4
Courtesy Xilinx
RAM-based FPGA
Xilinx XC4000ex
Courtesy Xilinx
A Low-Energy FPGA (UC Berkeley)
Array Size: 8x8 (2 x 4 LUT) Power Supply: 1.5V & 0.8V Configuration: Mapped as RAM Toggle Frequency: 125MHz Area: 3mm x 3mm
Larger Granularity FPGAs
1-mm 2-metalCMOS tech
1.2 x 1.2 mm2
600k transistors
208-pin PGA
fclock = 50 MHz
Pav = 3.6 W @ 5V
Basic Module: Datapath
PADDI-2 (UC Berkeley)
Design at a crossroadSystem-on-a-Chip
RAM
500 k Gates FPGA+ 1 Gbit DRAMPreprocessing
Multi-
SpectralImager
mCsystem+2 GbitDRAMRecog-
nition
Ana
log
64 SIMD ProcessorArray + SRAM
Image Conditioning100 GOPS
Embedded applications where cost, performance, and energy are the real issues!
DSP and control intensive Mixed-mode Combines programmable and
application-specific modules Software plays crucial role
Addressing the Design Complexity IssueArchitecture Reuse
Reuse comes in generationsGeneration Reuse element Status
1st Standard cells Well established
2nd IP blocks Being introduced
3rd Architecture Emerging
4th IC Early research
Source: Theo Claasen (Philips) – DAC 00
Architecture ReUse
• Silicon System Platform– Flexible architecture for hardware and software– Specific (programmable) components– Network architecture– Software modules– Rules and guidelines for design of HW and SW
• Has been successful in PC’s– Dominance of a few players who specify and control architecture
• Application-domain specific (difference in constraints)– Speed (compute power)– Dissipation– Costs– Real / non-real time data
Platform-Based Design
• A platform is a restriction on the space of possible implementation choices, providing a well-defined abstraction of the underlying technology for the application developer
• New platforms will be defined at the architecture-micro-architecture boundary
• They will be component-based, and will provide a range of choices from structured-custom to fully programmable implementations
• Key to such approaches is the representation of communication in the platform model
“Only the consumer gets freedom of choice;designers need freedom from choice”
(Orfali, et al, 1996, p.522)
Source:R.Newton
Berkeley Pleiades Processor
• 0.25um 6-level metal CMOS • 5.2mm x 6.7mm• 1.2 Million transistors• 40 MHz at 1V• 2 extra supplies: 0.4V, 1.5V• 1.5~2 mW power dissipation
Interface
Reconfigurable
Data-path
FPGA
ARM8 Core
Heterogeneous Programmable Platforms
Xilinx Vertex-II Pro
Courtesy Xilinx
High-speed I/O
Embedded PowerPcEmbedded memories
Hardwired multipliers
FPGA Fabric
Summary
• Digital CMOS Design is kicking and healthy• Some major challenges down the road caused by
Deep Sub-micron– Super GHz design– Power consumption!!!!– Reliability – making it workSome new circuit solutions are bound to emerge
• Who can afford design in the years to come? Some major design methodology change in the making!