+ All Categories
Home > Documents > c 411 l 20 Multiplier

c 411 l 20 Multiplier

Date post: 26-Nov-2015
Category:
Upload: nagarjuncherukupalli
View: 11 times
Download: 0 times
Share this document with a friend
Description:
PPT Presentation
23
Sp09 CMPEN 411 L20 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 20: Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
Transcript
CSE 477. VLSI Systems DesignCMPEN 411
Lecture 20: Multiplier Design
[Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
Sp09 CMPEN 411 L20 S.*
Review: Basic Building Blocks
Multiplexers, decoders
Interconnect
The Binary Multiplication
Multiply Operation
multiplicand
multiplier
partial
product
array
Multiplication Approaches
Right shift and add
Partial product array rows are accumulated from top to bottom on an N-bit adder
After each addition, right shift (by one bit) the accumulated partial product to align it with the next row to add
Time for N bits Tserial_mult = O(N Tadder) = O(N2) for a RCA
Making it faster
Use multiplier recoding to simplify multiple formation (booth)
Form the partial product array in parallel and add it in parallel
Making it smaller (i.e., slower)
Use serial-parallel mult
Use an array multiplier
Very regular structure with only short wires to nearest neighbor cells. Thus, very simple and efficient layout in VLSI Can be easily and efficiently pipelined
Right shift approach (almost) always used because left shift requires 2n bit adder
Sp09 CMPEN 411 L20 S.*
Serial-parallel multiplier structure
One simple,small way to implement is the serial-parallel multiplier. So called because the n-bit multiplier is fed in serially while the m bit multiplicand is held in parallel.
Sp09 CMPEN 411 L20 S.*
The Array Multiplier
The MxN Array Multiplier— Critical Path
Critical Path 1 & 2
Carry-Save Multiplier
Multiplier Floorplan
Booth multiplier
Encoding scheme to reduce number of stages in multiplication.
Performs two bits of multiplication at once—requires half the stages.
Each stage is slightly more complex than simple multiplier, but adder/subtracter is almost as small/fast as adder.
Sp09 CMPEN 411 L20 S.*
Booth encoding
y = -2nyn + 2n-1yn-1 + 2n-2yn-2 + ... (first bit is the sign bit)
(example, y=18=010010 y= -18 = 101110 )
Rewrite using 2a = 2a+1 - 2a:
y = 2n(yn-1-yn) + 2n-1(yn-2 -yn-1) + 2n-2(yn-3 -yn-2) + ...
Consider first two terms: by looking at three bits of y, we can determine whether to add x, 2x to partial product.
Sp09 CMPEN 411 L20 S.*
Booth actions
y = 2n(yn-1-yn) + 2n-1(yn-2 -yn-1) + 2n-2(yn-3 -yn-2) + ...
Consider first two terms: by looking at three bits of y, we can determine whether to add x, 2x to partial product.
Sp09 CMPEN 411 L20 S.*
Booth example
P0 = 00000000
x shift left for 2 bits to be 100100
y3y2y1 = 011, P2 = P1 (10*100100) =
11110111+01001000 = 001111111 (6310)
An array multiplier needs N addtions, booth multiplier needs only N/2 additions
Sp09 CMPEN 411 L20 S.*
Review: A 64-bit Adder/Subtractor
Ripple Carry Adder (RCA) built out of 64 FAs
Subtraction – complement all subtrahend bits (xor gates) and set the low order carry-in
RCA
advantage: simple logic, so small (low cost)
disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption)
A0
B0
A1
B1
A2
B2
A63
B63
add/subt
Booth structure
Wallace-Tree Multiplier
Wallace-Tree Multiplier
Making it Faster: Tree Multiplier Structure
mux
P (product)
Q (‘ier)
D (‘icand)
(4,2) Counter
Built out of two (3,2) counters (just FA’s!)
all of the inputs (4 external plus one internal) have the same weight (i.e., are in the same bit position)
the internal carry output is fed to the next higher weight position (indicated by the )
(3,2)
(3,2)
a balanced delay tree
2 csa delays total
Tiling (4,2) Counters
Tiles with neighboring (4,2) counters
Internal carry in at same “level” (i.e., bit position weight) as the internal carry out
(3,2)
(3,2)
(3,2)
(3,2)
(3,2)
(3,2)
Tiling (4,2) Counters
Tiles with neighboring (4,2) counters
Internal carry in at same “level” (i.e., bit position weight) as the internal carry out
(3,2)
(3,2)
(3,2)
(3,2)
(3,2)
(3,2)
multiplicand
multiplier
partial
product
array
double precision product
For class handout
multiplicand
multiplier
partial
product
array
double precision product
five (4,2) counters
‘icand
‘ier
partial
product
array
to a 13-bit fast CPA
Completely populate (costs more in terms of (4,2) counters) – advantage is the CPA doesn’t have to be as wide, so the multiplier faster, and the reduction tree is more “regular”
Sp09 CMPEN 411 L20 S.*
An 8x8 Multiplier Layout
multiplicand
multiplier
Why Not Recode ?
Multiplier recoding (modified Booth’s, canonical, …) recode the multiplier to allow base 4 multiplication with simple multiple formation
with recoding have the base 4 multiplier digit set of -2, -1, 0, 1, 2
Thus, with recoding the initial partial product array is only N/2 high
N
2N
N/2
But, the first level of (4,2) counters also reduces the partial product array to N/2 high
Which is better depends on the logic delay (recoding wins) and interconnect complexity (counters win big)
Sp09 CMPEN 411 L20 S.*
Hitachi 54X54b Mulitplier
Sp09 CMPEN 411 L20 S.*
Hitachi Multiplier: Booth encoder and PPG
Sp09 CMPEN 411 L20 S.*
Hitachi multiplier: 4-2 compressor
What is the state of art?
ISSCC 2003
Multipliers —Summary
Next Lecture and Reminders
HA
FA
FA
HA
HA
FA
FA
FA
FA
FA
FA
HA
through the complete array.
• Once Again: Identify Critical Path
• Other possible techniques
- Data encoding (Booth)

Recommended