Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 216 times |
Download: | 0 times |
VLSI ArithmeticAdders
Prof. Vojin G. Oklobdzija
University of California
http://www.ece.ucdavis.edu/acsel
Oklobdzija 2004 Computer Arithmetic 2
Introduction
• Digital Computer Arithmetic belongs to Computer Architecture, however, it is also an aspect of logic design.
• The objective of Computer Arithmetic is to develop appropriate algorithms that are utilizing available hardware in the most efficient way.
• Ultimately, speed, power and chip area are the most often used measures, making a strong link between the algorithms and technology of implementation.
Oklobdzija 2004 Computer Arithmetic 3
Basic Operations
• Addition
• Multiplication
• Multiply-Add
• Division
• Evaluation of Functions
• Multi-Media
Oklobdzija 2004 Computer Arithmetic 5
Addition of Binary NumbersFull Adder. The full adder is the fundamental building block of most arithmetic circuits:
The sum and carry outputs are described as:
iiiiiiiiiiiiiiiiiii cbcabacbacbacbacbac 1
iiiiiiiiiiiii cbacbacbacbas
FullAdder
CinCout
si
ai bi
Oklobdzija 2004 Computer Arithmetic 6
Addition of Binary Numbers
Propagate
Propagate
Generate
Generate
Inputs Outputs
ci ai bi si ci+1
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1 0
1 0 1 0 1
1 1 0 0 1
1 1 1 1 1
Oklobdzija 2004 Computer Arithmetic 7
Full-Adder Implementation
Full Adder operations is defined by equations:
iiiiiiiiiiiiiiiiii cpcbacbacbacbacbas
iiiiiiiiiiii cpgbacbacbac 1
One-bit adder could be implemented as shown
Carry-Propagate:and Carry-Generate gi
iii bap
iii bag cout c in
s i
a i b i
Oklobdzija 2004 Computer Arithmetic 8
High-Speed Addition
iii cps
iiii cpgc 1
One-bit adder could be implemented more efficiently
because MUX is faster
iii bap iii bag
0
1s
b ia i
cout
s i
c in
Oklobdzija 2004 Computer Arithmetic 10
The Ripple-Carry Adder
A0 B0
S0
Co,0Ci,0
A1 B1
S1
Co,1
A2 B2
S2
Co,2
A3 B3
S3
Co,3
(= Ci,1)FA FA FA FA
Worst case delay linear with the number of bits
tadder N 1– tcarry tsum+
td = O(N)
Goal: Make the fastest possible carry path circuit
From Rabaey
Oklobdzija 2004 Computer Arithmetic 11
Inversion Property
A B
S
CoCi FA
A B
S
CoCi FA
S A B Ci S A B Ci
=
Co A B Ci Co A B Ci
=
From Rabaey
Oklobdzija 2004 Computer Arithmetic 12
Minimize Critical Path by Reducing Inverting Stages
A0 B0
S0
Co,0Ci,0
A1 B1
S1
Co,1
A2 B2
S2
Co,2 Co,3FA’ FA’ FA’ FA’
A3 B3
S3
Odd CellEven Cell
Exploit Inversion Property
Note: need 2 different types of cellsFrom Rabaey
Oklobdzija 2004 Computer Arithmetic 13
Ripple Carry Adder
Carry-Chain of an RCA implemented using multiplexer from the standard cell library: a i+1 b i+1 a i b i
a i+2 b i+2
cout
c i+1 c i
s is i+1s i+2
c in
Critical Path
Oklobdzija, ISCAS’88
Oklobdzija 2004 Computer Arithmetic 14
Manchester Carry-Chain Realization of the Carry Path
• Simple and very popular scheme for implementation of carry signal path
V dd
Carry out Carry in
Propagatedevice
Predischarge& kill device
Generatedevice
++++++++
V ddV ddV ddV ddV ddV ddV dd
Oklobdzija 2004 Computer Arithmetic 15
Original DesignT. Kilburn, D. B. G. Edwards, D. Aspinall, "Parallel Addition in Digital Computers:
A New Fast "Carry" Circuit", Proceedings of IEE, Vol. 106, pt. B, p. 464, September 1959.
Oklobdzija 2004 Computer Arithmetic 16
Carry-Skip Adder
MacSorley, Proc IRE 1/61Lehman, Burla, IRE Trans on Comp, 12/61
Oklobdzija 2004 Computer Arithmetic 17
Carry-Skip Adder
FA FA FA FA
P0 G1 P0 G1 P2 G2 P3 G3
Co,3Co,2Co,1Co,0Ci ,0
FA FA FA FA
P0 G1 P0 G1 P2 G2 P3 G3
Co,2Co,1Co,0Ci,0
Co,3
Mul
tipl
exer
BP=PoP1P2P3
Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.
Bypass
From Rabaey
Oklobdzija 2004 Computer Arithmetic 18
Carry-Skip Adder: N-bits, k-bits/group, r=N/k groups
G r G r-1
...
SN-k-1S N-1
a N -1bN -1 b N -k-1a N -k-1
S(r-1)k-1 S (r-2)k
G 1G o
...
Sk
S2k-1
a 2k-1b 2k-1 b kak
Sk-1
S0
...
...a (r-1)k b(r-1)k a (r-1)kb (r-1)k
...a k-1 b k-1 a0 b 0
...
C in
... ... ... ... ... ... ... ...
P r-1P r-2 P 1 P 0
C out + + + +
A N D
O RO RO R O R
A N DA N DA N D
critica l pa th , de lay =2(k-1)+(N /2-2)
Oklobdzija 2004 Computer Arithmetic 19
Carry-Skip Adder
SKIPRCAd tN
tkt
2
212
N
tp
ripple adder
bypass adder
4..8
k
Oklobdzija 2004 Computer Arithmetic 21
Carry-chain of a 32-bit Variable Block Adder(Oklobdzija, Barnes: IBM 1985)
G 0
... ...
a0 b
0
...
...
ai
bi
aN-1
bN-1
S j
P m -2
C inC out
C ou
t
G 2G m -2G m -1G m
G 0G 1G 2G m -2G m -1G m
S N-1S i
S 0
P 2P 0P m -1P m
.....
G 1
P 1
C in
.....
aj b
j
Carry signal path
skip ing
ripp ling
Oklobdzija 2004 Computer Arithmetic 22
Carry-chain of a 32-bit Variable Block Adder(Oklobdzija, Barnes: IBM 1985)
1 13 34 4
5 56
=9
Any-point-to-any-point delay = 9 as compared to 12 for CSKA
Oklobdzija 2004 Computer Arithmetic 23
Delay Calculation for Variable Block Adder(Oklobdzija, Barnes: IBM 1985)
P0
Ci,0
P1
G0
P2
G1
P3
G2
BP
G3
BP
Co,3
Delay model:
Oklobdzija 2004 Computer Arithmetic 24
Variable Block Adder(Oklobdzija, Barnes: IBM 1985)
Variable Group Length
Oklobdzija, Barnes, Arith’85
321 cNcctd
Oklobdzija 2004 Computer Arithmetic 25
Carry-chain of a 32-bit Variable Block Adder(Oklobdzija, Barnes: IBM 1985)
Variable Block Lengths
• No closed form solution for delay• It is a dynamic programming problem
Oklobdzija 2004 Computer Arithmetic 26
Delay Comparison: Variable Block Adder
0
2
4
6
8
10
12
14
16
4 11 18 25 32 39 46 53 60
Size N
Del
ay
VBA- Multi-Level
CLA
VBA
VLSI ArithmeticLecture 4
Prof. Vojin G. Oklobdzija
University of California
http://www.ece.ucdavis.edu/acsel
Oklobdzija 2004 Computer Arithmetic 28
Carry-Lookahead Adder(Weinberger and Smith, 1958)
Ref: A. Weinberger and J. L. Smith, “A Logic for High-Speed Addition”, National Bureau of Standards, Circ. 591, p.3-12, 1958.
ARITH-13: Presenting Achievement Award to Arnold Weinberger of IBM (who invented CLA adder in 1958)
Oklobdzija 2004 Computer Arithmetic 29
CLA Definitions: One-bit adder
iii cps
iiii cpgc 1
iii bap iii bag
0
1s
b ia i
cout
s i
c in
Oklobdzija 2004 Computer Arithmetic 30
CLA Definitions: 4-bit Adderai bi
Ci
gi pi
ai+1 bi+1
Ci+1
gi+1 pi+1
ai+2 bi+2
Ci+2
gi+2 pi+2
ai+3 bi+3
Ci+3
gi+3 pi+3
Ci+4
1111
1111112 )(
cppgpg
cpgpgcpgc
iiiii
iiiiiiii
iiiiiiiiiiii cpgbacbacbac 1
Oklobdzija 2004 Computer Arithmetic 31
Carry-Lookahead Adder: 4-bitsai bi
Ci
gi pi
ai+1 bi+1
Ci+1
gi+1 pi+1
ai+2 bi+2
Ci+2
gi+2 pi+2
ai+3 bi+3
Ci+3
gi+3 pi+3
Ci+4
iiiiiiiiii
iiiiiiiiiiii
cpppgppgpg
cppgpgpgcpgc
1212122
111222223
)(
iiiiiiiiiiiiiii
iiiiiiiiiiii
cppppgpppgppgpg
gppgpgpgcpgc
123123123233
12122333334
)(
Gj Pj
Oklobdzija 2004 Computer Arithmetic 32
Carry-Lookahead Adderiiiiiiiiiij gpppgppgpgG 123123233
iiiij ppppP 123
jjjj cPGc )1(4
One gate delay to calculate p, g
One to calculateP and two for G
Three gate delaysTo calculate C4(j+1)
Compare that to 8 in RCA !
a i b i
Cin Cj
G jP j
a i+1 b i+1
g i+1p i+1 g i p i
a i+2 b i+2a i+3 b i+3
g i+1p i+1g i+1p i+1
C4(j+1)
C4j+1C4j+2C4j+3
P , G G roup
Oklobdzija 2004 Computer Arithmetic 33
Carry-Lookahead Adder(Weinberger and Smith)
iiiiiiiiiij GPPPGPPGPG 123123233*G
iiiij PPPPP 123*
jkkj cPGc 4)1(4 **
P j
G* P*
C 4j+1
G jP j+1G j+1P j+3G j+3P j+2G j+2
C4jC4(j+1)
C 4j+2C 4j+3
Additional two gate delays
C16 will take a total of 5 vs. 32 for RCA !
Oklobdzija 2004 Computer Arithmetic 34
32-bit Carry Lookahead Adder
C in
C out C in
C 4C 8C 12
C out
C 20C 24C 28
C in
C 16
a ib i
ind ividua l addersgenera ting: g i, p i,
and sum S i
C arry-lookahead b locks o f4-b its generating:
G i, P i, and C in fo r theadders
C arry-lookahead super- b locks o f4-b its b locks genera ting:
G * i, P * i, and C in fo r the 4-b itb locks
G roup producing fina lcarry C out and C 16
C ritica l pa th de lay = (fo r g i,p i)+2x2 (fo r G ,P )+3x2 (fo r C in)+1XO R - (fo r S um ) = appx. 12of de lay
Oklobdzija 2004 Computer Arithmetic 35
Carry-Lookahead Adder(Weinberger and Smith: original derivation, 1958 )
Oklobdzija 2004 Computer Arithmetic 36
Carry-Lookahead Adder(Weinberger and Smith: original derivation )
Oklobdzija 2004 Computer Arithmetic 37
Carry-Lookahead Adder (Weinberger and Smith)please notice the similarity with Parallel-Prefix Adders !
Oklobdzija 2004 Computer Arithmetic 38
Carry-Lookahead Adder (Weinberger and Smith)please notice the similarity with Parallel-Prefix Adders !
Motorola: CLA Implementation Example
A. Naini, D. Bearden and W. Anderson, “A 4.5nS 96b CMOS Adder Design”,
Proceedings of the IEEE Custom Integrated Circuits Conference, May 3-6, 1992.
Oklobdzija 2004 Computer Arithmetic 40
Critical path in Motorola's 64-bit CLA
C ritica l pa th : A , B - G 0 - G 3:0 - G 15:0 - G 47:0 - C 48 - C 60 - C 63 - S 63
G4
P7
G0
P0
G1
P1
G2
P2
G3
P3
...
CARRYBLOCK
G8
P1
1
... G1
2
P1
5
... G1
6
P3
1
... G3
2
P4
7
... G4
8
P5
1
G6
0
P6
0
G6
1
P6
1
G6
2
P6
2
G6
3
P6
3
... G5
2
P5
5
... G5
6
P5
9
...
PG BLOCK
PG BLOCK
PG BLOCK
PG BLOCK
P,G
0
P,G
1:0
P,G
2:0
G3
:0
P3
:0
G7
:4
P7
:4
G1
1:8
P1
1:8
G1
5:1
2
P1
5:1
2
G3
:0
P3
:0
G7
:0
P7
:0
G1
1:0
P1
1:0
G1
5:0
P1
5:0
G1
5:0
P1
5:0
G3
1:1
6
P3
1:1
6
G3
1:0
P3
1:0
G4
7:3
2
P4
7:3
2
G4
7:0
P4
7:0
G5
1:4
8
P5
1:4
8
G5
5:5
2
P5
5:5
2
G5
9:5
6
P5
9:5
6
C6
4
G5
1:4
8
P5
1:4
8
G5
5:4
8
P5
5:4
8
G5
9:4
8
P5
9:4
8
P,G
60
P,G
61
:60
P,G
62
:60
G6
3:6
0
P6
3:6
0
G6
3:4
8
P6
3:4
8
G6
3:0
P6
3:0
C0
C4
C8
C1
2
C1
6
C3
2
C4
8
C1
6
C3
2
C4
8
C5
2
C5
6
C6
0
C6
3
PG BLOCK
C6
2
C6
1
1.05nS
1.7nS
2.0nS 2.35nS
2.7nS
3.75nS
4.8nS
Oklobdzija 2004 Computer Arithmetic 41
Motorola's 64-bit CLA
conventional PG Block
carry ripples locally5-transistors in the path
no better situation here !
Basically, this is MCC performance with Carry-Skip.One should not expect any better results than VBA.
Oklobdzija 2004 Computer Arithmetic 42
Motorola's 64-bit CLA
Modified PG Block
Intermediate propagate signals Pi:0 are generated to speed-up C3
still critical path resembles MCC
Oklobdzija 2004 Computer Arithmetic 44
C ritica l pa th : A , B - G 0 - G 3:0 - G 15:0 - G 47:0 - C 48 - C 60 - C 63 - S 63
G4
P7
G0
P0
G1
P1
G2
P2
G3
P3
...
CARRYBLOCK
G8
P1
1
... G1
2
P1
5
... G1
6
P3
1
... G3
2
P4
7
... G4
8
P5
1
G6
0
P6
0
G6
1
P6
1
G6
2
P6
2
G6
3
P6
3... G
52
P5
5
... G5
6
P5
9
...
PG BLOCK
PG BLOCK
PG BLOCK
PG BLOCK
P,G0
P,G1
:0
P,G2
:0
G3
:0
P3
:0
G7
:4
P7
:4
G1
1:8
P1
1:8
G1
5:1
2
P1
5:1
2
G3
:0
P3
:0
G7
:0
P7
:0
G1
1:0
P1
1:0
G1
5:0
P1
5:0
G1
5:0
P1
5:0
G3
1:1
6
P3
1:1
6
G3
1:0
P3
1:0
G4
7:3
2
P4
7:3
2
G4
7:0
P4
7:0
G5
1:4
8
P5
1:4
8
G5
5:5
2
P5
5:5
2
G5
9:5
6
P5
9:5
6
C6
4
G5
1:4
8
P5
1:4
8
G5
5:4
8
P5
5:4
8
G5
9:4
8
P5
9:4
8
P,G6
0
P,G6
1:6
0
P,G6
2:6
0
G6
3:6
0
P6
3:6
0
G6
3:4
8
P6
3:4
8
G6
3:0
P6
3:0
C0
C4
C8
C1
2
C1
6
C3
2
C4
8
C1
6
C3
2
C4
8
C5
2
C5
6
C6
0
C6
3
PG BLOCK
C6
2
C6
1
1.05nS
1.7nS
2.0nS 2.35nS
2.7nS3.75nS
4.8nS
1.8nS
2.2nS
2.9nS 3.2nS
3.55nS
3.9nS