An IEEE 7542008 Decimal Parallel and Pipelined FPGA FloatingPoint Multiplier
Malte Baesler, SvenOle Voigt, Thomas Teufel
Institute for Reliable ComputingHamburg University of Technology
September 1st, 2010
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Agenda
1. Introduction
a)Why Decimal FloatingPoint Arithmetic?
b)What are the Requirements on the Decimal Multiplier?
2. Decimal FixedPoint Multiplier
3. Decimal FloatingPoint Multiplier
4. Post Place & Route Results
a)FixedPoint Multiplier
b)FloatingPoint Multiplier
1/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
2/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Why decimal floatingpoint arithmetic?
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● avoid conversion errors● human centric applications● required for commercial applications, e.g. interest
calculation
2/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Why decimal floatingpoint arithmetic?
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● avoid conversion errors● human centric applications● required for commercial applications, e.g. interest
calculation
IEEE Standard 7542008 for FloatingPoint Arithmetic
● published in August 2008● replaces IEEE 7541985 and IEEE 8541987● binary and decimal floatingpoint arithmetic
2/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
FloatingPoint Arithmetic
IEEE 7542008 FloatingPoint Arithmetic
decimal64 data format● radix b=10● significand precision p=16● exponent range q
min=398, q
max=369
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
3/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Requirements on the multiplier
● fast● low resource usage● IEEE 7542008 compliant● pipelined due to reuse in accurate scalar product
fully combinational→● optimized for FPGA architecture (Virtex5)
– internal fast carry chain
– DSP48E slices
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
4/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Requirements on the multiplier
● fast● low resource usage● IEEE 7542008 compliant● pipelined due to reuse in accurate scalar product
fully combinational→● optimized for FPGA architecture (Virtex5)
– internal fast carry chain
– DSP48E slices
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
4/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FixedPoint Multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
5/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
FixedPoint Multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
How does multiplication work?school method:
● partial product generation● accumulation of partial products
1234⋅5678 = 5000⋅1234 600⋅1234 70⋅1234 8⋅1234
5/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
FixedPoint Multiplier
● based on concepts of A. Vazquez, E. Antelo, P.Montuschi 1
● fully combinational● BCD recoding schemes● fast partial product generation● fast BCD4221 carry save adder reduction tree
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
1“A new family of highperformance parallel decimal multipliers“, 18th IEEE Symposium on Computer Arithmetic, June 2007
6/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
FixedPoint Multiplier
ABCD8421
P0 BCD4221
P1 BCD4221
Pp+1 BCD4221
...
p digits
SBCD8421
S_sBCD4221
S_wBCD4221
2p digits
2p
2p
BBCD8421
p digitsP
PG
en
CS
AT CP
A
DR
ec
CSAT Carry Save Adder TreeCPA Carry Propagation Adder
PPGen Partial Product GeneratorDRec Decimal Recoding Unit
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
7/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal Recoding
ABCD8421
P0 BCD4221
P1 BCD4221
Pp+1 BCD4221
...
p digits
SBCD8421
S_sBCD4221
S_wBCD4221
2p digits
2p
2p
BBCD8421
p digitsP
PG
en
CS
AT CP
A
DR
ec
CSAT Carry Save Adder TreeCPA Carry Propagation Adder
PPGen Partial Product GeneratorDRec Decimal Recoding Unit
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
8/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal Recoding
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● transforms the multiplier's digit set into ● reduces number of multiplicand multiples
● very fast operation, no ripple carry
A×1, A×2, A×3, A×4, A×5
{0,9} {−5,5}
8/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Partial Product Generator
ABCD8421
P0 BCD4221
P1 BCD4221
Pp+1 BCD4221
...
p digits
SBCD8421
S_sBCD4221
S_wBCD4221
2p digits
2p
2p
BBCD8421
p digitsP
PG
en
CS
AT CP
A
DR
ec
CSAT Carry Save Adder TreeCPA Carry Propagation Adder
PPGen Partial Product GeneratorDRec Decimal Recoding Unit
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
9/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Partial Product Generator
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● calculates multiples – exploits correlation between shift operation and constant
value multiplication●
●
– BCD Recoding is fast– fixedvalue shift operation is for free– only requires one carry propagate adder
● generates partial products by selection of
● 10's complement for
X 5421≪1=X⋅28421
A×1, A×2, A×3, A×4, A×5
X 8421≪3=X⋅55421
A×3P0
P p1
A×1−A×5
Bk0 :−X nX 0= X n X 01
9/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
BCD4221 Carry Save Adder Tree
ABCD8421
P0 BCD4221
P1 BCD4221
Pp+1 BCD4221
...
p digits
SBCD8421
S_sBCD4221
S_wBCD4221
2p digits
2p
2p
BBCD8421
p digitsP
PG
en
CS
AT CP
A
DR
ec
CSAT Carry Save Adder TreeCPA Carry Propagation Adder
PPGen Partial Product GeneratorDRec Decimal Recoding Unit
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
10/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Carry Save Adder Tree
P1
P2
P3
Pp+1
...
carry save adder tree sums up p+1 partial products
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
10/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Carry Save Adder Tree
P1
P2
P3
Pp+1
...
C1
C2
Cp
sign extension
sign extension
sign extension
CSA tree with respect to decimal recoding
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
10/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Carry Save Adder Tree
P1
P2
P3
Pp+1
...
C1
C2
Cp
improved sign extension
improved CSA tree with respect to decimal recoding
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
10/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Improved Sign Extension
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● adding several words composed of leading nines and following zeros always yields to a word composed of 0, 8, and 9. For example
● position of 0, 8, and 9 can be calculated very fast by means of FPGA's fast carry chain
999999990000 999900000000 990000000000= x989899990000
X kNegDC
={9 for ck
in=0∧signk=1
8 for ckin=1∧signk=1
0 else
ckout=ck1
in={ 1 for signk=1
ckin else
11/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
FixedPoint Multiplier
ABCD8421
P0 BCD4221
P1 BCD4221
Pp+1 BCD4221
...
p digits
SBCD8421
S_sBCD4221
S_wBCD4221
2p digits
2p
2p
BBCD8421
p digitsP
PG
en
CS
AT CP
A
DR
ec
CSAT Carry Save Adder TreeCPA Carry Propagation Adder
PPGen Partial Product GeneratorDRec Decimal Recoding Unit
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
12/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
FixedPoint Multiplier
ABCD8421
P0 BCD4221
P1 BCD4221
Pp+1 BCD4221
...
p digits
SBCD8421
S_sBCD4221
S_wBCD4221
2p digits
2p
2p
BBCD8421
p digitsP
PG
en
CS
AT CP
A
DR
ec
CSAT Carry Save Adder TreeCPA Carry Propagation Adder
PPGen Partial Product GeneratorDRec Decimal Recoding Unit
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
12/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FloatingPoint Multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
13/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FloatingPoint Multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● additional units for rounding, exponent computation and data format encoding/decoding
● based on M. Erle, B. Hickmann, M.Schulte 2
● early estimation of shift left amount
● fully IEEE 7542008 compliant
● support for gradual underflow and all rounding modes
● adapted to FPGA technology
2“Decimal FloatingPoint Multiplication“, IEEE Transaction on Computers, VOL. 58, NO. 7, July 2009
13/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY X = 0x03C80000534B9C1EY = 0x0250000277CB0D10
14/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY X = 0x03C80000534B9C1EY = 0x0250000277CB0D10
X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
15/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
Z = significand(X•Y)Z = 00000000000012193263111263526900Zs = 66888846846688648888664609006600Zc = 33111153153323544414446654520300
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
16/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
Z = significand(X•Y)Z = 00000000000012193263111263526900Zs = 66888846846688648888664609006600Zc = 33111153153323544414446654520300
LZ(X)=6, LZ(Y)=6, SLA=min(6+6, p)=12Z = 1219326311126352.690000000000000Zs = 8864888866460900.660000000000000Zc = 2354441444665452.030000000000000
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY
17/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
Z = significand(X•Y)Z = 00000000000012193263111263526900Zs = 66888846846688648888664609006600Zc = 33111153153323544414446654520300
LZ(X)=6, LZ(Y)=6, SLA=min(6+6, p)=12Z = 1219326311126352.690000000000000Zs = 8864888866460900.660000000000000Zc = 2354441444665452.030000000000000
Z' = 1219326311126352, G=6, R=9, sb='0'
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
18/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
Z = significand(X•Y)Z = 00000000000012193263111263526900Zs = 66888846846688648888664609006600Zc = 33111153153323544414446654520300
LZ(X)=6, LZ(Y)=6, SLA=min(6+6, p)=12Z = 1219326311126352.690000000000000Zs = 8864888866460900.660000000000000Zc = 2354441444665452.030000000000000
Z' = 1219326311126352, G=6, R=9, sb='0'exponent = 406 + p – SLA = 402
19/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
Z = significand(X•Y)Z = 00000000000012193263111263526900Zs = 66888846846688648888664609006600Zc = 33111153153323544414446654520300
LZ(X)=6, LZ(Y)=6, SLA=min(6+6, p)=12Z = 1219326311126352.690000000000000Zs = 8864888866460900.660000000000000Zc = 2354441444665452.030000000000000
Z' = 1219326311126352, G=6, R=9, sb='0'exponent = 406 + p – SLA = 402
Z'' = 0000121932631112, G=6, R=3, sb='1'exponent = 398
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
20/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
Z = significand(X•Y)Z = 00000000000012193263111263526900Zs = 66888846846688648888664609006600Zc = 33111153153323544414446654520300
LZ(X)=6, LZ(Y)=6, SLA=min(6+6, p)=12Z = 1219326311126352.690000000000000Zs = 8864888866460900.660000000000000Zc = 2354441444665452.030000000000000
Z' = 1219326311126352, G=6, R=9, sb='0'exponent = 406 + p – SLA = 402
Z'' = 0000121932631112, G=6, R=3, sb='1'exponent = 398
round up → Z''' = 0000121932631113 EXP398
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
21/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
Densily Packed Decimal (DPD) Decoder
Leading Zeros Count /Shift Left Amount
Computation
Decimal FixedPoint Multipliplier
Left Shift Register
Carry Propagate Adder
Overflow / Underflow Correction
Rounding Unit
RoundUp Detection
Exception Unit DPD Encoder
Exponent Computation
X•Yexception signals
XY X = +0000001234567890 EXP156Y = +0000009876543210 EXP250X•Y = +12193263111263526900 EXP406
Z = significand(X•Y)Z = 00000000000012193263111263526900Zs = 66888846846688648888664609006600Zc = 33111153153323544414446654520300
LZ(X)=6, LZ(Y)=6, SLA=min(6+6, p)=12Z = 1219326311126352.690000000000000Zs = 8864888866460900.660000000000000Zc = 2354441444665452.030000000000000
Z' = 1219326311126352, G=6, R=9, sb='0'exponent = 406 + p – SLA = 402
Z'' = 0000121932631112, G=6, R=3, sb='1'exponent = 398
round up → Z''' = 0000121932631113 EXP398Z = 0x000000285BCCC493invalid inexact overflow underflow
22/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
type1 type2 type3fixedpointmultiplier output
redundant(delayed CPA)
redundant(delayed CPA)
nonredundant
CPA length (digits) p+2 = 18 p+2 = 18 2·p = 32
shift register multiplier based multiplexer based multiplexer based
decimal fixedpoint multiplier
shift register
CPA (p+2) CPA (p2)
Ps Pc
shift registerQsu Qsl Qcu Qcl
OR
product RG sticky bit
...
decimal fixedpoint multiplier
CPA (2·p)
Ps Pc
shift register
OR
sticky bit
...
product RG23/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
decimal fixedpoint multiplier
shift register
CPA (p+2) CPA (p2)
Ps Pc
shift registerQsu Qsl Qcu Qcl
OR
product RG sticky bit
...
decimal fixedpoint multiplier
CPA (2·p)
Ps Pc
shift register
OR
sticky bit
...
product RG
type1 type2 type3fixedpointmultiplier output
redundant(delayed CPA)
redundant(delayed CPA)
nonredundant
CPA length (digits) p+2 = 18 p+2 = 18 2·p = 32
shift register multiplier based multiplexer based multiplexer based
23/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
decimal fixedpoint multiplier
shift register
CPA (p+2) CPA (p2)
Ps Pc
shift registerQsu Qsl Qcu Qcl
OR
product RG sticky bit
...
decimal fixedpoint multiplier
CPA (2·p)
Ps Pc
shift register
OR
sticky bit
...
product RG
type1 type2 type3fixedpointmultiplier output
redundant(delayed CPA)
redundant(delayed CPA)
nonredundant
CPA length (digits) p+2 = 18 p+2 = 18 2·p = 32
shift register multiplier based multiplexer based multiplexer based
23/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
shifting through multiplication:
●
● requires two DSP48Es per 32bit shift
● saves LUTs
X≪n ≡ X⋅2n
MUL MUL
X(31:16) X(15:0)shift 2k
ADD
Y(15:0)Y(31:16)
DS
P48
E
DS
P48
E
type1 type2 type3fixedpointmultiplier output
redundant(delayed CPA)
redundant(delayed CPA)
nonredundant
CPA length (digits) p+2 = 18 p+2 = 18 2·p = 32
shift register multiplier based multiplexer based multiplexer based
24/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Post Place & Route Results
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
25/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FixedPoint Multiplier with CPA output
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● Xilinx Virtex5, speed grade 2● up to 13 pipeline registers, configurable via VHDL generics
● 5350 – 6500 LUTs, 0 – 4900 FFs● 5500 – 7600 combined LUTs and FFs
25/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FixedPoint Multiplier with CPA output
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
● 5350 – 6500 LUTs, 0 – 4900 FFs● 5350 – 7600 combined LUTs and FFs
● Xilinx Virtex5, speed grade 2● up to 13 pipeline registers, configurable via VHDL generics
25/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FloatingPoint Multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
26/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FloatingPoint Multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
27/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Decimal FloatingPoint Multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
Type1mulbased shifting,
delayed CPA
Type2muxbased shifting,
delayed CPA
Type3muxbased shifting,
no delayed CPA
#LUTs 6300 8400 7900 9400 7500 9400
#FFs 0 4100 0 4500 0 4400
#(LUT + FFs) 6500 8400 8300 9300 7600 9600
#DSP48E 17 0 0
● approx. 70% of the LUTs are used by the fixedpoint multiplier (for Type2 and Type3)
● medium Virtex5 XC5VLX110T: 80009000 LUTs ~ 11.5%13%
28/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
0 3 6 90
5000
10000
decimal binary
number of pipeline registers
num
ber o
f LU
Ts
0 3 6 90
100
200
300
400
decimal binary
number of pipeline registers
max
. fre
quen
cy (M
Hz)
Comparison to binary floatingpoint multiplier● 64 bit binary floatingpoint multiplier generated with CoreGen● no DSP48E● Type2 decimal vs. CoreGen binary multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
decimal mult. : 3.2 – 3.5 more LUTs binary mult. : 1.6 – 2.2 times faster
29/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
0 3 6 90
5000
10000
decimal binary
number of pipeline registers
num
ber o
f LU
Ts
0 3 6 90
100
200
300
400
decimal binary
number of pipeline registers
max
. fre
quen
cy (M
Hz)
Comparison to binary floatingpoint multiplier● 64 bit binary floatingpoint multiplier generated with CoreGen● no DSP48E● Type2 decimal vs. CoreGen binary multiplier
Introduction Decimal FixedPoint Multiplier Decimal FloatingPoint Multiplier Post Place & Route Results
decimal mult. : 3.2 – 3.5 more LUTs binary mult. : 1.6 – 2.2 times faster
29/30
Decimal FloatingPoint MultiplierM. Baesler, S. Voigt, T. Teufel 09/01/2010
Summary
● decimal fixedpoint multiplier– parallel, fully combinational– configurable number of pipeline stages
● decimal floatingpoint multiplier– configurable number of pipeline stages– three different implementations– tradeoff: area vs. speed
● future work: fully IEEE 7542008 compliant coprocessor
30/30
Thank you for your attention!!!