Arithmetic III CPSC 321 Andreas Klappenecker. Any Questions?

transcript

Arithmetic IIICPSC 321

Andreas Klappenecker

Any Questions?

Today’s Menu

AdditionMultiplicationFloating Point Numbers

Recall: Full Adder

3 gates delay for first adder, 2(n-1) for remaining adders

Ripple Carry Adders

• Each gates causes a delay• our example: 3 gates for carry generation • book has example with 2 gates

• Carry might ripple through all n adders• O(n) gates causing delay• intolerable delay if n is large

• Carry lookahead adders

Faster Adders

cin a b cout s

0 0 0 0 0

0 0 1 0 1

0 1 0 0 1

0 1 1 1 0

1 0 0 0 1

1 0 1 1 0

1 1 0 1 0

1 1 1 1 1

cout=ab+cin(a xor b) =ab+acin+bcin

=ab+(a+b)cin

= g + p cin

Generate g = abPropagate p = a+b

Why are they called like that?

Fast Adders

Iterate the idea, generate and propagateci+1 = gi + pici

= gi + pi(gi-1 + pi-1 ci-1)

= gi + pigi-1+ pipi-1ci-1

= gi + pigi-1+ pipi-1gi-2 +…+ pipi-1 …p1g0

+pipi-1 …p1p0c0

Two level AND-OR circuit Carry is known early!

• Need to support the set-on-less-than instruction

• remember: slt is an arithmetic instruction

• produces 1 if rs < rt and 0 otherwise

• use subtraction: (a-b) < 0 implies a < b

• Need to support test for equality (beq $t5, $t6,

• use subtraction: (a-b) = 0 implies a = b

A Simple ALU for MIPS

000 = and001 = or010 = add110 = subtract111 = slt

•Note: zero is a 1 when the result is zero!Set

Result0a0

Result1a1

Result2a2

Operation

Result31

Overflow

Bnegate

ALU0Less

CarryIn

CarryOut

ALU1Less

CarryIn

CarryOut

ALU2Less

CarryIn

CarryOut

ALU31Less

CarryIn

Multipliers

• More complicated than addition• accomplished via shifting and addition• Let's look at 3 versions based on the grade school

algorithm

0010 (multiplicand)__ x_1011 (multiplier)

0010 x 1 00100 x 1 001000 x 0 0010000 x 1 00010110 • Shift and add if multiplier bit equals 1

Multiplication

1. TestMultiplier0

1a. Add multiplicand to product andplace the result in Product register

2. Shift the Multiplicand register left 1 bit

3. Shift the Multiplier register right 1 bit

32nd repetition?

Multiplier0 = 0Multiplier0 = 1

No: < 32 repetitions

Yes: 32 repetitions

64-bit ALU

Control test

MultiplierShift right

ProductWrite

MultiplicandShift left

64 bits

32 bits

0010 (multiplicand)__ x_1011 (multiplier) 0010 x 1 00100 x 1

001000 x 0 0010000 x 1

0010110

Multiplication

If each step took a clock cycle, this algorithm would use almost 100 clock cycles to multiply two 32-bit numbers.

Requires 64-bit wide adderMultiplicand register 64-bit wide

Variations on a Theme

• Product register has to be 64-bit• Nothing we can do about that!• Can we take advantage of that fact?• Yes! Add multiplicand to 32 MSBs• product = product >> 1• Repeat last steps

0010 (multiplicand)__ x_1011 (multiplier) 0010 x 1 00100 x 1

001000 x 0 0010000 x 1

0010110

Second Version

32 bits

64 bits

32 bits

Shift right

Multiplicand

32-bit ALU

Product Control test

1. TestMultiplier0

1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register

2. Shift the Product register right 1 bit

3. Shift the Multiplier register right 1 bit

32nd repetition?

Multiplier0 = 0Multiplier0 = 1

Yes: 32 repetitions

Version 1 versus Version 2

32 bits

64 bits

32 bits

Shift right

Multiplicand

32-bit ALU

Product Control test

64-bit ALU

Control test

ProductWrite

MultiplicandShift left

64 bits

32 bits

Critique

• Registers needed for • multiplicand• multiplier• product

• Use lower 32 bits of product register:• place multiplier in lower 32 bits• add multiplicand to higher 32 bits• product = product >> 1• repeat

Final Version

ControltestWrite

32 bits

64 bits

Shift rightProduct

Multiplicand

32-bit ALU

1. TestProduct0

1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register

2. Shift the Product register right 1 bit

32nd repetition?

Product0 = 0Product0 = 1

Yes: 32 repetitionsMultiplier (shifts right)

Summary

It was possible to improve upon the well-known grade school algorithm by• reducing the adder from 64 to 32

bits• keeping the multiplicand fixed• shifting the product register• omitting the multiplier register

The Booth Multiplier

Let’s kick it up a notch!

Runs of 1’s

• 011102 = 14 = 8+4+2 = 16 – 2

• Runs of 1s (current bit, bit to the right):

• 10 beginning of run• 11 middle of a run• 01 end of a run of 1s• 00 middle of a run of 0s

Run’s of 1’s

• 0111 1111 11002 = 2044• How do you get this conversion

quickly?• 0111 11112 = 128 – 1 = 127• 0111 1111 11112 = 2048 – 1• 0111 1111 11002 = 2048 – 1 – 3 =

2048 – 4

0000 shift

-0010 sub

0000 shift

0010 add

00001100

Example

0000 shift

0010 add

0000 shift

00001100

Booth Multiplication

Current and previous bit

00: middle of run of 0s, no action01: end of a run of 1s, add multiplicand10: beginning of a run of 1s, subtract

mcnd11: middle of string of 1s, no action

Example: 0010 x 0110

Iteration

Step Product

0 0010 Initial values 0000 0110,0

1 0010 0010

00: no op arith>> 1

0000 0110,0

0000 0011,0

2 0010 0010

10: prod-=Mcandarith>> 1

1110 0011,0 1111 0001,1

3 0010 0010

11: no oparith>> 1

1111 0001,1 1111 1000,1

4 0010

01: prod+=Mcandarith>> 1

0001 1000,1 0000 1100,0

Negative numbers

Booth’s multiplication works also with negative numbers:2 x -3 = -6 00102 x 11012 = 1111 10102

Negative Numbers

00102 x 11012 = 1111 10102

0) Mcnd 0010 Prod 0000 1101,0

1) Mcnd 0010 Prod 1110 1101,1 sub

1) Mcnd 0010 Prod 1111 0110,1 >>

2) Mcnd 0010 Prod 0001 0110,1 add

2) Mcnd 0010 Prod 0000 1011,0 >>

3) Mcnd 0010 Prod 1110 1011,0 sub

3) Mcnd 0010 Prod 1111 0101,1 >>

4) Mcnd 0010 Prod 1111 0101,1 nop

4) Mcnd 0010 Prod 1111 1010,1 >>

Summary

• Extends the final version of the grade school algorithm

• Simple change: add, subtract, or do nothing if last and previous bit respectively satisfy 0,1; 1,0 or 0,0; 1,1

• 0111 11002 = 128 – 4 = 1000 0002 – 0000 01002

Floating Point Numbers

We often use calculations based on real numbers, such as• e = 2.71828…• Pi = 3.14592…We represent approximations to such numbers by floating point numbers• 1.xxxxxxxxxx2 x 2yyyy

Floating-Point Representation: float

We need to distribute the 32 bits among sign, exponent, and significand• seeeeeeeexxxxxxxxxxxxxxxxxxxxxxxThe general form of such a number is • (-1)s x F x 2E • s is the sign, F is derived from the significand field, and E is derived from the exponent field

Floating Point Representation: double

• 1 bit sign, 11 bits for exponent, 52 bits for significand

• seeeeeeeeeeexxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Range of float: 2.0 x 10-38 … 2.0 x 1038

Range of double: 2.0 x 10-308 … 2.0 x 10308

IEEE 754 Floating-Point Standard

• Makes leading bit of normalized binary number implicit 1 + significand

• If significand is s1 s2 s3 s4 s5 s6 … then the value is

(-1)s x (1 + s1/2 + s2/4 + s3/8 + … ) 2E

• Design goal of IEEE 754: Integer comparisons should yield meaningful comparisons for floating point numbers

IEEE 754 Standard

• Negative exponents are a difficulty for sorting

• Idea: most positive … most negative 1111 1111 … 0000 0000• IEEE 754 uses a bias of 127 for single

precision. • Exponent -1 is represented by -1 + 127 = 126

IEEE 754 Example

Represent -0.75 in single precision format. -0.75 = -3/4 = -112 / 4 = -0.112

In scientific notation: -0.11 x 20 = -1.1 x 2-1

the latter form is normalized sc. notation

Value: (-1)s x (1+ significand) x 2(Expnt – 127)

Example (cont’d)

• -1.1 x 2-1 = (-1)1 x (1 + .1000 0000 0000 0000 0000

000) x 2(126 – 127)

The single precision representation is 1 0111 1110 1000 0000 0000 0000 0000

Conclusion

• We learned how to multiply• Three variations on the grade school

algorithm • Booth multiplication• Floating point representation a la IEEE 754

(Photo’s are courtesy of www.emerils.com,some graphs are due to Patterson and Hennessy)

Arithmetic III CPSC 321 Andreas Klappenecker. Any Questions?

Documents