Computer arithmetic Second try at third grade. What is stored and intended Bit patterns of several...

Computer arithmetic

Second try at third grade

What is stored and intended

• Bit patterns of several sizes, nothing more– Almost all modern machines store and manipulate

1,2,4,8 byte quantities – older ones had 12 ,40 ,36,60-bit words– or decimal digit strings, or variable length BCD

• The bit patterns can represent:– Numbers – fixed or floating point– Text data – ascii, extended ascii or unicode– Graphic data – pixels– Bit patterns – I/O register contents – often packed– Specialized data, sound or other signals, genomes

What operations are done

• These operations are usually in the ISA:– Arithmetic operations on numbers – fixed or floating

point• The standard four functions (+, -, *, / %)• Relational operations and comparison• Conversions – float to fixed

– Logical operations on bit patterns – I/O register contents – often packed

• Bit operations – set, clear, flip• Shifting and other bit field isolating methods• Packing and unpacking are done with logic and shifting

Data for which few operations are defined

– Text data – ascii, extended ascii or unicode– Graphic data – pixels, bitmaps, vector

graphics– Specialized data, sound or other signals,

genomes

Number representation

• Several common varieties of multidigit number representation exist

• Most are directly descended from multidigit integer (positional notation)

• Several examples– Sign and magnitude (people think this way)– Two’s complement (or ten’s complement)– Gray code (used to convert mechanical

motion to glitchless binary)

Positional notation

A number is represented by a string of digits in an understood base2134510

The value of the number is the value of the resulting polynomial2 * 104 + 1 * 103 + 3 * 102 + 4 * 101 + 5 * 100

In nested form this is (((((2 )* 10 + 1) * 10 + 3) * 10 * 4) * 10 * 5

The operations are the polynomial operations – addition, subtraction, multiplication, divisionwith carry propagation so the value is preserved, but theresult contains only valid digits again.

Example – 21345 + 32767 = 541122 * 104 + 1 * 103 + 3 * 102 + 4 * 101 + 5 * 100

3 * 104 + 2 * 103 + 7 * 102 + 6 * 101 + 7 * 100

5 * 104 + 3 * 103 + 10 * 102 + 10 * 101 + 12 * 100

Or, after carry propagation - 5 * 104 + 4 * 103 + 1 * 102 + 1 * 101 + 2 * 100

Notice this carry propagation is recursive and is done from right to left

Operations on numbers

• Numbers are represented in positional notation – a string of digits

• Numbers are interpreted as a value

• Operations in a computer manipulate the bit strings so as to generate a result that has the expected value – A + B, etc.

• The algorithms are what you learned early in life – the same in any number base

The primitive addition element

One-digitadder

Ai Bi

Si

CinCout

This configuration is valid for any base2 (binary), 16 (hex), decimal (10), or 2m

Note that 2m corresponds to word sizesof 16, 32, or 64 bits; long word sizes are Needed for many cryptographic algorithms

This logic can be used in writing multiword arithmetic operations

The multiply primitive

One-digitmultiplier

Ai Bi

Qi

Cout

One-digitadder

Ai Qi-1

Si

Cout Cin

This element can be used asa cell of an n x x array multiplier

Although the number of cells is O(n2)for n =32 this is not too large for VLSI

Note that the carry propagationproperties are similar to the same number of digits of addition

Other fast or multidigit multiply configurations exist

Multidigit add – any base

• Cn Cn-1 indicates signed overflow in binary• Note the long path for carry propagation

One-digitadder

An-1 Bn-1

Sn-1

Cn-1Cn One-digitadder

An-2 Bn-2

Sn-2

C1 One-digitadder

A0 B0

S0

C0

Many stages go here

Cn-1

Why sign-and-magnitude notation is avoided

2 1 3 4 5 3 2 7 6 7 5 3 10 10 12 5 41 11 11 2Subtraction is done the same way, but not exactlyFirst the two numbers are compared; then the one withthe smaller magnitude is subtracted from the other, andthe sign is that associated with the large magnitude 2 1 3 4 5 -3 2 7 6 7 -3 2 7 6 7 2 1 3 4 5 -1 1 4 2 2This requires at least one more comparison, so most integerarithmetic is done in complement notation rather thansign-and-magnitude.

Complement notation is easier and faster

Note that (for example) 0 = -100000 + 99999 + 1

Or in binary 0 = -100000000 + 11111111 + 1

Subtracting a number from all 9’s or all ones is just flipping all thedigits (or bits) and is done bit-by-bit, with no carry propagation

Thus –A in binary is 11111111 – A + 1 or NOT (A) + 1, ignoring the carry

Also, A – B is A + NOT(B) + 1 and no comparison step is needed

ADDER

ComplementerA

B

S

Add/Subtract

What’s wrong with S&M ?

• Addition in S&M takes multiple steps – – Compare signs– If signs are different compare magnitudes

• Sign is that of larger quantity• Subtract smaller from larger

Input 1 Input 2

ALU output

Compare and redirect

2’s complement representation

Representation – note that 111111 – V is V – the bit flip (not) of VTo represent A (now positive or negative) we use A if A<=0 and 2m + Aif A<0. Since 2m is 1 + 111111 = 1000000, 2m + A is 1 + A

Examples in 6 bits-3 = -000011 = 1+111100 = 111101, -31 = -011111= 1 + 100000 = 100001

Note that negative numbers are represented by bit patterns that representlarge positive numbers – for example in 6-bit we represent negatives by unsigned numbers > 31 and positives by themselves.

Addition in 6 bits5 + (– 3) is 000101 + 111101 = 000010 – 2 -5 + 3 is 000011 + 111011 = 111110 = - 2

The multiply primitive - again

One-digitmultiplier

Ai Bi

Qi

Cout

One-digitadder

Ai Qi-1

Si

Cout Cin

This element can be used asa cell of an n x n array multiplier

Although the number of cells is O(n2)for n =32 this is not too large for VLSI

Note that the carry propagationproperties are similar to the same number of digits of addition

Other fast or multidigit multiply configurations exist

Human multiplication is much like the computer variety, but not exactly

Multiplication is actually repeated addition, but we are used to its formatIn nonbinary basesIn the multiplication primitive (previous slide) a one-digit multiplier is usedThe human also has such a multiplier installed by the Departamento deEducación.

2 1 3 4 5 21345 x3 2 7 6 7 32767 14 7 21 28 35 149415 12 6 18 24 30 127380 14 7 21 28 35 149415 4 2 6 8 10 42690 6 3 9 12 15 64035 699304715

Note: The human can add a column of one-digit numbers, thecomputer can’t add more than two numbers at a time, but theycan be multidigit.

How a decimal machine multiplies this example

21345 32767 0 149415 149415 127380 1423215 149415 16364715 42690 59054715 64035 699304714

21345 32767 0 149415 7 * 21345 14941 5 add and shift127380 6 * 21345 14232 15 add and shift149415 7 * 21345 16364 715 add and shift 42690 2 * 21345 5905 4715 add and shift 64035 3 * 21345 06993 04714 add and shift

Note that the adder width is only 5 digits plus carry logicThe bits shifted out are not changed in later steps

A binary unsigned example

1101 = 1310

1011 = 1110

0000 1101 1101 0110 1 110110011 1 1001 11 0100 111 110110001 111 1000 1111 = 14310

ResultLow-order

bits

Multiplicand

Multiplier

Adder

All three of these registers shift at each step

This can be done with one long shift register

ResultLow-order

bits

Multiplicand

Multiplier

Adder

ResultLow-order

bits coming in

Multiplicand

Multiplierbits going out

Adder

Division is almost the inverse of multiplication

RemainderLow-order

bits

Divisor

Quotient

Adder

RemainderLow-order

bits going out

Multiplicand

Quotient bits coming in

Adder

Subtraction tells if sign changesif so, this quotient bit is zero andresult is ignored – if not, result isretained and quotient bit is one

Both registers shift left at each step

Some comments

• Algorithms– The same form in any base – you learned the format

many years ago– The inverse of a multistep process is usually the

inverse steps in inverse order– Binary means only additions rather than one-digit

multiplication is done

• Representations– Memory contents are just bit strings – meaning is

assumed when you do operations – logical or arithmetic

More comments

• Addition/subtraction are basic– Multiplication and division are just repeated adds and subtracts– Clock rate of a pipelined machine is limited by the time needed

for an addition.– Thus, fast carry propagation techniques are essential

• In earlier times several formats were used – Bit-serial, binary, sign-and-magnitude– Now integer arithmetic is always two’s complement binary –

floating-point uses several representations– http://www.bitsavers.org has data on many historic machines

• Don’t waste time doing computer arithmetic by hand except to learn principles– Especially don’t do binary-decimal or such conversions

http://www.bitsavers.org/

Speeding Up Addition With Carry Lookahead

• Speed of digital addition depends on carries• A base b = 2k divides length of carry chain by k• Two level logic for base b digit becomes complex

quickly as k increases• If we could compute the carries quickly, the full

adders compute result with 2 more gate delays• Carry lookahead computes carries quickly• It is based on two ideas: —a digit position generates a carry —a position propagates a carry in to the carry outThis and the next six slides are taken from the textbook

slides

Binary Propagate and Generate Signals

• In binary, the generate for digit j is G j = xjyj

• Propagate for digit j is Pj = xj+yj

– Of course xj+yj covers xjyj but it still corresponds to a carry out for a carry in

• Carries can then be written: c1 = G0 + P0c0

• c2 = G1 + P1G0 + P1P0c0

• c3 = G2 + P2G1 + P2P1G0 + P2P1P0c0

• c4 = G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0c0

• In words, the c2 logic is: c2 is one if digit 1 generates a carry, or if digit 0 generates one and digit 1 propagates it, or if digits 0&1 both propagate a carry in

Speed Gains With Carry Lookahead

• It takes one gate to produce a G or P, two levels of gates for any carry, & 2 more for full adders

• The number of OR gate inputs (terms) and AND gate inputs (literals in a term) grows as the number of carries generated by lookahead

• The real power of this technique comes from applying it recursively

• For a group of, say 4, digits an overall generate is G1

0 = G3 + P3G2 + P3P2G1 + P3P2P1G0

• An overall propagate is P10 = P3P2P1P0

Recursive Carry Lookahead Scheme

• If level 1 generates G1j and propagates P1

j are defined for all groups j, then we can also define level 2 signals G2

j and P2j over groups of groups

• If k things are grouped together at each level, there will be logkm levels, where m is the number of bits in the original addition

• Each extra level introduces 2 more gate delays into the worst case carry calculation

• k is chosen to trade-off reduced delay against the complexity of the G and P logic

• It is typically 4 or more, but the structure is easier to see for k=2

Fig. 6.4 Carry Lookahead Adder for Group Size k = 2

Fast multiplication methods

• Array multipliers– Silicon is cheap – use an array of the primitive

one-digit multipliers shown previously

• Booth’s algorithm – – two bits at a time means half as many adds

• Carry-save multiplication– Avoids carry propagation except at the last

step

Fig. 6.5 Digital Multiplication Schema

p: product pp: partial product

x0x1x2x3

y0y1y2y3

(xy0)0(xy0)1(xy0)2(xy0)3(xy0)4

(xy1)0(xy1)1(xy1)2(xy1)3(xy1)4

(xy2)0(xy2)1(xy2)2(xy2)3(xy2)4

(xy3)0(xy3)1(xy3)2(xy3)3(xy3)4

p0p1p2p3p4p5p6p7

pp0

pp1

pp2

pp3

mult iplicand

mult iplier

Signed and unsigned operations

• Addition and subtraction are the same in two’s complement integer

• Unsigned and signed multiplication are somewhat different – the same basic structure is used, but the overflow-handling and last step change.

Table 6.5 Radix-4 Booth Encoding (Bit-Pair Encoding)

Carry-save multiplication

• Basic idea– Three numbers can be added to make two– This is done without any carry-propagation

delay– So 32 numbers can be added as follows:

32>22>16>12>8>6>4>3>2– But then the last two must be added

conventionally – – The entire operation takes about as long as

two normal additions

A refinement to carry-save

• The basic block is now 4>3>2– Groups of 4 numbers are added through two

stages to make 2 numbers– In our previous terms

32>24>16>12>8>6>4>3>2– Compare with

32>22>16>12>8>6>4>3>2– This version takes much less long

interconnect lines

Condition codes

• A condition code register usually remembers the results of the last significant operation– SRC doesn’t have one, 80x86 does– Significant operation usually means an arithmetic or logical

(including shifts) but not a move, pop, or push– The usual bits are C(Carry), V(oVerflow), N(negative),

Z(Zero)– The 80x86 instructions are based on useful logical

combinations of theseG(Greater), E(Equal), L(Less) relate to signed representationA(Above), E(Equal), B(Below) relate to unsigned representation

– Note that compare works correctly even if the result overflows

Summer 2002 ICOM 4206 – Floating point arithmetic

Floating-point basics

• Floating-point numbers and operations are like scientific notation– Number has sign, mantissa, exponent

• Example - -1.453*10-4

– Addition and subtraction have similar stages• Align small number to larger (in absolute value)

• Add/subtract

• Renormalize if needed (so mantissa is smaller than base and >=1)

– Multiplication and division also have similar stages• Add/subtract exponents

• Multiply magnitudes

• XOR for result sign

• Consequences– Number has sign, exponent, and magnitude fields– S&M used for magnitudes since absolute value is needed– Separate ALU’s are needed than for fixed point operations


IEEE formats

S

1.

Fractional part of mantissamantissa

type MSB Exp size Mant. size Total

short implicit 8 23 32

long implicit 11 52 64

extended

explicit 15 64 80

Comments: implicit MSB implies number is normalized

since this bit is always 1, why store, it; use the bit for an extra bit of precision instead.

exponent is stored in excess format – 127, 1023, 16383 extended precision is used internally to some FPU’s;

it is not standard.


Numeric examples

Number Exp Exp field mantissaMantissa

fieldHex fields

Hex

1.0 0 127 = 7f 1 0 0,7f,0 3f800000

0.2 = 1.6 /8 -3 124 = 7c 1.6.6 = 9/15=99..

0,7c,999999 3e4ccccc

-0.4 = -1.6/4

-2 125 1.6.6 = 9/15=99..

1,7d,999999 becccccc

26 = 13/8*16

4 131 1.625.625=a00000

0,83,a00000 041d0000

-6.25 2 129 1.875.875 = e00000

1,81,e00000 84f00000Notes: This shows the general pattern, don’t try to duplicate it

Note, sign-and-magnitude, not complementation is used Sign-and-magnitude comparison, not floating point, is sufficient to compare

All integers (unless beyond precision range) are exactly represented Fractions are repeating decimals unless denominator is a power of 2

All examples are short (32 bit) format


General form of a FPU for addition

First operand Second operand

Sign compareExponent subtraction

Swap numbers if left has smaller exponent

PrenormalizeShift right by exponent difference

S&M addmantissas

Postnormalize and adjust exponent

Reassemble and store


Comments on the unusual blocks

Sign compareExponent subtraction

Swap numbers if left has smaller exponent

PrenormalizeShift right by exponent difference

S&M addmantissas

Postnormalize and adjust exponent

Reassemble and store

Extra rounding digitsmust be kept here

This is just normal Complement subtraction


Floating point add/subtract steps explained

• Sign comparison– Since mantissas are in S&M, second argument sign is flipped for subtraction,

then S&M addition rules are followed

• Exponent comparison– Normal complement subtraction of exponents cancels out the excess in the

representation. – The operands are swapped if the second operand has the larger exponent

difference

• Prenormalization shift– The second operand is shifted right by the amount of the exponent difference – Three extra bits, called round, guard, and sticky, must be retained

• Addition– This is a standard S&M addition, but it must include the extra bits

• Postnormalization– This shift can be from one place right to many places left if the difference is

small. – Rounding decisions are made here and not earlier


Floating point multiply/divide

Sign comparisonand XOR

Exponent addor subtract

Mantissa multiplyOr divide

Reassembly

Postnormalizing shifts

Rounding isDone only here

Exponentadjust

Full precisionIs kept here


IEEE Floating-point format• Previous floating-point formats had problems

– IBM 360 – used base 16 exponent – poor precision– All IBM – biased rounding destabilized numerical algorithms– Some HP – same problems, also used floating decimal, which ruined speed and

was incompatible– CDC – strange word lengths – 60 bits, etc.– In general, accuracy, control of rounding, and incompatibility were the problems –

read book for description of incidents– Weather prediction and other big numerical algorithms depend heavily on good

floating point algorithms

• IEEE floating-point features– Standardized format– Programmer control of rounding methods– Space in the format for unnormalized and not-a-number (NAN) formats

• Implementations– 8087 (Intel) coprocessor was defined before standard, and has a misdefined

stack – programming problems ever since (especially compiler design)


IEEE format – some details• Mantissa is in S&M format with understood one in MSB

– Understood 1 requires S&M– Understood 1 gives one more digit of precision– Prenormalization is easier, because incoming numbers are almost always normalized

• Exponent field in excess format, coming before mantissa– With S&M format this means fixed-point S&M compare can find which number is larger (if

they are normalized)– Exponent renormalization doesn’t require sign logic

• All zero and all one exponents– These are the extremes, very rare in real numbers– All zeroes is used to code unnormalized numbers– All ones are used to code not-a-numbers (usually resulting from exceptions)

• Rounding varieties– Toward 0– Toward plus infinity– Toward minus infinity– Unbiased


Register structure of MIPS


MIPS coprocessor• Operations

– Add and subtract (single, double) – add.s, sub.s, add.d, sub.d– Multiply and divide (single, double) – mul.s, div.d, etc.– Load/store (single) - lwc1, swc1 – note, this is coprocessor 1– Compare (single, double) – c.lt.s, c.lt.d– Branch conditional – bclt, bclf – – Compares and branches use a special condition register– Moves between coprocessor and CPU registers – mfc1, mfc1.d

• Registers– 32 coprocessor registers – separate from others– Condition register – used only by coprocessor branch and compare– No lo and hi registers in coprocessor– For double, registers are used in pairs - $f2 means $f2 and $f3 in double

• Notes– Data moves don’t really use floating arithmetic, just floating registers– Note register specifications in appendix A or on mips instruction sheet– Don’t trust anything except appendix A – mips instruction sheet doesn’t have all the

instructions– Programming is like expression programming otherwise


Floating-point arithmetic

Floating-point basics

The IEEE standard

The MIPS floating-point coprocessor

MIPS floating-point instructions

and registers


The basics of the MIPS floating point instructions

Operation opcodes

variants comments

arith Add.d Size (.s, .d) All are three-register instructions

Load/store which coproc

Memory to/from coprocessor z

Move to/from coproc

S, d

Move inside coproc

S,d

compares relationals Result goes to coprocessor status bit, not register

branches True/false Based on coprocessor status flag

Absolutes, negates

S, d Finds absolute value

converts S, d, w Any of single, double, word (integer)


The opcodes themselvesoperation Individual opcodes

add Add.s add.d

sub Sub.s sub.d

mul Mul.s mul.d

div div.S div.d

abs Abs.s abs.d

neg Neg.s neg.d

compare c.eq.s c.eq.d c.le.s c.le.d c.lt.s c.lt.d

Branch Bc1t bc1f (note the 1 isn’t an l)

convert Cvt.d.s cvt.s.d cvt.s.w

move Mov.s mov.d

• Notes– Floating processor is coprocessor 1, not elle– ,d, .s, w or d refer to double, single, word, or doubleword– Order of convert specifiers is dest, source

Date post:	31-Dec-2015
Category:	Documents
Upload:	stephany-webb
View:	213 times
Download:	0 times

Computer arithmetic Second try at third grade. What is stored and intended Bit patterns of several...

Documents