+ All Categories
Home > Documents > 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

Date post: 20-Dec-2015
Category:
View: 220 times
Download: 1 times
Share this document with a friend
52
1 CSE1301 Computer Programming Lecture 33: Real Number Representation
Transcript
Page 1: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

1

CSE1301Computer Programming

Lecture 33:Real Number Representation

Page 2: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

2

Topics

• Terminology

• IEEE standard for floating-point representation

• Floating point arithmetic

• Limitations

Page 3: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

3

Some Terminology

• All digits in a number following any leading zeros are significant digits:

12.345 -0.12345 0.00012345

Page 4: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

4

Some Terminology (cont)

• The scientific notation for real numbers is:

mantissa base exponent

In C, the expression: 12.456e-2

means: 12.456 10-2

Page 5: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

5

Some Terminology (cont)

• The mantissa is always normalized between 1 and the base (i.e., exactly one significant digit before the point)

Unnormalized Normalized

2997.9 105 2.9979 108

B1.39FC 1611 B.139FC 1612

0.010110110101 2-1 1.0110110101 2-3

Page 6: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

6

Some Terminology (cont)

• The precision of a number is how many digits (or bits) we use to represent it

• For example:33.143.14159263.1415926535897932384626433832795028

Page 7: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

7

Representing Numbers

• A real number n is represented by a floating-point approximation n*

• The computer uses 32 bits (or more) to store each approximation

• It needs to store

– the mantissa

– the sign of the mantissa

– the exponent (with its sign)

Page 8: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

8

313002223

Representing Numbers (cont)

• The standard way to allocate 32 bits (specified by IEEE Standard 754) is:

– 23 bits for the mantissa

– 1 bit for the mantissa's sign

– 8 bits for the exponent

Page 9: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

9

313002223

Representing Numbers (cont)

– 23 bits for the mantissa

– 1 bit for the mantissa's sign

– 8 bits for the exponent

Page 10: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

10

313002223

Representing Numbers (cont)

– 23 bits for the mantissa

– 1 bit for the mantissa's sign

– 8 bits for the exponent

Page 11: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

11

313002223

Representing Numbers (cont)

– 23 bits for the mantissa

– 1 bit for the mantissa's sign

– 8 bits for the exponent

Page 12: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

12

• The mantissa has to be in the range 1 mantissa < base

• Therefore – If we use base 2, the digit before the point must

be a 1

– So we don't have to worry about storing it We get 24 bits of precision using 23 bits

Representing the Mantissa

Page 13: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

13

Representing the Mantissa (cont)

• 24 bits of precision are equivalent to a little over 7 decimal digits:

24

log2 10≈7.2

Page 14: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

14

Representing the Mantissa (cont)

• Suppose we want to represent :3.1415926535897932384626433832795.....

• That means that we can only represent it as:3.141592 (if we truncate)3.141593 (if we round)

Page 15: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

15

Representing the Mantissa (cont)• Even if the computer appears to represent

more than 7 decimal places, only the first 7 places are meaningful

• For example: #include <math.h>

main() {

float pi = 2 * asin(1); printf("%.35f\n", pi);

}

Prints out:3.14159274191257320000000000000000003.1415927419125732000000000000000000

Page 16: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

16

Representing the Exponent• The exponent is represented as excess-127. E.g.,

Actual Exponent Stored Value -127 00000000

-126 00000001. . .

0 01111111+1 10000000

. . .i (i+127)2

. . .+128 11111111

Page 17: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

17

Representing the Exponent (cont)

• The IEEE standard restricts exponents to the range:

–126 exponent +127

• The exponents –127 and +128 have special meanings: – If exponent = -127, the stored value is 0

– If exponent = 128, the stored value is

Page 18: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

18

Representing Numbers -- Example 1What is 01011011 (8-bit machine) ?

0 101 1011

sign exp mantissa

• Mantissa: 1.1011

• Exponent (excess-3 format): 5-3=2

1.1011 22 110.11

110.112 = 22 + 21 + 2-1 + 2-2

= 4 + 2 + 0.5 + 0.25 = 6.75

Page 19: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

19

Representing Numbers -- Example 2Represent -10.375 (32-bit machine)

10.37510 = 10 + 0.25 + 0.125

= 23 + 21 + 2-2 + 2-3

= 1010.0112 1.0100112 23

• Sign: 1• Mantissa: 010011• Exponent (excess-127 format):

3+127 = 13010 = 100000102

1 10000010 01001100000000000000000

Page 20: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

20

Floating Point Overflow

• Floating point representations can overflow, e.g., 1.111111 2127

+ 1.111111 2127

               

11.111110 2127

= 1.1111110 2128

Page 21: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

21

Floating Point Underflow

• Floating point numbers can also get too small, e.g., 10.010000 2-126

÷ 11.000000 20

               

0.110000 2-126

= 01.100000 2-127

Page 22: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

22

Floating Point Addition

Five steps to add two floating point numbers:

1. Express the numbers with the same exponent (denormalize)

2. Add the mantissas

3. Adjust the mantissa to one digit/bit before the point (renormalize)

4. Round or truncate to required precision

5. Check for overflow/underflow

Page 23: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

23

Floating Point Addition -- Example 1(Assume precision 4 decimal digits)

x = 9.876 107

y = 1.357 106

Page 24: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

24

Floating Point Addition -- Example 1 (cont)(Assume precision 4 decimal digits)

1. Use the same exponents:

x = 9.876 107

y = 0.1357 107

Page 25: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

25

Floating Point Addition -- Example 1 (cont)(Assume precision 4 decimal digits)

2. Add the mantissas:

x = 9.876 107

y = 0.136 107

x+y = 10.012 107

Page 26: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

26

Floating Point Addition -- Example 1 (cont)(Assume precision 4 decimal digits)

3. Renormalize the sum:

x = 9.876 107

y = 0.136 107

x+y = 1.0012 108

Page 27: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

27

Floating Point Addition -- Example 1 (cont)(Assume precision 4 decimal digits)

4. Truncate or round:

x = 9.876 107

y = 0.136 107

x+y = 1.001 108

Page 28: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

28

Floating Point Addition -- Example 1 (cont)(Assume precision 4 decimal digits)

5. Check overflow and underflow:

x = 9.876 107

y = 0.136 107

x+y = 1.001 108

Page 29: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

29

Floating Point Addition -- Example 2 (Assume precision 4 decimal digits)

x = 3.506 10-5

y = -3.497 10-5

Page 30: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

30

Floating Point Addition -- Example 2 (cont) (Assume precision 4 decimal digits)

1. Use the same exponents:

x = 3.506 10-5

y = -3.497 10-5

Page 31: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

31

Floating Point Addition -- Example 2 (cont) (Assume precision 4 decimal digits)

2. Add the mantissas:

x = 3.506 10-5

y = -3.497 10-5

x+y = 0.009 10-5

Page 32: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

32

Floating Point Addition -- Example 2 (cont) (Assume precision 4 decimal digits)

3. Renormalize the sum:

x = 3.506 10-5

y = -3.497 10-5

x+y = 9.000 10-8

Page 33: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

33

Floating Point Addition -- Example 2 (cont) (Assume precision 4 decimal digits)

4. Truncate or round:

x = 3.506 10-5

y = -3.497 10-5

x+y = 9.000 10-8(no change)

Page 34: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

34

Floating Point Addition -- Example 2 (cont) (Assume precision 4 decimal digits)

5. Check overflow and underflow:

x = 3.506 10-5

y = -3.497 10-5

x+y = 9.000 10-8

Page 35: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

35

Floating Point Multiplication

Five steps to multiply two floating point numbers:

1. Multiply the mantissas

2. Add the exponents

3. Renormalize the mantissa

4. Round or truncate to required precision

5. Check for overflow/underflow

Page 36: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

36

Floating Point Multiplication -- Example (Assume precision 4 decimal digits)

x = 9.001 105

y = 8.001 10-3

Page 37: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

37

1&2. Multiply mantissas and Add exponents:

x = 9.001 105

y = 8.001 10-3

x

y = 72.017001 102

Floating Point Multiplication -- Example (cont) (Assume precision 4 decimal digits)

Page 38: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

38

3. Renormalize the mantissa:

x = 9.001 105

y = 8.001 10-3

x

y = 7.2017001 103

Floating Point Multiplication -- Example (cont) (Assume precision 4 decimal digits)

Page 39: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

39

4. Truncate or round:

x = 9.001 105

y = 8.001 10-3

x y = 7.201 103

Floating Point Multiplication -- Example (cont) (Assume precision 4 decimal digits)

Page 40: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

40

4. Truncate or round:

x = 9.001 105

y = 8.001 10-3

x y = 7.202 103

Floating Point Multiplication -- Example (cont) (Assume precision 4 decimal digits)

Page 41: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

41

5. Check overflow and underflow:

x = 9.001 105

y = 8.001 10-3

x y = 7.202 103

Floating Point Multiplication -- Example (cont) (Assume precision 4 decimal digits)

Page 42: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

42

Limitations

• Floating-point representations only approximate real numbers

• The normal laws of arithmetic don't always hold, e.g., associativity is not guaranteed

Page 43: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

43

Limitations -- Example(Assume precision 4 decimal digits)

x = 3.002 103

y = -3.000 103

z = 6.531 100

Page 44: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

44

Limitations -- Example (cont) (Assume precision 4 decimal digits)

x = 3.002 103

y = -3.000 103

z = 6.531 100

x+y = 2.000 100

Page 45: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

45

Limitations -- Example (cont) (Assume precision 4 decimal digits)

x = 3.002 103

x+y = 2.000 100

y = -3.000 103

z = 6.531 100

(x+y)+z = 8.531 100

Page 46: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

46

Limitations -- Example (cont) (Assume precision 4 decimal digits)

x = 3.002 103

y = -3.000 103

z = 6.531 100

Page 47: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

47

Limitations -- Example (cont) (Assume precision 4 decimal digits)

x = 3.002 103

y = -3.000 103

z = 6.531 100

y+z = -2.993 103

Page 48: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

48

Limitations -- Example (cont) (Assume precision 4 decimal digits)

x = 3.002 103

y = -3.000 103

y+z = -2.993 103

z = 6.531 100

x+(y+z) = 0.009 103

Page 49: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

49

Limitations -- Example (cont) (Assume precision 4 decimal digits)

x = 3.002 103

x+(y+z) = 9.000 100

y = -3.000 103

y+z = -2.993 103

z = 6.531 100

Page 50: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

50

Limitations -- Example (cont) (Assume precision 4 decimal digits)

x = 3.002 103

x+(y+z) = 9.000 100

y = -3.000 103

(x+y)+z = 8.531 100

z = 6.531 100

Page 51: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

51

Limitations -- ExerciseLaws of Arithmetic

• Consider the laws of arithmetic:– Commutativity (additive and multiplicative)

– Associativity

– Distributivity

– Identity (additive and multiplicative)

• Try to work out which ones always hold for floating-point numbers

Page 52: 1 CSE1301 Computer Programming Lecture 33: Real Number Representation.

52

Reading (for the Very Keen)

• Goldberg, D., What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys, Vol.23, No.1, March 1991

• Knuth, D.E., The Art of Computer Programming (Vol 2) -- Seminumerical Algorithms, Section 4.4, pp. 319-329 (ed 3)


Recommended