+ All Categories
Home > Documents > Lecture 2 ENGR-516 Spring 2011

Lecture 2 ENGR-516 Spring 2011

Date post: 02-Apr-2015
Category:
Upload: andu-nguyen
View: 552 times
Download: 0 times
Share this document with a friend
88
ENGR-516 Spring 2011 Adjunct Prof. Michael A. Soderstrand [email protected] 405-334-8329 Lecture #2 January 24, 2011 1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 1
Transcript
Page 1: Lecture 2 ENGR-516 Spring 2011

ENGR-516 Spring 2011

Adjunct Prof. Michael A. Soderstrand

[email protected]

405-334-8329

Lecture #2 January 24, 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 1

Page 2: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 2

Homework #1 -- Due Monday January 24, 2011

Do the following problems from Chapters 1 and 2:

Problem 1.3, p. 21 of the text

Rather than the linear realationship of Eq. (1.7) you might choose to model the upward force on the parachutist as a second-order relationship.

Fu = -c'v2

where c' = a second-order drag coefficient (kg/m). a) Using calculus, obtain the closed-form solution for the case where the jumper is initially at rest (v=0 at t=0). b) Repeat the numerical calculation in Example 1.2 with the same initial condition and parameter values. Use a value of 0.225 kg/m for c'.

Page 3: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 3

Homework #1 -- Due Monday January 24, 2011

Problem 1.19, p. 24 of the text

The velocity is equal to the rate of change of distance x(m),

dx/dt = v(t) [Eq P1.19]

a) Substitute Eq. (1.10) and develop and analytical solution for the distance as a function of time. Assume that x(0)=0. b) Use Euler's method to numerically integrate Eqs. (P1.19) and (1.9) in order to determine both the velocity and distance fallen as a function of time for the first 10s of free fall using the same parameters as in Example 1.2. c) Develop a plot of your numerical results together with analytical solutions.

Page 4: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 4

Homework #1 -- Due Monday January 24, 2011

Problem 2.3, p. 47 of the text

Develop, debug, and document a program to determine the roots of a quadratic equation ax

2 + bx +

c, in either a high-level language or a macro language of your choice (MatLab strongly suggested). Use a subroutine procedure to compute the roots (either rel or complex). Perform test runs of the cases: a) a=1, b=6, c=2; b) a=0, b=-4, c=1.6; c) a=3, b=2.5, c=7.

Page 5: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 5

Homework #1 -- Due Monday January 24, 2011

Problem 2.25, p. 51 of the text

The psuedocode below computes the factorial. Express this algorithm as a well-structured function in the language of your choice (MatLab is strongly recommended). Test it by computing 0! and 5! In addition, test the error trap by trying to evaluate -2!

Pseudocode

FUNCTION fac(n) IF n ≥ 0 THEN x=1 DOFOR i = 1,n x = x·i END DO fac = x ELSE display error message terminate ENDIF END fac

Page 6: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 6

Chapter 3 – Approximation and Round-Off Errors

Errors are inherent in any numerical solution

Even when we have an exact analytic solution (as in the parachute or circuit example of Chapter 1),

as soon as we use a computer to calculate solutions, those solutions have error in them.

Often we do not have exact or analytic solutions, then numerical techniques give us approximations

– but how much error is there?

Page 7: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 7

Sources of Error

This lecture will deal with all of the major errors associated with numerical analysis

Chapter 3 discusses accuracy and precision and errors due to number representation in the

computer.

Chapter 4 deals with truncation errors and in

detail errors associated with Taylor’s Series

approximations.

Page 8: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 8

3.1 Significant Figures

Below is a car odometer (see text Fig 3.1)

How many significant digits are there?

Page 9: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 9

There are eight significant digits.

However, only the first seven

can be used with confidence.

We can approximate the 8th

digit (126,462.25).

Page 10: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 10

3.1 Significant Figures

Below is a car speedometer (see text Fig 3.1)

How many significant digits are there?

Page 11: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 11

There are two significant digits.

However, only the first can be

used with confidence.

We can approximate the 2nd

digit (52).

Page 12: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 12

3.2 Accuracy and Precision

Accuracy refers to how closely a

computed or measured value agrees with the true value.

Precision refers to how closely computed or measured values agree

with each other.

Page 13: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 13

Illustration of Accuracy and Precision

Figure 3.2 p. 55 of text

Page 14: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 14

Computer Precision

Precision on a computer comes from

the word-length used for computation.

Accuracy on a computer comes from

the quality of the algorithms used to

perform the calculations.

Page 15: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 15

3.3 Error Definitions

Numerical errors include truncation

errors due to inexact mathematical operations and round-off errors due

to significant figure limitations.

True Value = Calculated Value +

True Error

Calculated Value is referred to in the

text as the Approximation

Page 16: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 16

Error Definitions

Et = True Value – Approximation

Relative Error expressed as a percentage is given by:

𝜺𝒕 =𝑻𝒓𝒖𝒆 𝑬𝒓𝒓𝒐𝒓

𝑻𝒓𝒖𝒆 𝑽𝒂𝒍𝒖𝒆× 100% =

𝑬𝒕

𝑻𝑽× 100%

NOTE: This assumes we know the true

value.

Page 17: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 17

Example 3.1

Problem Statement We measure a bridge and a rivet. We measure

10,000cm for the bridge and 11cm for the rivet. If the true values are

9,999cm and 10cm respectively, calculate a) the true error and b) the

relative percent error. (Note: This is

slightly different than in the text.)

Page 18: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 18

Example 3.1

For the Bridge:

Et = 9999cm - 10000cm = -1cm

𝜺𝒕 =𝑬𝒕

𝑇𝑉=

−1𝑐𝑚

9999𝑐𝑚× 100% ≈ −.01%

For the Rivet:

Et = 10cm - 11cm = -1cm

𝜺𝒕 =𝑬𝒕

𝑇𝑉=

−1𝑐𝑚

10𝑐𝑚× 100% ≈ −10%

Page 19: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 19

Approximate Error

Often we do not know the true value.

When we do not know the true value, we use the following equation to calculate the approximate error:

𝜺𝒂 =𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓

𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏× 100%

In the above formula, the denominator is our

approximation and the numerator is the approximate error.

We know the approximation, but how do we find the approximate error?

Page 20: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 20

Finding the Approximate Error

When we do not know the true value, it is often challenging to find the

approximate error.

However, in iterative algorithms, we can use the difference in successive

approximations as a reasonable indication of the approximate error:

𝜺𝒂 =𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏 − 𝑷𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏

𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏× 100%

In the above formula, the denominator is our approximation and the numerator is the approximate error.

Page 21: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 21

Sign of the Error

The error is negative if the approximation is larger than the

true value

The error is positive if the

approximation is smaller than the true value

Page 22: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 22

Absolute Error

Often we are not concerned with the sign of the error.

In the rest of this class, we will use absolute error.

𝜺𝒂 = 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓

𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏 × 100%

Page 23: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 23

Error & Significant Digits

Often we want the error to be less than a certain number of significant

digits n.

The error will be less than n

significant digits if:

𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 %

Page 24: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 24

Error & Significant Digits

Often we want the error to be less than a certain number of significant

digits n.

The error will be less than n

significant digits if:

𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 %

Let’s Take a

TEN MINUTE Break

Page 25: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 25

Example 3.2 Error Estimates for Iterative Methods

Problem Statement. In mathematics, functions can often be represented by infinite series. For example, the exponential function can be computed using

𝑒𝑥 = 1 + 𝑥 +𝑥2

2+

𝑥3

3!+ ⋯ +

𝑥𝑛

𝑛!

Thus, as more terms are added to the sequence, the approximation becomes a better and better estimate of the true value of ex. This is called a Maclaurin series expansion.

Page 26: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 26

Example 3.2 Error Estimates for Iterative Methods

Problem Statement Continued:

𝑒𝑥 = 1 + 𝑥 +𝑥2

2+

𝑥3

3!+ ⋯ +

𝑥𝑛

𝑛!

Starting with the simplest version, ex = 1, add terms one at a time to estimate e0.5 = 1.648721… Add terms until the absolute value of the approximate

error estimate a falls below a prespecified error

criterion s conforming to three significant figures.

Page 27: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 27

Example 3.2 SOLUTION

First solve for the error-specification equivalent to three significant digits:

The true error can be calculated as:

𝜺𝒔 = 𝟎.𝟓 × 𝟏𝟎𝟐−𝒏 % = 𝟎.𝟓 × 𝟏𝟎𝟐−𝟑 % = 𝟎. 𝟎𝟓%

𝜺𝒕 =𝒆𝟎.𝟓 − 𝒂𝒑𝒑𝒓𝒐𝒙

𝒆𝟎.𝟓× 𝟏𝟎𝟎%

Page 28: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 28

Example 3.2 SOLUTION

However, we usually do not know the true value and therefore cannot calculate the true error.

In such cases we must use the approximation error:

𝜺𝒂 =𝒂𝒑𝒑𝒓𝒐𝒙(𝒏) − 𝒂𝒑𝒑𝒓𝒐𝒙(𝒏 − 𝟏)

𝒂𝒑𝒑𝒓𝒐𝒙(𝒏)× 𝟏𝟎𝟎%

Page 29: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 29

x = 0.5 s = 0.05%

e x = 1.64872127

n Approx t a

0 1.00000000 39.34693%

1 1.50000000 9.02040% 33.33333333%

2 1.62500000 1.43877% 7.69230769%

3 1.64583333 0.17516% 1.26582278%

4 1.64843750 0.01721% 0.15797788%

5 1.64869792 0.00142% 0.01579529%

6 1.64871962 0.00010% 0.00131626%

7 1.64872117 0.00001% 0.00009402%

8 1.64872127 0.00000% 0.00000588%

Page 30: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 30

3.3.1 Iterative Calculations

Most of the methods in this course

are iterative using successive approximation to the true value

The computer implementation involves LOOPS usually ending when

the error drops below a specified value.

Page 31: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 31

function [ v,ea,iter ] = IterMeth( x,es,maxit )

% Implements a general series expansion

% INPUT:

% (x,es,maxit): x = independent variable

% es = Specified error

% maxit = Maximum number of interations

% Function term(val,n) that defines the n-th term of the sequence

%

% OUTPUT:

% [v,ea,iter]: v = Approximate value for iter iterations

% ea = Approximate error after iter interations

% iter = number of iterations

iter=1; sol=1; ea=100;

while ea>es & iter < maxit,

solold=sol;

termn=term(x,iter);

sol=sol+termn;

iter=iter+1;

if sol ~= 0

ea=abs((sol-solold)/sol)*100;

end

end

v=sol;

return

end

Page 32: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 32

function [ val ] = term( x,n )

% Function to calculate the

% n-th term of e^x

% INPUT: x, n

% OUTPUT: val

val=x^n/factorial(n);

return

end

Page 33: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 33

3.4 Round-Off Errors

Round-off errors originate from the fact that computers retain only a fixed number of significant digits.

Irrational numbers cannot be expressed exactly.

Many numbers that are exact in decimal are not exact in binary used by computers.

Page 34: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 34

3.4.1 Computer Representation of Numbers

Integers are represented in most computers in two’s compliment representation.

MatLab Integer Representations

Class Range of Values Name

Signed 8-bit integer -27 to 2

7-1 int8

Signed 16-bit integer -215

to 215

-1 int16

Signed 32-bit integer -231

to 231

-1 int32

Signed 64-bit integer -263

to 263

-1 int64

Unsigned 8-bit integer 0 to 28-1 uint8

Unsigned 16-bit integer 0 to 216

-1 uint16

Unsigned 32-bit integer 0 to 232

-1 uint32

Unsigned 64-bit integer 0 to 264

-1 uint64

Page 35: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 35

3.4.1 Computer Representation of Numbers

Floating Point numbers are represented in most computers by IEEE Standard 754.

Name Common name Base Digits E min E max Decimal

digits

Decimal

E max

binary16 Half precision 2 10+1 -14 +15 3.31 4.51

binary32 Single precision 2 23+1 -126 +127 7.22 38.23

binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95

binary128 Quadruple precision 2 112+1 -16382 +16383 34.02 4931.77

decimal32

10 7 -95 +96 7 96

decimal64

10 16 -383 +384 16 384

decimal128

10 34 -6143 +6144 34 6144

Page 36: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 36

3.4.1 Computer Representation of Numbers

MatLab uses primarily double precision IEEE Standard 754 arithmetic.

Single precision is available, but half, quadruple, and the decimal representations are not.

Name Common name Base Digits E min E max Decimal

digits

Decimal

E max

binary16 Half precision 2 10+1 -14 +15 3.31 4.51

binary32 Single precision 2 23+1 -126 +127 7.22 38.23

binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95

binary128

Quadruple

precision 2 112+1 -16382 +16383 34.02 4931.77

decimal32

10 7 -95 +96 7 96

decimal64

10 16 -383 +384 16 384

decimal128

10 34 -6143 +6144 34 6144

Page 37: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 37

Binary and HEX

Binary HEX Binary HEX Binary HEX Binary HEX

0000 0 0100 4 1000 8 1100 C

0001 1 0101 5 1001 9 1101 D

0010 2 0110 6 1010 A 1110 E

0011 3 0111 7 1011 B 1111 F

Page 38: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 38

3.4.1 Integer Representation in MatLab

uint8: 8-bit unsigned integer (0 – 255)

Binary HEX Decimal

0110 1010 6A 6161+10 = 106

0011 1011 3B 3161+11 = 59

1100 0101 C5 12161+5 = 197

1111 0111 F7 15161+7 = 247

Page 39: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 39

3.4.1 Integer Representation in MatLab

uint8: 8-bit unsigned integer (0 – 255)

Decimal HEX Binary

16 1616 = 1R0 = 10 0001 0000

40 4016 = 2R8 = 28 0010 1000

228 22816 = 14R4 = E4 1110 0100

157 15716 = 9R13 = 9D 1001 1101

Page 40: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 40

3.4.1 Integer Representation in MatLab

int8: 8-bit signed integer (-128 – 127)

Binary HEX Decimal

0110 1010 6A 6161+10 = 106

0011 1011 3B 3161+11 = 59

1100 0101 C5 4161+5-128 = -59

1111 0111 F7 7161+7-128 = -9

Page 41: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 41

3.4.1 Integer Representation in MatLab

int8: 8-bit signed integer (-128 – 127)

Decimal HEX Binary

16 1616 = 1R0 = 10 0001 0000

40 4016 = 2R8 = 28 0010 1000

-28 (256-28)16 = 14R4 = E4 1110 0100

-99 (256-99)16 = 9R13 = 9D 1001 1101

Page 42: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 42

3.4.1 Integer Representation in MatLab

int16: 16-bit signed integer (-32,768 – 32,767)

Binary HEX Decimal

0110 1010 0011 1101 hex2dec(‘6A3B’)

6A3B 6163+10162

+3161+11 = 27,195 1100 0101 1111 0111

hex2dec(‘45F7’)+ intmin(‘int16’)

C5F7 4163+5162

+15161+7-32768 = -14,857

Page 43: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 43

3.4.1 Integer Representation in MatLab

int16: 16-bit signed integer (-32,768 – 32,767)

Decimal HEX Binary

4136 413616 = 258R8 = 1028

25816=16R2

1616=1R0

0001 0000 0010 1000

-7011 (65,536-7011)16 = 3657R13 = E49D

365716 = 228R9

22816 = 14R4

1110 0100 1001 1101

Page 44: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 44

3.4.1 Integer Representation in MatLab

int16: 16-bit signed integer (-32,768 – 32,767)

Decimal HEX Binary

4136 413616 = 258R8 = 1028

25816=16R2

1616=1R0

0001 0000 0010 1000

-7011 (65,536-7011)16 = 3657R13 = E49D

365716 = 228R9

22816 = 14R4

1110 0100 1001 1101

Let’s Take a

TEN MINUTE Break

Page 45: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 45

3.4.1 Floating Point Representation

Computer equivalent of scientific notation.

𝐼 = 2,356 𝐹 = 2.356 × 103 (normalization)

𝐼 = 2,356 𝐹 = 2.36 × 103 (norm & round)

Binary equivalent:

𝐼 = 1011 𝐹 = 1.011 × 23 (normalization)

𝐼 = 1011 𝐹 = 1.10 × 23 (norm & round)

In binary, we save a bit, because the whole number is always 1 – we only need to store the fractional part.

Page 46: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 46

IEEE 754 Double Precision

MSB is the sign of the number (0 = +; 1 = -)

The exponent is 11 bits in a biased number system (subtract 1023 from the actual number)

The mantissa is formed by 1 + the 52-bit fraction. (53 bits total)

Page 47: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 47

Exponent encoding

The double precision binary floating-point exponent is encoded using an offset binary representation, with the zero offset being 1023; also known as exponent bias in the IEEE 754 standard. Examples of such representations would be: ■ Emin (1) = -1022 ■ E (50) = -973 ■ Emax (2046) = 1023 Thus, as defined by the offset binary representation, in order to get the true exponent the exponent bias of 1023 has to be subtracted from the written exponent. The exponents 0x000 and 0x7ff have a special meaning: ■ 0x000 is used to represent zero (if F=0) and subnormals (if F≠0); and ■ 0x7ff is used to represent infinity (if F=0) and NaNs (if F≠0), where F is the fraction mantissa. All bit patterns are valid encoding.

Page 48: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 48

IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

0x3ff0 0000 0000 0000 = 1

0x3ff0 0000 0000 0001 = 1.0000000000000002,

the next higher number > 1

0x3ff0 0000 0000 0002 = 1.0000000000000004

0x4000 0000 0000 0000 = 2

0xc000 0000 0000 0000 = –2

Page 49: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 49

IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

0x0000 0000 0000 0000 = 0

0x8000 0000 0000 0000 = –0

0x7ff0 0000 0000 0000 = Infinity

0xfff0 0000 0000 0000 = -Infinity

Page 50: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 50

IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

0x0000 0000 0000 0001 ≈ 4.9406564584124654 x 10-324

(Min subnormal positive double)

0x0010 0000 0000 0000 ≈ 2.2250738585072014 x 10-308

(Min normal positive double)

0x7fef ffff ffff ffff ≈ 1.7976931348623157 x 10308

(Max Double)

Note: Subnormals fill the gap between zero and smallest number (see text pp. 65-67 for explanation of this gap).

Page 51: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 51

IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

0x7ff0 0000 0000 0001 = qNaN

0xfff8 0000 0000 0000 = sNaN

Page 52: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 52

There are three kinds of operations which return NaN

Operations with a NaN as at least one operand

Indeterminate forms o The divisions 0/0, ∞/∞, ∞/−∞, −∞/∞, and −∞/−∞ o The multiplications 0×∞ and 0×−∞ o The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions o The standard has alternative functions for powers:

The standard pow function and the integer exponent pown function define 00, 1∞, and ∞0 as 1.

The powr function defines all three as invalid operations (NaN).

Real operations with complex results, for example: o The square root of a negative number o The logarithm of a negative number o The inverse sine or cosine of a number which is less than −1 or greater

than +1.

Page 53: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 53

3.4.2 Arithmetic Manipulations

Floating-point addition and subtraction can cause significant error.

Exponents must be the same to add or subtract

1.557 104

+ 4.381 102

Can’t Add!

The mantissa of the number with the smaller exponent is modified to make the exponents the same

1.557 104

+ 0.04381 104

1.60081 104

Page 54: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 54

3.4.2 Arithmetic Manipulations

1.557 104

+ 0.04381 104

1.60081 104

Exponents However, we must now either truncate (chop) or round to the machine precision (4 digits)

1.557 104

+ 0.04381 104

1.600 104 Chopped

1.557 104

+ 0.04381 104

1.601 104 Rounded

Page 55: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 55

3.4.2 Arithmetic Manipulations

Floating-point subtraction of two numbers that are very close to each other causes major problems.

Exponents must be the same to add or subtract

7.642 103

- 7.641 103

0.001 103

Renormalization creates three non-significant digits

7.642 103

- 7.641 103

1.000 100

Page 56: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 56

Example 3.7

Investigate the effect of round-off error on large numbers of interdependent computations.

The text suggests adding 1 and .00001 in single precision to themselves 10,000 times.

The first sum is 10,000 but the second will not be 1 due to round-off errors.

The text then suggest doing the second sum in double precision.

Page 57: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 57

Example 3.7 Generalized % Program Fig0312 (Page 72 of text)

% INPUTS: x1 = the first number to add (eg: 1)

% x2 = the second and third number to add (eg: 0.1)

% x3 = the second and third number to add (eg: 0.1)

% n = numer of iterations (eg: 10,000,000)

% OUTPUTS: sum1 = x1 summed n times (single precision)

% sum2 = x2 summed n times (single precision)

% sum2 = x3 summed n times (double precision)

sum1=single(0);

sum2=sum1;

sum3=0;

for i = 1:n

sum1=sum1+x1;

sum2=sum2+x2;

sum3=sum3+x3;

end

Page 58: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 58

Example 3.7 Generalized % Program Fig0312 (Page 72 of text)

sum1=single(0);

sum2=sum1;

sum3=0;

for i = 1:n

sum1=sum1+x1;

sum2=sum2+x2;

sum3=sum3+x3;

end

t1=sprintf('n = %0.10g, ',n);

t2=sprintf('x1 = %0.8g, x2 = %0.8g, x3 = %0.8g\n',x1,x2,x3);

t3=sprintf('sum1 = %0.8g\n',sum1);

t4=sprintf('sum2 = %0.8g\n',sum2);

t5=sprintf('sum3 = %0.8g\n',sum3);

t=sprintf('%s%s%s%s%s',t1,t2,t3,t4,t5);

sprintf(t)

n = 10000000, x1 = 1, x2 = 0.1, x3 = 0.1 sum1 = 10000000 sum2 = 1087937 sum3 = 1000000

Page 59: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 59

Example 3.8

Find the roots of the quadratic equation with a=1, b=30000000.0000001, c=3. (x+30000000)(x+0.0000001)

MatLab and Excel have trouble with this

EDU>> a=1;b=30000000.0000001;c=3; EDU>> r1=(-b+sqrt(b^2-4*a*c))/2/a

r1 = -1.00582838058472e-007

EDU>> r2=(-b-sqrt(b^2-4*a*c))/2/a

r2 = -30000000

EDU>> r1=2*c/(-b-sqrt(b^2-4*a*c))

r1 = -1e-007

a = 1

b = 30000000.0000001

c = 3

r1 = -0.000000100582838

r2 = -30000000

r1 = 0.000000100000000

Page 60: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 60

Example 3.9

Using a Taylor’s Series to calculate ex when x is negative is not recommended.

Instead calculate 1/e|x|.

However, the example in the text on pp. 74-75 works fine in both MatLab and Excel.

You need to use a higher value of x to see the effect (try -19 and -20).

EDU>> IterMeth(x,es,maxit) ans = 2.55376446514562e-009 EDU>> exp(-19) ans = 5.60279643753727e-009

Page 61: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 61

Example 3.9

Using a Taylor’s Series to calculate ex when x is negative is not recommended.

Instead calculate 1/e|x|.

However, the example in the text on pp. 74-75 works fine in both MatLab and Excel.

You need to use a higher value of x to see the effect (try -19 and -20).

EDU>> IterMeth(x,es,maxit) ans = 2.55376446514562e-009 EDU>> exp(-19) ans = 5.60279643753727e-009

Let’s Take a

TEN MINUTE Break

Page 62: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 62

Chapter 4 – Truncation errors and the Taylor Series

Truncation Errors are those that result from using

an approximation in place of an exact mathematical procedure

Euler’s Method is an example of a first order approximation to the real next value of a function.

Most approximation methods including Euler’s Method make use of a Taylor Series Approximation.

Page 63: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 63

4.1 – The Taylor Series

If a function f and its first n+1 derivatives are continuous on an interval containing x and x+h:

𝑓(𝑥 + ℎ) = 𝑓(𝑥) + 𝑓 ′(𝑥)ℎ +𝑓′′ (𝑥)

2!ℎ2 + ⋯ +

𝑓 𝑛 (𝑥)

𝑛 !ℎ𝑛 + 𝑅𝑛

Where the nth-order approximation error is give b:.

𝑅𝑛 = (𝑥 − 𝑡)𝑛

𝑛!𝑓 𝑛+1 (𝑡)𝑑𝑡 =

𝑓 𝑛+1 ()

(𝑛 + 1)!ℎ𝑛+1

𝑥+ℎ

𝑡=𝑥

Where is some value between x and x+h

Page 64: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 64

4.1 – Zero, 1st and 2nd Order Approximations

Rn is the slope to the exact prediction:

Page 65: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 65

Example 4.1 – Exact Taylor Series Approximation

For polynomials of order n, the n-th order Taylor Series

approximation is exact

Consider 𝑓(𝑥) = −0.1𝑥4 − 0.15𝑥3 − 0.5𝑥2 − 0.25𝑥 + 1.2 (n=4)

𝑓 ′(𝑥) = −0.4𝑥3 − 0.45𝑥2 − 𝑥 − 0.25

𝑓 ′′ (𝑥) = −1.2𝑥2 − 0.9𝑥 − 1

𝑓 ′′′ (𝑥) = −2.4𝑥 − 0.9

𝑓 𝑖𝑣 (𝑥) = −2.4

𝑓 𝑣 (𝑥) = 0 Hence, Rn =0

𝑅4=𝑓 𝑛+1 ()

𝑛+1 !ℎ𝑛+1=

𝑓 𝑖𝑣 ()

5 !ℎ5 =

0

5!ℎ5 = 0

Page 66: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 66

Example 4.1 – Successive Approximation

Here are the successive approximations for x=0 and

h=1 of 𝑓(𝑥) = −0.1𝑥4 − 0.15𝑥3 − 0.5𝑥2 − 0.25𝑥 + 1.2

𝑓 ′(0) = −0.4𝑥3 − 0.45𝑥2 − 𝑥 − 0.25 = −.25

𝑓 ′′ (0) = −1.2𝑥2 − 0.9𝑥 − 1 = −1

𝑓 ′′′ (0) = −2.4𝑥 − 0.9 = −0.9

𝑓 𝑖𝑣 (0) = −2.4

𝑓 𝑣 (0) = 0

𝑓0 1 = 𝑓 0 = 𝟏. 𝟐 𝑓1 1 = 𝑓0 1 + 𝑓′ 0 (1) = 1.2 − 0.25 1 = 𝟎. 𝟗𝟓

𝑓2 1 = 𝑓1 1 +𝑓′′(0)

2!(1)2= 0.95 −

1

2= 𝟎. 𝟒𝟓

𝑓3 1 = 𝑓2 1 +𝑓′′′ 0

3!(1)3= 0.45 −

0.9

6= 𝟎. 𝟑

𝑓4 1 = 𝑓3 1 +𝑓 𝑖𝑣 (0)

4!(1)4= 0.3 −

2.4

24= 𝟎. 𝟐 = 𝒇(𝟏)

𝑓0 1 = 𝑓(0)

𝑓𝑛 1 = 𝑓𝑛−1 1 +𝑓 𝑛 (0)

𝑛!ℎ𝑛

Page 67: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 67

Order n f(p /3) t

0 cos(x) = 0.70710678 0.70710678 41.4214%

1 -sin(x) = -0.70710678 0.52198666 4.3973%

2 -cos(x) = -0.70710678 0.49775449 0.4491%

3 sin(x) = 0.70710678 0.49986915 0.0262%

4 cos(x) = 0.70710678 0.50000755 0.0015%

5 -sin(x) = -0.70710678 0.50000030 0.0001%

6 -cos(x) = -0.70710678 0.49999999 0.0000%

f[n]

(p /4)

Example 4.2 Infinite Taylor Series Approximation

Use Taylor Series expansions with n=0 through n=6 to approximate f(x) = cos(pi/3) on the basis of f(x) and derivatives at pi/4 (h=pi/3-i/4)

Page 68: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 68

4.1.1 – The Remainder for the Taylor Series Expansion

Suppose we truncate the Taylor Series expansion

after the first term (zero-order approximation):

𝑓(𝑥 + ℎ) = 𝑓(𝑥) + 𝑅0

Where the 0th-order approximation error is give by:.

𝑅0 = (𝑥 − 𝑡)𝑛

𝑛!𝑓 𝑛+1 (𝑡)𝑑𝑡 =

𝑓 𝑛+1 ()

(𝑛 + 1)!ℎ𝑛+1 = 𝑓 ′()ℎ

𝑥+ℎ

𝑡=𝑥

Where is some value between x and x+h

Page 69: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 69

4.1.1 – The Remainder for the Taylor Series Expansion

R0 is the slope to the exact prediction:

Figure 4.2 Zero-order Taylor Series Prediction

Page 70: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 70

4.1.1 – The Remainder for the Taylor Series Expansion

R0 is the slope to the exact prediction:

Figure 4.3 Derivative Mean-Value theorem

Page 71: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 71

4.1.2 Using the Taylor Series to Estimate Truncation Error

The remainder Rn is an exact measure of the error of truncation at term n of a Taylor Series

Even though Rn usually cannot be calculated exactly, the order of Rn is hn [ie O(hn )]

This allows us to estimate with good accuracy the error

In iterative procedures this is the stopping criterion.

Page 72: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 72

Figure 4.4 In creasing non-linearity of a function requires higher and higher Taylor Series Approximations

Page 73: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 73

Figure 4.5 Log-log plot of R1 vs h showing that as h gets smaller R1 decreases with h2 O(h2)

Page 74: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 74

4.1.3 Numerical Differentiation

This will be covered in much more detail in Chapters 23 and 24.

Here we will introduce the three approximations typically used for a derivative o The forward difference o The backward difference o The centered difference

We will also look briefly at second derivatives.

Page 75: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 75

4.1.3 Numerical Differentiation

The first forward difference approximation.

𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖+1) − 𝑓(𝑥𝑖)

ℎ+ 𝑂(ℎ) =

∆𝑓𝑖ℎ

+ 𝑂(ℎ)

The first backward difference approximation

𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖) − 𝑓(𝑥𝑖−1)

ℎ+ 𝑂(ℎ) =

∇𝑓𝑖ℎ

+ 𝑂(ℎ)

Page 76: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 76

4.1.3 Numerical Differentiation

The first centered difference approximation.

𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖+1) − 𝑓(𝑥𝑖−1)

2ℎ+ 𝑂(ℎ2)

The second derivative approximation

𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖+1) − 2𝑓(𝑥𝑖) + 𝑓(𝑥𝑖−1)

ℎ2+ 𝑂(ℎ2)

Page 77: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 77

f(x)=Ax4 + Bx3 + Cx2 +Dx + E f '(0.5) = -0.9125

A = -0.1 For h=0.5 error

B = -0.15 Df/h = -1.45 58.90%

C = -0.5 f/h = -0.55 39.73%

D = -0.25 cf/h = -1 9.59%

E = 1.2

x f(x)

0.00 1.20000000 For h=0.5 error reduction

0.25 1.10351563 Df/h = -1.15469 26.54% 45.06%

0.50 0.92500000 f/h = -0.71406 21.75% 54.74%

0.75 0.63632813 cf/h = -0.93438 2.40% 25.00%

1.00 0.20000000

Example 4.4 Pages 92-93 of the text

Page 78: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 78

4.2 – Error Propagation

The purpose of this section is to investigate how errors propagate through mathematical functions.

For example, if we multiply two numbers together, each with a known error, what is the error of the

product?

We will look at error propagation for

o Single-variable functions

o Multi-variable functions o Stability and condition

Page 79: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 79

4.2.1 – Single-Variable Functions

Assume we have a function f(x) that is

dependent on a single variable x.

We have a known value 𝑥 that is an

approximation to x

Then the error in using 𝑓(𝑥 ) rather than f(x) is

given by:

∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥

Page 80: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 80

4.2.1 – Single-Variable Functions

Then the error in using 𝑓(𝑥 ) rather than f(x) is given by: ∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥

Page 81: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 81

Example 4.5

Given 𝑥 = 2.5 with D𝑥 = 0.01, estimate the

resulting error in the function f(x) = x3.

Solution: We have f’(x) = 3x2.

Hence, f’(𝑥 ) = 3(2.5)2 = 18.75. Therefore:

∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥

∆𝑓(𝑥 ) = 18.75 (0.01) = 0.1875

Predict: (2.5)3 – 0.1875 x3 (2.5)3 + 0.1875

15,4375 x3 15.8125

Page 82: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 82

4.2.2 – Multi-Variable Functions

The concept of 4.2.1 can be extended to multi-variable functions:

∆𝑓(𝑥 1, 𝑥 2, ⋯𝑥 𝑛) ≅ 𝜕𝑓

𝜕𝑥1 ∆𝑥1 +

𝜕𝑓

𝜕𝑥2 ∆𝑥2 + ⋯ +

𝜕𝑓

𝜕𝑥𝑛 ∆𝑥𝑛

See Example 4.6 pp. 96-97.

Page 83: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 83

4.2.3 – Stability and Condition

The condition of a mathematical problem relates to its sensitivity to changes in input values.

We say that a computation is numerically

unstable if the uncertainty of the input values is

grossly magnified by the numerical method.

A Condition Number can be defined by:

𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 𝑁𝑢𝑚𝑏𝑒𝑟 =𝑥 𝑓 ′(𝑥 )

𝑓(𝑥 )

Page 84: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 84

4.3 – Total Numerical Error

The Total Numerical Error is the summation of the truncation and round-off (or chopping) errors.

Round-off and chopping errors increase with subtractive cancellation and increased number of

calculations.

Truncation errors decrease with smaller step size,

but smaller step sizes increase the number of

calculations. Hence: There is a trade-off!

Page 85: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 85

4.3 – Total Numerical Error

Truncation errors decrease with smaller step size, but smaller step sizes increase the number of

calculations. Hence: There is a trade-off!

Figure 4.8 Trade-off between Truncation and Round-Off errors.

Page 86: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 86

Example 4.8 Truncation errors decrease with smaller step size,

but smaller step sizes increase the number of calculations. Hence: There is a trade-off!

Figure 4.9 Trade-off between Truncation and Round-Off errors in Example 4.8.

Page 87: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 87

4.4 – Blunders, Formulation Errors and Uncertainty

Programming blunders are often a cause of

error that is difficult to eliminate.

Modeling Errors (also called Formulation

Errors) are also difficult to eliminate.

Data Uncertainty also causes errors, but we

do have some measures for this that can be very helpful in at least accounting for these

types of errors.

Page 88: Lecture 2 ENGR-516 Spring 2011

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011

End of Tonight’s Lecture

Any questions?

Homework for next week: 3.1, 3.3, 3.6

(note: use e-19 rather than e-5), 3.13, 4.1

and 4.23

Quiz for this week is NOW online –

Complete before class next week

Quiz for next week will be online next

Monday. 88


Recommended