Post on 02-Apr-2015
transcript
ENGR-516 Spring 2011
Adjunct Prof. Michael A. Soderstrand
soderstrand@ieee.org
405-334-8329
Lecture #2 January 24, 2011
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 1
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 2
Homework #1 -- Due Monday January 24, 2011
Do the following problems from Chapters 1 and 2:
Problem 1.3, p. 21 of the text
Rather than the linear realationship of Eq. (1.7) you might choose to model the upward force on the parachutist as a second-order relationship.
Fu = -c'v2
where c' = a second-order drag coefficient (kg/m). a) Using calculus, obtain the closed-form solution for the case where the jumper is initially at rest (v=0 at t=0). b) Repeat the numerical calculation in Example 1.2 with the same initial condition and parameter values. Use a value of 0.225 kg/m for c'.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 3
Homework #1 -- Due Monday January 24, 2011
Problem 1.19, p. 24 of the text
The velocity is equal to the rate of change of distance x(m),
dx/dt = v(t) [Eq P1.19]
a) Substitute Eq. (1.10) and develop and analytical solution for the distance as a function of time. Assume that x(0)=0. b) Use Euler's method to numerically integrate Eqs. (P1.19) and (1.9) in order to determine both the velocity and distance fallen as a function of time for the first 10s of free fall using the same parameters as in Example 1.2. c) Develop a plot of your numerical results together with analytical solutions.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 4
Homework #1 -- Due Monday January 24, 2011
Problem 2.3, p. 47 of the text
Develop, debug, and document a program to determine the roots of a quadratic equation ax
2 + bx +
c, in either a high-level language or a macro language of your choice (MatLab strongly suggested). Use a subroutine procedure to compute the roots (either rel or complex). Perform test runs of the cases: a) a=1, b=6, c=2; b) a=0, b=-4, c=1.6; c) a=3, b=2.5, c=7.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 5
Homework #1 -- Due Monday January 24, 2011
Problem 2.25, p. 51 of the text
The psuedocode below computes the factorial. Express this algorithm as a well-structured function in the language of your choice (MatLab is strongly recommended). Test it by computing 0! and 5! In addition, test the error trap by trying to evaluate -2!
Pseudocode
FUNCTION fac(n) IF n ≥ 0 THEN x=1 DOFOR i = 1,n x = x·i END DO fac = x ELSE display error message terminate ENDIF END fac
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 6
Chapter 3 – Approximation and Round-Off Errors
Errors are inherent in any numerical solution
Even when we have an exact analytic solution (as in the parachute or circuit example of Chapter 1),
as soon as we use a computer to calculate solutions, those solutions have error in them.
Often we do not have exact or analytic solutions, then numerical techniques give us approximations
– but how much error is there?
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 7
Sources of Error
This lecture will deal with all of the major errors associated with numerical analysis
Chapter 3 discusses accuracy and precision and errors due to number representation in the
computer.
Chapter 4 deals with truncation errors and in
detail errors associated with Taylor’s Series
approximations.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 8
3.1 Significant Figures
Below is a car odometer (see text Fig 3.1)
How many significant digits are there?
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 9
There are eight significant digits.
However, only the first seven
can be used with confidence.
We can approximate the 8th
digit (126,462.25).
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 10
3.1 Significant Figures
Below is a car speedometer (see text Fig 3.1)
How many significant digits are there?
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 11
There are two significant digits.
However, only the first can be
used with confidence.
We can approximate the 2nd
digit (52).
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 12
3.2 Accuracy and Precision
Accuracy refers to how closely a
computed or measured value agrees with the true value.
Precision refers to how closely computed or measured values agree
with each other.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 13
Illustration of Accuracy and Precision
Figure 3.2 p. 55 of text
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 14
Computer Precision
Precision on a computer comes from
the word-length used for computation.
Accuracy on a computer comes from
the quality of the algorithms used to
perform the calculations.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 15
3.3 Error Definitions
Numerical errors include truncation
errors due to inexact mathematical operations and round-off errors due
to significant figure limitations.
True Value = Calculated Value +
True Error
Calculated Value is referred to in the
text as the Approximation
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 16
Error Definitions
Et = True Value – Approximation
Relative Error expressed as a percentage is given by:
𝜺𝒕 =𝑻𝒓𝒖𝒆 𝑬𝒓𝒓𝒐𝒓
𝑻𝒓𝒖𝒆 𝑽𝒂𝒍𝒖𝒆× 100% =
𝑬𝒕
𝑻𝑽× 100%
NOTE: This assumes we know the true
value.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 17
Example 3.1
Problem Statement We measure a bridge and a rivet. We measure
10,000cm for the bridge and 11cm for the rivet. If the true values are
9,999cm and 10cm respectively, calculate a) the true error and b) the
relative percent error. (Note: This is
slightly different than in the text.)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 18
Example 3.1
For the Bridge:
Et = 9999cm - 10000cm = -1cm
𝜺𝒕 =𝑬𝒕
𝑇𝑉=
−1𝑐𝑚
9999𝑐𝑚× 100% ≈ −.01%
For the Rivet:
Et = 10cm - 11cm = -1cm
𝜺𝒕 =𝑬𝒕
𝑇𝑉=
−1𝑐𝑚
10𝑐𝑚× 100% ≈ −10%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 19
Approximate Error
Often we do not know the true value.
When we do not know the true value, we use the following equation to calculate the approximate error:
𝜺𝒂 =𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏× 100%
In the above formula, the denominator is our
approximation and the numerator is the approximate error.
We know the approximation, but how do we find the approximate error?
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 20
Finding the Approximate Error
When we do not know the true value, it is often challenging to find the
approximate error.
However, in iterative algorithms, we can use the difference in successive
approximations as a reasonable indication of the approximate error:
𝜺𝒂 =𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏 − 𝑷𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏
𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏× 100%
In the above formula, the denominator is our approximation and the numerator is the approximate error.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 21
Sign of the Error
The error is negative if the approximation is larger than the
true value
The error is positive if the
approximation is smaller than the true value
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 22
Absolute Error
Often we are not concerned with the sign of the error.
In the rest of this class, we will use absolute error.
𝜺𝒂 = 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏 × 100%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 23
Error & Significant Digits
Often we want the error to be less than a certain number of significant
digits n.
The error will be less than n
significant digits if:
𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 %
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 24
Error & Significant Digits
Often we want the error to be less than a certain number of significant
digits n.
The error will be less than n
significant digits if:
𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 %
Let’s Take a
TEN MINUTE Break
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 25
Example 3.2 Error Estimates for Iterative Methods
Problem Statement. In mathematics, functions can often be represented by infinite series. For example, the exponential function can be computed using
𝑒𝑥 = 1 + 𝑥 +𝑥2
2+
𝑥3
3!+ ⋯ +
𝑥𝑛
𝑛!
Thus, as more terms are added to the sequence, the approximation becomes a better and better estimate of the true value of ex. This is called a Maclaurin series expansion.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 26
Example 3.2 Error Estimates for Iterative Methods
Problem Statement Continued:
𝑒𝑥 = 1 + 𝑥 +𝑥2
2+
𝑥3
3!+ ⋯ +
𝑥𝑛
𝑛!
Starting with the simplest version, ex = 1, add terms one at a time to estimate e0.5 = 1.648721… Add terms until the absolute value of the approximate
error estimate a falls below a prespecified error
criterion s conforming to three significant figures.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 27
Example 3.2 SOLUTION
First solve for the error-specification equivalent to three significant digits:
The true error can be calculated as:
𝜺𝒔 = 𝟎.𝟓 × 𝟏𝟎𝟐−𝒏 % = 𝟎.𝟓 × 𝟏𝟎𝟐−𝟑 % = 𝟎. 𝟎𝟓%
𝜺𝒕 =𝒆𝟎.𝟓 − 𝒂𝒑𝒑𝒓𝒐𝒙
𝒆𝟎.𝟓× 𝟏𝟎𝟎%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 28
Example 3.2 SOLUTION
However, we usually do not know the true value and therefore cannot calculate the true error.
In such cases we must use the approximation error:
𝜺𝒂 =𝒂𝒑𝒑𝒓𝒐𝒙(𝒏) − 𝒂𝒑𝒑𝒓𝒐𝒙(𝒏 − 𝟏)
𝒂𝒑𝒑𝒓𝒐𝒙(𝒏)× 𝟏𝟎𝟎%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 29
x = 0.5 s = 0.05%
e x = 1.64872127
n Approx t a
0 1.00000000 39.34693%
1 1.50000000 9.02040% 33.33333333%
2 1.62500000 1.43877% 7.69230769%
3 1.64583333 0.17516% 1.26582278%
4 1.64843750 0.01721% 0.15797788%
5 1.64869792 0.00142% 0.01579529%
6 1.64871962 0.00010% 0.00131626%
7 1.64872117 0.00001% 0.00009402%
8 1.64872127 0.00000% 0.00000588%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 30
3.3.1 Iterative Calculations
Most of the methods in this course
are iterative using successive approximation to the true value
The computer implementation involves LOOPS usually ending when
the error drops below a specified value.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 31
function [ v,ea,iter ] = IterMeth( x,es,maxit )
% Implements a general series expansion
% INPUT:
% (x,es,maxit): x = independent variable
% es = Specified error
% maxit = Maximum number of interations
% Function term(val,n) that defines the n-th term of the sequence
%
% OUTPUT:
% [v,ea,iter]: v = Approximate value for iter iterations
% ea = Approximate error after iter interations
% iter = number of iterations
iter=1; sol=1; ea=100;
while ea>es & iter < maxit,
solold=sol;
termn=term(x,iter);
sol=sol+termn;
iter=iter+1;
if sol ~= 0
ea=abs((sol-solold)/sol)*100;
end
end
v=sol;
return
end
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 32
function [ val ] = term( x,n )
% Function to calculate the
% n-th term of e^x
% INPUT: x, n
% OUTPUT: val
val=x^n/factorial(n);
return
end
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 33
3.4 Round-Off Errors
Round-off errors originate from the fact that computers retain only a fixed number of significant digits.
Irrational numbers cannot be expressed exactly.
Many numbers that are exact in decimal are not exact in binary used by computers.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 34
3.4.1 Computer Representation of Numbers
Integers are represented in most computers in two’s compliment representation.
MatLab Integer Representations
Class Range of Values Name
Signed 8-bit integer -27 to 2
7-1 int8
Signed 16-bit integer -215
to 215
-1 int16
Signed 32-bit integer -231
to 231
-1 int32
Signed 64-bit integer -263
to 263
-1 int64
Unsigned 8-bit integer 0 to 28-1 uint8
Unsigned 16-bit integer 0 to 216
-1 uint16
Unsigned 32-bit integer 0 to 232
-1 uint32
Unsigned 64-bit integer 0 to 264
-1 uint64
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 35
3.4.1 Computer Representation of Numbers
Floating Point numbers are represented in most computers by IEEE Standard 754.
Name Common name Base Digits E min E max Decimal
digits
Decimal
E max
binary16 Half precision 2 10+1 -14 +15 3.31 4.51
binary32 Single precision 2 23+1 -126 +127 7.22 38.23
binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95
binary128 Quadruple precision 2 112+1 -16382 +16383 34.02 4931.77
decimal32
10 7 -95 +96 7 96
decimal64
10 16 -383 +384 16 384
decimal128
10 34 -6143 +6144 34 6144
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 36
3.4.1 Computer Representation of Numbers
MatLab uses primarily double precision IEEE Standard 754 arithmetic.
Single precision is available, but half, quadruple, and the decimal representations are not.
Name Common name Base Digits E min E max Decimal
digits
Decimal
E max
binary16 Half precision 2 10+1 -14 +15 3.31 4.51
binary32 Single precision 2 23+1 -126 +127 7.22 38.23
binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95
binary128
Quadruple
precision 2 112+1 -16382 +16383 34.02 4931.77
decimal32
10 7 -95 +96 7 96
decimal64
10 16 -383 +384 16 384
decimal128
10 34 -6143 +6144 34 6144
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 37
Binary and HEX
Binary HEX Binary HEX Binary HEX Binary HEX
0000 0 0100 4 1000 8 1100 C
0001 1 0101 5 1001 9 1101 D
0010 2 0110 6 1010 A 1110 E
0011 3 0111 7 1011 B 1111 F
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 38
3.4.1 Integer Representation in MatLab
uint8: 8-bit unsigned integer (0 – 255)
Binary HEX Decimal
0110 1010 6A 6161+10 = 106
0011 1011 3B 3161+11 = 59
1100 0101 C5 12161+5 = 197
1111 0111 F7 15161+7 = 247
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 39
3.4.1 Integer Representation in MatLab
uint8: 8-bit unsigned integer (0 – 255)
Decimal HEX Binary
16 1616 = 1R0 = 10 0001 0000
40 4016 = 2R8 = 28 0010 1000
228 22816 = 14R4 = E4 1110 0100
157 15716 = 9R13 = 9D 1001 1101
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 40
3.4.1 Integer Representation in MatLab
int8: 8-bit signed integer (-128 – 127)
Binary HEX Decimal
0110 1010 6A 6161+10 = 106
0011 1011 3B 3161+11 = 59
1100 0101 C5 4161+5-128 = -59
1111 0111 F7 7161+7-128 = -9
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 41
3.4.1 Integer Representation in MatLab
int8: 8-bit signed integer (-128 – 127)
Decimal HEX Binary
16 1616 = 1R0 = 10 0001 0000
40 4016 = 2R8 = 28 0010 1000
-28 (256-28)16 = 14R4 = E4 1110 0100
-99 (256-99)16 = 9R13 = 9D 1001 1101
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 42
3.4.1 Integer Representation in MatLab
int16: 16-bit signed integer (-32,768 – 32,767)
Binary HEX Decimal
0110 1010 0011 1101 hex2dec(‘6A3B’)
6A3B 6163+10162
+3161+11 = 27,195 1100 0101 1111 0111
hex2dec(‘45F7’)+ intmin(‘int16’)
C5F7 4163+5162
+15161+7-32768 = -14,857
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 43
3.4.1 Integer Representation in MatLab
int16: 16-bit signed integer (-32,768 – 32,767)
Decimal HEX Binary
4136 413616 = 258R8 = 1028
25816=16R2
1616=1R0
0001 0000 0010 1000
-7011 (65,536-7011)16 = 3657R13 = E49D
365716 = 228R9
22816 = 14R4
1110 0100 1001 1101
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 44
3.4.1 Integer Representation in MatLab
int16: 16-bit signed integer (-32,768 – 32,767)
Decimal HEX Binary
4136 413616 = 258R8 = 1028
25816=16R2
1616=1R0
0001 0000 0010 1000
-7011 (65,536-7011)16 = 3657R13 = E49D
365716 = 228R9
22816 = 14R4
1110 0100 1001 1101
Let’s Take a
TEN MINUTE Break
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 45
3.4.1 Floating Point Representation
Computer equivalent of scientific notation.
𝐼 = 2,356 𝐹 = 2.356 × 103 (normalization)
𝐼 = 2,356 𝐹 = 2.36 × 103 (norm & round)
Binary equivalent:
𝐼 = 1011 𝐹 = 1.011 × 23 (normalization)
𝐼 = 1011 𝐹 = 1.10 × 23 (norm & round)
In binary, we save a bit, because the whole number is always 1 – we only need to store the fractional part.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 46
IEEE 754 Double Precision
MSB is the sign of the number (0 = +; 1 = -)
The exponent is 11 bits in a biased number system (subtract 1023 from the actual number)
The mantissa is formed by 1 + the 52-bit fraction. (53 bits total)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 47
Exponent encoding
The double precision binary floating-point exponent is encoded using an offset binary representation, with the zero offset being 1023; also known as exponent bias in the IEEE 754 standard. Examples of such representations would be: ■ Emin (1) = -1022 ■ E (50) = -973 ■ Emax (2046) = 1023 Thus, as defined by the offset binary representation, in order to get the true exponent the exponent bias of 1023 has to be subtracted from the written exponent. The exponents 0x000 and 0x7ff have a special meaning: ■ 0x000 is used to represent zero (if F=0) and subnormals (if F≠0); and ■ 0x7ff is used to represent infinity (if F=0) and NaNs (if F≠0), where F is the fraction mantissa. All bit patterns are valid encoding.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 48
IEEE 754 Double Precision
(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂
0x3ff0 0000 0000 0000 = 1
0x3ff0 0000 0000 0001 = 1.0000000000000002,
the next higher number > 1
0x3ff0 0000 0000 0002 = 1.0000000000000004
0x4000 0000 0000 0000 = 2
0xc000 0000 0000 0000 = –2
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 49
IEEE 754 Double Precision
(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂
0x0000 0000 0000 0000 = 0
0x8000 0000 0000 0000 = –0
0x7ff0 0000 0000 0000 = Infinity
0xfff0 0000 0000 0000 = -Infinity
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 50
IEEE 754 Double Precision
(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂
0x0000 0000 0000 0001 ≈ 4.9406564584124654 x 10-324
(Min subnormal positive double)
0x0010 0000 0000 0000 ≈ 2.2250738585072014 x 10-308
(Min normal positive double)
0x7fef ffff ffff ffff ≈ 1.7976931348623157 x 10308
(Max Double)
Note: Subnormals fill the gap between zero and smallest number (see text pp. 65-67 for explanation of this gap).
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 51
IEEE 754 Double Precision
(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂
0x7ff0 0000 0000 0001 = qNaN
0xfff8 0000 0000 0000 = sNaN
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 52
There are three kinds of operations which return NaN
Operations with a NaN as at least one operand
Indeterminate forms o The divisions 0/0, ∞/∞, ∞/−∞, −∞/∞, and −∞/−∞ o The multiplications 0×∞ and 0×−∞ o The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions o The standard has alternative functions for powers:
The standard pow function and the integer exponent pown function define 00, 1∞, and ∞0 as 1.
The powr function defines all three as invalid operations (NaN).
Real operations with complex results, for example: o The square root of a negative number o The logarithm of a negative number o The inverse sine or cosine of a number which is less than −1 or greater
than +1.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 53
3.4.2 Arithmetic Manipulations
Floating-point addition and subtraction can cause significant error.
Exponents must be the same to add or subtract
1.557 104
+ 4.381 102
Can’t Add!
The mantissa of the number with the smaller exponent is modified to make the exponents the same
1.557 104
+ 0.04381 104
1.60081 104
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 54
3.4.2 Arithmetic Manipulations
1.557 104
+ 0.04381 104
1.60081 104
Exponents However, we must now either truncate (chop) or round to the machine precision (4 digits)
1.557 104
+ 0.04381 104
1.600 104 Chopped
1.557 104
+ 0.04381 104
1.601 104 Rounded
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 55
3.4.2 Arithmetic Manipulations
Floating-point subtraction of two numbers that are very close to each other causes major problems.
Exponents must be the same to add or subtract
7.642 103
- 7.641 103
0.001 103
Renormalization creates three non-significant digits
7.642 103
- 7.641 103
1.000 100
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 56
Example 3.7
Investigate the effect of round-off error on large numbers of interdependent computations.
The text suggests adding 1 and .00001 in single precision to themselves 10,000 times.
The first sum is 10,000 but the second will not be 1 due to round-off errors.
The text then suggest doing the second sum in double precision.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 57
Example 3.7 Generalized % Program Fig0312 (Page 72 of text)
% INPUTS: x1 = the first number to add (eg: 1)
% x2 = the second and third number to add (eg: 0.1)
% x3 = the second and third number to add (eg: 0.1)
% n = numer of iterations (eg: 10,000,000)
% OUTPUTS: sum1 = x1 summed n times (single precision)
% sum2 = x2 summed n times (single precision)
% sum2 = x3 summed n times (double precision)
sum1=single(0);
sum2=sum1;
sum3=0;
for i = 1:n
sum1=sum1+x1;
sum2=sum2+x2;
sum3=sum3+x3;
end
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 58
Example 3.7 Generalized % Program Fig0312 (Page 72 of text)
sum1=single(0);
sum2=sum1;
sum3=0;
for i = 1:n
sum1=sum1+x1;
sum2=sum2+x2;
sum3=sum3+x3;
end
t1=sprintf('n = %0.10g, ',n);
t2=sprintf('x1 = %0.8g, x2 = %0.8g, x3 = %0.8g\n',x1,x2,x3);
t3=sprintf('sum1 = %0.8g\n',sum1);
t4=sprintf('sum2 = %0.8g\n',sum2);
t5=sprintf('sum3 = %0.8g\n',sum3);
t=sprintf('%s%s%s%s%s',t1,t2,t3,t4,t5);
sprintf(t)
n = 10000000, x1 = 1, x2 = 0.1, x3 = 0.1 sum1 = 10000000 sum2 = 1087937 sum3 = 1000000
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 59
Example 3.8
Find the roots of the quadratic equation with a=1, b=30000000.0000001, c=3. (x+30000000)(x+0.0000001)
MatLab and Excel have trouble with this
EDU>> a=1;b=30000000.0000001;c=3; EDU>> r1=(-b+sqrt(b^2-4*a*c))/2/a
r1 = -1.00582838058472e-007
EDU>> r2=(-b-sqrt(b^2-4*a*c))/2/a
r2 = -30000000
EDU>> r1=2*c/(-b-sqrt(b^2-4*a*c))
r1 = -1e-007
a = 1
b = 30000000.0000001
c = 3
r1 = -0.000000100582838
r2 = -30000000
r1 = 0.000000100000000
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 60
Example 3.9
Using a Taylor’s Series to calculate ex when x is negative is not recommended.
Instead calculate 1/e|x|.
However, the example in the text on pp. 74-75 works fine in both MatLab and Excel.
You need to use a higher value of x to see the effect (try -19 and -20).
EDU>> IterMeth(x,es,maxit) ans = 2.55376446514562e-009 EDU>> exp(-19) ans = 5.60279643753727e-009
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 61
Example 3.9
Using a Taylor’s Series to calculate ex when x is negative is not recommended.
Instead calculate 1/e|x|.
However, the example in the text on pp. 74-75 works fine in both MatLab and Excel.
You need to use a higher value of x to see the effect (try -19 and -20).
EDU>> IterMeth(x,es,maxit) ans = 2.55376446514562e-009 EDU>> exp(-19) ans = 5.60279643753727e-009
Let’s Take a
TEN MINUTE Break
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 62
Chapter 4 – Truncation errors and the Taylor Series
Truncation Errors are those that result from using
an approximation in place of an exact mathematical procedure
Euler’s Method is an example of a first order approximation to the real next value of a function.
Most approximation methods including Euler’s Method make use of a Taylor Series Approximation.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 63
4.1 – The Taylor Series
If a function f and its first n+1 derivatives are continuous on an interval containing x and x+h:
𝑓(𝑥 + ℎ) = 𝑓(𝑥) + 𝑓 ′(𝑥)ℎ +𝑓′′ (𝑥)
2!ℎ2 + ⋯ +
𝑓 𝑛 (𝑥)
𝑛 !ℎ𝑛 + 𝑅𝑛
Where the nth-order approximation error is give b:.
𝑅𝑛 = (𝑥 − 𝑡)𝑛
𝑛!𝑓 𝑛+1 (𝑡)𝑑𝑡 =
𝑓 𝑛+1 ()
(𝑛 + 1)!ℎ𝑛+1
𝑥+ℎ
𝑡=𝑥
Where is some value between x and x+h
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 64
4.1 – Zero, 1st and 2nd Order Approximations
Rn is the slope to the exact prediction:
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 65
Example 4.1 – Exact Taylor Series Approximation
For polynomials of order n, the n-th order Taylor Series
approximation is exact
Consider 𝑓(𝑥) = −0.1𝑥4 − 0.15𝑥3 − 0.5𝑥2 − 0.25𝑥 + 1.2 (n=4)
𝑓 ′(𝑥) = −0.4𝑥3 − 0.45𝑥2 − 𝑥 − 0.25
𝑓 ′′ (𝑥) = −1.2𝑥2 − 0.9𝑥 − 1
𝑓 ′′′ (𝑥) = −2.4𝑥 − 0.9
𝑓 𝑖𝑣 (𝑥) = −2.4
𝑓 𝑣 (𝑥) = 0 Hence, Rn =0
𝑅4=𝑓 𝑛+1 ()
𝑛+1 !ℎ𝑛+1=
𝑓 𝑖𝑣 ()
5 !ℎ5 =
0
5!ℎ5 = 0
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 66
Example 4.1 – Successive Approximation
Here are the successive approximations for x=0 and
h=1 of 𝑓(𝑥) = −0.1𝑥4 − 0.15𝑥3 − 0.5𝑥2 − 0.25𝑥 + 1.2
𝑓 ′(0) = −0.4𝑥3 − 0.45𝑥2 − 𝑥 − 0.25 = −.25
𝑓 ′′ (0) = −1.2𝑥2 − 0.9𝑥 − 1 = −1
𝑓 ′′′ (0) = −2.4𝑥 − 0.9 = −0.9
𝑓 𝑖𝑣 (0) = −2.4
𝑓 𝑣 (0) = 0
𝑓0 1 = 𝑓 0 = 𝟏. 𝟐 𝑓1 1 = 𝑓0 1 + 𝑓′ 0 (1) = 1.2 − 0.25 1 = 𝟎. 𝟗𝟓
𝑓2 1 = 𝑓1 1 +𝑓′′(0)
2!(1)2= 0.95 −
1
2= 𝟎. 𝟒𝟓
𝑓3 1 = 𝑓2 1 +𝑓′′′ 0
3!(1)3= 0.45 −
0.9
6= 𝟎. 𝟑
𝑓4 1 = 𝑓3 1 +𝑓 𝑖𝑣 (0)
4!(1)4= 0.3 −
2.4
24= 𝟎. 𝟐 = 𝒇(𝟏)
𝑓0 1 = 𝑓(0)
𝑓𝑛 1 = 𝑓𝑛−1 1 +𝑓 𝑛 (0)
𝑛!ℎ𝑛
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 67
Order n f(p /3) t
0 cos(x) = 0.70710678 0.70710678 41.4214%
1 -sin(x) = -0.70710678 0.52198666 4.3973%
2 -cos(x) = -0.70710678 0.49775449 0.4491%
3 sin(x) = 0.70710678 0.49986915 0.0262%
4 cos(x) = 0.70710678 0.50000755 0.0015%
5 -sin(x) = -0.70710678 0.50000030 0.0001%
6 -cos(x) = -0.70710678 0.49999999 0.0000%
f[n]
(p /4)
Example 4.2 Infinite Taylor Series Approximation
Use Taylor Series expansions with n=0 through n=6 to approximate f(x) = cos(pi/3) on the basis of f(x) and derivatives at pi/4 (h=pi/3-i/4)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 68
4.1.1 – The Remainder for the Taylor Series Expansion
Suppose we truncate the Taylor Series expansion
after the first term (zero-order approximation):
𝑓(𝑥 + ℎ) = 𝑓(𝑥) + 𝑅0
Where the 0th-order approximation error is give by:.
𝑅0 = (𝑥 − 𝑡)𝑛
𝑛!𝑓 𝑛+1 (𝑡)𝑑𝑡 =
𝑓 𝑛+1 ()
(𝑛 + 1)!ℎ𝑛+1 = 𝑓 ′()ℎ
𝑥+ℎ
𝑡=𝑥
Where is some value between x and x+h
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 69
4.1.1 – The Remainder for the Taylor Series Expansion
R0 is the slope to the exact prediction:
Figure 4.2 Zero-order Taylor Series Prediction
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 70
4.1.1 – The Remainder for the Taylor Series Expansion
R0 is the slope to the exact prediction:
Figure 4.3 Derivative Mean-Value theorem
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 71
4.1.2 Using the Taylor Series to Estimate Truncation Error
The remainder Rn is an exact measure of the error of truncation at term n of a Taylor Series
Even though Rn usually cannot be calculated exactly, the order of Rn is hn [ie O(hn )]
This allows us to estimate with good accuracy the error
In iterative procedures this is the stopping criterion.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 72
Figure 4.4 In creasing non-linearity of a function requires higher and higher Taylor Series Approximations
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 73
Figure 4.5 Log-log plot of R1 vs h showing that as h gets smaller R1 decreases with h2 O(h2)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 74
4.1.3 Numerical Differentiation
This will be covered in much more detail in Chapters 23 and 24.
Here we will introduce the three approximations typically used for a derivative o The forward difference o The backward difference o The centered difference
We will also look briefly at second derivatives.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 75
4.1.3 Numerical Differentiation
The first forward difference approximation.
𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖+1) − 𝑓(𝑥𝑖)
ℎ+ 𝑂(ℎ) =
∆𝑓𝑖ℎ
+ 𝑂(ℎ)
The first backward difference approximation
𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖) − 𝑓(𝑥𝑖−1)
ℎ+ 𝑂(ℎ) =
∇𝑓𝑖ℎ
+ 𝑂(ℎ)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 76
4.1.3 Numerical Differentiation
The first centered difference approximation.
𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖+1) − 𝑓(𝑥𝑖−1)
2ℎ+ 𝑂(ℎ2)
The second derivative approximation
𝑓 ′(𝑥𝑖) =𝑓(𝑥𝑖+1) − 2𝑓(𝑥𝑖) + 𝑓(𝑥𝑖−1)
ℎ2+ 𝑂(ℎ2)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 77
f(x)=Ax4 + Bx3 + Cx2 +Dx + E f '(0.5) = -0.9125
A = -0.1 For h=0.5 error
B = -0.15 Df/h = -1.45 58.90%
C = -0.5 f/h = -0.55 39.73%
D = -0.25 cf/h = -1 9.59%
E = 1.2
x f(x)
0.00 1.20000000 For h=0.5 error reduction
0.25 1.10351563 Df/h = -1.15469 26.54% 45.06%
0.50 0.92500000 f/h = -0.71406 21.75% 54.74%
0.75 0.63632813 cf/h = -0.93438 2.40% 25.00%
1.00 0.20000000
Example 4.4 Pages 92-93 of the text
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 78
4.2 – Error Propagation
The purpose of this section is to investigate how errors propagate through mathematical functions.
For example, if we multiply two numbers together, each with a known error, what is the error of the
product?
We will look at error propagation for
o Single-variable functions
o Multi-variable functions o Stability and condition
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 79
4.2.1 – Single-Variable Functions
Assume we have a function f(x) that is
dependent on a single variable x.
We have a known value 𝑥 that is an
approximation to x
Then the error in using 𝑓(𝑥 ) rather than f(x) is
given by:
∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 80
4.2.1 – Single-Variable Functions
Then the error in using 𝑓(𝑥 ) rather than f(x) is given by: ∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 81
Example 4.5
Given 𝑥 = 2.5 with D𝑥 = 0.01, estimate the
resulting error in the function f(x) = x3.
Solution: We have f’(x) = 3x2.
Hence, f’(𝑥 ) = 3(2.5)2 = 18.75. Therefore:
∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥
∆𝑓(𝑥 ) = 18.75 (0.01) = 0.1875
Predict: (2.5)3 – 0.1875 x3 (2.5)3 + 0.1875
15,4375 x3 15.8125
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 82
4.2.2 – Multi-Variable Functions
The concept of 4.2.1 can be extended to multi-variable functions:
∆𝑓(𝑥 1, 𝑥 2, ⋯𝑥 𝑛) ≅ 𝜕𝑓
𝜕𝑥1 ∆𝑥1 +
𝜕𝑓
𝜕𝑥2 ∆𝑥2 + ⋯ +
𝜕𝑓
𝜕𝑥𝑛 ∆𝑥𝑛
See Example 4.6 pp. 96-97.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 83
4.2.3 – Stability and Condition
The condition of a mathematical problem relates to its sensitivity to changes in input values.
We say that a computation is numerically
unstable if the uncertainty of the input values is
grossly magnified by the numerical method.
A Condition Number can be defined by:
𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 𝑁𝑢𝑚𝑏𝑒𝑟 =𝑥 𝑓 ′(𝑥 )
𝑓(𝑥 )
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 84
4.3 – Total Numerical Error
The Total Numerical Error is the summation of the truncation and round-off (or chopping) errors.
Round-off and chopping errors increase with subtractive cancellation and increased number of
calculations.
Truncation errors decrease with smaller step size,
but smaller step sizes increase the number of
calculations. Hence: There is a trade-off!
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 85
4.3 – Total Numerical Error
Truncation errors decrease with smaller step size, but smaller step sizes increase the number of
calculations. Hence: There is a trade-off!
Figure 4.8 Trade-off between Truncation and Round-Off errors.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 86
Example 4.8 Truncation errors decrease with smaller step size,
but smaller step sizes increase the number of calculations. Hence: There is a trade-off!
Figure 4.9 Trade-off between Truncation and Round-Off errors in Example 4.8.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 87
4.4 – Blunders, Formulation Errors and Uncertainty
Programming blunders are often a cause of
error that is difficult to eliminate.
Modeling Errors (also called Formulation
Errors) are also difficult to eliminate.
Data Uncertainty also causes errors, but we
do have some measures for this that can be very helpful in at least accounting for these
types of errors.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011
End of Tonight’s Lecture
Any questions?
Homework for next week: 3.1, 3.3, 3.6
(note: use e-19 rather than e-5), 3.13, 4.1
and 4.23
Quiz for this week is NOW online –
Complete before class next week
Quiz for next week will be online next
Monday. 88