Date post: | 03-Apr-2018 |
Category: |
Documents |
Upload: | mahmoud-el-mahdy |
View: | 225 times |
Download: | 0 times |
of 17
7/29/2019 Lecture 006
1/17
Roots of Equations
1.1 Introduction
The following problem may be used as an introduction to the problem of root finding. An electrical cable
is suspended from two towers that are 50 meters apart. The cable is allowed to dip 10 meters in the
middle. How long is the cable? We know that the curve assumed by a suspended cable is a catenary
(see Figure 0.0).
-20 -10 0 10 200
10
20
30
40
x
y
Figure 0.0. Cable suspended between two towers (left and right in the figure).
When the y-axis passes through the lowest point, we can assume an equation of the form
y= kcoshxk. Here k is a parameter to be determined. The conditions of the problem are thaty25 = y0 + 10. Hence
kcosh25
k= k + 10.
From this equation, k can be determined by the methods discussed in this chapter. The result is
k = 32.79. The question now is how can we find this value and what are the procedures to calculate it.
Another example for such kind of problems is the following missile-intercept problem. The movement of
an object in the x yplane is decried by the parametrized equations
x1t = t and y1t = 1 - -t.
A second object moves according to the equations
x2t = 1 - cosa t and y2t = sina t- 0.1 t2.
Is it possible to choose a value for a so that both objects will be in the same place at some time?
2012 G. Baumann
7/29/2019 Lecture 006
2/17
When we set the xand y coordinates equal to each other, we get the system
t= 1 - cosa t and 1 - -t = sina t- 0.1 t2
that needs to be solved for the unknown a and t. If real values exist for these unknowns that satisfy the
two equations, both objectives will be in the same place at some value t. But even though the problem isa rather simple one that yields a small system, there is no obvious way to get the answer, or even to see
if there is a solution. However, if we graphically represent the two curves we observe that there is an
intersection which means a solution (see Figure 0.0)
a
0.5 1.0 1.5 2.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Figure 0.0. Two objects are crossing on a common point.
The numerical solution of a system of nonlinear equations is one of the more challenging tasks in
numerical analysis and, as we will see, no completely satisfactory method exists for it. To understand the
difficulties, we start with what at first seems to be a rather easy problem, the solution of a single equation
in one variable
f(x) = 0.
The values of x that satisfy this equation are called the zeros or roots of the function f. In what is to follow
we will assume that f is continuous and sufficiently differentiable where needed.
1.2 Simple Root Finding Methods
To find the roots of a function of one variable is straightforward enoughjust plot the function and see
where it crosses the x-axis. The simplest methods are in fact little more than that and only carry out this
2 Lecture_006.nb
2012 G. Baumann
7/29/2019 Lecture 006
3/17
suggestion in a systematic and efficient way. Relying on the intuitive insight of the graph of the function f,
we can discover many different and apparently viable methods for finding the roots of a function of one
variable.
Suppose we have two values a and b, such that fa and fb have opposite signs. Then, because it isassumed that f is continuous, we know that there is a root somewhere in the interval a, b. To localize it,we take there is a root somewhere in the interval and compute fc. Depending on the sign of fc, wecan then place the root in one of the two intervals a, c or c, b. We can repeat this procedure until theregion in which the root is known to be located is sufficiently small. The algorithm is known as the bisec-
tion method. This method is based on the intermediate-value theorem which is shown in Figure 0.0. In
this figure the graph of a function that is continuous on the closed interval a, b is shown. The figuresuggests that if we draw any horizontal line y= k, where k is between fa and fb, then that line willcross the curve y= fx at least once over the interval a, b.
x
y
fa
fb
a b
k
x
Figure 0.0. Graph of a function with continuous behavior in the interval a, b.Stated in numerical terms, if f is continuous on a, b, then the function f must take on every value kbetween fa and fb at least once as xvaries from a to b. For example, the polynomial
px_ := x7 - x + 3
has a value of 3 at x= 1 and a value of 129 at x= 2. Thus it follows from the continuity of p that the
equation 3 - x + x7 == k has at least one solution in the interval 1, 2 for every value of k between 3and 129. This idea is stated more precisely in the following theorem.
Theorem 0.0. Intermediate-Value Theorem
If f is continuous on a closed interval a, b and k is any number between fa and fb, inclusive, thenthere is at least one number x in the interval a, b such that fx = k.
Although this theorem is intuitively obvious, its proof depends on a mathematically precise development
of the real number system, which is beyond the scope of this text.
A variety of problems can be reduced to solving an equation fx = 0 for its roots. Sometimes it is possi-ble to solve for the roots exactly using algebra, but often this is not possible and one must settle for
decimal approximations of the roots. One procedure for approximating roots is based on the following
Lecture_006.nb 3
2012 G. Baumann
7/29/2019 Lecture 006
4/17
consequences of the Intermediate-Value Theorem.
Theorem 0.0. Root Approximation
If f is continuous on a, b, and if fa and fb are nonzero and have opposite signs, then there is at least
one solution of the equation fx = 0 in the interval a, b.This result, which is illustrated in Figure 0.0, can be proved as follows.
x
y
fa>0
fx=0
fb
7/29/2019 Lecture 006
5/17
-2.20 -2.15 -2.10 -2.05 -2.00 -1.95 -1.90
-5
0
5
x
y
-2.2, -1.9
Figure 0.0. Graph of a function fx allowing a root in the interval a, b.
The polynomial is defined by
px_ := x5 + 8 x2 - x + 1
The following sequence of intervals shrinks the interval length in such a way that the conditions of Theo-
rem 0.0 are satisfied. The intervals are given in curled brackets in the second argument of Map[].
Mapp1, p2 &, 2.1, 2, 2.1, 1.8, 2.06, 2.05, 2.059, 2.056, 2.0585, 2.058
TableForm, TableHeadings , "x", "px" &
x px-2.46101 63
-2.46101 9.82432
-0.0879704 0.464937
-0.031969 0.135084
-0.00402787 0.0238737
The table shows that the interval is chosen in such a way that the signs of the polynomial px changes.However, the exact value of the root can be determined by
FindRootpx 0, x, -3.1
x -2.05843
Lecture_006.nb 5
2012 G. Baumann
7/29/2019 Lecture 006
6/17
Stating that the real value of x= -2.0584 is the intersection of the polynomial px with the horizontal x-axis.
1.2.1 The Bisection Method
The bisection method is very simple and intuitive, but has all the major characteristics of other root-
finding methods. The simplest numerical procedure for finding a root is to repeatedly halve the interval
a, b, keeping the half on which fx changes sign. This procedure is called the bisection method. It isguaranteed to converge to a root.
The bisection method is very simple and uses the ideas introduced above that the product of the function
at two different locations distinguishes three cases. If we have positive values there is no change in sign
and thus no root, if we have a negative sign the two values are different in sign and we will have a root, if
the result is zero we found the root itself. In general the following steps are used:
Step1 : Choose the lower and upper boundary of an interval including the root. This means fxu fxl < 0.
Step2 : Estimate the root by the arithmetic mean.
Step3 : Make the following calculations to determine in which subinterval the root lies:
If fxl fxr < 0 the root lies in the lower interval. Therefore, set xu= xr and return to step 2.
If fxl fxr > 0 the root lies in the upper interval. Therefore, set xl = xr and return to step 2.
If fxl fxr = 0 the root equals xr; terminate the computation.
To be more precise in our definition, suppose that we are given an interval a, b satisfying fa fb < 0and an error tolerance e > 0. Then the bisection method consists of the following steps:
Define c= a+ b2.
If b- c e, then accept c as the root and stop.
If signfb signfc 0, then set a= c. Otherwise, set b= c. Return to step 1.
These algorithmic steps are implemented in the following lines
6 Lecture_006.nb
2012 G. Baumann
7/29/2019 Lecture 006
7/17
bisectionf_, a_, b_ : Blockc, 105, ain a, bin b, m 1, results ,
While0 0, first step find the midpoint
c ain bin2
;
second step select the rootIfbin c , Returnresults;AppendToresults, m, c; third step select the interval Iff . x bin f . x c 0, ain Nc, bin Nc;m m 1
The application of the function to a polynomial shows us the following results
bisectionx6 x 1, 1, 1.4 TableForm, TableHeadings , "m", "c" &
m c
1 1.2
2 1.1
3 1.15
4 1.125
5 1.1375
6 1.13125
7 1.13438
8 1.13594
9 1.1351610 1.13477
11 1.13457
12 1.13467
13 1.13472
14 1.13474
15 1.13473
where m is the iteration step and c represents the approximation of the root at iteration step m. The
graphical representation of the function shows that there is in fact an intersection with the xaxis.
Lecture_006.nb 7
2012 G. Baumann
7/29/2019 Lecture 006
8/17
Plotx6 x 1, x, 1, 1.4
1.0 1.1 1.2 1.3 1.4
-1
0
1
2
3
4
5
Figure 0.0. Graph of the function fx = x6 - x- 1 allowing a root in the interval 1, 1.3.
In general, an iteration produces a sequence of approximate solutions; we will denote these iterates by
x0, x1, x2, .... These sequence of approximations is shown dynamically in the following Figure 0.0
1.0 1.1 1.2 1.3 1.4
-1
0
1
2
3
4
5
Figure 0.0. Sequence of approximations of the root for the function fx = x6 - x- 1.
The difference between the various root-finding methods lies in what is computed at each step and how
the next iterate is chosen.
8 Lecture_006.nb
2012 G. Baumann
7/29/2019 Lecture 006
9/17
x
y
fx0
fx1
x0 x1
x4 x2
x3
Figure 0.0. The bisection method. After three steps the root is known to lie in the interval x3, x4.
To estimate the error bound of the bisection method we can proceed as follows. Let an, bn and cn denote
the nth computed value of a, b, and c, respectively. Then easily we get
bn+1 - an+1 =1
2bn- an for n 1
and it is straightforward to deduce that
bn- an=1
2n-1b- a for n 1
where b- adenotes the length of the original interval with which we started. Since the root a is in either
the interval an, cn or cn, bn, we know that
a - cn cn- an= bn- cn=1
2bn- an.
This is the error bound for cn that is used in the second step of the bisection algorithm. Combining it with
our estimation, we obtain the further bound
a - cn 1
2nb- a.
This shows that the iterates cn converges to a as n .
To see how many iterations will be necessary, suppose we want to have
a - cn e.
This will be satisfied if
1
2nb- a e.
Lecture_006.nb 9
2012 G. Baumann
7/29/2019 Lecture 006
10/17
Taking logarithms of both sides, we can solve this to give
nlog b-a
e
log2.
For the example we discussed above the number of iterations for an accuracy of 10-5 should be found
within
nlog 1
0.00001
log2= 16.6096.
Thus we need about n= 16 iterations which is in agreement with the calculation.
There are several advantages to the bisection method. The principal one is that the method is guaran-
teed to converge. In addition, the error bound, given is guaranteed to decrease by one-half with each
iteration. Many other numerical methods have variable rates of decrease for the error, and these may be
worse than the bisection method for some equations. The principal disadvantage of the bisection methodis that it generally converges more slowly than most other methods. For functions fx that have a continu-ous derivative, other methods are usually faster. These methods may not always converge; when they do
converge, however, they are almost always much faster than the bisection method.
1.2.2 Method of False Position
Suppose we have two iterates x0 and x1 that encloses the root. We can then approximate fx by astraight line in the interval and find the place where this line cuts the x-axis. We take this as the new
iterate
x2 = x1 -x1 - x0 fx1
fx1 - fx0.
When this process is repeated, we have to decide which of the three points x0, x1, or x2, to select for
starting the next iteration. There are two plausible choices. The first, we retain the last iterate and one
point from the previous ones so that the two new points enclose the solution (Figure 0.0). This is the
method of false position.
10 Lecture_006.nb
2012 G. Baumann
7/29/2019 Lecture 006
11/17
x
y
fx0
fx1x0
x1 x2x3
Figure 0.0. The method of false position. After the second iteration, the root is known to lie in the interval
x3, x0.The formula for the false position algorithm is based on the similarity of the two triangles involved in the
iteration. Using the triangles generated by the straight line connecting the upper and lower value of the
function in the interval xn, xn-1 we can write down the relationfxn
xn+1 - xn=
fxn-1xn+1 - xn-1
This equation is equivalent to
xn+1 - xn-1 fxn = fxn-1 xn+1 - xn
which is written by collecting terms as
xn+1fxn - fxn-1 = xn-1 fxn - xnfxn-1
which is equivalent to
xn+1 =xn-1 fxn
fxn - fxn-1-
xnfxn-1fxn - fxn-1
If we add and subtract on the right hand side xn we find
xn+1 = xn+xn-1 fxn
fxn - fxn-1- xn-
xnfxn-1fxn - fxn-1
= xn+xn-1 fxn
fxn - fxn-1+
-xnfxn + xnfxn-1 - xnfxn-1fxn - fxn-1
= xn+xn-1 fxn
fxn - fxn-1+
-xn fxnfxn - fxn-1
Lecture_006.nb 11
2012 G. Baumann
7/29/2019 Lecture 006
12/17
= xn+xn-1 - xn fxnfxn - fxn-1
= xn-xn- xn-1 fxnf
xn
- f
xn-1
The successive iterates of the false position method are then simply computed by
xn+1 = xn-xn- xn-1 fxnfxn - fxn-1
.
We use this form because it involves one less function evaluation and one less multiplication than the
original relation (0.0) we started from.
The algorithm for the secant method consists of three steps:
Generate the approximated root by the derived iteration formula
Check if the error requirements are satisfied; if yes stop and return the value
If signfa signfc 0, then set a= c. Otherwise, set b= c. Return to step 1.
The following lines are an implementation of the secant method
falsePositionMethodf_, a_, b_ :Blockc, 105, ain a, bin b, cold b, k 0, results ,
While0 0,k k 1;
first step find the approximation c bin f . x bin bin ainf . x bin f . x ain;
second step select the root and terminate
IfAbscold c , Returnresults, cold c;AppendToresults, k, c; third step select the interval Iff . x ain f . x c 0, ain Nc, bin Nc
The application of the secant method shows the iteration steps
12 Lecture_006.nb
2012 G. Baumann
7/29/2019 Lecture 006
13/17
falsePositionMethodx6 x 1, 1, 2 TableForm, TableHeadings , "k", "c" &
k c
163
62
2 1.19058
3 1.11766
4 1.14056
5 1.1328
6 1.13537
7 1.13451
8 1.1348
9 1.1347
10 1.13473
11 1.13472
The same example was used previously as an example for the bisection method. The results are given
above. The last iterate equals the roota
rounded to 5 significant digits. The false position method con-verge only a little bit faster than the bisection method. But as the iterates become closer to a, the speed
of convergence increases.
1.2.3 Secant Method
The secant method and the false position method are known as straight-line approximations to the given
function y= fx. Assume that two initial guesses to the root a are known and denoted by x0 and x1. Theymay occur on opposite side of a or on the same side of a. The two points x0, fx0 and x1, fx1, on thegraph of y= fx, determine a straight line, called a secant line. This line is an approximation to the graphof y= fx and its root x2 is an approximation of a (see Figure 0.0).
To derive a formula for x2, we proceed in a manner similar to that used to derive the false position
formulas: Find the equation of the line and then find its root x2. The equation of the line is given by
y= px = fx1 + x- x1fx1 - fx0
x1 - x0.
Solving px2 = 0, we obtain
x2 = x1 - fx1x1 - x0
fx1 - fx0
Having found x2, we can drop x0 and use x1, x2 as a new set of approximate values for a. this leads to an
improved value x3; and this process can be continued indefinitely.
Doing so, we obtain the general iteration formula
xn+1 = xn- fxnxn- xn-1
fxn - fxn-1for n 1.
This is the secant method. It is called a two-point method, since two approximate values are needed to
obtain an improved value. The bisection method is also a two-point method, but the secant method will
almost always converge faster than bisection.
Lecture_006.nb 13
2012 G. Baumann
7/29/2019 Lecture 006
14/17
Figure 0.0 illustrates how the secant method works and shows the difference between it and the method
of false position. From this example we can see that now the successive iterates are no longer guaran-
teed to enclose the root.
x
y
fx0
fx1x0
x1 x2x3
Figure 0.0. The secant method.
The algorithm for the secant method consists of three steps:
generate the approximated root by the derived iteration formula
change the boundary values a= band b= c.
check if the error requirements are satisfied; if yes stop and return the value, if not return to step 1.
The following lines are an implementation of the secant method
secantMethodf_, a_, b_ :Blockc, 105, ain a, bin b, cold b, k 0, results ,
While0 0,k k 1;
first step find the approximation c bin f . x bin bin ainf . x bin f . x ain; second step select the root and terminate ain Nbin;bin Nc; third step select the root and terminate IfAbscold c , Returnresults, cold c;AppendToresults, k, c;
The application of the secant method shows the iteration steps
14 Lecture_006.nb
2012 G. Baumann
7/29/2019 Lecture 006
15/17
secantMethodx6 x 1, 1, 2 TableForm, TableHeadings , "k", "c" &
k c
163
62
2 1.03067
3 1.17569
4 1.12368
5 1.13367
6 1.13475
7 1.13472
The same example was used previously as an example for both the bisection and false position method.
The results are given in the table above. The last iterate equals to the root a rounded to 5 significant
digits. Contrary to the bisection method the secant method converge very rapidly. When the iterates
become closer to a, the speed of convergence increases in a way which needs less steps.
Example 0.0. Secant and False Position Method
The function
fx_ := x2 x - 1
has a root in the interval 0, 1 since f0 f1 < 0.
Solution 0.2. The results for all three methods discussed so far, the bisection, the false position, and
secant methods, are demonstrated in the following. The function has a root near x 0.7 as shown in the
following Figure 0.0.
0.0 0.2 0.4 0.6 0.8 1.0
-1.0
-0.5
0.0
0.5
1.0
1.5
x
fx
Figure 0.0. Graph of the function fx = x2 x- 1 for x 0, 1.
All methods start with two points x0 = 0 and x1 = 1. The following tables show the steps needed to derive
the root.
First the bisection method is applied to the problem
Lecture_006.nb 15
2012 G. Baumann
7/29/2019 Lecture 006
16/17
bisectionfx, 0, 1 TableForm, TableHeadings , "k", "c" &
k c
11
2
2 0.75
3 0.625
4 0.6875
5 0.71875
6 0.703125
7 0.710938
8 0.707031
9 0.705078
10 0.704102
11 0.703613
12 0.703369
13 0.703491
14 0.70343
15 0.70346116 0.703476
Next we use the false position method
falsePositionMethodfx, 0, 1 TableForm, TableHeadings , "k", "c" &
k c
1 1 --1
2 1.8816
3 0.420725
4 0.941745
5 0.5899566 0.78112
7 0.660269
8 0.730769
9 0.68747
10 0.713283
11 0.697609
12 0.707023
13 0.701331
14 0.704759
15 0.70269
16 0.703937
17 0.703184
18 0.70363819 0.703364
20 0.70353
21 0.70343
22 0.70349
23 0.703454
24 0.703476
25 0.703462
16 Lecture_006.nb
2012 G. Baumann
7/29/2019 Lecture 006
17/17
Finally the secant method is used.
secantMethodfx, 0, 1 TableForm, TableHeadings , "k", "c" &
k c
1 1 --1
2 0.569456
3 0.797357
4 0.685539
5 0.701245
6 0.703524
7 0.703467
The results for the different methods show that the bisection method needs the expected number of
iterations. However, the false position method needs more steps than expected. If we look at the results
generated during the iteration we observe that the root is approached. But during the first few iteration
steps there is some oscillation around the root which makes the convergence not direct. Contrary to thesecant method the false position method converge quite fast to the true root and does not show oscilla-
tions.
By using techniques from calculus and some algebraic manipulation, it is possible to show that the
iterates xn satisfy
a - xn+1 = a - xn a - xn-1-f'' xn2 f' xn
.
The unknown number xn is between xn and xn-1, and the unknown number xn is between the largest and
the smallest of the numbers a, xn, and xn-1. The error formula closely resembles the Newton error
formula which is discussed in the next section. This kind of formula should be expected, since the secant
method can be considered as an approximation of Newton's method, based on the difference quotient
f' xn fxn - fxn-1
xn- xn-1.
Check as an exercise that the use of this expression in Newton's formula (0.XXX) will yield (see next
subsection)
xn+1 = xn- fxnxn- xn-1
fxn - fxn-1for n 1.
Lecture_006.nb 17