Nonlinear Equations of One Variable I
Solve f (x) = 0, x ∈ R1
Finding a root
The difference: nonlinear but not linear
Bisection method:Given [ai , bi ], compute
pi =ai + bi
2
Iff (pi) = 0, stop
Chih-Jen Lin (National Taiwan Univ.) 1 / 54
Nonlinear Equations of One Variable II
Set
ai+1 = ai and bi+1 = pi , if f (ai)f (pi) < 0
and
ai+1 = pi and bi+1 = bi , otherwise
Example:
Chih-Jen Lin (National Taiwan Univ.) 2 / 54
Nonlinear Equations of One Variable III
>> bisect21This is the Bisection Method.Input the function F(x) in terms of xFor example: cos(x)
’cos(x)’Input endpoints A < B on separate lines01
F(A) and F(B) have same signInput endpoints A < B on separate lines00.5
F(A) and F(B) have same signInput endpoints A < B on separate lines02
Chih-Jen Lin (National Taiwan Univ.) 3 / 54
Nonlinear Equations of One Variable IV
Input tolerance0.001
Input maximum number of iterations - no decimal point50
Select output destination1. Screen2. Text fileEnter 1 or 21
Select amount of output1. Answer only2. All intermediate approximationsEnter 1 or 22
Bisection MethodI P F(P)
Chih-Jen Lin (National Taiwan Univ.) 4 / 54
Nonlinear Equations of One Variable V1 1.00000000e+00 5.4030231e-012 1.50000000e+00 7.0737202e-023 1.75000000e+00 -1.7824606e-014 1.62500000e+00 -5.4177135e-025 1.56250000e+00 8.2962316e-036 1.59375000e+00 -2.2951658e-027 1.57812500e+00 -7.3286076e-038 1.57031250e+00 4.8382678e-049 1.57421875e+00 -3.4224165e-03
10 1.57226562e+00 -1.4692977e-0311 1.57128906e+00 -4.9273569e-04
Approximate solution P = 1.57128906with F(P) = -0.00049274Number of iterations = 11 Tolerance = 1.00000000e-03
Chih-Jen Lin (National Taiwan Univ.) 5 / 54
Newton’s method I
x
f (x)
Chih-Jen Lin (National Taiwan Univ.) 6 / 54
Newton’s method II
Solvef (x) = 0
Fining the tangent line at xk :
y − f (xk)
x − xk= f ′(xk)
xk : the current iterateLet y = 0
xk+1 = xk − f (xk)/f ′(xk)
Chih-Jen Lin (National Taiwan Univ.) 7 / 54
Newton’s method III
The original idea of Newton’s method:
min f (x)
Equivalent to f ′(x) = 0
f (x + d) = f (x) + f ′(x)d +1
2d2f ′′(x) + · · ·
≈ f (x) + f ′(x)d +1
2d2f ′′(x)
Chih-Jen Lin (National Taiwan Univ.) 8 / 54
Newton’s method IV
Second-order approximation
mind
f (x) + f ′(x)d +1
2d2f ′′(x)
Then
f ′′(x)d + f ′(x) = 0
d = − f ′(x)
f ′′(x)
xk+1 = xk −f ′(x)
f ′′(x).
Chih-Jen Lin (National Taiwan Univ.) 9 / 54
Newton’s method V
Newton’s method may not converge
{xk} may diverge
>> newtonnewtonThis is Newtons MethodInput the function F(x) in terms of xFor example: cos(x)’cos(x)’
’cos(x)’Input the derivative of F(x) in terms of x’-sin(x)’
’-sin(x)’Input initial approximation
Chih-Jen Lin (National Taiwan Univ.) 10 / 54
Newton’s method VI
11Input tolerance0.001
0.001Input maximum number of iterations - no decimal point50
50Select output destination1. Screen2. Text fileEnter 1 or 21
1Select amount of output1. Answer only
Chih-Jen Lin (National Taiwan Univ.) 11 / 54
Newton’s method VII
2. All intermediate approximationsEnter 1 or 22
2Newtons Method
I P F(P)1 1.64209262e+00 -7.1235903e-022 1.57067528e+00 1.2104963e-043 1.57079633e+00 -5.9124355e-13
Approximate solution = 1.5707963268e+00with F(P) = -5.9124355058e-13Number of iterations = 3Tolerance = 1.0000000000e-03
Chih-Jen Lin (National Taiwan Univ.) 12 / 54
Newton’s method VIII
>> newtonThis is Newtons MethodInput the function F(x) in terms of xFor example: cos(x)’cos(x)’
Input the derivative of F(x) in terms of x’-sin(x)’
Input initial approximation0.1
Input tolerance0.001
Input maximum number of iterations - no decimal point50
Select output destination1. Screen2. Text file
Chih-Jen Lin (National Taiwan Univ.) 13 / 54
Newton’s method IX
Enter 1 or 21
Select amount of output1. Answer only2. All intermediate approximationsEnter 1 or 22
Newtons MethodI P F(P)1 1.00666444e+01 -8.0097972e-012 1.14045284e+01 3.9764987e-013 1.09711401e+01 -2.4431758e-024 1.09955792e+01 4.8638065e-065 1.09955743e+01 -4.2862638e-16
Approximate solution = 1.0995574288e+01
Chih-Jen Lin (National Taiwan Univ.) 14 / 54
Newton’s method X
with F(P) = -4.2862637970e-16Number of iterations = 5Tolerance = 1.0000000000e-03>>
The number of iterations is smaller than bisection
Chih-Jen Lin (National Taiwan Univ.) 15 / 54
Convergence Rate I
Which algorithm needs fewer iterations ?Need to analyze convergence rate
Assume x∗ is a solution
Linear convergence: if
limk→∞
‖xk+1 − x∗‖‖xk − x∗‖
≤ r < 1.
Example: r = 0.1, and
x1 − x∗ = 0.1
x2 − x∗ = 0.01
x3 − x∗ = 0.001Chih-Jen Lin (National Taiwan Univ.) 16 / 54
Convergence Rate II
Superlinear convergence: if
limk→∞
‖xk+1 − x∗‖‖xk − x∗‖
= 0
Example:
‖xk+1 − x∗‖‖xk − x∗‖
= 0.1, 0.01, 0.001
Chih-Jen Lin (National Taiwan Univ.) 17 / 54
Convergence Rate III
x1 − x∗ = 0.1
x2 − x∗ = 0.1× 0.1 = 0.01
x3 − x∗ = 0.01× 0.01 = 0.0001
x4 − x∗ = 0.001× 0.0001 = 10−7
Quadratic convergence. If
limk→∞
‖xk+1 − x∗‖‖xk − x∗‖2
≤ r (1)
Chih-Jen Lin (National Taiwan Univ.) 18 / 54
Convergence Rate IV
Example: r = 0.1
x1 − x∗ = 0.1
x2 − x∗ = (0.1)2 × 0.1 = 10−3
x3 − x∗ = (10−3)2 × 0.1 = 10−7
x4 − x∗ = (10−7)2 × 0.1 = 10−15
No need to have r < 1
(1) implies superlinear convergence
As long as ‖xk − x∗‖ → 0, the convergence is stillfaster.
Chih-Jen Lin (National Taiwan Univ.) 19 / 54
Convergence Rate V
Example:
‖xk+1 − x∗‖‖xk − x∗‖2
= 22, and start from ‖xk − x∗‖ = 2−3
2−6 · 22 = 2−4, 2−8 · 22 = 2−6
Newton’s method: Quadratic convergence
Chih-Jen Lin (National Taiwan Univ.) 20 / 54
Homework 7-1 I
Write C or Matlab programs for bisection andNewton methods
Find roots of some functions (also draw thesefunctions)You don’t want to try the same functions as others
Try different initial solutions and see what happens
Check their convergence rates
Chih-Jen Lin (National Taiwan Univ.) 21 / 54
Newton Method: Quadratic Convergence I
Use Newton’s method to solve
f (x) = 0
Assume f satisfies1 f is continuously differentiable2 f ′ is Lipschitz continuous:
|f ′(y)− f ′(x)| ≤ α|y − x |,∀x , y
where α > 0 is tha Lipschitz constant
Chih-Jen Lin (National Taiwan Univ.) 22 / 54
Newton Method: Quadratic ConvergenceII
Theorem 1
If {xk} → x∗ and f ′(x∗) 6= 0, then1 f (x∗) = 02 ∃L ≥ 1, δ > 0 such that ∀k ≥ L
|xk+1 − x∗| ≤ δ|xk − x∗|2
Proof.
Chih-Jen Lin (National Taiwan Univ.) 23 / 54
Newton Method: Quadratic ConvergenceIII
From Lemma 3,
f (xk+1) = f (xk) + f ′(xk)(xk+1 − xk) + e(xk+1, xk)
= 0 + e(xk+1, xk),
where
|e(xk+1, xk)| ≤ 1
2α|xk+1 − xk |2
Thus
|f (xk+1)| ≤ 1
2α|xk+1 − xk |2.
Chih-Jen Lin (National Taiwan Univ.) 24 / 54
Newton Method: Quadratic ConvergenceIV
Since {xk} → x∗
|f (xk+1)| → 0
Since f is continuous,
f (xk+1)→ f (x∗) implies f (x∗) = 0.
Defineβ̄ ≡ |f ′(x∗)−1|.
Chih-Jen Lin (National Taiwan Univ.) 25 / 54
Newton Method: Quadratic ConvergenceV
Since|xk − x∗| → 0
and f ′ is Lipschitz continuous, ∃L ≥ 1 such that ∀k ≥ L,
|f ′(x∗)−1(f ′(xk)− f ′(x∗))| ≤ |f ′(x∗)−1||f ′(xk)− f ′(x∗)|≤ αβ̄|xk − x∗|
≤ 1
2.
Chih-Jen Lin (National Taiwan Univ.) 26 / 54
Newton Method: Quadratic ConvergenceVI
For k ≥ L, from Lemma 2, f ′(xk)−1 exists and
|f ′(xk)−1| ≤ 2β̄
By the definition of Newton’s method:
xk+1 − x∗ = xk − x∗ − f ′(xk)−1f (xk) (2)
Since f (x∗) = 0
f (x∗) = f (xk) + f ′(xk)(x∗ − xk) + e(x∗, xk) = 0
Chih-Jen Lin (National Taiwan Univ.) 27 / 54
Newton Method: Quadratic ConvergenceVII
Then multiply f ′(xk)−1 on each term:
f ′(xk)−1f (xk) = xk − x∗ − f ′(xk)−1e(x∗, xk) (3)
Thus (2) and (3) imply
xk+1 − x∗ = f ′(xk)−1e(x∗, xk)
Chih-Jen Lin (National Taiwan Univ.) 28 / 54
Newton Method: Quadratic ConvergenceVIII
For k ≥ L,
|xk+1 − x∗| ≤ |f ′(xk)−1||e(x∗, xk)|
≤ 2β̄1
2α|xk − x∗|2
= δ|xk − x∗|2
by defining δ = β̄αThen the proof is complete
Chih-Jen Lin (National Taiwan Univ.) 29 / 54
Newton Method: Quadratic ConvergenceIX
Lemma 2If
|f ′(x∗)−1(f ′(xk)− f ′(x∗))| < 1 (4)
then
|f ′(xk)−1| ≤ |f ′(x∗)−1|1− |f ′(x∗)−1(f ′(xk)− f ′(x∗))|
(5)
Proof.
Chih-Jen Lin (National Taiwan Univ.) 30 / 54
Newton Method: Quadratic ConvergenceX
|f ′(xk)−1| ≤ |f ′(x∗)−1|1− |f ′(x∗)−1f ′(xk)− 1|
(6)
|f ′(xk)−1| − |f ′(x∗)−1 − f ′(xk)−1| ≤ |f ′(x∗)−1| (7)
Note that we need (4) only for ensuring that thedenominator of the right-hand side of (5) is positive.
Chih-Jen Lin (National Taiwan Univ.) 31 / 54
Newton Method: Quadratic ConvergenceXI
Lemma 3
If f ′ is Lipschitz continuous with constant α > 0 and
f (y) = f (x) + f ′(x)(y − x) + e(y , x),
then ∀x , y|e(y , x)| ≤ 1
2α|y − x |2
Chih-Jen Lin (National Taiwan Univ.) 32 / 54
Newton Method: Quadratic ConvergenceXII
Proof.
f (y)− f (x)− f ′(x)(y − x)
=
∫ 1
0
(f ′(x + t(y − x))− f ′(x))(y − x)dt
Note that
d(f (x + t(y − x))
dt= f ′(x + t(y − x))(y − x)
Chih-Jen Lin (National Taiwan Univ.) 33 / 54
Newton Method: Quadratic ConvergenceXIII
because t = 1:
f (x + t(y − x)) = f (y)
and t = 0:
f (x + t(y − x)) = f (x)
Chih-Jen Lin (National Taiwan Univ.) 34 / 54
Newton Method: Quadratic ConvergenceXIVThen
|e(y , x)|
≤∫ 1
0
|(f ′(x + t(y − x))− f ′(x))(y − x)|dt (8)
≤∫ 1
0
α|(x + t(y − x))− x ||(y − x)|dt (9)
≤∫ 1
0
αt|(y − x)|2dt =1
2α|y − x |2
(8) to (9) using Lipschitz condition (by assumption)Chih-Jen Lin (National Taiwan Univ.) 35 / 54
Nonlinear Equations with more than OneVariable I
n equalities, n variables
f1(x1, . . . , xn) = 0...
fn(x1, . . . , xn) = 0
Chih-Jen Lin (National Taiwan Univ.) 36 / 54
Nonlinear Equations with more than OneVariable II
The Jacobian J(x)
J(x) =
∂f1(x)∂x1
· · · ∂f1(x)∂xn...
∂fn(x)∂x1
· · · ∂fn(x)∂xn
J(x) is a square matrix but not symmetric
Newton’s method
xk+1 = xk − J(xk)−1f (xk)
Chih-Jen Lin (National Taiwan Univ.) 37 / 54
Nonlinear Equations with more than OneVariable III
Of course whether J(xk) is invertable is an issue
Homework 7-2: Consider
f1(x1, x2, x3) = x21 + x22 + x23 − 1 = 0
f2(x1, x2, x3) = x21 + x23 − 1/4 = 0
f3(x1, x2, x3) = x21 + x22 − 4x3 = 0
With different initial solutions, use Newton’smethod to solve it until
‖xk+1 − xk‖ < 10−7
Chih-Jen Lin (National Taiwan Univ.) 38 / 54
Nonlinear Equations with more than OneVariable IV
Use Matlab, C, or python.
Questions remained about Newton’s methods ?
Selection of initial solutionsUnder what conditions we know the sequenceconverges ?
If there are m equalities, n variables but m 6= n:
f1(x1, . . . , xn) = 0...
fm(x1, . . . , xn) = 0Chih-Jen Lin (National Taiwan Univ.) 39 / 54
Nonlinear Equations with more than OneVariable V
Then we cannot use Newton’s method as J(x) ism × n, not invertible
J(x) =
∂f1(x)∂x1
· · · ∂f1(x)∂xn...
∂fm(x)∂x1
· · · ∂fm(x)∂xn
It’s not a square matrix
Chih-Jen Lin (National Taiwan Univ.) 40 / 54
Nonlinear Equations with more than OneVariable VI
Consider the following reformulation
minx‖f (x)‖2 (10)
That is
min f (x)T f (x)
or
minx
f1(x)2 + · · ·+ fm(x)2
Chih-Jen Lin (National Taiwan Univ.) 41 / 54
Nonlinear Equations with more than OneVariable VII
Now f : Rn → Rm
If there are x such that f (x) = 0, then the minimalvalue of (10) must be zero.
Consider
g(x) ≡ ‖f (x)‖2
Then
min g(x) means solving ∇g(x) = 0
Chih-Jen Lin (National Taiwan Univ.) 42 / 54
Nonlinear Equations with more than OneVariable VIII
That is
∂g(x)
∂x1= 0, . . . ,
∂g(x)
∂xn= 0
Now
g : Rn → R1,∇g : Rn → Rn
Chih-Jen Lin (National Taiwan Univ.) 43 / 54
Nonlinear Equations with more than OneVariable IX
What is ∇g(x)?
∂g(x)
∂x1=
∂(f1(x)2 + · · · fm(x)2)
∂x1
= 2f1(x)∂f1(x)
∂x1+ · · ·+ 2fm(x)
∂fm(x)
∂x1...
∂g(x)
∂xn= 2f1(x)
∂f1(x)
∂xn+ · · ·+ 2fm(x)
∂fm(x)
∂xn
Chih-Jen Lin (National Taiwan Univ.) 44 / 54
Nonlinear Equations with more than OneVariable X
Therefore,
∇g(x) = 2
∂f1(x)∂x1
· · · ∂fm(x)∂x1...
∂f1(x)∂xn
· · · ∂fm(x)∂xn
f1(x)
...fm(x)
= 2J(x)T f (x)
Then we use Newton’s method to solve ∇g(x) = 0
Chih-Jen Lin (National Taiwan Univ.) 45 / 54
Nonlinear Equations with more than OneVariable XI
The iteration
xk+1 = xk −∇2g(xk)−1∇g(xk)
What is ∇2g(x)? An n × n matrix:
∇2g(x) =
∂g(x)∂x1∂x1
· · · ∂g(x)∂xn∂x1...
∂g(x)∂x1∂xn
· · · ∂g(x)∂xn∂xn
Chih-Jen Lin (National Taiwan Univ.) 46 / 54
Nonlinear Equations with more than OneVariable XII
As in general
∂g(x)
∂xi∂xj=∂g(x)
∂xj∂xi
∇2g(x) is symmetric
What is ∇2g(x) represented in f ?
Chih-Jen Lin (National Taiwan Univ.) 47 / 54
Nonlinear Equations with more than OneVariable XIII
Remember that
∂g(x)
∂x1= 2f1(x)
∂f1(x)
∂x1+ · · ·+ 2fm(x)
∂fm(x)
∂x1
Thus
∂g(x)
∂x1∂x1= 2f1(x)
∂f1(x)
∂x1∂x1+ 2
∂f1(x)
∂x1
∂f1(x)
∂x1
+ · · ·+ 2fm(x)∂fm(x)
∂x1∂x1+ 2
∂fm(x)
∂x1
∂fm(x)
∂x1
Chih-Jen Lin (National Taiwan Univ.) 48 / 54
Nonlinear Equations with more than OneVariable XIV
∂g(x)
∂x1∂x2= 2f1(x)
∂f1(x)
∂x1∂x2+ 2
∂f1(x)
∂x2
∂f1(x)
∂x1
+ · · ·+ 2fm(x)∂fm(x)
∂x1∂x2+ 2
∂fm(x)
∂x2
∂fm(x)
∂x1
In matrix form:Define
(Gk)ij ≡ fk(x)∂fk(x)
∂xi∂xj
Chih-Jen Lin (National Taiwan Univ.) 49 / 54
Nonlinear Equations with more than OneVariable XV
Then
∇2g(x) = 2(m∑
k=1
Gk + J(x)TJ(x))
Now
J(x)T =
∂f1(x)∂x1
· · · ∂fm(x)∂x1...
∂f1(x)∂xn
· · · ∂fm(x)∂xn
, J(x) =
∂f1(x)∂x1
· · · ∂f1(x)∂xn...
∂fm(x)∂x1
· · · ∂fm(x)∂xn
Chih-Jen Lin (National Taiwan Univ.) 50 / 54
Fixed-Point Iteration for NonlinearSystems I
f1(x1, x2) = x31 + 10x1 − x2 − 5 = 0
f2(x1, x2) = x1 + x32 − 10x2 + 1 = 0
Chih-Jen Lin (National Taiwan Univ.) 51 / 54
Fixed-Point Iteration for NonlinearSystems II
Converting to
x1 = g1(x1, x2), x2 = g2(x1, x2)
For example,
x1 = −0.1x31 + 0.1x2 + 0.5
x2 = 0.1x1 + 0.1x32 + 0.1
Then using the following iteration
xk+11 = −0.1(xk1 )3 + 0.1xk2 + 0.5
xk+12 = 0.1xk1 + 0.1(xk2 )3 + 0.1
Chih-Jen Lin (National Taiwan Univ.) 52 / 54
Fixed-Point Iteration for NonlinearSystems III
Fixed point ?
x = g(x), then x is a fixed point of g
The convergence ??
Chih-Jen Lin (National Taiwan Univ.) 53 / 54
Fixed-Point Iteration for NonlinearSystems IV
Theorem 4
If p is a fixed point of g(x),
‖∇g(p)‖ < 1,
and the initial point x0 is sufficiently close to p, then thealgorithm
xk+1 = g(xk)
converges
Chih-Jen Lin (National Taiwan Univ.) 54 / 54