Numerical -...

Numerical Methods in Science and Engineering

Thomas R� Bewley

UC San Diego

i

ii

Contents

Preface vii

� A short review of linear algebra �

�� Notation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Vectors � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Vector addition � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Vector multiplication � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Matrices � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Matrix addition � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Matrix�vector multiplication � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Matrix multiplication � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Identity matrix � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Inverse of a square matrix � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Other de nitions � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Determinants � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� De nition of the determinant � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Properties of the determinant � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Computing the determinant � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Eigenvalues and Eigenvectors � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Physical motivation for eigenvalues and eigenvectors � � � � � � � � � � � � � � �� Eigenvector decomposition � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Matrix norms � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Condition number � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Solving linear equations ��

�� Introduction to the solution of Ax � b � � � � � � � � � � � � � � � � � � � � � � � � � � �� Example of solution approach � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Gaussian elimination algorithm � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Forward sweep � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Back substitution � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Operation count � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Matlab implementation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� LU decomposition � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Testing the Gaussian elimination code � � � � � � � � � � � � � � � � � � � � � � �� Pivoting � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Thomas algorithm � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Forward sweep � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

iii

�� Back substitution � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Operation count � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Matlab implementation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� LU decomposition � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Testing the Thomas code � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Parallelization � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Review � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Solving nonlinear equations ��

�� The Newton�Raphson method for nonlinear root nding � � � � � � � � � � � � � � � � �� Scalar case � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Quadratic convergence � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Multivariable case�systems of nonlinear equations � � � � � � � � � � � � � � � �� Matlab implementation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Dependence of Newton�Raphson on a good initial guess � � � � � � � � � � � � ��

�� Bracketing approaches for scalar root nding � � � � � � � � � � � � � � � � � � � � � � �� Bracketing a root � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Re ning the bracket � bisection � � � � � � � � � � � � � � � � � � � � � � � � � � �� Re ning the bracket � false position � � � � � � � � � � � � � � � � � � � � � � � �� Testing the bracketing algorithms � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Interpolation ��

�� Lagrange interpolation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Solving n� � equations for the n� � coe�cients � � � � � � � � � � � � � � � � �� Constructing the polynomial directly � � � � � � � � � � � � � � � � � � � � � � � �� Matlab implementation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Cubic spline interpolation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Constructing the cubic spline interpolant � � � � � � � � � � � � � � � � � � � � �� Matlab implementation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Tension splines � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� B�splines � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Minimization ��

�� Motivation � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Solution of large linear systems of equations � � � � � � � � � � � � � � � � � � � �� Solution of nonlinear systems of equations � � � � � � � � � � � � � � � � � � � � �� Optimization and control of dynamic systems � � � � � � � � � � � � � � � � � � ��

�� The Newton�Raphson method for nonquadratic minimization � � � � � � � � � � � � � �� Bracketing approaches for minimization of scalar functions � � � � � � � � � � � � � � � �

�� Bracketing a minimum � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Re ning the bracket � the golden section search � � � � � � � � � � � � � � � � � �� Re ning the bracket � inverse parabolic interpolation � � � � � � � � � � � � � � �� Re ning the bracket � Brent�s method � � � � � � � � � � � � � � � � � � � � � � ��

�� Gradient�based approaches for minimization of multivariable functions � � � � � � � � �� Steepest descent for quadratic functions � � � � � � � � � � � � � � � � � � � � � �� Conjugate gradient for quadratic functions � � � � � � � � � � � � � � � � � � � �� Preconditioned conjugate gradient � � � � � � � � � � � � � � � � � � � � � � � � �� Extension non�quadratic functions � � � � � � � � � � � � � � � � � � � � � � � � ��

iv

� Di�erentiation ��

�� Derivation of nite di�erence formulae � � � � � � � � � � � � � � � � � � � � � � � � � � �� Taylor Tables � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Pad�e Approximations � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Modi ed wavenumber analysis � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Alternative derivation of di�erentiation formulae � � � � � � � � � � � � � � � � � � � � �

� Integration �

�� Basic quadrature formulae � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Techniques based on Lagrange interpolation � � � � � � � � � � � � � � � � � � � �� Extension to several gridpoints � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Error Analysis of Integration Rules � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Romberg integration � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Adaptive Quadrature � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

Ordinary di�erential equations ��

�� Taylor�series methods � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� The trapezoidal method � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� A model problem � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Simulation of an exponentially�decaying system � � � � � � � � � � � � � � � � � �� Simulation of an undamped oscillating system � � � � � � � � � � � � � � � � � � �

�� Stability � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Stability of the explicit Euler method � � � � � � � � � � � � � � � � � � � � � � �� Stability of the implicit Euler method � � � � � � � � � � � � � � � � � � � � � � �� Stability of the trapezoidal method � � � � � � � � � � � � � � � � � � � � � � � � �

�� Accuracy � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� Runge�Kutta methods � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� The class of second�order Runge�Kutta methods �RK�� A popular fourth�order Runge�Kutta method �RK�� An adaptive Runge�Kutta method �RKM�� A low�storage Runge�Kutta method �RKW��

A Getting started with Matlab �

A�� What is Matlab� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �A�� Where to nd Matlab � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �A�� How to start Matlab � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �A�� How to run Matlab�the basics � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �A�� Commands for matrix factoring and decomposition � � � � � � � � � � � � � � � � � � � �A�� Commands used in plotting � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � A� Other Matlab commands � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � A� Hardcopies � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � A�� Matlab programming procedures� m� les � � � � � � � � � � � � � � � � � � � � � � � � � A�� Sample m� le � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

v

vi

Preface

The present text provides a brief �one quarter� introduction to e�cient and e�ective numericalmethods for solving typical problems in scienti c and engineering applications� It is intended toprovide a succinct and modern guide for a senior or rst�quarter masters level course on this subject�assuming only a prior exposure to linear algebra� a knowledge of complex variables� and rudimentaryskills in computer programming�I am indebted to several sources for the material compiled in this text� which draws from class

notes from ME�� by Prof� Parviz Moin at Stanford University� class notes from ME�� by Prof�Harv Lomax at Stanford University� and material presented in Numerical Recipes by Press et al�� and Matrix Computations by Golub � van Loan �� The latter two textbooks arehighly recommended as supplemental texts to the present notes� We do not attempt to duplicatethese excellent texts� but rather attempt to build up to and introduce the subjects discussed atgreater lengths in these more exhaustive texts at a metered pace�The present text was prepared as supplemental material for MAE�� and MAE��a at UC San

Diego�

vii

viii

Chapter �

A short review of linear algebra

Linear algebra forms the foundation upon which e�cient numerical methods may be built to solveboth linear and nonlinear systems of equations� Consequently� it is useful to review brie�y somerelevant concepts from linear algebra�

�� Notation

Over the years� a fairly standard notation has evolved for problems in linear algebra� For clarity�this notation is now reviewed�

�� Vectors

A vector is de ned as an ordered collection of numbers or algebraic variables�

c �

�BBB�c�c��cn

�CCCA �

All vectors in the present notes will be assumed to be arranged in a column unless indicated otherwise�Vectors are represented with lower�case letters and denoted in writing �i�e�� on the blackboard� withan arrow above the letter ��c � and in print �as in these notes� with boldface �c�� The vector c shownabove is n�dimensional� and its i�th element is referred to as ci� For simplicity� the elements of allvectors and matrices in these notes will be assumed to be real unless indicated otherwise� However�all of the numerical tools we will develop extend to complex systems in a straightforward manner�

�� Vector addition

Two vectors of the same size are added by adding their individual elements�

c� d �

�BBB�c� � d�c� � d��

cn � dn

�CCCA �

�

� CHAPTER �� A SHORT REVIEW OF LINEAR ALGEBRA

�� Vector multiplication

In order to multiply a vector with a scalar� operations are performed on each element�

� c �

�BBB�� c�� c��

� cn

�CCCA �

The inner product of two real vectors of the same size� also known as dot product� is de nedas the sum of the products of the corresponding elements�

�u�v� � u � v �nXi��

ui vi � u� v� � u� v� � � � �� un vn�

The ��norm of a vector� also known as the Euclidean norm or the vector length� is de nedby the square root of the inner product of the vector with itself�

kuk �p�u�u� �

qu�� u�� u�n�

The angle between two vectors may be de ned using the inner product such that

cos��u�v� ��u�v�

kuk kvk �

In summation notation� any term in an equation with lower�case English letter indices repeatedtwice implies summation over all values of that index� Using this notation� the inner product iswritten simply as ui vi� To avoid implying summation notation� Greek indices are usually used�Thus� u� v� does not imply summation over ��

�� Matrices

A matrix is de ned as a two�dimensional ordered array of numbers or algebraic variables�

A �

�BBB�a�� a�� a�na�� a�� a�n��

��

��am� am� � � � amn

�CCCA �

The matrix above has m rows and n columns� and is referred to as an m � n matrix� Matricesare represented with uppercase letters� with their elements represented with lowercase letters� Theelement of the matrix A in the i�th row and the j�th column is referred to as aij �

�� Matrix addition

Two matrices of the same size are added by adding their individual elements� Thus� if C � A�B�then cij � aij � bij �

�� NOTATION �

�� Matrix�vector multiplication

The product of a matrix A with a vector x� which results in another vector b� is denoted Ax � b�It may be de ned in index notation as�

bi �

nXj��

aij xj �

In summation notation� it is written�

bi � aij xj �

Recall that� as the j index is repeated in the above expression� summation over all values of j isimplied without being explicitly stated� The rst few elements of the vector b are given by�

b� � a�� x� � a�� x� � � � �� a�n xn�

b� � a�� x� � a�� x� � � � �� a�n xn�

etc� The vector b may be written�

b � x�

�BBB�a��a��

am�

�CCCA� x�

�BBB�a��a��

am�

�CCCA� � � �� xn

�BBB�a�na�n��

amn

�CCCA �

Thus� b is simply a linear combination of the columns of A with the elements of x as weights�

�� Matrix multiplication

Given two matrices A and B� where the number of columns of A is the same as the number of rowsof B� the product C � AB is de ned in summation notation� for the �i� j��th element of the matrixC� as

cij � aik bkj �

Again� as the index k is repeated in this expression� summation is implied over the index k� In otherwords� cij is just the inner product of row i of A with column j of B� For example� if we write��

BBB�c�� c�� c�nc�� c�� c�n��

��

��cm� cm� � � � cmn

�CCCA

� �z �C

�

�BBB�a�� a�� a�la�� a�� a�l��

��

��am� am� � � � aml

�CCCA

� �z �A

�BBB�b�� b�� b�nb�� b�� b�n��

��

��bl� bl� � � � bln

�CCCA

� �z �B

�

then we can see that c�� is the inner product of row � of A with column � of B� Note that usuallyAB �� BA� matrix multiplication usually does not commute�


�� Identity matrix

The identity matrix is a square matrix with ones on the diagonal and zeros o� the diagonal�

I �

�BBB��

� �

�CCCA � Ix � x� IA � AI � A

In the notation for I used at left� in which there are several blank spots in the matrix� the zerosare assumed to act like �paint� and ll up all unmarked entries� Note that a matrix or a vector isnot changed when multiplied by I � The elements of the identity matrix are equal to the Kroneckerdelta�

�ij �

�� i � j

� otherwise�

�� Inverse of a square matrix

If BA � I � we may refer to B as A�� Note� however� that for a given square matrix A� it is notalways possible to compute its inverse� when such a computation is possible� we refer to the matrixA as being �nonsingular� or �invertible�� If we take Ax � b� we may multiply this equation fromthe left by A�� which results in

A��hAx � b

i� x � A��b�

Computation of the inverse of a matrix thus leads to one method for determining x given A and b�unfortunately� this method is extremely ine�cient� Note that� since matrix multiplication does notcommute� one always has to be careful when multiplying an equation by a matrix to multiply outall terms consistently �either from the left� as illustrated above� or from the right��If we take AB � I and CA � I � then we may premultiply the former equation by C� leading to

ChAB � I

i� CA��z�

I

B � C � B � C�

Thus� the left and right inverses are identical�If we take AX � I and AY � I �noting by the above argument that Y A � I�� it follows that

YhAX � I

i� Y A��z�

I

X � Y � X � Y�

Thus� the inverse is unique�

�� Other de nitions

The transpose of a matrix A� denoted AT � is found by swapping the rows and the columns�

A �

��

�A � AT �

��

�

In index notation� we say that bij � aji� where B � AT �

�� NOTATION �

The adjoint of a matrix A� denoted A�� is found by taking the conjugate transpose of A�

A �

�� i

� � �i �

� A� �

�� i��i �

�

The main diagonal of a matrix A is the collection of elements along the line from a�� to ann�The �rst subdiagonal is immediately below the main diagonal �from a�� to an�n�� the secondsubdiagonal is immediately below the rst subdiagonal� etc�� the �rst superdiagonal is imme�diately above the main diagonal �from a�� to an��n�� the second superdiagonal is immediatelyabove the rst superdiagonal� etc�A banded matrix has nonzero elements only near the main diagonal� Such matrices arise in

discretization of di�erential equations� As we will show� the narrower the width of the band ofnonzero elements� the easier it is to solve the problem Ax � b with an e�cient numerical algorithm�A diagonal matrix is one in which only the main diagonal of the matrix is nonzero� a tridiagonalmatrix is one in which only the main diagonal and the rst subdiagonal and superdiagonal arenonzero� etc� An upper triangular matrix is one for which all subdiagonals are zero� and a lowertriangular matrix is one for which all superdiagonals are zero� Generically� such matrices looklike�

A �

�BBBB�

�

�

�CCCCA � U �

�BBBB��

�CCCCA � L �

�BBBB�

��CCCCA �

nonzeroelements

nonzero

elements

nonzero

elements

Banded matrix Upper triangular matrix Lower triangular matrix

A block banded matrix is a banded matrix in which the nonzero elements themselves arenaturally grouped into smaller submatrices� Such matrices arise when discretizing systems of partialdi�erential equations in more than one direction� For example� as shown in class� the following isthe block tridiagonal matrix that arises when discretizing the Laplacian operator in two dimensionson a uniform grid�

M �

�BBBBB�B C �A B C

� � ��

� � �

A B C� A B

�CCCCCA with B �

�BBBBB��

� � ��

� � �

� ��

�CCCCCA � A � C � I�

We have� so far� reviewed some of the notation of linear algebra that will be essential in thedevelopment of numerical methods� In x�� we will discuss various methods of solution of nonsingularsquare systems of the form Ax � b for the unknown vector x� We will need to solve systems ofthis type repeatedly in the numerical algorithms we will develop� so we will devote a lot of attentionto this problem� With this machinery� and a bit of analysis� we will see in the rst homework thatwe are already able to analyze important systems of engineering interest with a reasonable degreeof accuracy� In homework �� we analyze of the static forces in a truss subject to some signi cantsimplifying assumptions�In order to further our understanding of the statics and dynamics of phenomena important

in physical systems� we need to review a few more elements from linear algebra� determinants�eigenvalues� matrix norms� and the condition number�


�� Determinants

�� De nition of the determinant

An extremely useful method to characterize a square matrix is by making use of a scalar quantitycalled the determinant� denoted jAj� The determinant may be de ned by induction as follows�� The determinant of a �� matrix A � !a��" is just jAj � a��

�� The determinant of a �� matrix A ��a�� a��a�� a��

is jAj � a�� a�� a�� a��

n� The determinant of an n � n matrix is de ned as a function of the determinant of several�n� �� n� �� matrices as follows� the determinant of A is a linear combination of the elementsof row � �for any � such that � � � � n� and their corresponding cofactors�

jAj � a��A�� a�� A�� a�nA�n�

where the cofactor A�� is de ned as the determinant of M�� with the correct sign�

A�� jM�� j�where the minor M�� is the matrix formed by deleting row � and column � of the matrix A�

�� Properties of the determinant

When de ned in this manner� the determinant has several important properties�

�� Adding a multiple of one row of the matrix to another row leaves the determinant unchanged�a bc d

�a b� d� c

ab

�� Exchanging two rows of the matrix �ips the sign of the determinant�a b

c d

� �c da b

�� If A is triangular �or diagonal�� then jAj is the product a�� a�� ann of the elements on the maindiagonal� In particular� the determinant of the identity matrix is jI j � �� If A is nonsingular �i�e�� if Ax � b has a unique solution�� then jAj �� If A is singular �i�e�� if Ax � b does not have a unique solution�� then jAj � ��

�� Computing the determinant

For a large matrix A� the determinant is most easily computed by performing the row operationsmentioned in properties � and � discussed in the previous section to reduce A to an upper triangularmatrix U � �In fact� this is the heart of Gaussian elimination procedure� which will be described indetail x�� Taking properties �� and � together� if follows that

jAj � ��r jU j � ��r u�� u�� unn�where r is the number of row exchanges performed� and the u�� are the elements of U which are onthe main diagonal� By property �� we see that whether or not the determinant of a matrix is zerois the litmus test for whether or not that matrix is singular�The command det�A� is used to compute the determinant in Matlab�

�� EIGENVALUES AND EIGENVECTORS

�� Eigenvalues and Eigenvectors

Consider the equation

A��

We want to solve for both a scalar � and some corresponding vector �� other than the trivial solution�� such that� when �� is multiplied from the left by A� it is equivalent to simply scaling �� by thefactor �� Such a situation has the important physical interpretation as a natural mode of a systemwhen A represents the �system matrix� for a given dynamical system� as will be illustrated in x��The easiest way to determine for which � it is possible to solve the equation A�� for ��

is to rewrite this equation as

�A� �I��

If �A � �I� is a nonsingular matrix� then this equation has a unique solution� and since the right�hand side is zero� that solution must be �� However� for those values of � for which �A � �I�is singular� this equation admits other solutions with �� The values of � for which �A � �I� issingular are called the eigenvalues of the matrix A� and the corresponding vectors �� are called theeigenvectors�Making use of property � of the determinant� we see that the eigenvalues must therefore be

exactly those values of � for which

jA� �I j � ��This expression� when multiplied out� turns out to be a polynomial in � of degree n for an n � nmatrix A� this is referred to as the characteristic polynomial of A� By the fundamental theoremof algebra� there are exactly n roots to this equation� though these roots need not be distinct�Once the eigenvalues � are found by nding the roots of the characteristic polynomial of A� theeigenvectors �� may be found by solving the equation �A��I�� Note that the �� in this equationmay be determined only up to an arbitrary constant� which can not be determined because �A��I�is singular� In other words� if �� is an eigenvector corresponding to a particular eigenvalue �� thenc�� is also an eigenvector for any scalar c� Note also that� if all of the eigenvalues of A are distinct�di�erent�� all of the eigenvectors of A are linearly independent �i�e�� i� ��j� �� for i �� j��The command �V�D� � eig�A� is used to compute eigenvalues and eigenvectors in Matlab�

�� Physical motivation for eigenvalues and eigenvectors

In order to realize the signi cance of eigenvalues and eigenvectors for characterizing physical systems�and to foreshadow some of the developments in later chapters� it is enlightening at this point todiverge for a bit and discuss the time evolution of a taught wire which has just been struck �as witha piano wire� or plucked �as with the wire of a guitar or a harp�� Neglecting damping� the de�ectionof the wire� f�x� t�� obeys the linear partial di�erential equation �PDE�

�f

t��

�f

x��

subject to

boundary conditions�

�f � � at x � ��

f � � at x � L�

and initial conditions�

�f � c�x� at t � ��f�t � d�x� at t � ��

CHAPTER �� A SHORT REVIEW OF LINEAR ALGEBRA

We will solve this system using the separation of variables �SOV� approach� With this approach�we seek �modes� of the solution� f�� which satisfy the boundary conditions on f and which decoupleinto the form

f� � X��x�T��t��

�No summation is implied over the Greek index � pronounced �iota�� If we can nd enoughnontrivial �nonzero� solutions of �� which t this form� we will be able to reconstruct a solutionof �� which also satis es the initial conditions as a superposition of these modes� Inserting ��into �� we nd that

X�T�� X ��

� T� � T ��T��

X ��

X��

X ��

��

X�

T �� T��where the constant �� must be independent of both x and t due to the center equation combinedwith the facts that X� � X��x� and T� � T��t�� The two systems at right are solved with�

X� � A� cos��x�

��B� sin

��x�

�T� � C� cos��t� �D� sin��t�

Due to the boundary condition at x � �� it must follow that A� � �� Due to the boundarycondition at x � L� it follows for most �� that B� � � as well� and thus f��x� t� � � � x� t� However�for certain speci c values of �� speci cally� for ��L�� for integer values of �� X� satis es thehomogeneous boundary condition at x � L even for nonzero values of B��We now attempt to form a superposition of the nontrivial f� that solves the initial conditions

given for f � De ning #c� � B�C� and #d� � B�D�� we take

f �

�X��

f� �

�X��

h#c� sin

��x�

�cos��t� � #d� sin

��x�

�sin��t�

i�

where �� L� The coe�cients #c� and #d� may be determined by enforcing the initial conditions�

f�x� t � �� c�x� ��Xi��

#c� sin��x�

��

f

t�x� t � �� d�x� �

�Xi��

#d�� sin��x�

��

Noting the orthogonality of the sine functions�� we multiply both of the above equations by sin��x��and integrate over the domain x � !�� L"� which results in�Z L

�

c�x� sin��x

�

�dx � #c�

L

�� #c� �

�

L

Z L

�

c�x� sin��x

�

�dxZ L

�

d�x� sin��x

�

�dx � #d��

L

�� #d� �

�

� �

Z L

�

d�x� sin��x

�

�dx

�� for � � ��

The #c� and #d� are referred to as the discrete sine transforms of c�x� and d�x� on the interval x � !�� L"��This orthogonality principle states that� for �� integers�Z

L

�

sin� ��x

L

�sin

��xL

�dx �

�L��

� otherwise�

�� EIGENVALUES AND EIGENVECTORS �

Thus� the solutions of �� which satis es the boundary conditions and initial conditions maybe found as a linear combination of modes of the simple decoupled form given in �� which maybe determined analytically� But what if� for example� � is a function of x� Then we can no longerrepresent the mode shapes analytically with sines and cosines� In such cases� we can still seekdecoupled modes of the form f � X�x�T �t�� but we now must determine the X�x� numerically�Consider again the equation of the form�

X ��

��X with X � � at x � � and x � L�

Consider now the values of X only at N � � discrete locations ��grid points�� located at x � j$xfor j � � � � �N � where $x � L�N � Note that� at these grid points� the second derivative may beapproximated by�

�X

x�

xj

�Xj�� Xj

$x� Xj �Xj��

$x

�$x �

Xj�� Xj �Xj��

�$x��

where� for clarity� we have switched to the notation that Xj � X�xj�� By the boundary conditions�X� � XN � �� The di�erential equation at each of the N � � grid points on the interior may beapproximated by the relation

��jXj�� Xj �Xj��

�$x�� Xj �

which may be written in the matrix form�

�

�$x��

�BBBBBBB�

��

��

� � ��

��N�� N�� N�� N�� N��

�CCCCCCCA

�BBBBBBB�

X�

X�

X�

��XN��

XN��

�CCCCCCCA�h� ��

i�BBBBBBB�

X�

X�

X�

��XN��

XN��

�CCCCCCCA�

or� more simply� as

A��

where � � �� This is exactly the matrix eigenvalue problem discussed at the beginning of thissection� and can be solved in Matlab for the eigenvalues �i and the corresponding mode shapes ��iusing the eig command� as illustrated in the code wire�m provided at the class web site� Note that�for constant � and a su�ciently large number of gridpoints� the rst several eigenvalues returned bywire�m closely match the analytic solution �� L� and the rst several eigenvectors �� are ofthe same shape as the analytic mode shapes sin��x��

�� Eigenvector decomposition

If all of the eigenvectors ��i of a given matrix A are linearly independent� then any vector x may beuniquely decomposed in terms of contributions parallel to each eigenvector such that

x � S�� where S �

�� j j j�� j j j

�A �

�� CHAPTER �� A SHORT REVIEW OF LINEAR ALGEBRA

Such change of variables often simpli es a dynamical equation signi cantly� For example� if a givendynamical system may be written in the form %x � Ax where the eigenvalues of A are distinct� then�by substitution of x � S�� and multiplication from the left by S�� we may write�

%�� &�� where & � S��AS �

�BBB��

��

� �n

�CCCA �

In this representation� as & is diagonal� the dynamical evolution of each mode of the system iscompletely decoupled �i�e�� %�� %�� etc��

�� Matrix norms

The norm of A� denoted kAk� is de ned by

kAk � maxx��

kAxkkxk �

where kxk is the Euclidean norm of the vector x� In other words� kAk is an upper bound on theamount the matrix A can �amplify� the vector x�

kAxk � kAk kxk �x�In order to compute the norm of A� we square both sides of the expression for kAk� which results in

kAk� � maxx ��

kAxk�kxk� � max

x��

xT �ATA�x

xT x�

The peak value of the expression on the right is attained when �ATA�x � �max x� In other words�the matrix norm may be computed by taking the maximum eigenvalue of the matrix �ATA�� Inorder to compute a matrix norm in Matlab� one may type in sqrt�max�eig�A� A�� or� moresimply� just call the command norm�A��

�� Condition number

Let Ax � b and consider a small perturbation to the right�hand side� The perturbed system iswritten

A�x� �x� � �b� �b� � A�x � �b�

We are interested in bounding the change �x in the solution x resulting from the change �b to theright�hand side b� Note by the de nition of the matrix norm that

kxk kbk�kAk and k�xk � kA��k k�bk�Dividing the equation on the right by the equation on the left� we see that the relative change in xis bounded by the relative change in b according to the following�

k�xkkxk � c

k�bkkbk �

��

where c � kAk kA��k is known as the condition number of the matrix A� If the condition numberis small �say� � O�� the the matrix is referred to as well conditioned� meaning that small errorsin the values of b on the right�hand side will result in �small� errors in the computed values of x�However� if the condition number is large �say� � O�� then the matrix is poorly conditioned�and the solution x computed for the problem Ax � b is often unreliable�In order to compute the condition number of a matrix in Matlab� one may type in norm�A�

norm�inv�A�� or� more simply� just call the command cond�A��

��

Chapter �

Solving linear equations

Systems of linear algebraic equations may be represented e�ciently in the form Ax � b� Forexample�

�u� �v � �w � �u � �w � u� v � w � ��

��

�A

� �z �A

��uvw

�A

� �z �x

�

��

�A

� �z �b

�

Given an A and b� one often needs to solve such a system for x� Systems of this form need to besolved frequently� so these notes will devote substantial attention to numerical methods which solvethis type of problem e�ciently�

�� Introduction to the solution of Ax � b

If A is diagonal� the solution may be found by inspection��

�A��x�x�x�

�A �

��

�A �

x� � ��x� � �x� � ��

If A is upper triangular� the problem is almost as easy� Consider the following��


�A �

��

�A

The solution for x� may be found by by inspection� x� � � � ��Substituting this result into the equation implied by the second row� the solution for x� may thenbe found�

�x� � x��z��

� �� x� � ��

Finally� substituting the resulting values for x� and x� into the equation implied by the rst row�the solution for x� may then be found�

�x� � � x��z��

�� x��z��

� � � x� � ��

��

�� CHAPTER �� SOLVING LINEAR EQUATIONS

Thus� upper triangular matrices naturally lend themselves to solution via a march up from thebottom row� Similarly� lower triangular matrices naturally lend themselves to solution via a marchdown from the top row�Note that if there is a zero in the i�th element on the main diagonal when attempting to solve a

triangular system� we are in trouble� There are either�

� zero solutions �if� when solving the i�th equation� one reaches an equation like � � �� whichcannot be made true for any value of xi�� or there are

� in�nitely many solutions �if� when solving the i�th equation� one reaches the truism � � �� inwhich case the corresponding element xi can take any value��

The matrix A is called singular in such cases� When studying science and engineering problemson a computer� generally one should rst identify nonsingular problems before attempting to solvethem numerically�To solve a general nonsingular matrix problem Ax � b� we would like to reduce the problem

to a triangular form� from which the solution may be found by the marching procedure illustratedabove� Such a reduction to triangular form is called Gaussian elimination� We rst illustrate theGaussian elimination procedure by example� then present the general algorithm�

�� Example of solution approach

Consider the problem ��


�A �

��

�A

Considering this matrix equation as a collection of rows� each representing a separate equation� wecan perform simple linear combinations of the rows and still have the same system� For example�we can perform the following manipulations�

�� Interchange the rst two rows��


�A �

��

�A

�� Multiply the rst row by � and subtract from the last row��


�A �

��

�A

�� Add second row to third� ��


�A �

��

�A

This is an upper triangular matrix� so we can solve this by inspection �as discussed earlier�� Alter�natively �and equivalently�� we continue to combine rows until the matrix becomes the identity� thisis referred to as the Gauss�Jordan process�

�� INTRODUCTION TO THE SOLUTION OF Ax � b ��

�� Divide the last row by �� then add the result to the second row��


�A �

��

�A

�� Divide the second row by ��


�A �

��

�A

�� Subtract second and third rows from the rst��


�A �

��

�A � x �

��

�A

The letters x�� x�� and x� clutter this process� so we may devise a shorthand augmented matrix

in which we can conduct the same series of operations without the extraneous symbols��

��

�A �

��

��

�A � � � � �

��

��

�A

� �z �A

��z�b

��z�x

An advantage of this notation is that we can solve it simultaneously for several right hand sidesbi comprising a right�hand�side matrix B� A particular case of interest is the several columns thatmake up the identity matrix� Example� construct three vectors x�� x�� and x� such that

Ax� �

��

�A � Ax� �

��

�A � Ax� �

��

�A �

This problem is solved as follows��

� � ��

�A �

��

� � ��

�A �

��

� � ��

�A �

� �z �A

� �z �B��

� � ��

��

�A �

��

� � �

� � �

� � ��

��

��

��

��

�A

� �z �X

De ning X �

�� j j jx� x� x�

j j j

�A� we have AX � I by construction� and thus X � A��

The above procedure is time consuming� but is just a sequence of mechanical steps� In thefollowing section� the procedure is generalized so that we can teach the computer to do the work forus�


�� Gaussian elimination algorithm

This section discusses the Gaussian elimination algorithm to nd the solution x of the systemAx � b� where A and b are given� The following notation is used for the augmented matrix�

�A jb� �

�BBB�a�� a�� a�na�� a�� a�n��

��

��an� an� ann

b�b��bn

�CCCA �

�� Forward sweep

�� Eliminate everything below a�� the rst �pivot�� in the rst column�Let m�� a��a�� Multiply the rst row by m�� and add to the second row�Let m�� a��a�� Multiply the rst row by m�� and add to the third row�� etc� The modi ed augmented matrix soon has the form�

BBB�a�� a�� a�n� a�� a�n��

��

�� an� � � � ann

b�b��bn

�CCCA

where all elements except those in the rst row have been changed�

�� Repeat step � for the new �smaller� augmented matrix �highlighted by the dashed box in the lastequation�� The pivot for the second column is a��

�� etc� The modi ed augmented matrix eventually takes the form�BBB�a�� a�� a�n� a�� a�n��

� � ��

�� ann

b�b��bn

�CCCA

Note that at each stage we need to divide by the �pivot�� so it is pivotal that the pivot is nonzero�If it is not� exchange the row with the zero pivot with one of the lower rows that has a nonzeroelement in the pivot column� Such a procedure is referred to as �partial pivoting�� We can alwayscomplete the Gaussian elimination procedure with partial pivoting if the matrix we are solving isnonsingular� i�e�� if the problem we are solving has a unique solution�

�� Back substitution

The process of back substitution is straightforward� Initiate with�

bn � bn�ann�

Starting from i � n� � and working back to i � �� update the other bi as follows�

bi ��bi �

nXk�i��

aik bk

��aii

where summation notation is not implied� Once nished� the vector b contains the solution x of theoriginal system Ax � b�

�� GAUSSIAN ELIMINATION ALGORITHM ��

�� Operation count

Let�s now determine how expensive the Gaussian elimination algorithm is�

Operation count for the forward sweep�

� �To eliminate a�� n nTo eliminate entire rst column� �n� �� n�n� �� n�n� ��To eliminate a�� n� �� n� ��To eliminate entire second column� �n� �� n� ��n� �� n� ��n� ��etc�

The total number of divisions is thus�

n��Xk��

�n� k�

The total number of multiplications is�

n��Xk��

�n� k � ��n� k�

The total number of additions is�

n��Xk��

�n� k � ��n� k�

Two useful identities here arenX

k��

k �n�n� ��

�and

nXk��

k� �n�n� ��n� ��

��

both of which may be veri ed by induction� Applying these identities� we see that�

The total number of divisions is� n�n� ��The total number of multiplications is� �n� � n��

The total number of additions is� �n� � n��

� For large n� the total number of �ops for the forward sweep is thus O��n��

Operation count for the back substitution�

The total number of divisions is� n

The total number of multiplications is�n��Xk��

�n� k� � n�n� ��

The total number of additions is�

n��Xk��

�n� k� � n�n� ��

� For large n� the total number of �ops for the back substitution is thus O�n��

Thus� we see that the forward sweep is much more expensive than the back substitution for large n�


�� Matlab implementation

The following code is an e�cient Matlab implementation of Gaussian elimination� The �partialpivoting� checks necessary to insure success of the approach have been omitted for simplicity� andare left as an exercise for the motivated reader� Thus� the following algorithm may fail even onnonsingular problems if pivoting is required� Note that� unfortunately� Matlab refers to the elementsof A as A�i�j�� though the accepted convention is to use lowercase letters for the elements of matrices�

gauss�m

Solves the system Ax�b for x using Gaussian elimination without

pivoting� The matrix A is replaced by the m�ij and U on exit� and

the vector b is replaced by the solution x of the original system�

�� FORWARD SWEEP ��

for j � �n� � For each column j�n�

for i�j� �n� loop through the elements a�ij below the pivot a�jj�

Compute m�ij� Note that we can store m�ij in the location

�below the diagonal�� that a�ij used to sit without disrupting

the rest of the algorithm� as a�ij is set to zero by construction

during this iteration�

A�i�j� � � A�i�j� � A�j�j��

Add m�ij times the upper triangular part of the j�th row of

the augmented matrix to the i�th row of the augmented matrix�

A�i�j� �n� � A�i�j� �n� � A�i�j� A�j�j� �n��

b�i� � b�i� � A�i�j� b�j��

end

end

�� BACK SUBSTITUTION ��

b�n� � b�n� � A�n�n�� Initialize the backwards march

for i � n� ��

Note that an inner product is performed at the multiplication

sign here� accounting for all values of x already determined�

b�i� � � b�i� � A�i�i� �n� b�i� �n� � � A�i�i��

end

end gauss�m

�� GAUSSIAN ELIMINATION ALGORITHM �

�� LU decomposition

We now show that the forward sweep of the Gaussian elimination algorithm inherently constructsan LU decomposition of A� Through several row operations� the matrix A is transformed by theGaussian elimination procedure into an upper triangular form� which we will call U � Furthermore�each row operation �which is simply the multiplication of one row by a number and adding theresult to another row� may also be denoted by the premultiplication of A by a simple transformationmatrix Eij � It turns out that the transformation matrix which does the trick at each step is simplyan identity matrix with the �i� j��th component replaced by mij � For example� if we de ne

E��

�BBB��

m��

� �

�CCCA �

then E��A means simply to multiply the rst row of A by m�� and add it to the second row� which isexactly the rst step of the Gaussian elimination process� To �undo� the multiplication of a matrixby E�� we simply multiply the rst row of the resulting matrix by �m�� and add it to the secondrow� so that

E��

�BBB�

� ��m��

� � �

� �

�CCCA �

The forward sweep of Gaussian elimination �without pivoting� involves simply the premultiplicationof A by several such matrices�

�En�n��En�n�� En��n�� En� � � �E��E��En� � � �E��E�� z �E

A � U�

To �undo� the e�ect of this whole string of multiplications� we may simply multiply by the inverseof E� which� it is easily veri ed� is given by

E��

�BBBBB�

� ��m�� m�� m��

��

� � �

�mn� �mn� � � � �mn�n��

�CCCCCA �

De ning L � E�� and noting that EA � U � it follows at once that A � LU �

� CHAPTER �� SOLVING LINEAR EQUATIONS

We thus see that both L and U may be extracted from the matrix that has replaced A after theforward sweep of the Gaussian elimination procedure� The following Matlab code constructs thesetwo matrices from the value of A returned by gauss�m�

extract�LU�m

Extract the LU decomposition of A from the modified version of A

returned by gauss�m� Note that this routine does not make efficient

use of memory� It is for demonstration purposes only�

First� construct L with �s on the diagonal and the negative of the

factors m�ij used during the Gaussian elimination below the diagonal�

L�eye�n��

for j� �n� �

for i�j� �n�

L�i�j��A�i�j��

end

end

U is simply the upper�triangular part of the modified A�

U�zeros�n��

for i� �n�

for j�i�n�

U�i�j��A�i�j��

end

end

end extract�LU�m

As opposed to the careful implementation of the Gaussian elimination procedure in gauss�m� inwhich the entire operation is done �in place� in memory� the code extract LU�m is not e�cient withmemory� It takes the information stored in the array A and spreads it out over two arrays L and U�constructing the LU decomposition of A� In codes for which memory storage is a limiting factor�this is probably not a good idea� Leaving the nontrivial components of L and U in a single array A�though it makes the code a bit di�cult to interpret� is an e�ective method of saving memory space�Once we have the LU decomposition of A �e�g�� once we run the full Gaussian elimination

procedure once�� we can solve a system with a new right hand side Ax � b� with a very inexpensivealgorithm� We note that we may rst solve an intermediate problem

Ly � b�

for the vector y� As L is �lower� triangular� this system can be solved inexpensively �O�n�� ops��Once y is found� we may then solve the system

U x � y

for the vector x� As U is �upper� triangular� this system can also be solved inexpensively �O�n��ops�� Substituting the second equation into the rst� and noting that A � LU � we see that whatwe have solved by this two�step process is equivalent to solving the desired problem Ax � b�� but ata signi cantly lower cost �O��n�� ops instead of O��n�� ops� because we were able to leveragethe LU decomposition of the matrix A� Thus� if you are going to get several right�hand�side vectorsb� with A remaining xed� it is a very good idea to reuse the LU decomposition of A rather thanrepeatedly running the Gaussian elimination routine from scratch�

�� GAUSSIAN ELIMINATION ALGORITHM ��

�� Testing the Gaussian elimination code

The following code tests gauss�m and extract LU�m with random A and b�

test�gauss�m

echo on

This code tests the Gaussian elimination and LU decomposition

routines� First� create a random A and b�

clear� n�� A�rand�n�� b�rand�n� �� pause

Recall that A and b are destroyed in gauss�m�

Let�s hang on to them here�

Asave�A� bsave�b�

Run the Gaussian elimination code to find the solution x of

Ax�b� and extract the LU decomposition of A�

echo off� gauss� extract�LU� echo on� pause

Now let�s see how good x is� If we did well� the value of Ax

should be about the same as the value of b�

Recall that the solution x is returned in b by gauss�m�

x�b� Ax � Asavex� b�bsave� pause

Now let�s see how good L and U are� If we did well� L should be

lower triangular with �s on the diagonal� U should be upper

triangular� and the value of LU should be about the same as

the value of A�

Note that the product LU is not done efficiently below� as both

L and U have structure which is not being leveraged�

L� U� LU�LU� A�Asave

end test�gauss�m

�� Pivoting

As you run the code test gauss�m on several random matrices A� you may be lulled into a false senseof security that pivoting isn�t all that important� I will shatter this dream for you with homework �� Just because a routine works well on several random matrices does not mean it will work wellin general'

As mentioned earlier� any nonsingular system may be solved by the Gaussian elimination proce�dure if partial pivoting is implemented� Recall that partial pivoting involves simply swapping rowswhenever a zero pivot is encountered� This can sometimes lead to numerical inaccuracies� as small

�but nonzero� pivots may be encountered by this algorithm� This can lead to subsequent row com�binations which involve the di�erence of two large numbers which are almost equal� On a computer�in which all numbers are represented with only nite precision� taking the di�erence of two numbers


which are almost equal can lead to signi cant magni cation of round�o� error� To alleviate thissituation� we can develop a procedure which swaps columns� in addition to swapping the rows� tomaximize the size of the pivot at each step� One has to bookkeep carefully when swapping columnsbecause the elements of the solution vector are also swapped�

�� Thomas algorithm

This section discusses the Thomas algorithm that nds the solution x of the system Ax � g� whereA and g are given with A assumed to be tridiagonal and diagonally dominant�� The algorithm isbased on the Gaussian elimination algorithm of the previous section but capitalizes on the structureof A� The following notation is used for the augmented matrix�

�A jg� �

�BBBBBBB�

b� c� �a� b� c�

a� b� c��

� � ��

an�� bn�� cn�� an bn

g�g�g��

gn��gn

�CCCCCCCA�

�� Forward sweep

�� Eliminate everything below b� �the rst pivot� in the rst column�Let m� � �a��b�� Multiply the rst row by m� and add to the second row�

�� Repeat step � for the new �smaller� augmented matrix� as in the Gaussian elimination procedure�The pivot for the second column is b��

� � � Continue iterating to eliminate the ai until the modi ed augmented matrix takes the form�BBBBBBB�

b� c� �� b� c�

� b� c��

� � ��

� bn�� cn�� bn

g�g�g��

gn��gn

�CCCCCCCA

Again� at each stage it is pivotal that the pivot is nonzero� A good numerical discretization ofa di�erential equation will result in matrices A which are diagonally dominant� in which case thepivots are always nonzero and we may proceed without worrying about the tedious �and numericallyexpensive� chore of pivoting�

�Diagonal dominance means that the magnitude of the element on the main diagonal in each row is larger thanthe sum of the magnitudes of the other elements in that row�

�� THOMAS ALGORITHM ��

�� Back substitution

As before� initiate the back substitution with�

gn � gn�bn�

Starting from i � n� � and working back to i � �� update the other gi as follows�

gi ��gi � ci gi��

��bi

where summation notation is not implied� Once nished� the vector g contains the solution x of theoriginal system Ax � g�

�� Operation count

Let�s now determine how expensive the Thomas algorithm is�

Operation count for the forward sweep�

� �To eliminate a�� To eliminate entire subdiagonal� �n� �� n� �� n� ��

� For large n� the total number of �ops for the forward sweep is thus O��n��

Operation count for the back substitution�

The total number of divisions is� n

The total number of multiplications is� �n� ��The total number of additions is� �n� ��

� For large n� the total number of �ops for the back substitution is thus O��n��

This is a lot cheaper than Gaussian elimination'



The following code is an e�cient Matlab implementation of the Thomas algorithm� The matrix Ais assumed to be diagonally dominant so that the need for pivoting is obviated�

thomas�m

Solves the system Ax�g for x using the Thomas algorithm�

assuming A is tridiagonal and diagonally dominant� It is

assumed that �a�b�c�g� are previously�defined vectors of

length n� where a is the subdiagonal� b is the main diagonal�

and c is the superdiagonal of the matrix A� The vectors

�a�b�c� are replaced by the m�i and U on exit� and the vector

g is replaced by the solution x of the original system�

�� FORWARD SWEEP ��

for j � �n� � For each column j�n�

Compute m��j� �� Note that we can put m��j� � in the location

�below the diagonal�� that a��j� � used to sit without disrupting

the rest of the algorithm� as a��j� � is set to zero by construction

during this iteration�

a�j� � � � a�j� � � b�j��

Add m��j� � times the upper triangular part of the j�th row of

the augmented matrix to the �j� ��th row of the augmented

matrix�

b�j� � � b�j� � � a�j� � c�j��

g�j� � � g�j� � � a�j� � g�j��

end

�� BACK SUBSTITUTION ��

g�n� � g�n� � b�n��

for i � n� ��

g�i� � � g�i� � c�i� g�i� � � � b�i��

end

end thomas�m

�� LU decomposition

The forward sweep of the Thomas algorithm again inherently constructs an LU decomposition ofA� This decomposition may �ine�ciently� be constructed with the code extract LU�m used for theGaussian elimination procedure� Note that most of the elements of both L and U are zero�Once we have the LU decomposition of A� we can solve a system with a new right hand side

Ax � g� with a two�step procedure as before� The cost of e�ciently solving

Ly � g�

�� THOMAS ALGORITHM ��

for the vector y is O��n� �ops �similar to the cost of the back substitution in the Thomas algorithm�but noting that the divisions are not required because the diagonal elements are unity�� and the costof e�ciently solving

U x � y

for the vector x is O��n� �ops �the same as the cost of back substitution in the Thomas algorithm��Thus� solving Ax � g� by reusing the LU decomposition of A costs O��n� �ops� whereas solvingit by the Thomas algorithm costs O�n� �ops� This is a measurable savings which should not bediscounted�

�� Testing the Thomas code

The following code tests thomas�m with extract LU�m for random A and g�

test�thomas�m

echo on

This code tests the implementation of the Thomas algorithm�

First� create a random a� b� c� and g�

clear� n�� a�rand�n� �� b�rand�n� �� c�rand�n� �� g�rand�n� ��

Construct A� Note that this is an insanely inefficient way of

storing the nonzero elements of A� and is done for demonstration

purposes only�

A � diag�a��n�� diag�b�� diag�c� �n� ��

Hang on to A and g for later use�

Asave�A� gsave�g�

Run the Thomas algorithm to find the solution x of Ax�g�

put the diagonals back into matrix form� and extract L and U�

echo off� thomas


extract�LU� echo on� pause

Now let�s see how good x is� If we did well� the value of Ax

should be about the same as the value of g�

Recall that the solution x is returned in g by thomas�m�

x�g� Ax � Asavex� g�gsave� pause

Now let�s see how good L and U are� If we did well� L should be

lower triangular with �s on the diagonal� U should be upper

triangular� and the value of LU should be about the same as

the value of A� Note the structure of L and U�

L� U� LU�LU� A�Asave

end test�thomas�m

��

�� Parallelization

The Thomas algorithm� which is O�n�� is very e�cient and� thus� widely used�Many modern computers achieve their speed by nding many things to do at the same time� This

is referred to as parallelization of an algorithm� However� each step of the Thomas algorithm dependsupon the previous step� For example� iteration j�� of the forward sweep must be complete beforeiteration j�� can begin� Thus� the Thomas algorithm does not parallelize� which is unfortunate� Inmany problems� however� one can nd several di�erent systems of the form Ax � g and work onthem all simultaneously to achieve the proper �load balancing� of a well�parallelized code� One mustalways use great caution when parallelizing a code to not accidentally reference a variable before itis actually calculated�

�� Review

We have seen that Gaussian elimination is a general tool that may be used whenever A is nonsingular�regardless of the structure of A�� and that partial pivoting �exchanging rows� is sometimes requiredto make it work� Gaussian elimination is also a means of obtaining an LU decomposition of A� withwhich more e�cient solution algorithms may be developed if several systems of the form Ax � b

must be solved where A is xed and several values of b will be encountered�Most of the systems we will encounter in our numerical algorithms� however� will not be full�

When the equations and unknowns of the system are enumerated in such a manner that the nonzeroelements lie only near the main diagonal� resulting in what we call a banded matrix� more e�cientsolution techniques are available� A particular example of interest is tridiagonal systems� which areamenable to very e�cient solution via the Thomas algorithm�

Chapter �

Solving nonlinear equations

Many problems in engineering require solution of nonlinear algebraic equations� In what follows� wewill show that the most popular numerical methods for solving such equations involve linearization�which leads to repeatedly solving linear systems of the form Ax � b� Solution of nonlinear equationsalways requires iterations� That is� unlike linear systems where� if solutions exist� they can beobtained exactly with Gaussian elimination� with nonlinear equations� only approximate solutionsare obtained� However� in principle� the approximations can be improved by increasing the numberof iterations�A single nonlinear equation may be written in the form f�x� � �� The objective is to nd the

value�s� of x for which f is zero� In terms of a geometrical interpretation� we want to nd thecrossing point�s�� x � xopt� where f�x� crosses the x�axis in a plot of f vs� x� Unfortunately� thereare no systematic methods to determine when a nonlinear equation f�x� � � will have zero� one� orseveral values of x which satisfy f�x� � ��

�� The Newton�Raphson method for nonlinear root �nding

�� Scalar case

Iterative techniques start from an initial �guess� for the solution and seek successive improvementsto this guess� Suppose the initial guess is x � x�� Linearize f�x� about x�� using the Taylor seriesexpansion

f�x� � f�x�� x� x��f ��x��

Instead of nding the roots of the resulting polynomial of degree �� we settle for the root of theapproximate polynomial consisting of the rst two terms on the right�hand side of �� Let x��

be this root� which can be easily obtained �when f ��x�� by taking f�x� � � and solving ��for x� which results in

x�� x�� f�x��

f ��x��

Thus� starting with x�� the next approximation is x�� In general� successive approximations areobtained from

x�j�� x�j � f�x�j�

f ��x�j�for j � ��

��

�� CHAPTER �� SOLVING NONLINEAR EQUATIONS

until �hopefully� we obtain a solution that is su�ciently accurate� The geometrical interpretationof the method is shown below�

f (x)

xx( 2 ) x( 1 ) x( 0 )

The function f at point x�� is approximated by a straight line which is tangent to f with slopef ��x�� The intersection of this line with the x�axis gives x�� the function at x�� is approximatedby a tangent straight line which intersects the x�axis at x�� etc�

�� Quadratic convergence

We now show that� once the solution is near the exact value x � xopt� the Newton�Raphson methodconverges quadratically� Let x�j indicate the solution at the jth iteration� Consider the Taylorseries expansion

f�xopt� � � f�x�j� � �xopt � x�j�f ��x�j� ��xopt � x�j��

�f ��x�j��

If f ��x�j� �� divide by f ��x�j��

x�j � xopt f�x�j�

f ��x�j��xopt � x�j��

�

f ��x�j�

f ��x�j��

Combining this with the Newton�Raphson formula �� leads to

x�j�� xopt �x�j � xopt��

�

f ��x�j�

f ��x�j��

De ning the error at iteration j as ��j �x�j � xopt

� the error at the iteration j � � is related tothe error at the iteration j by

��j��

f ��x�j�f ��x�j�

��j�� That is� convergence is quadratic�Note that convergence is guaranteed only if the initial guess is fairly close to the exact root�

otherwise� the neglected higher�order terms dominate the above expression and we may encouterdivergence� If f � is near zero at the root� we will also have problems�

�� THE NEWTON�RAPHSON METHOD FOR NONLINEAR ROOT FINDING �

�� Multivariable case�systems of nonlinear equations

A system of n nonlinear equations in n unknowns can be written in the general form�

fi�x�� x�� xn� � � for i � �� n�

or� more compactly� as f�x� � � As usual� subscripts denote vector components and boldfacedenotes whole vectors� Generalization of the Newton�Raphson method to such systems is achievedby using the multi�dimensional Taylor series expansion�

fi�x�� x�� xn� � fi�x�� x

�� x��n � � �x� � x

��

fix�

x�x��

��x� � x��

fix�

x�x��

� � � �

� fi�x�� x

�� x��n � �

nXj��

�xj � x��j �

fixj

x�x��

� � � �

where� as in the previous section� superscripts denote iteration number� Thus� for example� the

components of the initial guess vector x�� are x��i for i � �� n� Let x�� be the solution of the

above equation with only the linear terms retained and with fi�x�� x�� xn� set to zero as desired�Then

nXj��

fixj

x�x�� z �

a��ij

�x��j � x

��j �� z �

h��j

� �fi�x�� for i � �� n� ��

Equation �� constitutes n linear equations for the n components of h�� x��x�� which maybe considered as the desired update to x�� In matrix notation� we have

A��h�� f�x�� where a��ij �

fixj

x�x��

�

Note that the elemenets of the Jacobian matrix A�� are evaluated at x � x�� The solution at the rst iteration� x�� is obtained from

x�� x�� h��

Successive approximations are obtained from

A�kh�k�� f�x�k�x�k�� x�k � h�k��

where a�kij �

fixj

x�x�k�

�

Note that the elements of the Jacobian matrix A�k are evaluated at x � x�k�As an example� consider the nonlinear system of equations

f�x� �

�x�� cos x� � �x� � � sin x� � �

� ��

The Jacobian matrix is given by

A�k �

�B�f�x�

f�x�

f�x�

f�x�

�CAx�x�k�

�

��x

�k� �� sin x�k�

� cos x�k� �

��

� CHAPTER �� SOLVING NONLINEAR EQUATIONS

Let the initial guess be�

x��

��x��

x��

�A �

The function f and Jacobian A are rst evaluated at x � x�� Next� the following system ofequations is solved for h��

A��h�� f�x��

We then update x according to

x�� x�� h��

The function f and Jacobian A are then evaluated at x � x�� and we solve

A��h�� f�x��x�� x�� h��

The process continues in an iterative fashion until convergence� A numerical solution to this examplenonlinear system is found in the following sections�


The following code is a modular Matlab implementation of the Newton�Raphson method� By usingmodular programming style� we are most easily able to adapt segments of code written here for lateruse on di�erent nonlinear systems�

newt�m

Given an initial guess for the solution x and the auxiliary functions

compute�f�m and compute�A�m to compute a function and the corresponding

Jacobian� solve a nonlinear system using Newton�Raphson� Note that f may

be a scalar function or a system of nonlinear functions of any dimension�

res� � i� �

while �res� e� ��

f�compute�f�x�� A�compute�A�x�� Compute function and Jacobian

res�norm�f�� Compute residual

x�save�i��x�� Save x� f� and the residual

f�save�i��f�� res�save�i� ��res�

x�x��A�f�� Solve system for next x

i�i� � Increment index

end

evals�i� �

end newt�m

The auxiliary functions �listed below� are written as functions rather than as simple Matlab scripts�and all variables are passed in as arguments and pass back out via the function calls� Such hand�shaking between subprograms eases debugging considerably when your programs become large�

�� THE NEWTON�RAPHSON METHOD FOR NONLINEAR ROOT FINDING ��

function �f� � compute�f�x�

f��x� �x� ��cos�x�� Evaluate the function f�x��

x��sin�x� ��

end function compute�f�m

function �A� � compute�A�x�

A�� x� � ��sin�x�� Evaluate the Jacobian A

�cos�x� �� of the function in compute�f

end function compute�A�m

To prevent confusion with other cases presented in these notes� these case�speci c functions arestored at the class web site as compute f�m�� and compute A�m�� and must be saved as compute f�m

and compute A�m before the test code in the following section will run�

�� Dependence of Newton�Raphson on a good initial guess

The following code tests our implementation of the Newton�Raphson algorithm�

test�newt�m

Tests Newton�s algorithm for nonlinear root finding�

First� provide a sufficiently accurate initial guess for the root

clear� format long� x�init�x�

Find the root with Newton�Raphson and print the convergence

newt� x�save� res�save

end test�newt�m

function �x� � init�x

x�� Initial guess for x

end function init�x�m

As before� the case�speci c function to compute the initial guess is stored at the class web site asinit x�m�� and must be saved as init x�m before the test code will run�

Typical results when applying the Newton�Raphson approach to solve a nonlinear system ofequations are shown below� It is seen that a �good� �lucky�� choice of initial conditions will convergevery rapidly to the solution with this scheme� A poor choice will not converge smoothly� and may notconverge at all�this is a major drawback to the Newton�Raphson approach� As nonlinear systemsdo not necessarily have unique solutions� the nal root to which the system converges is dependenton the choice of initial conditions� It is apparent in the results shown that� in this example� thesolution is not unique�


A good initial guessiteration x� x� residual� ��

A poor initial guessiteration x� x��

��

��

�� Bracketing approaches for scalar root �nding

It was shown in the previous section that the Newton�Raphson technique is very e�cient for ndingthe solution of both scalar nonlinear equations of the form f�x� � � and multivariable nonlinearsystems of equations of the form f�x� � when good initial guesses are available� Unfortunately�good initial guesses are not always readily available� When they are not� it is desirable to seekthe roots of such equations by more pedestrian means which guarantee success� The techniques wewill present in this section� though they can not achieve quadratic convergence� are guaranteed toconverge to a solution of a scalar nonlinear equation so long as�

a� the function is continuous� andb� an initial bracketing pair may be found�

They also have the added bene t that they are based on function evaluations alone �i�e�� you don�tneed to write compute A�m�� which makes them simple to implement� Unfortunately� these tech�niques are based on a bracketing principle that does not extend readily to multi�dimensional func�tions�

�� Bracketing a root

The class of scalar nonlinear functions we will consider are continuous� Our rst task is to nd apair of values for x which bracket the minimum� i�e�� we want to nd an x�lower and an x�upper

such that f�x�lower� and f�x�upper� have opposite signs� This may often be done by hand with aminor amount of trial and error� At times� however� it is convenient to have an automatic procedureto nd such a bracketing pair� For example� for functions which have opposite sign for su�cientlylarge and small arguments� a simple approach is to start with an initial guess for the bracket andthen geometrically increase the distance between these points until a bracketing pair is found� Thismay be implemented with the following codes �or variants thereof��

�� BRACKETING APPROACHES FOR SCALAR ROOT FINDING ��

function �x�lower�x�upper� � find�brack

�x�lower�x�upper� � init�brack�

while compute�f�x�lower� compute�f�x�upper� � �

interval�x�upper�x�lower�

x�upper �x�upper��interval� x�lower �x�lower��interval�

end

end function find�brack�m

The auxiliary functions used by this code may be written as

function �x�lower�x�upper� � init�brack

x�lower�� x�upper� � Initialize a guess for the bracket�

end function init�brack�m

function �f� � compute�f�x�

f�x�� x�� x � �� Evaluate the function f�x��

end function compute�f�m

To prevent confusion with other cases presented in these notes� these case�speci c functions arestored at the class web site as init brack�m� and compute f�m� �

�� Re ning the bracket � bisection

Once a bracketing pair is found� the task that remains is simply to re ne this bracket until a desireddegree of precision is obtained� The most straightforward approach is to repeatedly chop the intervalin half� keeping those two values of x that bracket the root� The convergence of such an algorithmis linear� at each iteration� the bounds on the solution are reduced by exactly a factor of ��The bisection algorithm may be implemented with the following code�

bisection�m

f�lower�compute�f�x�lower�� f�upper�compute�f�x�upper�� evals��

interval�x�upper�x�lower� i� �

while interval � x�tol

x � �x�upper � x�lower��

f � compute�f�x�� evals�evals� �

res�save�i� ��norm�f��

plot�x�f��ko��

if f�lowerf � �

x�upper � x�

f�upper � f�

else

x�lower � x�

f�lower � f�

end

interval�interval��

i�i� � pause�

end


end bisection�m


�� Re ning the bracket � false position

A technique that is usually slightly faster than the bisection technique� but that retains the safetyof maintaining and re ning a bracketing pair� is to compute each new point by a numerical approx�imation to the Newton�Raphson formula �� such that

x�new � x�lower � f�x�lower�

�f��x� ��

where the quantity �f��x is a simple di�erence approximation to the slope of the function f

�f

�x�f�x�upper�� f�x�lower�

x�upper � x�lower�

As shown in class� this approach sometimes stalls� so it is useful to put in an ad hoc check to keepthe progress moving�

false�pos�m

f�lower�compute�f�x�lower�� f�upper�compute�f�x�upper�� evals��

interval�x�upper�x�lower� i� �

while interval � x�tol

fprime � �f�upper�f�lower��interval�

x � x�lower � f�lower � fprime�

Ad hoc check� stick with bisection technique if updates

provided by false position technique are too small�

tol � interval� ��

if ��x�x�lower� � tol � �x�upper�x� � tol �

x � �x�lower�x�upper��

end

f � compute�f�x�� evals � evals� �

res�save�i� ��norm�f��

plot��x�lower x�upper��f�lower f�upper��k��x�f��ko��

if f�lowerf � �

x�upper � x�

f�upper � f�

else

x�lower � x�

f�lower � f�

end

interval�x�upper�x�lower�

i�i� � pause�

end


end false�pos�m

�� BRACKETING APPROACHES FOR SCALAR ROOT FINDING ��

�� Testing the bracketing algorithms

The following code tests the bracketing algorithms for scalar nonlinear root nding and compareswith the Newton�Raphson algorithm� Convergence of the false position algorithm is usually foundto be faster than bisection� and both are safer �and slower� than Newton�Raphson� Don�t forgetto update init x�m and compute A�m appropriately �i�e�� with init x�m� and compute A�m� � inorder to get Newton�Raphson to work� You might have to ddle with init x�m in order to obtainconvergence of the Newton�Raphson algorithm�

test�brack�m

Tests the bisection� false position� and Newton routines for

finding the root of a scalar nonlinear function�

First� compute a bracket of the root�

clear� �x�lower�x�upper� � find�brack�

Prepare to make some plots of the function on this interval

xx��x�lower � �x�upper�x�lower�� x�upper��

for i� � � � yy�i��compute�f�xx�i�� end

x ��x�lower x�upper�� y ��

x�tol � �� Set tolerance desired for x�

x�lower�save�x�lower� x�upper�save�x�upper�

fprintf��n Now testing the bisection algorithm��n��

figure� �� clf� plot�xx�yy��k��x �y ��b�� hold on� grid�

title��Convergence of the bisection routine��

bisection

evals� hold off

fprintf�� Now testing the false position algorithm��n��

figure�� clf� plot�xx�yy��k��x �y ��b�� hold on� grid�

title��Convergence of false position routine��

x�lower�x�lower�save�

x�upper�x�upper�save�

false�pos

evals� hold off

fprintf�� Now testing the Newton�Raphson algorithm��n��

figure�� clf� plot�xx�yy��k�� hold on� grid�

title��Convergence of the Newton�Raphson routine��

x�init�x�

newt

Finally� an extra bit of plotting stuff to show what happens�

x�lower�t�min�x�save�� x�upper�t�max�x�save��

xx��x�lower�t � �x�upper�t�x�lower�t�� x�upper�t��

for i� � � � yy�i��compute�f�xx�i�� end

plot�xx�yy��k��x�save� ��f�save� ��ko�� pause

for i� �size�x�save��

��

plot��x�save�i� x�save�i� ��f�save�i� ��k��

x�save�i� ��f�save�i� ��ko��

pause

end�

evals� hold off

end test�brack�m

Chapter �

Interpolation

One often encounters the problem of constructing a smooth curve which passes through discrete datapoints� This situation arises when developing di�erentiation and integration strategies� as we willdiscuss in the following chapters� as well as when simply estimating the value of a function betweenknown values� In this handout� we will present two techniques for this procedure� polynomial�Lagrange� interpolation and cubic spline interpolation�

Note that the process of interpolation passes a curve exactly through each data point� Thisis sometimes exactly what is desired� However� if the data is from an experiment and has anyappreciable amount of uncertainty� it is best to use a least�squares technique to t a low�order curvein the general vicinity of several data points� This technique minimizes a weighted sum of thesquare distance from each data points to this curve without forcing the curve to pass through eachdata point individually� Such an approach generally produces a much smoother curve �and a morephysically�meaningful result� when the measured data is noisy�

�� Lagrange interpolation

Suppose we have a set of n� � data points fxi� yig� The process of Lagrange interpolation involvessimply tting an n�th degree polynomial �i�e�� a polynomial with n� � degrees of freedom� exactlythrough this data� There are two ways of accomplishing this� solving a system of n�� simultaneousequations for the n� � coe�cients of this polynomial� and constructing the polynomial directly infactored form� We will present both of these techniques�

�� Solving n� � equations for the n� � coe�cients

Consider the polynomial

P �x� � ao � a� x� a� x� � � � �� an x

n�

At each point xi� this polynomial must take the value yi� i�e��

yi � P �xi� � ao � a� xi � a� x�i � � � �� an x

ni for i � �� n�

��

�� CHAPTER �� INTERPOLATION

In matrix form� we may write this system as�BBB�� x� x�� xn�� x� x�� xn��

��

� � ��

� xn x�n � � � xnn

�CCCA

� �z �V

�BBB�a�a��an

�CCCA

� �z �a

�

�BBB�y�y��yn

�CCCA

� �z �y

�

This system is of the form V a � y� where V is commonly referred to as Vandermonde�s matrix� andmay be solved for the vector a containing the coe�cients ai of the desired polynomial� Unfortunately�Vandermonde�s matrix is usually quite poorly conditioned� and thus this technique of nding aninterpolating polynomial is unreliable at best�

�� Constructing the polynomial directly

Consider the n�th degree polynomial given by the factored expression

L��x� � ��x� x��x � x�� x� x��x� x�� x� xn� � ��

nYi��i��

�x� xi��

where �� is a constant yet to be determined� This expression �by construction� is equal to zero if xis equal to any of the data points except x�� In other words� L��xi� � � for i �� Choosing �� tonormalize this polynomial at x � x�� we de ne

��

�� nY

i��i��

�x� � xi�

��

�

which results in the very useful relationship

L��xi� � �i� �

�� i � ��

� i ��

Scaling this result� the polynomial y�L��x� �no summation implied� passes through zero at everydata point x � xi except at x � x�� where it has the value y�� Finally� a linear combination of n��of these polynomials

P �x� �

nXk��

ykLk�x�

provides an n�th degree polynomial which exactly passes through all of the data points� by construc�tion�A code which implements this constructive technique to determine the interpolating polynomial

is given in the following subsection�Unfortunately� high�order polynomials tend to wander wildly between the data points even if

the data appears to be fairly regular� as shown in Figure �� Thus� Lagrange interpolation shouldbe thought of as dangerous for anything more than a few data points and avoided in favor of othertechniques� such as spline interpolation�

�� CUBIC SPLINE INTERPOLATION �

−10 −5 0 5 10

−0.2

0

0.2

0.4

0.6

0.8

1

Figure �� The interpolation problem� a continuous curve representing the function of interest�� several points on this curve� the Lagrange interpolation of these points�


function �p� � lagrange�x�x�data�y�data�

Computes the Lagrange polynomial p�x� that passes through the given

data �x�data�y�data �

n�size�x�data� ��

p��

for k� �n� For each data point �x�data�k��y�data�k�

L� �

for i� �k�

L�L�x�x�data�i��x�data�k��x�data�i�� Compute L�k

end

for i�k� �n

L�L�x�x�data�i��x�data�k��x�data�i��

end

p � p � y�data�k� L� Add L�k�s contribution to p�x�

end

end lagrange�m

�� Cubic spline interpolation

Instead of forcing a high�order polynomial through the entire dataset �which often has the spuriouse�ect shown in Figure �� we may instead construct a continuous� smooth� piecewise cubic function

� CHAPTER �� INTERPOLATION

through the data� We will construct this function to be smooth in the sense of having continuous rst and second derivatives at each data point� These conditions� together with the appropriateconditions at each end� uniquely determine a piecewise cubic function through the data which isusually reasonably smooth� we will call this function the cubic spline interpolant�De ning the interpolant in this manner is akin to deforming a thin piece of wood or metal to

pass over all of the data points plotted on a large block of wood and marked with thin nails� Infact� the rst de nition of the word �spline� in Webster�s dictionary is �a thin wood or metal stripused in building construction��this is where the method takes its name� The elasticity equationgoverning the deformation f of the spline is

f �� G� ��

where G is a force localized near each nail �i�e�� with a delta function� which is su�cient to passthe spline through the data� As G is nonzero only in the immediate vicinity of each nail� such aspline takes an approximately piecewise cubic shape between the data points� Thus� between the

data points� we have�

f �� f �� C�� f �� C� x� C��

f � �C�

�x� � C� x� C�� and f �

C�

�x� �

C�

�x� � C� x� C��

�� Constructing the cubic spline interpolant

Let fi�x� denote the cubic in the interval xi � x � xi�� and let f�x� denote the collection of all thecubics for the entire range x� � x � xn� As noted above� f

��i varies linearly with x between each

data point� At each data point� by �� with G being a linear combination of delta functions� wehave�

�a� continuity of the function f � i�e�� fi��xi� � fi�xi� � f�xi� � yi�

�b� continuity of the rst derivative f �� i�e�� f �i��xi� � f �i�xi� � f ��xi�� and

�c� continuity of the second derivative f �� i�e�� f ��i��xi� � f ��i �xi� � f ��xi��

We now describe a procedure to determine an f which satis es conditions �a� and �c� by construction�in a manner analogous to the construction of the Lagrange interpolant in x�� and which satis escondition �b� by setting up and solving the appropriate system of equations for the value of f �� ateach data point xi�To begin the constructive procedure for determining f � note that on each interval xi � x � xi��

for i � �� n � �� we may write a linear equation for f ��i �x� as a function of its value at theendpoints� f ��xi� and f ��xi�� which are �as yet� undetermined� The following form �which islinear in x� ts the bill by construction�

f ��i �x� � f ��xi�x� xi��xi � xi��

� f ��xi��x� xi

xi�� xi�

Note that this rst degree polynomial is in fact just a Lagrange interpolation of the two datapoints fxi� f ��xi�g and fxi�� f ��xi��g� By construction� condition �c� is satis ed� Integrating thisequation twice and de ning $i � xi�� xi� it follows that

f �i�x� � �f ��xi�

�

�xi�� x��

$i�f ��xi��

�

�x � xi��

$i� C��

fi�x� �f ��xi�

�

�xi�� x��

$i�f ��xi��

�

�x� xi��

$i� C�x� C��

�� CUBIC SPLINE INTERPOLATION ��

The undetermined constants of integration are obtained by matching the end conditions

fi�xi� � yi and fi�xi�� yi��

A convenient way of constructing the linear and constant terms in the expression for fi�x� in sucha way that the desired end conditions are met is by writing fi�x� in the form

fi�x� �f ��xi�

�

��xi�� x��

$i�$i�xi�� x�

�f ��xi��

�

��x� xi�

�

$i�$i�x� xi�

� yi�xi�� x�

$i� yi��

�x� xi�

$i� where xi � x � xi��

By construction� condition �a� is satis ed� Finally� an expression for f �i�x� may now be found bydi�erentiating this expression for fi�x�� which gives

f �i�x� �f ��xi�

�

��xi�� x��

$i�$i

�f ��xi��

�

��x� xi�

�

$i�$i

�yi��$i

� yi$i

�

The second derivative of f at each node� f ��xi�� is still undetermined� A system of equations fromwhich the f ��xi� may be found is obtained by imposing condition �b�� which is achieved by setting

f �i�xi� � f �i��xi� for i � �� n� ��Substituting appropriately from the above expression for f �i�x�� noting that $i � xi��xi� leads to

$i��

�f ��xi��

$i�� $i

�f ��xi� �

$i

�f ��xi��

yi�� yi$i

� yi � yi��$i��

��

for i � �� n� �� This is a diagonally�dominant tridiagonal system of n� � equations for then � � unknowns f ��x�� f

��x�� f��xn�� We nd the two remaining equations by prescribing

conditions on the interpolating function at each end� We will consider three types of end conditions�

� Parabolic run�out�f ��x�� f ��x��

f ��xn� � f ��xn��

� Free run�out �also known as natural splines��f ��x��

f ��xn� � ��

� Periodic end�conditions�f ��x�� f ��xn��

f ��x�� f ��xn��

Equation �� may be taken together with equation �� or �� making the appropriatechoice for the end conditions depending upon the problem at hand� to give us n�� equations for then � � unknowns f ��xi�� This set of equations is then solved for the f

��xi�� which thereby ensuresthat condition �b� is satis ed� Once this system is solved for the f ��xi�� the cubic spline interpolantfollows immediately from equation ��


xi �� yi ��

Table �� Data points from which we want to reproduce �via interpolation� a continuous function�These data points all lie near the familiar curve generated by the function sin�x��x�

Note that� when equation �� is taken together with parabolic or free run�out at the ends�a tridiagonal system results which can be solved e�ciently with the Thomas algorithm� Whenequation �� is taken together periodic end conditions� a tridiagonal circulant system �which isnot diagonally dominant� results� A code which solves these systems to determine the cubic splineinterpolation with any of the above three end�conditions is given in the following subsection�The results of applying the cubic spline interpolation formula to the set of data given in Table

�� is shown in Figure �� All three end conditions on the cubic spline do reasonably well� but theLagrange interpolation is again spurious� Note that applying periodic end conditions on the splinein this case �which is not periodic� leads to a non�physical �wiggle� in the interpolated curve nearthe left end� in general� the periodic end conditions should be reserved for those cases for whichthe function being interpolated is indeed periodic� The parabolic run�out simply places a parabolabetween x� and x�� whereas the free run�out tapers the curvature down to zero near x�� Both endconditions provide reasonably smooth interpolations�

−10 −5 0 5 10

−0.2

0

0.2

0.4

0.6

0.8

1

Figure �� Various interpolations of the data of Table �� data� Lagrange interpolation�cubic spline �parabolic run�out�� cubic spline �free run�out�� cubic spline �periodic��



spline�setup�m

Determines the quantity g�f�� for constructing a cubic spline

interpolation of a function� Note that this algorithm calls

thomas�m� which was developed earlier in the course� and assumes

the x�data is already sorted in ascending order�

Compute the size of the problem�


Set up the delta�i � x��i� ��x�i for i� �n� �

delta� �n� ��x�data��n��x�data� �n� ��

Set up and solve a tridiagonal system for the g at each data point�

for i��n�

a�i��delta�i� ��!�

b�i��delta�i� ��delta�i��

c�i��delta�i��!�

g�i��y�data�i� ��y�data�i��delta�i� ��

�y�data�i��y�data�i� ��delta�i� ��

end

if end�conditions�� Parabolic run�out

b� �� c� �� g� ��

b�n�� a�n�� g�n��

thomas� �� solve system

elseif end�conditions�� Free run�out �"natural" spline�

b� �� c� �� g� ��

b�n�� a�n�� g�n��

thomas� �� solve system

elseif end�conditions�� Periodic end conditions

a� �� b� �� c� �� g� ��

a�n�� b�n�� c�n�� g�n��

Note� the following is an inefficient way to solve this circulant

�but NOT diagonally dominant�� system via full�blown Gaussian

elimination�


A� �n��a� �� A�n� ��c� ��

g�A�g�� solve system

end

end spline�setup�m

��


function �f� � spline�interp�x�x�data�y�data�g�delta�


Find that value of i such that x�data�i� �� x �� x�data�i� �

for i� �n�

if x�data�i� � � x� break� end

end

compute the cubic spline approximation of the function�

f�g�i� �! ��x�data�i� ��x��delta�i��delta�i��x�data�i� ��x��

g�i� ��! ��x � x�data�i��delta�i��delta�i��x � x�data�i��

�y�data�i��x�data�i� ��x� � y�data�i� ��x�x�data�i�� delta�i��

end spline�interp�m

��

test�interp�m

clear� clf�

�x�data� y�data��input�data�

plot�x�data�y�data��ko�� axis��

hold on� pause�

x�lower�min�x�data�� x�upper�max�x�data��


for i� ��

ylagrange�i��lagrange�xx�i��x�data�y�data��

end

plot�xx�ylagrange��b�� pause�

for end�conditions� �� Set the end conditions you want to try here�

spline�setup�

for i� ��

y�spline�i��spline�interp�xx�i��x�data�y�data�g�delta��

end

plot�xx�y�spline�� pause�

end

end test�interp�m

��

function �x�data�y�data� � input�data

x�data��

y�data�sin�x�data�eps��x�data�eps��

end input�data�m


�� Tension splines

For some special cases� cubic splines aren�t even smooth enough� In such cases� it is helpful to put�tension� on the spline to straighten out the curvature between the �nails� �datapoints�� In thelimit of large tension� the interpolant becomes almost piecewise linear�Tensioned splines obey the di�erential equation�

f �� f �� G

where � is the tension of the spline� This leads to the following relationships between the datapoints�

!f �� f "�� !f �� f "� � C�� !f �� f " � C� x� C��

Solving the ODE on the right leads to an equation of the form

f � ��C�x� C�� C�e��x � C�e

�x�

Proceeding with a constructive process to satisfy condition �a� as discussed in x�� we assemblethe linear and constant terms of f �� f such that

�f ��i �x�� fi�x�

��f ��i �xi�� yi

� x� xi��xi � xi��

��f ��i �xi�� yi��

� x� xixi�� xi

�

Similarly� we assemble the exponential terms in the solution of this ODE for f in a constructivemanner such that condition �c� is satis ed� Rewriting the exponentials as sinh functions� the desiredsolution may be written

fi�x� � ��n �

f ��xi�� yi� xi�� x

$i��f ��xi�� yi��

� x� xi$i

� f ��xi�sinh��xi�� x�

sinh�$i

� f ��xi��sinh��x � xi�

sinh�$i

owhere xi � x � xi��

Di�erentiating once and appling condition �b� leads to the tridiagonal system��

$i��

sinh�$i��

f ��xi��

��

�

$i�� cosh�$i��

sinh�$i��

$i� � cosh�$i

sinh�$i

f ��xi�

��

�

��

$i� �

sinh�$i

f ��xi��

��yi�� yi$i

� yi � yi��$i��

� ��

The tridiagonal system �� can be set up and solved exactly as was done with �� even thoughthe coe�cients have a slightly more complicated form� The tensioned�spline interpolant is then givenby ��

�� B�splines

It is interesting to note that we may write the cubic spline interpolant in a similar form to how weconstructed the Lagrange interpolant� that is�

f�x� �nX

��

y�b��x��

��

where the b��x� correspond to the various cubic spline interpolations of the Kronecker deltas suchthat b��xi� � �i�� as discussed in x�� for the functions L��x�� The b��x� in this representationare referred to as the basis functions� and are found to have localized support �in other words�b��x�� for large jx� x�j��By relaxing some of the continuity constraints� we may con ne each of the basis functions to

have compact support �in other words� we can set b��x� � � exactly for jx � x�j � R for someR�� With such functions� it is easier both to compute the interpolations themselves and to projectthe interpolated function onto a di�erent mesh of gridpoints� The industry of computer animationmakes heavy use of e�cient algorithms for this type of interpolation extended to three�dimensionalproblems�

Chapter �

Minimization

�� Motivation

It is often the case that one needs to minimize a scalar function of one or several variables� Threecases of particular interest which motivate the present chapter are described here�

�� Solution of large linear systems of equations

Consider the n � n problem Ax � b� where n is so large that an � O�n�� algorithm such asGaussian elimination is out of the question� and the problem can not be put into such a form thatA has a banded structure� In such cases� it is sometimes useful to construct a function J �x� suchthat� once �approximately� minimized via an iterative technique� the desired condition Ax � b is�approximately� satis ed� If A is symmetric positive de nite �i�e�� if akj � ajk and all eigenvaluesof A are positive� then we can accomplish this by de ning

J �x� � ��xTAx� bTx �

�

�xi aij xj � bi xi�

Requiring that A be symmetric positive de nite insures that J � � for jxj � � in all directionsand thus that a minimum point indeed exists� �We will extend this approach to more generalproblems at the end of this chapter�� Di�erentiating J with respect to an arbitrary component ofx� we nd that

Jxk

��

��ik aij xj � xi aij �jk�� bi �ik � akjxj � bk�

The unique minimum of J �x� is characterized byJxk

� akjxj � bk � � or rJ � Ax � b � �

Thus� solution of large linear systems of the form Ax � b �where A is symmetric positive de nite�may be found by minimization of J �x��

�� Solution of nonlinear systems of equations

Recall from x�� that the Newton�Raphson method was an e�ective technique to nd the root �whenone exists� of a nonlinear system of equations f�x� � when a su�ciently�accurate initial guess is

��

�� CHAPTER �� MINIMIZATION

available� When such a guess is not available� an alternative technique is to examine the square ofthe norm of the vector f �

J �x� � !f�x�"T f�x�

Note that this quantity is never negative� so any point x for which f�x� � minimizes J �x�� Thus�seeking a minimum of this J �x� with respect to x might result in an x such that f�x� � � However�there are quite likely many minimum points of J �x� �at whichrJ � �� only some of which �if any'�will correspond to f�x� � � Root nding in systems of nonlinear equations is very di�cult�thoughthis method has signi cant drawbacks� variants of this method are really about the best one can dowhen one doesn�t have a good initial guess for the solution�

�� Optimization and control of dynamic systems

In control and optimization problems� one can often represent the desired objective mathematicallyas a �cost function� to be minimized� When the cost function is formulated properly� its minimizationwith respect to the control parameters results in an e�ectively�controlled dynamic system� Such costfunctions may often be put in the form

J �u� � xTQx� uTRu

where x � x�u� is the state of the system and u is the control� The second term in the aboveexpression measures the amount of control used and the rst term is an observation of the propertyof interest of the dynamic system which� in turn� is a function of the applied control� When thecontrol is both e�cient �small in magnitude� and has the desired e�ect on the dynamic system� bothof these terms are small� Thus� minimization of this cost function with respect to u results in thedetermination of an e�ective set of control parameters�

�� The Newton�Raphson method for nonquadratic mini�

mization

If a good initial guess is available �near the desired minimum point�� minimization of J �x� can beaccomplished simply by applying the Newton�Raphson method developed previously to the gradientof J �x�� given by function f�x� �rJ � in order to nd the solution to the equation f�x� � � Thisworks just as well for scalar or vector x� and converges quite rapidly� Recall from equation ��that the Newton�Raphson method requires both evaluation of the function f�x�� in this case takento be the gradient of J �x�� and the Jacobian of f�x�� given in this case by

a�kij �

fixj

x�x�k�

��Jxixj

x�x�k�

�

This matrix of second derivatives is referred to as the Hessian of the function J � Unfortunately�for very large N � computation and storage of the Hessian matrix� which has N� elements� can beprohibitively expensive�

�� BRACKETING APPROACHES FOR MINIMIZATION OF SCALAR FUNCTIONS �

�� Bracketing approaches for minimization of scalar func�

tions

�� Bracketing a minimum

We now seek a reliable but pedestrian approach to a minimize scalar function J �x� when a goodinitial guess for the minimum is not available� To do this� we begin with a �bracketing� approachanalogous to that which we used for nding the root of a nonlinear scalar equation� Recall from x��that bracketing a root means nding a pair fx�lower� x�upperg for which f�x�lower� and f�x�upper�have opposite signs �so a root must exist between x�lower and x�upper if the function is continuousand bounded��Analogously� bracketing a minimum means nding a triplet fx�� x�� x�g with x� between x� and

x� and for which J �x�� J �x�� and J �x�� J �x�� so a minimum must exist between x� and x�if the function is continuous and bounded�� Such an initial bracketing triplet may often be foundby a minor amount of trial and error� At times� however� it is convenient to have an automaticprocedure to nd such a bracketing triplet� For example� for functions which are large and positivefor su�ciently large jxj� a simple approach is to start with an initial guess for the bracket and thengeometrically scale out each end until a bracketing triplet is found�

Matlab implementation

find�triplet�m

Initialize and expand a triplet until the minimum is bracketed�

Should work if J �� inf as �x� �� inf�

�x �x��x�� init�triplet�

J �compute�J�x �� J��compute�J�x�� J��compute�J�x��

while �J��J �

Compute a new point x� to the left of the triplet

x��x ��x��x �� J��compute�J�x��

Center new triplet on x

x��x�� J��J��

x��x � J��J �

x �x�� J �J��

end

while �J��J��

Compute new point x� to the right of the triplet

x��x��x��x�� J��compute�J�x��

Center new triplet on x�

x �x�� J �J��

x��x�� J��J��

x��x�� J��J��

end

end find�triplet�m

Case�speci c auxiliary functions are given below�

function �J� � compute�J�x�

J��cos�x��sin��x� � �� x�� compute function you are trying to minimize�

plot�x�J��ko�� plot a circle at the data point�

end compute�J�m

� CHAPTER �� MINIMIZATION

J(x)

x1(k)

W(k) 1−W(k)

Z(k)

x2(k) x3

(k)

x4(k)

�

J(x)

x3(k+1)

W(k+1)

1−W(k+1)

Z(k+1)

x2(k+1) x1

(k+1)x4(k+1)

k�th iteration �k � ��th iteration

Figure �� Naming of points used in golden section search procedure� fx�k� � x�k� � x

�k� g is referred

to as the bracketing triplet at iteration k� where x�k� is between x

�k� and x

�k� and x

�k� is assumed

to be closer to x�k� than it is to x

�k� � A new guess is made at point x

�k� and the bracket is re ned

by retaining those three of the four points which maintain the tightest bracket� The reduction ofthe interval continues at the following iterations in a self�similar fashion�

function �x �x��x�� init�triplet

x �� x�� x�� Initialize guess for the bracketing triplet

end init�triplet�m

�� Re ning the bracket � the golden section search

Once the minimum of the non�quadratic function is bracketed� all that remains to be done is tore ne these brackets� A simple algorithm to accomplish this� analogous to the bisection techniquedeveloped for scalar root nding� is called the golden section search� As illustrated in Figure �� letW be de ned as the ratio of the smaller interval to the width of the bracketing triplet fx�� x�� x�gsuch that

W �x� � x�x� � x�

� ��W �x� � x�x� � x�

�

We now pick a new trial point x� and de ne Z �x� � x�x� � x�

� x� � x� � Z�x� � x��

There are two possibilities�

if J �x�� J �x�� then fx�� x�� x�g is a new bracketing triplet �as in Figure �� andif J �x�� J �x�� then fx�� x�� x�g is a new bracketing triplet�

Minimizing the width of this new �re ned� bracketing triplet in the worst case� we should take thewidth of both of these triplets as identical� so that

W � Z � ��W � Z � �� W ��

If the same algorithm is used for the re nement at each iteration k� then a self�similar situationdevelops in which the ratioW is constant from one iteration to the next� i�e�� W �k �W �k�� Notethat� in terms of the quantities at iteration k� we have either

�� BRACKETING APPROACHES FOR MINIMIZATION OF SCALAR FUNCTIONS ��

W k�� Z�k��W �k � Z�k� if fx�k� � x�k� � x

�k� g is the new bracketing triplet� or

W k�� Z�k��W �k� if fx�k� � x�k� � x

�k� g is the new bracketing triplet�

Dropping the superscripts on W and Z� which we assume to be independent of k� and inserting�� both of these conditions reduce to the relation

W � � �W � � � ��

which �because � � W � �� implies that

W ��p��

�� W �

p��

�� and Z �p��

These proportions are referred to as the golden section�

To summarize� the golden section algorithm takes an initial bracketing triplet fx�� x�� x

�� g�

computes a new data point at x�� x

�� Z�x

�� x

�� where Z � �� and then�

if J �x�� J �x�� the new triplet is fx�� x�� x

�� g � fx�� x

�� x

�� g� or

if J �x�� J �x�� the new triplet is fx�� x�� x

�� g � fx�� x

�� x

�� g�

The process continues on the new �re ned� bracketing triplet in an iterative fashion until the desiredtolerance is reached such that jx� � x�j � �� Even if the initial bracketing triplet is not in the ratioof the golden section� repeated application of this algorithm quickly brings the triplet into this ratioas it is re ned� Note that convergence is attained linearly� each bound on the minimum is ��times the previous bound� This is slightly slower than the convergence of the bisection algorithmfor nonlinear root� nding�


golden�m

Input assumes �x �x��x� are a bracketing triple with function

values �J �J��J� � On output� x� is the best guess of the minimum�

Z�sqrt�� initialize the golden section ratio

if �abs�x��x � � abs�x��x�� insure proper ordering

swap�x � x �x�� x��swap�

swap�J � J �J�� J��swap�

end

pause�

while �abs�x��x � � x�tol�

x� � x� � Z�x��x �� compute and plot new point

J� � compute�J�x��

evals � evals� �

Note that a couple of lines are commented out below because some

of the data is already in the right place�

if �J��J��

x��x � J��J � Center new triplet on x�

x��x�� J��J��

x �x�� J �J��

else


x �x�� J �J�� Center new triplet on x�

x��x�� J��J��

x��x�� J��J��

end

pause�

end

end golden�m

The implementation of the golden section search algorithm may be tested as follows�

test�golden�m

Tests the golden search routine

clear� find�triplet� Initialize a bracket of the minimum�

Prepare to make some plots of the function over the width of the triplet

xx��x � �x��x �� x��

for i� � � � yy�i��compute�J�xx�i�� end

figure� �� clf� plot�xx�yy��k�� hold on� grid�

title��Convergence of the golden section search��

plot�x �J ��ko�� plot�x��J��ko�� plot�x��J��ko��

x�tol � �� Set desired tolerance for x�

evals�� Reset a counter for tracking function evaluations�

golden� hold off

x�� J�� evals

end test�golden�m

�� Re ning the bracket � inverse parabolic interpolation

Recall from x�� that� when a function f�x� is �locally linear� �meaning that its shape is well�approximated by a linear function�� the false position method is an e�cient technique to nd theroot of the function based on function evaluations alone� The false position method is based onthe construction of successive linear interpolations of recent function evaluations� taking each newestimate of the root of f�x� as that value of x for which the value of the linear interpolant is zero�Analogously� when a function J �x� is �locally quadratic� �meaning that its shape is well�

approximated by a quadratic function�� the minimum point of the function may be found via ane�cient technique based on function evaluations alone� At the heart of this technique is the con�struction of successive quadratic interpolations based on recent function evaluations� taking eachnew estimate of the minimum of J �x� as that value of x for which the value of the quadratic inter�polant is minimum� For example� given data points fx�� y�g� fx�� y�g� and fx�� y�g� the quadraticinterpolant is given by the Lagrange interpolation formula

P �x� � y��x� x��x� x��

�x� � x��x� � x�� y�

�x� x��x� x��

�x� � x��x� � x�� y�

�x � x��x � x��

�x� � x��x� � x��

as described in x�� Setting dP �x��dx � � to nd the critical point of this quadratic yields

� � y��x� x� � x�

�x� � x��x� � x�� y�

�x� x� � x��x� � x��x� � x��

� y��x� x� � x�

�x� � x��x� � x��

�� BRACKETING APPROACHES FOR MINIMIZATION OF SCALAR FUNCTIONS ��

Multiplying by �x��x��x��x��x� �x�� and then solving for x gives the desired value of x whichis a critical point of the interpolating quadratic�

x ��

�

y��x� � x��x� � x�� y��x� � x��x� � x�� y��x� � x��x� � x��

y��x� � x�� y��x� � x�� y��x� � x��

Note that the critical point found may either be a minimum point or a maximum point dependingon the relative orientation of the data� but will always maintain the bracketing triplet�


inv�quad�m

Finds the minimum or maximum of a function by repeatedly moving to the

critical point of a quadratic interpolant of three recently�computed data points

in an attempt to home in on the critical point of a nonquadratic function�

res� � i� �

J �compute�J�x �� J��compute�J�x�� J��compute�J�x��

while �abs�x��x � � x�tol�

x� � ��J �x��x��x��x�� J��x��x ��x��x �� J��x �x��x �x��

�J �x��x�� J��x��x �� J��x �x�� compute the critical point

The following plotting stuff is done for demonstration purposes only�

x�lower�min��x x� x�� x�upper�max��x x� x��


for i� ��

J�lagrange�i��lagrange�xx�i��x � x�� x��J � J�� J��

end

plot�xx�J�lagrange��b�� plot a curve through the data

Jinterp�lagrange�x��x � x�� x��J � J�� J�� plot a at the critical point

plot�x��Jinterp��r�� of the Lagrange interpolant�

pause�

J� � compute�J�x�� Compute function J at new point and the proceed as with

evals � evals� � the golden section search�

Note that a couple of lines are commented out below because some

of the data is already in the right place�

if �J��J��

x��x � J��J � Center new triplet on x�

x��x�� J��J��

x �x�� J �J��

else

x �x�� J �J�� Center new triplet on x�

x��x�� J��J��

x��x�� J��J��

end

end

end inv�quad�m

The test golden�m code is easily modi ed �by replacing the call to golden�m with a call toinv quad�m� to test the e�ciency of the inverse parabolic algorithm�


�� Re ning the bracket � Brent�s method

As with the false position technique for accelerated bracket re nement for the problem of scalarroot nding� the inverse parabolic technique can also stall for a variety of scalar functions J �x� onemight attempt to minimize�

A hybrid technique� referred to as Brent�s method� combines the reliable convergence bene ts ofthe golden section search with the ability of the inverse parabolic interpolation technique to �homein� quickly on the solution when the minimum point is approached� Switching in a reliable fashionfrom one technique to the other without the possibility for stalling requires a certain degree ofheuristics and results in a rather long code which is not very elegant looking� Don�t let this dissuadeyou� however� the algorithm �when called correctly� is reliable� fast� and requires a minimum numberof function evaluations� The algorithm was developed by Brent in �� and implemented in Fortranand C by the authors of Numerical Recipes �� where it is discussed in further detail� �AMatlab implementation of the algorithm is included at the class web site� with the name brent�m�It is functionally equivalent to golden�m and inv quad�m� discussed above� and can be called in thesame manner�� Of the methods discussed in these notes� Brent�s method is the best �all purpose�scalar minimimization tool for a wide variety of applications�

�� Gradient�based approaches for minimization of multi�

variable functions

We now seek a reliable technique to minimize a multivariable function J �x� which a� does notrequire a good initial guess� b� is e�cient for high�dimensional systems� and c� does not requirecomputation and storage of the Hessian matrix� As opposed to the problem of root nding in themultivariable case� some very good techniques are available to solve this problem�

A straightforward approach to minimizing a scalar function J of an n�dimensional vector x is toupdate the vector x iteratively� proceeding at each step in a downhill direction p a distance whichminimizes J �x� in this direction� In the simplest such algorithm �referred to as the steepest descentor simple gradient algorithm�� the direction p is taken to be the direction r of maximum decreaseof the function J �x�� i�e�� the direction opposite to the gradient vector rJ �x�� As the iterationk � �� this approach should converge to one of the minima of J �x� �if such a minimum exists��Note that� if J �x� has multiple minima� this technique will only nd one of the minimum points�and the one it nds �a �local� minimum� might not necessarily be the one with the smallest valueof J �x� �the �global� minimum��Though the above approach is simple� it is often quite slow� As we will show� it is not always the

best idea to proceed in the direction of steepest descent of the cost function� A descent directionp�k chosen to be a linear combination of the direction of steepest descent r�k and the step takenat the previous iteration p�k�� is often much more e�ective� The �momentum� carried by such anapproach allows the iteration to turn more directly down narrow valleys without oscillating betweenone descent direction and another� a phenomenon often encountered when momentum is lacking� Aparticular choice of the momentum term results in a remarkable orthogonality property amongst the

set of various descent directions �namely� p�kTAp�j � � for j �� k� and the set of descent directions

are referred to as a conjugate set� Searching in a series of mutually conjugate directions leads toexact convergence of the iterative algorithm in n iterations� assuming a quadratic cost function andno numerical round�o� errors��

�Note that� for large n� the accumulating round�o� error due to the nite�precision arithmetic of the calculationsis signicant� so exact convergence in n iterations usually can not be obtained�

�� GRADIENT�BASED APPROACHES FORMINIMIZATION OFMULTIVARIABLE FUNCTIONS��

In the following� we will develop the steepest descent �x�� and conjugate gradient �x�� ap�proaches for quadratic functions rst� then discuss their extension to nonquadratic functions �x��Though the quadratic and nonquadratic cases are handled essentially identically in most regards�the line minimizations required by the algorithms may be done directly for quadratic functions�but should be accomplished by a more reliable bracketing procedure �e�g�� Brent�s method� for non�quadratic functions� Exact convergence in n iterations �again� neglecting numerical round�o� errors�is possible only in the quadratic case� though nonquadratic functions may also be minimized quitee�ectively with the conjugate gradient algorithm when the appropriate modi cations are made�


J(x)

x

J(x)

x1

x2

x1

x2

One variable Two variables �oblique view� Two variables �top view�

Figure �� Geometry of the minimization problem for quadratic functions� The ellipses in the two gures on the right indicate isosurfaces of constant J �

�� Steepest descent for quadratic functions

Consider a quadratic function J �x� of the form

J �x� � ��xTAx� bTx

where A is positive de nite� The geometry of this problem is illustrated in Figure ��We will begin at some initial guess x�� and move at each step of the iteration k in a direction

downhill r�k such that

x�k�� x�k � ��kr�k�

where ��k is a parameter for the descent which will be determined� In this manner� we proceediteratively towards the minimum of J �x�� Noting the derivation of rJ in x�� de ne r�k as thedirection of steepest descent such that

r�k � �rJ �x�k� � ��Ax�k � b��

Now that we have gured out what direction we will update x�k� we need to gure out the parameterof descent ��k� which governs the distance we will update x�k in this direction� This may be foundby minimizing J �x�k � ��kr�k� with respect to ��k� Dropping the superscript ��k for the timebeing for notational clarity� note rst that

J �x� � r� ��

��x� � r�TA�x � � r�� bT �x� � r�

and thus

J �x� � r�

��

�rTA�x� � r� �

�

��x� � r�TAr� bT r

� � rTAr� rTAx � rTb

� � rTAr� rT �Ax � b�

� � rTAr� rT r�

Setting J �x� �r�� yields

� �rT r

rTAr�


Thus� from the value of x�k at each iteration� we can determine explicitly both the direction ofdescent r�k and the parameter ��k which minimizes J �


sd�quad�m

Minimize a quadratic function J�x� � � �� x�T A x � b�T x

using the steepest descent method

clear res�save x�save� epsilon� e�!� x�zeros�size�b��

for iter� �itmax

r�b�Ax� determine gradient

res�r�r� compute residual

res�save�iter��res� x�save��iter��x� save some stuff

if �res�epsilon � iter��itmax�� break� end exit yet#

alpha�res��r�Ar�� compute alpha

x�x�alphar� update x

end

end sd�quad�m

Operation count

The operation count for each iteration of the steepest descent algorithm may be determined byinspection of the above code� and is calculated in the following table�

Operation �opsTo compute Ax� b� O��n��To compute rT r� O��n�To compute rTAr� O��n��To compute x� �r� O��n�TOTAL� O��n��

A cheaper technique for computing such an algorithm is discussed at the end of the next section�

�� Conjugate gradient for quadratic functions

As discussed earlier� and shown in Figure �� proceeding in the direction of steepest descent at eachiteration is not necessarily the most e�cient strategy� By so doing� the path of the algorithm can bevery jagged� Due to the successive line minimizations and the lack of momentum from one iterationto the next� the steepest descent algorithm must tack back and forth �� at each turn� We nowshow that� by slight modi cation of the steepest descent algorithm� we arrive at the vastly improvedconjugate gradient algorithm� This improved algorithm retains the correct amount of momentumfrom one iteration to the next to successfully negotiate functions J �x� with narrow valleys�Note that in �easy� cases for which the condition number is approximately unity� the level

surfaces of J are approximately circular� and convergence with either the steepest descent or theconjugate gradient algorithm will be quite rapid� In poorly conditioned problems� the level surfacesbecome highly elongated ellipses� and the zig�zag behavior is ampli ed�Instead of minimizing in a single search direction at each iteration� as we did for the method of

steepest descent� now consider searching simultaneously in m directions� which we will denote p��


−8 −6 −4 −2 0 2 4−2

0

2

4

6

8

10

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 10−8

0

1

2

3

4

5x 10

−5

a� Quadratic J�� b� Non�quadratic J��

Minimum point

Minimum point

Startingpoint

Startingpoint

Figure �� Convergence of� �� simple gradient� and � � � the conjugate gradient algorithms whenapplied to nd minima of two test functions of two scalar control variables x� and x� �horizontal andvertical axes�� Contours illustrate the level surfaces of the test functions� contours corresponding tothe smallest isovalues are solid� those corresponding to higher isovalues are dotted�

p�� p�m�� Take

x�m � x��

m��Xj��

��jp�j�

and note that

J �x�m� ��

�

�x��

m��Xj��

��jp�j�TA�x��

m��Xj��

��jp�j�� bT

�x��

m��Xj��

��jp�j��

Taking the derivative of this expression with respect to ��k�

J �x�m�

��k��

�

�p�k

�TA�x��

m��Xj��

��jp�j��

�

�x��

m��Xj��

��jp�j�TA�p�k

�� bTp�k

� ��kp�kTAp�k � p�k

TAx�� p�k

Tb�

m��Xj��j ��k

��jp�kTAp�j�

We seek a technique to select all the p�j in such a way that they are orthogonal through A� orconjugate� such that

p�kTAp�j � � for j �� k�

�� GRADIENT�BASED APPROACHES FORMINIMIZATIONOFMULTIVARIABLE FUNCTIONS�

IF we can nd such a sequence of p�j� then we obtain

J �xm��k

� ��kp�kTAp�k � p�k

T�Ax�� b

�� kp�k

TAp�k � p�k

Tr��

and thus setting J ��k � � results in

��k �p�k

Tr��

p�kTAp�k

�

The remarkable thing about this result is that it is independent of p�j for j �� k' Thus� so longas we can nd a way to construct a sequence of p�k which are all conjugate� then each of theseminimizations may be done separately�The conjugate gradient technique is simply an e�cient technique to construct a sequence of p�k

which are conjugate� It entails just rede ning the descent direction p�k at each iteration after the rst to be a linear combination of the direction of steepest descent� r�k� and the descent directionat the previous iteration� p�k�� such that

p�k � r�k � � p�k�� and x�k�� x�k � �p�k�

where � and � are given by

� �r�k

Tr�k

r�k��Tr�k��

� � �r�k

Tr�k

p�kTAp�k

�

Veri cation that this choice of � results in conjugate directions and that this choice of � is equivalentto the one mentioned previously �minimizing J in the direction p�k from the point x�k� involvesa straightforward �but lengthy� proof by induction� The interested reader may nd this proof inGolub and van Loan ��


cg�quad�m

Minimize a quadratic function J�� x�T A x � b�T x

using the conjugate gradient method



r�b�Ax� determine gradient




if �iter�� compute update direction

p � r� set up a S�D� step

else

beta � res � res�save�iter� �� compute momentum term

p � r � beta p� set up a C�G� step

end

� CHAPTER �� MINIMIZATION

alpha�res��p�Ap�� compute alpha

x�x�alphap� update x

end

end cg�quad�m

As seen by comparison of the cg quad�m to sd quad�m� implementation of the conjugate gradientalgorithm involves only a slight modi cation of the steepest descent algorithm� The operation countis virtually the same �cg quad�m requires �n more �ops per iteration�� and the storage requirementsare slightly increased �cg quad�m de nes an extra n�dimensional vector p��

Matlab implementation � cheaper version

A cheaper version of cg quad may be written by leveraging the fact that an iterative procedure maydeveloped for the computation of r by combining the following two equations�

x�k � x�k�� k��p�k��

r�k � b�Ax�k�

Putting these two equations together leads to an update equation for r at each iteration�

r�k � b�A�x�k�� k��p�k�� r�k�� k��Ap�k��

The nice thing about this update equation for r� versus the direct calculation of r� is that this updateequation depends on the matrix�vector product Ap�k�� which needs to be computed anyway duringthe computation of �� Thus� for quadratic functions� an implementation which costs only O��n��per iteration is possible� as given by the following code�

cg�cheap�quad�m

Minimize a quadratic function J�� x�T A x � b�T x

using the conjugate gradient method



if �iter��

r�b�Ax� determine r directly

else

r�r�alphad� determine r by the iterative formula�

end




if �iter�� compute update direction

p � r� set up a S�D� step

else

beta � res � res�save�iter� �� compute momentum term

p � r � beta p� set up a C�G� step

end

d�Ap� perform the �expensive� matrix�vector product

alpha�res��p�d�� compute alpha

x�x�alphap� update x


end

end cg�cheap�quad�m

A test code named test sd cg�m� which is available at the class web site� may be used to check tomake sure these steepest descent and conjugate gradient algorithms work as advertized� These codesalso produce some nice plots to get a graphical interpretation of the operation of the algorithms�

�� Preconditioned conjugate gradient

Assuming exact arithmetic� the conjugate gradient algorithm converges in exactly N iterations foran N �dimensional quadratic minimization problem� For large N � however� we often can not a�ordto perform N iterations� We often seek to perform approximate minimization of an N �diminsionalproblem with a total number of iterations m � N � Unfortunately� convergence of the conjugategradient algorithm to the minimum of J � though monotonic� is often highly nonuniform� so largereductions in J might not occur until iterations well after the iteration m at which we would liketo truncate the iteration sequence�The uniformity of the convergence is governed by the condition number c of the matrix A�

which �for symmetric positive�de nite A� is just equal to the ratio of its maximum and minimumeigenvalues� �max��min� For small c� convergence of the conjugate gradient algorithm is quite rapideven for high�dimensional problems with N � ��We therefore seek to solve a better conditioned but equivalent problem (A(x � (b which� once

solved� will allow us to easily extract the solution of the original problem Ax � b for a poorly�conditioned symmetric positive de nite A� To accomplish this� premultiply Ax � b by P�� forsome symmetric preconditioning matrix P �

P��Ax � P��b � �P��AP�� z �A

Px��z�x

� P��b� �z �b

Note that the matrix P��AP�� is symmetric positive de nite� We will defer discussion of theconstruction of an appropriate P to the end of the section� su�ce it to say for the moment that�if P � is somehow �close� to A� then the problem (A(x � (b is a well conditioned problem �because(A � P��AP�� I� and can be solved rapidly �with a small number of iterations� using theconjugate gradient approach�The computation of (A might be prohibitively expensive and destroy any sparsity structure of A�

We now show that it is not actually necessary to compute (A and (b in order to solve the originalproblem Ax � b in a well conditioned manner� To begin� we write the conjugate gradient algorithmfor the well conditioned problem (A(x � (b� For simplicity� we use a short�hand ��pseudo�code��notation�


for i � � � m

(r��(b� (A(x� i � �

(r� �(d� i � �

resold � res� res � (rT(r

(p��(r� i � �

(r� �(p where � � res�resold� i � �

� � res��(pT (d� where (d � (A(p

(x� (x� �(p

end

For clarity of notation� we have introduced a tilde over each vector involed in this optimization�Note that� in converting the poorly�conditioned problem Ax � b to the well�conditioned problem(A(x � (b� we made the following de nitions� (A � P��AP�� (x � Px� and (b � P��b� De ne nowsome new intermediate variables r � P(r� p � P��(p� and d � P (d� With these de nitions� we nowrewrite exactly the above algorithm for solving the well�conditioned problem (A(x � (b� but substitutein the non�tilde variables�

for i � � � m

P��r��P��b� �P��AP��Px� i � �

P��r� �P��d� i � �

resold � res� res � �P��r�T �P��r�

Pp��P��r� i � �

P��r� �Pp where � � res�resold� i � �

� � res�!�Pp�T �P��d�" where P��d � �P��AP��Pp

Px� Px� �Pp

end

Now de ne M � P � and simplify�

for i � � � m

r��b�Ax� i � �

r� �d� i � �

resold � res� res � rT s where s �M��r

p��s� i � �

s� �p where � � res�resold� i � �

� � res�!pTd" where d � Ap

x� x� �p

end

This is practically identical to the original conjugate gradient algorithm for solving the problemAx � b� The new variable s � M��r may be found by solution of the system Ms � r� Thus�


when implementing this method� we seek an M for which we can solve this system quickly �e�g�� anM which is the product of sparse triangular matrices�� Recall that if M � P � is somehow �close�to A� the problem here �which is actually the solution of (A(x � (b via standard conjugate gradient�is well conditioned and converges in a small number of iterations� There are a variety of heuristictechniques to construct an appropriate M � One of the most popular is incomplete Cholesky

factorization� which constructs a triangular H with HHT �M A with the following strategy�

H � A

for k � � � n

H�k� k� �pH�k� k�

for i � k � � � n

if H�i� k� �� then H�i� k� � H�i� k��H�k� k�

end

for j � k � � � n

for i � j � n

if H�i� j� �� then H�i� j� � H�i� j��H�i� k�H�j� k�

end

end

end

Once H is obtained with this appraoch such that HHT � M � solving the system Ms � r for sis similar to solving Gaussian elimination by leveraging an LU decomposition� one rst solves thetriangular system Hf � r for the intermediate variable f � then solves the triangular system HT s � f

for the desired quantity s�

Note that the triangular factors H and HT are zero everywhere A is zero� Thus� if A is sparse�the above algorithm can be rewritten in a manner that leverages the sparsity structure of H �akinto the backsubstitution in the Thomas algorithm�� Though it is sometimes takes a bit of e�ort towrite an algorithm that e�ciently leverages such sparsity structure� as it usually must be done ona case�by�case basis� the bene ts of preconditioning are often quite signi cant and well worth thecoding e�ort which it necessitates�

Motivation and further discussion of incomplete Cholesky factorization� as well as other precon�ditioning techniques� is deferred to Golub and van Loan ��

�� Extension non�quadratic functions

At each iteration of the conjugate gradient method� there are ve things to be done�

�� Determine the �negative of� the gradient direction� r � �rJ �

�� compute the residual rT r�

�� determine the necessary momentum � and the corresponding update direction p� r� �p�

�� determine the �scalar� parameter of descent � which minimizes J �x� �p�� and

�� update x� x� �p�

��

In the homework� you will extend the codes developed here for quadratic problems to nonquadraticproblems� creating two new routines sd nq�m and cg nq�m� Essentially� the algorithm is the same�but J �x� now lacks the special quadratic structure we assumed in the previous sections� Generalizingthe results of the previous sections to nonquadratic problems entails only a few modi cations�

�� Replace the line which determines the gradient direction with a call to a function� which wewill call compute grad�m� which calculates the gradient rJ of the nonquadratic function J �

�� As the function is not quadratic� but the momentum term in the conjugate gradient algorithmis computed using a local quadratic approximation of J � the momentum sometimes builds upin the wrong direction� Thus� the momentum should be reset to zero �i�e�� take � � �� everyR iterations in cg nq�m �R � �� is often a good choice��

�� Replace the direct computation of � with a call to an �appropriately modi ed� version ofBrent�s method to determine � based on a series of function evaluations�

Finally� when working with nonquadratic functions� it is often adventageous to compute themomentum term � according to the formula

� ��r�k � r�k��T r�k

�r�k��T r�k��

The �correction� term �r�k��T r�k is zero using the conjugate gradient approach when the functionJ is quadratic� When the function J is not quadratic� this additional term often serves to nudgethe descent direction towards that of a steepest descent step in regions of the function which arehighly nonquadratic� This approach is referred to as the Polak�Ribiere variant of the conjugategradient method for nonquadratic functions�

Chapter �

Di�erentiation

�� Derivation of �nite di�erence formulae

In the simulation of physical systems� one often needs to compute the derivative of a function f�x�which is known only at a discrete set of grid points x�� x�� xN � E�ective formulae for computingsuch approximations� known as nite di�erence formulae� may be derived by combination of one ormore Taylor series expansions� For example� de ning fj � f�xj�� the Taylor series expansion for fat point xj�� in terms of f and its derivatives at the point xj is given by

fj�� fj � �xj�� xj�f�j �

�xj�� xj��

�'f ��j �

�xj�� xj��

�'f ��j � � � �

De ning hj � xj�� xj � rearrangement of the above equation leads to

f �j �fj�� fj

hj� hj�f ��j � � � �

In these notes� we will indicate a uniform mesh by denoting its �constant� grid spacing as h �withouta subscript�� and we will denote a nonuniform ��stretched�� mesh by denoting the grid spacing as hj�with a subscript�� We will assume the mesh is uniform unless indicated otherwise� For a uniformmesh� we may write the above equation as

f �j �fj�� fj

h�O�h��

where O�h� denotes the contribution from all terms which have a power of h which is greater thenor equal to one� Neglecting these �higher�order terms� for su�ciently small h� we can approximatethe derivative as �

�f

�x

j

�fj�� fj

h�

which is referred to as the rst�order forward di�erence formula for the rst derivative� The neglectedterm with the highest power of h �in this case� �h f ��j �� is referred to as the leading�order error� Theexponent of h in the leading�order error is the order of accuracy of the method� For a su�ciently ne initial grid� if we re ne the grid spacing further by a factor of two� the truncation error of thismethod is also reduced by approximately a factor of �� indicating a � rst�order� behavior�

��

�� CHAPTER �� DIFFERENTIATION

Similarly� by expanding fj�� about the point xj and rearranging� we obtain�� f� x

�j�fj � fj��

h�

which is referred to as the rst�order backward di�erence formula for the rst derivative� Higher�order �more accurate� schemes can be derived by combining Taylor series of the function f at variouspoints near the point xj � For example� the widely used second�order central di�erence formula forthe rst derivative can be obtained by subtraction of the two Taylor series expansions

fj�� fj � h f �j �h�

�f ��j �

h�

�f ��j � � � �


�f ��j �

h�

�f ��j � � � � �

leading to

fj�� fj�� h f�j �

h�

�f ��j � � � �

and thus

f �j �fj�� fj��

�h� h�

�f ��j � � � � �

�� f� x

�j�fj�� fj��

�h�

Similar formulae can be derived for second�order derivatives �and higher�� For example� by addingthe above Taylor series expansions instead of subtracting them� we obtain the second�order centraldi�erence formula for second derivative given by

f ��j �fj�� fj � fj��

h�� h�

��f IVj � � � � �

�� f� x�

�j�fj�� fj � fj��

h��

In general� we can obtain higher accuracy if we include more points� By approximate linear com�bination of four di�erent Taylor Series expansions� a fourth�order central di�erence formula for the rst derivative may be found� which takes the form�� f

� x

�j�fj�� fj�� fj�� fj��

��h�

The main di�culty with higher order formulae occurs near boundaries of the domain� They requirethe functional values at points outside the domain which are not available� Near boundaries oneusually resorts to lower order formulae�

�� Taylor Tables

There is a general technique for constructing nite di�erence formulae using a tool known as aTaylor Table� This technique is best illustrated by example� Suppose we want to construct the mostaccurate nite di�erence scheme for the rst derivative that involves the function values fj � fj��and fj�� Given this restriction on the information used� we seek the highest order of accuracy thatcan be achieved� That is� if we take

f �j ��X

��

a� fj��

�� PADE APPROXIMATIONS ��

where the a� are the coe�cients of the nite di�erence formula sought� then we desire to selectthe a� such that the error � is as large a power of h as possible �and thus will vanish rapidly uponre nement of the grid�� It is convenient to organize the Taylor series of the terms in the aboveformula using a �Taylor Table� of the form

fj f �j f ��j f ��j

f �j�a�fj�a� fj��a� fj��

� � � �

�a� � � �

�a� �a�h �a� h�� a� h��a� �a� ��h� �a� ��h

�

� �a� ��h�

�

The leftmost column of this table contains all of the terms on the left�hand side of �� Theelements to the right� when multiplied by the corresponding terms at the top of each column andsummed� yield the Taylor series expansion of each of the terms to the left� Summing up all of theseterms� we get the error � expanded in terms of powers of the grid spacing h� By proper choice ofthe available degrees of freedom fa�� a�� a�g� we can set several of the coe�cients of this polynomialequal to zero� thereby making � as high a power of h as possible� For small h� this is a good wayof making this error small� In the present case� we have three free coe�cients� and can set thecoe�cients of the rst three terms to zero�

�a� � a� � a� � �

�� a�h� a��h� � �

�a� h�

�� a�

��h��

��

��

��

�A��a�ha�ha�h

�A �

��

�A �

��a�a�a�

�A �

��h��h��h�

�A

This simple linear system may be solved either by hand or with Matlab� The resulting second�orderforward di�erence formula for the rst derivative is�� f

� x

�j��fj � �fj�� fj��

�h�

and the leading�order error� which can be determined by multiplying the rst non�zero column sumby the term at the top of the corresponding column� is�

�a�h�

�� a�

��h��

�

f ��j �

h�

�f ��j �

revealing that this scheme is second�order accurate�

�� Pad�e Approximations

By including both nearby function evaluations and nearby gradient approximations on the left handside of an expression like �� we can derive a banded system of equations which can easily besolved to determine a numerical approximation of the f �j � This approach is referred to as Pad�eapproximation� Again illustrating by example� consider the equation

b�� f�j�� f �j � b� f

�j��

�Xk��

a� fj��

�� CHAPTER �� DIFFERENTIATION

Leveraging the Taylor series expansions


�f ��j �

h�

�f ��j �

h�

��f�ivj �

h�

��f�vj � � � �

f �j�� f �j � h f ��j �h�

�f ��j �

h�

�f�ivj �

h�

��f�vj � � � � �

the corresponding Taylor table is

fj f �j f ��j f ��j f�ivj f

�vj

b�� f�j��

f �jb� f

�j��

�a�� fj��a� fj�a� fj��

� b�� b��h� b��h�

� b��h�

� b��h

��

� � � � � �

� b� b�h b�h�

� b�h�

� b�h

��

�a�� a��h� �a�� h�

� �a�� h�

� �a�� h

�� a�� h

��

�a� � � � � �

�a� �a�h �a� h�� a� h�� a� h�� a� h

��

We again use the available degrees of freedom fb�� b�� a�� a�� a�g to obtain the highest possibleaccuracy in �� Setting the sums of the rst ve columns equal to zero leads to the linear system�

BBBB�� h � �h�h h �h�� h��h�� h�� h�� h��h�� h�� h�� h��

�CCCCA

�BBBB�b��b�a��a�a�

�CCCCA �

�BBBB��

�CCCCA

which is equivalent to�BBBB�

� � ��

�CCCCA

�BBBB�

b��b�

a��ha�ha�h

�CCCCA �

�BBBB��

�CCCCA �

�BBBB�b��b�a��a�a�

�CCCCA �

�BBBB�

��h��

��h�

�CCCCA

Thus� the system to be solved to determine the numerical approximation of the derivatives at eachgrid point has a typical row given by

�

�

�� f� x

�j��

�� f� x

�j��

�

�� f� x

�j��

��

�h

�fj�� fj��

��

and has a leading�order error of h�f�vj �� and thus is fourth�order accurate� Writing out this

equation for all values of j on the interior leads to the tridiagonal� diagonally�dominant system

�BBBBBB�

� � ��

� � ��

��

� � ��

�CCCCCCA

�BBBBBBBBBB�

�� f� x

�j��

� f� x

�j�

� f� x

�j��

�CCCCCCCCCCA�

�BBBBBBBBB�

��

��h


��

�CCCCCCCCCA�

�� MODIFIED WAVENUMBER ANALYSIS �

At the endpoints� a di�erent treatment is needed� a central expression �Pad�e or otherwise� cannotbe used at x� because x�� is outside of our available grid of data� One commonly settles on alower�order forward �backward� expression at the left �right� boundary in order to close this set ofequations in a nonsingular manner� This system can then be solved e�ciently for the numericalapproximation of the derivative at all of the gridpoints using the Thomas algorithm�

�� Modi�ed wavenumber analysis

The order of accuracy is usually the primary indicator of the accuracy of nite�di�erence formulae�it tells us how mesh re nement improves the accuracy� For example� for a su�ciently ne grid� meshre nement by a factor of two improves the accuracy of a second�order nite�di�erence scheme by afactor of four� and improves the accuracy of a fourth�order scheme by a factor of sixteen� Anothermethod for quantifying the accuracy of a nite di�erence formula that yields further information iscalled the modi�ed wavenumber approach� To illustrate this approach� consider the harmonicfunction given by

f�x� � ei k x ��

�Alternatively� we can also do this derivation with sines and cosines� but complex exponentials tendto make the algebra easier�� The exact derivative of this function is

f � � i k ei k x � i k f ��

We now ask how accurately the second order central nite�di�erence scheme� for example� computesthe derivative of f � Let us discretize the x axis with a uniform mesh�

xj � h � j where j � �� N� and h �L

N�

The nite�di�erence approximation for the derivative which we consider here is

�f

�x

j�fj�� fj��

�h

Substituting for fj � ei k xj � noting that xj�� xj � h and xj�� xj � h� we obtain

� f

� x

j�ei k �xj�h � ei k �xj�h

�h�ei k h � e�i k h

�hfj � i

sin�h k�

hfj � i k� fj ��

where

k� �sin h k

h� h k� � sin�h k�

By analogy with �� k� is called the modi ed wave number for this second�order nite di�erencescheme� In an analogous manner� one can derive modi ed wave numbers for any nite di�erenceformula� A measure of accuracy of the nite�di�erence scheme is provided by comparing the modi edwavenumber k�� which appears in the numerical approximation of the derivative �� with theactual wavenumber k� which appears in the exact expression for the derivative �� For smallwavenumbers� the numerical approximation of the derivative on our discrete grid is usually prettygood �k� k�� but for large wavenumbers� the numerical approximation is degraded� As k � �h�the numerical approximation of the derivative is completely o��

�

As another example� consider the modi ed wavenumber for the fourth�order expression of rstderivative given by

� f

� x

j�fj�� fj�� fj�� fj��

��h��

�h�fj�� fj��

��h�fj�� fj��

Inserting �� and manipulating as before� we obtain�

� f

� x

j��

�h�ei k h � e�i k h�fj � �

��ei k �h � e�i k �h�fj � i

h ��hsin�hk��

�hsin��hk�

ifj � ik� fj

� hk� ��

�sin�hk��

�sin��hk�

Consider now our fourth�order Pad�e approximation�

�

�

�� f� x

�j�� f� x

�j��

�

�� f� x

�j��

��

�h


�Approximating the modi ed wavenumber at points xj�� and xj�� with their corresponding numericalapproximations�� f

� x

�j��

� i k� ei k xj�� i k� ei k h fj and�� f� x

�j��

� i k� ei k xj�� i k� e�i k h fj

and inserting �� and manipulating as before� we obtain�

i k��

�ei k h � � �

�

�e�i k h

fj �

�

�h�ei k h � e�i k h�fj

i k��

�

�cos�hk�

fj � i

�

�hsin�hk�fj

� hk� �� sin�hk�

� � �� cos�hk�

�� Alternative derivation of di�erentiation formulae

Consider now the Lagrange interpolation of the three points fxi�� fi��g� fxi� fig� and fxi�� fi��g�given by�

f�x� ��x� xi��x� xi��

�xi�� xi��xi�� xi��fi��

�x� xi��x � xi��

�xi � xi��xi � xi��fi �

�x� xi��x � xi�

�xi�� xi��xi�� xi�fi��

Di�erentiate this expression with respect to x and then evaluating at x � xi gives

f � �xi� ��h�

��h��h�fi�� h�

��h��h�fi��

fi�� fi��h

�

which is the same as what we get with Taylor Table�

Chapter �

Integration

Di�erentiation and integration are two essential tools of calculus which we need to solve engineeringproblems� The previous chapter discussed methods to approximate derivatives numerically� we nowturn to the problem of numerical integration� In the setting we discuss in the present chapter� inwhich we approximate the integral of a given function over a speci ed domain� this procedure isusually referred to as numerical quadrature�

�� Basic quadrature formulae

�� Techniques based on Lagrange interpolation

Consider the problem of integrating a function f on the interval !a� c" when the function is evaluatedonly at a limited number of discrete gridpoints� One approach to approximating the integral off is to integrate the lowest�order polynomial that passes through a speci ed number of functionevaluations using the formulae of Lagrange interpolation�For example� if the function is evaluated at the midpoint b � �a� c�� then �de ning h � c�a�

we can integrate a constant approximation of the function over the interval !a� c"� leading to themidpoint rule�Z c

a

f�x� dx Z c

a

hf�b�

idx � h f�b� �M�f��

If the function is evaluated at the two endpoints a and c� we can integrate a linear approximationof the function over the interval !a� c"� leading to the trapezoidal rule�

Z c

a

f�x� dx Z c

a

h �x � c�

�a� c�f�a� �

�x� a�

�c� a�f�c�

idx � h

f�a� � f�c�

�� T �f��

If the function is known at all three points a� b� and c� we can integrate a quadratic approximationof the function over the interval !a� c"� leading to Simpson�s rule�

Z c

a

f�x� dx Z c

a

h �x � b��x� c�

�a� b��a� c�f�a� �

�x� a��x � c�

�b� a��b� c�f�b� �

�x� a��x � b�

�c� a��c� b�f�c�

idx

� � � � � hf�a� � �f�b� � f�c�

�� S�f��

��

� CHAPTER � INTEGRATION

Don�t forget that symbolic manipulation packages like Maple �which is included with Matlab� andMathematica are well suited for this sort of algebraic manipulation�

�� Extension to several gridpoints

Recall that Lagrange interpolations are ill�behaved near the endpoints when the number of gridpointsis large� Thus� it is generally ill�advised to continue the approach of the previous section to functionapproximations which are higher order than quadratic� Instead� better results are usually obtained byapplying the formulae of x�� repeatedly over several smaller subintervals� De ning a numerical gridof points fx�� x�� xng distributed over the interval !L�R"� the intermediate gridpoints xi�� xi�� xi�� the grid spacing hi � �xi � xi�� and the function evaluations fi � f�xi�� thenumerical approximation of the integral of f�x� over the interval !L�R" via the midpoint rule takesthe form Z R

L

f�x� dx nXi��

hi fi��

numerical approximation of the integral via the trapezoidal rule takes the form

Z R

L


hifi�� fi

��h

�

hf� � fn � �

n��Xi��

fi

i� ��

and numerical approximation of the integral via Simpson�s rule takes the form

Z R

L


hifi�� fi� �

�� fi

��h

�

hf� � fn � �

nXi��

fi� ��

n��Xi��

fi

i� ��

where the rightmost expressions assume a uniform grid in which the grid spacing h is constant�

�� Error Analysis of Integration Rules

In order to quantify the accuracy of the integration rules we have proposed so far� we may again turnto Taylor series analysis� For example� replacing f�x� with its Taylor series approximation about band integrating� we obtainZ c

a

f�x� dx �

Z c

a

hf�b� � �x� b� f ��b� �

�

��x � b�� f ��b� �

�

��x � b�� f �� b� � � � �

idx

� h f�b� ��

��x� b��

caf ��b� �

�

��x � b��

caf �� b� � � � �

� h f�b� �h�

��f ��b� �

h�

��f �iv �b� � � � �

��

Thus� if the integral is approximated by the midpoint rule �� the leading�order error is propor�tional to h�� and the approximation of the integral over this single interval is third�order accurate�Note also that if all even derivatives of f happen to be zero �for example� if f�x� is linear in x�� themidpoint rule integrates the function exactly�The question of the accuracy of a particular integration rule over a single interval is often not

of much interest� however� A more relevant measure is the rate of convergence of the integration

�� ROMBERG INTEGRATION �

rule when applied over several gridpoints on a given interval !L�R" as the numerical grid is re ned�For example� consider the formula �� on n subintervals �n � � gridpoints�� As h � ��n �thewidth of each subinterval is inversely proportional to the number of subintervals�� the error overthe entire interval !L�R" of the numerical integration will be proportional to nh� � h�� Thus� forapproximations of the integral on a given interval !L�R" as the computational grid is re ned� themidpoint rule is second�order accurate�Consider now the Taylor series approximations of f�a� and f�c� about b�

f�a� � f�b� �

��h�

f � �b� �

�

�

��h�

�

f �� b� ��

�

��h�

�

f �� b� � � � �

f�c� � f�b� �

�h

�

f � �b� �

�

�

�h

�

�

f �� b� ��

�

�h

�

�

f �� b� � � � �

Combining these expressions gives

f�a� � f�c�

�� f�b� �

�

h� f �� b� �

�

��f �iv �b� � � � �

Solve for f�b� and substituting into �� yieldsZ c

a

f�x� d x � hf�a� � f�c�

�� h�

��f �� b�� h�

��f �iv �b� � � � � ��

As with the midpoint rule� the leading�order error of the trapezoidal rule �� is proportional to h��and thus the trapezoidal approximation of the integral over this single interval is third�order accurate�Again� the most relevant measure is the rate of convergence of the integration rule �� when appliedover several gridpoints on a given interval !L�R" as the number of gridpoints is increased� in such asetting� as with the midpoint rule� the trapezoidal rule is second�order accurate�Note from �� that S�f� � �

� M�f� �� T �f�� Adding �� times equation �� plus ��

times equation �� givesZ c

a

f�x� d x � hf�a� � �f�b� � f�c�

�� h�

��f �iv �b� � � � �

The leading�order error of Simpson�s rule �� is therefore proportional to h�� and thus the approx�imation of the integral over this single interval using this rule is fth�order accurate� Note that if alleven derivatives of f higher than three happen to be zero �e�g�� if f�x� is cubic in x�� then Simpson�srule integrates the function exactly� Again� the most relevant measure is the rate of convergence ofthe integration rule �� when applied over several gridpoints on the given interval !L�R" as thenumber of gridpoints is increased� in such a setting� Simpson�s rule is fourth�order accurate�

�� Romberg integration

In the previous derivation� it became evident that keeping track of the leading�order error of aparticular numerical formula can be a useful thing to do� In fact� Simpson�s rule can be constructedsimply by determining the speci c linear combination of the midpoint rule and the trapezoidalrule for which the leading�order error term vanishes� We now pursue further such a constructiveprocedure� with a technique known as Richardson extrapolation� in order to determine even higher�order approximations of the integral on the interval !L�R" using linear combinations of several


trapezoidal approximations of the integral on a series of successively ner grids� The approach wewill present is commonly referred to as Romberg integration�

Recall rst that the error of the trapezoidal approximation of the integral �� on the giveninterval !L�R" may be written

I �

Z R

L

f�x� d x �h

�

hf� � fn � �

n��Xi��

fi

i� c� h

� � c� h� � c� h

� � c� h� � � � �

Let us start with n� � � and h� � �R�L��n� and iteratively re ne the grid� De ne the trapezoidalapproximation of the integral on a numerical grid with n � nl � �

l �e�g�� h � hl � �R � L��nl� as�

Il�� hl�

hf� � fnl � �

nl��Xi��

fi

i

We now examine the truncation error �in terms of h�� as the grid is re ned� Note that at the rstlevel we have

I�� I � c� h�� c� h

�� c� h

��

whereas at the second level we have

I�� I � c� h�� c� h

�� c� h

��

� I � c�h�� c�

h�� c�

h��

Assuming the coe�cients ci �which are proportional to the various derivatives of f� vary only slowlyin space� we can eliminate the error proportional to h�� by taking a linear combination of I�� andI�� to obtain�

I�� I�� I��

�� I �

�

�c� h

��

�

��c� h

��

�This results in Simpson�s rule� if you do all of the appropriate substitutions�� Continuing to thethird level of grid re nement� the trapezoidal approximation of the integral satis es

I�� I � c� h�� c� h

�� c� h

��

� I � c�h�� c�

h��

� c�h��

� � � �

First� eliminate terms proportional to h�� by linear combination with I��

I�� I�� I��

�� I �

�

��c� h

��

�

��c� h

��

Then� eliminate terms proportional to h�� by linear combination with I��

I�� I�� I��

�� I � �

��c� h

��

�� ROMBERG INTEGRATION �

This process may be repeated to provide increasingly higher�order approximations to the integral I �The structure of the re nements is�

Gridpoints �nd�Order �th�Order �th�Order th�OrderApproximation Correction Correction Correction

n� � �� I��

�n� � �

� � � I�� I��

n� � �� I�� I�� I��

� � �n� � �

� � �� I�� I�� I�� I��

The general form for the correction term �for k �� is�

Il�k ��k�� Il��k�� I�l��k��

��k��


A matlab implementation of Romberg integration is given below� A straightforward test code isprovided at the class web site�

function �int�evals� � int�romb�L�R�refinements�

Integrate the function defined in compute�f�m from x�L to x�R using

Romberg integration to provide the maximal order of accuracy with a

given number of grid refinements�

evals�� toplevel�refinements� �

for level� �toplevel

Approximate the integral with the trapezoidal rule on ��level subintervals

n��level�

�I�level� ��evals�temp�� int�trap�L�R�n��

evals�evals�evals�temp�

Perform several corrections based on I at the previous level�

for k��level

I�level�k� � ��k� �I�level�k� � � I�level� �k� ��k� � � ��

end

end

int�I�toplevel�toplevel��

end int�romb�m

A simple function to perform the trapezoidal integrations is given by�

function �int�evals� � int�trap�L�R�n�

Integrate the function defined in compute�f�m from x�L to x�R on

n equal subintervals using the trapezoidal rule�

h��R�L��n�


int��compute�f�L� � compute�f�R��

for i� �n�

x�L�hi� int�int�compute�f�x��

end

int�hint� evals��n� ��

end int�trap

�� Adaptive Quadrature

Often� it is wasteful to use the same grid spacing h everywhere in the interval of integration !L�R"�Ideally� one would like to use a ne grid in the regions where the integrand varies quickly and acoarse grid where the integrand varies slowly� As we now show� adaptive quadrature techniquesautomatically adjust the grid spacing in just such a manner�Suppose we seek a numerical approximation (I of the integral I such that

j(I � I j � ��

where � is the error tolerance provided by the user� The idea of adaptive quadrature is to spreadout this error in our approximation of the integral proportionally across the subintervals spanning!L�R"� To demonstrate this technique� we will use Simpson�s rule as the base method� First� dividethe interval !L�R" into several subintervals with the numerical grid fx�� x�� xng� Evaluating theintegral on a particular subinterval !xi�� xi" with Simpson�s rule yields

Si �hi�!f�xi � hi� � � f�xi � hi

�� f�xi�"�

Dividing this particular subinterval in half and summing Simpson�s approximations of the integralon each of these smaller subintervals yields

S��i �

hi��!f�xi � hi� � �f�xi � �hi

�� f�xi � hi

�� f�xi � hi

�� f�xi�"�

The essential idea is to compare the two approximations Si and S��i to obtain an estimate for the

accuracy of S��i � If the accuracy is acceptable� we will use S

��i for the approximation of the integral

on this interval� otherwise� the adaptive procedure further subdivides the interval and the processis repeated� Let Ii denote the exact integral on !xi�� xi"� From our error analysis� we know that

Ii � Si � c h�i f�iv �xi � hi

��

and

Ii � S��i � c

�hi�

�� hf �iv

�xi � �hi

�

�� f �iv

�xi � hi

�

�i� � � �

Each of the terms in the bracket in the last expression can be expanded in a Taylor series about thepoint xi � hi

� �

f �iv�xi � �hi

�

�� f �iv

�xi � hi

�

�� hi�f �v

�xi � hi

�

��

f �iv�xi � hi

�

�� f �iv

�xi � hi

�

��hi�f �v

�xi � hi

�

��

�

Thus�

Ii � S��i � �c

�hi�

�� hf �iv

�xi � hi

�

�i� � � � ��

Subtracting �� from �� we obtain

S��i � Si �

��

��c h�i f

�iv�xi � hi

�

��

and substituting into the RHS of �� reveals that

I � S��i �

��S

��i � Si��

Thus� the error in S��i is� to leading order� about �

�� of the di�erence between Si and S��i � The

good news is that this di�erence can easily be computed� If

�

��jS��i � Sij � hi

R� L��

then S��i is su�ciently accurate for the subinterval !xi�� xi"� and we move on to the next subinterval�

If this condition is not satis ed� the subinterval !xi�� xi" will be subdivided further� The essentialidea of adaptive quadrature is thus to spread evenly the error of the numerical approximation ofthe integral �or� at least� an approximation of this error� over the entire interval !L�R" by selectivere nements of the numerical grid� Similar schemes may also be pursued for the other base integrationschemes such as the trapezoidal rule� As with Richardson extrapolation� knowledge of the truncationerror can be leveraged to estimate the accuracy of the numerical solution without knowing the exactsolution�

�

Chapter �

Ordinary di�erential equations

Consider rst a scalar� rst�order ordinary di�erential equation �ODE� of the form

dy

dt� f�y� t� with y�t�� y��

The problem we address now is the advancement of such a system in time by integration of thisdi�erential equation� As the quantity being integrated� f � is itself a function of the result of theintegration� y� the problem of integration of an ODE is fundamentally di�erent than the problem ofnumerical quadrature discussed in x� in which the function being integrated was given� Note thatODEs with higher�order derivatives and systems of ODEs present a straightforward generalizationof the present discussion� as will be shown in due course� Note also that we refer to the independentvariable in this chapter as time� t� but this is done without loss of generality and other interpretationsof the independent variable are also possible�The ODE given above may be �solved� numerically by marching it forward in time� step by step�

In other words� we seek to approximate the solution y to �� at timestep tn�� tn � hn giventhe solution at the initial time t� and the solution at the previously�computed timesteps t� to tn�For simplicity of notation� we will focus our discussion initially on the case with constant stepsizeh� generalization to the case with nonconstant hn is straightforward�

�� Taylor�series methods

One of the simplest approaches to march the ODE �� forward in time is to appeal to a Taylorseries expansion in time� such as

y�tn�� y�tn� � hy��tn� �h�

�y��tn� �

h�

�y��tn� � � � � � ��

From our ODE� we have�

y� �dy

dt� f

y�� dy�

dt�df

dt�f

t�f

y

dy

dt� ft � ffy

y�� dy��

dt�

d

dt�ft � ffy� �

t�ft � ffy� �

y�ft � ffy�

dy

dt� ftt � ftfy � �ffyt � f�yf � f�fyy�

CHAPTER �� ORDINARY DIFFERENTIAL EQUATIONS

etc� Denoting the numerical approximation of y�tn� as yn� the time integration method based onthe rst two terms of �� is given by

yn�� yn � hf�yn� tn��

This is referred to as the explicit Euler method� and is the simplest of all time integration schemes�Note that this method neglects terms which are proportional to h�� and thus is �second�order�accurate over a single time step� As with the problem of numerical quadrature� however� a morerelevant measure is the accuracy achieved when marching the ODE over a given time interval �t�� t��T � as the timesteps h are made smaller� In such a setting� we lose one in the order of accuracy �as inthe quadrature problem discussed in x� and thus� over a speci ed time interval �t�� t��T �� explicitEuler is �rst�order accurate�We can also base a time integration scheme on the rst three terms of ��

yn�� yn � hf�yn� tn� �h�

�!ft�yn� tn� � f�yn� tn�fy�yn� tn�"�

Even higher�order Taylor series methods are also possible� We do not pursue such high�order Taylorseries approaches in the present text� however� as their computational expense is relatively high �dueto all of the cross derivatives required� and their stability and accuracy is not as good as some ofthe other methods which we will develop�Note that a Taylor series expansion in time may also be written around tn��

y�tn� � y�tn�� hy��tn�� h�

�y��tn�� h�

�y��tn��

The time integration method based on the rst two terms of this Taylor series is given by

yn�� yn � hf�yn�� tn��

This is referred to as the implicit Euler method� It also neglects terms which are proportionalto h�� and thus is �second�order� accurate over a single time step� As with explicit Euler� over aspeci ed time interval �t�� t� � T �� implicit Euler is �rst�order accurate�If f is nonlinear in y� implicit methods such as the implicit Euler method given above are

di�cult to use� because knowledge of yn�� is needed �before it is computed'� to compute f in orderto advance from yn to yn�� Typically� such problems are approximated by some type of linearizationor iteration� as will be discussed further in class� On the other hand� if f is linear in y� implicitstrategies such as �� are easily solved for yn��

�� The trapezoidal method

The formal solution of the ODE �� over the interval !tn� tn��" is given by

yn�� yn �

Z tn��

tn

f�y� t�dt�

Approximating this integral with the trapezoidal rule from x�� gives

yn�� yn �h

�!f�yn� tn� � f�yn�� tn��"� ��

This is referred to as the trapezoidal or Crank�Nicholson method� We defer discussion of theaccuracy of this method to x�� after we discuss rst an illustrative model problem�

�� A MODEL PROBLEM �

�� A model problem

A scalar model problem which is very useful for characterizing various time integration strategies is

y� � �y with y�t�� y��

where � is� in general� allowed to be complex� The exact solution of this problem is y � y�e�t�t��

The utility of this model problem is that the exact solution is available� so we can compare thenumerical approximation using a particular numerical method to the exact solution in order toquantify the pros and cons of the numerical method� The insight we gain by studying the applicationof the numerical method we choose to this simple model problem allows us to predict how this methodwill work on more di�cult problems for which the exact solution is not available�

Note that� for �� the magnitude of the exact solution grows without bound� We thusrefer to the exact solution as being unstable if �� and stable if �� Graphically�we denote the region of stability of the exact solution in the complex plane � by the shaded regionshown in Figure ��

−4 −3 −2 −1 0 1 2−3

−2

−1

0

1

2

3

Figure �� Stability of the exact solution to the model problem y� � �y in the complex plane ��

�� Simulation of an exponentially�decaying system

Consider now the model problem �� with � � �� The exact solution of this system is simplya decaying exponential� In Figure �� we show the application of the explicit Euler method� theimplicit Euler method� and the trapezoidal method to this problem� Note that the explicit Eulermethod appear to be unstable for the large values of h� Note also that all three methods are moreaccurate as h is re ned� with the trapezoidal method appearing to be the most accurate�

�� Simulation of an undamped oscillating system

Consider now the second�order ODE for a simple mass�spring system given by

y�� y with y�t�� y�� y��t��

where � � �� The exact solution is y � y� cos!��t� t��" � �y��!ei��t�t� � e�i��t�t�"�

� CHAPTER �� ORDINARY DIFFERENTIAL EQUATIONS

0 5 10 15−1.5

−1

−0.5

0

0.5

1

1.5

0 5 10 15−1.5

−1

−0.5

0

0.5

1

1.5

0 5 10 15−1.5

−1

−0.5

0

0.5

1

1.5

Figure �� Simulation of the model problem y� � �y with � � �� using the explicit Euler method�top�� the implicit Euler method �middle�� and the trapezoidal method �bottom�� Symbols denote�� h � �� h � �� exact solution�

We may easily write this second�order ODE as a rst�order system of ODEs by de ning y� � yand y� � y� and writing� �

y�y�

�� z �

y�

�

��

� �z �

A

�y�y�

� �z �

y

� ��

The eigenvalues of A are �i�� Note that the eigenvalues are imaginary� if we has started with theequation for a damped oscillator� the eigenvalues would have a negative real part as well� Note alsothat A may be diagonalized by its matrix of eigenvectors�

A � S&S�� where & �

�i� �� i�

�

Thus� we have

y� � S&S��y � S��y� � &S��y � z� � &z�

�� A MODEL PROBLEM �

0 5 10 15−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

0 5 10 15−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

0 5 10 15−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure �� Simulation of the oscillatory system y�� y with � � � using the explicit Eulermethod �top�� the implicit Euler method �middle�� and the trapezoidal method �bottom�� Symbolsdenote� �� h � �� h � �� exact solution�

where we have de ned z � S��y� In terms of the components of z� we have decoupled the dynamicsof the system�

z�� i�z�

z�� i�z��Each of these systems is exactly the same form as our scalar model problem �� with complex �inthis case� pure imaginary� values for �� Thus� eigenmode decompositions of physical systems �likemass�spring systems� motivate us to look at the scalar model problem �� over the complex plane�� In fact� our original second�order system �� as re�expressed in �� will be stable i� there areno eigenvalues of A with �� In Figure �� we show the application of the explicit Euler method� the implicit Euler method�

and the trapezoidal method to the rst�order system of equations �� Note that the explicit Eulermethod appears to be unstable for both large and small values of h� Note also that all three methodsare more accurate as h is re ned� with the trapezoidal method appearing to be the most accurate�We see that some numerical methods for time integration of ODEs are more accurate than others�

and some numerical techniques are sometimes unstable� even for ODEs with stable exact solutions�


In the next two sections� we develop techniques to quantify both the stability and the accuracy ofnumerical methods for time integration of ODEs by application of these numerical methods to themodel problem ��

�� Stability

For stability of a numerical method for time integration of an ODE� we want to insure that� if theexact solution is bounded� the numerical solution is also bounded� We often need to restrict thetimestep h in order to insure this� To make this discussion concrete� consider a system whose exactsolution is bounded and de ne�� a stable numerical scheme� one which does not blow up for any h�� an unstable numerical scheme� one which blows up for any h� and�� a conditionally stable numerical scheme� one which blows up for some h�

�� Stability of the explicit Euler method

Applying the explicit Euler method �� to the model problem �� we see that

yn�� yn � �hyn � �� h�yn�

Thus� assuming constant h� the solution at time step n is�

yn � �� h�ny� � �ny� � � � � � �h�

For large n� the numerical solution remains stable i�

j�j � � � �� Rh�� Ih�

� � ��The region of the complex plane which satis es this stability constraint is shown in Figure �� Notethat this region of stability in the complex plane �h is consistent with the numerical simulationsshown in Figure ��a and ��a� for real� negative �� this numerical method is conditionally stable�i�e�� it is stable for su�ciently small h�� whereas for pure imaginary �� this numerical method isunstable for any h� though the instability is mild for small h�

−4 −3 −2 −1 0 1 2−3

−2

−1

0

1

2

3

Figure �� Stability of the numerical solution to y� � �y in the complex plane �h using the explictEuler method�

�� STABILITY �

�� Stability of the implicit Euler method

Applying the implicit Euler method �� to the model problem �� we see that

yn�� yn � �hyn�� yn�� h��yn�


yn �

��

�� h

ny� � �ny� � � �

�

�� h�


j�j � � � �� Rh�� Ih�

� ��The region of the complex plane which satis es this stability constraint is shown in Figure �� Notethat this region of stability in the complex plane �h is consistent with the numerical simulationsshown in Figure ��b and ��b� this method is stable for any stable ODE for any h� and is evenstable for some cases in which the ODE itself is unstable�

−4 −3 −2 −1 0 1 2−3

−2

−1

0

1

2

3

Figure �� Stability of the numerical solution to y� � �y in the complex plane �h using the implicitEuler method�

�� Stability of the trapezoidal method

Applying the trapezoidal method �� to the model problem �� we see that

yn�� yn ��h

��yn � yn�� yn��

�� h

�

�� h�

�yn�


yn �

�� h

�

�� h�

�n

y� � �ny� � � �� h

�

�� h�

�


j�j � � � � � � � ��h� � ��


The region of the complex plane which satis es this stability constraint coincides exactly with theregion of stability of the exact solution� as shown in Figure �� Note that this region of stability inthe complex plane �h is consistent with the numerical simulations shown in Figure ��c and ��c�which are stable for systems with �� and marginally stable for systems with ��

−4 −3 −2 −1 0 1 2−3

−2

−1

0

1

2

3

Figure �� Stability of the numerical solution to y� � �y in the complex plane �h using thetrapezoidal method�

�� Accuracy

Revisiting the model problem y� � �y� the exact solution �assuming t� � � and h � constant� is

y�tn� � etny� � �eh�ny� �

�� h�

��h�

��h�

��

ny��

On the other hand� solving the model problem with explicit Euler led to

yn � �� h�ny� � �ny��

solving the model problem with implicit Euler led to

yn �

��

�� h

ny� �

�� h� ��h� � ��h� � � � �

�ny� � �ny��

and solving the model problem with trapezoidal led to

yn �

�� h

�

�� h�

�n

y� �

�� h�

��h�

��h�

��

ny� � �ny��

To quantify the accuracy of these three methods� we can compare the ampli cation factor � in eachof the numerical approximations to the exact value eh� The leading order error of the explicit Eulerand implicit Euler methods are seen to be proportional to h�� as noted in x�� and the leadingorder error of the trapezoidal method is proportional to h�� Thus� over a speci ed time interval�t�� t� � T �� explicit Euler and implicit Euler are �rst�order accurate and trapezoidal

is second�order accurate� The higher order of accuracy of the trapezoidal method implies animproved rate of convergence of this scheme to the exact solution as the timestep h is re ned� asobserved in Figures �� and ��

�� RUNGE�KUTTA METHODS �

�� Runge�Kutta methods

An important class of explicit methods� called Runge�Kutta methods� is given by the general form�

k� � f�yn� tn

�k� � f

�yn � �� h k�� tn � �� h

�k� � f

�yn � �� h k� � �� h k�� tn � �� h

��

yn�� yn � �� h k� � �� h k� � �� h k� � � � � �

��

where the constants �i� �i� and �i are selected to match as many terms as possible of the exactsolution�

y�tn�� y�tn� � hy��tn� �h�

�y��tn� �

h�

�y��tn� � � � �

where

y� � f

y�� ft � ffy

y�� ftt � ftfy � �ffyt � f�yf � f�fyy�

etc� Runge�Kutta methods are explicit and �self starting�� as they don�t require any informationabout the numerical approximation of the solution before time tn� this typically makes them quiteeasy to use� As the number of intermediate steps ki in the Runge�Kutta method is increased� theorder of accuracy of the method can also be increased� The stability properties of higher�orderRunge�Kutta methods are also generally quite favorable� as will be shown�

�� The class of second�order Runge�Kutta methods �RK��

Consider rst the family of two�step schemes of the form ��

k� � f�yn� tn��

k� � f�yn � �� h k�� tn � �� h�

f�yn� tn� � fy�yn� tn�� h f�yn� tn�

�� ft�yn� tn�

�� h

��

yn�� yn � �� h k� � �� h k�

yn � �� h f�yn� tn� � �� h�f�yn� tn� � �� h fy�yn� tn� f�yn� tn� � �� h ft�yn� tn�

� yn � �� h f�yn� tn� � �� h

� �� fy�yn� tn� f�yn� tn� � �� h� �� ft�yn� tn��

Note that the approximations given above are exact if f is linear in y and t� as it is in our modelproblem� The exact solution we seek to match with this scheme is given by

y�tn�� y�tn� � hf�yn� tn� �h�

�

�ft�yn� tn� � f�yn� tn�fy�yn� tn�

��


Matching coe�cients to as high an order as possible� we require that

��

�� h� ��

h�

�

�� h� ��

h�

�

��

� ��

��

��

Thus� the general form of the two�step second�order Runge�Kutta method �RK�� is


�k� � f

�yn � �hk�� tn � �h

�yn�� yn �

��

��

�h k� �

� ��

�h k��

��

where � is a free parameter� A popular choice is � � �� which is known as the midpoint method andhas a clear geometric interpretation of approximating a central di�erence formula in the integrationof the ODE from tn to tn�� Another popular choice is � � �� which is equivalent to perhaps themost common so�called �predictor�corrector� scheme� and may be computed in the following order�

predictor � y�n�� yn � hf�yn� tn�

corrector � yn�� yn �h

�

hf�yn� tn� � f�y�n�� tn��

i�

The �predictor� �which is simply an explicit Euler estimate of yn�� is only �stepwise �nd�orderaccurate�� However� as we shown below� calculation of the �corrector� �which looks roughly like arecalculation of yn�� with a trapezoidal rule� results in a value for yn�� which is �stepwise �rd�orderaccurate� �and thus the scheme is globally �nd�order accurate��

Applying an RK� method �for some value of the free parameter �� to the model problem y� � �yyields

yn�� yn ��

��

�h�yn �

� ��

�h�� h��yn

�� h�

�� h�

�

�yn � �yn � � � � � �h�

�� h�

��

The ampli cation factor � is seen to be a truncation of the Taylor series of the exact value eh �

� � �h � �h�

� � �h�

� � � � � We thus see that the leading order error of this method �for any valueof �� is proportional to h� and� over a speci ed time interval �t�� t� � T �� an RK� method is

second�order accurate� Over a large number of timesteps� the method is stable i� j�j � �� thedomain of stability of this method is illustrated in Figure ��

�� RUNGE�KUTTA METHODS

−4 −3 −2 −1 0 1 2−3

−2

−1

0

1

2

3

Figure �� Stability of the numerical solution to y� � �y in the complex plane �h using RK��

�� A popular fourth�order Runge�Kutta method �RK��

The most popular fourth�order Runge�Kutta method is


�k� � f

�yn �

h

�k�� tn��

�k� � f

�yn �

h

�k�� tn��

�k� � f

�yn � h k�� tn��

�yn�� yn �

h

�k� �

h

�

�k� � k�

��h

�k�

��

This scheme usually performs very well� and is the workhorse of many ODE solvers� This particularRK� scheme also has a reasonably�clear geometric interpretation� as discussed further in class�

A derivation similar to that in the previous section con rms that the constants chosen in ��indeed provide fourth�order accuracy� with the �� relationship again given by a truncated Taylorseries of the exact value�

� � � � �h�� h�

�� h�

�� h�

��

Over a large number of timesteps� the method is stable i� j�j � �� the domain of stability of thismethod is illustrated in Figure ��

CHAPTER �� ORDINARY DIFFERENTIAL EQUATIONS

−4 −3 −2 −1 0 1 2−3

−2

−1

0

1

2

3

Figure �� Stability of the numerical solution to y� � �y in the complex plane �h using RK��

�� An adaptive Runge�Kutta method �RKM��

Another popular fourth�order scheme� known as the Runge�Kutta�Merson method� is


�k� � f

�yn �

h

�k�� tn��

�k� � f

�yn �

h

��k� � k�� tn��

�k� � f

�yn �

h

�k� � �k�� tn��

�y�n�� yn �

h

�k� � �h

�k� � �hk�

k� � f�y�n�� tn��

�yn�� yn �

h

�k� �

�h

�k� �

h

�k��

��

Note that one extra computation of f is required in this method as compared with the method givenin �� With the same sort of analysis as we did for RK�� it may be shown that both y�n�� andyn�� are �stepwise �th�order accurate�� meaning that using either to advance in time over a giveninterval gives global �th�order accuracy� In fact� if (y�t� is the exact solution to an ODE and yn takesthis exact value of (y�tn� at t � tn� then it follows after a bit of analysis that the errors in y

�n�� and

yn�� are

y�n�� (y�tn�� h�

��(y�v �O�h��

yn�� (y�tn�� h�

��(y�v �O�h��

Subtracting �� from �� gives

yn�� y�n�� h�

��(y�v �O�h��

�� RUNGE�KUTTA METHODS �

which may be substituted on the RHS of �� to give

yn�� (y�tn�� yn�� y�n�� O�h��

The quantity on the LHS of �� is the error of our current �best guess� for yn�� The rst termon the RHS is something we can compute� even if we don�t know the exact solution (y�t�� Thus� evenif the exact solution (y�t� is unknown� we can still estimate the error of our best guess of yn�� withquantities which we have computed� We may use this estimate to decide whether or not to re neor coarsen the stepsize h to attain a desired degree of accuracy on the entire interval� As with theprocedure of adaptive quadrature� it is straightforward to determine whether or not the error onany particular step is small enough such that� when the entire �global� error is added up� it will bewithin a prede ned acceptable level of tolerance�

�� A low�storage Runge�Kutta method �RKW��

Amongst people doing very large simulations with specialized solvers� a third�order scheme which israpidly gaining popularity� known as the Runge�Kutta�Wray method� is


�k� � f

�yn � �� h k�� tn � �� h

�k� � f

�yn � �� h k� � �� h k�� tn � �� h

�yn�� yn � �� h k� � �� h k� � �� h k��

��

where

��

��

��

−4 −3 −2 −1 0 1 2−3

−2

−1

0

1

2

3

Figure �� Stability of the numerical solution to y� � �y in the complex plane �h using third�orderRunge�Kutta�

��

A derivation similar to that in x�� con rms that the constants chosen in �� indeed providethird�order accuracy� with the � � � relationship again given by a truncated Taylor series of theexact value�

� � � � �h�� h�

�� h�

��

Over a large number of timesteps� the method is stable i� j�j � �� the domain of stability of thismethod is illustrated in Figure ��This scheme is particularly useful because it may be computed in the following order


�y� � yn � �� h k� �overwrite yn�

y�� y� � �� h k� �overwrite k��

k� � f�y�� tn � �� h

��overwrite y��

y�� y�� h k� �update y��

yn�� y�� h k� �overwrite k��

k� � f�y�� tn � �� h

��overwrite y��

yn�� yn�� h k� �update yn��

with �� and �� Veri cation that these two forms of the RKW� scheme areidentical is easily obtained by substitution� The above ordering of the RKW� scheme is useful whenthe dimension N of the vector y is huge� In the above scheme� we rst compute the vector k� �also ofdimension N� as we do� for example� for explicit Euler� However� every operation that follows eitherupdates an existing vector or may overwrite the memory of an existing vector' �Pointers are veryuseful to set up multiple variables at the same memory location�� Thus� we only need enough roomin the computer for two vectors of length N � not the four that one might expect are needed uponexamination of �� Amazingly� though this scheme is third�order accurate� it requires no morememory than explicit Euler� Note that one has to be very careful when computing an operationlike f�y�� in place �i�e�� in a manner which immediately overwrites y�� when the function f is acomplicated nonlinear function�Low�storage schemes such as this are essential in both computational mechanics �e�g�� the nite

element modeling of the stresses and temperatures of an engine block� and computational �uiddynamics �e�g�� the nite�di�erence simulation of the turbulent �ow in a jet engine�� In theseand many other problems of engineering interest� accurate discretization of the governing PDEnecessitates big state vectors and e�cient numerical methods� It is my hope that� in the past�� weeks� you have begun to get a �avor for how such numerical techniques can be derived andimplemented�

Appendix A

Getting started with Matlab

A�� What is Matlab�

Matlab� short for Matrix Laboratory� is a high�level language excellent for� among many other things�linear algebra� data analysis� two and three dimensional graphics� and numerical computation ofsmall�scale systems� In addition� extensive �toolboxes� are available which contain useful routinesfor a wide variety of disciplines� including control design� system identi cation� optimization� signalprocessing� and adaptive lters�In short� Matlab is a very rapid way to solve problems which don�t require intensive compu�

tations or to create plots of functions or data sets� Because a lot of problems encountered in theengineering world are fairly small� Matlab is an important tool for the engineer to have in his or herarsenal� Matlab is also useful as a simple programming language in which one can experiment �onsmall problems� with e�cient numerical algorithms �designed for big problems� in a user�friendlyenvironment� Problems which do require intensive computations� such as those often encounteredin industry� are more e�ciently solved in other languages� such as Fortran ��

A�� Where to �nd Matlab

For those who own a personal computer� the �student edition� of Matlab is available at the bookstorefor just over what it costs to obtain the manual� which is included with the program itself� This isa bargain price �less than )�� for excellent software� and it is highly recommended that you getyour own �legal� copy of it to take with you when you leave UCSD��

For those who don�t own a personal computer� or if you choose not to make this investment� youwill nd that Matlab is available on a wide variety of platforms around campus� The most stableversions are on various Unix machines around campus� including the ACS Unix boxes in the studentlabs �accessible by undergrads� and most of the Unix machines in the research groups of the MAEdepartment� Unix machines running Matlab can be remotely accessed with any computer runningX�windows on campus� There are also several Macs and PCs in the department �including thoserun by ACS� with Matlab already loaded�

�The student edition is actually an almost full�edged version of the latest release of the professional version ofMatlab� limited only in its printing capabilities and the maximum matrix size allowed� Note that the use of m�les�described at the end of this chapter� allows you to rerun nished Matlab codes at high resolution on the larger Unixmachines to get report�quality printouts� if so desired� after debugging your code on your PC�

�

� APPENDIX A� GETTING STARTED WITH MATLAB

A�� How to start Matlab

Running Matlab is the same as running any software on the respective platform� For example� on aMac or PC� just nd the Matlab icon and double click on it�Instructions on how to open an account on one of the ACS Unix machines will be discussed in

class� To run Matlab once logged in to one of these machines� just type matlab� The PATH variableshould be set correctly so it will start on the rst try�if it doesn�t� that machine probably doesn�thave Matlab� If running Matlab remotely� the DISPLAY variable must be set to the local computerbefore starting Matlab for the subsequent plots to appear on your screen� For example� if you aresitting at a machine named turbulence� you need to�

A� log in to turbulence�

B� open a window and telnet to an ACS machine on which you have an account

telnet iacs�

C� once logged in� set the DISPLAY environmental variable� with

setenv DISPLAY turbulence��

D� run Matlab� with

matlab

If you purchase the student edition� of course� full instructions on how to install the programwill be included in the manual�

A�� How to run Matlab�the basics

Now that you found Matlab and got it running� you are probably on a roll and will stop reading themanual and this handout �you are� after all� an engineer'� The following section is designed to giveyou just enough knowledge of the language to be dangerous� From there� you can�

A� read the manual� available with the student edition or on its own at the bookstore� or

B� use the online help� by typing help �command name�� for a fairly thorough description ofhow any particular command works� Don�t be timid to use this approach� as the online helpin Matlab is very extensive�

To begin with� Matlab can function as an ordinary �but expensive'� calculator� At the �� prompt�try typing

��

Matlab will reassure you that the universe is still in good order

ans �

�

To enter a matrix� type

�� A � � � �� !� � � ��

A�� HOW TO RUN MATLAB�THE BASICS �

Matlab responds with

A �

� �

� � !

� � �

By default� Matlab operates in an echo mode� that is� the elements of a matrix or vector will beprinted out as it is created� This may become tedious for large operations�to suppress it� type asemicolon after entering commands� such as�

�� A � � � �� !� � � ��

Matrix elements are separated by spaces or commas� and a semicolon indicates the end of a row�Three periods in a row means that the present command is continued on the following line� as in�

�� A � � � ��

� � !� ��

� � ��

When typing in a long expression� this can greatly improve the legibility of what you just typed�Elements of a matrix can also be arithmetic expressions� such as �pi� �sqrt�� etc� A columnvector may be constructed with

�� y � �� $��

resulting in

y �

�

�

$

To automatically build a row vector� a statement such as

�� z � ��

results in

z �

� � � ! � �

Vector y can be premultiplied by matrix A and the result stored in vector z with

�� z � Ay

Multiplication of a vector by a scalar may be accomplished with

�� z � ��y

Matrix A may be transposed as

�� C � A�

The inverse of a matrix is obtained by typing

�� D � inv�A�

A �� identity matrix may be constructed with�� E � eye��

Tridiagonal matrices may be constructed by the following command and variations thereof�

�� diag�ones�m� � �� diag�ones�m� �� diag�ones�m� � ��


There are two �matrix division� symbols in Matlab� � and � � if A is a nonsingular squarematrix� then A�B and B�A correspond formally to left and right multiplication of B �which must be ofthe appropriate size� by the inverse of A� that is inv�A�B and Binv�A�� but the result is obtaineddirectly �via Gaussian elimination with full pivoting� without the computation of the inverse �whichis a very expensive computation�� Thus� to solve a system Ax�b for the unknown vector x� type

�� A�� !� � � ��

�� b��

�� x�A�b

which results in

x �

�

�

�

To check this result� just type

�� Ax

which veri es that

ans �

�

�

��

Starting with the innermost group of operations nested in parentheses and working outward� theusual precedence rules are observed by Matlab� First� all the exponentials are calculated� Then�all the multiplications and divisions are calculated� Finally� all the additions and subtractions arecalculated� In each of these three catagories� the calculation proceeds from left to right through theexpression� Thus

��

ans �

�

and

��

ans � ��

It is best to make frequent use of parentheses to insure the order of operations is as you intend�Note that the matrix sizes must be correct for the requested operations to be de ned or an errorwill result�Suppose we have two vectors x and y and we wish to perform the componentwise operations�

z� � x� � y� for � � �� nThe Matlab command to execute this is

�� x � � ��

�� y � �!� ��

�� z � x�y

A�� HOW TO RUN MATLAB�THE BASICS �

Note that z � xy will result in an error� since this implies matrix multiplication and is unde nedin this case� �A row vector times a column vector� however� is a well de ned operation� so z � x�y

is successful� Try it'� The period distinguishes matrix operations � � and �� from component�wiseoperations �� and �� One of the most common bugs when starting out with Matlab is de ningand using row vectors where column vectors are in fact needed�Matlab also has control �ow statements� similar to many programming languages�

�� for i � �!� x�i� � i� end� x

creates�

x �

� � � � !

Each for must be matched by an end� Note in this example that commas are used to include severalcommands on a single line� and the trailing x is used to write the nal value of x to the screen� Notealso that this for loop builds a row vector� not a column vector� An if statement may be used asfollows�

�� n � ��

�� if n � �� j � � elseif n � �� j � � � else j � �� end

The format of a while statement is similar to that of the for� but exits at the control of a logicalcondition�

�� m � ��

�� while m � �� m � m�� end� m

Matlab has several advanced matrix functions which will become useful to you as your knowledgeof linear algebra grows� These are summarized in the following section�don�t be concerned if mostof these functions are unfamiliar to you now� One command which we will encounter early in thequarter is LU decomposition�

�� A�� !� � � ��

�� L�U�P��lu�A�

This results in a lower triangular matrix L� an upper triangular matrix U� and a permutation matrixP such that PX � LU

L �

��

�� $ ��

��

U �

��

� ��

� � ��

P �

� �

� �

� �


Checking this with

�� P�LU

con rms that the original matrix A is recovered�

A�� Commands for matrix factoring and decomposition

The following commands are but a small subset of what Matlab has to o�er�

�� R � chol�X�

produces an upper triangular R so that R�R � X� If X is not positive de nite� an error message isprinted�

�� V�D� � eig�X�

produces a diagonal matrix D of eigenvalues and a full matrix V whose columns are the correspondingeigenvectors so that XV � VD�

�� p � poly�A�

If A is an N by N matrix� poly�A� is a row vector with N�� elements which are the coe�cients ofthe characteristic polynomial� det�lambdaeye�A� � A��If A is a vector� poly�A� is a vector whose elements are the coe�cients of the polynomial whoseroots are the elements of A�

�� P�H� � hess�A�

produces a unitary matrix P and a Hessenberg matrix H so that A � PHP� and P�P � eye�size�P��

�� L�U�P� � lu�X�

produces a lower triangular matrix L� an upper triangular matrix U� and a permutation matrix P sothat PX � LU�

�� Q � orth�A�

produces an orthonormal basis for the range of A� Note that Q�Q � I� the columns of Q span thesame space as the columns of A and the number of columns of Q is the rank of A�

�� Q�R� � qr�X�

produces an upper triangular matrix R of the same dimension as X and a unitary matrix Q so thatX � QR�

�� U�S�V� � svd�X�

produces a diagonal matrix S� of the same dimension as X� with nonnegative diagonal elements indecreasing order� and unitary matrices U and V so that X � USV��

A�� COMMANDS USED IN PLOTTING

A� Commands used in plotting

For a two dimensional plot� try for example�

�� x��

�� y �sin�x��

�� y��cos�x��

�� plot�x�y ��x�y��x��

This results in the plot�

0 1 2 3 4 5 6 7 8 9 10−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Three dimensional plots are also possible�

�� x�y� � meshgrid��

�� R � sqrt�x�� y�� eps�

�� Z � sin�R��R�

�� mesh�Z�

010

2030

40

0

10

20

30

40−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

This results in the plot�

Axis rescaling and labelling can be controlled with loglog� semilogx� semilogy� title� xlabel�and ylabel�see the help pages for more information�

APPENDIX A� GETTING STARTED WITH MATLAB

A� Other Matlab commands

Typing who lists all the variables created up to that point� Typing whos gives detailed informationshowing sizes of arrays and vectors�In order to save the entire variable set computed in a session� type save session prior to

quitting� All variables will be saved in a le named session�mat� At a later time the session may beresumed by typing load session� Save and load commands are menu�driven on the Mac and PC�Some additional useful functions� which are for the most part self�explanatory �check the help

page if not�� are� abs� conj� sin� cos� tan� asin� acos� atan� sinh� cosh� tanh� exp�

log� log �� eps�Matlab also has many special functions built in to aid in linear problem�solving� an index of

these along with their usage is in the Matlab manual� Note that many of the *built�in� functions arejust prewritten m� les stored in subdirectories� and can easily be opened and accessed by the userfor examination�

A�� Hardcopies

Hardcopies of graphs on the Unix stations are best achieved using the print command� This createspostscript les which may then be sent directly to a postscript printer using the local print command�usually lp or lpr� at the Unix level� If using the �deps option of the print command in Matlab�then the encapulated postscript le �e�g�� figure �eps� may be included in TEX documents as donehere�Hardcopies of graphs on a Mac may be obtained from the options available under the *File� menu�

On the student edition for the PC� the prtscr command may be used for a screen dump�Text hardcopy on all platforms is best achieved by copying the text in the Matlab window and

pasting it in to the editor of your choosing and then printing from there�

A�� Matlab programming procedures� m��les

In addition to the interactive� or �Command� mode� you can execute a series of Matlab commandsthat are stored in *m� les�� which have a le extension *�m�� An example m� le is shown in thefollowing section� When working on more complicated problems� it is a very good idea to work fromm� les so that set�up commands don�t need to be retyped repeatedly�Any text editor may be used to generate m� les� It is important to note that these should be plain

ASCI text les�type in the sequence of commands exactly as you would if running interactively�The symbol is used in these les to indicate that the rest of that line is a comment�comment all m� les clearly� su�ciently� and succinctly� so that you can come back to the code later and understandhow it works� To execute the commands in an m� le called sample�m� simply type sample at theMatlab �� prompt� Note that an m� le may call other m� les� Note that there are two types ofm� les� scripts and functions� A script is just a list of matlab commands which run just as if youhad typed them in one at a time �though it sounds pedestrian� this simple method is a useful andstraightforward mode of operation in Matlab�� A function is a set of commands that is �called� �asin a real programming language�� only inheriting those variables in the argument list with which itis called and only returning the variables speci ed in the function declaration� These two types ofm� les will be illustrated thoroughly by example as the course proceeds�On Unix machines� Matlab should be started from the directory containing your m� les so that

Matlab can nd these les� On all platforms� I recommend making a new directory for each newproblem you work on� to keep things organized and to keep from overwriting previously�written

A�� SAMPLE M�FILE �

m� les� On the Mac and the PC� the search path must be updated using the path command afterstarting Matlab so that Matlab can nd your m� les �see help page for details��

A�� Sample m��le

� sample�m

echo on� clc

� This code uses MATLAB�s eig program to find eigenvalues of

� a few random matrices which we will construct with special

� structure�

�

� press any key to begin

pause

R � rand��

eig�R�

� As R has random entries� it may have real or complex eigenvalues�

� press any key

pause

eig�R R��

� Notice that R R� is symmetric� with real eigenvalues�

� press any key

pause

eig�R R��

� The matrix R R� is skewsymmetric� with imaginary eigenvalues�

� press any key

pause

�V�D� � eig�R� R�

� This matrix R� R is symmetric� with real POSITIVE eigenvalues�

� press any key

pause

�V�D� � eig�R R��

� R R� has the same eigenvalues as R� R� But different eigenvectors�

� press any key

pause

� You can create matrices with desired eigenvalues �� from any

� invertible S times lambda times S inverse�

S � rand��

lambda � diag��

A � S lambda inv�S�

eig�A�

echo off

� end sample�m

Date post:	25-Aug-2018
Category:	Documents
Upload:	phungcong
View:	222 times
Download:	1 times

Numerical -...

Documents