+ All Categories
Home > Documents > The Pythagorean Theorem and Its Consequences Pythagorean Theorem and Its Consequences Jim Emery...

The Pythagorean Theorem and Its Consequences Pythagorean Theorem and Its Consequences Jim Emery...

Date post: 11-Apr-2018
Category:
Upload: nguyentuyen
View: 227 times
Download: 1 times
Share this document with a friend
32
The Pythagorean Theorem and Its Consequences Jim Emery Edited: 8/4/13 Contents 1 Pythagoras: Biographical Sketch 5 2 Eight Proofs of the Pythagorean Theorem 5 2.1 Proof I: Euclid’s Elements .................... 5 2.2 Proof II: The Ascent of Man ................... 5 2.3 Proof III: Garfield ......................... 10 2.4 Proof IV: An Arrangement of Four Triangles in a Square of side a + b ............................. 12 2.5 Proof V: An Arrangement of Four Triangles in a Square of side c 12 2.6 Remarks on Geometric Proofs, Versus Algebraic Proofs .... 12 2.7 Proof VI: Equating Two Square Arrangements ......... 12 2.8 Proof VII: Triangle Area Proportional to the Hypotenuse Squared 17 2.9 Proof VIII: Similarity and Proportion .............. 17 3 A Crises in Greek Mathematics: What is a real number? 20 4 Inequalities 20 5 Euclidean Distance 21 6 Distance Functions and the Metric Space 22 7 Vector Spaces and Inner Product Spaces 23 1
Transcript

The Pythagorean Theorem and Its

Consequences

Jim Emery

Edited: 8/4/13

Contents

1 Pythagoras: Biographical Sketch 5

2 Eight Proofs of the Pythagorean Theorem 5

2.1 Proof I: Euclid’s Elements . . . . . . . . . . . . . . . . . . . . 52.2 Proof II: The Ascent of Man . . . . . . . . . . . . . . . . . . . 52.3 Proof III: Garfield . . . . . . . . . . . . . . . . . . . . . . . . . 102.4 Proof IV: An Arrangement of Four Triangles in a Square of

side a + b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.5 Proof V: An Arrangement of Four Triangles in a Square of side c 122.6 Remarks on Geometric Proofs, Versus Algebraic Proofs . . . . 122.7 Proof VI: Equating Two Square Arrangements . . . . . . . . . 122.8 Proof VII: Triangle Area Proportional to the Hypotenuse Squared 172.9 Proof VIII: Similarity and Proportion . . . . . . . . . . . . . . 17

3 A Crises in Greek Mathematics: What is a real number? 20

4 Inequalities 20

5 Euclidean Distance 21

6 Distance Functions and the Metric Space 22

7 Vector Spaces and Inner Product Spaces 23

1

8 Normed Linear Spaces 23

9 Normed Linear Spaces and Functional Analysis 23

10 Hilbert Space and `2 23

11 Orthogonality, Orthagonal Polynomials, Fourier Series 23

12 Projections 23

13 Linear Least Squares Problems as Geometric Problems: Or-

thogonality and the Pythagorean Theorem 23

14 Elementary Formulation of the Least Squares Problem for

Straight Line Fitting 23

15 A Geometric View of the Least Squares Problem 24

16 Bibliography 31

List of Figures

1 The Pythagorean Theorem. The area of a square on thehypotenuse of a right triangle is equal to the sum of the squareson the sides. a2 + b2 = c2, where here a = 6, b = 8, c = 10 . . 6

2 Proof I: Euclid’s Elements. The short side of the triangle isa, the long side is b and the hypotenuse is c. The more darklyshaded triangle rotated counterclockwise by 90 degrees, willfall exactly on the more lightly shaded triangle. So these twotriangles are congruent. The line from the top vertex dividesthe square on the hypotenuse c into a left rectangle L and aright one R. The dark triangle has area b2/2, because its basehas length b, as does its height. The area of the lightly shadedtriangle is 1/2 that of the right sub-rectangle R. Thereforethe area of R is b2. Repeating the argument on the left sideof the figure with two new triangles, we find the area of L isa2. Therefore c2 = a2 + b2. . . . . . . . . . . . . . . . . . . . . 7

2

3 Proof II: The Ascent of Man. Jacob Bronowski in hisbook The Ascent of Man discusses this proof on pages 156-162. The book is based on the 1972 BBC television series ofthe same name. The small side of the triangle is a, the longside b, the hypotenuse c. The area of the left figure is c2. Theshaded inner square has side length b − a. Rearranging thepieces of the left figure we get the right figure consisting of asmall square of area a2, and a larger composite square of areab2. Therefore a2 + b2 = c2. . . . . . . . . . . . . . . . . . . . . 8

4 Proof II: The Ascent of Man. Jacob Bronowski in hisbook The Ascent of Man discusses this proof on pages 156-162. The book is based on the 1972 BBC television series ofthe same name. The small side of the triangle is a, the longside b, the hypotenuse c. The area of the left figure is c2. Theshaded inner square has side length b − a. Rearranging thepieces of the left figure we get the right figure consisting of asquare region of area b2 (shaded region), and a square regionof area a2 (unshaded region). Therefore a2 + b2 = c2. . . . . . 9

5 Proof III: Garfield’s Proof. The long side of the triangle isb, the short side a, the hypotenuse c. The area of the trapezoidis A = (a + b)(a + b)/2 = a2/2 + ab + b2/2. The area as thesum of the three triangles is A = ab + c2/2. Equating the twoexpressions for A we obtain the result a2 + b2 = c2. . . . . . . 11

6 Proof IV: An Arrangement of Four Triangles in a

Square of side a + b The short side of the triangle is a, thelong side b, and the hypotenuse c. The area of the enclosingrectangle is A = (a + b)2. The area of the four triangles andthe inside rectangle is A = 4(ab/2) + c2 = 2ab + c2. Equatingthese two expressions for area A we have a2 + b2 = c2. . . . . . 13

7 Proof V: An Arrangement of Four Triangles in a Square

of side c The short side of the triangle is a, the long side b,and the hypotenuse c. The area of the enclosing rectangle isA = c2. The area of the four triangles and the inside rectan-gle is A = 4(ab/2) + (b − a)2 = a2 + b2. Equating these twoexpressions for area A we have a2 + b2 = c2. . . . . . . . . . . 14

3

8 An Arrangement That Leads to a Geometric Proof.

By equating this square with a certain second square alsoof side a + b we arrive at a clearly geometric proof of thePythagorean Theorem that uses no algebra, and could havebeen employed by the Greeks, who did not have algebra avail-able. They used number in their arguments, but to them num-bers were line segment lengths. Through this means Euclidtreated the concept of proportional numbers. . . . . . . . . . 15

9 Proof VI: Equating Two Square Arrangements. Theleft enclosing square and the right enclosing square have thesame area. Therefore the sum of the areas of the two shadedsquares on the left, is equal to the area of the shaded squareon the right. That is, the sum of the square on the short sideof the triangle, plus the square on the long side of the triangle,is equal to the area of the square on the hypotenuse. . . . . . 16

10 Proof VII: Area Proportional to the Hypotenuse Squared.

Let the short side of the triangle be a, the long side b, and thehypotenuse c. Similar right triangles have their areas pro-portional to the square of their hypotenuses. with the sameproportionality constant, say α. This follows because simi-lar triangles have corresponding sides that are proportional.Also the ratios of triangle sides are the same for similar tri-angles. This is established in Euclidean geometry, and is thebasis of trigonometry. In particular similar triangles have thesame acute angles. Let one of them be θ. Then the area ofthe triangle is A = (ab)/2 = c cos(θ)c sin(θ) = αc2, whereα = cos(θ) sin(θ). The vertical line divides the triangle intotwo similar triangles, a left one and a right one. The hy-potenuse of the left sub-triangle is a, the right one b. Thustheir areas are αa2 and αb2. The area of the original triangleis αc2. So αa2 + αb2 = αc2. Thus a2 + b2 = c2. . . . . . . . . . 18

11 Proof VIII: Similarity and Proportion. Let the shortside of the triangle be a, the long side b, and the hypotenuse c.The vertical line divides the triangle into two similar triangles.Corresponding sides are proportional. c is divided into twosegments c1 on the left and c2 on the right. We have a/c1 =c/a, so a2 = c1c. And b/c2 = c/b, so b2 = c2c. Then a2 + b2 =(c1 + c2)c = c2. . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4

1 Pythagoras: Biographical Sketch

Pythagoras proclaimed that ”All is Number” (that is, all is Mathematics).Pythagoras was born in Samos about 570 BC and died about 495 BC. Knowl-edge about him is vague and uncertain. He is said to have related mathe-matics to music, believed in reincarnation, and founded a secret religion insouthern Italy in the town of Croton, a Greek colony. Much said about himmay be apocryphal. But perhaps he was the first to call himself a philoso-pher (lover of knowledge). Many later philosophers claimed to have beeninfluenced by his ideas. The Pythagorean theorem itself may have originatedin the cultures of the Babylonians and Indians, although he may have beenthe first to write down a formal proof of this theorem, earlier versions beingfolklore and tradition.

2 Eight Proofs of the Pythagorean Theorem

Most proofs are obvious from geometrical figures. Some proofs are algebraic,many use use the concept of similarities of triangles and proportions.

2.1 Proof I: Euclid’s Elements

In our figure for Euclid’s proof, which proof appears in his work The Ele-

ments, two overlaid shaded triangles are congruent, and so have equal areas.Corresponding to each triangle are two rectangles each of double the area.

One such rectangle is a square on a side of the original right triangle. Theother makes up a portion of the square on the hypotenuse. So suppose thetriangle sides are a and b and the hypotenuse c. So we have have that a2

is equal to the area of a sub-rectangle of the square on the hypotenuse c.Similarly we have b2 equal to the area of the rest of the area of the squareon the hypotenuse. Thus

a2 + b2 = c2

2.2 Proof II: The Ascent of Man

Jacob Bronowsky devotes several pages discussing a proof of the Pythagoreantheorem in his book, The Ascent of Man, and in his television series. This

5

Figure 1: The Pythagorean Theorem. The area of a square on thehypotenuse of a right triangle is equal to the sum of the squares on the sides.a2 + b2 = c2, where here a = 6, b = 8, c = 10

6

Figure 2: Proof I: Euclid’s Elements. The short side of the triangle is a,the long side is b and the hypotenuse is c. The more darkly shaded trianglerotated counterclockwise by 90 degrees, will fall exactly on the more lightlyshaded triangle. So these two triangles are congruent. The line from the topvertex divides the square on the hypotenuse c into a left rectangle L and aright one R. The dark triangle has area b2/2, because its base has length b,as does its height. The area of the lightly shaded triangle is 1/2 that of theright sub-rectangle R. Therefore the area of R is b2. Repeating the argumenton the left side of the figure with two new triangles, we find the area of L isa2. Therefore c2 = a2 + b2.

7

Figure 3: Proof II: The Ascent of Man. Jacob Bronowski in his bookThe Ascent of Man discusses this proof on pages 156-162. The book isbased on the 1972 BBC television series of the same name. The small sideof the triangle is a, the long side b, the hypotenuse c. The area of the leftfigure is c2. The shaded inner square has side length b− a. Rearranging thepieces of the left figure we get the right figure consisting of a small square ofarea a2, and a larger composite square of area b2. Therefore a2 + b2 = c2.

8

Figure 4: Proof II: The Ascent of Man. Jacob Bronowski in his bookThe Ascent of Man discusses this proof on pages 156-162. The book isbased on the 1972 BBC television series of the same name. The small sideof the triangle is a, the long side b, the hypotenuse c. The area of the leftfigure is c2. The shaded inner square has side length b− a. Rearranging thepieces of the left figure we get the right figure consisting of a square regionof area b2 (shaded region), and a square region of area a2 (unshaded region).Therefore a2 + b2 = c2.

9

occurs in the chapter called The Music of the Spheres and in an episodesimilarly titled in the television series. See the figure captioned The Ascent

of Man.

2.3 Proof III: Garfield

James A. Garfield contributed an original proof for the Pythagorean theorem.Of course most proofs of this theorem are rather similar. I had heard aboutGarfield’s proof many times, but had not actually seen it. However, hisproof is presented in the book: Welchons, Krickenberger, Pearson, Plane

Geometry.I graduated from James A. Garfield elementary school in Long Beach

California, a few years back, so I am closely connected to Garfield. Garfieldwas one of our assassinated presidents, a rather interesting person, an excep-tion to our rather dull and dim witted group of presidents in general. Hisassassin Charles Guiteau had a connection with the Oneida Community inOneida, New York. This was a 19th century social experiment devoted to”free” love. For an interesting treatment of these matters see Sara Vowell’sbook Assassination Vacation. If you are not familiar with Sara, her quirkypersonality and her squeaky voice, as heard on This American Life, youare really missing out.

Garfield’s proof consists in using two copies of the triangle, which hasshort side a, long side b, and hypotenuse c. We rest one copy on its shortside a, the other on the long side b, so that the two triangles touch at apoint. Then we add a line joining the top vertex of the first triangle tothe top vertex of the second triangle getting a trapezoid (See the Garfieldfigure). A trepezoid is a quadrilateral with two parallel opposite sides. Thearea of the trapezoid is the average length of its two parallel sides timesthe perpendiculat distance between its parallel sides (this can be shown bydecomposing the trapezoid into two triangles by drawing a diagonal). So thearea of the trapezoid is

A = (a + b)a + b

2=

1

2(a2 + 2ab + b2) =

a2

2+ ab +

b2

2.

On the other hand writing the area as the sum of the areas of the threetriangles, we have

A =ab

2+

ab

2+

c2

2= ab +

c2

2.

10

Figure 5: Proof III: Garfield’s Proof. The long side of the triangle isb, the short side a, the hypotenuse c. The area of the trapezoid is A =(a + b)(a + b)/2 = a2/2 + ab + b2/2. The area as the sum of the threetriangles is A = ab+ c2/2. Equating the two expressions for A we obtain theresult a2 + b2 = c2.

11

Equating these two expressions for A, we obtain

a2 + b2 = c2.

2.4 Proof IV: An Arrangement of Four Triangles in a

Square of side a + b

Consider the figure called An Arrangement of Four Triangles in a

Square of side a+b. The short side of the triangle is a, the long side b, andthe hypotenuse c. The area of the enclosing rectangle is A = (a+b)2. The areaof the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2.Equating these two expressions for area A we have a2 + b2 = c2.

2.5 Proof V: An Arrangement of Four Triangles in a

Square of side c

Consider the figure called An Arrangement of Four Triangles in a

Square of side c. The short side of the triangle is a, the long side b, andthe hypotenuse c. The area of the enclosing rectangle is A = c2. The area ofthe four triangles and the inside rectangle is A = 4(ab/2)+(b−a)2 = a2 +b2.Equating these two expressions for area A we have a2 + b2 = c2.

2.6 Remarks on Geometric Proofs, Versus Algebraic

Proofs

Euclid’s proof is purely Geometric with no reliance on algebra. The figuretitled An Arrangement That Leads to a Geometric Proof. will leadto another purely geometric proof. Most of the proofs are algebraic involvinga slight amount of Algebra.

2.7 Proof VI: Equating Two Square Arrangements

Referring to the figure for proof VI, the left enclosing square and the rightenclosing square have the same area. Therefore the sum of the areas of thetwo shaded squares on the left, is equal to the area of the shaded square onthe right. That is, the sum of the square on the short side of the triangle,plus the square on the long side of the triangle, is equal to the area of thesquare on the hypotenuse.

12

Figure 6: Proof IV: An Arrangement of Four Triangles in a Square

of side a + b The short side of the triangle is a, the long side b, and thehypotenuse c. The area of the enclosing rectangle is A = (a + b)2. The areaof the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2.Equating these two expressions for area A we have a2 + b2 = c2.

13

Figure 7: Proof V: An Arrangement of Four Triangles in a Square of

side c The short side of the triangle is a, the long side b, and the hypotenusec. The area of the enclosing rectangle is A = c2. The area of the four trianglesand the inside rectangle is A = 4(ab/2) + (b− a)2 = a2 + b2. Equating thesetwo expressions for area A we have a2 + b2 = c2.

14

Figure 8: An Arrangement That Leads to a Geometric Proof. Byequating this square with a certain second square also of side a+b we arrive ata clearly geometric proof of the Pythagorean Theorem that uses no algebra,and could have been employed by the Greeks, who did not have algebraavailable. They used number in their arguments, but to them numbers wereline segment lengths. Through this means Euclid treated the concept ofproportional numbers.

15

Figure 9: Proof VI: Equating Two Square Arrangements. The leftenclosing square and the right enclosing square have the same area. Thereforethe sum of the areas of the two shaded squares on the left, is equal to thearea of the shaded square on the right. That is, the sum of the square on theshort side of the triangle, plus the square on the long side of the triangle, isequal to the area of the square on the hypotenuse.

16

2.8 Proof VII: Triangle Area Proportional to the Hy-

potenuse Squared

See the figure for proof VII. Let the short side of the triangle be a, thelong side b, and the hypotenuse c. Similar right triangles have their ar-eas proportional to the square of their hypotenuses. with the same pro-portionality constant, say α. This follows because similar triangles havecorresponding sides that are proportional. Also the ratios of triangle sidesare the same for similar triangles. This is established in Euclidean geome-try, and is the basis of trigonometry. In particular similar triangles have thesame acute angles. Let one of them be θ. Then the area of the triangle isA = (ab)/2 = c cos(θ)c sin(θ) = αc2, where α = cos(θ) sin(θ). The verticalline divides the triangle into two similar triangles, a left one and a right one.The hypotenuse of the left sub-triangle is a, the right one b. Thus their areasare αa2 and αb2. The area of the original triangle is αc2. So αa2 +αb2 = αc2.Thus a2 + b2 = c2.

2.9 Proof VIII: Similarity and Proportion

Let the short side of the triangle be a, the long side b, and the hypotenuse c.The vertical line divides the triangle into two sub-triangles both similar tothe original. Referring to the figure, corresponding sides are proportional. cis divided into two segments c1 on the left and c2 on the right. We have

a

c1

=c

a,

soa2 = c1c.

b

c2

=c

b,

sob2 = c1c.

Thena2 + b2 = (c1 + c2)c = c2.

Thusa2 + b2 = c2.

17

Figure 10: Proof VII: Area Proportional to the Hypotenuse Squared.

Let the short side of the triangle be a, the long side b, and the hypotenusec. Similar right triangles have their areas proportional to the square of theirhypotenuses. with the same proportionality constant, say α. This follows be-cause similar triangles have corresponding sides that are proportional. Alsothe ratios of triangle sides are the same for similar triangles. This is estab-lished in Euclidean geometry, and is the basis of trigonometry. In partic-ular similar triangles have the same acute angles. Let one of them be θ.Then the area of the triangle is A = (ab)/2 = c cos(θ)c sin(θ) = αc2, whereα = cos(θ) sin(θ). The vertical line divides the triangle into two similar tri-angles, a left one and a right one. The hypotenuse of the left sub-triangle isa, the right one b. Thus their areas are αa2 and αb2. The area of the originaltriangle is αc2. So αa2 + αb2 = αc2. Thus a2 + b2 = c2.

18

Figure 11: Proof VIII: Similarity and Proportion. Let the short sideof the triangle be a, the long side b, and the hypotenuse c. The verticalline divides the triangle into two similar triangles. Corresponding sides areproportional. c is divided into two segments c1 on the left and c2 on theright. We have a/c1 = c/a, so a2 = c1c. And b/c2 = c/b, so b2 = c2c. Thena2 + b2 = (c1 + c2)c = c2.

19

3 A Crises in Greek Mathematics: What is

a real number?

For the greeks numbers were lengths of line segments. Fractions (rationalnumbers) are obtained by dividing line segments into equal pieces. Theydiscovered that the diagonal of a square can not be equal to any multiple ofa fractional division of the unit length of a square. This is a big problem fortheir concept of number!

Show that the square root of a prime number is not rational. So supposethe integer p is a prime, having no factors. Suppose

√p could be written as

a rational number, as a fraction say n/m, where m and n have no commonfactor, since if not we could divide out the common factors.

√p =

m

n.

Squaring we have

p =m2

n2.

Thenn2p = m2.

Hence p must be a factor of m, say

m = pr.

Thenn2p = p2r2

But this implies that p is a factor of n. This contradicts our assumption thatm and n had no common factor. Therefore the square root of a prime is nota rational number.

Mention the definition of real numbers as Didekind Cuts, or as equivalenceclasses of Cauchy sequences.LEAST UPPER BOUND AXIOM If A is any nonempty set of the realnumbers R that is bounded above, then A has a least upper bound.

4 Inequalities

Cauchy-Schwartz, Minkowsky (Goldberg)

20

Cauchy-Schwartz,

∞∑

n=1

sntn

≤[

∞∑

n=1

s2

n

]

1/2[

∞∑

n=1

t2n

]

1/2

Minkowsky

[

∞∑

n=1

(sn + tn)2

]

≤[

∞∑

n=1

s2

n

]

1/2

+

[

∞∑

n=1

t2n

]

1/2

If a, b, c are vectors in a normed vector space (triangle inequality)

‖c − a‖ ≤ ‖b − a‖ + ‖c − b‖.

5 Euclidean Distance

From the Pythagorean Theorem we able to define the Euclidean distancebetween points. So if we have two points with respective coordinates p1 =(x1, y1, z1) and p2 = (x2, y2, z2), the distance between the points is

d =√

(x2 − x1)2 + (y2 − y1)2 + (z2 − z1)2

So now we can talk about the nearness of points and thus talk about con-cepts such as continuity, and differentiability. Also we are able to formulatethe ideas of analytic geometry.

For example we are able to define the ellipse as the locus of points equidis-tant from two fixed points called the foci. Doing this we arrive at the canon-ical representation of an ellipse with the equation

x2

a2+

y2

b2= 1,

and the standard equation of the ellipsoid

x2

a2+

y2

b2+

z2

c2= 1.

21

6 Distance Functions and the Metric Space

A metric is a distance function ρ defined on some set of points M with thefollowing four properties:

For a, b, c points of M , then

ρ(a, b) ≥ 0 (i)

ρ(a, b) = ρ(a, b) (ii)

ρ(a, a) = 0 (iii)

and

ρ(a, c) ≤ ρ(a, b) + ρ(b, c) (iv)

An open ball about the point a of radius r, B(a, r), is the set of all pointssuch that

ρ(a, p) < r

A metric space (M, ρ) consists of a set M with a metric ρ.An open set in a metric space is a set A so that for every point a in A

there exists some open ball about a that is a subset of A. A metric space Mand the class of all open subsets form a topological space.

The metric for ordinary Euclidean two dimensional space is defined bythe Pythagorean Theorem. So let point p1 = (x1, y1) and point p2 = (x2, y2.Then the Euclidean distance between the points is the square root of thedifferences of the coordinates

d(p1, p2) =√

(x2 − x1)2 + (y2 − y1)2,

which by the Pythagorean Theorem is the length of the line segment con-necting the two points.

So the triangle inequality says that the sum of the lengths of two adjacentsides in a triangle is greater than the length of the opposite side.

This is metric property (iv):

ρ(a, c) ≤ ρ(a, b) + ρ(b, c) (iv)

For this simple two dimensional case, from the law of cosines

c2 = a2 + b2 − ab cos(θ) ≤ a2 + b2.

For more general arguments see lineara.pdf, Topics In Linear Algebra

and Its Applications by James Emery.

22

7 Vector Spaces and Inner Product Spaces

8 Normed Linear Spaces

9 Normed Linear Spaces and Functional Analy-

sis

10 Hilbert Space and `2

11 Orthogonality, Orthagonal Polynomials, Fourier

Series

12 Projections

13 Linear Least Squares Problems as Geo-

metric Problems: Orthogonality and the

Pythagorean Theorem

14 Elementary Formulation of the Least Squares

Problem for Straight Line Fitting

The traditional way of deriving least squares equations is to write the ex-pression for the sum of the squares difference between the given ”data” andthe approximating function, and then to set the partial derivatives with re-spect to the coefficients of the approximating function to zero. Let us dothis for the case of fitting a straight line to given data. Assume the modelf(x) = ax + b and minimize

r(a, b) =n

i=1

(axi + b − yi)2

The conditions for a minimum are

∂r

∂a=

n∑

i=1

2xi(axi + b − yi) = 0

23

∂r

∂b=

n∑

i=1

2(axi + b − yi) = 0

We get a two by two system of equations.

an

i=1

x2

i + bn

i=1

xi =n

i=1

xiyi

an

i=1

xi + bn

i=1

1 =n

i=1

yi

These equations are known as the normal equations of the problem. Theyhave a unique solution if the determinant is not zero, that is if

nn

i=1

x2

i − (n

i=1

xi)2) 6= 0.

If the x values are not all equal this follows from the Cauchy-Schwartz in-equality applied to the vectors (1, 1, ...1) and (x1, x2, ..., xn). The generalproblem can be viewed more naturally as being geometric.

15 A Geometric View of the Least Squares

Problem

The abstract linear least squares problem may be formulated as approxima-tion in a vector space by some element of a subspace. Often this vector spaceis a space of functions. As examples the subspace could be generated by abases such as

1, x, x2, x3, ....,

or such as1, cos(ωt), sin(ωt), cos(2ωt), sin(ωt), ...

The first case would be a polynomial, or power series approximation. Andthe second would be a Fourier or trigonometric approximation. So considera vector space V with an inner product of u, with v, written as (u, v). Givena subspace S and an arbitrary element g of V , we are to find the element inS that best approximates g in the norm corresponding to the inner product.The L2 norm for functions is based on the inner product

(f, g) =∫

fg,

24

and for sequences is based on the inner product

(f, g) =n

i=1

figi.

This L2 norm corresponds directly to the ”squares” part of the least squaresapproximation. But the theory carries through for an arbitrary inner prod-uct. The norm defined by an inner product is

‖f‖ = (f, f)1/2.

A solution f ∈ S, minimizes

(f − g, f − g) = ‖f − g‖2.

We will show that the problem is solved as the orthogonal projection of avector into a subspace. One can think of this as analogous to the simplegeometric problem of projecting a vector in space onto a plane. Think of avector from the origin to a point, and think of a plane through the origin,not containing this vector. The plane is a vector space. A vector in theplane closest to the original vector is obviously the orthogonal projection ofthe vector onto the plane. The same thing happens in the general problem,where the plane becomes the subspace. For example the subspace might bethe set of all cubic polyunomials. And the problem is to best fit the data toa cubic polynomial.

Two vectors are orthogonal, i.e. perpendicular, if their inner product iszero. We require a preliminary theorem to prove the main proposition.Pythagorean Theorem. If v1 is orthogonal to v2, then

‖v1 + v2‖2 = ‖v1‖2 + ‖v2‖2.

Proof.

(v1 + v2, v1 + v2) = (v1, v1) + 2(v1, v2) + v(2, v2) = (v1, v1) + (v2, v2).

Proposition. If f ∈ S and (g− f, h) = 0, ∀h ∈ S then f is a solution to theleast squares problem.Proof. Let s ∈ S. We have

‖g − s‖2 = ‖(g − f) + (f − s)‖2 = ‖g − f‖2 + ‖f − s‖2 ≥ ‖g − f‖2.

25

By assumption, g − f is orthogonal to the subspace S, and f − s is in S. Sothe second equality is a consequence of the Pythagorean Theorem.We have shown that

‖g − s‖ ≥ ‖g − f‖, ∀s ∈ S.

so f is the best approximation to g in S and this completes the proof.Notice that a unique solution always exists because f is the unique or-

thogonal projection of g into S. For finite subspaces the solution can beformulated as a solution to a set of n linear equations in n unknowns. Let Sequal the span of f1, .., fn. Let the solution be

f = c1f1 + c2f2 + .. + cnfn.

Then the minimum condition is equivalent to

(fi, c1f1 + c2f2 + ..cnfn − g) = 0, i = 1, .., n.

This is the same as

c1(fi, f1) + c2(fi, f2) + ..cn(fi, fn) = (fi, g), i = 1, .., n.

These n linear equations in n unknowns are called the normal equations ofthe problem. In the usual case, S is a space of discrete functions. These arefunctions defined on a finite domain. Suppose there are m data values sothat the domain is

{p1, p2, ..., pm}.We identify the function fi with the vector

fi(p1)fi(p2)..........

fi(pm)

fi is an m dimensional column vector of values of the ith function. We canformulate the minimum conditions with matrices. The inner product is thenthe transpose of the first vector times the second. We write the transpose ofa vector v as vt. We have (fi, fj) = f t

i fj Then

c1(fi, f1) + c2(fi, f2) + ..cn(fi, fn) = (fi, g), i = 1, .., n.

26

Thus

[

f ti f1 ... f t

i fn

]

c1

.

.

.cn

= f ti g

If we let A be an m row by n column matrix, whose ith column is fi, then

A =[

f1 f2 ... fn

]

Written out

A =

f1(p1) f2(p1) ... fn(p1)f1(p2) f2(p2) ... fn(p2)

... ... ... ...f1(pm) f2(pm) ... fn(pm)

Also let

B =

g(p1)...

g(pm)

The normal equations become

AtA

c1

.

.

.cn

= AtB.

Note that the original approximation problem in this form is a system of mequations in n unknowns

A

c1

.

.

.cn

≈ B.

Any linear system of this form with m > n can be interpreted as a leastsquares problem and has an approximate least squares solution. The matrices

27

A and B are a convenient input set to a general linear least squares solver(see the listing of subroutine llsq).

There is always a unique solution to the linear least squares problem.The solution is the orthogonal projection into the subspace. But there willbe more than one solution to the normal equations if the given functionsspanning the subspace are not linearly independent. The normal equationshave a solution, so they are consistent. From the theory of linear equations,if the determinant D of the coefficient matrix of the normal equations isnot zero, then there is a unique solution. Then we can solve the equationseither by inverting the coefficient matrix, or by gaussian elimination. If D iszero, then there is more than one solution, such solution will involve one ormore variables of arbitrary value. Gaussian elimination will fail. The D = 0solution can be computed by using elementary row operations which can bedone numerically or with various computer algebra programs. When we areconcerned only with the discrete space, it does not matter that there aremultiple solutions to the normal equations. Because any set of coefficientsgives a linear combination equal to the unique projection into the subspace.The various solutions just give different linear combinations of dependentvectors that equal the same vector. On the other hand if points other than thesample points are in the relevant domain of the functions, then the multiplesolutions may give function solutions that are not the same on this extendeddomain. To illustrate compare functions f and g where f(x) = x(x − 1)is equal to zero on the domain x = 0 and x = 1, but it is not zero onthe extended domain of all real numbers. Let g be the true zero function,g(x) = 0. The two functions agree on {0, 1}, but give different values onan extended domain. Frequently we want to use the least squares solutionfor interpolation between the given data points, and so the case of multiplesolutions to the normal equations does have consequence.

We will show that if f1,..,fn are linearly independent then the normalequations have a unique solution. This is obvious because in this case f1,..,fn

is a basis of S and the unique solution f in S has unique components withrespect to this basis. It is also a direct consequence of the following propo-sition.Proposition. if f1,...,fn are the linearly independent columns of a matrixA, which has m > n rows, then det(AtA) is not equal to zero.Proof. Suppose the determinant is zero. Then there exists c1, c2, .., cn, not

28

all zero such that

c1

(f1, f1)(f2, f1)............

(fn, f1)

+ c2

(f1, f2)(f2, f2)............

(fn, f2)

+ .. + cn

(f1, fn)(f2, fn)

......

......(fn, fn)

= 0.

Letv = c1f1 + ... + cnfn.

The first equation shows that (fi, v) = 0, for i = 1, .., n. It follows that(v, v) = 0. This implies v = 0, and so each ci is zero. This is a contradiction,so the proposition is true.Example 1. We are to fit the function

y = f(x) = a sin(x) + b cos(x).

to the datax y

1.0 3.02.5 5.63.4 7.8

,

Apply the sin function to the x values to get the first column of matrix Aand the cos function to get the second column. Let vector B be the y values.The normal equations are

AtAC = AtB

or in terms of the components

[

1.13154358 0.222243250.22224325 1.86845642

]

C =

[

3.882636366−10.40652323

]

The solution is

C =

[

4.6334245−6.120705005

]

Sof(x) = 4.6334245 sin(x) − 6.120705005 cos(x)

The following program does the linear least squares computations.

29

c+ llsq least squares solution of a*c=b (solving for c)

subroutine llsq(a,ia,m,n,ws,c,b,ier)

c parameters

c a-m by n matrix. declared row dimension ia.

c ws-working storage vector of length m

c c-vector of size n

c b-vector of size m

c ier-return parameter: ier=0 normal return,ier=1 normal

c equations

c nearly singular,ier=2 normal equations singular.

c

dimension a(ia,1),b(1),c(1),ws(1)

c compute lower elements of jth column of transpose(a)*a

do 50 j=1,n

do 18 i=j,n

s=0.

do 15 k=1,m

s=s+a(k,i)*a(k,j)

15 continue

18 ws(i)=s

c

c compute jth element of right side vector

s=0.

do 40 k=1,m

40 s=s+a(k,j)*b(k)

c(j)=s

c

c store lower elements of jth column in a

do 19 i=j,n

19 a(i,j)=ws(i)

c

50 continue

c fill in upper values

do 60 i=1,n

do 60 j=i,n

a(i,j)=a(j,i)

60 continue

ib=1

30

mm=1

eps=1.e-12

inv=0

c solve normal equations

call gausse(a,ia,c,ib,n,mm,inv,eps,det,ier)

return

end

16 Bibliography

[1] Heath T. L. (translator), Euclid’s Elements, 3 Volumes, Dover, 1956.

[2] Welchons A. M., Krickenberger W. R., Pearson Helen R., Plane Geom-

etry, 1958, Ginn and Company. Garfield Proof p. 253.

[3] Bronowski Jacob, The Ascent of Man, Little Brown and Company,1973.

[4] Halmos Paul R, Introduction to Hilbert Space: And the Theory

of Spectral Multiplicity, Chelsea, 1951. Halmos was a student of JohnVon Neumann.

[5] Halmos Paul R, Finite Dimensional Vector Spaces, Springer-Verlag,1975.

[6] Diggins Julia E, String, Straight-Edge and Shadow: The Story of

Geometry, The Viking Press, 1965. This is a book for junior high schoolstudents, and elementary school teachers. A very nice short book with pic-tures, a history of the Greeks and Pythagoras, as well as some interestingmathematical discussions I had not seen elsewhere.

[7] Pedoe Dan, Geometry and the Liberal Arts, St Martins Press, 1976.

[8] Vowell Sara, Assassination Vacation, 2005, Simon and Schuster.

[9] Goldberg Richard R Methods of Real Analysis, Blaisdell PublishingCompany, 1964.

31

32


Recommended