Date post: | 04-Apr-2018 |
Category: |
Documents |
Upload: | elgreco1982 |
View: | 221 times |
Download: | 0 times |
of 253
7/30/2019 Practical Linear Algebra
1/253
Practical Linear Algebra
Class Notes
Spring 2009
Robert van de GeijnMaggie Myers
Department of Computer SciencesThe University of Texas at Austin
Austin, TX 78712
Draft
January 18, 2012
Copyright 2009Robert van de Geijn and Maggie Myers
7/30/2019 Practical Linear Algebra
2/253
7/30/2019 Practical Linear Algebra
3/253
Contents
1. Introduction 11.1. A Motivating Example: Predicting the Weather . . . . . . . . . . . . . . . . 1
1.2. Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.1. Equality (=) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.2. Copy (copy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.3. Scaling (scal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.4. Vector addition (add) . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.5. Scaled vector addition (axpy) . . . . . . . . . . . . . . . . . . . . . . 81.2.6. Dot product (dot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2.7. Vector length (norm2) . . . . . . . . . . . . . . . . . . . . . . . . . . 91.2.8. A Simple Linear Algebra Package: SLAP . . . . . . . . . . . . . . . . 9
1.3. A Bit of History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4. Matrices and Matrix-Vector Multiplication . . . . . . . . . . . . . . . . . . . 121.4.1. First: Linear transformations . . . . . . . . . . . . . . . . . . . . . . 121.4.2. From linear transformation to matrix-vector multiplication . . . . . . 191.4.3. Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.4.4. Computing the matrix-vector multiply . . . . . . . . . . . . . . . . . 291.4.5. Cost of matrix-vector multiplication . . . . . . . . . . . . . . . . . . . 331.4.6. Scaling and adding matrices . . . . . . . . . . . . . . . . . . . . . . . 331.4.7. Partitioning matrices and vectors into submatrices (blocks) and sub-
vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351.5. A High Level Representation of Algorithms . . . . . . . . . . . . . . . . . . . 38
1.5.1. Matrix-vector multiplication . . . . . . . . . . . . . . . . . . . . . . . 381.5.2. Transpose matrix-vector multiplication . . . . . . . . . . . . . . . . . 401.5.3. Triangular matrix-vector multiplication . . . . . . . . . . . . . . . . . 421.5.4. Symmetric matrix-vector multiplication . . . . . . . . . . . . . . . . . 45
1.6. Representing Algorithms in Code . . . . . . . . . . . . . . . . . . . . . . . . 461.7. Outer Product and Rank-1 Update . . . . . . . . . . . . . . . . . . . . . . . 481.8. A growing library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
i
7/30/2019 Practical Linear Algebra
4/253
1.8.1. General matrix-vector multiplication (gemv) . . . . . . . . . . . . . . 501.8.2. Triangular matrix-vector multiplication (trmv) . . . . . . . . . . . . 511.8.3. Symmetric matrix-vector multiplication (symv) . . . . . . . . . . . . 521.8.4. Rank-1 update (ger) . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.8.5. Symmetric Rank-1 update (syr) . . . . . . . . . . . . . . . . . . . . 531.9. Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2. Matrix-Matrix Multiplication 712.1. Motivating Example: Rotations . . . . . . . . . . . . . . . . . . . . . . . . . 712.2. Composing Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . 722.3. Special Cases of Matrix-Matrix Multiplication . . . . . . . . . . . . . . . . . 762.4. Properties of Matrix-Matrix Multiplication . . . . . . . . . . . . . . . . . . . 792.5. Multiplying partitioned matrices . . . . . . . . . . . . . . . . . . . . . . . . . 802.6. Summing it all up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.7. Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883. Gaussian Elimination 93
3.1. Solving a System of linear Equations via Gaussian Elimination (GE, Take 1) 933.2. Matrix Notation (GE, Take 2) . . . . . . . . . . . . . . . . . . . . . . . . . . 953.3. Towards Gauss Transforms (GE, Take 3) . . . . . . . . . . . . . . . . . . . . 973.4. Gauss Transforms (GE, Take 4) . . . . . . . . . . . . . . . . . . . . . . . . . 1003.5. Gauss Transforms Continued (GE, Take 5) . . . . . . . . . . . . . . . . . . . 1043.6. Toward the LU Factorization (GE Take 6) . . . . . . . . . . . . . . . . . . . 1083.7. Coding Up Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . 113
4. LU Factorization 1174.1. Gaussian Elimination Once Again . . . . . . . . . . . . . . . . . . . . . . . . 1184.2. LU factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1184.3. Forward Substitution = Solving a Unit Lower Triangular System . . . . . . . 1234.4. Backward Substitution = Solving an Upper Triangular System . . . . . . . . 1244.5. Solving the Linear System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1274.6. When LU Factorization Breaks Down . . . . . . . . . . . . . . . . . . . . . . 1284.7. Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1304.8. Back to When LU Factorization Breaks Down . . . . . . . . . . . . . . . . 1334.9. The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.9.1. First, some properties . . . . . . . . . . . . . . . . . . . . . . . . . . 1424.9.2. Thats about all we will say about determinants . . . . . . . . . . . . 1434.9.3. Gauss-Jordan method . . . . . . . . . . . . . . . . . . . . . . . . . . 1444.9.4. Inverting a matrix using the LU factorization . . . . . . . . . . . . . 1474.9.5. Inverting the LU factorization . . . . . . . . . . . . . . . . . . . . . . 1514.9.6. In practice, do not use inverted matrices! . . . . . . . . . . . . . . . . 1514.9.7. More about inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7/30/2019 Practical Linear Algebra
5/253
iii
5. Vector Spaces: Theory and Practice 1555.1. Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1555.2. Why Should We Care? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.2.1. A systematic procedure (first try) . . . . . . . . . . . . . . . . . . . . 162
5.2.2. A systematic procedure (second try) . . . . . . . . . . . . . . . . . . 1645.3. Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1665.4. Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1695.5. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1715.6. The Answer to Life, The Universe, and Everything . . . . . . . . . . . . . . 1745.7. Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6. Orthogonality 1836.1. Orthogonal Vectors and Subspaces . . . . . . . . . . . . . . . . . . . . . . . 1836.2. Motivating Example, Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . 1866.3. Solving a Linear Least-Squares Problem . . . . . . . . . . . . . . . . . . . . 188
6.4. Motivating Example, Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . 1916.5. Computing an Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . 1916.6. Motivating Example, Part III . . . . . . . . . . . . . . . . . . . . . . . . . . 1926.7. What does this all mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1956.8. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7. The Singular Value Decomposition 2097.1. The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2097.2. Consequences of the SVD Theorem . . . . . . . . . . . . . . . . . . . . . . . 2117.3. Projection onto a column space . . . . . . . . . . . . . . . . . . . . . . . . . 2147.4. Low-rank Approximation of a Matrix . . . . . . . . . . . . . . . . . . . . . . 216
8. QR factorization 2198.1. Classical Gram-Schmidt process . . . . . . . . . . . . . . . . . . . . . . . . . 2198.2. Modified Gram-Schmidt process . . . . . . . . . . . . . . . . . . . . . . . . . 2238.3. Householder QR factorization . . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.3.1. Householder transformations (reflectors) . . . . . . . . . . . . . . . . 2318.3.2. Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2348.3.3. Forming Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
8.4. Solving Linear Least-Squares Problems . . . . . . . . . . . . . . . . . . . . . 240
9. Eigenvalues and Eigenvectors 2419.1. Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2419.2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.2.1. Eigenvalues and vector a a 2 2 matrix . . . . . . . . . . . . . . . . 244
7/30/2019 Practical Linear Algebra
6/253
iv
7/30/2019 Practical Linear Algebra
7/253
Preface
v
7/30/2019 Practical Linear Algebra
8/253
vi
7/30/2019 Practical Linear Algebra
9/253
Chapter1Introduction
In this chapter, we give a motivating example and use this to introduce linear algebra
notation.
1.1 A Motivating Example: Predicting the Weather
Let us assume that on any day, the following table tells us how the weather on that daypredicts the weather talgorithms and architectureshe next day:
Todaysunny cloudy rainy
Tomorrowsunny 0.4 0.3 0.1cloudy 0.4 0.3 0.6rainy 0.2 0.4 0.3
Table 1.1: Table that predicts the weather.
This table is interpreted as follows: If today it is rainy, then the probability that it will becloudy tomorrow is 0.6, etc.
Example 1.1 If today is cloudy, what is the probability that tomorrow is sunny? cloudy?
rainy?To answer this, we simply consult Table 1.1: the probability it will be sunny, cloudy, and
rainy are given by 0.3, 0.3, and 0.4, respectively.
1
7/30/2019 Practical Linear Algebra
10/253
2 Chapter 1. Introduction
Example 1.2 If today is cloudy, what is the probability that the day after tomorrow issunny? cloudy? rainy?
Now this gets a bit more difficult. Let us focus on the question what the probability isthat it is sunny the day after tomorrow. We dont know what the weather will be tomorrow.
What we do know is
The probability that it will be sunny the day after tomorrow and sunny tomorrow is0.4 0.3.
The probability that it will sunny the day after tomorrow and cloudy tomorrow is0.3 0.3.
The probability that it will sunny the day after tomorrow and rainy tomorrow is0.1 0.4.
Thus, the probability that it will be sunny the day after tomorrow if it is cloudy today is
0.4 0.3 + 0.3 0.3 + 0.1 0.4 = 0.25.
Exercise 1.3 Work out the probabilities that it will be cloudy/rainy the day after tomorrow.
Example 1.4 If today is cloudy, what is the probability that a week from today it is sunny?cloudy? rainy?
We will not answer this question now. We insert it to make the point that things can getmessy.
As is usually the case when we are trying to answer quantitative questions, it helps tointroduce some notation.
Let (k)s denote the probability that it will be sunny k days from now. Let (k)c denote the probability that it will be cloudy k days from now.
Let (k)
r denote the probability that it will be rainy k days from now.Here, is the Greek lower case letter chi, pronounced [kai] in English. Now, Table 1.1,Example 1.2, and Exercise 1.3 motivate the equations
(k+1)s = 0.4 (k)s + 0.3 (k)c + 0.1 (k)r
(k+1)c = 0.4 (k)s + 0.3 (k)c + 0.6 (k)r
(k+1)r = 0.2 (k)s + 0.4 (k)c + 0.3 (k)r
(1.1)
7/30/2019 Practical Linear Algebra
11/253
1.1. A Motivating Example: Predicting the Weather 3
The probabilities that denote what the weather may be on day k and the table thatsummarizes the probabilities is often represented as a (state) vector, x, and (transition)matrix, P, respectively:
x(k) =
(k)s(k)c(k)r
and P = 0.4 0.3 0.10.4 0.3 0.60.2 0.4 0.3
.The transition from day k to day k + 1 is then written as the matrix-vector product (multi-plication)
(k+1)s
(k+1)c
(k+1)r
=
0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
(k)s
(k)c
(k)r
,
or x(k+1) = P x(k), which is simply a more compact representation (way of writing) theequations in (1.1).
Assume again that today is cloudy so that the probability that it is sunny, cloudy, orrainy today is 0, 1, and 0, respectively:
]x(0) =
(0)s
(0)c
(0)r
=
01
0
. (1.2)
Then the vector of probabilities for tomorrows weather, x(1), is given by (1)s
(1)c
(1)r
=
0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
(0)s
(0)c
(0)r
=
0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
01
0
(1.3)
=
0.4 0 + 0.3 1 + 0.1 00.4 0 + 0.3 1 + 0.6 0
0.2 0 + 0.4 1 + 0.3 0
=
0.30.3
0.4
. (1.4)
The vector of probabilities for the day after tomorrow, x(2), is given by
(2)s
(2)c
(2)r
= 0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
(1)s
(1)c
(1)r
= 0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
0.30.30.4
(1.5)
=
0.4 0.3 + 0.3 0.3 + 0.1 0.40.4 0.3 + 0.3 0.3 + 0.6 0.4
0.2 0.3 + 0.4 0.3 + 0.3 0.4
=
0.250.45
0.30
. (1.6)
7/30/2019 Practical Linear Algebra
12/253
4 Chapter 1. Introduction
Repeating this process, we can find the probabilities for the weather for the next sevendays, under the assumption that today is cloudy:
k0 1 2 3 4 5 6 7
x(k) =
010
0.30.30.4
0.250.450.30
0.2650.4150.320
0.26250.42250.3150
0.263250.420750.31600
0.263120.421120.31575
0.263160.421040.31580
Exercise 1.5 Follow the instructions for this problem given on the class wiki.For the example described in this section,
1. Recreate the above table by programming it up with Matlab or Octave, starting withthe assumption that today is cloudy.
2. Create two similar tables starting with the assumption that today is sunny andrainy,respectively
3. Compare how x(7) differs depending on todays weather.
4. What do you notice if you compute x(k) starting with today being sunny/cloudy/rainyand you let k get large?
5. What does x() represent?
Alternatively, (1.5)(1.6) could have been stated as (2)s
(2)c
(2)r
=
0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
(1)s
(1)c
(1)r
=
0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
0.4 0.3 0.10.4 0.3 0.6
0.2 0.4 0.3
(0)s
(0)c
(0)r
= Q (0)
s
(0)c
(0)r
, (1.7)where Q is the transition matrix that tells us how the weather today predicts the weatherthe day after tomorrow.
7/30/2019 Practical Linear Algebra
13/253
1.1. A Motivating Example: Predicting the Weather 5
Exercise 1.6 Given Table 1.1, create the following table, which predicts the weather theday after tomorrow given the weather today:
Today
sunny cloudy rainy
Day afterTomorrow
sunnycloudyrainy
This then tells us the entries in Q in (1.7).
7/30/2019 Practical Linear Algebra
14/253
6 Chapter 1. Introduction
1.2 Vectors
We will call a one-dimensional array of numbers a (real-valued) (column) vector:
x =
01...
n1
,where i R for 0 i < n. The set of all such vectors is denoted by Rn. A vector has adirection and a length:
Its direction can be visualized by drawing an arrow from the origin to the point(0, 1, . . . , n1).
Its length is given by the Euclidean length of this arrow: The length ofx
Rn is given
by x2 = 20 + 21 + + 2n1, which is often called the two-norm.A number of operations with vectors will be encountered in the course of this book. In theremainder of this section, let x, y Rn R, with
x =
01...
n1
, y =
01...
n1
, and z =
01...
n1
.
1.2.1 Equality (=)
Two vectors are equal if all their components are element-wise equal: x = y if and only if,for 0 i < n, i = i.
1.2.2 Copy (copy)
The copy operation assigns the content of one vector to another. In our mathematicalnotation we will denote this by the symbol := (pronounce: becomes). An algorithm andM-script for the copy operation are given in Figure 1.1.
Remark 1.7 Unfortunately, M-script starts indexing at one, which explains the differ-ence in indexing between the algorithm and the M-script implementation.
Cost: Copying one vector to another requires 2n memory operations (memops): The vectorx of length n must be read, requiring n memops, and the vector y must be written, whichaccounts for the other n memops.
7/30/2019 Practical Linear Algebra
15/253
1.2. Vectors 7
for i = 0, . . . , n 1i := i
endfor
for i=1:n
y(i) = x( i );
end
Figure 1.1: Algorithm and M-script for the vector copy operation y := x.
for i = 0, . . . , n 1i := i
endfor
for i=1:n
y(i) = alpha * x( i );
end
Figure 1.2: Algorithm and M-script for the vector scale operation y := x.
for i = 0, . . . , n 1i := i + i
endfor
for i=1:n
z(i) = x(i) + y( i );
end
Figure 1.3: Algorithm and M-script for the vector addition operation z := x + y.
for i = 0, . . . , n 1i := i + i
endfor
for i=1:n
z(i) = alpha * x(i) + y(i);
end
Figure 1.4: Algorithm and M-script for scaled vector addition z := x + y.
:= 0
for i = 0, . . . , n 1 := ii +
endfor
alpha = 0;
for i=1:nalpha = x(i) * y(i) + alpha;
end
Figure 1.5: Algorithm and M-script for the dot product := xTy.
1.2.3 Scaling (scal)
Multiplying vector x by scalar yields a new vector, x, in the same direction as x by scaledby a factor . Scaling a vector by means each of its components, i, is scaled by :
x =
01...
n1
=
01
...n1
.
An algorithm and M-script for the scal operation are given in Figure 1.2.
7/30/2019 Practical Linear Algebra
16/253
8 Chapter 1. Introduction
Cost: On a computer, real numbers are stored as floating point numbers and real arithmeticis approximated with floating point arithmetic. Thus, we count floating point computations(flops): a multiplication or addition each cost one flop. Scaling a vector requires n flops and2n memops.
1.2.4 Vector addition (add)
The vector addition x + y is defined by
x + y =
01...
n1
+
01...
n1
=
0 + 01 + 1
...n1 + n1
.
In other words, the vectors are added element-wise, yielding a new vector of the same length.An algorithm and M-script for the add operation are given in Figure 1.3.
Cost: Vector addition requires 3n memops (x and y are read and the resulting vector iswritten) and n flops.
Exercise 1.8 Let x, y Rn. Show that vector addition commutes: x + y = y + x.
1.2.5 Scaled vector addition (axpy)
One of the most commonly encountered operations when implementing more complex linearalgebra operations is the scaled vector addition y := x + y:
x + y =
01...
n1
+
01...
n1
=
0 + 01 + 1
...n1 + n1
.
It is often referred to as the axpy operation, which stands for alpha times x plus y. Analgorithm and M-script for the axpy operation are given in Figure 1.4. We emphasize thattypically it is used in situations where the output vector overwrites the input vector y.
Cost: The axpy operation requires 3n memops and 2n flops. Notice that by combiningthe scaling and vector addition into one operation, there is the opportunity to reduce thenumber of memops that are incurred separately by the scal and add operations.
7/30/2019 Practical Linear Algebra
17/253
1.2. Vectors 9
1.2.6 Dot product (dot)
The other commonly encountered operation is the dot (inner) product. It is defined by
xTy =
n
1
i=0
ii = 00 + 11 + + n1n1.
We have already encountered the dot product in, for example, (1.1) (the right-hand side ofeach equation is a dot product) and (1.6). An algorithm and M-script for the dot operationare given in Figure 1.5.
Cost: A dot product requires approximately 2n memops and 2n flops.
Exercise 1.9 Let x, y Rn. Show that xTy = yTx.
1.2.7 Vector length (norm2)
The Euclidean length of a vector x (the two-norm) is given by x2 =n1
i=0 2i Clearly
x2 =
xTx, so that the dot operation can be used to compute this length.
Cost: If computed with a dot product, it requires approximately n memops and 2n flops.
Remark 1.10 In practice the two-norm is typically not computed via the dot operation,
since it can cause overflow or underflow in floating point arithmetic. Instead, vectors xand y are first scaled to prevent such unpleasant side effects.
1.2.8 A Simple Linear Algebra Package: SLAP
As part of this course, we will demonstrate how linear algebra libraries are naturally layeredby incrementally building a small library in M-script. For example, the axpy operation canbe implemented as a function as illustrated in Figure 1.7.
A couple of comments:
In this text, we distinguish between a (column) vector and a row vector (a vectorwritten as a row of vectors or, equivalently, a column vector that has been transposed).
In M-script, every variable is viewed as a multi-dimensional array. For our purposes,it suffices to think of all arrays as being two dimensional. A scalar is the special casewhere the array has one row and one column. A vector is the case where the arrayconsists of one row or one column. If variable x consists of one column, then x ( i )
7/30/2019 Practical Linear Algebra
18/253
10 Chapter 1. Introduction
Operation Abbrev. Definition Function Approx. costflops memops
Vector-vector operationsCopy (copy) y := x y = SLAP Copy( x ) 0 2n
Vector scaling (scal) y := x y = SLAP Scal( alpha, x ) n 2nAdding (add) z := x + y z = SLAP Add( x, y ) n 3nScaled addition (axpy) z := x + y z = SLAP Axpy( alpha, x, y ) 2n 3nDot product (dot) := xTy alpha = SLAP Dot( x, y ) 2n 2nLength (norm2) := x2 alpha = SLAP Norm2( x ) 2n n
Figure 1.6: A summary of the more commonly encountered vector-vector routines in ourlibrary.
equals the ith component of that column. If variable x consists of one row, then x(
i ) equals the ith component of that row. Thus, in this sense, M-script is blind towhether a vector is a row or a column vector.
It may seem a bit strange that we are creating functions for the mentioned operations.For example, M-script can compute the inner product of column vectors x and y viathe expression x * y and thus there is no need for the function. The reason we arewriting explicit functions is to make the experience of building the library closer tohow one would do it in practice with a language like the C programming language.
We expect the reader to be able to catch on without excessive explanation. M-scriptis a very simple and intuitive language.
When in doubt, try help or doc at the prompt.
A list of functions for performing common vector-vector operations is given in Figure 1.6.
Exercise 1.11 Start building your library by implement the functions in Figure 1.6. (Seedirections on the class wiki page.)
1.3 A Bit of HistoryThe functions Figure 1.6 are very similar in functionality to Fortran routines known as the(level-1) Basic Linear Algebra Subprograms (BLAS) that are commonly used in scientificlibraries. These were first proposed in the 1970s and were used in the development of one ofthe first linear algebra libraries, LINPACK.
See
7/30/2019 Practical Linear Algebra
19/253
1.3. A Bit of History 11
function [ z ] = SLAP_Axpy( alpha, x, y )
%
% Compute z = alpha * x + y
%
% If y is a column/row vector, then
% z is a matching vector. This is% because the operation is almost always
% used to overwrite y = axpy( alpha, x, y )
%
[ m_x, n_x ] = size( x );
[ m_y, n_y ] = size( y );
if ( n_x = = 1 ) % x is a column
n = m_x;
else
n = n_x; % x is a row
endif
i f ( n _ y = = 1 & m _ y ~ = n )abort(axpy: incompatible lengths);
else if ( m_y == 1 & n_y ~= n )
abort(axpy: incompatible lengths);
endif
z = y ;
for i=1:n
z(i) = alpha * x( i ) + z(i);
end
return
end
octave:1> x = [
> 1
> -1
> 2
> ]
x =
1
-1
2
octave:2> y = [
> 2
> 1
> 0
> ]
y =
21
0
octave:3> axpy( -1, x, y )
ans =
1
2
-2
octave:4> -1 * x + y
ans =
1
2
-2
Figure 1.7: Left: A routine for computing axpy. Right: Testing the axpy function inoctave.
C. Lawson, R. Hanson, D. Kincaid, and F. Krogh, Basic Linear Algebra Subprogramsfor Fortran Usage, ACM Trans. on Math. Soft., 5 (1979) 305325.
J. J. Dongarra, J. R. Bunch, C. B. Moler, and G. W. Stewart, LINPACK Users Guide,SIAM, Philadelphia, 1979.
7/30/2019 Practical Linear Algebra
20/253
12 Chapter 1. Introduction
1.4 Matrices and Matrix-Vector Multiplication
To understand what a matrix is in the context of linear algebra, we have to start with lineartransformations, after which we can discuss how matrices and matrix-vector multiplication
represent such transformations.
1.4.1 First: Linear transformations
Definition 1.12 (Linear transformation)A function L : Rn Rm is said to be a linear transformation if for all x, y Rn and R
L(x) = L(x) (Transforming a scaled vector is the same as scaling the transformedvector).
L(x + y) = L(x) + L(y) (The transformation is distributive).
Example 1.13 A rotation is a linear transformation. For example, let R : R2 R2 be
the transformation that rotates a vector through angle .Let x, y R2 and R. Figure 1.8 illustrates that scaling a vector first by and then rotating it yields the
same result as rotating the vector first and then scaling it.
Figure 1.10 illustrates that rotation x + y yields the same vector as adding the vectorsafter rotating them first.
Lemma 1.14 Let x, y Rn, , R and L : Rn Rm be a linear transformation.Then
L(x + y) = L(x) + L(y).
Proof: L(x + y) = L(x) + L(y) = L(x) + L(y).
Lemma 1.15 Let {xo, x1, . . . , xn1} Rn and let L : Rn Rm be a linear transforma-tion. Then
L(x0 + x1 + . . . + xn1) = L(x0) + L(x1) + . . . + L(xn1). (1.8)
7/30/2019 Practical Linear Algebra
21/253
1.4. Matrices and Matrix-Vector Multiplication 13
Scale then rotate Rotate then scale
(a) (d)
(b) (e)
(c) (f)
Figure 1.8: An illustration that for a rotation, R : R2 R2, one can scale the input first
and then apply the transformation or one can transform the input first and then scale theresult: R(2x) = 2R(x).
7/30/2019 Practical Linear Algebra
22/253
14 Chapter 1. Introduction
Figure 1.9: A generalization of Figure 1.8: R(x) = R(x).
While it is tempting to say that this is simply obvious, we are going to prove this rigor-ously. Wheneven one tries to prove a result for a general n, where n is a natural number,one often uses a proof by induction. We are going to give the proof first, and then we willtry to explain it.
Proof: Proof by induction on n.
Base case: n = 1. For this case, we must show that L(x0) = L(x0). This is trivially true.
Inductive step: Inductive Hypothesis (IH): Assume that the result is true for n = k wherek 1:
L(x0 + x1 + . . . + xk1) = L(x0) + L(x1) + . . . + L(xk1).We will show that the result is then also true for n = k + 1:
L(x0 + x1 + . . . + xk) = L(x0) + L(x1) + . . . + L(xk).
Assume that k 1. ThenL(x0 + x1 + . . . + xk)
= (expose extra term)L(x0 + x1 + . . . + xk1 + xk)
= (associativity of vector addition)
L((x0 + x1 + . . . + xk1) + xk)= (Lemma 1.14)L(x0 + x1 + . . . + xk1) + L(xk)
= (Inductive Hypothesis)L(x0) + L(x1) + . . . + L(xk1) + L(xk).
By the Principle of Mathematical Induction the result holds for all n.
7/30/2019 Practical Linear Algebra
23/253
1.4. Matrices and Matrix-Vector Multiplication 15
Add then rotate Rotate then add
(a) (d)
(b) (e)
(c) (f)
Figure 1.10: An illustration that for a rotation, R : R2 R2, one can add vectors first
and then apply the transformation or one can transform the vectors first and then add theresults: R(x + y) = R(x) + R(y).
7/30/2019 Practical Linear Algebra
24/253
16 Chapter 1. Introduction
The idea is as follows: The base case shows that the result is true for n = 1:L(x0) = L(x0).
The inductive step shows that if the result is true for n = 1, then the result is true forn = 1 + 1 = 2 so that L(x0 + x1) = L(x0) + L(x1). Since the result is indeed true forn = 1 (as proven by the base case) we now know that the result is also true for n = 2. Theinductive step also implies that if the result is true for n = 2, then it is also true for n = 3.Since we just reasoned that it is true for n = 2, we now know it is also true for n = 3:L(x0 + x1 + x2) = L(x0) + L(x1) + L(x2). And so forth.
Remark 1.16 The Principle of Mathematical Induction says that
if one can show that a property holds for n = nb and
one can show that if it holds for n = k, where k
nb, then it is also holds for
n = k + 1,
one can conclude that the property holds for all n nb.In the above example (and quite often) nb = 1.
Example 1.17 Show thatn1
i=0 i = n(n 1)/2.Proof by induction:
Base case: n = 1. For this case, we must show that11
i=0 i = 1(0)/2.
11i=0 i= (Definition of summation)0
= (arithmetic)1(0)/2
This proves the base case.
Inductive step: Inductive Hypothesis (IH): Assume that the result is true for n = k wherek 1:
k1
i=0i = k(k 1)/2.
We will show that the result is then also true for n = k + 1:
(k+1)1i=0
i = (k + 1)((k + 1) 1)/2.
Assume that k 1. Then
7/30/2019 Practical Linear Algebra
25/253
1.4. Matrices and Matrix-Vector Multiplication 17
(k+1)1i=0 i
= (arithmetic)
ki=0 i
= (split off last term)k1i=0 i + k
= (Inductive Hypothesis)k(k 1)/2 + k.
= (algebra)(k2 k)/2 + 2k/2.
= (algebra)(k2 + k)/2.
= (algebra)(k + 1)k/2.
= (arithmetic)(k + 1)((k + 1) 1)/2.This proves the inductive step.
By the Principle of Mathematical Induction the result holds for all n.
As we become more proficient, we will start combining steps. For now, we give lots of detail
to make sure everyone stays on board.
Exercise 1.18 Use mathematical induction to prove that n1i=0 i2 = (n
1)n(2n
1)/6.
Theorem 1.19 Let {xo, x1, . . . , xn1} Rn, {o, 1, . . . , n1} R, and let L : Rn Rm be a linear transformation. Then
L(0x0 + 1x1 + . . . + n1xn1) = 0L(x0) + 1L(x1) + . . . + n1L(xn1). (1.9)
Proof:L(0x0 + 1x1 + . . . + n1xn1)
= (Lemma 1.15)L(0x0) + L(1x1) + . . . + L(n1xn1)
= (Definition of linear transforma-tion)
0L(x0) + 1L(x1) + . . . + k1L(xk1) + n1L(xn1).
7/30/2019 Practical Linear Algebra
26/253
7/30/2019 Practical Linear Algebra
27/253
1.4. Matrices and Matrix-Vector Multiplication 19
Example 1.23 The transformation F
=
is not a linear transformation.
The first check should be whether F(0) = 0. The answer in this case is yes. However,
F
2
11
= F
22
=
2 2
2
=
42
= 2
11
= 2F
11
.
Hence, there is a vector x R2 and R such that F(x) = F(x). This means that F isnot a linear transformation.
Exercise 1.24 For each of the following, determine whether it is a linear transformation or
not:
F 01
2
=
00
2
.
F
01
=
200
.
1.4.2 From linear transformation to matrix-vector multiplication
Now we are ready to link linear transformations to matrices and matrix-vector multiplication.The definitions of vector scaling and addition mean that any x Rn can be written as
x =
01...
n1
= 0
10...0
e0+ 1
01...0
e1+ + n1
00...1
en1=
n1j=0
jej.
Definition 1.25 The vectors ej Rn mentioned above are called the unit basis vectorsand the notation ej is reserved for these vectors.
Exercise 1.26 Let x, ei Rn. Show that eTi x = xTei = i (the ith element of x).
7/30/2019 Practical Linear Algebra
28/253
7/30/2019 Practical Linear Algebra
29/253
1.4. Matrices and Matrix-Vector Multiplication 21
Matrix Vector Scalar NoteSymbol LATEX Code
A a \alpha alphaB b \beta betaC c \gamma gammaD d \delta deltaE e \epsilon epsilon ej = jth unit basis
vector.F f \phi phiG g \xi xiH h \eta eta
I Used for identity ma-trix.
K k \kappa kappaL l \lambda lambdaM m \mu mu m() = row dimen-
sion.N n \nu nu is shared with V.
n() = column di-mension.
P p \pi pi
Q q \theta thetaR r \rho rhoS s \sigma sigmaT t \tau tauU u \upsilon upsilonV v \nu nu shared with N.W w \omega omegaX x \chi chiY y \psi psiZ z \zeta zeta
Figure 1.11: Correspondence between letters used for matrices (uppercase Roman)/vectors(lowercase Roman) and the symbols used to denote their scalar entries (lowercase Greekletters).
7/30/2019 Practical Linear Algebra
30/253
22 Chapter 1. Introduction
Definition 1.29 (Matrix-vector multiplication)
0,0 0,1 0,n11,0 1,1
1,n
1
... ... . . . ...m1,0 m1,1 m1,n1
01...
n1
=
0,00+ 0,11+ + 0,n1n11,00+ 1,11+ + 1,n1n1
......
......
m1,00+ m1,11+ + m1,n1n1
. (1.10)
Example 1.30 Compute Ax when A = 1 0 23 1 12 1 2
and x = 100
.Answer:
13
2
, the first column of the matrix!
It is not hard to see that if ej is the j unit basis vector (as defined in Definition 1.25),then
Aej =
a0 a1 aj an1
00...1...0
= 0 a0+ 0 a1+ + 1 aj + + 0 an1 = aj.
Also, given a vector x, the dot product eTi x equals the ith entry in x, i:
eTi x =
0 0 1 0
0
1...i...
n1
= 0 0+ 0 1+ + 1 i + + 0 n1 = i.
7/30/2019 Practical Linear Algebra
31/253
1.4. Matrices and Matrix-Vector Multiplication 23
(a) (b)
(c)
Figure 1.12: Computing the matrix that represents rotation R. (a)-(b) Trigonomitry tellsus the coefficients ofR(e0) and R(e1). This then motivates the matrix that is given in (c),
so that the rotation can be written as a matrix-vector multiplication.
Example 1.31
00
1
T
1 0 23 1 1
2 1 2
.
10
0
=
00
1
T
13
2
= 2,
the (2, 0) element of the matrix. We notice that i,j = eTi (Aej).
Later, we will see that eTi A equals the ith row of matrix A and that i,j = eTi (Aej) =
eTi Aej = (eTi A)ej.
Example 1.32 Recall that a rotation R : R2 R2 is a linear transformation. We now
show how to compute the matrix, Q, that represents this rotation.
7/30/2019 Practical Linear Algebra
32/253
24 Chapter 1. Introduction
Given that the transformation is from R2 to R2, we know that the matrix will be a2 2 matrix: it will take vectors of size two as input and will produce vectors of sizetwo.
We have learned that the first column of the matrix Q will equal R(e0) and the secondcolumn will equal R(e1). In Figure 1.12 we motivate that
R(e0) =
cos()sin()
and R(e0) =
sin()cos()
.
We conclude thatQ =
cos() sin()sin() cos()
.
This means that a vector x = 0
1
is transformed into
R(x) = Qx =
cos() sin()sin() cos()
01
=
cos()0 sin()1sin()0 + cos()1
.
This is a formula you may have seen in a precalculus or physics course when discussing changeof coordinates, except with the coordinates 0 and 1 replaced by x and y, respectively.
Example 1.33 The transformation F
01
=
0 + 1
0
was shown in Exam-
ple 1.20 to be a linear transformation. Let us give an alternative proof of this: we willcompute a possible matrix, A, that represents this linear transformation. We will then showthat F(x) = Ax, which then means that F is a linear transformation.
To compute a possible matrix consider:
F
10
=
11
and F
01
=
10
.
Thus, if F is a linear transformation, then F(x) = Ax where A = 1 1
1 0
. Now,
Ax =
1 11 0
01
=
0 + 1
0
= F
01
= F(x)
which finished the proof that F is a linear transformation.
7/30/2019 Practical Linear Algebra
33/253
1.4. Matrices and Matrix-Vector Multiplication 25
Exercise 1.34 Show that the transformation in Example 1.21 is not a linear transformationby computing a possible matrix that represents it, and then showing that it does notrepresentit.
Exercise 1.35 Show that the transformation in Example 1.23 is not a linear transformationby computing a possible matrix that represents it, and then showing that it does notrepresentit.
Exercise 1.36 For each of the transformations in Exercise 1.24 compute a possible matrixthat represents it and use it to show whether the transformation is linear.
1.4.3 Special Matrices
The identity matrix Let LI : Rn Rn be the function defined for every x Rn as
LI(x) = x. Clearly LI(x + y) = x + y = LI(x) + LI(y), so that we recognize itas a linear transformation. We will denote the matrix that represents LI by the letter I
and call it the identity matrix. By the definition of a matrix, the j column of I is given byLI(ej) = ej . Thus, the identity matrix is given by
I =
1 0 00 1 0...
.... . .
...0 0 1
.
Notice that clearly Ix = x and a column partitioning of I yields I =
e0 e1 en1
.
In M-script, an n n identity matrix is generated by the command eye( n ).
Diagonal matrices
Definition 1.37 (Diagonal matrix)A matrix A Rnn is said to be diagonal if i,j = 0 for all i = j.
7/30/2019 Practical Linear Algebra
34/253
7/30/2019 Practical Linear Algebra
35/253
1.4. Matrices and Matrix-Vector Multiplication 27
> y = diag( A )
y =
1
23
In linear algebra an element-wise vector-vector product is not a meaningful operation:when x, y Rn the product xy has no meaning. However, in M-script, if x and y areboth column vectors (or matrices for that matter), the operation x . * y is an element-wisemultiplication. Thus, diag( x ) * y can be alternatively written as x . * y .
Triangular matrices
Definition 1.41 (Triangular matrix)A matrix A Rnn is said to be
lower triangular if i,j = 0 for all i < jstrictly lower triangular if i,j = 0 for all i junit lower triangular if i,j = 0 for all i < j and i,j = 1 ifi = jupper triangular if i,j = 0 for all i > jstrictly upper triangular if i,j = 0 for all i junit upper triangular if i,j = 0 for all i > j and i,j = 1 ifi = j
If a matrix is either lower or upper triangular, it is said to be triangular.
Exercise 1.42 Give examples for each of the triangular matrices in Definition 1.41.
Exercise 1.43 Show that a matrix that is both lower and upper triangular is in fact adiagonal matrix.
In M-script, given an nn matrix A, its lower and upper triangular parts can be extractedby the calls tril( A ) and triu( A ), respectively. Its strictly lower and strictly uppertriangular parts can be extract by the calls tril( A, -1 ) and triu( A, 1 ), respectively.
7/30/2019 Practical Linear Algebra
36/253
28 Chapter 1. Introduction
Exercise 1.44 Add the functions trilu( A ) and triuu( A ) to your SLAP library. Thesefunctions return the lower and upper triangular part ofA, respectively, with the diagonal setto ones. Thus,> A = [
1 -2 1-1 2 0
2 3 3
];
> trilu( A )
ans =
1 0 0
-1 1 0
2 3 1
Hint: use the tril() and eye() functions. You will also want to use the size() functionto extract the dimensions of A, to pass in to eye().
Transpose matrix
Definition 1.45 (Transpose matrix)Let A Rmn and B Rnm. Then B is said to be the transpose of A if, for 0 i < mand 0 j < n, j,i = i,j. The transpose of a matrix A is denoted by AT.
We have already used
T
to indicate a row vector, which is consistent with the above defi-nition.Rather than writing supplying our own function for transposing a matrix (or vector), we
will rely on the M-script operation: B = A creates a matrix that equals the transpose ofmatrix A. There is a reason for this: rarely do we explicitly transpose a matrix. Instead, wewill try to always write our routines so that the operation is computed as if the matrix istransposed, but without performing the transpose so that we do not pay the overhead (intime and memory) associated with the transpose.
Example 1.46 Let A =
1 0 2 12 1 1 23 1 1 3
and x =
12
4
. Then
AT =
1 0 2 12 1 1 2
3 1 1 3
T=
1 2 30 1 12 1 11 2 3
and xT =
12
4
T = 1 2 4 .
7/30/2019 Practical Linear Algebra
37/253
7/30/2019 Practical Linear Algebra
38/253
30 Chapter 1. Introduction
Example 1.50 Let A =
1 0 22 1 1
3 1 1
and x =
12
1
. Then
Ax =
1 0 22 1 1
3 1 1
12
1
=
1 0 2 121
2 1 1 12
1
3 1 1
12
1
=
102
T
121
211
T
12
1
31
1
T
12
1
=
(1)(1) + (0)(2) + (2)(1)(2)(1) + (1)(2) + (1)(1)
(3)(1) + (1)(2) + (1)(1)
=
33
2
The algorithm for computing y := Ax + y is given in Figure 1.13(left). To its right is theM-script function that implements the matrix-vector multiplication. If initially y = 0, thenthese compute y := Ax.
Now, let us revisit the fact that the matrix-vector multiply can be computed as dotproducts of the rows of A with the vector x. Think of the matrix A as individual rows:
A =
aT0aT1...
aTm1
,
where ai is the (column) vector which, when transposed, becomes the ith row of the matrix.Then
Ax =
aT0aT1...
aTm1
x =
aT0 xaT1 x
...aTm1x
,
7/30/2019 Practical Linear Algebra
39/253
1.4. Matrices and Matrix-Vector Multiplication 31
for i = 0, . . . , m 1for j = 0, . . . , n 1
i := i + i,jjendfor
endfor
for i=1:m
for j=1:n
y ( i ) = y ( i ) + A ( i , j ) * x ( j ) ;
end
end
Figure 1.13: Algorithm and M-script for computing y := Ax + y.
for i = 0, . . . , m 1for j = 0, . . . , n 1
i := i + i,jjendfori := i + aTi x
endfor
for i=1:m
y( i ) = y( i ) + FLA_Dot( A( i,: ), x );
end
Figure 1.14: Algorithm and M-script for computing y := Ax + y via dot products. A( i,: )equals the ith row of array A.
for j = 0, . . . , n 1for i = 0, . . . , m 1
i := i + i,jjendfor
endfor
for j=1:n
for i=1:m
y ( i ) = y ( i ) + A ( i , j ) * x ( j ) ;
end
end
Figure 1.15: Algorithm and M-script for computing y := Ax+y. This is exactly the algorithmin Figure 1.13, except with the order of the loops interchanged.
for j = 0, . . . , n 1for i = 0, . . . , m 1
i := i + i,jjendfor
y := jaj + y
endfor
for j=1:n
y = FLA_Axpy( x( j ), A( :, j ), y );
end
Figure 1.16: Algorithm and M-script for computing y := Ax + y via axpy operations.A( :,j ) equals the jth column of array A.
7/30/2019 Practical Linear Algebra
40/253
32 Chapter 1. Introduction
which is exactly what we reasoned before. An algorithm and corresponding M-script imple-mentation that exploit this insight are given in Figure 1.14.
Via axpy operations: Next, we note that, by definition,0,00+ 0,11+ + 0,n1n11,00+ 1,11+ + 1,n1n1
......
......
m1,00+ m1,11+ + m1,n1n1
=
0
0,01,0
...m1,0
+ 1
0,11,1
...m1,1
+ + n1
0,n11,n1
...m1,n1
.
This suggests the alternative algorithm and M-script for computing y := Ax + y givenin Figure 1.15, which are exactly the algorithm and M-script given in Figure 1.13(left) butwith the two loops interchanged.
If we let aj denote the vector that equals thejth column ofA, then A =
a0 a1 an1
and
Ax = 0
0,01,0
...m1,0
a0+ 1
0,11,1
...m1,1
a1+ + n1
0,n11,n1
...m1,n1
an1= 0a0 + 1a1 + + n1an1.This suggests the algorithm and M-script for computing y := Ax+y given in Figure 1.16.. Itillustrates how matrix-vector multiplication can be expressed in terms of axpy operations.
Example 1.51 Let A =
1 0 22 1 1
3 1 1
and x =
12
1
. Then
Ax =
1 0 22 1 13 1
1
121
= (1)
123
+ (2)
01
1
+ (1)
21
1
=
(1)(1)(1)(2)
(1)(3)
+
(2)(0)(2)(1)
(2)(1)
+
(1)(2)(1)(1)
(1)(1)
=
(1)(1) + (0)(2) + (2)(1)(2)(1) + (1)(2) + (1)(1)
(3)(1) + (1)(2) + (1)(1)
=
33
2
7/30/2019 Practical Linear Algebra
41/253
1.4. Matrices and Matrix-Vector Multiplication 33
1.4.5 Cost of matrix-vector multiplication
Computing y := Ax + y, where A Rmn, requires mn multiplies and mn adds, for a totalof 2mn floating point operations (flops). This count is the same regardless of the order ofthe loops (i.e., regardless of whether the matrix-vector multiply is organized by computingdot operations with the rows or axpy operations with the columns).
1.4.6 Scaling and adding matrices
Theorem 1.52 Let LA :Rn R
m
be a linear transformation and, for all x Rn
, definethe function LB : Rn Rm by LB(x) = LA(x). Then LB(x) is a linear transformation.
Proof: Let x, y Rn and , R. Then
LB(x + y) = LA(x + y) = (LA(x) + LA(y)) = ()LA(x) + ()LA(y)
= ()LA(x) + ()LA(y) = (LA(x)) + (LA(y)) = LB(x) + LB(y).
Hence LB is a linear transformation.
Now, let A be the matrix that represents LA. Then, for all x Rn, (Ax) = LA(x) =LB(x). Since LB is a linear transformation, there should be a matrix B such that, for allx Rn, Bx = LB(x) = (Ax). One way to find how that matrix relates to and A is torecall that bj = Bej, the jth column ofB. Thus, bj = Bej = (Aej) = aj , where aj equalsthe jth column ofA. We conclude that B is computed from A by scaling each column by .But that simply means that each element of B is scaled by . This motivates the followingdefinition
Definition 1.53 (Scaling a matrix)If A Rmn and R, then
0,0 0,1 0,n11,0 1,1 1,n1
......
...m1,0 m1,1 m1,n1
. =
0,0 0,1 0,n11,0 1,1 1,n1
......
...m1,0 m1,1 m1,n1
.
7/30/2019 Practical Linear Algebra
42/253
34 Chapter 1. Introduction
An alternative motivation for this definition is to consider
(Ax) =
0,00+ 0,11+ + 0,n1n11,00+ 1,11+
+ 1,n1n1
... ... ... ...m1,00+ m1,11+ + m1,n1n1
=
(0,00+ 0,11+ + 0,n1n1)(1,00+ 1,11+ + 1,n1n1)
......
......
(m1,00+ m1,11+ + m1,n1n1)
=
0,00+ 0,11+ + 0,n1n11,00+ 1,11+ + 1,n1n1
......
......
m1,00+ m1,11+ + m1,n1n1
=
0,0 0,1 0,n11,0 1,1 1,n1
......
. . ....
m1,0 m1,1 m1,n1
01...
n1
= (A)x.
Remark 1.54 Since, by design, (Ax) = (A)x we can drop the parentheses and writeAx (which also equals A(x) since L(x) = Ax is a linear transformation).
Theorem 1.55 Let LA : Rn Rm and LB : Rn Rm both be linear transformationsand, for all x Rn, define the function LC : Rn Rm by LC(x) = LA(x) + LB(x). ThenLC(x) is a linear transformations.
Exercise 1.56 Prove Theorem 2.2.
Now, let A, B, and C be the matrices that represent LA, LB, and LC from Theorem 2.2,respectively. Then, for all x Rn, Cx = LC(x) = LA(x) + LB(x). One way to find howmatrix C relates to matrices A and B is to exploit the fact that
cj = Cej = LC(ej) = LA(ej) + LB(ej) = Aej + Bej = aj + bj,
where aj , bj , and cj equal the jth columns of A, B, and C, respectively. Thus, the jthcolumn of C equals the sum of the corresponding columns of A and B. But that simplymeans that each element of C equals the sum of the corresponding elements of A and B:
7/30/2019 Practical Linear Algebra
43/253
1.4. Matrices and Matrix-Vector Multiplication 35
Definition 1.57 (Matrix addition)If A, B Rmn, then
0,0 0,1
0,n
1
1,0 1,1 1,n1...
......
m1,0 m1,1 m1,n1
+
0,0 0,1
0,n1
1,0 1,1 1,n1...
......
m1,0 m1,1 m1,n1
=
0,0 + 0,0 0,1 + 0,1 0,n1 + 0,n11,0 + 1,0 1,1 + 1,1 1,n1 + 1,n1
......
...m1,0 + m1,0 m1,1 + m1,1 m1,n1 + m1,n1
.
Exercise 1.58 Note: I have changed this question from before. Give a motivationfor matrix addition by considering a linear transformation LC(x) = LA(x) + LB(x)
1.4.7 Partitioning matrices and vectors into submatrices (blocks) andsubvectors
Theorem 1.59 Let A Rmn, x Rn, and y Rn. Let m = m0 + m1 + mM1, mi 0 for i = 0, . . . , M 1; and n = n0 + n1 + nN1, nj 0 for j = 0, . . . , N 1; and
Partition
A =
A0,0 A0,1 A0,N1A1,0 A1,1 A1,N1
......
. . ....
AM1,0 AM1,1 AM1,N1
, x =
x0x1...
xN1
, and y =
y0y1...
yM1
with Ai,j Rminj , xj Rnj , and yi Rmi . Then yi =
N1j=0 Ai,jxj .
This theorem is intuitively true, and messy to prove carefully, and therefore we will notgive its proof, relying on examples instead.
7/30/2019 Practical Linear Algebra
44/253
36 Chapter 1. Introduction
Remark 1.60 If one partitions matrix A, vector x, and vector y into blocks, and onemakes sure the dimensions match up, then blocked matrix-vector multiplication proceedsexactly as does a regular matrix-vector multiplication except that individual multipli-cations of scalars commute while (in general) individual multiplications with matrix andvector blocks (submatrices and subvectors) do not.
Example 1.61 Consider
A =
A00 a01 A02aT10 11 a
T12
A20 a21 A22
=
1 2 4 1 01 0 1 2 12 1 3 1 21 2 3 4 3
1 2 0 1 2
, (1.11)
x =
x01
x2
=
12345
, and y =
y01
y2
, (1.12)
where y0, y2 R2. Then
y = y01y2
= A00 a01 A02aT10 11 aT12A20 a21 A22
x01x2
= A00x0 + a011 + A02x2aT10x0 + 111 + aT12x2A20x0 + a211 + A22x2
=
1 21 0
12
+
4
1
3 +
1 0
2 1
45
2 1 12
+ (3)3 +
1 2
45
1 21 2
12
+
30
3 +
4 31 2
45
= 31
+
12
3
+
4
3 0 + 9 + 14
55
+
90
+
3114
= 19
5
2345
9
7/30/2019 Practical Linear Algebra
45/253
1.4. Matrices and Matrix-Vector Multiplication 37
Remark 1.62 The labeling of the submatrices and subvectors in (1.11) and (1.12) wascarefully chosen to convey information: The letters that are used convey informationabout the shapes. For example, for a01 and a21 the use of a lowercase Roman letterindicates they are column vectors while the Ts in aT
10and aT12
indicate that they are rowvectors. Symbols 11 and 1 indicate these are scalars. We will use these conventionsconsistently to enhance readability.
7/30/2019 Practical Linear Algebra
46/253
38 Chapter 1. Introduction
y := Mvmult unb var1(A,x,y)
Partition A
ATAB
, y
yTyB
where AT is 0 n and yT is 0 1
while m(AT) < m(A) doRepartition
ATAB
A0aT1
A2
, yT
yB
y01
y2
where a1 is a row
1 := aT1 x + 1
Continue withATAB
A0aT1
A2
, yT
yB
y01
y2
endwhile
y := Mvmult unb var2(A,x,y)
Partition A AL AR , x xTxB
where AL is m 0 and xT is 0 1
while m(xT) < m(x) doRepartition
AL AR A0 a1 A2 , xT
xB
x01
x2
where a1 is a column
y := 1a1 + y
Continue withAL AR
A0 a1 A2 , xTxB
x01
x2
endwhile
Figure 1.17: The algorithms in Figures 1.14 and 1.16, represented in FLAME notation.
1.5 A High Level Representation of Algorithms
It is our experience that many, if not most, errors in coding are related to indexing mistakes.We now discuss how matrix algorithms can be described without excessive indexing.
1.5.1 Matrix-vector multiplicationIn Figure 1.17, we present algorithms for computing y := Ax+y, already given in Figures 1.14and 1.16, using a notation that is meant to hide indexing. Let us discuss the algorithm onthe left in detail.
We compare and contrast with the partitioning of matrix A and vector y in Figure 1.14
A =
aT0aT1...
a
T
m1
and y =
01...
m1
with the partitioning in Figure 1.17(left)
A =
A0aT1
A2
and y =
y01
y2
.
7/30/2019 Practical Linear Algebra
47/253
7/30/2019 Practical Linear Algebra
48/253
40 Chapter 1. Introduction
y := Mvmult t unb var1(A,x,y)
Partition A AL AR , y yTyB
where AL is m 0 and yT is 0 1
while m(yT) < m(y) doRepartition
AL AR A0 a1 A2 , yT
yB
y01
y2
where a1 is a column
1 := aT1 x + 1
Continue withAL AR
A0 a1 A2 , yTyB
y01
y2
endwhile
y := Mvmult t unb var2(A,x,y)
Partition A
ATAB
, x
xTxB
where AT is 0 n and xT is 0 1
while m(AT) < m(A) doRepartition
ATAB
A0aT1
A2
, xT
xB
x01
x2
where a1 is a row
y := 1a1 + y
Continue withATAB
A0aT1
A2
, xT
xB
x01
x2
endwhile
Figure 1.18: Algorithms for computing y := ATx + y. Notice that matrix A is not explicitlytransposed. Be sure to compare and contrast with the algorithms in Figure 1.17.
1.5.2 Transpose matrix-vector multiplication
Example 1.63 Let A =
1 2 02 1 11 2 3
and x =
1
23
. Then
ATx =
1 2 02 1 1
1 2 3
T
12
3
=
1 2 12 1 2
0 1 3
12
3
=
06
7
.
The thing to notice is that what was a column in A becomes a row in AT.
The above example motivates the observation that if
A = ATAB = A0aT1A2
thenAT =
ATT A
TB
=
AT0 a1 AT2
.
Moreover, ifA =
AL AR
=
A0 a1 A2
.
7/30/2019 Practical Linear Algebra
49/253
1.5. A High Level Representation of Algorithms 41
y := Mvmult unb var1b(A,x,y)
Partition A
ATL ATRABL ABR
,
x
xTxB , y
yTyB where ATL is 00, xT, yT are 01
while m(ATL) < m(A) doRepartition
ATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
1 := aT10x0 + 111 + aT12x2 + 1
Continue withATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
y := Mvmult unb var2b(A,x,y)
Partition A
ATL ATRABL ABR
,
x
xT
xB , y yT
yB where ATL is 00, xT, yT are 01while m(ATL) < m(A) do
RepartitionATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
y0 := 1a01 + y0
1 := 111 + 1y2 := 1a21 + y2
Continue withATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
Figure 1.19: The algorithms in Figure 1.17, exposing more submatrices and vectors. Worksfor square matrices and prepares for the discussion on multiplying with triangular and sym-metric matrices.
then
AT
= ATLATR = AT0
aT1AT2
.
This motivates the algorithms for computing y = ATx + y in Figure 1.18.
7/30/2019 Practical Linear Algebra
50/253
7/30/2019 Practical Linear Algebra
51/253
7/30/2019 Practical Linear Algebra
52/253
44 Chapter 1. Introduction
y := Trmv un unb var1(U,x,y)
Partition U
UTL UTR0 UBR
,
x
xTxB , y
yTyB where UTL is 00, xT, yT are 01
while m(UTL) < m(U) doRepartition
UTL UTR0 UBR
U00 u01 U020 11 uT12
0 0 U22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
1 := uT10x0+ 111 + uT12x2 + 1
Continue withUTL UTR
0 UBR
U00 u01 U020 11 uT12
0 0 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
y := Trmv un unb var2(U,x,y)
Partition U
UTL UTR0 UBR
,
x
xTxB , y
yTyB where UTL is 00, xT, yT are 01
while m(UTL) < m(U) doRepartition
UTL UTR0 UBR
U00 u01 U020 11 uT12
0 0 U22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
y0 := 1u01 + y0
1 := 111 + 1y2 := 1u21 + y2
Continue withUTL UTR
0 UBR
U00 u01 U020 11 uT12
0 0 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
Figure 1.20: The algorithms in Figure 1.19, modified to take advantage of the upper trian-gular structure of U. They compute y := U x if y = 0 upon entry.
Cost Let us analyze the algorithm in Figure 1.20(left). The cost is in the update 1 :=111+ uT12x2+ 1 which is typically computed in two steps: 1 := 111+ 1 followed by adot product 1 := u
T12x2 + 1. Now, during the first iteration, u12 and x2 are of length 0, so
that that iteration requires 2 flops for the first step. During the ith iteration (staring withi = 0), u12 and x2 are of length i so that the cost of that iteration is 2 flops for the first stepand 2i flops for the second. Thus, if A is an n n matrix, then the total cost is given by
n1i=0
[2 + 2i] = 2n + 2
n1i=0
i = 2n + 2
n(n
1)
2 = 2n + n2
n = n2
+ n n2
flops.
Exercise 1.71 Compute the cost, in flops, of the algorithm in Figure 1.20(right).
7/30/2019 Practical Linear Algebra
53/253
1.5. A High Level Representation of Algorithms 45
y := Symv u unb var1(A,x,y)
Partition A
ATL ATRABL ABR
,
x
xT
xB , y yT
yB where ATL is 00, xT, yT are 01
while m(ATL) < m(A) doRepartition
ATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
1 :=
aT01aT10 x0 + 111 + a
T12x2
Continue withATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
y := Symv u unb var2(A,x,y)
Partition A
ATL ATRABL ABR
,
x
xTxB , y
yTyB where ATL is 00, xT, yT are 01
while m(ATL) < m(A) doRepartition
ATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
y0 := 1a01 + y0
1 := 111 + 1y2 := 1 a21
a12
+ y2
Continue withATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
Figure 1.21: The algorithms in Figure 1.19, modified to compute y := Ax + y where A issymmetric and only the upper triangular part of A is stored.
1.5.4 Symmetric matrix-vector multiplication
Theorem 1.72 Let A be a symmetric matrix. Partition
A
ATL ATRABL ABR
=
A00 a01 A02aT10 11 a
T12
A20 a21 A22
,
where ATA and L00 are square matrices. Then
ABL = ATTR, a10 = a01, A20 = AT02, and a21 = a12. ATL and ABR are symmetric matrices.
Again, we dont prove this, giving an example instead:
7/30/2019 Practical Linear Algebra
54/253
7/30/2019 Practical Linear Algebra
55/253
1.6. Representing Algorithms in Code 47
1 function [ y_out ] = Symv_u_unb_var1( A, x, y )2 [ ATL, ATR, ...3 ABL, ABR ] = FLA_Part_2x2( A, ...
4 0, 0, FLA_TL );5 [ xT, ...6 xB ] = FLA_Part_2x1( x, ...7 0, FLA_TOP );8 [ yT, ...9 yB ] = FLA_Part_2x1( y, ...
10 0, FLA_TOP );11 while ( size( ATL, 1 ) < size( A, 1 ) )12 [ A00, a01, A02, ...13 a10t, alpha11, a12t, ...14 A20, a21, A22 ] = FLA_Repart_2x2_to_3x3( ATL, ATR, ...15 ABL, ABR, ...16 1, 1, FLA_BR );17 [ x0, ...18 chi1, ...19 x2 ] = FLA_Repart_2x1_to_3x1( xT, ...20 xB, 1, FLA_BOTTOM );21 [ y0, ...22 psi1, ...23 y2 ] = FLA_Repart_2x1_to_3x1( yT, ...24 yB, 1, FLA_BOTTOM );25 %------------------------------------------------------------%26 psi1 = FLA_Dot( a01, x0 ) + psi1;27 psi1 = alpha11 * chi1 + psi1;28 psi1 = FLA_Dot( a12, x2 ) + psi1;
29 %------------------------------------------------------------%30 [ ATL, ATR, ...31 ABL, ABR ] = FLA_Cont_with_3x3_to_2x2( A00, a01, A02, ...32 a10t, alpha11, a12t, ...33 A20, a21, A22, FLA_TL );34 [ xT, ...35 xB ] = FLA_Cont_with_3x1_to_2x1( x0, ...36 chi1, ...37 x2, FLA_TOP );38 [ yT, ...39 yB ] = FLA_Cont_with_3x1_to_2x1( y0, ...40 psi1, ...41 y2, FLA_TOP );
42 end43 y_out = [ yT44 yB ];45 return
Figure 1.22: FLAME@lab code for the algorithm in Figure 1.21(left).
7/30/2019 Practical Linear Algebra
56/253
7/30/2019 Practical Linear Algebra
57/253
1.7. Outer Product and Rank-1 Update 49
Definition 1.75 (Outer product) Let y Rm and x Rn. Then the outer product ofvectors y and x is given by
yxT =
01...
m1
01...
n1
T
=
01...
m1
0 1 n1
=
00 01 0n110 11 1n1
......
. . ....
m10 m11 m1n1
We note that yxT can be written in a number of equivalent ways:
yxT =
00 01 0n110 11 1n1
......
. . ....
m10 m11 m1n1
=
0
01...
m1
1
01...
m1
n1
01...
m1
=
0y 1y n1y
and
yx
T
=
00 01 0n110 11 1n1
... ... . . . ...m10 m11 m1n1
=
0
0 1 n1
1
0 1 n1...
m1
0 1 n1 =
0xT
1xT
...m1xT
7/30/2019 Practical Linear Algebra
58/253
50 Chapter 1. Introduction
Definition 1.76 (Rank-1 update) Let A Rmn, y Rm, x Rn, and R. Then
A + yxT
=
0,0 0,1 0,n11,0 1,1 1,n1
... ... . . . ...m1,0 m1,1 m1,n1
+
01
...m1
01
...n1
T
=
00 01 . . . 0,n110 11 . . . 1,n1
......
. . ....
m1,0 m1,1 . . . m1,n1
+
01
...m1
0 1 n1
=
00 01 . . . 0,n110 11 . . . 1,n1
......
. . ....
m1,0 m1,1 . . . m1,n1
+
00 01 0n110 11 1n1
......
. . ....
m10 m11 m1n1
=
0,0 + 00 0,1 + 01 0,n1 + 0n11,0 + 10 1,1 + 11 1,n1 + 1n1
......
. . ....
m1,0 + m10 m1,1 + m11 m1,n1 + m1n1
This operation is called a rank-1 update.
1.8 A growing library
We now describe some additions to the SLAP library you are developing.
1.8.1 General matrix-vector multiplication (gemv)
In our library, we add three routines that together implement commonly encountered casesof matrix-vector multiplication.
SLAP Gemv n( alpha, A, x, y ) and SLAP Gemv t( alpha, A, x, y )These routines respectively implement z := Ax + y and z := ATx + y, whereA Rmn.
Assume that matrix A is stored by columns. Then, pick the algorithm that favors
access by column, as discussed in class.
Implement the algorithm with calls to the SLAP Dot and/or SLAP Axpy routinesyou wrote.
You will want to be careful to add the multiplication by at just the right placeso as not to incur unnecessary computations or copies of vectors and/r matrices.For example, creating a temporary matrix B = A is not a good idea. (Why?)
7/30/2019 Practical Linear Algebra
59/253
1.8. A growing library 51
SLAP Gemv( trans, alpha, A, x, beta, y )
Depending on whether trans equals the constant SLAP NO TRANSPOSE or SLAP TRANSPOSE,this routine computes z := Ax + y or z := ATx + y , respectively.
Implement this routine as a wrapper routine to the two routines given above:have it check the parameter trans, perform some minimal computation, andthen calling one of the other two routines defined above.
In this wrapper routine, you must accommodate the possibility that x and/ory are row vectors. Vector z and y are either both row vectors or both columnvectors.
Exercise 1.77 Follow the directions on the wiki to implement the above routines for com-puting the general matrix-vector multiply.
1.8.2 Triangular matrix-vector multiplication (trmv)
We add five routines that together implement all commonly encountered cases of triangularmatrix-vector multiplication.
SLAP Trmv un( diag, U, x ), SLAP Trmv ut( diag, U, x ), SLAP Trmv ln( diag, L, x ),SL
These routines respectively implement z := Ux, z := UTx, z := Lx, and z := LTx.
Some notes:
As always, assume that matrices U and L are stored by columns and pick thealgorithm that favors access by column, as discussed in class.
Implement the algorithm with calls to the SLAP Dot and/or SLAP Axpy routinesyou wrote.
The parameter diag allows one to pass in one of three values: SLAP NON UNIT if thevalues stored on the diagonal are to be used, SLAP UNIT if the values stored on thediagonal are to be ignored, and the computation assumes they are (implicitly) allequal to one, or SLAP ZERO if the values stored on the diagonal are to be ignored,
and the computation assumes they are (implicitly) all equal to zero (meaning thatthe matrix is strictly upper or lower triangular).
You may assume that x and z are column vectors.SLAP Trmv( uplo, trans, diag, alpha, A, x )
Depending on whether trans equals the constant SLAP NO TRANSPOSE or SLAP TRANSPOSE,this routine computes either z := Ax or z := ATx.
7/30/2019 Practical Linear Algebra
60/253
52 Chapter 1. Introduction
If uplo equals SLAP UPPER TRIANGULAR then A upper triangular (stored in theupper triangular part of array A while if uplo equals SLAP LOWER TRIANGULAR itis lower triangular.
The parameter diag is passed on to one of the above four routines.
Vector x and z are either both row vectors or both column vectors.
Exercise 1.78 Follow the directions on the wiki to implement the above routines for com-puting the triangular matrix-vector multiply.
1.8.3 Symmetric matrix-vector multiplication (symv)
We add three routines that together implement all commonly encountered cases of symmetricmatrix-vector multiplication.
SLAP Symv u( alpha, A, x, y ), SLAP Symv l( alpha, A, x, y )These routines implement z := Ax + y.
Again, assume that matrix A is stored by columns and pick the algorithm thatfavors access by column.
Implement the algorithm with calls to the SLAP Dot and/or SLAP Axpy routinesyou wrote.
The u and l in the name of the routine indicate whether the matrix is stored
only in the upper or lower triangular part of array A, respectively.
You may assume that x, y, and z are all column vectors.SLAP Symv( uplo, alpha, A, x, beta, y )
This routine is the wrapper to the above routines. It computes z := Ax + y , whereA is symmetric.
Matrix A is stored only in the upper or the lower triangular part of array A, de-pending on whether uplo equals SLAP UPPER TRIANGULAR or SLAP LOWER TRIANGULAR,respectively.
The parameter diag is passed on to one of the above four routines. Vector x can be a row or a column vector. Vector y and z are either both rowvectors or both column vectors.
Exercise 1.79 Follow the directions on the wiki to implement the above routines for com-puting the symmetric matrix-vector multiply.
7/30/2019 Practical Linear Algebra
61/253
7/30/2019 Practical Linear Algebra
62/253
54 Chapter 1. Introduction
1.9 Answers
Exercise 1.3 Work out the probabilities that it will be cloudy/rainy the day after tomorrow.
Exercise 1.5 Follow the instructions for this problem given on the class wiki.For the example described in this section,
1. Recreate the above table by programming it up with Matlab or Octave, starting withthe assumption that today is cloudy.
2. Create two similar tables starting with the assumption that today is sunny andrainy,respectively
3. Compare how x(7) differs depending on todays weather.
4. What do you notice if you compute x(k) starting with today being sunny/cloudy/rainyand you let k get large?
5. What does x() represent?
Exercise 1.6 Given Table 1.1, create the following table, which predicts the weather the
day after tomorrow given the weather today:
Todaysunny cloudy rainy
Day afterTomorrow
sunnycloudyrainy
This then tells us the entries in Q in (1.7).
Exercise 1.8 Let x, y Rn
. Show that vector addition commutes: x + y = y + x.
Exercise 1.9 Let x, y Rn. Show that xTy = yTx.
7/30/2019 Practical Linear Algebra
63/253
1.9. Answers 55
Exercise 1.11 Start building your library by implement the functions in Figure 1.6. (Seedirections on the class wiki page.)
Exercise 1.18 Use mathematical induction to prove thatn1
i=0 i2 = (n 1)n(2n 1)/6.
Answer:
Proof by induction:
Base case: n = 1. For this case, we must show that 11i=0 i2 = (0)2(2 1)/6.11i=0 i2
= (Definition of summation)0
= (arithmetic)(0)1(2 1)/6
This proves the base case.
Inductive step: Inductive Hypothesis (IH): Assume that the result is true for n = k wherek 1:
k1i=0
i2 = (k 1)k(2k 1)/6.
We will show that the result is then also true for n = k + 1:
(k+1)1i=0
i2 = ((k + 1) 1)k(2(k + 1) 1)/6.
Assume that k 1. Then
7/30/2019 Practical Linear Algebra
64/253
56 Chapter 1. Introduction
(k+1)1i=0 i
2
= (arithmetic)
ki=0 i2
= (split off last term)k1i=0 i2 + k2
= (Inductive Hypothesis)(k 1)k(2k 1)/6 + k2.
= (algebra)((2k3 3k2 + k) + 6k2)/6.
= (algebra)(2k3 + 3k2 + k)/6.
= (algebra)k(k + 1)(2k + 1)/6.
= (arithmetic)((k + 1) 1)(k + 1)(2(k + 1) 1)/6.This proves the inductive step.
By the Principle of Mathematical Induction the result holds for all n.
Exercise 1.24 For each of the following, determine whether it is a linear transformation ornot:
F 012
= 002
. F
01
=
200
.
Answer
F0
12
= 0
02
.
First check if F(0) = 0. F
00
0
=
00
0
. So it COULD be a linear
transformation.
7/30/2019 Practical Linear Algebra
65/253
1.9. Answers 57
Next, check if F(x) = F(x). Let x =
01
2
be an arbitrary vector and be
an arbitrary scalar. Then
F
01
2
= F
01
2
=
00
2
= F
00
2
.
Next, check ifF(x + y) = F(x) + F(y). Let x =
01
2
and y =
01
2
be an
arbitrary vectors. Then
F012 +
012 = F
0 + 01
+ 12 + 2 =
0 + 00
2 + 2
=
00
2
+
00
2
= F
00
2
+ F
00
2
.
Thus it is a linear transformation.
F
01
=
200
.
First check if F(0) = 0. F 00 = 00 . So it COULD be a lineartransformation.
Now, looking at it, I suspect this is not a linear transformation because of the20. So, I will try to construct an example where F(x) = F(x) or F(x + y) =F(x) + F(y). Let x =
10
and = 2. Then
F(x) = F
2
10
= F
20
=
40
.
Also
2F(x) = 2F 1
0
= 2
10
= 2
0
.
Thus, for this choice of and x we find that F(x) = F(x).Thus it is not a linear transformation.
7/30/2019 Practical Linear Algebra
66/253
58 Chapter 1. Introduction
Exercise 1.26 Let x, ei Rn. Show that eTi x = xTei = i (the ith element of x).
Answer
Recall that ei =
0...010...0
where the 1 occurs in the ith entry and x =
0...
i1i
i+1...
n1
. Then
eTi x = 0 0 + + 0 i1 + 1 i + 0 i+1 + + 0 n1 = i.
Exercise 1.34 Show that the transformation in Example 1.21 is not a linear transformationby computing a possible matrix that represents it, and then showing that it does notrepresentit.
Answer
Let F
=
+ + 1
. If F were a linear transformation, then there would be
a corresponding matrix, A =
, such that
F = .This matrix would be computed by computing its columns
= F
10
=
1 + 01 + 1
=
12
and
= F
01
=
0 + 10 + 1
=
11
so that
= 1 1
2 1 . But1 12 1
=
+
2 +
=
+ + 1
.
Thus F cannot be a linear transformation. (There is no matrix that has the same actionas F.)
7/30/2019 Practical Linear Algebra
67/253
1.9. Answers 59
Exercise 1.35 Show that the transformation in Example 1.23 is not a linear transformationby computing a possible matrix that represents it, and then showing that it does notrepresentit.
Answer
Let F
=
. If F were a linear transformation, then there would be a
corresponding matrix, A =
, such that
F
=
.
This matrix would be computed by computing its columns
= F 10 = 01 and
= F
01
=
00
so that
=
0 01 0
. But
0 01 0
=
0
=
.
Thus F cannot be a linear transformation. (There is no matrix that has the same actionas F.)
Exercise 1.36 For each of the transformations in Exercise 1.24 compute a possible matrixthat represents it and use it to show whether the transformation is linear.
Answer
Let F
012
=
00
2
. IfF were a linear transformation, then there would be
a corresponding matrix, A =
00 01 0210 11 1220 21 22
, such that
F
01
2
=
00 01 0210 11 12
20 21 22
01
2
.
7/30/2019 Practical Linear Algebra
68/253
60 Chapter 1. Introduction
This matrix would be computed by computing its columns
0010
20
= F
10
0
=
10
0
,
011121
= F
01
0
=
00
0
,
and 0212
22
= F
00
1
=
00
1
,
so that 00 01 0210 11 1220 21 22 =
1 0 00 0 0
0 0 1 . Checking now 1 0 00 0 0
0 0 1
01
2
=
00
2
= F
01
2
.
Thus there is a matrix that corresponds to F which is therefore a linear transformation.
Let F
01
=
200
. If F were a linear transformation, then there would be
a corresponding matrix, A = 00 01
10 11 , such thatF
01
=
00 0110 11
01
.
This matrix would be computed by computing its columns0010
= F
10
=
10
and
0111 = F
01 =
00
so that
00 0110 11
=
1 00 0
. Checking now
1 00 0
01
=
00
= F
01
.
Thus it is not a linear transformation.
7/30/2019 Practical Linear Algebra
69/253
1.9. Answers 61
Exercise 1.39 Let D =
2 0 00 3 00 0 1
. What linear transformation, L, does this matrix
represent? In particular, answer the following questions: L : R? R??. Give ? and ??. A linear transformation can be describe by how it transforms the unit basis vectors:
L(e0) =
L(e1) =
L(e2) =
L
012 =
Answer
L : R3 R3. A linear transformation can be describe by how it transforms the unit basis vectors:
L(e0) =
200
L(e1) =
030
L(e2) =
00
1
L01
2 = L(0e0 + 1e1 + 2e2) = 0L(e0) + 1L(e1) + 2L(e2)= 0
20
0
+ 1
03
0
+ 2
00
1
=
2031
2
.
Thus, the elements of the input vector are scaled by the corresponding elements of thediagonal of the matrix.
7/30/2019 Practical Linear Algebra
70/253
62 Chapter 1. Introduction
Exercise 1.42 Give examples for each of the triangular matrices in Definition 1.41.
Answerlower triangular if i,j = 0 for all i < j
1 02 3
strictly lower triangular if i,j = 0 for all i j
0 02 0
unit lower triangular if i,j = 0 for all i < j and i,j = 1 ifi = j
1 02 1
upper triangular if i,j = 0 for all i > j
1 20 3
strictly upper triangular if i,j = 0 for all i j
0 20 0
unit upper triangular if i,j = 0 for all i > j and i,j = 1 ifi = j 1 2
0 1
Exercise 1.43 Show that a matrix that is both lower and upper triangular is in fact adiagonal matrix.
AnswerIf a matrix is both upper and lower triangular then ij = 0 ifi = j. Thus the matrix is
diagonal.
Exercise 1.44 Add the functionstrilu( A )
andtriuu( A )
to your SLAP library. Thesefunctions return the lower and upper triangular part ofA, respectively, with the diagonal setto ones. Thus,> A = [
1 -2 1
-1 2 0
2 3 3
];
> trilu( A )
ans =
1 0 0-1 1 0
2 3 1
Hint: use the tril() and eye() functions. You will also want to use the size() functionto extract the dimensions of A, to pass in to eye().
7/30/2019 Practical Linear Algebra
71/253
7/30/2019 Practical Linear Algebra
72/253
64 Chapter 1. Introduction
y := Ltrmvmult unb var1b(A,x,y)
Partition A
ATL ATRABL ABR
,
x
xT
xB , y yT
yB where ATL is 00, xT, yT are 01
while m(ATL) < m(A) doRepartition
ATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
1 := aT10x0 + 111 +aT12x2 0
+ 1
Continue withATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
y := Ltrmvmult unb var2b(A,x,y)
Partition A
ATL ATRABL ABR
,
x
xTxB , y
yTyB where ATL is 00, xT, yT are 01
while m(ATL) < m(A) doRepartition
ATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
where 11, 1, and 1 are scalars
y0 :=
0 1a01 + y01 := 111 + 1y2 := 1a21 + y2
Continue withATL ATRABL ABR
A00 a01 A02aT10 11 aT12
A20 a21 A22
,
xTxB
x01
x2
, yT
yB
y01
y2
endwhile
Figure 1.23: Answer to Exercise 1.67. Compare and contrast to Figure 1.19. Executingeither of these algorithms with A = L will compute y := Lx + y where L is lower triangular.Notice that the algorithm would become even clearer if {A,a,} were replaced by {L,l,}.
Exercise 1.68 Modify the algorithms in Figure 1.20 so that x is overwritten with x := U x,without using the vector y.
Answer
The answer is in Figure 1.24. It computes x := U x.Some explaination: Let us first consider y := Ux + y. If we partition the matrix and
vectors we get y01
y2
:=
U00 u01 U020 11 uT12
0 0 U22
x01
x2
+
y01
y2
=
U00x0 + u011 + U02x2 + y0111 + uT12x2 + 1
U02x2 + y2
.
7/30/2019 Practical Linear Algebra
73/253
1.9. Answers 65
Notice that the algorithm in Figure 1.20(left) has the property that when the current iterationstarts, everything in red currently exists in y:
y01y2
:== U00x0 + u011 + U02x2 + y0
111 + uT12x2 + 1
U02x2 + y2 .
The update1 := 111 + u
T12x2 + 1
then makes it so that one more element of y has been updated.Now, turn to the computation x := U x:
x01x2
:==
U00x0 + u011 + U02x2
111 + uT12x2
U02x2
.
Notice that to update one more element of x we much compute
1 := 111 + uT12x2
which justifies the algorithm in Figure 1.24(left).Now, lets turn to the algorithm in Figure 1.20(right). That algorithm has the property
that when the current iteration starts, everything in red currently exists in y:
y01y2
:==
U00x0 + u011 + U02x2+y0
111 + uT12x2 + 1U02x2 + y2
.
The updatesy0 := 1u01 + y01 := 111 + 1
then make it so that the vector y contains everything in blue: y01
y2
:==
U00x0 + u011 + U02x2+y0111 + uT12x2+1
U02x2 + y2
.
Now, turn to the computation x := U x: x01x2
:== U00x0 + u011 + U02x2111 + uT12x2
U02x2
.Notice that the computations
x0 := 1u01 + x01 := 111
7/30/2019 Practical Linear Algebra
74/253
66 Chapter 1. Introduction
x := Trmv un unb var1b(U, x)
Partition U
UTL UTR0 UBR
,
x
xTxB where UTL is 0 0, xT is 0 1
while m(UTL) < m(U) doRepartition
UTL UTR0 UBR
U00 u01 U020 11 uT12
0 0 U22
,
xTxB
x01
x2
where 11, 1 are scalars
1 := uT10x0+ 111 + uT12x2
Continue withUTL UTR
0 UBR
U00 u01 U020 11 uT12
0 0 A22
,
xTxB
x01
x2
endwhile
x := Trmv un unb var2(U, x)
Partition U
UTL UTR0 UBR
,
x
xT
xB where UTL is 0 0, xT is 0 1
while m(UTL) < m(U) doRepartition
UTL UTR0 UBR
U00 u01 U020 11 uT12
0 0 U22
,
xTxB
x01
x2
where 11, 1 are scalars
x0 := 1u01 + x01 := 111
Continue withUTL UTR
0 UBR
U00 u01 U020 11 uT12
0 0 A22
,
xTxB
x01
x2
endwhile
Figure 1.24: Answer to Exercise 1.67.
make is so that everthing in blue is in x: x01
x2
:==
U00x0 + u011 + U02x2111 + uT12x2
U02x2
.
which justifies the algorithm in Figure 1.24(right).
Exercise 1.69 Reason why the algorithm you developed for Exercise 1.67 cannot be triviallychanged so that x := Lx without requiring y. What is the solution?
AnswerLet us focus on the algorithm on the left in Figure 1.23. Consider the update
1 := aT10x0 + 111 +a
T12x2 0
+ 1
7/30/2019 Practical Linear Algebra
75/253
1.9. Answers 67
and reason what would happen if we blindly changed this to
1 := aT10x0 + 111.
Then x0 on the right of the := refers