In Chapter 3 our problem was methods to solve $f(x) = 0$ where $f:R \rightarrow R$. Of course, this also handles problems of the type $f(x) = g(x)$ and $f(x) = b$.
In Chapter 4 we begin with linear systems of equations:
$$~ \begin{align} a_{11} x_1 + a_{12}x_2 + \cdots a_{1n} x_n &= b_1\\ a_{21} x_1 + a_{22}x_2 + \cdots a_{2n} x_n &= b_2\\ &\vdots\\ a_{m1} x_1 + a_{m2}x_2 + \cdots a_{mn} x_n &= b_m \end{align} ~$$A solution is a set of values $x_1, x_2, \dots, x_n$ so that each equation is true simultaneously.
We see three sets of numbers: the $a$s, the $b$s and the $x$s. We represent these as matrices:
$$~ A = \left( \begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ & \vdots & & \\ a_{m1} & a_{m2} & \cdots & a_{mn}\\ \end{array} \right) ~$$$$~ x = \left( \begin{array}{c} x_1\\ x_2\\ \vdots\\ x_n \end{array} \right) ~$$$$~ b = \left( \begin{array}{c} b_1\\ b_2\\ \vdots\\ b_n \end{array} \right) ~$$This is because matrices have their own, nearly familar, algebra that allows us to represent the system of equations in terms of an equation:
$$~ Ax = b. ~$$Here are two main problems in linear algebra that repeat themselves all the time:
Consider the system of equations is
$$~ \begin{align} 1x_{11} + 2x_{12} + 3x_{13} &= 4\\ 5x_{21} + 6x_{22} + 7x_{23} &= 8\\ 9x_{31} + 10x_{32} + 11x_{33} &= 12 \end{align} ~$$Then we can do the following:
A = [1 2 3; 5 6 7; 9 10 11] b = [4,8,12] A
3x3 Array{Int64,2}: 1 2 3 5 6 7 9 10 11
and
b
3-element Array{Int64,1}: 4 8 12
To put in variables we can use symbolic math:
using SymPy x = [symbols("x$i") for i in 1:3]
3-element Array{Any,1}: x1 x2 x3
And from here:
A*x - b
3-element Array{Any,1}: x1 + 2*x2 + 3*x3 - 4 5*x1 + 6*x2 + 7*x3 - 8 9*x1 + 10*x2 + 11*x3 - 12
We saw $A$ has $m$ rows and $n$ columns.
Here we can view $x$ and $b$ as vectors or as matrices with 1 column. This is an identification, they are not mathematically the same thing.
A basic matrix in Julia
is stored in contiguous memory, so has an order. Here we can see what it is by looking at the first 4 elements
A[1:4]
4-element Array{Int64,1}: 1 5 9 2
A vector on the other hand is not stored as a matrix in this sense:
isa(b, Matrix)
false
This is because b
has only 1 dimension, a matrix has two. This is different from the "row vector" and "column vector""
A[1,:]
1x3 Array{Int64,2}: 1 2 3
A[:, 1:1]
3x1 Array{Int64,2}: 1 5 9
But note this is similar – but different. How?
A[:,1]
3-element Array{Int64,1}: 1 5 9
We can add and subtract matrices of the same size. The result is a matrix $(a_{ij} - b_{ij})$.
A = [1 2 3; 4 5 6; 7 8 9] B = [1 4 7; 2 5 8; 3 6 9] A
3x3 Array{Int64,2}: 1 2 3 4 5 6 7 8 9
and
B
3x3 Array{Int64,2}: 1 4 7 2 5 8 3 6 9
And here we have:
A-B
3x3 Array{Int64,2}: 0 -2 -4 2 0 -2 4 2 0
B-A
3x3 Array{Int64,2}: 0 2 4 -2 0 2 -4 -2 0
Multiplication is possible for matrices, but the size constraint is different: $A B$ is defined if the number of columns of $A$ matches the number of rows of $B$. That is if $A$ is $m \times n$ then $B$ must be $n \times p$ for some $m$ and $p$.
The defintion is $(AB)_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} + \cdots + a_{in}b_{nj}$.
A * B
3x3 Array{Int64,2}: 14 32 50 32 77 122 50 122 194
If we think of $A = [a_1; a_2; \cdots; a_m]$ as comprised of row vectors and $B = [b_1 b_2 \cdots b_p]$ as column vectors, then we have the $ij$ terms is the product $a_i \cdot b_j$. If we identify these row and column vectors as just vectors, this is the dot product.
A[1, :] * B[:, 2]
1-element Array{Int64,1}: 32
(A*B)[1, 2]
32
Though matrix multiplication is defined and uses the same notation as multiplication of numbers, it doesn't have two key properties:
A*B - B*A
3x3 Array{Int64,2}: -52 -46 -40 -46 -16 14 -40 14 68
C = [1 0 0;-2 0 0; 1 0 0]
3x3 Array{Int64,2}: 1 0 0 -2 0 0 1 0 0
So $C$ is non-zero, $A$ is non-zero, and yet:
A*C
3x3 Array{Int64,2}: 0 0 0 0 0 0 0 0 0
We don't have "division" defined for matrices. There are for some square matrices an analog, $A^{-1}$ for which $AA^{-1} = I$, the matrix with a one on the diagonal. This is sort of like $a \cdot (1/a) = 1$ when $a \neq 0$.
There is a built in function like division \
. The expression A \ b
will be a solution, x
to Ax =b
:
A * (A \ b) # should be b = [1,2,3]
3-element Array{Float64,1}: 4.0 8.0 12.0
The transpose of the matrix $A=(a_{ij})_{1 \leq i \leq m, 1 \leq j \leq n}$ is found by swapping rows and columns:
$$~ A^T = (a_{ji})_{1 \leq i \leq m, 1 \leq j \leq n} ~$$On the computer we see easily using '
for transpose:
Here is $A$:
A
3x3 Array{Int64,2}: 1 2 3 4 5 6 7 8 9
and its transpose
A'
3x3 Array{Int64,2}: 1 4 7 2 5 8 3 6 9
The transpose is an involution – meaning do it twice and you haven't changed anything:
(A')'
3x3 Array{Int64,2}: 1 2 3 4 5 6 7 8 9
A matrix is symmetric if $A^T = A$. This means $A$ needs to be square and have symmetry along its main diagonal.
Our example A
is not symmetric:
A' - A
3x3 Array{Int64,2}: 0 2 4 -2 0 2 -4 -2 0
However, in general $A^TA$ is symmetric:
B = A'*A
3x3 Array{Int64,2}: 66 78 90 78 93 108 90 108 126
We could see visually, or confirm as follows
B' - B
3x3 Array{Int64,2}: 0 0 0 0 0 0 0 0 0
A matrix $A$ is positive definite if for every nonzero vector $x$, $x^T A x > 0$.
The example in the book is
A = [2 1; 1 2]
2x2 Array{Int64,2}: 2 1 1 2
For which $x^T A x = (x_1 + x_2)^2 + x_1^2 + x_2^2 > 0$ (if $x \neq 0$).
Let $I$ be an $n \times n$ identify matrix, that is $I$ is all $0$s with $1$s on the diagonal:
Then$AI = A$ and $IA = A$, if defined. For example:
$$~ (AI)_{ij} = a_{i1} I_{1j} + \cdots+ a_{ij} I_{jj} + \cdots + a_{in} I_{nj} = a_{i1} 0+ \cdots+ a_{ij} 1+ \cdots + a_{in} 0 = a_{ij} ~$$Let $A$ be given (and $m \times n$) Then $B$ is a right inverse if $AB = I$.
Thm: A square matrix, $A$, can possess at most one right inverse.
Proof: a linear algebra proof that rests on a fact: a linear combination of a basis producing $x$ is unique.
Suppose $B$ is an inverse. We aim to show $B$ is unique. The product
$$~ (AB)_{\cdot k} = \sum_l a_{\cdot l} b_{lk} ~$$So if we look at the $k$th column vector of a matrix and write this as $A_{:k}$, then:
$$~ I_{:k} = \sum_l b_{lk}A_{:l} ~$$So the $k$th column vector of $I$ is a linear combination of the column vectors of $A$. This means the columns of $A$ span the same space as the columns of $I$, which is $R^n$. So the columns of $A$ form a basis for $R^n$, ans so the values $b_{lk}$ are uniquely defined.
Theorem: If $A$ and $B$ are square matrices, then a right inverse is a left inverse.
Suppose $B$ is a right inverse ($AB=I$). Then set $C = BA - I + B$ and note: $AC = ABA - AI + AB = A(BA) - A + I = I$, so $C=B$ and consequently $BA = I$ as well.
So, if a square matrix $A$ has a $n\times n$ right inverse it has a unique inverse and is called invertible. When it exists, the inverse is written $A^{-1}$.
We return to system of equations. There we know a few basic facts:
The latter is on the only tricky one. If we use $E'_i = E_i + E_j$ instead, do we get a difference?
Well, if our solutions solved $E_i$ and $E_j$, then it would also solve $E'_I$, as $0 + 0$ is still $0$. To get the reverse, for $E'_j = -E_j$ and add this to $E'_i$ to get $E''_i = E'_i + E'_j = E_i + E_j - E_j = E_i$ and so the original set of equations will have the same solutions if we replace $E_i$ as specified.
We have systems of linear equations lend them selves to matrices, so elementary operations lend themselves to matrix operations.
To illustrate:
A = [1 2 3; 4 5 6; 7 8 9]
3x3 Array{Int64,2}: 1 2 3 4 5 6 7 8 9
We can switch rows 2 and 3 via let multiplication of the identity matrix with rows $2$ and $3$ swapped
[1 0 0; 0 0 1; 0 1 0] * A
3x3 Array{Int64,2}: 1 2 3 7 8 9 4 5 6
We can multiply row 2 as follows:
[1 0 0; 0 pi 0; 0 0 1] * A
3x3 Array{Float64,2}: 1.0 2.0 3.0 12.5664 15.708 18.8496 7.0 8.0 9.0
We can replace row $2$ with row $2$ plus row $3$ via:
[1 0 0; 0 1 0; 0 1 1] * A
3x3 Array{Int64,2}: 1 2 3 4 5 6 11 13 15
If we think of these elementary operations in terms of left multiplication by an elementary matrix, we can write the transformed matrix as $E_m E_{m-1} \cdots E_2 E_1 A$.
Thm: If $A$ is invertible, we can find $A^{-1}$ from a sequence of elementary row operations.
Idea: If we can find a sequence such that $E_m E_{m-1} \cdots E_2 E_1 A = I$, the $E_m E_{m-1} \cdots E_2 E_1 \cdot I = A^{-1}$. Why?
$$~ A^{-1}A = (E_m E_{m-1} \cdots E_2 E_1 \cdot I) A = E_m E_{m-1} \cdots E_2 E_1 (I \cdot A) = E_m E_{m-1} \cdots E_2 E_1 \cdot A = I. ~$$For an $n \times n$ matrix $A$, all of these are equivalent:
Matrices can be partitioned into blocks. So for example:
B = [1 2; 3 4] C = [5 6; 6 5] O = [0 0; 0 0] I = [1 0; 0 1] A = [B C; O I]
4x4 Array{Int64,2}: 1 2 5 6 3 4 6 5 0 0 1 0 0 0 0 1
Problem 7 has one show that $A^k$ is also a block matrix with Blocks $B^k$ and $(B^k-I)(B-I)^{-1}C$ to go with O and I. Let's see with k = 2:
We need $B-I$ to have an inverse. We check:
det(B-I)
-6.0
Now let $k=3$, we have
k = 3 out = A^k
4x4 Array{Int64,2}: 37 54 117 114 81 118 252 243 0 0 1 0 0 0 0 1
And now check the pieces:
out[1:2, 1:2] ==B^k
true
and
(B^k - I)* inv(B-I) * C == out[1:2, 3:4]
true