Review for test 2

11/11: Please let me know if something seems wrong. I may add more questions, but wanted to get this up sooner than later.

This is some of the material that will be up for testing on test 2.

Chapter 3

In Chapter 3 we discussed various means to solve the equation $f(x)=0$ where $f: R \rightarrow R$. (That is a $f$ is a scalar-valued function of a real variable). This formulation applies to finding answers for each of these equations:

$$~ \begin{align} \sin(x) -x + 1/2 &= 0\\ \tan(x) + x &= 1\\ a\cos(x) &= x \end{align} ~$$

Here is a summary of the methods we discussed:

Bisection method:

Newton's method

Secant method

Fixed point

some sample problems

$$~ f(x_n) = f(r) + f'(r)e_n + (1/2) f''(r) e_n^2 + \mathcal{O}(e_n^3) ~$$

Divide this by $e_n$, then use for both the case $n$ and $n-1$. Use these pieces to identify the limit as $n$ goes to $\infty$ of

$$~ \frac{f(x_n)/e_n - f(x_{n-1})/e_{n-1}}{x_n - x_{n-1}}. ~$$$$~ \sqrt{p + \sqrt{p + \sqrt{p + \cdots}}} ~$$

(As the book says, this is $x_{n+1} = \sqrt{p + x_n}$.)

Chapter 4

Chapter 4 deals with the problem of simulataneous solutions of linear equations expressed in matrix form: $Ax=b$.

4.1

In the first section, there is a review of linear algebra concepts. Some important pieces are:

4.2

We looked at solving $Ax=b$ is special cases: diagonal matrices, upper triangular, or lower triangular.

We looked at forming an LU decomposition of a matrix $A$. The first step of the method basically forms this matrix product:

$$~ A = \left[ \begin{array}{cc} a_{11} & c^T\\ b & A_1 \end{array} \right] = L_1 \cdot U_1 = \left[ \begin{array}{cc} 1 & 0\\ b/a_{11} & L_2 \end{array} \right] \cdot \left[ \begin{array}{cc} a_{11} & c^t\\ 0 & R_2 \end{array} \right] ~$$

(We won't do this method though, but you should know how to do Gaussian Elimination to find $LU$, with a possible $P$.)

We saw a block-matrix multiplication method to find the Cholesky decomposition. It starts with

$$~ A = \left[ \begin{array}{cc} a_{11} & w^T\\ w & K \end{array} \right], \quad L_1 = \left[ \begin{array}{cc} a_{11} & w^T\\ w & K \end{array} \right] \quad\text{ and } B_1= \left[ \begin{array}{cc} I & 0^T\\ 0 & K - \frac{ww^T}{a_{11}} \end{array} \right]. ~$$

And then we have $A = L_1 B_1 L_1^T$, and we can repeat the work on $B_1$ to get a decomposition.

We had these theorems:

If all $n$ leading principal minors of $A$ are non-singular, then $A$ has an $LU$ decomposition (without pivoting).

If $A$ is real, symmetric and positive definite then it has a unique (Cholesky) factorization $A=LL^T$ and $L$ has a positive diagonal.

4.3 Gaussian Elimination

The Gaussian Elimination process produces a matrix $U$, possibly with row swaps, and if one counts the row multipliers appropriately, a matrix $L$, where $LU=PA$, $P$ being a permutation matrix.

The decomposition is not unique. In addition to having degrees of freedom to pick the diagonal values of $L$ and $U$, it depends on the choice of pivot row. We discussed briefly three pivot types:

you should know the partial pivoting, where the largest magnitude entry of $a_{li}$, in column $i$, after removing the previously chosen $i-1$ rows is taken to be the pivot row. This choice ensures the values $\lambda_i$ are all bounded by 1 in magnitude, so $L$ will be unit triangular with dominant diagonal.

One compelling reason to choose to solve $Ax=b$ via solving $LUx=b$ is that it takes fewer steps than solving $x = A^{-1}b$. The following theorem allows us to count:

Theorem 4 (p176) on Long Operations: To solve $Ax=b$ for $m$ vectors $b$ where $A$ is $n\times n$ involves approximately this many ops:

$$~ \frac{n^3}{3} + (\frac{1}{2} + m) n^2. ~$$

Section 4.4

Section 4.4 introduces the concept of a norm. This makes talking about convergence possible, as it gives a sense of size.

We saw that a norm on a vector space was a function satisfying three properties:

There main vector norms a indexed: $l_1$, $l_2$, and $l_\infty$. For $p \geq 1$, we have also

$$~ \| x \|_p = (|x_1|^p +|x_2|^p +|x_3|^p +\cdots + |x_n|^p)^{1/p}. ~$$

The induced matrix norm is a norm on the set of matrices that is derived from a norm on vectors. It is specified by:

$$~ \| A \|_p = \text{The largest value of } \{\|Ax\|_p: \|x \|_p = 1\}. ~$$

We commented that when $p=2$, this is related to the singular values of $A$, which are the eigen values of $A^TA$.

When $p=\infty$, the $\| \cdot \|_{\infty}$ norm is related to largest of the $l_1$ norms of the row vectors of $A$.

A useful inequality of matrix norms is $\| A B \| \leq \| A \| \cdot \| B \|$, Similarly, $\|Ax\| \leq \| A \| \cdot \| x \|$. These are true for the matrix norm induced by the vector norm.

The use of norms make it possible to prove this inequality on the relative errors of solving $Ax=b$ when there is an error in finding $b$:

$$~ \frac{\| x - \tilde{x} \|}{\|x\|} \leq \kappa(A) \cdot \frac{\| b - \tilde{b} \|}{\|b\|}. ~$$

The condition number is $\kappa(A) = \| A \| \cdot \|A^{-1}\|$. It depends on the norm used.

Iterative schemes

We spoke about a few iterative schemes. The first is used to "polish" approximate answers. It goes like:

Suppose we have an approximate answer $x^{(0)}$ to $Ax=b$. Then we get the residual is

$$~ r^{(0)} = b - A x^{(0)} ~$$

And from here we solve the error from $A e^{(0)} = r^{(0)}$. And finally, we get

$$~ x^{(1)} = x^{(0)} + e^{(0)}. ~$$

I everything is exact, then this is the answer, but if there is some loss in solving, then it is an improved answer. In that case the process can be repeated.

This can be made concrete by assuming we have an approximate inverse to $A$, called $B$. Then when solving $Ae^{(0)} = r^{(0})$ by multiplying by $B$, we lose some information. In that case, the value of $x^{(1)}$ satisfies:

$$~ r^{(0)} = b - A x^{(0)}; \quad e^{(0)} = B r^{(0)}; \quad x^{(1)} =x^{(0)} + e^{(0)} ~$$

And in general, the value of $x^{(m)}$ satisfies:

$$~ x^{(m)} = B \sum_{k=0}^m (I - AB)^k b. ~$$

The sum on the right is convergent if $\|I - AB \| < 1$ and so $x^{(m)}$ will converge.

This requires knowing that when $\|A\|<1$, then $I-A$ is invertible and can be written as a Neumann sum.

Other convergent algorithms.

If we have a splitting matrix $Q$, then we can form the equation $Qx = Qx - Ax + b$. Which suggests an iterative scheme of the type:

$$~ Q x^{(k+1)} = (Q- A)x^{(k)} + b. ~$$

For various choices of $Q$ this will converge:

As long as $\| I - Q^{-1}A \| < 1$ for some subordinate norm, this sequence will converge.

More generally, if the iterative scheme is

$$~ x^{(k+1)} = Gx^{(k)} + c ~$$

then this will converge if the magnitude of the largest eigenvalue is less than 1.

Some sample problems

$$~ A = \left[ \begin{array}{ccc} 1 & 0 & 0\\ 2 & 3 & 0\\ 4 & 5 & 6 \end{array} \right], \quad b = [7,8,9]^T ~$$$$~ A = \left[ \begin{array}{ccc} 2 & 0 & -8\\ -4 & -1 & 0\\ 7 & 4 & -5 \end{array} \right]. ~$$

Did you need to use a permutation matrix?

Use your answer to solve $Ax=[1,2,3]^T$.

$$~ A = \left[ \begin{array}{cc} 1 & 2\\ 2 & 1 \end{array} \right] ~$$

Does $A$ have a Cholesky decomposition?

using SymPy
A = [ Sym(4) 1 2; 1 2 3; 2 1 2]
\begin{bmatrix}4&1&2\\1&2&3\\2&1&2\end{bmatrix}

The LU decomposition is given by:

L, U, p = lu(A)
(

⎡ 1    0   0⎤
⎢           ⎥
⎢1/4   1   0⎥
⎢           ⎥
⎣1/2  2/7  1⎦,


⎡4   1    2 ⎤
⎢           ⎥
⎢0  7/4  5/2⎥
⎢           ⎥
⎣0   0   2/7⎦,

[1,2,3])

From this, we have inverses:

inv(L)
\begin{bmatrix}1&0&0\\- \frac{1}{4}&1&0\\- \frac{3}{7}&- \frac{2}{7}&1\end{bmatrix}

and

inv(U)
\begin{bmatrix}\frac{1}{4}&- \frac{1}{7}&- \frac{1}{2}\\0&\frac{4}{7}&-5\\0&0&\frac{7}{2}\end{bmatrix}

Show that we can find $A^{-1}$ using these two matrices. What is the value?

$$~ A = \left[ \begin{array}{ccc} 2 & 0 & -8\\ -4 & -1 & 0\\ 7 & 4 & -5 \end{array} \right]. ~$$$$~ A = \left[ \begin{array}{ccc} 1 & 0 & 0\\ 2 & 3 & 0\\ 4 & 5 & 6 \end{array} \right]. ~$$$$~ x^{(k+1)} = Q^{-1} B x^{(k)} + Q^{-1}b. ~$$

This is of the general form $x^{(k+1)} = Gx^{(k)} + c$. Using the criteria that the spectral radius of $G$ must be less than 1, how does this translate to a condition on $Q$ and $B$ that will ensure convergence?

A = [1 2; 3 4]
2x2 Array{Int64,2}:
 1  2
 3  4

and

b = [1,2]
2-element Array{Int64,1}:
 1
 2

A guess at an answer is $x^{(0)} = [-1/4, 1/4]$. Compute the residual value $r^{(0)} = b - Ax^{(0)}$. The error $e^{(0)}$ satisfies $Ae^{(0)} = r^{(0))}$. Find $e^{(0)}$.

A = [1 1/2; 1/3 1/4]
2x2 Array{Float64,2}:
 1.0       0.5 
 0.333333  0.25

The Jacobi method takes $Q$ to be the diagonal of $Q$. Is it guaranteed to converge for some $b$? (Use the $\|\cdot \|_\infty$ norm.)

You might be interested in this computation:

Q = diagm(diag(A)) # [1 0; 0 1/4]
I - inv(Q) * A
2x2 Array{Float64,2}:
  0.0      -0.5
 -1.33333   0.0