Convergence of iterative methods

Theorem

The fixed point method given by

$$g(x) = Tx + b, \quad x^{(k)} = g(x^{(k-1)}), \quad x^{(0)} \in \mathbb R^{n}, k \geq 1$$

converges if $\rho(T) < 1$ to the solution of $(I-T)x = b$.

Proof

The iterates of the fixed-point method are

$x^{(0)} = x^{(0)},$
$x^{(1)} = Tx^{(0)} + b,$
$x^{(2)} = T^2x^{(0)} + Tb + b,$
$x^{(3)} = T^3x^{(0)} + T^2b + T b + b,$
$\vdots$
$x^{(k)} = T^k x^{(0)} + \sum_{j=0}^{k-1} T^jb$

As $k \to \infty$, $\|T^k\| \to 0$ and

$$\sum_{j=0}^{k-1} T^jb \to (1-T)^{-1} b.$$

Therefore

$$x = \lim_{k \to \infty} x^{(k)} = (I-T)^{-1} b$$$$(I-T)x = b.$$

Corollary (Error estimate)

The fixed point method given by

$$g(x) = Tx + c, \quad x^{(k)}= g(x^{(k-1)}), \quad x^{(0)} \in \mathbb R^{n}, k \geq 1.$$

with $\rho(T) < 1$ has an error that satisfies

$$\|x^{(k)}- x\| \leq \|T\|^k \|x^{(0)} - x\|,\quad x = (I-T)^{-1}b,$$

in any norm. Additionally, for any $\delta > 0$

$$ \|x^{(k)}- x\| = O( (\rho(T) + \delta)^n ).$$

Proof

From the proof of the previous theorem,

$$x^{(k)} - x = T^k x^{(0)} + \sum_{j=0}^{k-1} T^jb - \sum_{j=0}^{\infty}T^j b = T^k x^{(0)} - \sum_{j=k}^{\infty}T^j b = T^k\left( x^{(0)} - \sum_{j=0}^{\infty}T^j b \right) = T^k (x^{(0)} -x).$$

Therefore

$$\|x^{(k)} - x \| \leq \|T^k\| \|x^{(0)}-x\| \leq \|T\|^k \|x^{(0)}-x\|.$$

The proof of the last fact requires the following two results that we will not prove:

Theorem A

For any $n\times n$ matrix $A$ and $\delta > 0$, there exists an induced matrix norm $\|\cdot\|_A$ (depending on $A$) such that

$$ \|A\|_A \leq \rho(A) + \delta. $$

Theorem B

Given any two induced matrix norms on $n \times n$ matrices $\|\cdot\|_\alpha$ and $\|\cdot\|_\beta$ there exists constants $0 < C_1 \leq C_2$ such that

$$C_1 \|A\|_\beta \leq \|A\|_\alpha \leq C_2 \|A\|_\beta,$$

for all matrices $A$.

To finish the proof of the corollary, given $\delta > 0$, we choose a matrix norm $\|\cdot\|_T$ such that $ \|T\|_T \leq \rho(T) + \delta$ by Theorem A. Then from Theorem B we find some constant $C > 0$

$$\|x^{(k)} - x \| \leq \|T^k\| \|x^{(0)}-x\| \leq C \|T^k\|_T \|x^{(0)}-x\| \leq C \|T\|^k_T \|x^{(0)}-x\| \leq C (\rho(T) + \delta)^k \|x^{(0)}-x\|. $$

Therefore

$$ \|x^{(k)}- x\| = O( (\rho(T) + \delta)^k ).$$

Jacobi's Method

We want to solve the equation $Ax=b$. If $A$ would happen to be a diagonal matrix, with non-zero diagonal entries, computing $A^{-1} = \mathrm{diag}(a_{11}^{-1},a_{22}^{-1},\ldots,a_{nn}^{-1})$ is simple. But, of course, most linear systems are not diagonal. The Jacobi iterative method is given by the following iteration formula:

\begin{align*} x^{(0)} &= \text{ an initial guess}\\ x^{(k)}_{i} &= \sum_{j=1, ~~ j\neq i}^n \left( - \frac{a_{ij} x_j^{(k-1)}}{a_{ii}} \right) + \frac{b_i}{a_{ii}}, \quad {i = 1,2,\ldots,n}, ~~ k > 0. \end{align*}
In [11]:
A = [10, -1 , 2, 0; -1, 11, -1 3; 2, -1, 10, -1; 0, 3,-1,8];
b = [1,1,1,1]'; n = 4; x = zeros(n,1);
x = Jacobi(A,b,.000001,40)
norm(A*x-b)
x =

    0.0876
    0.0786
    0.1011
    0.1082


ans =

   4.9674e-07

Jacobi's method is easier to understand when written in matrix form, even though it should be coded as written above. Let $A = D - L - U$ where $L$ is strictly lower-triangular, $D$ is diagonal and $U$ is strictly upper-triangular. Then define

$$ x^{(k)} = D^{-1}(L + U) x^{(k-1)} + D^{-1} b.$$

You should check that this is the same as the above equations for Jacobi's method. A fixed-point of this iteration would satisfy

$$ x = D^{-1}(L + U) x + D^{-1} b,$$$$ (D-L-U)x = b,$$$$ Ax = b.$$

Theorem

Jacobi's method converges if $\rho(D^{-1}(L + U))< 1$.

In particular, this may happen if $D$ has large entries (on the diagonal) and $L$ and $U$ have small entries.

Gauss-Seidel Method

The Gauss-Seidel iterative method is a slight modification of Jacobi's method. Recall Jacobi's method

\begin{align*} x^{(0)} &= \text{ an initial guess}\\ x^{(k)}_{i} &= \sum_{j=1, ~~ j\neq i}^n \left( - \frac{a_{ij} x_j^{(k-1)}}{a_{ii}} \right) + \frac{b_i}{a_{ii}}, \quad {i = 1,2,\ldots,n}, ~~ k > 0. \end{align*}

We can modify this by replacing the "$k-1$" on the right-hand side with $k$ when $j \leq i$, since $x_j^{(k)}$ has been computed for $j < i$:

\begin{align*} x^{(0)} &= \text{ an initial guess}\\ x^{(k)}_{i} &= -\frac{1}{a_{ii}}\left[\sum_{j=1}^{i-1} {a_{ij} x_j^{(k)}} + \sum_{j=i+1}^n {a_{ij} x_j^{(k-1)}} \right] + \frac{b_i}{a_{ii}}, \quad {i = 1,2,\ldots,n}, ~~ k > 0. \end{align*}
In [5]:
A = [10, -1 , 2, 0; -1, 11, -1 3; 2, -1, 10, -1; 0, 3,-1,8];
b = [1,1,1,1]'; n = 4; x = zeros(n,1);
x = Jacobi(A,b,0,10);
norm(A*x-b)
x = GS(A,b,0,10);
norm(A*x-b)
Maximum number of interations reached, Nmax = 10
ans =

   8.2524e-05

Maximum number of interations reached, Nmax = 10
ans =

   1.4308e-10

Let $A = D - L - U$ where $L$ is strictly lower-triangular, $D$ is diagonal and $U$ is strictly upper-triangular. Then define

$$ x^{(k)} = (D-L)^{-1}U x^{(k-1)} + (D-L)^{-1} b.$$

You should check that this gives the same iteration as the formula for Gauss-Seidel. A fixed-point of this iteration would satisfy

$$ x = (D-L)^{-1}U x + (D-L)^{-1}b,$$$$ (D-L-U)x = b,$$$$ Ax = b.$$

Theorem

  1. The Gauss-Seidel method converges if $\rho((D-L)^{-1}U)< 1$.
  2. If the entries of $L$ and $U$ in $A = D-L-U$ are non-negative and $\rho(D^{-1}(L +U)) < 1$ (Jacobi converges) then
$$ \rho((D-L)^{-1}U) < \rho(D^{-1}(L +U)) \quad \text{(Gauss-Siedel converges faster)}$$