Direct Methods for Solving Linear Systems¶

A review of linear algebra¶

Example¶

Find a quadratic polynomial $p(x)$ such that $p(-1) = 0$, $p(1) = 0$ and $p'(0) = 0$.

Let $p(x) = a x^2 + b x + c$ then we have to solve the following equations

$$a - b + c = 0\\ a + b + c =0\\ 0 + b + 0 = 0$$

To solve this linear system we perform row operations. The row operation $R_2- R_1 \to R_2$ states that we should subtract row 1 from row to2, then replace row 2 with this difference:

$$\begin{array}{cccc} a - b + c = 0 & &a - b + c = 0\\ a + b + c =0 & R_2-R_1 \to R_2 &0 + 2b + 0 =0\\ 0 + b + 0 = 0 && 0 + b + 0 = 0\\ \end{array}$$

The last two equations give $b = 0$, and then the top equation gives $a = -c$. So let $a = \alpha \in \mathbb R$ the polynomial $p(x)$ is given by:

$$ p(x) = \alpha(x^2 -1).$$

We develop the proper definitions so that we can write the above system in matrix form as

$$\begin{bmatrix} 1 & -1 & 1\\ 1 & 1 & 1 \\ 0 & 1 & 0 \end{bmatrix}\begin{bmatrix} a \\ b \\ c \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}$$

Definition¶

A matrix is an $n \times m$ array of real (or complex) numbers, where ordering matters. Here $n$ refers to the number of rows in the matrix and $m$ is the number of columns.

For an $n \times m$ matrix $A$ we use $a_{ij}$ to refer to the element that is in the $i$th row and $j$th column. Rows and columns are counted from the top left entry. We use the notation

$$A = (a_{ij})_{1 \leq i \leq n, 1 \leq j \leq m}$$

to refer to the the matrix $A$ by its entires. We also use $A = (a_{ij})$ when $n$ and $m$ are implied.

The diagonal entries of a matrix $A$ are $a_{ii}$ for $1 \leq i \leq \min\{m,n\}$.

A matrix $A$ is triangular if either $a_{ij} = 0$ for $i < j$ (lower triangular) or $a_{ij} = 0$ for $j < i$ (upper triangular). If a matrix is both lower and upper triangular it is said to be diagonal.

A column vector is an $n \times 1$ matrix.

A row vector is an $1 \times m$ matrix.

If $x$ is is either a row or column vector, we use $x_i$ ($1 \leq i \leq n$ or $1 \leq i \leq m$, resp.) to refer to its entries.

Definition (matrix-vector multiplication)¶

Let $A = (a_{ij})$ be an $n \times m$ matrix and let $x$ be a $m \times 1$ column vector. The product $y = Ax$ is a $n \times 1$ vector $y$ given by

$$ y_j = \sum_{i=1}^m a_{ji} x_i, \quad 1 \leq j \leq n.$$

With this notation, the following linear system of equations

$$a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n = y_1\\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n = y_2\\ \vdots ~~~~~~~~~~~~~~~ \vdots\\ a_{n1} x_1 + a_{n2} x_2 + \cdots + a_{nn} x_n = y_n$$

is equivalent to $Ax = y$

A = [1,2,3; 4,5,6]; % 2 x 3 matrix
x = [1;1;1]; % 3 x 1 vector
A*x  % gives a 2 x 1 vector

ans =

     6
    15

Gaussian elimination with backward substitution¶

The goal of Gaussian elimination is to solve a system of equations by performing what are called elementary row operations to reduce a general matrix to an upper-triangular matrix. Then a procedure calle backward substitution applies to the upper-triangular matrix.

Elementary row operations are:

Multiply any row by a non-zero constant. $c R_1 \to R_1$
Given a row, any multiple of another row can be added, replacing the given row. $R_2 - 3 R_1 \to R_2$
Interchange two rows: $R_1 \leftrightarrow R_2$

Assume we want to solve the following linear system of equations $Ax = b$

$$\begin{bmatrix} 1 & -1 & 1 & 0\\ 1 & 1 & 2 & 1 \\ 0 & 1 & 0 & 0 \\ 1 & -1 &1 & -1 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0\end{bmatrix}.$$

We consider the associated augmented matrix

$$ [A, b] = \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 1 & 1 & 2 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 1 & -1 &1 & -1 & 0\end{array} \right]. $$

The process of Gaussian elimination reduces this augmented matrix to an upper-triangular matrix:

$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 1 & 1 & 2 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 1 & -1 &1 & -1 & 0\end{array} \right] & R_2 - R_1 \to R_2 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 1 & -1 &1 & -1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 1 & -1 &1 & -1 & 0\end{array} \right] & R_4 - R_1 \to R_4 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 &0 & -1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 &0 & -1 & 0\end{array} \right]& R_3 - 1/2R_2 \to R_3 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & -1/2 & -1/2 & -1/2 \\ 0 & 0 &0 & -1 & 0\end{array} \right] \end{array}$$

The process of backward substitution starts with the final line. Let's take this equation out of augmented form and write it as $Ux = y$. The last equation is:

$$u_{44} x_4 = -x_4 = 0 \quad \Longrightarrow \quad x_4 = 0.$$

Then the next ones:

$$ u_{33}x_3 + u_{34} x_4 = -1/2 x_3 -1/2 x_4 = -1/2 \quad \Longrightarrow \quad x_3 = 1.$$$$ u_{22}x_2 + u_{23}x_3+ u_{24} x_4 = 2x_2 + x_3 + x_4 = 1 \quad \Longrightarrow \quad x_2 = 0.$$$$ u_{11}x_1 + u_{12}x_2 + u_{13}x_3+ u_{14} x_4 = x_1-x_2+x_3 = 0 \quad \Longrightarrow \quad x_1 = -1.$$

In general, we perform these row operations to turn $Ax = b$ (typically when $A$ is $n \times n$) to $Ux = y$ and then backsubsitition is given by

$$ x_i = \frac{y_i - \sum_{j = i+1}^n u_{ij}x_j}{u_{ii}},\quad i = n, n-1, \ldots, 2,1$$

Note that is is a well-defined solution procedure because $x_i$ is given in terms of $x_j$ for $j > i$. If we start with $x_n$ and move our way down to $x_1$, we know everything on the right-hand side of this equations.

WARNING: This solution procedure is not well-defined if one of the diagonal entries vanishes as row operations are performed. If this happens, one has to interchange rows.

The full algorithm for this basic Gaussian elimination, including row swaps, is given in Algorithm 6.1 in the text.

Row-reduced echelon form¶

In the above example, further row operations can be performed to be able to read off the solution:

$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & -1/2 & -1/2 & -1/2 \\ 0 & 0 &0 & -1 & 0\end{array} \right]& -R_4 \to R_4 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & -1/2 & -1/2 & -1/2 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & -1/2 & -1/2 & -1/2 \\ 0 & 0 &0 & 1 & 0\end{array} \right] & -2R_3 \to R_3 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] & R_3-R_4 \to R_3 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 1 & 1 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] & R_2-R_4 \to R_2 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] & R_2-R_3 \to R_2 & \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 1 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] & R_1-R_3 \to R_1 & \left[\begin{array}{cccc|c} 1 & -1 & 0 & 0 & -1 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 0 & 0 & -1 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] & 1/2R_2 \to R_2 & \left[\begin{array}{cccc|c} 1 & -1 & 0 & 0 & -1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$$$\begin{array}{ccccc} \left[\begin{array}{cccc|c} 1 & -1 & 0 & 0 & -1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] & R_1 + R_2 \to R_1 & \left[\begin{array}{cccc|c} 1 & 0 & 0 & 0 & -1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 &0 & 1 & 0\end{array} \right] \end{array}$$

If you now read off the equations that this augmented system represents, we have $x_1 = -1, x_2 = 0, x_3 = 2, x_4 = 0$. Note that this was a lot more work than solving the system with backward substitution. This is an indication that backward subsitution is preferable on a computer.

The inverse matrix¶

Definition¶

The $n\times n$ identity matrix $I = I_n$ ($n$ is often suppressed) is the $n\times n$ matrix with ones on the diagonal and zero everywhere else:

$$ I_2 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}.$$

So, in the example we used row operations for the equation $Ax =b$ to change $[A,b]$ to $[I,x]$.

Theorem¶

Suppose $A$ is an $n \times n$ matrix. If the only solution of $Ax = 0$ is $x = 0$ ($x$ is a vector of all zeros) then there exists a unique matrix $A^{-1}$ called the inverse matrix such that $A^{-1}A = AA^{-1} = I$.

Computing $A^{-1}$¶

Use row operations to transform $[A,I]$ to $[I,B]$ and then $B = A^{-1}$.