The Power Method

The Power method is an iterative technique to compute the domainant eigenvalue of a matrix $A$. The eigenvalue $\lambda_1$ of and $n \times n$ matrix $A$ is said to be dominant if

$$|\lambda_1| > |\lambda_2| \geq |\lambda_3| \geq \cdots \geq |\lambda_n|.$$

Note that not all matrices have a dominant eigenvalue, but we will assume here that the we are dealing with such a matrix. We also assume that

$$A = S D S^{-1}$$

for an invertible matrix $S$ and a diagonal matrix

$$ D = \begin{bmatrix} \lambda_1 \\ & \lambda_2 \\ && \ddots \\&&& \lambda_n \end{bmatrix}.$$

In other words, we assume that A is diagonalizable.

Let $x$ be a non-zero vector in $\mathbb R^n$ and consider the sequence

$$x, Ax, A^2x, A^3x,\ldots, A^kx, \ldots.$$

Computing these powers:

$$A = S DS^{-1},$$$$A^2 = S DS^{-1}S DS^{-1} = S D^2S^{-1},$$$$A^3 = S D^2S^{-1}S DS^{-1} = S D^3S^{-1},$$$$\vdots$$$$A^k = S D^{k-1}S^{-1}S DS^{-1} = S D^kS^{-1}.$$

We write

$$A^k x = \lambda_1^k S \begin{bmatrix} 1 \\ & \left(\frac{\lambda_2}{\lambda_1}\right)^k \\ && \ddots \\&&& \left(\frac{\lambda_n}{\lambda_1}\right)^k \end{bmatrix}S^{-1}x.$$

Because $\lambda_1$ is the dominant eigenvalue $(\lambda_j/\lambda_1)^k \to 0$ as $k \to \infty$ for all $j > 1$.

For large $k$ we can write

$$A^k x \approx \lambda_1^k S \begin{bmatrix} 1 \\ & 0 \\ && \ddots \\&&& 0 \end{bmatrix}S^{-1}x.$$

But this will only converge no something non-zero if $\lambda_1 = 1$.

So, we take a different approach, and renormalize at with each power of $A$. Note that $A^kx$ grows (or decays) like $\lambda_1^k$ and $A^{k-1}x$, like $\lambda_1^{k-1}$. So, if we take a ratio of these two, somehow, it should isolate $\lambda_1$. We use the $l_\infty$ norm to do this.

Define $x^{(k)} = A^k x$. Let $p = p_{k-1}$ be the smallest integer such that $|x^{(k-1)}_p| = \|x^{(k-1)}\|_\infty$. Define

$$\lambda^{(k)} = \frac{x^{(k)}_p}{x^{(k-1)}_p}.$$

If $A$ is diagonalizable with $\lambda_1$ being its dominant eigenvalue then (typically) $\lambda^{(k)} \to \lambda_1$ as $k \to \infty$.

Define

$$y = S \begin{bmatrix} 1 \\ & 0 \\ && \ddots \\&&& 0 \end{bmatrix}S^{-1}x.$$

Then

$$A^k x = \lambda_1^ky \left( I + O(|\lambda_2/\lambda_1|^k) \right).$$

We then express

$$x^{(k)}_{p_{k-1}} = \lambda_1^k y_{p_{k-1}}\left( 1 + O(|\lambda_2/\lambda_1|^k) \right),$$

and therefore

$$\lim_{k \to \infty} \frac{x^{(k)}_{p_{k-1}}}{x^{(k)}_{p_{k-1}}} = \lim_{k\to \infty} \frac{\lambda_1^k y_{p_{k-1}}\left( 1 + O(|\lambda_2/\lambda_1|^k) \right)}{\lambda_1^{k-1} y_{p_{k-1}} \left( 1 + O(|\lambda_2/\lambda_1|^{k-1}) \right)} = \lambda_1.$$

This can only fail if $y = 0$ (which can happen!). In other words:

Theorem

Let $\lambda^{(k)}$ be the approximation of the dominant eigenvalue $\lambda_1$ of a diagonalizable matrix using the Power method, then (typically)

$$\lambda^{(k)} = \lambda_1 + O(|\lambda_2/\lambda_1|^{k}).$$

In practice, one does not want to just compute $A^kx$ becuase if $|\lambda_1| \neq 1$, overflow or underflow can affect the computation. We do the following:

  1. Generate a random vector $x$, set $x = x/\|x\|_\infty$.
  2. Let $p$ be the smallest integer such that $|x_p| = \|x\|_\infty$.
  3. For $k =1,2,\ldots,N_{\mathrm{max}}$ do
    1. Set $y = Ax$.
    2. Set $\lambda^{(k)} = y_p/x_p$.
    3. If $k > 1$ and $|\lambda^{(k)} - \lambda^{(k-1)}| < \epsilon$, return $\lambda^{(k)}$.
    4. Set $x = y/\|y\|_\infty$.
    5. Let $p$ be the smallest integer such that $|x_p| = \|x\|_\infty$.

The symmetric Power method

When the matrix $A$ is symmetric, the error term can be reduced further with the help of the fact that $\|Ax\|^2_2 = x^T A^T A x = x A^2 x$. For the symmetric Power method we perform the following:

  1. Generate a random vector $x$, set $x = x/\|x\|_2$.
  2. For $k =1,2,\ldots,N_{\mathrm{max}}$ do
    1. Set $y = Ax$.
    2. Set $\displaystyle \lambda^{(k)} = {x^Ty}$.
    3. If $|\lambda^{(k)} - \lambda^{(k-1)}| < \epsilon$, return $\lambda^{(k)}$.
    4. Set $x = y/\|y\|_2$.

To analyze the error for the symmetric Power method, we define $x^{(k)}$ to be the value of $x$ that is computed at iteration $k$, with $x^{(0)}$ being the initial unit vector computed in step 1. By induction it follows that $x^{(k)} = c A^{k} x^{(0)}$ where $c$ is chosen sothat $\|x^{(k)}\|_2 = 1$, or $c = \|A^{k}x^{(0)}\|_2^{-1}$. It then follows that

$$ \lambda^{(k+1)} = (x^{(k)})^T (A x^{(k)}) = \frac{ (x^{(0)})^T (A^{k})^T A^{k+1} x^{(0)} }{(x^{(0)})^T (A^{k})^T A^{k} x^{(0)}} = \frac{ (x^{(0)})^T A^{2k+1} x^{(0)} }{(x^{(0)})^T A^{2k} x^{(0)}} \quad k \geq 0$$

Because $A$ is symmetric we can apply the Spectral Theorem: $A = Q^T D Q$ for an orthogonal matrix $Q$ and a diagonal matrix $D$ that contains the eigenvalues. Let $c = Q x^{(0)}$ and we have

$$\lambda^{(k+1)} = \frac{c^T D^{2k+1} c}{c^T D^{2k} c} = \frac{\displaystyle \sum_{j=1}^n c_j^2 \lambda_j^{2k+1}}{\displaystyle \sum_{j=1}^n c_j^2 \lambda_j^{2k}}.$$

Dividing numerator and denominator by $\lambda_1^{2k}$ where $\lambda_1$ is assumed to be the dominant eigenvalue, we have

$$\lambda^{(k+1)} = \lambda_1 \frac{\displaystyle \sum_{j=1}^n c_j^2 (\lambda_j/\lambda_1)^{2k+1}}{\displaystyle \sum_{j=1}^n c_j^2 (\lambda_j/\lambda_1)^{2k}} = \lambda_1 \frac{\displaystyle 1 + \sum_{j=2}^n c_j^2 (\lambda_j/\lambda_1)^{2k+1}}{\displaystyle 1+ \sum_{j=2}^n c_j^2 (\lambda_j/\lambda_1)^{2k}} = \lambda_1 \frac{1 + O(|\lambda_j/\lambda_1|^{2k+1})}{1 +O(|\lambda_j/\lambda_1|^{2k})}.$$

From this it follows that:

Theorem

Let $\lambda^{(k)}$ be the approximation of the dominant eigenvalue $\lambda_1$ of a symmetric matrix using the symmetric Power method, then (typically)

$$\lambda^{(k)} = \lambda_1 + O(|\lambda_2/\lambda_1|^{2k}).$$