Eigenvalues and eigenvectors¶

Definition¶

Given a square matrix $A$, the characteristic polynomial of $A$ is given by

$$p(\lambda) = \det(A - \lambda I).$$

Definition¶

The zeros of the characteristic polynomial are called the eigenvalues of $A$. Given an eigenvalue $\lambda$, a non-zero vector $v$ such that

$$ Av = \lambda v$$

is called an eigenvector for the eigenvalue $\lambda$.

Note: If $v$ is an eigenvector, the $\alpha v$ is also for any $\alpha \in \mathbb R$. Given any norm $\|\cdot\|$, you can always choose $v$ so that $\|v\| = 1$.

Example¶

Find the eigenvalues and eigenvectors of

$$ A = \begin{bmatrix} 1 & 2 \\ 2 & -1 \end{bmatrix}.$$

A = [1,2;2,-1];
eigs(A)
[vs,lambdas] = eigs(A)

ans =

   -2.2361
    2.2361


vs =

    0.5257   -0.8507
   -0.8507   -0.5257


lambdas =

   -2.2361         0
         0    2.2361

Example¶

Find the eigenvalues of

$$ A = \begin{bmatrix} 2 & 0 & 0 \\ 1 & -1 & 2 \\ -1 & 1 & -1 \end{bmatrix}.$$

Definition¶

The spectral radius of a square matrix $A$ is given by

$$ \rho(A) = \max_i \left|\lambda_i\right|$$

where the maximum is taken overall eigenvalues.

Recall that if $\lambda = \alpha + i \beta$ is a complex number then $|\lambda| = \sqrt{\alpha^2 + \beta^2}$.

Spectral Theorem¶

If $A$ is an $n \times n$ symmetric matrix $A = A^T$ then all the eigenvalues of $A$ are real and there exists a matrix $U$ with $U^T = U^{-1}$ such that

$$A = U \begin{bmatrix} \lambda_1 \\ & \lambda_2 \\ && \ddots \\ &&& \lambda_n \end{bmatrix} U^T.$$

Recall that in the previous lecture we came up with explicit formulae for $\|A\|_1$ and $\|A\|_\infty$ but we mentioned nothing about $\|A\|_2$. One needs to use eigenvalues to characterize it:

Theorem¶

If $A$ is an $n\times n$ matrix, then

$\displaystyle \|A\|_2 = [\rho(A^TA)]^{1/2}$
$\rho(A) \leq \|A\|$ for any induced norm $\|\cdot\|$.

Proof¶

$\displaystyle \|A\|_2 = [\rho(A^TA)]^{1/2}$

Because $A^TA$ is a symmetric matrix, $A^T A = U \Lambda U^T$ for a diagonal matrix $\Lambda$ that contains the eigenvalues of $A^TA$.

For such a matrix $U$, note that

$$\|x\|_2^2 = x^T x = x U^T U x = \|Ux\|_2^2$$$$\|x\|_2^2 = x^T x = x U U^T x = \|U^Tx\|_2^2.$$

So $\|x\|_2 = 1$ if and only if $\|Ux\|_2= 1$. Let $v$ be an eigenvector corresponding to an eigenvalue $\lambda$:

$A^TA v = \lambda v$.

We note that for any vector $y$, $\|y\|_2^2 = y^T y$. We apply $x^T$ to the above equation:

$$ v^TA^T A v = \lambda v^T v \Leftrightarrow \|Av\|_2^2 = \lambda \|x\|^2.$$

This implies that $\lambda \geq 0$. So $\rho(A^TA) = \max_i \lambda_i$. Going back to the definition of the norm

\begin{align} \|A\|^2_2 &= \max_{\|x\|_2 = 1} \|Ax\|^2_2 = \max_{\|x\|_2 = 1} x^T A^TAx \\ & = \max_{\|x\|_2 = 1} x^T U \Lambda U^T x = \max_{\|U^T x\|_2 = 1} x^T U \Lambda U^T x \\ & = \max_{\|y\|_2 = 1} y^T \Lambda y = \max_{\|y\|_2 = 1} y^T \Lambda^{1/2} \Lambda^{1/2} y \\ & = \max_{\|y\|_2 = 1} \| \Lambda^{1/2} y \|_2^2 = \|\Lambda\|_2. \end{align}

Assume $\Lambda = \mathrm{diag}(\lambda_1,\lambda_2,\ldots,\lambda_n)$ where $\lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n$. If we choose $x = e_n = [0,0,\ldots,0,1]^T$ then $\|x\|_2 =1$ and

$$ \|\Lambda x\| = \max_i \lambda_i = \rho(\Lambda).$$

This shows that $\|A\|_2 \geq [\rho(\Lambda)]^{1/2}$. Now, for any vector $x \in \mathbb R^n$, $\|x\|_2 = 1$

$$ \|\Lambda x\|_2^2 = \sum_{i=1}^n \lambda^2_i x^2_i \leq \|x\|_2^2 \max_i \lambda_i^2 = \max_i \lambda_i^2. $$

This last inequality shows that $\|\Lambda\|_2 \leq \rho(\Lambda)$ and hence $\|A\|_2 \leq [\rho(\Lambda)]^{1/2}$. And so, we find that

$$ \|A\|_2 = [\rho(\Lambda)]^{1/2} = [\rho(A^T A)]^{1/2},$$

because $\Lambda$ and $A^TA$ have the same eigenvalues.

$\rho(A) \leq \|A\|$ for any induced norm $\|\cdot\|$.

Let $\|\cdot\|$ be any induced matrix norm:

$$\|A\| = \max_{\|x\|=1} \|Ax\|.$$

$\|Av\| = \|\lambda v\| = |\lambda| \|v\| = \rho(A)$.

For any specific choice of vector $\|v\|=1$, $\|A\| \geq \|Av\|$, so $\|A\| \geq \rho(A)$.

hold on
%A = [10 2 3 1; 1 -3 1 -1; -1 -3 -1 4; 2 2 2 5];
%A = rand(10)-1/2; A = A + 1j*(rand(10)-1/2);
A = [4,1,1;0,2,1;-2,0,9];
for i = 1:length(A)
    a = A(i,i); r = norm(A(i,:),1)-abs(a);
    theta = linspace(0,2*pi,100);
    x = r*cos(theta); y = r*sin(theta);
    plot(x+real(a),y + imag(a),'k')
end
lambda = eigs(A);
plot(real(lambda),imag(lambda),'*')

Example¶

Estimate the spectral radius of $A$ using the Gershgorin Circle Theorem

$$ A = \begin{bmatrix} 4 & 1 & 2 \\ 0 & 2 &1 \\ 2 & 1 & 9 \end{bmatrix}.$$