studEE16A
  • Introduction
  • Linear Algebra
    • Linear Equations
      • Description
      • Example Problems
    • Vector Spaces
      • Description
      • Example Problems
    • Inner Products
      • Description
      • Example Problems
    • Determinants
      • Description
      • Example Problems
    • Eigen-everything
      • Description
      • Example Problems
    • Matrices
      • Description
      • Example Problems
    • Least Squares
      • Description
      • Example Problems
    • Gram-Schmidt
      • Description
      • Example Problems
    • Basis
      • Description
      • Example Problems
    • Page Rank
  • Circuits
    • Circuit Basics
    • Capacitance
    • Nodal Analysis
    • Superposition
    • Thevenin and Norton
    • What, When, Where, and Why?
    • Op Amps
Powered by GitBook
On this page
  • Calculation
  • Example
  • Diagonalization
  • What and why
  • Calculation
  • Example
  • Diagonalizability

Was this helpful?

  1. Linear Algebra
  2. Eigen-everything

Example Problems

Calculation

Let us derive a method for solving for the eigenvalues, eigenvectors, and eigenspaces of a given matrix.

We know that for a given eigenvector v⃗\vec{v}v, Av⃗=λv⃗\textbf{A}\vec{v} = \lambda\vec{v}Av=λv

However, we have 2 unknowns: v⃗\vec{v}v and λ\lambdaλ, so we try to eliminate one of them.

We can re-write the equation as Av⃗=λIv⃗\textbf{A}\vec{v} = \lambda\textbf{I}\vec{v}Av=λIv

where I\textbf{I}I is the identity matrix. We then subtract Av⃗\textbf{A}\vec{v}Av from both sides.

λIv⃗−Av⃗=0⃗\lambda\textbf{I}\vec{v} - \textbf{A}\vec{v} = \vec{0}λIv−Av=0

Factoring out the v⃗\vec{v}v, we get

(λI−A)v⃗=0⃗(\lambda\textbf{I} - \textbf{A})\vec{v} = \vec{0}(λI−A)v=0

We observe that v⃗\vec{v}v is in the null space of the matrix λI−A\lambda\textbf{I} - \textbf{A}λI−A. Since we know that any eigenvector v⃗\vec{v}v is non-trivial, we have to make sure that the null space λI−A\lambda\textbf{I} - \textbf{A}λI−A is non-trivial as well.

Recall the Invertible Matrix Theorem, which states that if the null space of a matrix is non-trivial, its determinant must be 0. Therefore, to solve for the eigenvalues, we use the equation det(λI−A)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = 0det(λI−A)=0

det(λI−A)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = 0det(λI−A)=0 is called the the characteristic polynomial of A\textbf{A}A because it is a polynomial in λ\lambdaλ. To find the eigenvalues of a matrix, we find the roots of the characteristic polynomial det(λI−A)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = 0det(λI−A)=0.

Sometimes, an eigenvalue λi\lambda_iλi​ can appear multiple times as a root. We define the algebraic multiplicity of the eigenvalue λi\lambda_iλi​ to be the number of times λi\lambda_iλi​ appears as a root in the characteristic polynomial.

Notes:

  1. Observe what happens if an eigenvalue of A\textbf{A}A is 0. Then, det(0I−A)=det(−A)=0  ⟹  det(A)=0\text{det}(0\textbf{I} - \textbf{A}) = \text{det}(-\textbf{A}) = 0 \implies \text{det}(\textbf{A}) = 0det(0I−A)=det(−A)=0⟹det(A)=0. Therefore, a matrix A\textbf{A}A has an eigenvalue of 0 if and only if A\textbf{A}A is non-invertible.

  2. If A\textbf{A}A is a triangular matrix, we can easily find the eigenvalues of A\textbf{A}A. Assume that d1,d2,…,dnd_1, d_2, \ldots, d_nd1​,d2​,…,dn​ are the entries along the diagonal in such a matrix. From cofactor expansion, we know that det(A)=d1d2…dn\text{det}(\textbf{A}) = d_1d_2\ldots d_ndet(A)=d1​d2​…dn​. This means that det(λI−A)=(λ−d1)(λ−d2)⋯(λ−dn)=0\text{det}(\lambda\textbf{I} - \textbf{A}) = (\lambda - d_1)(\lambda - d_2)\cdots(\lambda - d_n) = 0det(λI−A)=(λ−d1​)(λ−d2​)⋯(λ−dn​)=0, so d1,d2,…,dnd_1, d_2, \ldots, d_nd1​,d2​,…,dn​ are the roots of the characteristic polynomial, i.e. the eigenvalues. The eigenvalues of a triangular matrix are its diagonal entries.

Once we have all of the eigenvalues λ1,λ2,…,λn\lambda_1, \lambda_2, \ldots, \lambda_nλ1​,λ2​,…,λn​ of a A\textbf{A}A, we can find the eigenvectors associated with each eigenvalue by using the equation (λiI−A)v⃗=0⃗(\lambda_i\textbf{I} - \textbf{A})\vec{v} = \vec{0}(λi​I−A)v=0. In fact, we are basically finding the basis vectors for the null space of λiI−A\lambda_i\textbf{I} - \textbf{A}λi​I−A for i=1,2,…,ni = 1, 2, \ldots, ni=1,2,…,n.

If the dimension of the null space of λiI−A\lambda_i\textbf{I} - \textbf{A}λi​I−A for an eigenvalue λi\lambda_iλi​ is larger than 1, we will expect to have multiple eigenvectors associated with the same eigenvalue λi\lambda_iλi​.

We therefore define the geometric multiplicity of an eigenvalue to be the dimension of the associated null space of λiI−A\lambda_i\textbf{I} - \textbf{A}λi​I−A.

Example

Let us try to find the eigenvalues, eigenvectors, and eigenspaces of the matrix A=[1−120020−13]\textbf{A} = \begin{bmatrix}1 & -1 & 2 \\ 0 & 0 & 2 \\ 0 & -1 & 3\end{bmatrix}A=​100​−10−1​223​​.

We first find the roots of the characteristic polynomial.

det(λI−A)=det([λ000λ000λ]−[1−120020−13])=det([λ−11−20λ−201λ−3])=0\text{det}(\lambda\textbf{I} - \textbf{A}) = \text{det}\left(\begin{bmatrix}\lambda & 0 & 0 \\ 0 & \lambda & 0 \\ 0 & 0 & \lambda\end{bmatrix} - \begin{bmatrix}1 & -1 & 2 \\ 0 & 0 & 2 \\ 0 & -1 & 3\end{bmatrix}\right) = \text{det}\left(\begin{bmatrix}\lambda - 1 & 1 & -2 \\ 0 & \lambda & -2 \\ 0 & 1 & \lambda - 3\end{bmatrix}\right) = 0det(λI−A)=det​​λ00​0λ0​00λ​​−​100​−10−1​223​​​=det​​λ−100​1λ1​−2−2λ−3​​​=0

Using cofactor expansion along the first column, we get

(λ−1)(λ(λ−3)+2)=0(\lambda - 1)(\lambda(\lambda - 3) + 2) = 0(λ−1)(λ(λ−3)+2)=0

(λ−1)(λ2−3λ+2)=0(\lambda - 1)(\lambda^2 - 3\lambda + 2) = 0(λ−1)(λ2−3λ+2)=0

(λ−1)(λ−1)(λ−2)=(λ−1)2(λ−2)=0(\lambda - 1)(\lambda - 1)(\lambda - 2) = (\lambda - 1)^2(\lambda - 2) = 0(λ−1)(λ−1)(λ−2)=(λ−1)2(λ−2)=0

We see that A\textbf{A}A has the eigenvalues λ1=1\lambda_1 = 1λ1​=1 with algebraic multiplicity of 2 and λ2=2\lambda_2 = 2λ2​=2 with algebraic multiplicity of 1.

We then find the basis vectors for the null space of λiI−A\lambda_i\textbf{I} - \textbf{A}λi​I−A for each eigenvalue.

λ1=1\lambda_1 = 1λ1​=1:

Null(I−A)=Null([01−201−201−2])=span{[100],[121]}\text{Null}(\textbf{I} - \textbf{A}) = \text{Null}\left(\begin{bmatrix}0 & 1 & -2 \\ 0 & 1 & -2 \\ 0 & 1 & -2\end{bmatrix}\right) = \text{span}\left\{\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}\right\}Null(I−A)=Null​​000​111​−2−2−2​​​=span⎩⎨⎧​​100​​,​121​​⎭⎬⎫​

Therefore, the eigenvectors associated with λ1=1\lambda_1 = 1λ1​=1 are [100]\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}​100​​ and [121]\begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}​121​​. Since there are 2 eigenvectors associated with λ1=1\lambda_1 = 1λ1​=1, we say that the eigenvalue λ1=1\lambda_1 = 1λ1​=1 has a geometric multiplicity of 2. The eigenspace corresponding to λ1=1\lambda_1 = 1λ1​=1 is E1=span{[100],[121]}E_1 = \text{span}\left\{\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}\right\}E1​=span⎩⎨⎧​​100​​,​121​​⎭⎬⎫​.

λ2=2\lambda_2 = 2λ2​=2:

Null(2I−A)=Null([11−202−201−1])=Null([10−101−1000])=span{[111]}\text{Null}(2\textbf{I} - \textbf{A}) = \text{Null}\left(\begin{bmatrix}1 & 1 & -2 \\ 0 & 2 & -2 \\ 0 & 1 & -1\end{bmatrix}\right) = \text{Null}\left(\begin{bmatrix}1 & 0 & -1 \\ 0 & 1 & -1 \\ 0 & 0 & 0\end{bmatrix}\right) = \text{span}\left\{\begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}\right\}Null(2I−A)=Null​​100​121​−2−2−1​​​=Null​​100​010​−1−10​​​=span⎩⎨⎧​​111​​⎭⎬⎫​

[111]\begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}​111​​ is the eigenvector associated with λ2=2\lambda_2 = 2λ2​=2, which has a geometric multiplicity of 2. E2=span{[111]}E_2 = \text{span}\left\{\begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}\right\}E2​=span⎩⎨⎧​​111​​⎭⎬⎫​.

Diagonalization

What and why

Imagine a matrix A\textbf{A}A represented a state transition matrix, and we want to find out the steady state. We would have to calculate lim⁡k→∞Ak\lim_{k \to \infty} \textbf{A}^klimk→∞​Ak, but matrix multiplication is a very tedious process. If we could express A\textbf{A}A as a product of 3 matrices, PDP−1\textbf{P}\textbf{D}\textbf{P}^{-1}PDP−1, where P\textbf{P}P is an invertible matrix and D\textbf{D}D is a diagonal matrix, then we could easily find the limit. Assume A=PDP−1\textbf{A} = \textbf{P}\textbf{D}\textbf{P}^{-1}A=PDP−1, that is A\textbf{A}A is diagonalizable.

Ak=(PDP−1)k=PDP−1PDP−1⋯PDP−1=PD(P−1P)D(P−1DP)⋯(P−1DP)DP−1=PDkP−1\textbf{A}^k = (\textbf{P}\textbf{D}\textbf{P}^{-1})^k = \textbf{P}\textbf{D}\textbf{P}^{-1}\textbf{P}\textbf{D}\textbf{P}^{-1} \cdots \textbf{P}\textbf{D}\textbf{P}^{-1} = \textbf{P}\textbf{D}(\textbf{P}^{-1}\textbf{P})\textbf{D}(\textbf{P}^{-1}\textbf{D}\textbf{P}) \cdots (\textbf{P}^{-1}\textbf{D}\textbf{P})\textbf{D}\textbf{P}^{-1} = \textbf{P}\textbf{D}^k\textbf{P}^{-1}Ak=(PDP−1)k=PDP−1PDP−1⋯PDP−1=PD(P−1P)D(P−1DP)⋯(P−1DP)DP−1=PDkP−1

Calculating Dk\textbf{D}^kDk is very easy; in fact, for any diagonal matrix D=[d10⋯00d2⋯0⋮⋮⋱⋮000dn]\textbf{D} = \begin{bmatrix}d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n\end{bmatrix}D=​d1​0⋮0​0d2​⋮0​⋯⋯⋱0​00⋮dn​​​, Dk=[d1k0⋯00d2k⋯0⋮⋮⋱⋮000dnk]\textbf{D}^k = \begin{bmatrix}d_1^k & 0 & \cdots & 0 \\ 0 & d_2^k & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n^k\end{bmatrix}Dk=​d1k​0⋮0​0d2k​⋮0​⋯⋯⋱0​00⋮dnk​​​, so we just need to find the kkk-th power of each diagonal entry in D\textbf{D}D.

Calculation

Let A∈Rn×n\textbf{A} \in \mathbb{R}^{n \times n}A∈Rn×n, D=[d10⋯00d2⋯0⋮⋮⋱⋮000dn]\textbf{D} = \begin{bmatrix}d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n\end{bmatrix}D=​d1​0⋮0​0d2​⋮0​⋯⋯⋱0​00⋮dn​​​, and P=[p⃗1p⃗2⋯p⃗n]\textbf{P} = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}P=[p​1​​p​2​​⋯​p​n​​].

A=PDP−1\textbf{A} = \textbf{P}\textbf{D}\textbf{P}^{-1}A=PDP−1

AP=PD\textbf{A}\textbf{P} = \textbf{P}\textbf{D}AP=PD

A[p⃗1p⃗2⋯p⃗n]=[p⃗1p⃗2⋯p⃗n][d10⋯00d2⋯0⋮⋮⋱⋮000dn]\textbf{A}\begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n \\ \end{bmatrix} = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}\begin{bmatrix}d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & d_n\end{bmatrix}A[p​1​​p​2​​⋯​p​n​​]=[p​1​​p​2​​⋯​p​n​​]​d1​0⋮0​0d2​⋮0​⋯⋯⋱0​00⋮dn​​​

We then get nnn equations:

Ap⃗1=d1p⃗1\textbf{A}\vec{p}_1 = d_1\vec{p}_1Ap​1​=d1​p​1​

Ap⃗2=d2p⃗2\textbf{A}\vec{p}_2 = d_2\vec{p}_2Ap​2​=d2​p​2​

⋮\vdots⋮

Ap⃗n=dnp⃗n\textbf{A}\vec{p}_n = d_n\vec{p}_nAp​n​=dn​p​n​

We recognize these equations as the definitions of eigenvectors and eigenvalues. Therefore, the diagonal matrix D\textbf{D}D is just a matrix with the eigenvalues of A\textbf{A}A along its diagonal and the invertible P\textbf{P}P is just a matrix with the eigenvectors of P\textbf{P}P as column vectors.

If (λ1,v⃗1),(λ2,v⃗2),…,(λn,v⃗n)(\lambda_1, \vec{v}_1), (\lambda_2, \vec{v}_2), \ldots, (\lambda_n, \vec{v}_n)(λ1​,v1​),(λ2​,v2​),…,(λn​,vn​) are the eigenpairs of the matrix A\textbf{A}A, then D=[λ10⋯00λ2⋯0⋮⋮⋱⋮00⋯λn]\textbf{D} = \begin{bmatrix}\lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n\end{bmatrix}D=​λ1​0⋮0​0λ2​⋮0​⋯⋯⋱⋯​00⋮λn​​​ and P=[v⃗1v⃗2⋯v⃗n]\textbf{P} = \begin{bmatrix}\vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n\end{bmatrix}P=[v1​​v2​​⋯​vn​​].

Note: Order matters! The eigenvector v⃗i\vec{v}_ivi​ in the iii-th column of P\textbf{P}P must correspond to the eigenvalue λi\lambda_iλi​ in the iii-th column of D\textbf{D}D.

We would then need to find P−1\textbf{P}^{-1}P−1 by finding the inverse of P\textbf{P}P.

Let's now see what happens if we want to calculate Ak\textbf{A}^kAk. For this, we will use the row vectors of P−1\textbf{P}^{-1}P−1, which we will define as P−1=[q⃗1Tq⃗2T⋮q⃗nT]\textbf{P}^{-1} = \begin{bmatrix}\vec{q}_1^T \\ \vec{q}_2^T \\ \vdots \\ \vec{q}_n^T\end{bmatrix}P−1=​q​1T​q​2T​⋮q​nT​​​. We know that

Ak=[p⃗1p⃗2⋯p⃗n][λ1k0⋯00λ2k⋯0⋮⋮⋱⋮00⋯λnk][q⃗1Tq⃗2T⋮q⃗nT]=[p⃗1p⃗2⋯p⃗n][λ1kq⃗1Tλ2kq⃗2T⋮λnkq⃗nT]\textbf{A}^k = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}\begin{bmatrix}\lambda_1^k & 0 & \cdots & 0 \\ 0 & \lambda_2^k & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n^k\end{bmatrix}\begin{bmatrix}\vec{q}_1^T \\ \vec{q}_2^T \\ \vdots \\ \vec{q}_n^T\end{bmatrix} = \begin{bmatrix}\vec{p}_1 & \vec{p}_2 & \cdots & \vec{p}_n\end{bmatrix}\begin{bmatrix}\lambda_1^k\vec{q}_1^T \\ \lambda_2^k\vec{q}_2^T \\ \vdots \\ \lambda_n^k\vec{q}_n^T\end{bmatrix}Ak=[p​1​​p​2​​⋯​p​n​​]​λ1k​0⋮0​0λ2k​⋮0​⋯⋯⋱⋯​00⋮λnk​​​​q​1T​q​2T​⋮q​nT​​​=[p​1​​p​2​​⋯​p​n​​]​λ1k​q​1T​λ2k​q​2T​⋮λnk​q​nT​​​

Ak=λ1kp⃗1q⃗1T+λ2kp⃗2q⃗2T+⋯+λnkp⃗nq⃗nT\textbf{A}^k = \lambda_1^k\vec{p}_1\vec{q}_1^T + \lambda_2^k\vec{p}_2\vec{q}_2^T + \cdots + \lambda_n^k\vec{p}_n\vec{q}_n^TAk=λ1k​p​1​q​1T​+λ2k​p​2​q​2T​+⋯+λnk​p​n​q​nT​

We could express Ak\textbf{A}^kAk as a sum of matrices p⃗iq⃗iT\vec{p}_i\vec{q}_i^Tp​i​q​iT​ scaled by the respective eigenvalue to the kkk-th power λik\lambda_i^kλik​. If λi<1\lambda_i < 1λi​<1, then lim⁡k→∞λik=0\lim_{k \to \infty} \lambda_i^k = 0limk→∞​λik​=0, so we could basically eliminate the matrix associated with λi<1\lambda_i < 1λi​<1 from the above equation because its scaling factor λik\lambda_i^kλik​ approaches 000.

Example

Taking our previous example, we have the following eigenvalues and eigenvectors.

λ1=1:[100],[121]\lambda_1 = 1: \begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}1 \\ 2 \\ 1\end{bmatrix}λ1​=1:​100​​,​121​​

λ2=2:[111]\lambda_2 = 2: \begin{bmatrix}1 \\ 1 \\ 1\end{bmatrix}λ2​=2:​111​​

We can therefore diagonalize [1−120020−13]\begin{bmatrix}1 & -1 & 2 \\ 0 & 0 & 2 \\ 0 & -1 & 3\end{bmatrix}​100​−10−1​223​​ as PDP−1\textbf{P}\textbf{D}\textbf{P}^{-1}PDP−1, where D=[100010002]\textbf{D} = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2\end{bmatrix}D=​100​010​002​​ and P=[111021011]\textbf{P} = \begin{bmatrix}1 & 1 & 1 \\ 0 & 2 & 1 \\ 0 & 1 & 1\end{bmatrix}P=​100​121​111​​. Finally, we have to calculate P−1\textbf{P}^{-1}P−1.

Also note that there are multiple ways to diagonalize the above matrix. We could have just as well diagonalized it where D=[100020001]\textbf{D} = \begin{bmatrix}1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1\end{bmatrix}D=​100​020​001​​ and P=[111210110]\textbf{P} = \begin{bmatrix}1 & 1 & 1 \\ 2 & 1 & 0 \\ 1 & 1 & 0\end{bmatrix}P=​121​111​100​​. Just make such that the eigenvector in the first column of P\textbf{P}P is associated with the eigenvalue in the first column of D\textbf{D}D, that the eigenvector in the second column of P\textbf{P}P is associated with the eigenvalue in the second column of D\textbf{D}D, etc.

Diagonalizability

We note that a matrix A∈Rn×n\textbf{A} \in \mathbb{R}^{n \times n}A∈Rn×n is only diagonalize if it is square and has exactly n eigenvectors. This leads to two conclusions about the diagonalizability of a matrix A\textbf{A}A:

  1. If A\textbf{A}A has nnn unique eigenvalues, then it must have nnn eigenvectors, so it is automatically diagonalizable.

  2. If the sum of the geometric multiplicities of all eigenvalues of A\textbf{A}A equals nnn, i.e. A\textbf{A}A has nnn eigenvectors, then A\textbf{A}A is diagonalizable.

Note: Even if the sum of the algebraic multiplicities equals nnn, this does not mean the A\textbf{A}A has nnn eigenvectors. Take A=[1101]\textbf{A} = \begin{bmatrix}1 & 1 \\ 0 & 1\end{bmatrix}A=[10​11​] for example. The characteristic equation is (λ−1)2=0(\lambda - 1)^2 = 0(λ−1)2=0, so λ1=1\lambda_1 = 1λ1​=1 has an algebraic multiplicity of 2. However, Null([0100])=span{[10]}\text{Null}\left(\begin{bmatrix}0 & 1 \\ 0 & 0\end{bmatrix}\right) = \text{span}\left\{\begin{bmatrix}1 \\ 0\end{bmatrix}\right\}Null([00​10​])=span{[10​]}, so the geometric multiplicity of λ1=1\lambda_1 = 1λ1​=1 is only 1. In general, number of columns in A≥sum of algebraic multiplicities≥sum of geometric multiplicities\text{number of columns in } \textbf{A} \geq \text{sum of algebraic multiplicities} \geq \text{sum of geometric multiplicities}number of columns in A≥sum of algebraic multiplicities≥sum of geometric multiplicities

PreviousDescriptionNextMatrices

Last updated 5 years ago

Was this helpful?