Evanalysis
8.2Estimated reading time: 10 min

8.2 Diagonalization and similarity

Treat diagonalization as a basis change built from eigenvectors, then use similarity to explain when a matrix can be simplified without changing its essential eigenvalue data.

Course contents

MATH1030: Linear algebra I

Rigorous linear algebra notes on systems, matrices, structure, and proof, with interaction used only where it clarifies the mathematics.

Chapter 1Systems of equations1 sections
Chapter 4Solution structure1 sections
Chapter 5Invertibility1 sections

One eigenvector is useful. A whole basis of eigenvectors is transformative.

If a square matrix has enough linearly independent eigenvectors, then there is a coordinate system in which the matrix becomes diagonal. In that coordinate system, powers, inverses, and many structural questions become almost trivial.

Why diagonalization matters

Suppose a matrix acts on Rn\mathbb{R}^n. In the standard basis, the action may look complicated because different coordinates mix together. But if you build a new basis from eigenvectors, then the action along each basis vector becomes pure scalar multiplication.

That is exactly what a diagonal matrix does.

Definition

Similarity

Two n×nn\times n matrices AA and BB are similar if there exists an nonsingular matrix SS such that

S1AS=B.S^{-1}AS=B.

Similarity means that AA and BB represent the same linear transformation in two different bases.

Definition

Diagonalization and diagonalizability

Let AA be an n×nn\times n matrix.

If there exists an invertible matrix SS and scalars λ1,,λn\lambda_1,\dots,\lambda_n such that

S1AS=diag(λ1,,λn),S^{-1}AS=\operatorname{diag}(\lambda_1,\dots,\lambda_n),

then we say that AA is diagonalizable and that the displayed equality is a diagonalization of AA.

So diagonalization is a special case of similarity in which the target matrix is diagonal.

Eigenvectors are exactly what fill the diagonalization matrix

Theorem

Characterization of diagonalization

Let AA be an n×nn\times n matrix, and let

S=[v1 v2  vn]S=[v_1\ v_2\ \cdots\ v_n]

be an invertible matrix built from column vectors v1,,vnv_1,\dots,v_n.

Then the following are equivalent:

  1. each vjv_j is an eigenvector of AA with eigenvalue λj\lambda_j;
S1AS=diag(λ1,,λn).S^{-1}AS=\operatorname{diag}(\lambda_1,\dots,\lambda_n).

This theorem is the heart of diagonalization. The columns of the change-of-basis matrix are not arbitrary. They must be eigenvectors.

Equivalently, if D=diag(λ1,,λn)D=\operatorname{diag}(\lambda_1,\dots,\lambda_n), then

AS=SD.AS=SD.

That equation says:

  • the first column of AS is Av1Av_1, while the first column of SD is λ1v1\lambda_1v_1;
  • the second column of AS is Av2Av_2, while the second column of SD is λ2v2\lambda_2v_2;
  • and so on.

So the single matrix identity AS=SDAS=SD packages all eigenvector equations at once.

Theorem

When is a matrix diagonalizable?

An n×nn\times n matrix AA is diagonalizable if and only if it has n linearly independent eigenvectors.

This criterion is the one you should remember. Diagonalization is not about guessing a lucky matrix SS; it is about finding a full basis of eigenvectors.

First diagonalization examples

Worked example

A diagonalizable upper-triangular matrix

Let

A=[111022003].A= \begin{bmatrix} 1&1&1\\ 0&2&2\\ 0&0&3 \end{bmatrix}.

Suppose we have eigenvectors

u1=[100],u2=[110],u3=[342],u_1= \begin{bmatrix} 1\\0\\0 \end{bmatrix}, \qquad u_2= \begin{bmatrix} 1\\1\\0 \end{bmatrix}, \qquad u_3= \begin{bmatrix} 3\\4\\2 \end{bmatrix},

with eigenvalues 1, 2, and 3, respectively.

If we set

U=[u1 u2 u3],U=[u_1\ u_2\ u_3],

then the three vectors are linearly independent, so UU is invertible. Hence

U1AU=diag(1,2,3).U^{-1}AU=\operatorname{diag}(1,2,3).

The original matrix is not diagonal, but it becomes diagonal in the eigenvector basis.

Worked example

A matrix that is not diagonalizable

Consider

J=[1401].J= \begin{bmatrix} 1&4\\ 0&1 \end{bmatrix}.

Its only eigenvalue is 1, because

det(JλI)=1λ401λ=(1λ)2.\det(J-\lambda I)= \begin{vmatrix} 1-\lambda&4\\ 0&1-\lambda \end{vmatrix} =(1-\lambda)^2.

Now solve (JI)x=0(J-I)x=0:

JI=[0400].J-I= \begin{bmatrix} 0&4\\ 0&0 \end{bmatrix}.

So x2=0x_2=0 and x1x_1 is free. The eigenspace is therefore

span{[10]},\operatorname{span}\left\{ \begin{bmatrix} 1\\0 \end{bmatrix} \right\},

which is only one-dimensional. A 2×22\times2 matrix needs two linearly independent eigenvectors to be diagonalizable, so JJ is not diagonalizable.

Similar matrices preserve eigenvalue data

Similarity is not arbitrary conjugation. It preserves the essential eigenvalue structure of a matrix.

Theorem

Similar matrices have the same characteristic polynomial

If AA and BB are similar, then

pA(x)=pB(x).p_A(x)=p_B(x).

In particular, AA and BB have the same eigenvalues.

This is what makes diagonalization meaningful. The diagonal matrix obtained from AA is not a different spectral object. It is the same linear transformation written in a basis that exposes its eigenvalues visibly on the diagonal.

Common mistake

Having the same characteristic polynomial is not enough for similarity

Similarity implies equal characteristic polynomials, but the converse is false. Two matrices can share the same characteristic polynomial and still fail to be similar.

For example,

[0100]and[0000]\begin{bmatrix} 0&1\\ 0&0 \end{bmatrix} \qquad\text{and}\qquad \begin{bmatrix} 0&0\\ 0&0 \end{bmatrix}

both have characteristic polynomial x2x^2, but the first matrix is not similar to the zero matrix.

Diagonalization makes powers and inverses easy

Suppose

A=SDS1,D=diag(λ1,,λn).A=SDS^{-1}, \qquad D=\operatorname{diag}(\lambda_1,\dots,\lambda_n).

Then matrix algebra becomes much simpler.

Theorem

Powers, inverse, and transpose of a diagonalizable matrix

If A=SDS1A=SDS^{-1} with DD diagonal, then for each positive integer m,

Am=SDmS1.A^m=SD^mS^{-1}.

If AA is invertible, then every diagonal entry of DD is nonzero and

A1=SD1S1.A^{-1}=SD^{-1}S^{-1}.

Also, ATA^T is diagonalizable and has the same eigenvalues as AA.

The key point is that diagonal matrices are easy to power:

diag(λ1,,λn)m=diag(λ1m,,λnm).\operatorname{diag}(\lambda_1,\dots,\lambda_n)^m =\operatorname{diag}(\lambda_1^m,\dots,\lambda_n^m).

Worked example

Compute a power through diagonalization

Let

A=[211121112].A= \begin{bmatrix} 2&1&1\\ 1&2&1\\ 1&1&2 \end{bmatrix}.

Suppose AA is diagonalized as

A=Sdiag(4,1,1)S1.A=S\operatorname{diag}(4,1,1)S^{-1}.

Then

Am=Sdiag(4m,1,1)S1.A^m=S\operatorname{diag}(4^m,1,1)S^{-1}.

So the hard part is finding the diagonalization once. After that, every positive power is controlled by replacing 4 with 4m4^m and leaving the repeated 1 entries alone.

Quick check

Quick check

What must the columns of a diagonalizing matrix S be?

Think about the equation AS=SDAS=SD.

Solution

Answer

Quick check

Can a 3×3 matrix be diagonalizable with only two linearly independent eigenvectors?

Use the characterization theorem.

Solution

Answer

Quick check

If A is similar to D and D is diagonal, do A and D have the same eigenvalues?

Use the similarity theorem.

Solution

Answer

Exercises

Quick check

Suppose A has eigenvectors v1,v2,v3v_1,v_2,v_3 with eigenvalues 2,5,12,5,-1, and these vectors are linearly independent. What diagonal matrix appears in a diagonalization of A?

Order the diagonal entries to match the chosen order of the eigenvectors.

Solution

Guided solution

Quick check

Why is [1101]\begin{bmatrix}1&1\\0&1\end{bmatrix} not diagonalizable?

Check how many linearly independent eigenvectors it has.

Solution

Guided solution

Quick check

If A=SDS1A=SDS^{-1} with D=diag(2,1)D=\operatorname{diag}(2,-1), what is A3A^3?

Use the power rule for diagonalizable matrices.

Solution

Guided solution

Read 8.1 Eigenvalues, eigenvectors, and eigenspaces first if the homogeneous-system viewpoint is not yet solid.

Continue with 8.3 Characteristic polynomials and diagonalization tests for the polynomial tools that decide whether enough eigenvectors exist.

The basis language here also depends on 6.5 Basis and dimension.

Section mastery checkpoint

Answer each question correctly to complete this section checkpoint. Correct progress: 0%.

Skills: diagonalization, eigenvector, basis

What is the correct criterion for an n×n matrix A to be diagonalizable?

Attempts used: 0

Attempts remaining: Unlimited attempts

Preview does not consume an attempt.

Submit records a graded attempt.

Key terms in this unit