1.10 — Eigenvalues & Eigenvectors

Date: 2026-03-02 | Block: 1 — Linear Algebra

The idea in plain English

Apply a matrix to a random vector and it gets knocked off in a new direction — rotated, stretched, completely changed. But for every matrix, there exist a handful of special vectors that only get scaled — they don't rotate at all. They stay on the same line through the origin, just longer or shorter. These are eigenvectors. The scaling factor is the eigenvalue.

The intuition

Imagine you're looking at a transformation that stretches space horizontally and vertically. Most arrows get knocked into new directions. But an arrow pointing purely horizontally just gets longer horizontally — same direction, bigger. An arrow pointing purely vertically just gets longer vertically. These "axis-aligned" arrows are the eigenvectors.

"Eigen" is German for "own" — these are the transformation's own special directions. Every transformation has them (though finding them takes work).

Most vectors:             Eigenvectors:
              Av                       λv (scaled, same direction)
             ↗                        ↗
v ─────────→            v ─────────→
      rotated                not rotated!

The math

Definition:

A · v = λ · v

v = eigenvector (non-zero), λ = eigenvalue (a scalar).

How to find eigenvalues — rearrange to get:

(A − λI)·v = 0

For a non-zero solution to exist, (A − λI) must be singular:

det(A − λI) = 0     ← the characteristic equation

Solve this polynomial equation for λ.

How to find eigenvectors — once you have λ, solve (A − λI)·v = 0 — find the null space of (A − λI).

Two elegant properties:

det(A) = λ₁ · λ₂ · ... · λₙ    (product of all eigenvalues)
trace(A) = λ₁ + λ₂ + ... + λₙ  (sum of diagonal = sum of eigenvalues)

Quick sanity checks when you've found eigenvalues.

A worked example

A = [ 2  1 ]
    [ 0  3 ]

Step 1 — characteristic equation:

det(A − λI) = (2−λ)(3−λ) − 1·0 = (2−λ)(3−λ) = 0
→  λ₁ = 2,   λ₂ = 3

Step 2 — eigenvectors: For λ₁=2: solve (A−2I)v = [[0,1],[0,1]]v = 0 → v₂=0, v₁ free → v₁ = [1, 0] For λ₂=3: solve (A−3I)v = [[-1,1],[0,0]]v = 0 → v₁=v₂ → v₂ = [1, 1]

Check: A·[1,0] = [2,0] = 2·[1,0] ✓ and A·[1,1] = [3,3] = 3·[1,1] ✓

What the eigenvalue value tells you: - λ > 1: stretched (gets bigger) - λ = 1: unchanged - 0 < λ < 1: shrunk - λ = 0: collapses to zero → matrix is singular! - λ < 0: flipped direction + scaled

Why this matters for ML

PCA: the principal components are the eigenvectors of the data's covariance matrix. The eigenvalues tell you how much variance lies in each direction. Large eigenvalue = important direction. Near-zero eigenvalue = noise.

Vanishing/exploding gradients in RNNs: apply the same weight matrix W at every time step. After t steps: hₜ = Wᵗ·h₀. Decompose in the eigenvector basis: each component gets multiplied by λᵢᵗ. If |λ| > 1, that component explodes over time. If |λ| < 1, it vanishes. The eigenvalues of W determine whether an RNN can remember long-range dependencies. This is the root cause of the vanishing gradient problem.

PageRank: Google ranks web pages by finding the dominant eigenvector of the page-link matrix — the eigenvector with the largest eigenvalue. Each page's importance is its component in that special direction.

The one thing to remember

Eigenvectors are special directions that only get scaled (not rotated) by the transformation. The eigenvalue is the scaling factor. They reveal the transformation's natural coordinate system.