Readings / Linear Algebra / 1.5 — Rank, Column Space & Null Space

1.5 — Rank, Column Space & Null Space

1.5 — Rank, Column Space & Null Space

Date: 2026-02-27 | Block: 1 — Linear Algebra

The idea in plain English

Every matrix transformation either preserves the full richness of space, or squashes some of it away. Rank measures how much survives. The column space is the "territory" the transformation can reach. The null space is what gets completely destroyed (sent to zero).

The intuition

Imagine shining a light through space onto a wall. The shadow on the wall is the column space — everything the transformation can output. The things that cast no shadow (they're aligned with the light beam) are the null space — they get completely flattened.

A 2D matrix might: - Map the full plane to the full plane (rank 2 — nothing lost) - Squash the full plane onto a single line (rank 1 — one dimension destroyed) - Collapse everything to a point (rank 0 — total destruction)

The higher the rank, the more the transformation preserves.

The math

Column space: the set of all possible outputs of A·x (for any x).

Col(A) = span of the columns of A

This is the "landing zone" — every output lives here. It's called the column space because A·x is always a linear combination of A's columns.

Null space (kernel): the set of all inputs that get sent to zero.

Null(A) = { x : A·x = 0 }

Rank: the dimension of the column space = number of linearly independent columns.

Rank-Nullity Theorem:

rank(A) + nullity(A) = n     (n = number of columns)

Surviving dimensions + destroyed dimensions = total input dimensions. Always.

A worked example

A = [ 1  2 ]
    [ 2  4 ]

Notice column 2 = 2 × column 1 — they're the same direction. So the column space is just a single line (all multiples of [1,2]). Rank = 1.

Now find the null space (solve A·x = 0):

x₁ + 2x₂ = 0   →   x₁ = -2x₂

So x = t·[-2, 1] for any scalar t — a whole line of vectors all collapse to zero. Nullity = 1.

Check: rank + nullity = 1 + 1 = 2 = n ✓

Why this matters for ML

Rank = expressive power. A weight matrix of rank r can only output in an r-dimensional subspace, no matter how high-dimensional the input. Low rank means the layer is a bottleneck — it can only express a limited range of transformations.

LoRA (Low-Rank Adaptation) fine-tunes giant language models by adding a low-rank update A·B where A is (d×r) and B is (r×d) with r tiny. The update has rank at most r — it only shifts the model in a small subspace of weight space. This is why you can fine-tune a 7B parameter model using a tiny fraction of the usual memory.

Multicollinear features break linear regression because the data matrix X becomes rank-deficient — its column space doesn't fill the output space, and the normal equations have no unique solution.

The one thing to remember

Rank tells you how many dimensions the transformation keeps. Null space is what it destroys. They always sum to the input dimension.

← Previous 1.4 — Matrix Multiplication: Composition of Transformations Next → 1.6 — Determinants: Geometric Meaning