1.3 — Matrices as Transformations

Date: 2026-02-22 | Block: 1 — Linear Algebra

The idea in plain English

A matrix is not a grid of numbers — it is a function that transforms space. Feed it a vector, it gives you back a new vector. But more than that: it acts on every point in space simultaneously, moving the entire plane according to fixed rules.

The intuition

Think of the matrix as a machine. You put in an arrow, it gives you back a different arrow. But it does this consistently to every arrow in space — it's like physically grabbing the plane and moving it.

The key insight: to understand what a matrix does, just watch where the two basis vectors land.

The basis vectors e₁ = [1,0] (points right) and e₂ = [0,1] (points up) are like the coordinate axes. Every other vector is built from them. So if you know where these two go, you know where everything goes.

The columns of the matrix are exactly where the basis vectors land.

A = [ 2  1 ]   →  e₁ (right) lands at [2, 0]  (column 1)
    [ 0  3 ]   →  e₂ (up)    lands at [1, 3]  (column 2)

The math

Matrix-vector multiplication:

A = [ a  b ]    x = [ x₁ ]    A·x = [ a·x₁ + b·x₂ ]
    [ c  d ]        [ x₂ ]          [ c·x₁ + d·x₂ ]

Each row of A dot-products with x to give one output number.

Reading a matrix geometrically: just look at the columns — they tell you where right and up land after the transformation.

Linear transformation rules: a transformation is linear if: 1. A·(u + v) = A·u + A·v (add then transform = transform then add) 2. A·(c·v) = c·(A·v) (scale then transform = transform then scale)

Geometrically: straight lines stay straight, and the origin never moves.

A worked example

A = [ 3  0 ]   ← what does this do?
    [ 0  2 ]

Column 1 = [3, 0]: e₁ (right) gets stretched 3× to the right
Column 2 = [0, 2]: e₂ (up) gets stretched 2× upward
→ This matrix stretches space: 3× wide, 2× tall.

Apply to v = [1, 3]:
A·[1,3] = [3·1 + 0·3, 0·1 + 2·3] = [3, 6]  ✓

Four common transformation types: | Matrix | What it does | |--------|-------------| | [[1,0],[0,1]] | Identity — nothing moves | | [[3,0],[0,2]] | Scale 3× wide, 2× tall | | [[-1,0],[0,1]] | Reflect (flip left-right) | | [[0,-1],[1,0]] | Rotate 90° counterclockwise |

Why this matters for ML

Every layer of a neural network is a matrix transformation. The operation output = W · input + b is a linear transformation (W) followed by a shift (b). When a network learns, it is learning the matrix W — learning which transformation of the input space best separates the data.

Understanding W geometrically gives intuition for what a layer is doing: compressing, rotating, reflecting the data cloud into a more useful shape.

A translation (shifting everything by [3,1]) is not linear (the origin would move). That's why the bias b is added separately.

The one thing to remember

A matrix is a transformation of space. Its columns tell you where the basis vectors land — and from that, you know where everything goes.