1.2 — Linear Combinations, Span & Linear Independence

Date: 2026-02-21 | Block: 1 — Linear Algebra

The idea in plain English

A linear combination is what you get when you scale some vectors and add them together. The span is the full set of vectors you can reach by doing this. Linear independence asks whether all your vectors are genuinely contributing — or whether some are secretly redundant copies of others.

The intuition

Linear combination: think of vectors as ingredients and scalars as quantities in a recipe. Different quantities of the same ingredients produce different results.

Span: imagine you can walk in two directions (your two vectors). The span is every place you could possibly reach by walking any combination of those two directions.

Independence: if one of your directions is just a scaled version of another (e.g. "north" and "double-north"), you haven't actually gained a new direction. You're stuck on the same line. Independent vectors genuinely point in new directions.

INDEPENDENT (span the plane):     DEPENDENT (stuck on a line):
v₂ ↑                              v₁ and v₂ on the same line:
   |  ↗ v₁                           ↗ v₂
   | /         → reach anywhere      ↗ v₁
   |/                                ↗
───+────                         ────+────

The math

Linear combination:

result = c₁·v₁ + c₂·v₂ + ... + cₙ·vₙ

The c values are scalars (plain numbers), the v values are vectors.

Span: the set of all possible linear combinations of a set of vectors.

Linear independence: vectors v₁, v₂, ..., vₖ are independent if the only solution to:

c₁·v₁ + c₂·v₂ + ... + cₖ·vₖ = 0

is c₁ = c₂ = ... = cₖ = 0. If you can make the zero vector with non-zero scalars, they are dependent.

A worked example

v₁ = [1, 2]   v₂ = [2, 4]    ← notice v₂ = 2 × v₁

Try: 2·v₁ + (−1)·v₂ = [2,4] + [−2,−4] = [0,0]

Non-zero scalars (2 and −1) produced the zero vector → dependent. v₂ gives us no new direction.

v₁ = [1, 0]   v₂ = [0, 1]

The only way to get c₁·[1,0] + c₂·[0,1] = [0,0] is c₁=0, c₂=0 → independent. These span the entire 2D plane.

Why this matters for ML

Redundant features are linearly dependent. If your dataset has both income_monthly and income_annual, one is just 12× the other. They are linearly dependent — one is completely redundant. Linear models struggle with this (called multicollinearity).

Neural network layers compute linear combinations. Every neuron takes a weighted sum of its inputs — that's a linear combination. The weights are the scalars, the input values are the vector.

PCA removes dependence. PCA finds directions in data that are independent — directions that each carry new, non-redundant information.

The one thing to remember

If you can make one vector by scaling and adding the others, it's dependent and redundant. Independent vectors genuinely point in new directions.