Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Type Theory

Humans classify. We sort animals into species, books into genres, people into roles, and programmers into those who have mass-assigned any to silence the compiler and those who are lying about it. Classification is how finite minds manage infinite variety. Types are classification for computation: every value belongs to a type, and the type determines what operations make sense. You can add numbers but not strings. You can take the length of a list but not the length of an integer. The type system enforces these distinctions before your program runs, which sounds obvious until you remember that most of the world’s software runs on languages where the type system’s considered opinion is “looks plausible” right up until production catches fire.

This seems pedestrian until you push it. What if types could say not just “this is a list” but “this is a list of exactly five elements”? What if they could say not just “this function returns an integer” but “this function returns a positive integer”? What if the type of a function could express its complete specification, so that any implementation with that type is correct by construction?

Dependent type theory answers yes to all of these. It is the most expressive type system in common use, and it blurs the line between programming and mathematics. A type becomes a proposition. A program becomes a proof. The compiler becomes a theorem checker. This is not metaphor; it is the Curry-Howard correspondence that we met in the previous article, now unleashed to its full power.

The Ladder of Expressiveness

Type systems form a ladder. Each rung lets you say more.

Simple types (C, Java): Values have types. int, string, bool. You cannot add a string to an integer. This catches typos and category errors, but nothing deeper.

Polymorphic types (Haskell, OCaml): Types can be parameterized. List α works for any element type. You write one reverse function that works on lists of integers, strings, or custom objects. The type ∀ α. List α → List α says “for any type α, give me a list of α and I’ll return a list of α.”

Dependent types (Lean, Coq, Agda): Types can depend on values. Vector α n is a list of exactly n elements. The number n is a value that appears in the type. Now the type system can express array bounds, matrix dimensions, protocol states, and any property you can state precisely.

The jump from polymorphic to dependent types is where things get interesting. Consider matrix multiplication. Two matrices can only be multiplied if the columns of the first equal the rows of the second. With dependent types:

Matrix m n → Matrix n p → Matrix m p

The shared n enforces compatibility at compile time. Multiply a 3×4 by a 5×2? Type error. The bug is caught before any code runs. Your linear algebra homework now has compile errors, which is somehow both better and worse.

Types as First-Class Citizens

In simple type systems, types and values live in separate worlds. You cannot write a function that takes a type as an argument or returns a type as a result. The wall between them is absolute.

Dependent types tear down this wall. Types become values. You can compute with them, pass them to functions, store them in data structures. The function that constructs Vector Int n takes a number n and returns a type. This uniformity is what makes the whole system work: if types can depend on values, then types must be values.

The theoretical foundations trace through the 20th century: Russell’s type distinctions to resolve paradoxes, Church’s simply typed lambda calculus, Martin-Löf’s intuitionistic type theory that unified logic and computation. Lean implements a refinement called the Calculus of Inductive Constructions, which adds inductive types and a hierarchy of universes to keep everything consistent.

The practical experience differs from conventional programming. Types become more informative but also more demanding. You must often provide proofs alongside your code, demonstrating that values satisfy required properties. The compiler becomes an adversary that checks your reasoning at every step, as we saw with tactics. When a program type-checks, you gain strong guarantees about its behavior. When it fails, the error messages guide you toward the gap between what you claimed and what you proved.

Universes and Russell’s Paradox

When you write universe u v w in Lean, you are declaring universe level variables. If that sentence meant nothing to you, good - it shouldn’t yet. But why do universes exist at all? The answer involves one of the most famous disasters in the history of mathematics, a polite letter that ended a career, and the dawning realization that self-reference is the serpent in every formal garden.

In 1901, Bertrand Russell sent a letter to Gottlob Frege, who had just completed his life’s work: a logical foundation for all of mathematics. Russell’s letter contained a single question. Consider the set R of all sets that do not contain themselves. Does R contain itself? If yes, then by definition it should not. If no, then by definition it should. Frege’s system was inconsistent. His life’s work collapsed. He wrote back: “Hardly anything more unfortunate can befall a scientific writer than to have one of the foundations of his edifice shaken after the work is finished.”

This is the danger of self-reference. A set that asks about its own membership. A sentence that asserts its own falsehood. A type that contains itself. These constructions look innocent but harbor contradictions. Mathematics needed walls to prevent them.

Type theory builds those walls through stratification. Types are organized into a hierarchy of universes. In Lean, Prop sits at Sort 0, Type 0 sits at Sort 1, Type 1 sits at Sort 2, and so on. A type at level n can only mention types at levels below n. The type Type 0 itself has type Type 1, not Type 0. This breaks the self-reference. You cannot ask whether Type contains itself because Type is not a single thing; it is an infinite ladder, and each rung can only see the rungs below.

universe u v w

-- Universe polymorphism (explicit universe level)
def polyIdentity (α : Sort u) (a : α) : α := a

-- Universe level expressions
def maxLevel (α : Type u) (β : Type v) : Type (max u v) := α × β

-- Type 0 contains types, Type 1 contains Type 0, etc.
example : Type 0 = Sort 1 := rfl
example : Prop = Sort 0 := rfl

-- Impredicative Prop: functions into Prop stay in Prop
def propPredicate (P : Type u → Prop) : Prop := ∀ α, P α

-- Predicative Type: function types take maximum universe
def typePredicate (P : Type u → Type v) : Type (max (u+1) v) :=
  ∀ α, P α

The declaration universe u v w introduces universe level variables. When you write def polyIdentity (α : Sort u) (a : α) : α := a, you are defining a function that works at any universe level. The Sort u includes both Prop (when u = 0) and Type n (when u = n + 1). Universe polymorphism lets you write single definitions that work across the entire hierarchy.

Predicativity and Impredicativity

Here is a rule that sounds obvious until you think about it: you cannot be in the photograph you are taking. The photographer stands outside the frame. A committee that selects its own members creates paradoxes of legitimacy. A definition that refers to a collection containing itself is suspect. This intuition, that the definer must stand apart from the defined, is called predicativity.

Imagine a monastery where knowledge is organized into concentric walls. Acolytes in the outer ring may study only texts from their own ring. Scholars who wish to reference the entire outer collection must do so from the second ring, looking inward. Those who would survey the second ring must stand in the third. And so on, each level permitted to see only what lies below. No scholar may cite a collection that includes their own work. The hierarchy prevents the serpent from eating its tail.

This is how predicative universes work. When you quantify over all types at level n, the resulting type lives at level n+1. The definition “for all types α in Type 0, the type α → α” must itself live in Type 1 because it speaks about the entirety of Type 0. You cannot make universal claims about a collection while remaining inside it. The quantification must ascend.

-- Predicative: quantifying over Type u produces Type (u+1)
-- The result must be "larger" than what we quantify over
def predicativeExample : Type 1 := ∀ (α : Type 0), α → α

-- Check the universe levels explicitly
#check (∀ (α : Type 0), α → α : Type 1)  -- Lives in Type 1
#check (∀ (α : Type 1), α → α : Type 2)  -- Lives in Type 2

-- Impredicative: quantifying over Prop still produces Prop
-- Prop can "talk about itself" without ascending universe levels
def impredicativeExample : Prop := ∀ (P : Prop), P → P

#check (∀ (P : Prop), P → P : Prop)  -- Still in Prop, not Type 0!

-- Why this matters: classical logic requires impredicativity
-- "For all propositions P, P or not P" is itself a proposition
def excludedMiddleType : Prop := ∀ (P : Prop), P ∨ ¬P

-- If Prop were predicative, this would be Type 1, breaking classical reasoning
-- Impredicativity lets us define propositions that quantify over all propositions

-- The danger: unrestricted impredicativity leads to paradox
-- Girard's paradox shows Type : Type is inconsistent
-- Lean avoids this by making only Prop impredicative

Lean’s Type hierarchy is predicative: ∀ (α : Type 0), α → α has type Type 1, not Type 0. This prevents Girard’s paradox, a type-theoretic ouroboros that arises when Type : Type. The infinite regress of universes is the price of consistency.

But Prop breaks the rule. Lean’s Prop is impredicative: ∀ (P : Prop), P → P has type Prop, staying at the same level despite quantifying over all propositions. The monastery has a secret inner sanctum where the old restrictions do not apply. How is this safe?

Proof irrelevance is the answer. In Prop, all proofs of the same proposition are equal. You cannot extract computational content from an impredicative definition over propositions because there is nothing to extract; all witnesses are indistinguishable. The dangerous circularity is defanged. The serpent may eat its tail here because the tail has no substance.

This matters for classical logic. The law of excluded middle, ∀ (P : Prop), P ∨ ¬P, quantifies over all propositions. If Prop were predicative, this would live in Type 0, making it a computational object rather than a logical axiom. The monks in the inner sanctum can do things forbidden to those outside because what happens in Prop stays in Prop. The walls are there to protect computation from paradox; proof lives by different rules.

-- PLift: lifts any type by exactly one level
def liftedFalse : Type := PLift False

-- ULift: lifts types by any amount
def liftedNat : Type u := ULift.{u} Nat

-- Lifting and unlifting values
def liftExample : ULift.{1} Nat := ⟨42⟩
example : liftExample.down = 42 := rfl

-- Non-cumulativity: types exist at exactly one level
def needsLifting (α : Type 1) : Type 2 := ULift.{2} α

Non-Cumulativity

In a cumulative type theory, every type at universe level n is automatically also a type at level n+1 and all higher levels. Coq and Idris work this way: if you have Nat : Type 0, you can use Nat anywhere a Type 1 is expected. The type “flows upward” through the hierarchy without explicit intervention. This makes polymorphic code more convenient since you rarely need to think about universe levels.

Lean takes the opposite approach. Each type lives at exactly one universe level. Nat has type Type 0 and only Type 0. If a function expects a Type 1 argument, you cannot pass Nat directly. You must explicitly lift it using ULift or PLift, wrapper types that move values to higher universes.

-- In Lean, Nat has type Type 0, not Type 1
#check (Nat : Type 0)    -- works
-- #check (Nat : Type 1) -- would fail: Nat is not in Type 1

-- A function expecting Type 1 cannot accept Nat directly
def wantsType1 (α : Type 1) : Type 1 := α

-- This fails: Nat lives in Type 0, not Type 1
-- def broken := wantsType1 Nat

-- Solution: explicitly lift Nat to Type 1
def works := wantsType1 (ULift Nat)

-- In a cumulative system (like Coq), this would work:
-- def coqStyle := wantsType1 Nat  -- Coq allows this

-- Practical example: polymorphic container at higher universe
def Container (α : Type 1) := List α

-- Cannot directly use Nat
-- def natContainer := Container Nat  -- fails

-- Must lift first
def natContainer := Container (ULift Nat)

-- Extracting values requires unlifting
def sumLifted (xs : List (ULift Nat)) : Nat :=
  xs.foldl (fun acc x => acc + x.down) 0

Note

This explicit lifting makes universe structure visible in your code. You always know exactly which universe level you are working at. The tradeoff is verbosity: code that would “just work” in Coq requires explicit lifts in Lean. In practice, most Lean code stays within Type 0 and Prop, so non-cumulativity rarely causes friction.

Proof Irrelevance

A bear catching a salmon does not care whether the fish swam upstream via the left channel or the right. Philosophers have spent centuries on equivalent questions about identity and equivalence, producing volumes that the bear would find unpersuasive. Proof irrelevance applies this principle to mathematics: any two proofs of the same proposition are equal. If you have two proofs p1 and p2 that both establish proposition P, then p1 = p2 holds definitionally. We care that the theorem is true, not which path led there. This is either profound or obvious depending on how much time you’ve spent arguing about equality, which is to say, depending on whether you’ve ever debugged a language with more than three equality operators.

-- Basic logical connectives
theorem and_intro (P Q : Prop) (hp : P) (hq : Q) : P ∧ Q := ⟨hp, hq⟩

theorem or_elim (P Q R : Prop) (h : P ∨ Q) (hp : P → R) (hq : Q → R) : R :=
  h.elim hp hq

theorem iff_intro (P Q : Prop) (hpq : P → Q) (hqp : Q → P) : P ↔ Q :=
  ⟨hpq, hqp⟩

-- Proof irrelevance demonstration
theorem proof_irrel_demo (P : Prop) (p1 p2 : P) : p1 = p2 := rfl

-- Classical logic (via choice)
open Classical in
theorem excluded_middle (P : Prop) : P ∨ ¬P := Classical.em P

Proof irrelevance has profound computational implications. Because all proofs of a proposition are equal, the compiler can erase proofs at runtime. A function that takes a proof argument does not actually need to receive any runtime data for that argument. This erasure is essential for performance: without it, complex proofs would bloat compiled code with useless proof terms. Your elaborate justification for why the code is correct compiles down to nothing, much like comments but with mathematical guarantees.

The technical foundation is that Prop is a subsingleton universe. A subsingleton is a type with at most one element. For any proposition P, there is at most one proof of P up to definitional equality. This contrasts with Type, where Bool has two distinct elements true and false, and Nat has infinitely many.

Proof irrelevance also enables powerful automation. When a tactic constructs a proof term, the exact structure of that term does not matter. The tactic can use whatever construction is convenient, and the result will be equal to any other proof of the same statement. This freedom simplifies tactic implementation and allows for aggressive optimization of proof search.

Constructive and Classical Logic

Lean’s type theory is constructive at its core. A constructive proof of existence must provide a witness: to prove ∃ n, P n, you must exhibit a specific n and show P n holds. You cannot merely argue that non-existence leads to contradiction. This discipline has a profound consequence: constructive proofs are programs. A proof of ∃ n, n * n = 4 contains the number 2. You can extract it and compute with it.

-- Constructive proof: we provide an explicit witness
theorem constructive_exists : ∃ n : Nat, n * n = 4 :=
  ⟨2, rfl⟩  -- The witness is 2, and we can compute 2 * 2 = 4

-- Constructive: we can extract and run the witness
def constructive_even : { n : Nat // n % 2 = 0 } :=
  ⟨4, rfl⟩  -- The subtype bundles value with proof

#eval constructive_even.val  -- Outputs: 4

-- The law of excluded middle itself (classical axiom)
theorem lem (P : Prop) : P ∨ ¬P := Classical.em P

-- Double negation elimination (classical, not constructive)
-- In constructive logic, ¬¬P does not imply P
theorem dne (P : Prop) : ¬¬P → P := Classical.byContradiction

-- Classical proof by contradiction: no even number is odd
theorem even_not_odd (n : Nat) : n % 2 = 0 → ¬(n % 2 = 1) := by
  intro heven hodd
  omega

Classical logic adds axioms that break this computational interpretation. The law of excluded middle (P ∨ ¬P for any proposition) lets you prove existence by contradiction without producing a witness. Double negation elimination (¬¬P → P) lets you escape a double negation without constructing a direct proof. These principles are mathematically sound but computationally empty. When you prove something exists using excluded middle, the proof does not contain the thing that exists.

Lean provides classical axioms through the Classical namespace. When you use Classical.em or tactics like by_contra, you are stepping outside constructive logic. Lean tracks this: definitions that depend on classical axioms are marked noncomputable, meaning they cannot be evaluated at runtime.

-- Classical choice: given existence, extract a witness
-- This is noncomputable because we cannot run it
noncomputable def classical_witness (P : Nat → Prop) (h : ∃ n, P n) : Nat :=
  Classical.choose h

-- The witness satisfies the property (but we cannot compute what it is)
theorem classical_witness_spec (P : Nat → Prop) (h : ∃ n, P n) :
    P (classical_witness P h) :=
  Classical.choose_spec h

-- Contrast: decidable existence gives computable witnesses
def decidable_witness (p : Nat → Bool) (bound : Nat) : Nat :=
  -- We can search by enumeration because the domain is finite
  (List.range bound).find? (fun n => p n) |>.getD 0

-- The key insight: constructive proofs compute, classical proofs assert

Why does this matter? For pure mathematics, classical reasoning is often more convenient. Many standard proofs use contradiction freely. But for verified programming, constructive proofs have an advantage: they produce code. A constructive proof that a sorting algorithm returns a sorted list can be extracted into an actual sorting function. A classical proof merely asserts the sorted list exists.

The practical guidance: use constructive methods when you can, classical when you must. Lean supports both. When you see noncomputable on a definition, you know it relies on classical axioms and cannot be executed. When a definition lacks that marker, it is constructive and can run. The type system tracks the distinction so you always know which world you are in.

Quotients and Parametricity

Quotient types allow you to define new types by identifying elements of an existing type according to an equivalence relation. The integers modulo n, for example, identify natural numbers that have the same remainder when divided by n. Quotients are essential for constructing mathematical objects like rational numbers, real numbers, and algebraic structures.

-- Simple modulo equivalence relation
def ModRel (n : Nat) : Nat → Nat → Prop :=
  fun a b => a % n = b % n

-- Prove it's an equivalence relation
theorem ModRel.refl (n : Nat) : ∀ x, ModRel n x x :=
  fun _ => rfl

theorem ModRel_symm (n : Nat) : ∀ x y, ModRel n x y → ModRel n y x :=
  fun _ _ h => h.symm

theorem ModRel.trans (n : Nat) : ∀ x y z, ModRel n x y → ModRel n y z → ModRel n x z :=
  fun _ _ _ hxy hyz => Eq.trans hxy hyz

-- Create setoid instance
instance ModSetoid (n : Nat) : Setoid Nat where
  r := ModRel n
  iseqv := {
    refl := ModRel.refl n
    symm := @ModRel_symm n
    trans := @ModRel.trans n
  }

-- Define the quotient type (integers modulo n)
def ZMod (n : Nat) : Type := Quotient (ModSetoid n)

However, quotients break parametricity. Parametricity is the principle that polymorphic functions must treat their type arguments uniformly. A function of type ∀ α, α → α can only be the identity function because it has no way to inspect what α is. It must work the same way for Nat, String, and any other type. This uniformity yields powerful “free theorems” about polymorphic functions.

Quotients violate this uniformity through the Quot.lift operation. When you lift a function to operate on a quotient type, you must prove that the function respects the equivalence relation. This proof obligation means that functions on quotients can behave differently depending on the specific equivalence relation, breaking the uniformity that parametricity requires.

namespace ZMod

-- Constructor respecting equivalence
def mk (n : Nat) (a : Nat) : ZMod n :=
  Quotient.mk (ModSetoid n) a

-- Addition operation via lifting (simplified - proper proof would be longer)
def add {n : Nat} [NeZero n] : ZMod n → ZMod n → ZMod n :=
  Quotient.lift₂
    (fun a b => mk n ((a + b) % n))
    (fun a₁ a₂ b₁ b₂ h₁ h₂ => by
      apply Quotient.sound
      simp only [ModSetoid] at h₁ h₂ ⊢
      unfold ModRel at h₁ h₂ ⊢
      rw [Nat.add_mod, h₁, h₂, ← Nat.add_mod]
      rfl)

-- Quotient.sound: related elements are equal
theorem mk_eq_of_rel {n : Nat} (a b : Nat) (h : ModRel n a b) :
  mk n a = mk n b :=
  Quotient.sound h

-- Quotient induction principle
theorem ind_on {n : Nat} {P : ZMod n → Prop} (q : ZMod n)
  (h : ∀ a, P (mk n a)) : P q :=
  Quotient.ind h q

end ZMod

Why is this acceptable? The trade-off is deliberate. Quotients are necessary for mathematics: you cannot construct the integers, rationals, or reals without them. The loss of parametricity is confined to quotient types and does not affect ordinary polymorphic functions. Moreover, the requirement to prove that lifted functions respect equivalence ensures that quotient operations are well-defined. You cannot accidentally distinguish between equivalent elements.

Comparative Type Systems

Different languages make different design choices in their type systems. The following table summarizes key features across proof assistants and programming languages.

FeatureLean 4CoqAgdaIdris 2HaskellRust
Dependent TypesFullFullFullFullLimitedNo
Universe HierarchyPredicativePredicativePredicativePredicativeNoneNone
Universe CumulativityNoYesNoYesN/AN/A
Proof IrrelevanceYes (Prop)Yes (Prop)OptionalYesN/AN/A
Tactic LanguageLean DSLLtacNoElabN/AN/A
Type InferencePartialPartialPartialPartialSorta FullFull
Termination CheckingRequiredRequiredRequiredOptionalNoNo
Linear TypesNoNoNoQTTExtensionOwnership
Effects SystemMonadMonadMonadAlgebraicMonadOwnership
Code GenerationNativeOCaml/HaskellHaskellNativeNativeNative
Cubical Type TheoryNoNoYesNoNoNo
Decidable Type CheckingNoNoNoNoSortaYes

Glossary:

  • Ltac: Coq’s original tactic language, a dynamically-typed scripting language for proof automation
  • QTT: Quantitative Type Theory, tracks how many times each variable is used to enable linear resource management
  • Predicative: A universe is predicative if quantifying over types at level n produces a type at level n+1 or higher
  • Cumulativity: Whether a type at level n is automatically also at level n+1
  • Sorta Full: Haskell has full type inference for base Haskell 2010, but enabling type system extensions (GADTs, TypeFamilies, RankNTypes, etc.) may require type annotations
  • Sorta (Decidable): Haskell 2010 has decidable type checking, but extensions like UndecidableInstances and TypeFamilies can make type checking undecidable or non-terminating

Lean and Coq provide full dependent types with rich proof automation, making them suitable for formal verification. Agda emphasizes explicit proof terms and supports cubical type theory for constructive equality. Idris 2 uses quantitative type theory to track resource usage, bridging the gap between theorem proving and systems programming.

Haskell approaches dependent types through extensions like GADTs, DataKinds, and type families. Base Haskell maintains decidable type checking, but common extensions can introduce undecidability. Rust’s ownership system provides memory safety guarantees through affine types, with decidable checking and predictable compile times.

A common critique of Lean is its lack of linear or affine types, which would enable compile-time guarantees about resource usage and in-place mutation. The Lean developers chose instead to rely on runtime reference counting with FBIP optimizations, trading static linearity guarantees for simpler types and the ability to share data freely without borrow checker complexity.

The table above looks like a feature comparison. It is actually a map of philosophical commitments. Each row is a question about the nature of computation; each column is an answer. The language you choose chooses what thoughts are easy to think.

The fundamental trade-off is expressiveness versus automation. Full dependent types let you express arbitrary properties but require manual proof effort. Decidable type systems like Rust and Haskell infer types automatically but cannot express many important invariants. Choose based on whether you need machine-checked proofs or just strong static guarantees.

In short: Lean and Coq make you prove everything is correct, Rust makes you prove memory is safe, Haskell makes you prove effects are tracked, and most other languages just trust you not to ship on Friday.