Proofs
You have written functions. You have defined types. You have pattern matched, recursed, and composed. But you have not yet proved anything.
The difference matters. When you write a function, the compiler checks that types align, but it does not verify that your code does what you claim. You say this function sorts? The compiler shrugs. In theorem proving, you make claims and then justify them. The compiler verifies that your justification actually establishes the claim. You cannot bluff your way through a proof.
A bear learns to fish by watching the stream, understanding where salmon pause, developing patience for the moment when motion becomes certainty. Proving is similar. You learn to read the goal state, understand where progress stalls, develop patience for the tactic that transforms confusion into clarity.
Notation
Before we write our first proof, we need a shared language. The notation below bridges three worlds: the mathematical symbols you find in logic textbooks, the inference rules used in programming language theory (as in Pierce’s Types and Programming Languages and Harper’s Practical Foundations for Programming Languages), and the Lean syntax you will write. Learning to read all three simultaneously is the key to fluency.
| Symbol | Name | Meaning |
|---|---|---|
| \(\vdash\) | turnstile | “proves” or “entails” |
| \(\Gamma\) | Gamma | the context (hypotheses we can use) |
| \(\to\) | arrow | implication or function type |
| \(\forall\) | for all | universal quantification |
| \(\exists\) | exists | existential quantification |
| \(\land\) | and | conjunction |
| \(\lor\) | or | disjunction |
| \(\top\) | top | truth (trivially provable) |
| \(\bot\) | bottom | falsehood (unprovable) |
A judgment \(\Gamma \vdash P\) reads “from context \(\Gamma\), we can prove \(P\).” An inference rule shows how to derive new judgments from existing ones:
\[ \frac{\Gamma \vdash P \quad \Gamma \vdash Q}{\Gamma \vdash P \land Q} \text{(∧-intro)} \]
This rule says: if you can prove \(P\) and you can prove \(Q\), then you can prove \(P \land Q\). The premises sit above the line; the conclusion below. The name on the right identifies the rule. Every tactic you learn corresponds to one or more such rules. The tactic is the mechanism; the rule is the justification.
Tactics as Proof-State Transformers
You may have repressed the trauma of high school algebra, but the core idea was sound: you start with \(2x + 5 = 11\) and apply operations until you reach \(x = 3\). Subtract 5, divide by 2, each step transforming the equation into something simpler. The tedium was doing it by hand, error-prone and joyless. But the method itself, symbolic manipulation through mechanical transformation, turns out to be extraordinarily powerful when the machine handles the bookkeeping.
Tactics work the same way. You start with a goal (what you want to prove) and a context (what you already know). Each tactic transforms the goal into simpler subgoals. You keep applying tactics until no goals remain. The proof is the sequence of transformations, not a single flash of insight.
Think of it as a game. Your current position is the proof state: the facts you hold and the destination you seek. Each tactic is a legal move that changes your position. Some moves split one goal into two (like constructor creating two subgoals). Some moves close a goal entirely (like rfl finishing with a checkmate). You win when the board is empty.
Formally, a proof state is a judgment \(\Gamma \vdash G\): context \(\Gamma\), goal \(G\). A tactic transforms one proof state into zero or more new proof states. When no goals remain, the proof is complete. This table is your Rosetta Stone:
| Tactic | Before | After | Rule |
|---|---|---|---|
intro h | \(\Gamma \vdash P \to Q\) | \(\Gamma, h:P \vdash Q\) | \(\to\)-intro |
apply f | \(\Gamma \vdash Q\) | \(\Gamma \vdash P\) | \(\to\)-elim (given \(f : P \to Q\)) |
exact h | \(\Gamma, h:P \vdash P\) | \(\square\) | assumption |
rfl | \(\Gamma \vdash t = t\) | \(\square\) | refl |
constructor | \(\Gamma \vdash P \land Q\) | \(\Gamma \vdash P\), \(\Gamma \vdash Q\) | \(\land\)-intro |
left | \(\Gamma \vdash P \lor Q\) | \(\Gamma \vdash P\) | \(\lor\)-intro₁ |
right | \(\Gamma \vdash P \lor Q\) | \(\Gamma \vdash Q\) | \(\lor\)-intro₂ |
cases h | \(\Gamma, h:P \lor Q \vdash R\) | \(\Gamma, h:P \vdash R\), \(\Gamma, h:Q \vdash R\) | \(\lor\)-elim |
induction n | \(\Gamma \vdash \forall n., P(n)\) | \(\Gamma \vdash P(0)\), \(\Gamma, ih:P(k) \vdash P(k{+}1)\) | Nat-ind |
rw [h] | \(\Gamma, h: a=b \vdash P[a]\) | \(\Gamma, h:a=b \vdash P[b]\) | subst |
simp | \(\Gamma \vdash G\) | \(\Gamma \vdash G’\) | rewrite* |
contradiction | \(\Gamma, h:\bot \vdash P\) | \(\square\) | \(\bot\)-elim |
The symbol \(\square\) marks a completed goal. Multiple goals after “After” mean the tactic created subgoals. Read left to right: you have the state on the left, you apply the tactic, you must now prove everything on the right. This is the algebra of proof. Each tactic is a function from proof states to proof states, and a complete proof is a composition that maps your theorem to \(\square\).
If the table above looks like both logic and programming, that is not a coincidence.
Proving vs Programming
The surprising insight is that proving and programming are the same activity viewed differently. A proof is a program. A theorem is a type. When you prove \(P \to Q\), you are writing a function that transforms evidence for \(P\) into evidence for \(Q\). This correspondence, the Curry-Howard isomorphism, means that logic and computation are two views of the same underlying structure:
| Logic | Programming |
|---|---|
| proposition | type |
| proof | program |
| \(P \to Q\) | function from P to Q |
| \(P \land Q\) | pair (P, Q) |
| \(P \lor Q\) | either P or Q |
| \(\top\) | unit type |
| \(\bot\) | empty type |
Every function you have written so far was secretly a proof. Every proof you write from now on is secretly a program. Two cultures, mathematicians and programmers, spoke the same language for decades without knowing it.
Your First Proof
Let us prove something undeniably true: one plus one equals two.
-- Your very first proof: 1 + 1 = 2
theorem one_plus_one : 1 + 1 = 2 := by
rfl
-- Without tactics, just a direct proof term
theorem one_plus_one' : 1 + 1 = 2 := rfl
Whitehead and Russell famously required 362 pages of Principia Mathematica before reaching this result. We have done it in three characters. This is not because we are cleverer than Russell; it is because we inherited infrastructure. The Principia was an attempt to place all of mathematics on rigorous foundations, to banish the intuition and hand-waving that had allowed paradoxes to creep into set theory. It was a heroic, doomed effort: the notation was unreadable, the proofs were uncheckable by any human in finite time, and Gödel would soon prove that the program could never fully succeed. But the ambition was right. The ambition was to make mathematics a science of proof rather than a craft of persuasion.
A century later, the ambition survives in different form. We do not write proofs in Russell’s notation; we write them in languages that machines can check. The 362 pages compress to three characters not because the mathematics got simpler but because the verification got automated. What mathematicians have been writing all along was pseudocode: informal instructions meant for human execution, full of implicit steps and assumed context, correct only if the reader filled in the gaps charitably. We are finally compiling that pseudocode.
The keyword by enters tactic mode. Instead of writing a proof term directly, you give commands that build the proof incrementally. The tactic rfl (reflexivity) says “both sides of this equation compute to the same value, so they are equal.” Lean evaluates 1 + 1, gets 2, sees that 2 = 2, and accepts the proof. No faith required. No appeals to authority. The machine checked, and the machine does not lie.
Or does it? Ken Thompson’s Reflections on Trusting Trust demonstrated that a compiler can be trojaned to insert backdoors into code it compiles, including into future versions of itself. Turtles all the way down. At some point you trust the hardware, the firmware, the operating system, the compiler that compiled your proof assistant. We choose to stop the regress somewhere, not because the regress ends but because we must act in the world despite uncertainty. This is the stoic’s bargain: do the work carefully, verify what can be verified, and accept that perfection is not on offer. The alternative is paralysis, and paralysis builds nothing.
The Goal State
When you write proofs in Lean, the editor shows you the current goal state. This is your map, your honest accounting of where you stand. Unlike tests that can pass while bugs lurk, unlike documentation that drifts from reality, the goal state cannot lie. It tells you exactly what you have (hypotheses) and exactly what you need to prove (the goal). The gap between aspiration and achievement is always visible.
-- Demonstrating how the goal state changes
theorem add_zero (n : Nat) : n + 0 = n := by
rfl
theorem zero_add (n : Nat) : 0 + n = n := by
simp
When you place your cursor after by in add_zero, you see:
n : Nat
⊢ n + 0 = n
The line n : Nat is your context: the facts you know, the tools you have. The symbol ⊢ (turnstile) separates what you have from what you need. The goal n + 0 = n is your obligation. After applying rfl, the goal disappears. No goals means the proof is complete. You have caught your fish.
Reflexivity: rfl
The rfl tactic proves goals of the form \(a = a\) where both sides are definitionally equal. In inference rule notation:
\[ \frac{}{\Gamma \vdash a = a} \text{(refl)} \]
No premises above the line means the rule is an axiom: equality is reflexive, always, unconditionally. “Definitionally equal” means Lean can compute both sides to the same value without any lemmas. This is equality by computation, the most basic form of truth: run the program on both sides and see if you get the same answer.
-- rfl works when both sides compute to the same value
theorem two_times_three : 2 * 3 = 6 := by rfl
theorem list_length : [1, 2, 3].length = 3 := by rfl
theorem string_append : "hello " ++ "world" = "hello world" := by rfl
theorem bool_and : true && false = false := by rfl
def double (n : Nat) : Nat := n + n
theorem double_two : double 2 = 4 := by rfl
When rfl works, it means the equality is “obvious” to Lean’s computation engine. When it fails, you need other tactics to transform the goal into something rfl can handle.
Triviality: trivial
The trivial tactic handles goals that are straightforwardly true. It combines several simple tactics and works well for basic logical facts.
theorem true_is_true : True := by
trivial
theorem one_le_two : 1 ≤ 2 := by
trivial
theorem and_true : True ∧ True := by
trivial
Simplification: simp
The simp tactic is your workhorse. It applies a database of hundreds of rewrite rules, accumulated over years by the mathlib community, to simplify the goal. This is collective knowledge made executable: every time someone proved that x + 0 = x or list.reverse.reverse = list, they added to the arsenal that simp deploys on your behalf.
theorem add_zero_simp (n : Nat) : n + 0 = n := by
simp
theorem zero_add_simp (n : Nat) : 0 + n = n := by
simp
theorem silly_arithmetic : (1 + 0) * (2 + 0) + 0 = 2 := by
simp
theorem list_append_nil {α : Type*} (xs : List α) : xs ++ [] = xs := by
simp
theorem use_hypothesis (a b : Nat) (h : a = b) : a + 1 = b + 1 := by
simp [h]
When simp alone does not suffice, you can give it additional lemmas: simp [lemma1, lemma2]. You can also tell it to use hypotheses from your context: simp [h].
Tip
When stuck, try
simpfirst. It solves a surprising number of goals. If it does not solve the goal completely, look at what remains.
Introducing Assumptions: intro
When your goal is an implication \(P \to Q\), you assume \(P\) and prove \(Q\). This is the introduction rule for implication:
\[ \frac{\Gamma, P \vdash Q}{\Gamma \vdash P \to Q} \text{(→-intro)} \]
The comma means “extended context”: if you can prove \(Q\) assuming \(P\), then you can prove \(P \to Q\) outright. The intro tactic performs this transformation, moving the antecedent from goal to hypothesis.
-- intro: move assumptions from goal into context
theorem imp_self (P : Prop) : P → P := by
intro hp
exact hp
-- apply: use a lemma whose conclusion matches the goal
theorem imp_trans (P Q R : Prop) : (P → Q) → (Q → R) → P → R := by
intro hpq hqr hp
apply hqr
apply hpq
exact hp
-- intro with universal quantifiers
theorem forall_self (P : Nat → Prop) : (∀ n, P n) → (∀ n, P n) := by
intro h n
exact h n
After intro hp, the goal changes from P → P to just P, and you gain hypothesis hp : P. Multiple intro commands can be combined: intro h1 h2 h3 introduces three assumptions at once.
Applying Lemmas: apply and exact
The apply tactic uses the elimination rule for implication, also called modus ponens:
\[ \frac{\Gamma \vdash P \to Q \quad \Gamma \vdash P}{\Gamma \vdash Q} \text{(→-elim)} \]
If you have a proof of \(P \to Q\) and a proof of \(P\), you can derive \(Q\). When your goal is \(Q\) and you apply a hypothesis \(h : P \to Q\), Lean transforms your goal to \(P\). The exact tactic says “this term is exactly what we need” and closes the goal.
In the imp_trans proof, apply hqr transforms the goal from R to Q, because hqr : Q → R. Then apply hpq transforms Q to P. Finally exact hp provides the P we need. Each step is modus ponens, chained backward.
Intermediate Steps: have
Sometimes you want to prove a helper fact before using it. The have tactic introduces a new hypothesis with its own proof. This is how knowledge accumulates: you establish a stepping stone, name it, and build on it.
-- have: introduce intermediate results
theorem have_demo (a b c : Nat) (h1 : a = b) (h2 : b = c) : a = c := by
have step : a = b := h1
rw [step, h2]
theorem sum_example (n : Nat) : n + n = 2 * n := by
have h : 2 * n = n + n := by ring
rw [h]
The pattern have name : type := proof adds name : type to your context.
Case Analysis: cases
When you have a value of an inductive type, cases splits the proof into one case per constructor. This is exhaustive reasoning: you consider every possible form the value could take, and you prove your claim holds in each. The compiler ensures you miss nothing. This is how careful decisions should be made: enumerate the possibilities, handle each one, leave no branch unexamined.
theorem bool_cases (b : Bool) : b = true ∨ b = false := by
cases b with
| true => left; rfl
| false => right; rfl
theorem nat_zero_or_succ (n : Nat) : n = 0 ∨ n ≥ 1 := by
cases n with
| zero => left; rfl
| succ m => right; simp
theorem option_destruct (o : Option Nat) : o = none ∨ ∃ n, o = some n := by
cases o with
| none => left; rfl
| some n => right; exact ⟨n, rfl⟩
For Bool, there are two cases: true and false. For Nat, there are two cases: zero and succ m. For Option, there are none and some n.
Induction
For properties of natural numbers, mathematical induction is the fundamental principle:
\[ \frac{\Gamma \vdash P(0) \quad \Gamma, P(n) \vdash P(n+1)}{\Gamma \vdash \forall n., P(n)} \text{(Nat-ind)} \]
Prove the base case \(P(0)\). Then prove the inductive step: assuming \(P(n)\), show \(P(n+1)\). From these two finite proofs, you derive a statement about infinitely many numbers. The induction tactic generates both proof obligations automatically. The principle dates to Pascal and Fermat, but the mechanization is new.
theorem sum_twice (n : Nat) : n + n = 2 * n := by
induction n with
| zero => rfl
| succ n ih => omega
theorem length_append {α : Type*} (xs ys : List α) :
(xs ++ ys).length = xs.length + ys.length := by
induction xs with
| nil => simp
| cons x xs ih => simp [ih]; ring
In the succ case, you get an induction hypothesis ih that assumes the property holds for n, and you must prove it holds for n + 1.
Arithmetic: omega
For goals involving linear arithmetic over natural numbers or integers, omega is powerful. It implements a decision procedure for Presburger arithmetic, a fragment of number theory that is provably decidable. Within its domain, omega does not search or guess; it decides.
theorem omega_simple (n : Nat) (h : n < 10) : n < 100 := by
omega
theorem omega_transitive (a b c : Int) (h1 : a < b) (h2 : b < c) : a < c := by
omega
theorem omega_sum (x y : Nat) (h : x + y = 10) : x ≤ 10 := by
omega
If your goal involves only addition, subtraction, multiplication by constants, and comparisons, try omega.
Decision Procedures: decide
For decidable propositions, decide simply computes the answer. Is 7 less than 10? Run the comparison. Is this list empty? Check. Some questions have algorithms that answer them definitively, and decide invokes those algorithms. When it works, there is nothing to prove; the computation is the proof.
theorem three_lt_five : (3 : Nat) < 5 := by
decide
theorem bool_compute : (true && false) = false := by
decide
theorem list_membership : 3 ∈ [1, 2, 3, 4, 5] := by
decide
theorem fin_in_bounds : (2 : Fin 5).val < 5 := by
decide
Putting It Together
Real proofs combine multiple tactics. You introduce assumptions, simplify, split cases, apply lemmas, and close with computation. The art is knowing which tool fits which moment. With practice, patterns emerge: implications call for intro, equalities for rw or simp, inductive types for cases or induction. The goal state guides you.
theorem worked_example (n : Nat) : n + 0 = 0 + n := by
simp
theorem worked_example2 (a b : Nat) (h : a = b) : a + 1 = b + 1 := by
rw [h]
theorem combined_proof (n : Nat) (h : n > 0) : n - 1 + 1 = n := by
omega
theorem list_nonempty (xs : List Nat) (h : xs ≠ []) : xs.length > 0 := by
cases xs with
| nil => contradiction
| cons x xs' => simp
The Tactics You Need
| Tactic | Purpose |
|---|---|
rfl | Prove a = a when both sides compute to the same value |
trivial | Prove obviously true goals |
simp | Simplify using rewrite rules |
intro | Assume hypotheses from implications and universals |
apply | Use a lemma whose conclusion matches the goal |
exact | Provide exactly the term needed |
have | Introduce intermediate results |
cases | Split on constructors of inductive types |
induction | Prove by induction on recursive types |
omega | Solve linear arithmetic |
decide | Compute decidable propositions |
rw | Rewrite using an equality |
These twelve tactics will carry you through most of what follows.
Exercises
The best way to learn tactics is to use them. These exercises progress from straightforward applications of single tactics to combinations that require reading the goal state carefully.
-- Exercise 1: Use rfl to prove this computation
theorem ex_rfl : 3 * 4 = 12 := by
rfl
-- Exercise 2: Use simp to simplify this expression
theorem ex_simp (n : Nat) : n * 1 + 0 = n := by
simp
-- Exercise 3: Use intro and exact
theorem ex_intro_exact (P Q : Prop) (h : P) (hpq : P → Q) : Q := by
exact hpq h
-- Exercise 4: Use cases to prove this about booleans
theorem ex_bool_not_not (b : Bool) : !!b = b := by
cases b <;> rfl
-- Exercise 5: Use induction to prove addition is commutative
theorem ex_add_comm (n m : Nat) : n + m = m + n := by
induction n with
| zero => simp
| succ n ih => omega
-- Exercise 6: Use omega to prove this inequality
theorem ex_omega (x y : Nat) (h1 : x ≤ 5) (h2 : y ≤ 3) : x + y ≤ 8 := by
omega
-- Exercise 7: Combine multiple tactics
theorem ex_combined (xs : List Nat) : ([] ++ xs).length = xs.length := by
simp
-- Exercise 8: Prove implication transitivity
theorem ex_imp_chain (A B C D : Prop) : (A → B) → (B → C) → (C → D) → A → D := by
intro hab hbc hcd ha
exact hcd (hbc (hab ha))
-- Exercise 9: Use cases on a natural number
theorem ex_nat_lt (n : Nat) : n = 0 ∨ 0 < n := by
cases n with
| zero => left; rfl
| succ m => right; omega
-- Exercise 10: Prove a property about list reversal
theorem ex_reverse_nil : ([] : List Nat).reverse = [] := by
rfl
If you get stuck, ask yourself: what is the shape of my goal? What tactic handles that shape? What hypotheses do I have available? The Infoview is your guide.
What Comes Next
You can now prove things. The proofs have been simple, but the mental model is in place. You understand goals, hypotheses, and the tactic dance that connects them. Next we introduce type theory and dependent types, the language for stating claims worth proving.
Project: The Liar’s Trap
Try to prove something false:
-- Try to prove something false. Every tactic will fail.
theorem liar : 0 = 1 := by
sorry -- Try: rfl, simp, omega, decide. Nothing works.
-- The goal state shows: ⊢ 0 = 1
-- This goal is unprovable because it is false.
Every tactic fails. rfl cannot make 0 equal 1. simp finds nothing to simplify. omega knows arithmetic and refuses. decide computes the answer and it is false. The goal state sits there, immovable: ⊢ 0 = 1. You can stare at it, curse at it, try increasingly desperate combinations. Nothing works because nothing can work. The machine will not let you prove a falsehood.
This is the point. The compiler is not your collaborator; it is your adversary. It checks every step and rejects handwaving. When someone tells you their code is correct, you can ask: does it typecheck? When someone tells you their proof is valid, you can ask: did the machine accept it? The answers are not always the same, but when they are, you know something real.
Project: De Morgan’s Little Theorem
Augustus De Morgan formalized the laws that bear his name in the 1850s: the negation of a conjunction is the disjunction of negations, and vice versa. Every programmer knows these laws intuitively from boolean expressions. Let us prove one.
-- De Morgan's Law: ¬(P ∧ Q) → (¬P ∨ ¬Q)
theorem demorgan (P Q : Prop) (h : ¬(P ∧ Q)) : ¬P ∨ ¬Q := by
by_cases hp : P
· -- Case: P is true
right
intro hq
apply h
constructor
· exact hp
· exact hq
· -- Case: P is false
left
exact hp
The proof proceeds by case analysis. We have h : ¬(P ∧ Q), a proof that P ∧ Q is false. We must show ¬P ∨ ¬Q. The by_cases tactic splits on whether P holds:
- If
Pis true (call thishp), we go right and prove¬Q. Why? IfQwere true, thenP ∧ Qwould be true, contradictingh. So¬Q. - If
Pis false (call thishnp), we go left and prove¬Pdirectly. We have it:hnp.
Each branch uses tactics from this article: intro, apply, exact, left, right, constructor. The contradiction tactic spots when hypotheses conflict. Read the proof slowly, watch the goal state at each step, and trace how the logical structure maps to the tactic sequence. This is the texture of real mathematics: case splits, contradictions, and the steady narrowing of possibilities until only truth remains.
De Morgan died in 1871. His laws persist in every boolean expression, every logic gate, every conditional branch. The proofs you write today will outlast you the same way. There is comfort in that. The work endures because the truths it captures are timeless. You are not just writing code that might be obsolete next year. You are building something permanent.