Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Why?

Software keeps getting more complex. Every year brings more dependencies, more attack surface, more emergent behavior no one designed or intended. We build systems by stacking abstractions, trusting that each layer does what it claims. And this mostly works, except when it doesn’t. Sometimes an algorithm trades against itself and the stock market loses a trillion dollars in thirty-six minutes. Sometimes a missing bounds check in a cryptographic library lets attackers read arbitrary memory from half the servers on Earth. The complexity is the natural consequence of building systems larger than any individual can hold in their head, maintained by teams that turn over faster than the code does.

And now we are increasingly automating the production of code itself. Large language models now generate plausible programs at unprecedented scale. They have read every GitHub repository, every Stack Overflow answer, every tutorial and textbook. They produce code that looks right, compiles often, and works sometimes. They are plausibility generators: statistical engines that have learned what code typically looks like without understanding what code actually means. The gap between “looks correct” and “is correct” has always mattered. It is about to matter more.

One response is to build richer formal systems that can constrain the chaos. Languages where the compiler verifies rather than merely translates. Type systems that encode invariants the programmer used to track mentally. Proof assistants that check whether code does what it claims, mechanically and exhaustively. The map must match the territory, and these tools force you to make the correspondence explicit. They also give us something to point at the output of language models: a verifier that can check whether generated code satisfies its specification, turning plausibility into proof.

Lean sits at this intersection: a theorem prover backed by a million-line mathematical library and a general-purpose programming language fast enough for production use. You can prove theorems about prime numbers in the morning and write a web server in the afternoon, using the same language, the same tools, the same mental model. Write a claim, and the compiler tells you whether it holds. No trust required. No hand-waving accepted.

If Dwarf Fortress is the most complex simulation ever built and Factorio is crystallized obsession in game form, then Lean is what happens when you point that same energy at the foundations of mathematics. Like Dwarf Fortress, losing is fun, because even a failed proof attempt teaches you something about why it failed. Like Factorio, there is always one more lemma to optimize, one more proof to refactor, one more elegant solution lurking just beyond the current mess.

So why learn Lean then? Maybe you want to get a taste of what the future of mathematics looks like. Maybe you want to learn what the engineers at top quant firms and AI labs are building, because that seems like a reasonable bet on where things are heading. Maybe you want to build high-assurance software to model and trade markets. Maybe you have no practical reason at all and just want to learn something new. All of these work. There is no entrance exam.

There are also practical reasons. Start small: prove that a parser handles every edge case, that a compiler preserves meaning, that a refactor did not silently break an invariant. These are theorems about code you actually write. Scale up from there. The world runs on systems that allocate resources: power grids, packet routing, markets matching buyers and sellers. A bug in market infrastructure misallocates capital, distorts prices, rewards the wrong participants. The matching algorithm that assigns medical residents to hospitals, the slot allocation system that schedules aircraft across congested airspace, high-assurance trading systems: these are theorems waiting to be verified. The gap between “we tested it extensively” and “we proved it correct” is the gap between confidence and certainty.

Most people do not need dependent type theory or formal verification for their day jobs. But necessity is a boring criterion for what to learn. Humans decipher Linear B, collide particle jets to conjure baryons from vacuum, compose fugues, map the cosmic microwave background to the first seconds after time began, and verify QED’s prediction of the electron magnetic moment to twelve decimal places. None of these have direct economic outcomes (yet), yet we do them anyway. Much of this text is about similar pursuits that are still early and straddle the line between practical and theoretical, but this is where the future is always born.

But what makes this cluster of ideas so rewarding is the overarching big ideas and trends. When a proof type-checks, it is correct. The compiler does not care about your reputation or your confidence. It cares whether the logic holds. You discover what you actually understand versus what you thought you understood. A proof assistant is a bicycle for the mind: it amplifies your ability to think correctly.

Lean will not let you wave your hands. It will not accept “it is obvious that”, “because of abstract nonsense” or “this margin is too narrow” as justifications. The feedback loop is immediate and unforgiving. This can be frustrating, but it is also what makes the successes satisfying. When a proof finally compiles, you know it works.

There is a kind of peace in this. The world outside is full of claims that cannot be verified, arguments that cannot be resolved, systems that fail in ways no one predicted. Inside a proof assistant, the rules are clear. A theorem holds or it does not. The machine tells you which. You can build something that will still be correct in a hundred years, long after the context that motivated it has faded. The difficulty is part of the appeal. Easy things do not teach you much, and they do not last.

If this sounds interesting, keep reading. If not, there are plenty of other things worth learning.

Why Lean Specifically?

Among proof assistants, several mature options exist. Coq has the largest ecosystem and decades of industrial use. Isabelle offers powerful automation and a massive library of formalized mathematics. Agda provides elegant dependent types with a minimalist core. Each has devoted communities and hard-won expertise. The choice between them involves tradeoffs: Coq’s tactic language is battle-tested but showing its age; Isabelle’s automation can feel like magic until it fails mysteriously; Agda prioritizes purity over pragmatism.

Lean 4 occupies a distinctive position. The system emerged from Microsoft Research in 2013 and has evolved through four major versions. Lean 4, released in 2021, was a ground-up rewrite that reimagined the tool as both proof assistant and practical programming language. The implementation is largely written in Lean itself, a feat of bootstrapping that demonstrates the language’s capabilities. The mathematical library, Mathlib, contains over a million lines of formalized mathematics spanning algebra, analysis, topology, number theory, and beyond.

Lean has better tooling than other theorem provers. The error messages are informative. The IDE integration works. You can ask for hints, search the library, and see exactly what remains to prove.

This article series is my (admittedly flawed) attempt to take you from zero knowledge of Lean to writing your own proofs and programs, with enough depth to tackle real problems. Whether you finish it or abandon it halfway through, you will have spent your time on something worthwhile.