Artificial Intelligence

In 2024, a computer solved one of the hardest problems at the International Mathematical Olympiad with a formally verified proof. In 2025, another hit gold-medal standard. The proofs were checked down to the axioms. No trust required. The interesting part is not that machines beat humans at competition math. The interesting part is that reason is becoming executable at scale, and that changes everything.

Mathlib now contains 1.9 million lines of formalized mathematics spanning algebra, analysis, topology, and number theory. It grows by thousands of theorems monthly. No single person understands all of it, and no single person needs to. The theorem you formalize today may be imported by a researcher fifty years from now working on problems we cannot imagine. The proof will still check. Meanwhile, neural networks have learned to propose proof steps that formal systems verify. The model guesses, the kernel checks. DeepSeek-Prover and LeanDojo make this practical today. PhysLean is formalizing physics itself: Maxwell’s equations, quantum mechanics, field theory. The tooling has matured faster than most expected.

We should be honest about limits. Higher-order logic is undecidable. Church and Turing settled this in 1936. Formalization is expensive: the Polynomial Freiman-Ruzsa conjecture required 20,000 lines of Lean for a 50-page paper. Some domains resist entirely. Physics says “for large N” and expects you to understand. But within scope, something remarkable becomes possible: certainty. Not high confidence. Certainty. The proof typechecks or it does not. seL4 runs in critical systems with mathematical proof it cannot crash. CompCert compiles C with proof the assembly matches the source. These are not research toys. They are deployed in critical infrastructure.

This work gets funded because it pushes the frontiers of human knowledge. Science foundations fund theorem provers because they see infrastructure for the future of mathematics. Trading firms fund them because they need systems that actually work. Both are right. Knight Capital lost $440 million in 45 minutes from a deployment bug. The code did exactly what it was written to do. It was simply written wrong. For firms whose existence depends on algorithm correctness, formal methods are existential.

This is where it gets interesting, at least if you find the intersection of mechanism design, game theory, and formal methods interesting, which you should, because it is genuinely one of the most exciting open problems in theoretical computer science and also immediately practical. Markets are mathematical objects. Combinatorial auctions, where bidders express preferences over bundles rather than individual items, turn resource allocation into constraint satisfaction problems. (Shameless plug: OneChronos builds these for financial markets, and if you are the kind of person who reads articles like this for fun, we should talk.) The winner determination problem reduces to weighted set packing: find the best non-overlapping selection from exponentially many candidates. NP-hard in general, but tractable instances exist, and the boundary between hard and easy is where the interesting mathematics lives. Proving properties about these systems, that they are incentive-compatible, that they converge, that they allocate efficiently under stated assumptions, is exactly the kind of problem where formal verification shines. Every improvement in market mechanism design, every formally verified property of an auction protocol, translates into real systems allocating real resources. Better reasoning about markets means systems that allocate capital more efficiently, and efficient allocation is the difference between prosperity and stagnation.

The functional programming shops of Wall Street, the quant firms in London and Hong Kong, the AI labs in China, all contribute to this ecosystem. DeepSeek’s open-source theorem provers emerged from it. The competition is global but the infrastructure is shared. A trading firm in New York open-sources a proof automation library; researchers in Beijing build on it. An AI lab in Hangzhou releases trained models; mathematicians in Paris use them. Private incentive aligns with public good. The tools developed for trading algorithms can verify medical devices. The techniques refined for financial models can prove properties of cryptographic protocols. And as AI infrastructure itself becomes tradeable, as markets emerge for GPU compute, data center capacity, and model inference, the same auction theory applies. The resources that train the models are allocated by the mechanisms the models might one day verify.

There is a future taking shape. AI agents will increasingly act in markets: trading, lending, allocating, optimizing. This is already happening. The question is not whether but how. An AI agent can be constrained by rules, but only if those rules are precise enough to check. Natural language policies are suggestions. Formally verified constraints are guarantees. Imagine market infrastructure where agents must prove, before executing, that their actions satisfy regulatory constraints, risk limits, fairness properties. Not “we reviewed the code” but “the system verified the proof.” The agent that cannot demonstrate compliance cannot act. Formal constraints are not a limitation on AI autonomy. They are the condition that makes AI autonomy safe.

We are building that infrastructure now, whether we recognize it or not. Every verified auction protocol, every theorem about market equilibria, becomes a potential constraint on future autonomous systems. AI agents in markets are not a hypothetical. They are here, and more are coming. The practical question is not whether to allow them but how to make them work well. Formal verification offers something concrete: constraints that actually constrain, rules that cannot be silently violated, guarantees that hold regardless of what the model learned. Skip the existential risk debates. The point is engineering systems that do what they are supposed to do.

If you are reading this as a student or someone early in your career: this stuff is fun. Watching a proof come together, seeing the goal state shrink to nothing, getting that green checkmark from the compiler when everything finally clicks. It is like solving puzzles, except the puzzles are deep and the solutions last. The theorems you formalize will still be valid when you are gone. That is a strange thing to be able to say about your work. The field is small enough that you can make real contributions and growing fast enough that there is plenty to do.

The work is hard. The learning curve is real. There will be days when the goal state mocks you and nothing seems to work. This is normal. The difficulty is not a bug; it is the cost of building things that matter. Scientists have always endured frustration because progress depends on it. The stoic response is not to complain but to continue: one lemma at a time, one proof at a time, one small piece of certainty added to the edifice. The edifice grows. It has been growing for centuries, and you can be part of it.

Jump in.

Open Problem: Verified Auction Theory

Here is a concrete challenge. The code below implements a simple batch auction: buyers and sellers submit orders, the market clears at a uniform price. We can prove safety properties (buyers never pay more than they bid, sellers never receive less than they asked). But the hard problem remains open.

/-- A limit order: price and quantity -/
structure Order where
  price : Nat      -- Price in basis points (avoids floats)
  quantity : Nat   -- Number of units
  deriving DecidableEq, Repr

/-- An order book: sorted buy orders (descending) and sell orders (ascending) -/
structure OrderBook where
  bids : List Order  -- Buy orders, highest price first
  asks : List Order  -- Sell orders, lowest price first
  deriving Repr

/-- Result of a batch auction -/
structure ClearingResult where
  clearingPrice : Nat
  volume : Nat           -- Total units traded
  buyFills : List Nat    -- Quantities filled per bid (same order as input)
  sellFills : List Nat   -- Quantities filled per ask
  deriving Repr

The clearing mechanism finds a price that maximizes trading volume:

/-- Find total demand at a given price -/
def demandAtPrice (bids : List Order) (p : Nat) : Nat :=
  (bids.filter (·.price ≥ p)).map (·.quantity) |>.sum

/-- Find total supply at a given price -/
def supplyAtPrice (asks : List Order) (p : Nat) : Nat :=
  (asks.filter (·.price ≤ p)).map (·.quantity) |>.sum

/-- Simple clearing: find price where supply meets demand -/
def clearBatch (book : OrderBook) : ClearingResult :=
  -- Collect all prices as potential clearing prices
  let prices := (book.bids.map (·.price) ++ book.asks.map (·.price)).eraseDups
  -- Find price that maximizes volume (simplified)
  let bestPrice := prices.foldl (fun best p =>
    let vol := min (demandAtPrice book.bids p) (supplyAtPrice book.asks p)
    let bestVol := min (demandAtPrice book.bids best) (supplyAtPrice book.asks best)
    if vol > bestVol then p else best) 0
  let volume := min (demandAtPrice book.bids bestPrice) (supplyAtPrice book.asks bestPrice)
  -- Simplified fill allocation (pro-rata would be more realistic)
  let buyFills := book.bids.map (fun o => if o.price ≥ bestPrice then o.quantity else 0)
  let sellFills := book.asks.map (fun o => if o.price ≤ bestPrice then o.quantity else 0)
  { clearingPrice := bestPrice, volume, buyFills, sellFills }

Safety properties are tractable:

/-- Safety: buyers never pay more than their bid -/
theorem buyers_pay_at_most_bid (book : OrderBook) :
    let result := clearBatch book
    ∀ i : Fin book.bids.length,
      (result.buyFills.getD i 0 > 0) →
      result.clearingPrice ≤ (book.bids.get i).price := by
  sorry  -- Exercise: prove this

/-- Safety: sellers receive at least their ask -/
theorem sellers_receive_at_least_ask (book : OrderBook) :
    let result := clearBatch book
    ∀ i : Fin book.asks.length,
      (result.sellFills.getD i 0 > 0) →
      result.clearingPrice ≥ (book.asks.get i).price := by
  sorry  -- Exercise: prove this

The open problem is incentive compatibility: proving that truthful bidding is optimal. A mechanism is dominant-strategy incentive compatible (DSIC) if no bidder can improve their outcome by misreporting their true value, regardless of what others do.

/-- Utility for a buyer: units acquired times surplus -/
def buyerUtility (trueValue : Nat) (unitsFilled : Nat) (pricePaid : Nat) : Int :=
  unitsFilled * (trueValue - pricePaid : Int)

/-!
## The Open Problem: Incentive Compatibility

The safety proofs above are tractable. The hard question is:
**Is truthful bidding optimal?**

A mechanism is dominant-strategy incentive compatible (DSIC) if:
for any bidder, bidding their true value maximizes utility
regardless of what others bid.

To prove DSIC in Lean requires quantifying over ALL alternative bids
and showing none improve utility. This is where dependent types meet
mechanism design, and no one has done it comprehensively.

The VCG mechanism is DSIC but not budget-balanced.
Real exchanges use uniform-price auctions which are NOT exactly DSIC.
The open problem: characterize the manipulation bounds and prove them.

If you formalize incentive compatibility for batch auctions in Lean,
you will have done something genuinely new.
-/

/-- The conjecture: truthful bidding is approximately optimal.
    Proving this requires formalizing "approximately" and handling
    the full space of strategic deviations. -/
theorem incentive_compatibility
    (book : OrderBook)
    (i : Fin book.bids.length)
    (trueValue : Nat)
    (alternativeBid : Order) :
    -- Utility from bidding true value ≥ utility from any alternative
    -- (This statement is incomplete - a real proof needs the full setup)
    True := by
  sorry  -- Open problem: make this statement precise and prove it

This is where mechanism design meets dependent types. The Vickrey-Clarke-Groves (VCG) mechanism is the textbook solution: each participant pays the externality they impose on others, making truthful bidding a dominant strategy. The second-price auction is the simplest VCG instance. You bid your true value because winning at your bid costs you exactly what you would have paid anyway, and shading your bid only risks losing an auction you should have won. The math is elegant and the incentives are aligned.

But VCG has problems in practice. It is not budget-balanced: the auctioneer may collect less than they pay out, or vice versa. It requires solving the winner determination problem optimally, which is NP-hard for combinatorial auctions. And it is vulnerable to collusion and shill bidding in ways that uniform-price auctions are not. Real exchanges use uniform-price batch auctions, which sacrifice exact incentive compatibility for computational tractability and budget balance. The open problem is characterizing exactly how much manipulation is possible and proving bounds on strategic gain.

Formalizing this in Lean requires expressing “for all alternative bids, utility does not increase” as a dependent type, then proving it holds. The quantification over counterfactual bids is where the difficulty lives. This is genuinely open research at the intersection of mechanism design and formal verification.

The intersection of neural theorem proving and auction theory is also wide open. Interesting things happen when you point AI at mechanism design.

If you have worked your way to the end of this article, you are exactly the kind of person we want to talk to. OneChronos builds verified auction infrastructure for financial markets. We work on exactly these problems: combinatorial auctions, mechanism design, formal verification of trading systems. If formalizing incentive compatibility sounds like fun rather than work, reach out to OneChronos careers or to me directly: Stephen Diehl.

Resources

Formalized Mathematics

Mathlib: The formalized mathematics library
PhysLean: Formalizing physics

Neural Theorem Proving

DeepSeek-Prover: Open-weight theorem proving models
LeanDojo: ML infrastructure for theorem proving
Lean Copilot: Neural inference in Lean

Interactive Proving with MCP

The Model Context Protocol (MCP) standardizes how AI assistants interact with external tools. For theorem proving, this means LLMs can read goal states, query for relevant lemmas, and build proofs interactively rather than generating them blind. The lean-lsp-mcp server exposes Lean’s language server to AI agents, enabling access to diagnostics, goal states, hover documentation, and search tools like Loogle and LeanSearch.

Tip

Setup for Claude Code (run in your Lean project root):
claude mcp add lean-lsp uvx lean-lsp-mcp
Setup for Cursor/VS Code (add to MCP settings):
{ "mcpServers": { "lean-lsp": { "command": "uvx", "args": ["lean-lsp-mcp"] } } }
Requires uv package manager. Run lake build before starting.

Tools

lean-lsp-mcp: MCP server for Lean interaction
LeanExplore: Semantic search across Mathlib

Keyboard shortcuts

From Zero to QED

Artificial Intelligence

Open Problem: Verified Auction Theory

Resources