Implementation
Ok, now we move beyond the trivial type systems from the 1970s and into the fun stuff from the 1980s! System F’s bidirectional type checking represents a approach to handling polymorphic types without requiring full type annotations everywhere. The bidirectional approach splits type checking into two complementary modes: inference (synthesizing types from expressions) and checking (verifying that expressions conform to expected types). This division allows the system to gracefully handle situations where types are partially known or completely unknown, making the language more ergonomic while preserving type safety.
Typing Rules
Before diving into the implementation details, let’s establish the formal typing rules that govern System F. Buckle up, because we’re about to embark into the fun magical land of type-level wizardry! We’ll be introducing a few new symbols that might look intimidating at first, but they’re really not that scary once you get used to them.
-
\( \Gamma \) (Gamma) - The typing context, which is like a dictionary that maps variables to their types and keeps track of what we know so far.
-
\( \vdash \) (Turnstile) - The “proves” or “entails” symbol. When we write \( \Gamma \vdash e \Rightarrow A \), we’re saying “given context \( \Gamma \), expression \( e \) synthesizes type \( A \).”
-
\( \Rightarrow \) (Double Right Arrow) - Inference mode, where we’re asking “what type does this expression have?” The type checker figures it out for us.
-
\( \Leftarrow \) (Double Left Arrow) - Checking mode, where we’re saying “please verify this expression has the expected type.” We already know what type we want.
-
\( \forall \) (Forall) - Universal quantification, meaning “for any type.” When we see \( \forall \alpha. A \), it means “for any type \( \alpha \), we have type \( A \).”
-
\( \hat{\alpha} \) (Hat Alpha) - Existential type variables, which are like type-level unknowns that the system solves during inference. Think of them as placeholders that get filled in later.
-
\( \bullet \) (Bullet) - The application judgment symbol used in our inference rules. When we write \( A \bullet e \Rightarrow B \), we’re saying “applying type \( A \) to expression \( e \) yields type \( B \).”
-
\( <: \) (Subtype) - The subtyping relation, expressing that one type is “more specific” than another. For example, \( \text{Int} <: \forall \alpha. \alpha \) would mean Int is a subtype of the polymorphic type.
-
\( [B/\alpha]A \) - Type substitution, replacing all occurrences of type variable \( \alpha \) with type \( B \) in type \( A \). This is how we instantiate polymorphic types.
Now that we’ve equipped ourselves with this symbolic toolkit, let’s see how these pieces combine to create the elegant machinery of System F type checking.
Basic Rules
The variable rule looks up types from the context:
\[ \frac{x : A \in \Gamma}{\Gamma \vdash x \Rightarrow A} \text{(T-Var)} \]
Application checks that the function type matches the argument:
\[ \frac{\Gamma \vdash e_1 \Rightarrow A \to B \quad \Gamma \vdash e_2 \Leftarrow A}{\Gamma \vdash e_1 ; e_2 \Rightarrow B} \text{(T-App)} \]
Lambda abstraction introduces a new variable binding:
\[ \frac{\Gamma, x : A \vdash e \Leftarrow B}{\Gamma \vdash \lambda x. e \Leftarrow A \to B} \text{(T-Abs)} \]
Polymorphic Rules
Universal introduction allows us to generalize over type variables:
\[ \frac{\Gamma, \alpha \vdash e \Leftarrow A}{\Gamma \vdash e \Leftarrow \forall \alpha. A} \text{(T-ForallI)} \]
Universal elimination instantiates polymorphic types:
\[ \frac{\Gamma \vdash e \Rightarrow \forall \alpha. A}{\Gamma \vdash e \Rightarrow [B/\alpha]A} \text{(T-ForallE)} \]
Type annotation allows switching from checking to inference mode:
\[ \frac{\Gamma \vdash e \Leftarrow A}{\Gamma \vdash (e : A) \Rightarrow A} \text{(T-Instr)} \]
Primitive Type Rules
Integer literals have type Int:
\[ \frac{}{\Gamma \vdash n \Rightarrow \text{Int}} \text{(T-LitInt)} \]
Boolean literals have type Bool:
\[ \frac{}{\Gamma \vdash \text{true} \Rightarrow \text{Bool}} \text{(T-LitBool)} \]
Control Flow Rules
Let bindings introduce local variables:
\[ \frac{\Gamma \vdash e_1 \Rightarrow A \quad \Gamma, x : A \vdash e_2 \Rightarrow B}{\Gamma \vdash \text{let } x = e_1 \text{ in } e_2 \Rightarrow B} \text{(T-Let)} \]
Conditional expressions require Bool conditions and matching branch types:
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Bool} \quad \Gamma \vdash e_2 \Rightarrow A \quad \Gamma \vdash e_3 \Leftarrow A}{\Gamma \vdash \text{if } e_1 \text{ then } e_2 \text{ else } e_3 \Rightarrow A} \text{(T-If)} \]
Binary Operation Rules
Arithmetic operations take two integers and return an integer:
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Int} \quad \Gamma \vdash e_2 \Leftarrow \text{Int}}{\Gamma \vdash e_1 \oplus e_2 \Rightarrow \text{Int}} \text{(T-Arith)} \]
Boolean operations take two booleans and return a boolean:
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Bool} \quad \Gamma \vdash e_2 \Leftarrow \text{Bool}}{\Gamma \vdash e_1 \land e_2 \Rightarrow \text{Bool}} \text{(T-Bool)} \]
Comparison operations take two integers and return a boolean:
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Int} \quad \Gamma \vdash e_2 \Leftarrow \text{Int}}{\Gamma \vdash e_1 < e_2 \Rightarrow \text{Bool}} \text{(T-Cmp)} \]
Equality operations are polymorphic and work on any type:
\[ \frac{\Gamma \vdash e_1 \Rightarrow A \quad \Gamma \vdash e_2 \Leftarrow A}{\Gamma \vdash e_1 = e_2 \Rightarrow \text{Bool}} \text{(T-Eq)} \]
Bidirectional Rules
The mode switch allows inference results to be checked:
\[ \frac{\Gamma \vdash e \Rightarrow A}{\Gamma \vdash e \Leftarrow A} \text{(T-Sub)} \]
Existential variables are introduced for unknown types:
\[ \frac{\Gamma, \hat{\alpha} \vdash e \Rightarrow A}{\Gamma \vdash e \Rightarrow [\hat{\alpha}/\alpha]A} \text{(T-InstL)} \]
Application Inference Rules
Application inference handles complex cases where the function type is not immediately known:
Application with arrow types: \[ \frac{\Gamma \vdash e_2 \Leftarrow A}{\Gamma \vdash A \to B \bullet e_2 \Rightarrow B} \text{(T-AppArrow)} \]
Application with existential variables: \[ \frac{\Gamma[\hat{\alpha} := \hat{\alpha_1} \to \hat{\alpha_2}], \hat{\alpha_1}, \hat{\alpha_2} \vdash e_2 \Leftarrow \hat{\alpha_1}}{\Gamma \vdash \hat{\alpha} \bullet e_2 \Rightarrow \hat{\alpha_2}} \text{(T-AppEVar)} \]
In these rules, \( \Rightarrow \) indicates inference mode (synthesizing a type), while \( \Leftarrow \) indicates checking mode (verifying against an expected type). The hat notation \( \hat{\alpha} \) denotes existential type variables that the system solves during inference.
Core Data Structures
Context and Environment Management
The bidirectional algorithm maintains a context that tracks multiple kinds of bindings and constraints. Our context system needs to handle not just term variables and their types, but also type variables, existential variables, and the relationships between them.
#![allow(unused)] fn main() { use std::collections::HashSet; use std::fmt; use crate::ast::{BinOp, Expr, Type}; use crate::errors::{TypeError, TypeResult}; pub type TyVar = String; pub type TmVar = String; impl fmt::Display for Entry { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match self { Entry::VarBnd(x, ty) => write!(f, "{}: {}", x, ty), Entry::TVarBnd(a) => write!(f, "{}", a), Entry::ETVarBnd(a) => write!(f, "^{}", a), Entry::SETVarBnd(a, ty) => write!(f, "^{} = {}", a, ty), Entry::Mark(a) => write!(f, "${}", a), } } } #[derive(Debug, Clone)] pub struct Context(Vec<Entry>); impl Default for Context { fn default() -> Self { Self::new() } } impl Context { pub fn new() -> Self { Context(Vec::new()) } pub fn push(&mut self, entry: Entry) { self.0.push(entry); } pub fn find<F>(&self, predicate: F) -> Option<&Entry> where F: Fn(&Entry) -> bool, { self.0.iter().find(|entry| predicate(entry)) } pub fn break3<F>(&self, predicate: F) -> (Vec<Entry>, Option<Entry>, Vec<Entry>) where F: Fn(&Entry) -> bool, { if let Some(pos) = self.0.iter().position(predicate) { let left = self.0[..pos].to_vec(); let middle = self.0[pos].clone(); let right = self.0[pos + 1..].to_vec(); (left, Some(middle), right) } else { (self.0.clone(), None, Vec::new()) } } pub fn from_parts(left: Vec<Entry>, middle: Entry, right: Vec<Entry>) -> Self { let mut ctx = left; ctx.push(middle); ctx.extend(right); Context(ctx) } } impl fmt::Display for Context { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { let entries: Vec<String> = self.0.iter().rev().map(|e| e.to_string()).collect(); write!(f, "{}", entries.join(", ")) } } #[derive(Debug)] pub struct InferenceTree { pub rule: String, pub input: String, pub output: String, pub children: Vec<InferenceTree>, } impl InferenceTree { fn new(rule: &str, input: &str, output: &str, children: Vec<InferenceTree>) -> Self { Self { rule: rule.to_string(), input: input.to_string(), output: output.to_string(), children, } } } impl fmt::Display for InferenceTree { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { self.display_with_indent(f, 0) } } impl InferenceTree { fn display_with_indent(&self, f: &mut fmt::Formatter, indent: usize) -> fmt::Result { let prefix = " ".repeat(indent); writeln!( f, "{}{}: {} => {}", prefix, self.rule, self.input, self.output )?; for child in &self.children { child.display_with_indent(f, indent + 1)?; } Ok(()) } } pub struct BiDirectional { counter: usize, } impl Default for BiDirectional { fn default() -> Self { Self::new() } } #[allow(clippy::only_used_in_recursion)] impl BiDirectional { pub fn new() -> Self { Self { counter: 0 } } fn fresh_tyvar(&mut self) -> TyVar { let var = format!("α{}", self.counter); self.counter += 1; var } fn is_mono(&self, ty: &Type) -> bool { match ty { Type::Int | Type::Bool | Type::Var(_) | Type::ETVar(_) => true, Type::Arrow(t1, t2) => self.is_mono(t1) && self.is_mono(t2), Type::Forall(_, _) => false, } } fn free_vars(&self, ty: &Type) -> HashSet<TyVar> { match ty { Type::Var(name) | Type::ETVar(name) => { let mut set = HashSet::new(); set.insert(name.clone()); set } Type::Arrow(t1, t2) => { let mut set = self.free_vars(t1); set.extend(self.free_vars(t2)); set } Type::Forall(var, ty) => { let mut set = self.free_vars(ty); set.remove(var); set } Type::Int | Type::Bool => HashSet::new(), } } fn subst_type(&self, var: &TyVar, replacement: &Type, ty: &Type) -> Type { match ty { Type::Var(name) if name == var => replacement.clone(), Type::ETVar(name) if name == var => replacement.clone(), Type::Var(_) | Type::ETVar(_) | Type::Int | Type::Bool => ty.clone(), Type::Arrow(t1, t2) => Type::Arrow( Box::new(self.subst_type(var, replacement, t1)), Box::new(self.subst_type(var, replacement, t2)), ), Type::Forall(bound_var, body) => { if bound_var == var { ty.clone() // Variable is shadowed } else { Type::Forall( bound_var.clone(), Box::new(self.subst_type(var, replacement, body)), ) } } } } pub fn apply_ctx_type(&self, ctx: &Context, ty: &Type) -> Type { let mut current = ty.clone(); let mut changed = true; // Keep applying substitutions until no more changes occur while changed { changed = false; let new_type = self.apply_ctx_type_once(ctx, ¤t); if new_type != current { changed = true; current = new_type; } } current } fn apply_ctx_type_once(&self, ctx: &Context, ty: &Type) -> Type { match ty { Type::ETVar(a) => { if let Some(Entry::SETVarBnd(_, replacement)) = ctx.find(|entry| matches!(entry, Entry::SETVarBnd(name, _) if name == a)) { // Recursively apply substitutions to the replacement type self.apply_ctx_type_once(ctx, replacement) } else { ty.clone() } } Type::Arrow(t1, t2) => Type::Arrow( Box::new(self.apply_ctx_type_once(ctx, t1)), Box::new(self.apply_ctx_type_once(ctx, t2)), ), Type::Forall(var, body) => { Type::Forall(var.clone(), Box::new(self.apply_ctx_type_once(ctx, body))) } _ => ty.clone(), } } fn before(&self, ctx: &Context, a: &TyVar, b: &TyVar) -> bool { let pos_a = ctx .0 .iter() .position(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let pos_b = ctx .0 .iter() .position(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); match (pos_a, pos_b) { (Some(pa), Some(pb)) => pa > pb, // Later in the context means earlier in ordering _ => false, } } /// Check if a type variable occurs in a type (occurs check for unification) /// This prevents creating infinite types when solving ^α := τ by ensuring α /// ∉ τ fn occurs_check(&self, var: &TyVar, ty: &Type) -> bool { match ty { Type::Var(name) | Type::ETVar(name) => name == var, Type::Arrow(t1, t2) => self.occurs_check(var, t1) || self.occurs_check(var, t2), Type::Forall(bound_var, body) => { // If the variable is shadowed by the forall binding, it doesn't occur if bound_var == var { false } else { self.occurs_check(var, body) } } Type::Int | Type::Bool => false, } } // ========== TYPING RULE METHODS ========== // Each method implements a specific typing rule from System F /// T-Var: Variable lookup rule /// Γ, x:A ⊢ x ⇒ A fn infer_var( &self, ctx: &Context, x: &str, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { if let Some(Entry::VarBnd(_, ty)) = ctx.find(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)) { let output = format!("{} ⇒ {} ⊣ {}", input, ty, ctx); Ok(( ty.clone(), ctx.clone(), InferenceTree::new("InfVar", input, &output, vec![]), )) } else { Err(TypeError::UnboundVariable { name: x.to_string(), expr: None, }) } } /// T-LitInt: Integer literal rule /// ⊢ n ⇒ Int fn infer_lit_int( &self, ctx: &Context, _n: i64, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let output = format!("{} ⇒ Int ⊣ {}", input, ctx); Ok(( Type::Int, ctx.clone(), InferenceTree::new("InfLitInt", input, &output, vec![]), )) } /// T-LitBool: Boolean literal rule /// ⊢ b ⇒ Bool fn infer_lit_bool( &self, ctx: &Context, _b: bool, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let output = format!("{} ⇒ Bool ⊣ {}", input, ctx); Ok(( Type::Bool, ctx.clone(), InferenceTree::new("InfLitBool", input, &output, vec![]), )) } /// T-Abs: Lambda abstraction rule /// Γ,x:A ⊢ e ⇐ B /// ───────────────────── /// Γ ⊢ λx:A.e ⇒ A → B fn infer_abs( &mut self, ctx: &Context, x: &str, param_ty: &Type, body: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let b = self.fresh_tyvar(); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::VarBnd(x.to_string(), param_ty.clone())); new_ctx.push(Entry::ETVarBnd(b.clone())); let (ctx1, tree) = self.check(&new_ctx, body, &Type::ETVar(b.clone()))?; let (left, _, right) = ctx1.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let result_ty = Type::Arrow(Box::new(param_ty.clone()), Box::new(Type::ETVar(b))); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, final_ctx); Ok(( result_ty, final_ctx, InferenceTree::new("InfLam", input, &output, vec![tree]), )) } /// T-App: Function application rule /// Γ ⊢ e1 ⇒ A Γ ⊢ e2 • A ⇒⇒ C /// ──────────────────────────────── /// Γ ⊢ e1 e2 ⇒ C fn infer_application( &mut self, ctx: &Context, func: &Expr, arg: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (func_ty, ctx1, tree1) = self.infer(ctx, func)?; let func_ty_applied = self.apply_ctx_type(&ctx1, &func_ty); let (result_ty, ctx2, tree2) = self.infer_app(&ctx1, &func_ty_applied, arg)?; let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, ctx2); Ok(( result_ty, ctx2, InferenceTree::new("InfApp", input, &output, vec![tree1, tree2]), )) } /// T-Let: Let binding rule /// Γ ⊢ e1 ⇒ A Γ,x:A ⊢ e2 ⇒ B /// ───────────────────────────── /// Γ ⊢ let x = e1 in e2 ⇒ B fn infer_let( &mut self, ctx: &Context, x: &str, e1: &Expr, e2: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ty1, ctx1, tree1) = self.infer(ctx, e1)?; let mut new_ctx = ctx1.clone(); new_ctx.push(Entry::VarBnd(x.to_string(), ty1)); let (ty2, ctx2, tree2) = self.infer(&new_ctx, e2)?; let (left, _, right) = ctx2.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let output = format!("{} ⇒ {} ⊣ {}", input, ty2, final_ctx); Ok(( ty2, final_ctx, InferenceTree::new("InfLet", input, &output, vec![tree1, tree2]), )) } /// T-If: Conditional rule /// Γ ⊢ e1 ⇐ Bool Γ ⊢ e2 ⇒ A Γ ⊢ e3 ⇒ A /// ────────────────────────────────────────── /// Γ ⊢ if e1 then e2 else e3 ⇒ A fn infer_if( &mut self, ctx: &Context, e1: &Expr, e2: &Expr, e3: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ctx1, tree1) = self.check(ctx, e1, &Type::Bool)?; let (ty2, ctx2, tree2) = self.infer(&ctx1, e2)?; let (ty3, ctx3, tree3) = self.infer(&ctx2, e3)?; // Ensure both branches have the same type let (unified_ctx, tree_unify) = self.subtype(&ctx3, &ty2, &ty3)?; let output = format!("{} ⇒ {} ⊣ {}", input, ty2, unified_ctx); Ok(( ty2, unified_ctx, InferenceTree::new( "InfIf", input, &output, vec![tree1, tree2, tree3, tree_unify], ), )) } /// T-BinOp: Binary operation rules fn infer_binop( &mut self, ctx: &Context, op: &BinOp, e1: &Expr, e2: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { match op { // T-Arith: Int → Int → Int BinOp::Add | BinOp::Sub | BinOp::Mul | BinOp::Div => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Int)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Int)?; let output = format!("{} ⇒ Int ⊣ {}", input, ctx2); Ok(( Type::Int, ctx2, InferenceTree::new("InfArith", input, &output, vec![tree1, tree2]), )) } // T-Bool: Bool → Bool → Bool BinOp::And | BinOp::Or => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Bool)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Bool)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfBool", input, &output, vec![tree1, tree2]), )) } // T-Cmp: Int → Int → Bool BinOp::Lt | BinOp::Le | BinOp::Gt | BinOp::Ge => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Int)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Int)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfCmp", input, &output, vec![tree1, tree2]), )) } // T-Eq: ∀α. α → α → Bool BinOp::Eq | BinOp::Ne => { let (ty1, ctx1, tree1) = self.infer(ctx, e1)?; let (ctx2, tree2) = self.check(&ctx1, e2, &ty1)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfEq", input, &output, vec![tree1, tree2]), )) } } } fn inst_l( &mut self, ctx: &Context, a: &TyVar, ty: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ ^{} :=< {}", ctx, a, ty); match ty { Type::ETVar(b) if self.before(ctx, a, b) => { let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); let new_ctx = Context::from_parts( left, Entry::SETVarBnd(b.clone(), Type::ETVar(a.clone())), right, ); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstLReach", &input, &output, vec![]), )) } Type::Arrow(t1, t2) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree1) = self.inst_r(&ctx1, t1, &a1)?; let t2_applied = self.apply_ctx_type(&ctx2, t2); let (ctx3, tree2) = self.inst_l(&ctx2, &a2, &t2_applied)?; let output = format!("{}", ctx3); Ok(( ctx3, InferenceTree::new("InstLArr", &input, &output, vec![tree1, tree2]), )) } Type::Forall(b, t) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(b.clone())); let (ctx1, tree) = self.inst_l(&new_ctx, a, t)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("InstLAllR", &input, &output, vec![tree]), )) } _ if self.is_mono(ty) => { // Occurs check: ensure ^α doesn't occur in τ to prevent infinite types if self.occurs_check(a, ty) { return Err(TypeError::OccursCheck { var: a.clone(), ty: ty.clone(), expr: None, }); } let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let new_ctx = Context::from_parts(left, Entry::SETVarBnd(a.clone(), ty.clone()), right); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstLSolve", &input, &output, vec![]), )) } _ => Err(TypeError::InstantiationError { var: a.clone(), ty: ty.clone(), expr: None, }), } } fn inst_r( &mut self, ctx: &Context, ty: &Type, a: &TyVar, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {} :=< ^{}", ctx, ty, a); match ty { Type::ETVar(b) if self.before(ctx, a, b) => { let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); let new_ctx = Context::from_parts( left, Entry::SETVarBnd(b.clone(), Type::ETVar(a.clone())), right, ); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstRReach", &input, &output, vec![]), )) } Type::Arrow(t1, t2) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree1) = self.inst_l(&ctx1, &a1, t1)?; let t2_applied = self.apply_ctx_type(&ctx2, t2); let (ctx3, tree2) = self.inst_r(&ctx2, &t2_applied, &a2)?; let output = format!("{}", ctx3); Ok(( ctx3, InferenceTree::new("InstRArr", &input, &output, vec![tree1, tree2]), )) } Type::Forall(b, t) => { let subst_t = self.subst_type(b, &Type::ETVar(b.clone()), t); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::ETVarBnd(b.clone())); new_ctx.push(Entry::Mark(b.clone())); let (ctx1, tree) = self.inst_r(&new_ctx, &subst_t, a)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::Mark(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("InstRAllL", &input, &output, vec![tree]), )) } _ if self.is_mono(ty) => { // Occurs check: ensure ^α doesn't occur in τ to prevent infinite types if self.occurs_check(a, ty) { return Err(TypeError::OccursCheck { var: a.clone(), ty: ty.clone(), expr: None, }); } let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let new_ctx = Context::from_parts(left, Entry::SETVarBnd(a.clone(), ty.clone()), right); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstRSolve", &input, &output, vec![]), )) } _ => Err(TypeError::InstantiationError { var: a.clone(), ty: ty.clone(), expr: None, }), } } fn subtype( &mut self, ctx: &Context, t1: &Type, t2: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {} <: {}", ctx, t1, t2); match (t1, t2) { (Type::Int, Type::Int) | (Type::Bool, Type::Bool) => Ok(( ctx.clone(), InferenceTree::new("SubRefl", &input, &format!("{}", ctx), vec![]), )), (Type::Var(a), Type::Var(b)) if a == b => Ok(( ctx.clone(), InferenceTree::new("SubReflTVar", &input, &format!("{}", ctx), vec![]), )), (Type::ETVar(a), Type::ETVar(b)) if a == b => Ok(( ctx.clone(), InferenceTree::new("SubReflETVar", &input, &format!("{}", ctx), vec![]), )), (Type::Arrow(a1, a2), Type::Arrow(b1, b2)) => { let (ctx1, tree1) = self.subtype(ctx, b1, a1)?; // Contravariant in argument let (ctx2, tree2) = self.subtype(&ctx1, a2, b2)?; // Covariant in result let output = format!("{}", ctx2); Ok(( ctx2, InferenceTree::new("SubArr", &input, &output, vec![tree1, tree2]), )) } (_, Type::Forall(b, t2_body)) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(b.clone())); let (ctx1, tree) = self.subtype(&new_ctx, t1, t2_body)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("SubAllR", &input, &output, vec![tree]), )) } (Type::Forall(a, t1_body), _) => { let subst_t1 = self.subst_type(a, &Type::ETVar(a.clone()), t1_body); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::ETVarBnd(a.clone())); new_ctx.push(Entry::Mark(a.clone())); let (ctx1, tree) = self.subtype(&new_ctx, &subst_t1, t2)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::Mark(name) if name == a)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("SubAllL", &input, &output, vec![tree]), )) } (Type::ETVar(a), _) if !self.free_vars(t2).contains(a) => { let (ctx1, tree) = self.inst_l(ctx, a, t2)?; let output = format!("{}", ctx1); Ok(( ctx1, InferenceTree::new("SubInstL", &input, &output, vec![tree]), )) } (_, Type::ETVar(a)) if !self.free_vars(t1).contains(a) => { let (ctx1, tree) = self.inst_r(ctx, t1, a)?; let output = format!("{}", ctx1); Ok(( ctx1, InferenceTree::new("SubInstR", &input, &output, vec![tree]), )) } _ => Err(TypeError::SubtypingError { left: t1.clone(), right: t2.clone(), expr: None, }), } } fn check( &mut self, ctx: &Context, expr: &Expr, ty: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {:?} ⇐ {}", ctx, expr, ty); match (expr, ty) { (Expr::LitInt(_), Type::Int) => Ok(( ctx.clone(), InferenceTree::new("ChkLitInt", &input, &format!("{}", ctx), vec![]), )), (Expr::LitBool(_), Type::Bool) => Ok(( ctx.clone(), InferenceTree::new("ChkLitBool", &input, &format!("{}", ctx), vec![]), )), (Expr::Abs(x, _param_ty, body), Type::Arrow(expected_param, result_ty)) => { // For simplicity, we assume the parameter type matches let mut new_ctx = ctx.clone(); new_ctx.push(Entry::VarBnd(x.clone(), *expected_param.clone())); let (ctx1, tree) = self.check(&new_ctx, body, result_ty)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("ChkLam", &input, &output, vec![tree]), )) } (_, Type::Forall(a, ty_body)) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(a.clone())); let (ctx1, tree) = self.check(&new_ctx, expr, ty_body)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == a)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("ChkAll", &input, &output, vec![tree]), )) } _ => { // Fallback to inference + subtyping let (inferred_ty, ctx1, tree1) = self.infer(ctx, expr)?; let inferred_applied = self.apply_ctx_type(&ctx1, &inferred_ty); let ty_applied = self.apply_ctx_type(&ctx1, ty); let (ctx2, tree2) = self.subtype(&ctx1, &inferred_applied, &ty_applied)?; let output = format!("{}", ctx2); Ok(( ctx2, InferenceTree::new("ChkSub", &input, &output, vec![tree1, tree2]), )) } } } pub fn infer( &mut self, ctx: &Context, expr: &Expr, ) -> TypeResult<(Type, Context, InferenceTree)> { let input = format!("{} ⊢ {:?}", ctx, expr); match expr { Expr::Var(x) => self.infer_var(ctx, x, &input), Expr::Ann(expr, ty) => self.infer_ann(ctx, expr, ty, &input), Expr::LitInt(n) => self.infer_lit_int(ctx, *n, &input), Expr::LitBool(b) => self.infer_lit_bool(ctx, *b, &input), Expr::Abs(x, param_ty, body) => self.infer_abs(ctx, x, param_ty, body, &input), Expr::App(func, arg) => self.infer_application(ctx, func, arg, &input), Expr::TAbs(a, body) => self.infer_tabs(ctx, a, body, &input), Expr::TApp(func, ty_arg) => self.infer_tapp(ctx, func, ty_arg, &input), Expr::Let(x, e1, e2) => self.infer_let(ctx, x, e1, e2, &input), Expr::IfThenElse(e1, e2, e3) => self.infer_if(ctx, e1, e2, e3, &input), Expr::BinOp(op, e1, e2) => self.infer_binop(ctx, op, e1, e2, &input), } } /// T-Instr: Type annotation rule /// Γ ⊢ e ⇐ A /// ────────────────── /// Γ ⊢ (e : A) ⇒ A fn infer_ann( &mut self, ctx: &Context, expr: &Expr, ty: &Type, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ctx1, tree) = self.check(ctx, expr, ty)?; let output = format!("{} ⇒ {} ⊣ {}", input, ty, ctx1); Ok(( ty.clone(), ctx1, InferenceTree::new("InfAnn", input, &output, vec![tree]), )) } /// T-ForallI: Type abstraction rule /// Γ, α ⊢ e ⇒ A /// ────────────────────── /// Γ ⊢ Λα. e ⇒ ∀α. A fn infer_tabs( &mut self, ctx: &Context, a: &str, body: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(a.to_string())); let (body_ty, ctx1, tree) = self.infer(&new_ctx, body)?; // Apply context substitutions to resolve existential variables before removing // type binding let resolved_body_ty = self.apply_ctx_type(&ctx1, &body_ty); let (left, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == a)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let result_ty = Type::Forall(a.to_string(), Box::new(resolved_body_ty)); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, final_ctx); Ok(( result_ty, final_ctx, InferenceTree::new("InfTAbs", input, &output, vec![tree]), )) } /// T-ForallE: Type application rule /// Γ ⊢ e ⇒ ∀α. A /// ────────────────────── /// Γ ⊢ e[B] ⇒ [B/α]A fn infer_tapp( &mut self, ctx: &Context, func: &Expr, ty_arg: &Type, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (func_ty, ctx1, tree1) = self.infer(ctx, func)?; match func_ty { Type::Forall(a, body_ty) => { let result_ty = self.subst_type(&a, ty_arg, &body_ty); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, ctx1); Ok(( result_ty, ctx1, InferenceTree::new("InfTApp", input, &output, vec![tree1]), )) } _ => Err(TypeError::TypeApplicationError { actual: func_ty.clone(), expr: None, }), } } fn infer_app( &mut self, ctx: &Context, func_ty: &Type, arg: &Expr, ) -> TypeResult<(Type, Context, InferenceTree)> { let input = format!("{} ⊢ {:?} • {}", ctx, arg, func_ty); match func_ty { Type::Arrow(param_ty, result_ty) => { let (ctx1, tree) = self.check(ctx, arg, param_ty)?; let output = format!("{} ⇒⇒ {} ⊣ {}", input, result_ty, ctx1); Ok(( result_ty.as_ref().clone(), ctx1, InferenceTree::new("InfAppArr", &input, &output, vec![tree]), )) } Type::ETVar(a) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree) = self.check(&ctx1, arg, &Type::ETVar(a1))?; let output = format!("{} ⇒⇒ ^{} ⊣ {}", input, a2, ctx2); Ok(( Type::ETVar(a2), ctx2, InferenceTree::new("InfAppETVar", &input, &output, vec![tree]), )) } _ => Err(TypeError::ApplicationTypeError { actual: func_ty.clone(), expr: None, }), } } } pub fn run_bidirectional(expr: &Expr) -> TypeResult<(Type, Context, InferenceTree)> { let mut bi = BiDirectional::new(); let ctx = Context::new(); let (ty, final_ctx, tree) = bi.infer(&ctx, expr)?; // Apply the final context to resolve existential variables let resolved_ty = bi.apply_ctx_type(&final_ctx, &ty); Ok((resolved_ty, final_ctx, tree)) } #[derive(Debug, Clone)] pub enum Entry { VarBnd(TmVar, Type), // x: A TVarBnd(TyVar), // α ETVarBnd(TyVar), // ^α SETVarBnd(TyVar, Type), // ^α = τ Mark(TyVar), // $α } }
The context entries represent different kinds of information the type checker needs to track throughout the bidirectional inference process. Variable bindings, represented as VarBnd(TmVar, Type)
, associate term variables with their types, such as recording that x
has type Int
in the current scope. Type variable bindings, denoted as TVarBnd(TyVar)
, introduce type variables into scope, such as the \( \alpha \) that appears in universal quantification \( \forall \alpha. \ldots \).
Existential type variable bindings, written as ETVarBnd(TyVar)
, introduce existential type variables that represent unknown types to be determined through constraint solving. Solved existential type variable bindings, represented as SETVarBnd(TyVar, Type)
, record the concrete solutions discovered for existential variables during the inference process. Finally, marker entries, denoted as Mark(TyVar)
, mark the beginning of a scope to enable proper garbage collection of type variables when their scope is exited.
The context itself maintains these entries in a stack-like structure where order matters crucially for scoping and variable resolution:
#![allow(unused)] fn main() { use std::collections::HashSet; use std::fmt; use crate::ast::{BinOp, Expr, Type}; use crate::errors::{TypeError, TypeResult}; pub type TyVar = String; pub type TmVar = String; #[derive(Debug, Clone)] pub enum Entry { VarBnd(TmVar, Type), // x: A TVarBnd(TyVar), // α ETVarBnd(TyVar), // ^α SETVarBnd(TyVar, Type), // ^α = τ Mark(TyVar), // $α } impl fmt::Display for Entry { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { match self { Entry::VarBnd(x, ty) => write!(f, "{}: {}", x, ty), Entry::TVarBnd(a) => write!(f, "{}", a), Entry::ETVarBnd(a) => write!(f, "^{}", a), Entry::SETVarBnd(a, ty) => write!(f, "^{} = {}", a, ty), Entry::Mark(a) => write!(f, "${}", a), } } } impl Default for Context { fn default() -> Self { Self::new() } } impl Context { pub fn new() -> Self { Context(Vec::new()) } pub fn push(&mut self, entry: Entry) { self.0.push(entry); } pub fn find<F>(&self, predicate: F) -> Option<&Entry> where F: Fn(&Entry) -> bool, { self.0.iter().find(|entry| predicate(entry)) } pub fn break3<F>(&self, predicate: F) -> (Vec<Entry>, Option<Entry>, Vec<Entry>) where F: Fn(&Entry) -> bool, { if let Some(pos) = self.0.iter().position(predicate) { let left = self.0[..pos].to_vec(); let middle = self.0[pos].clone(); let right = self.0[pos + 1..].to_vec(); (left, Some(middle), right) } else { (self.0.clone(), None, Vec::new()) } } pub fn from_parts(left: Vec<Entry>, middle: Entry, right: Vec<Entry>) -> Self { let mut ctx = left; ctx.push(middle); ctx.extend(right); Context(ctx) } } impl fmt::Display for Context { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { let entries: Vec<String> = self.0.iter().rev().map(|e| e.to_string()).collect(); write!(f, "{}", entries.join(", ")) } } #[derive(Debug)] pub struct InferenceTree { pub rule: String, pub input: String, pub output: String, pub children: Vec<InferenceTree>, } impl InferenceTree { fn new(rule: &str, input: &str, output: &str, children: Vec<InferenceTree>) -> Self { Self { rule: rule.to_string(), input: input.to_string(), output: output.to_string(), children, } } } impl fmt::Display for InferenceTree { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { self.display_with_indent(f, 0) } } impl InferenceTree { fn display_with_indent(&self, f: &mut fmt::Formatter, indent: usize) -> fmt::Result { let prefix = " ".repeat(indent); writeln!( f, "{}{}: {} => {}", prefix, self.rule, self.input, self.output )?; for child in &self.children { child.display_with_indent(f, indent + 1)?; } Ok(()) } } pub struct BiDirectional { counter: usize, } impl Default for BiDirectional { fn default() -> Self { Self::new() } } #[allow(clippy::only_used_in_recursion)] impl BiDirectional { pub fn new() -> Self { Self { counter: 0 } } fn fresh_tyvar(&mut self) -> TyVar { let var = format!("α{}", self.counter); self.counter += 1; var } fn is_mono(&self, ty: &Type) -> bool { match ty { Type::Int | Type::Bool | Type::Var(_) | Type::ETVar(_) => true, Type::Arrow(t1, t2) => self.is_mono(t1) && self.is_mono(t2), Type::Forall(_, _) => false, } } fn free_vars(&self, ty: &Type) -> HashSet<TyVar> { match ty { Type::Var(name) | Type::ETVar(name) => { let mut set = HashSet::new(); set.insert(name.clone()); set } Type::Arrow(t1, t2) => { let mut set = self.free_vars(t1); set.extend(self.free_vars(t2)); set } Type::Forall(var, ty) => { let mut set = self.free_vars(ty); set.remove(var); set } Type::Int | Type::Bool => HashSet::new(), } } fn subst_type(&self, var: &TyVar, replacement: &Type, ty: &Type) -> Type { match ty { Type::Var(name) if name == var => replacement.clone(), Type::ETVar(name) if name == var => replacement.clone(), Type::Var(_) | Type::ETVar(_) | Type::Int | Type::Bool => ty.clone(), Type::Arrow(t1, t2) => Type::Arrow( Box::new(self.subst_type(var, replacement, t1)), Box::new(self.subst_type(var, replacement, t2)), ), Type::Forall(bound_var, body) => { if bound_var == var { ty.clone() // Variable is shadowed } else { Type::Forall( bound_var.clone(), Box::new(self.subst_type(var, replacement, body)), ) } } } } pub fn apply_ctx_type(&self, ctx: &Context, ty: &Type) -> Type { let mut current = ty.clone(); let mut changed = true; // Keep applying substitutions until no more changes occur while changed { changed = false; let new_type = self.apply_ctx_type_once(ctx, ¤t); if new_type != current { changed = true; current = new_type; } } current } fn apply_ctx_type_once(&self, ctx: &Context, ty: &Type) -> Type { match ty { Type::ETVar(a) => { if let Some(Entry::SETVarBnd(_, replacement)) = ctx.find(|entry| matches!(entry, Entry::SETVarBnd(name, _) if name == a)) { // Recursively apply substitutions to the replacement type self.apply_ctx_type_once(ctx, replacement) } else { ty.clone() } } Type::Arrow(t1, t2) => Type::Arrow( Box::new(self.apply_ctx_type_once(ctx, t1)), Box::new(self.apply_ctx_type_once(ctx, t2)), ), Type::Forall(var, body) => { Type::Forall(var.clone(), Box::new(self.apply_ctx_type_once(ctx, body))) } _ => ty.clone(), } } fn before(&self, ctx: &Context, a: &TyVar, b: &TyVar) -> bool { let pos_a = ctx .0 .iter() .position(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let pos_b = ctx .0 .iter() .position(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); match (pos_a, pos_b) { (Some(pa), Some(pb)) => pa > pb, // Later in the context means earlier in ordering _ => false, } } /// Check if a type variable occurs in a type (occurs check for unification) /// This prevents creating infinite types when solving ^α := τ by ensuring α /// ∉ τ fn occurs_check(&self, var: &TyVar, ty: &Type) -> bool { match ty { Type::Var(name) | Type::ETVar(name) => name == var, Type::Arrow(t1, t2) => self.occurs_check(var, t1) || self.occurs_check(var, t2), Type::Forall(bound_var, body) => { // If the variable is shadowed by the forall binding, it doesn't occur if bound_var == var { false } else { self.occurs_check(var, body) } } Type::Int | Type::Bool => false, } } // ========== TYPING RULE METHODS ========== // Each method implements a specific typing rule from System F /// T-Var: Variable lookup rule /// Γ, x:A ⊢ x ⇒ A fn infer_var( &self, ctx: &Context, x: &str, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { if let Some(Entry::VarBnd(_, ty)) = ctx.find(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)) { let output = format!("{} ⇒ {} ⊣ {}", input, ty, ctx); Ok(( ty.clone(), ctx.clone(), InferenceTree::new("InfVar", input, &output, vec![]), )) } else { Err(TypeError::UnboundVariable { name: x.to_string(), expr: None, }) } } /// T-LitInt: Integer literal rule /// ⊢ n ⇒ Int fn infer_lit_int( &self, ctx: &Context, _n: i64, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let output = format!("{} ⇒ Int ⊣ {}", input, ctx); Ok(( Type::Int, ctx.clone(), InferenceTree::new("InfLitInt", input, &output, vec![]), )) } /// T-LitBool: Boolean literal rule /// ⊢ b ⇒ Bool fn infer_lit_bool( &self, ctx: &Context, _b: bool, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let output = format!("{} ⇒ Bool ⊣ {}", input, ctx); Ok(( Type::Bool, ctx.clone(), InferenceTree::new("InfLitBool", input, &output, vec![]), )) } /// T-Abs: Lambda abstraction rule /// Γ,x:A ⊢ e ⇐ B /// ───────────────────── /// Γ ⊢ λx:A.e ⇒ A → B fn infer_abs( &mut self, ctx: &Context, x: &str, param_ty: &Type, body: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let b = self.fresh_tyvar(); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::VarBnd(x.to_string(), param_ty.clone())); new_ctx.push(Entry::ETVarBnd(b.clone())); let (ctx1, tree) = self.check(&new_ctx, body, &Type::ETVar(b.clone()))?; let (left, _, right) = ctx1.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let result_ty = Type::Arrow(Box::new(param_ty.clone()), Box::new(Type::ETVar(b))); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, final_ctx); Ok(( result_ty, final_ctx, InferenceTree::new("InfLam", input, &output, vec![tree]), )) } /// T-App: Function application rule /// Γ ⊢ e1 ⇒ A Γ ⊢ e2 • A ⇒⇒ C /// ──────────────────────────────── /// Γ ⊢ e1 e2 ⇒ C fn infer_application( &mut self, ctx: &Context, func: &Expr, arg: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (func_ty, ctx1, tree1) = self.infer(ctx, func)?; let func_ty_applied = self.apply_ctx_type(&ctx1, &func_ty); let (result_ty, ctx2, tree2) = self.infer_app(&ctx1, &func_ty_applied, arg)?; let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, ctx2); Ok(( result_ty, ctx2, InferenceTree::new("InfApp", input, &output, vec![tree1, tree2]), )) } /// T-Let: Let binding rule /// Γ ⊢ e1 ⇒ A Γ,x:A ⊢ e2 ⇒ B /// ───────────────────────────── /// Γ ⊢ let x = e1 in e2 ⇒ B fn infer_let( &mut self, ctx: &Context, x: &str, e1: &Expr, e2: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ty1, ctx1, tree1) = self.infer(ctx, e1)?; let mut new_ctx = ctx1.clone(); new_ctx.push(Entry::VarBnd(x.to_string(), ty1)); let (ty2, ctx2, tree2) = self.infer(&new_ctx, e2)?; let (left, _, right) = ctx2.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let output = format!("{} ⇒ {} ⊣ {}", input, ty2, final_ctx); Ok(( ty2, final_ctx, InferenceTree::new("InfLet", input, &output, vec![tree1, tree2]), )) } /// T-If: Conditional rule /// Γ ⊢ e1 ⇐ Bool Γ ⊢ e2 ⇒ A Γ ⊢ e3 ⇒ A /// ────────────────────────────────────────── /// Γ ⊢ if e1 then e2 else e3 ⇒ A fn infer_if( &mut self, ctx: &Context, e1: &Expr, e2: &Expr, e3: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ctx1, tree1) = self.check(ctx, e1, &Type::Bool)?; let (ty2, ctx2, tree2) = self.infer(&ctx1, e2)?; let (ty3, ctx3, tree3) = self.infer(&ctx2, e3)?; // Ensure both branches have the same type let (unified_ctx, tree_unify) = self.subtype(&ctx3, &ty2, &ty3)?; let output = format!("{} ⇒ {} ⊣ {}", input, ty2, unified_ctx); Ok(( ty2, unified_ctx, InferenceTree::new( "InfIf", input, &output, vec![tree1, tree2, tree3, tree_unify], ), )) } /// T-BinOp: Binary operation rules fn infer_binop( &mut self, ctx: &Context, op: &BinOp, e1: &Expr, e2: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { match op { // T-Arith: Int → Int → Int BinOp::Add | BinOp::Sub | BinOp::Mul | BinOp::Div => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Int)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Int)?; let output = format!("{} ⇒ Int ⊣ {}", input, ctx2); Ok(( Type::Int, ctx2, InferenceTree::new("InfArith", input, &output, vec![tree1, tree2]), )) } // T-Bool: Bool → Bool → Bool BinOp::And | BinOp::Or => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Bool)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Bool)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfBool", input, &output, vec![tree1, tree2]), )) } // T-Cmp: Int → Int → Bool BinOp::Lt | BinOp::Le | BinOp::Gt | BinOp::Ge => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Int)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Int)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfCmp", input, &output, vec![tree1, tree2]), )) } // T-Eq: ∀α. α → α → Bool BinOp::Eq | BinOp::Ne => { let (ty1, ctx1, tree1) = self.infer(ctx, e1)?; let (ctx2, tree2) = self.check(&ctx1, e2, &ty1)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfEq", input, &output, vec![tree1, tree2]), )) } } } fn inst_l( &mut self, ctx: &Context, a: &TyVar, ty: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ ^{} :=< {}", ctx, a, ty); match ty { Type::ETVar(b) if self.before(ctx, a, b) => { let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); let new_ctx = Context::from_parts( left, Entry::SETVarBnd(b.clone(), Type::ETVar(a.clone())), right, ); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstLReach", &input, &output, vec![]), )) } Type::Arrow(t1, t2) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree1) = self.inst_r(&ctx1, t1, &a1)?; let t2_applied = self.apply_ctx_type(&ctx2, t2); let (ctx3, tree2) = self.inst_l(&ctx2, &a2, &t2_applied)?; let output = format!("{}", ctx3); Ok(( ctx3, InferenceTree::new("InstLArr", &input, &output, vec![tree1, tree2]), )) } Type::Forall(b, t) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(b.clone())); let (ctx1, tree) = self.inst_l(&new_ctx, a, t)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("InstLAllR", &input, &output, vec![tree]), )) } _ if self.is_mono(ty) => { // Occurs check: ensure ^α doesn't occur in τ to prevent infinite types if self.occurs_check(a, ty) { return Err(TypeError::OccursCheck { var: a.clone(), ty: ty.clone(), expr: None, }); } let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let new_ctx = Context::from_parts(left, Entry::SETVarBnd(a.clone(), ty.clone()), right); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstLSolve", &input, &output, vec![]), )) } _ => Err(TypeError::InstantiationError { var: a.clone(), ty: ty.clone(), expr: None, }), } } fn inst_r( &mut self, ctx: &Context, ty: &Type, a: &TyVar, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {} :=< ^{}", ctx, ty, a); match ty { Type::ETVar(b) if self.before(ctx, a, b) => { let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); let new_ctx = Context::from_parts( left, Entry::SETVarBnd(b.clone(), Type::ETVar(a.clone())), right, ); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstRReach", &input, &output, vec![]), )) } Type::Arrow(t1, t2) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree1) = self.inst_l(&ctx1, &a1, t1)?; let t2_applied = self.apply_ctx_type(&ctx2, t2); let (ctx3, tree2) = self.inst_r(&ctx2, &t2_applied, &a2)?; let output = format!("{}", ctx3); Ok(( ctx3, InferenceTree::new("InstRArr", &input, &output, vec![tree1, tree2]), )) } Type::Forall(b, t) => { let subst_t = self.subst_type(b, &Type::ETVar(b.clone()), t); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::ETVarBnd(b.clone())); new_ctx.push(Entry::Mark(b.clone())); let (ctx1, tree) = self.inst_r(&new_ctx, &subst_t, a)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::Mark(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("InstRAllL", &input, &output, vec![tree]), )) } _ if self.is_mono(ty) => { // Occurs check: ensure ^α doesn't occur in τ to prevent infinite types if self.occurs_check(a, ty) { return Err(TypeError::OccursCheck { var: a.clone(), ty: ty.clone(), expr: None, }); } let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let new_ctx = Context::from_parts(left, Entry::SETVarBnd(a.clone(), ty.clone()), right); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstRSolve", &input, &output, vec![]), )) } _ => Err(TypeError::InstantiationError { var: a.clone(), ty: ty.clone(), expr: None, }), } } fn subtype( &mut self, ctx: &Context, t1: &Type, t2: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {} <: {}", ctx, t1, t2); match (t1, t2) { (Type::Int, Type::Int) | (Type::Bool, Type::Bool) => Ok(( ctx.clone(), InferenceTree::new("SubRefl", &input, &format!("{}", ctx), vec![]), )), (Type::Var(a), Type::Var(b)) if a == b => Ok(( ctx.clone(), InferenceTree::new("SubReflTVar", &input, &format!("{}", ctx), vec![]), )), (Type::ETVar(a), Type::ETVar(b)) if a == b => Ok(( ctx.clone(), InferenceTree::new("SubReflETVar", &input, &format!("{}", ctx), vec![]), )), (Type::Arrow(a1, a2), Type::Arrow(b1, b2)) => { let (ctx1, tree1) = self.subtype(ctx, b1, a1)?; // Contravariant in argument let (ctx2, tree2) = self.subtype(&ctx1, a2, b2)?; // Covariant in result let output = format!("{}", ctx2); Ok(( ctx2, InferenceTree::new("SubArr", &input, &output, vec![tree1, tree2]), )) } (_, Type::Forall(b, t2_body)) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(b.clone())); let (ctx1, tree) = self.subtype(&new_ctx, t1, t2_body)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("SubAllR", &input, &output, vec![tree]), )) } (Type::Forall(a, t1_body), _) => { let subst_t1 = self.subst_type(a, &Type::ETVar(a.clone()), t1_body); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::ETVarBnd(a.clone())); new_ctx.push(Entry::Mark(a.clone())); let (ctx1, tree) = self.subtype(&new_ctx, &subst_t1, t2)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::Mark(name) if name == a)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("SubAllL", &input, &output, vec![tree]), )) } (Type::ETVar(a), _) if !self.free_vars(t2).contains(a) => { let (ctx1, tree) = self.inst_l(ctx, a, t2)?; let output = format!("{}", ctx1); Ok(( ctx1, InferenceTree::new("SubInstL", &input, &output, vec![tree]), )) } (_, Type::ETVar(a)) if !self.free_vars(t1).contains(a) => { let (ctx1, tree) = self.inst_r(ctx, t1, a)?; let output = format!("{}", ctx1); Ok(( ctx1, InferenceTree::new("SubInstR", &input, &output, vec![tree]), )) } _ => Err(TypeError::SubtypingError { left: t1.clone(), right: t2.clone(), expr: None, }), } } fn check( &mut self, ctx: &Context, expr: &Expr, ty: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {:?} ⇐ {}", ctx, expr, ty); match (expr, ty) { (Expr::LitInt(_), Type::Int) => Ok(( ctx.clone(), InferenceTree::new("ChkLitInt", &input, &format!("{}", ctx), vec![]), )), (Expr::LitBool(_), Type::Bool) => Ok(( ctx.clone(), InferenceTree::new("ChkLitBool", &input, &format!("{}", ctx), vec![]), )), (Expr::Abs(x, _param_ty, body), Type::Arrow(expected_param, result_ty)) => { // For simplicity, we assume the parameter type matches let mut new_ctx = ctx.clone(); new_ctx.push(Entry::VarBnd(x.clone(), *expected_param.clone())); let (ctx1, tree) = self.check(&new_ctx, body, result_ty)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("ChkLam", &input, &output, vec![tree]), )) } (_, Type::Forall(a, ty_body)) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(a.clone())); let (ctx1, tree) = self.check(&new_ctx, expr, ty_body)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == a)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("ChkAll", &input, &output, vec![tree]), )) } _ => { // Fallback to inference + subtyping let (inferred_ty, ctx1, tree1) = self.infer(ctx, expr)?; let inferred_applied = self.apply_ctx_type(&ctx1, &inferred_ty); let ty_applied = self.apply_ctx_type(&ctx1, ty); let (ctx2, tree2) = self.subtype(&ctx1, &inferred_applied, &ty_applied)?; let output = format!("{}", ctx2); Ok(( ctx2, InferenceTree::new("ChkSub", &input, &output, vec![tree1, tree2]), )) } } } pub fn infer( &mut self, ctx: &Context, expr: &Expr, ) -> TypeResult<(Type, Context, InferenceTree)> { let input = format!("{} ⊢ {:?}", ctx, expr); match expr { Expr::Var(x) => self.infer_var(ctx, x, &input), Expr::Ann(expr, ty) => self.infer_ann(ctx, expr, ty, &input), Expr::LitInt(n) => self.infer_lit_int(ctx, *n, &input), Expr::LitBool(b) => self.infer_lit_bool(ctx, *b, &input), Expr::Abs(x, param_ty, body) => self.infer_abs(ctx, x, param_ty, body, &input), Expr::App(func, arg) => self.infer_application(ctx, func, arg, &input), Expr::TAbs(a, body) => self.infer_tabs(ctx, a, body, &input), Expr::TApp(func, ty_arg) => self.infer_tapp(ctx, func, ty_arg, &input), Expr::Let(x, e1, e2) => self.infer_let(ctx, x, e1, e2, &input), Expr::IfThenElse(e1, e2, e3) => self.infer_if(ctx, e1, e2, e3, &input), Expr::BinOp(op, e1, e2) => self.infer_binop(ctx, op, e1, e2, &input), } } /// T-Instr: Type annotation rule /// Γ ⊢ e ⇐ A /// ────────────────── /// Γ ⊢ (e : A) ⇒ A fn infer_ann( &mut self, ctx: &Context, expr: &Expr, ty: &Type, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ctx1, tree) = self.check(ctx, expr, ty)?; let output = format!("{} ⇒ {} ⊣ {}", input, ty, ctx1); Ok(( ty.clone(), ctx1, InferenceTree::new("InfAnn", input, &output, vec![tree]), )) } /// T-ForallI: Type abstraction rule /// Γ, α ⊢ e ⇒ A /// ────────────────────── /// Γ ⊢ Λα. e ⇒ ∀α. A fn infer_tabs( &mut self, ctx: &Context, a: &str, body: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(a.to_string())); let (body_ty, ctx1, tree) = self.infer(&new_ctx, body)?; // Apply context substitutions to resolve existential variables before removing // type binding let resolved_body_ty = self.apply_ctx_type(&ctx1, &body_ty); let (left, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == a)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let result_ty = Type::Forall(a.to_string(), Box::new(resolved_body_ty)); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, final_ctx); Ok(( result_ty, final_ctx, InferenceTree::new("InfTAbs", input, &output, vec![tree]), )) } /// T-ForallE: Type application rule /// Γ ⊢ e ⇒ ∀α. A /// ────────────────────── /// Γ ⊢ e[B] ⇒ [B/α]A fn infer_tapp( &mut self, ctx: &Context, func: &Expr, ty_arg: &Type, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (func_ty, ctx1, tree1) = self.infer(ctx, func)?; match func_ty { Type::Forall(a, body_ty) => { let result_ty = self.subst_type(&a, ty_arg, &body_ty); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, ctx1); Ok(( result_ty, ctx1, InferenceTree::new("InfTApp", input, &output, vec![tree1]), )) } _ => Err(TypeError::TypeApplicationError { actual: func_ty.clone(), expr: None, }), } } fn infer_app( &mut self, ctx: &Context, func_ty: &Type, arg: &Expr, ) -> TypeResult<(Type, Context, InferenceTree)> { let input = format!("{} ⊢ {:?} • {}", ctx, arg, func_ty); match func_ty { Type::Arrow(param_ty, result_ty) => { let (ctx1, tree) = self.check(ctx, arg, param_ty)?; let output = format!("{} ⇒⇒ {} ⊣ {}", input, result_ty, ctx1); Ok(( result_ty.as_ref().clone(), ctx1, InferenceTree::new("InfAppArr", &input, &output, vec![tree]), )) } Type::ETVar(a) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree) = self.check(&ctx1, arg, &Type::ETVar(a1))?; let output = format!("{} ⇒⇒ ^{} ⊣ {}", input, a2, ctx2); Ok(( Type::ETVar(a2), ctx2, InferenceTree::new("InfAppETVar", &input, &output, vec![tree]), )) } _ => Err(TypeError::ApplicationTypeError { actual: func_ty.clone(), expr: None, }), } } } pub fn run_bidirectional(expr: &Expr) -> TypeResult<(Type, Context, InferenceTree)> { let mut bi = BiDirectional::new(); let ctx = Context::new(); let (ty, final_ctx, tree) = bi.infer(&ctx, expr)?; // Apply the final context to resolve existential variables let resolved_ty = bi.apply_ctx_type(&final_ctx, &ty); Ok((resolved_ty, final_ctx, tree)) } #[derive(Debug, Clone)] pub struct Context(Vec<Entry>); }
Context management requires careful attention to scoping rules. When we enter a polymorphic function or introduce new type variables, we must ensure they are properly cleaned up when we exit their scope. The context provides methods for breaking it apart and reconstructing it to handle these operations efficiently.
Type Substitution and Application
Type substitution forms the computational heart of our System F implementation. When we instantiate a polymorphic type or solve existential variables, we need to systematically replace type variables with concrete types throughout complex type expressions.
#![allow(unused)] fn main() { fn subst_type(&self, var: &TyVar, replacement: &Type, ty: &Type) -> Type { match ty { Type::Var(name) if name == var => replacement.clone(), Type::ETVar(name) if name == var => replacement.clone(), Type::Var(_) | Type::ETVar(_) | Type::Int | Type::Bool => ty.clone(), Type::Arrow(t1, t2) => Type::Arrow( Box::new(self.subst_type(var, replacement, t1)), Box::new(self.subst_type(var, replacement, t2)), ), Type::Forall(bound_var, body) => { if bound_var == var { ty.clone() // Variable is shadowed } else { Type::Forall( bound_var.clone(), Box::new(self.subst_type(var, replacement, body)), ) } } } } }
Substitution must handle variable capture correctly. When substituting into a Forall
type, we must ensure that the bound variable doesn’t conflict with the replacement type. This is analogous to alpha-conversion in lambda calculus but operates at the type level.
Context application extends substitution by applying the current state of existential variable solutions to a type:
#![allow(unused)] fn main() { pub fn apply_ctx_type(&self, ctx: &Context, ty: &Type) -> Type { let mut current = ty.clone(); let mut changed = true; // Keep applying substitutions until no more changes occur while changed { changed = false; let new_type = self.apply_ctx_type_once(ctx, ¤t); if new_type != current { changed = true; current = new_type; } } current } }
This operation is essential because our algorithm incrementally builds up solutions to existential variables. As we learn more about unknown types, we need to propagate this information through all the types we’re working with.
Bidirectional Algorithm Core
Inference Rules
The inference mode synthesizes types from expressions. Our implementation uses a modular approach where the main inference function delegates to specialized methods for each syntactic form.
#![allow(unused)] fn main() { pub fn infer( &mut self, ctx: &Context, expr: &Expr, ) -> TypeResult<(Type, Context, InferenceTree)> { let input = format!("{} ⊢ {:?}", ctx, expr); match expr { Expr::Var(x) => self.infer_var(ctx, x, &input), Expr::Ann(expr, ty) => self.infer_ann(ctx, expr, ty, &input), Expr::LitInt(n) => self.infer_lit_int(ctx, *n, &input), Expr::LitBool(b) => self.infer_lit_bool(ctx, *b, &input), Expr::Abs(x, param_ty, body) => self.infer_abs(ctx, x, param_ty, body, &input), Expr::App(func, arg) => self.infer_application(ctx, func, arg, &input), Expr::TAbs(a, body) => self.infer_tabs(ctx, a, body, &input), Expr::TApp(func, ty_arg) => self.infer_tapp(ctx, func, ty_arg, &input), Expr::Let(x, e1, e2) => self.infer_let(ctx, x, e1, e2, &input), Expr::IfThenElse(e1, e2, e3) => self.infer_if(ctx, e1, e2, e3, &input), Expr::BinOp(op, e1, e2) => self.infer_binop(ctx, op, e1, e2, &input), } } }
The inference function delegates to specialized methods that implement individual typing rules:
Variable Lookup follows the T-Var rule:
\[ \frac{x : A \in \Gamma}{\Gamma \vdash x \Rightarrow A} \text{(T-Var)} \]
#![allow(unused)] fn main() { /// T-Var: Variable lookup rule /// Γ, x:A ⊢ x ⇒ A fn infer_var( &self, ctx: &Context, x: &str, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { if let Some(Entry::VarBnd(_, ty)) = ctx.find(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)) { let output = format!("{} ⇒ {} ⊣ {}", input, ty, ctx); Ok(( ty.clone(), ctx.clone(), InferenceTree::new("InfVar", input, &output, vec![]), )) } else { Err(TypeError::UnboundVariable { name: x.to_string(), expr: None, }) } } }
Integer Literals use the T-LitInt rule:
\[ \frac{}{\Gamma \vdash n \Rightarrow \text{Int}} \text{(T-LitInt)} \]
#![allow(unused)] fn main() { /// T-LitInt: Integer literal rule /// ⊢ n ⇒ Int fn infer_lit_int( &self, ctx: &Context, _n: i64, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let output = format!("{} ⇒ Int ⊣ {}", input, ctx); Ok(( Type::Int, ctx.clone(), InferenceTree::new("InfLitInt", input, &output, vec![]), )) } }
Boolean Literals use the T-LitBool rule:
\[ \frac{}{\Gamma \vdash \text{true} \Rightarrow \text{Bool}} \text{(T-LitBool)} \]
#![allow(unused)] fn main() { /// T-LitBool: Boolean literal rule /// ⊢ b ⇒ Bool fn infer_lit_bool( &self, ctx: &Context, _b: bool, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let output = format!("{} ⇒ Bool ⊣ {}", input, ctx); Ok(( Type::Bool, ctx.clone(), InferenceTree::new("InfLitBool", input, &output, vec![]), )) } }
Lambda Abstraction implements the T-Abs rule:
\[ \frac{\Gamma, x : A \vdash e \Leftarrow B}{\Gamma \vdash \lambda x. e \Leftarrow A \to B} \text{(T-Abs)} \]
#![allow(unused)] fn main() { /// T-Abs: Lambda abstraction rule /// Γ,x:A ⊢ e ⇐ B /// ───────────────────── /// Γ ⊢ λx:A.e ⇒ A → B fn infer_abs( &mut self, ctx: &Context, x: &str, param_ty: &Type, body: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let b = self.fresh_tyvar(); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::VarBnd(x.to_string(), param_ty.clone())); new_ctx.push(Entry::ETVarBnd(b.clone())); let (ctx1, tree) = self.check(&new_ctx, body, &Type::ETVar(b.clone()))?; let (left, _, right) = ctx1.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let result_ty = Type::Arrow(Box::new(param_ty.clone()), Box::new(Type::ETVar(b))); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, final_ctx); Ok(( result_ty, final_ctx, InferenceTree::new("InfLam", input, &output, vec![tree]), )) } }
Function Application uses the T-App rule:
\[ \frac{\Gamma \vdash e_1 \Rightarrow A \to B \quad \Gamma \vdash e_2 \Leftarrow A}{\Gamma \vdash e_1 ; e_2 \Rightarrow B} \text{(T-App)} \]
#![allow(unused)] fn main() { /// T-App: Function application rule /// Γ ⊢ e1 ⇒ A Γ ⊢ e2 • A ⇒⇒ C /// ──────────────────────────────── /// Γ ⊢ e1 e2 ⇒ C fn infer_application( &mut self, ctx: &Context, func: &Expr, arg: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (func_ty, ctx1, tree1) = self.infer(ctx, func)?; let func_ty_applied = self.apply_ctx_type(&ctx1, &func_ty); let (result_ty, ctx2, tree2) = self.infer_app(&ctx1, &func_ty_applied, arg)?; let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, ctx2); Ok(( result_ty, ctx2, InferenceTree::new("InfApp", input, &output, vec![tree1, tree2]), )) } }
Let Bindings implement the T-Let rule:
\[ \frac{\Gamma \vdash e_1 \Rightarrow A \quad \Gamma, x : A \vdash e_2 \Rightarrow B}{\Gamma \vdash \text{let } x = e_1 \text{ in } e_2 \Rightarrow B} \text{(T-Let)} \]
#![allow(unused)] fn main() { /// T-Let: Let binding rule /// Γ ⊢ e1 ⇒ A Γ,x:A ⊢ e2 ⇒ B /// ───────────────────────────── /// Γ ⊢ let x = e1 in e2 ⇒ B fn infer_let( &mut self, ctx: &Context, x: &str, e1: &Expr, e2: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ty1, ctx1, tree1) = self.infer(ctx, e1)?; let mut new_ctx = ctx1.clone(); new_ctx.push(Entry::VarBnd(x.to_string(), ty1)); let (ty2, ctx2, tree2) = self.infer(&new_ctx, e2)?; let (left, _, right) = ctx2.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let output = format!("{} ⇒ {} ⊣ {}", input, ty2, final_ctx); Ok(( ty2, final_ctx, InferenceTree::new("InfLet", input, &output, vec![tree1, tree2]), )) } }
Conditional Expressions use the T-If rule:
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Bool} \quad \Gamma \vdash e_2 \Rightarrow A \quad \Gamma \vdash e_3 \Leftarrow A}{\Gamma \vdash \text{if } e_1 \text{ then } e_2 \text{ else } e_3 \Rightarrow A} \text{(T-If)} \]
#![allow(unused)] fn main() { /// T-If: Conditional rule /// Γ ⊢ e1 ⇐ Bool Γ ⊢ e2 ⇒ A Γ ⊢ e3 ⇒ A /// ────────────────────────────────────────── /// Γ ⊢ if e1 then e2 else e3 ⇒ A fn infer_if( &mut self, ctx: &Context, e1: &Expr, e2: &Expr, e3: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ctx1, tree1) = self.check(ctx, e1, &Type::Bool)?; let (ty2, ctx2, tree2) = self.infer(&ctx1, e2)?; let (ty3, ctx3, tree3) = self.infer(&ctx2, e3)?; // Ensure both branches have the same type let (unified_ctx, tree_unify) = self.subtype(&ctx3, &ty2, &ty3)?; let output = format!("{} ⇒ {} ⊣ {}", input, ty2, unified_ctx); Ok(( ty2, unified_ctx, InferenceTree::new( "InfIf", input, &output, vec![tree1, tree2, tree3, tree_unify], ), )) } }
Binary Operations implement the T-BinOp rules (T-Arith, T-Bool, T-Cmp, T-Eq):
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Int} \quad \Gamma \vdash e_2 \Leftarrow \text{Int}}{\Gamma \vdash e_1 \oplus e_2 \Rightarrow \text{Int}} \text{(T-Arith)} \]
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Bool} \quad \Gamma \vdash e_2 \Leftarrow \text{Bool}}{\Gamma \vdash e_1 \land e_2 \Rightarrow \text{Bool}} \text{(T-Bool)} \]
\[ \frac{\Gamma \vdash e_1 \Leftarrow \text{Int} \quad \Gamma \vdash e_2 \Leftarrow \text{Int}}{\Gamma \vdash e_1 < e_2 \Rightarrow \text{Bool}} \text{(T-Cmp)} \]
\[ \frac{\Gamma \vdash e_1 \Rightarrow A \quad \Gamma \vdash e_2 \Leftarrow A}{\Gamma \vdash e_1 = e_2 \Rightarrow \text{Bool}} \text{(T-Eq)} \]
#![allow(unused)] fn main() { /// T-BinOp: Binary operation rules fn infer_binop( &mut self, ctx: &Context, op: &BinOp, e1: &Expr, e2: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { match op { // T-Arith: Int → Int → Int BinOp::Add | BinOp::Sub | BinOp::Mul | BinOp::Div => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Int)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Int)?; let output = format!("{} ⇒ Int ⊣ {}", input, ctx2); Ok(( Type::Int, ctx2, InferenceTree::new("InfArith", input, &output, vec![tree1, tree2]), )) } // T-Bool: Bool → Bool → Bool BinOp::And | BinOp::Or => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Bool)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Bool)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfBool", input, &output, vec![tree1, tree2]), )) } // T-Cmp: Int → Int → Bool BinOp::Lt | BinOp::Le | BinOp::Gt | BinOp::Ge => { let (ctx1, tree1) = self.check(ctx, e1, &Type::Int)?; let (ctx2, tree2) = self.check(&ctx1, e2, &Type::Int)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfCmp", input, &output, vec![tree1, tree2]), )) } // T-Eq: ∀α. α → α → Bool BinOp::Eq | BinOp::Ne => { let (ty1, ctx1, tree1) = self.infer(ctx, e1)?; let (ctx2, tree2) = self.check(&ctx1, e2, &ty1)?; let output = format!("{} ⇒ Bool ⊣ {}", input, ctx2); Ok(( Type::Bool, ctx2, InferenceTree::new("InfEq", input, &output, vec![tree1, tree2]), )) } } } }
Type Annotations use the T-Instr rule:
\[ \frac{\Gamma \vdash e \Leftarrow A}{\Gamma \vdash (e : A) \Rightarrow A} \text{(T-Instr)} \]
#![allow(unused)] fn main() { /// T-Instr: Type annotation rule /// Γ ⊢ e ⇐ A /// ────────────────── /// Γ ⊢ (e : A) ⇒ A fn infer_ann( &mut self, ctx: &Context, expr: &Expr, ty: &Type, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (ctx1, tree) = self.check(ctx, expr, ty)?; let output = format!("{} ⇒ {} ⊣ {}", input, ty, ctx1); Ok(( ty.clone(), ctx1, InferenceTree::new("InfAnn", input, &output, vec![tree]), )) } }
Type Abstraction implements the T-TAbs rule:
\[ \frac{\Gamma, \alpha \vdash e \Rightarrow A}{\Gamma \vdash \Lambda \alpha. e \Rightarrow \forall \alpha. A} \text{(T-TAbs)} \]
#![allow(unused)] fn main() { /// T-ForallI: Type abstraction rule /// Γ, α ⊢ e ⇒ A /// ────────────────────── /// Γ ⊢ Λα. e ⇒ ∀α. A fn infer_tabs( &mut self, ctx: &Context, a: &str, body: &Expr, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(a.to_string())); let (body_ty, ctx1, tree) = self.infer(&new_ctx, body)?; // Apply context substitutions to resolve existential variables before removing // type binding let resolved_body_ty = self.apply_ctx_type(&ctx1, &body_ty); let (left, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == a)); // Preserve solved existential variable bindings from the left context let mut final_ctx_entries = left .into_iter() .filter(|entry| matches!(entry, Entry::SETVarBnd(_, _))) .collect::<Vec<_>>(); final_ctx_entries.extend(right); let final_ctx = Context(final_ctx_entries); let result_ty = Type::Forall(a.to_string(), Box::new(resolved_body_ty)); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, final_ctx); Ok(( result_ty, final_ctx, InferenceTree::new("InfTAbs", input, &output, vec![tree]), )) } }
Type Application uses the T-TApp rule:
\[ \frac{\Gamma \vdash e \Rightarrow \forall \alpha. A}{\Gamma \vdash e[B] \Rightarrow [B/\alpha]A} \text{(T-TApp)} \]
#![allow(unused)] fn main() { /// T-ForallE: Type application rule /// Γ ⊢ e ⇒ ∀α. A /// ────────────────────── /// Γ ⊢ e[B] ⇒ [B/α]A fn infer_tapp( &mut self, ctx: &Context, func: &Expr, ty_arg: &Type, input: &str, ) -> TypeResult<(Type, Context, InferenceTree)> { let (func_ty, ctx1, tree1) = self.infer(ctx, func)?; match func_ty { Type::Forall(a, body_ty) => { let result_ty = self.subst_type(&a, ty_arg, &body_ty); let output = format!("{} ⇒ {} ⊣ {}", input, result_ty, ctx1); Ok(( result_ty, ctx1, InferenceTree::new("InfTApp", input, &output, vec![tree1]), )) } _ => Err(TypeError::TypeApplicationError { actual: func_ty.clone(), expr: None, }), } } }
Each method includes comments linking it to the formal typing rule it implements, making the correspondence between theory and implementation explicit.
Checking Rules
The checking mode verifies that expressions conform to expected types. This is where the algorithm can make progress even when types aren’t fully determined.
#![allow(unused)] fn main() { fn check( &mut self, ctx: &Context, expr: &Expr, ty: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {:?} ⇐ {}", ctx, expr, ty); match (expr, ty) { (Expr::LitInt(_), Type::Int) => Ok(( ctx.clone(), InferenceTree::new("ChkLitInt", &input, &format!("{}", ctx), vec![]), )), (Expr::LitBool(_), Type::Bool) => Ok(( ctx.clone(), InferenceTree::new("ChkLitBool", &input, &format!("{}", ctx), vec![]), )), (Expr::Abs(x, _param_ty, body), Type::Arrow(expected_param, result_ty)) => { // For simplicity, we assume the parameter type matches let mut new_ctx = ctx.clone(); new_ctx.push(Entry::VarBnd(x.clone(), *expected_param.clone())); let (ctx1, tree) = self.check(&new_ctx, body, result_ty)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::VarBnd(name, _) if name == x)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("ChkLam", &input, &output, vec![tree]), )) } (_, Type::Forall(a, ty_body)) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(a.clone())); let (ctx1, tree) = self.check(&new_ctx, expr, ty_body)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == a)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("ChkAll", &input, &output, vec![tree]), )) } _ => { // Fallback to inference + subtyping let (inferred_ty, ctx1, tree1) = self.infer(ctx, expr)?; let inferred_applied = self.apply_ctx_type(&ctx1, &inferred_ty); let ty_applied = self.apply_ctx_type(&ctx1, ty); let (ctx2, tree2) = self.subtype(&ctx1, &inferred_applied, &ty_applied)?; let output = format!("{}", ctx2); Ok(( ctx2, InferenceTree::new("ChkSub", &input, &output, vec![tree1, tree2]), )) } } } }
Checking mode provides specialized handling that takes advantage of known type information to guide the inference process more efficiently. For lambda expressions, when checking \( \lambda x:\tau_1. e \) against type \( \tau_1 \to \tau_2 \), the algorithm can immediately verify that the parameter types match and then recursively check the body \( e \) against the expected return type \( \tau_2 \). This direct decomposition avoids the need to synthesize types and then verify compatibility.
When checking against universal types \( \forall\alpha. \tau \), the algorithm introduces the type variable \( \alpha \) into the context and then checks the expression against the instantiated type \( \tau \). This approach ensures that the universal quantification is handled correctly while maintaining the scoping discipline required for sound type checking. When direct checking strategies are not applicable to the current expression and expected type combination, the algorithm falls back to a synthesis-plus-subtyping approach, where it first synthesizes a type for the expression and then verifies that this synthesized type is a subtype of the expected type.
Subtyping and Instantiation
Subtyping Relations
System F includes a subtyping system that handles the relationships between polymorphic types. The key insight is that ∀α. τ
is more general than any specific instantiation of τ
.
#![allow(unused)] fn main() { fn subtype( &mut self, ctx: &Context, t1: &Type, t2: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {} <: {}", ctx, t1, t2); match (t1, t2) { (Type::Int, Type::Int) | (Type::Bool, Type::Bool) => Ok(( ctx.clone(), InferenceTree::new("SubRefl", &input, &format!("{}", ctx), vec![]), )), (Type::Var(a), Type::Var(b)) if a == b => Ok(( ctx.clone(), InferenceTree::new("SubReflTVar", &input, &format!("{}", ctx), vec![]), )), (Type::ETVar(a), Type::ETVar(b)) if a == b => Ok(( ctx.clone(), InferenceTree::new("SubReflETVar", &input, &format!("{}", ctx), vec![]), )), (Type::Arrow(a1, a2), Type::Arrow(b1, b2)) => { let (ctx1, tree1) = self.subtype(ctx, b1, a1)?; // Contravariant in argument let (ctx2, tree2) = self.subtype(&ctx1, a2, b2)?; // Covariant in result let output = format!("{}", ctx2); Ok(( ctx2, InferenceTree::new("SubArr", &input, &output, vec![tree1, tree2]), )) } (_, Type::Forall(b, t2_body)) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(b.clone())); let (ctx1, tree) = self.subtype(&new_ctx, t1, t2_body)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("SubAllR", &input, &output, vec![tree]), )) } (Type::Forall(a, t1_body), _) => { let subst_t1 = self.subst_type(a, &Type::ETVar(a.clone()), t1_body); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::ETVarBnd(a.clone())); new_ctx.push(Entry::Mark(a.clone())); let (ctx1, tree) = self.subtype(&new_ctx, &subst_t1, t2)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::Mark(name) if name == a)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("SubAllL", &input, &output, vec![tree]), )) } (Type::ETVar(a), _) if !self.free_vars(t2).contains(a) => { let (ctx1, tree) = self.inst_l(ctx, a, t2)?; let output = format!("{}", ctx1); Ok(( ctx1, InferenceTree::new("SubInstL", &input, &output, vec![tree]), )) } (_, Type::ETVar(a)) if !self.free_vars(t1).contains(a) => { let (ctx1, tree) = self.inst_r(ctx, t1, a)?; let output = format!("{}", ctx1); Ok(( ctx1, InferenceTree::new("SubInstR", &input, &output, vec![tree]), )) } _ => Err(TypeError::SubtypingError { left: t1.clone(), right: t2.clone(), expr: None, }), } } }
The subtyping rules capture several essential relationships that govern type compatibility in System F. Function types exhibit the classic contravariant-covariant pattern, where a function that accepts more general arguments and returns more specific results is considered a subtype of a function with more specific argument requirements and more general return types. This means that a function of type \( A_1 \to B_1 \) is a subtype of \( A_2 \to B_2 \) when \( A_2 \leq A_1 \) (contravariant in arguments) and \( B_1 \leq B_2 \) (covariant in results).
Universal quantification follows the principle that \( \forall\alpha. \tau_1 \leq \tau_2 \) holds if \( \tau_1 \leq \tau_2 \) when \( \alpha \) is instantiated with a fresh existential variable, effectively allowing polymorphic types to be related through their instantiations. When subtyping involves existential variables, the algorithm must solve constraints about what these variables should be instantiated to in order to satisfy the subtyping relationship, often leading to the generation of additional constraints that propagate through the system.
Variable Instantiation
The instantiation judgments handle the core complexity of polymorphic type checking. When we have constraints like ^α ≤ τ
(an existential variable should be at most as general as some type), we need to find appropriate instantiations.
#![allow(unused)] fn main() { fn inst_l( &mut self, ctx: &Context, a: &TyVar, ty: &Type, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ ^{} :=< {}", ctx, a, ty); match ty { Type::ETVar(b) if self.before(ctx, a, b) => { let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); let new_ctx = Context::from_parts( left, Entry::SETVarBnd(b.clone(), Type::ETVar(a.clone())), right, ); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstLReach", &input, &output, vec![]), )) } Type::Arrow(t1, t2) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree1) = self.inst_r(&ctx1, t1, &a1)?; let t2_applied = self.apply_ctx_type(&ctx2, t2); let (ctx3, tree2) = self.inst_l(&ctx2, &a2, &t2_applied)?; let output = format!("{}", ctx3); Ok(( ctx3, InferenceTree::new("InstLArr", &input, &output, vec![tree1, tree2]), )) } Type::Forall(b, t) => { let mut new_ctx = ctx.clone(); new_ctx.push(Entry::TVarBnd(b.clone())); let (ctx1, tree) = self.inst_l(&new_ctx, a, t)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::TVarBnd(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("InstLAllR", &input, &output, vec![tree]), )) } _ if self.is_mono(ty) => { // Occurs check: ensure ^α doesn't occur in τ to prevent infinite types if self.occurs_check(a, ty) { return Err(TypeError::OccursCheck { var: a.clone(), ty: ty.clone(), expr: None, }); } let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let new_ctx = Context::from_parts(left, Entry::SETVarBnd(a.clone(), ty.clone()), right); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstLSolve", &input, &output, vec![]), )) } _ => Err(TypeError::InstantiationError { var: a.clone(), ty: ty.clone(), expr: None, }), } } }
Left instantiation (inst_l
) handles cases where the existential variable is on the left side of a constraint. This typically means we’re looking for the most general type that satisfies the constraint.
#![allow(unused)] fn main() { fn inst_r( &mut self, ctx: &Context, ty: &Type, a: &TyVar, ) -> TypeResult<(Context, InferenceTree)> { let input = format!("{} ⊢ {} :=< ^{}", ctx, ty, a); match ty { Type::ETVar(b) if self.before(ctx, a, b) => { let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == b)); let new_ctx = Context::from_parts( left, Entry::SETVarBnd(b.clone(), Type::ETVar(a.clone())), right, ); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstRReach", &input, &output, vec![]), )) } Type::Arrow(t1, t2) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree1) = self.inst_l(&ctx1, &a1, t1)?; let t2_applied = self.apply_ctx_type(&ctx2, t2); let (ctx3, tree2) = self.inst_r(&ctx2, &t2_applied, &a2)?; let output = format!("{}", ctx3); Ok(( ctx3, InferenceTree::new("InstRArr", &input, &output, vec![tree1, tree2]), )) } Type::Forall(b, t) => { let subst_t = self.subst_type(b, &Type::ETVar(b.clone()), t); let mut new_ctx = ctx.clone(); new_ctx.push(Entry::ETVarBnd(b.clone())); new_ctx.push(Entry::Mark(b.clone())); let (ctx1, tree) = self.inst_r(&new_ctx, &subst_t, a)?; let (_, _, right) = ctx1.break3(|entry| matches!(entry, Entry::Mark(name) if name == b)); let final_ctx = Context(right); let output = format!("{}", final_ctx); Ok(( final_ctx, InferenceTree::new("InstRAllL", &input, &output, vec![tree]), )) } _ if self.is_mono(ty) => { // Occurs check: ensure ^α doesn't occur in τ to prevent infinite types if self.occurs_check(a, ty) { return Err(TypeError::OccursCheck { var: a.clone(), ty: ty.clone(), expr: None, }); } let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let new_ctx = Context::from_parts(left, Entry::SETVarBnd(a.clone(), ty.clone()), right); let output = format!("{}", new_ctx); Ok(( new_ctx, InferenceTree::new("InstRSolve", &input, &output, vec![]), )) } _ => Err(TypeError::InstantiationError { var: a.clone(), ty: ty.clone(), expr: None, }), } } }
Right instantiation (inst_r
) handles the dual case where the existential variable is on the right. This typically means we’re looking for the most specific type that satisfies the constraint.
The instantiation algorithms include careful handling of several complex scenarios that arise during constraint solving. The reach relationship occurs when two existential variables are constrained against each other, requiring the algorithm to determine which variable should be solved in terms of the other while maintaining the proper ordering constraints. Arrow type instantiation requires breaking function types apart into their component argument and return types, creating separate instantiation constraints for each component that must be solved consistently.
The interaction between instantiation and universal quantification presents particular challenges, as the algorithm must ensure that polymorphic types are instantiated correctly while preserving the scoping discipline that prevents type variables from escaping their intended scope. These cases require constraint management to ensure that all relationships are maintained throughout the solving process.
Occurs Check
A critical component of sound type inference is the occurs check, which prevents infinite types from arising during unification. When solving constraints like ^α := τ
, we must ensure that α
does not occur within τ
, as this would create a cyclic type.
#![allow(unused)] fn main() { /// Check if a type variable occurs in a type (occurs check for unification) /// This prevents creating infinite types when solving ^α := τ by ensuring α /// ∉ τ fn occurs_check(&self, var: &TyVar, ty: &Type) -> bool { match ty { Type::Var(name) | Type::ETVar(name) => name == var, Type::Arrow(t1, t2) => self.occurs_check(var, t1) || self.occurs_check(var, t2), Type::Forall(bound_var, body) => { // If the variable is shadowed by the forall binding, it doesn't occur if bound_var == var { false } else { self.occurs_check(var, body) } } Type::Int | Type::Bool => false, } } }
The occurs check is applied during the InstLSolve and InstRSolve cases of instantiation. Without this check, the type system could accept programs that would lead to infinite types, violating the decidability of type checking.
Application Inference
Function application in System F requires careful handling because the function type might not be immediately apparent. The infer_app
judgment implements the T-AppArrow and T-AppEVar rules defined earlier:
#![allow(unused)] fn main() { fn infer_app( &mut self, ctx: &Context, func_ty: &Type, arg: &Expr, ) -> TypeResult<(Type, Context, InferenceTree)> { let input = format!("{} ⊢ {:?} • {}", ctx, arg, func_ty); match func_ty { Type::Arrow(param_ty, result_ty) => { let (ctx1, tree) = self.check(ctx, arg, param_ty)?; let output = format!("{} ⇒⇒ {} ⊣ {}", input, result_ty, ctx1); Ok(( result_ty.as_ref().clone(), ctx1, InferenceTree::new("InfAppArr", &input, &output, vec![tree]), )) } Type::ETVar(a) => { let a1 = self.fresh_tyvar(); let a2 = self.fresh_tyvar(); let (left, _, right) = ctx.break3(|entry| matches!(entry, Entry::ETVarBnd(name) if name == a)); let arrow_type = Type::Arrow( Box::new(Type::ETVar(a1.clone())), Box::new(Type::ETVar(a2.clone())), ); let mut new_ctx = left; new_ctx.push(Entry::SETVarBnd(a.clone(), arrow_type)); new_ctx.push(Entry::ETVarBnd(a1.clone())); new_ctx.push(Entry::ETVarBnd(a2.clone())); new_ctx.extend(right); let ctx1 = Context(new_ctx); let (ctx2, tree) = self.check(&ctx1, arg, &Type::ETVar(a1))?; let output = format!("{} ⇒⇒ ^{} ⊣ {}", input, a2, ctx2); Ok(( Type::ETVar(a2), ctx2, InferenceTree::new("InfAppETVar", &input, &output, vec![tree]), )) } _ => Err(TypeError::ApplicationTypeError { actual: func_ty.clone(), expr: None, }), } } }
The implementation handles two core application scenarios through distinct inference rules. The T-AppArrow rule applies when the function has a known arrow type \( A \to B \), allowing the algorithm to check the argument against \( A \) and return \( B \) as the result type. This straightforward case corresponds to the Type::Arrow
pattern in the implementation and represents the standard function application scenario.
The T-AppEVar rule handles the more complex case where the function type is an existential variable \( \hat{\alpha} \). In this situation, the algorithm instantiates the existential variable as \( \hat{\alpha_1} \to \hat{\alpha_2} \) with fresh existential variables, then checks the argument against \( \hat{\alpha_1} \) and returns \( \hat{\alpha_2} \) as the result type. This corresponds to the Type::ETVar
case and enables type inference even when the function type is initially unknown.
When the function has a polymorphic Forall
type, the instantiation is handled through the subtyping mechanism using the SubAllL rule rather than directly in application inference. This design choice ensures soundness by routing polymorphic instantiation through the well-established subtyping infrastructure and follows the standard bidirectional algorithm design patterns.
Error Handling
Our implementation provides comprehensive error reporting that distinguishes between parse errors and type errors. Parse errors use source-located reporting with ariadne, while type errors provide contextual information about the expressions being typed.
#![allow(unused)] fn main() { use ariadne::{Color, Label, Report, ReportKind}; use thiserror::Error; use crate::ast::Type; /// Source span for error reporting #[derive(Debug, Clone, PartialEq)] pub struct Span { pub start: usize, pub end: usize, } impl Span { pub fn new(start: usize, end: usize) -> Self { Self { start, end } } pub fn point(pos: usize) -> Self { Self { start: pos, end: pos + 1, } } } impl TypeError {} impl std::fmt::Display for TypeError { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { TypeError::UnboundVariable { name, expr } => { write!(f, "Variable '{}' not found in context", name)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::ApplicationTypeError { actual, expr } => { write!(f, "Expected function type in application, got {}", actual)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::TypeApplicationError { actual, expr } => { write!( f, "Expected forall type in type application, got {}", actual )?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::OccursCheck { var, ty, expr } => { write!( f, "Occurs check failed: variable '{}' occurs in type {}", var, ty )?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::SubtypingError { left, right, expr } => { write!(f, "No matching rule for subtyping {} <: {}", left, right)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::InstantiationError { var, ty, expr } => { write!(f, "No matching rule for instantiation {} :=< {}", var, ty)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::CheckingError { expected, expr } => { write!(f, "No matching rule for checking against type {}", expected)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::InferenceError { expr } => { write!(f, "No matching rule for inference")?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::ContextError { message, expr } => { write!(f, "Context error: {}", message)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } } } } #[derive(Error, Debug)] #[allow(dead_code)] pub enum ParseError { #[error("Parse error: {message}")] LalrpopError { message: String, span: Span }, } impl ParseError { pub fn to_ariadne_report<'a>( &self, filename: &'a str, ) -> Report<'a, (&'a str, std::ops::Range<usize>)> { match self { ParseError::LalrpopError { message, span } => { Report::build(ReportKind::Error, filename, span.start) .with_message("Parse Error") .with_label( Label::new((filename, span.start..span.end)) .with_message(message) .with_color(Color::Red), ) .finish() } } } } pub type TypeResult<T> = Result<T, TypeError>; #[allow(dead_code)] pub type ParseResult<T> = Result<T, ParseError>; #[derive(Debug)] #[allow(dead_code, clippy::enum_variant_names)] pub enum TypeError { UnboundVariable { name: String, expr: Option<String>, }, ApplicationTypeError { actual: Type, expr: Option<String>, }, TypeApplicationError { actual: Type, expr: Option<String>, }, OccursCheck { var: String, ty: Type, expr: Option<String>, }, SubtypingError { left: Type, right: Type, expr: Option<String>, }, InstantiationError { var: String, ty: Type, expr: Option<String>, }, CheckingError { expected: Type, expr: Option<String>, }, InferenceError { expr: Option<String>, }, ContextError { message: String, expr: Option<String>, }, } }
The type error system includes expression context to help developers understand where failures occur during type checking. Each error variant includes an expr
field that stores the expression being typed when the error occurred, providing valuable debugging information.
Parse errors receive enhanced treatment with source location information for precise error reporting:
#![allow(unused)] fn main() { use ariadne::{Color, Label, Report, ReportKind}; use thiserror::Error; use crate::ast::Type; /// Source span for error reporting #[derive(Debug, Clone, PartialEq)] pub struct Span { pub start: usize, pub end: usize, } impl Span { pub fn new(start: usize, end: usize) -> Self { Self { start, end } } pub fn point(pos: usize) -> Self { Self { start: pos, end: pos + 1, } } } #[derive(Debug)] #[allow(dead_code, clippy::enum_variant_names)] pub enum TypeError { UnboundVariable { name: String, expr: Option<String>, }, ApplicationTypeError { actual: Type, expr: Option<String>, }, TypeApplicationError { actual: Type, expr: Option<String>, }, OccursCheck { var: String, ty: Type, expr: Option<String>, }, SubtypingError { left: Type, right: Type, expr: Option<String>, }, InstantiationError { var: String, ty: Type, expr: Option<String>, }, CheckingError { expected: Type, expr: Option<String>, }, InferenceError { expr: Option<String>, }, ContextError { message: String, expr: Option<String>, }, } impl TypeError {} impl std::fmt::Display for TypeError { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { TypeError::UnboundVariable { name, expr } => { write!(f, "Variable '{}' not found in context", name)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::ApplicationTypeError { actual, expr } => { write!(f, "Expected function type in application, got {}", actual)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::TypeApplicationError { actual, expr } => { write!( f, "Expected forall type in type application, got {}", actual )?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::OccursCheck { var, ty, expr } => { write!( f, "Occurs check failed: variable '{}' occurs in type {}", var, ty )?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::SubtypingError { left, right, expr } => { write!(f, "No matching rule for subtyping {} <: {}", left, right)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::InstantiationError { var, ty, expr } => { write!(f, "No matching rule for instantiation {} :=< {}", var, ty)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::CheckingError { expected, expr } => { write!(f, "No matching rule for checking against type {}", expected)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::InferenceError { expr } => { write!(f, "No matching rule for inference")?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } TypeError::ContextError { message, expr } => { write!(f, "Context error: {}", message)?; if let Some(expr) = expr { write!(f, "\n When typing expression: {}", expr)?; } Ok(()) } } } } impl ParseError { pub fn to_ariadne_report<'a>( &self, filename: &'a str, ) -> Report<'a, (&'a str, std::ops::Range<usize>)> { match self { ParseError::LalrpopError { message, span } => { Report::build(ReportKind::Error, filename, span.start) .with_message("Parse Error") .with_label( Label::new((filename, span.start..span.end)) .with_message(message) .with_color(Color::Red), ) .finish() } } } } pub type TypeResult<T> = Result<T, TypeError>; #[allow(dead_code)] pub type ParseResult<T> = Result<T, ParseError>; #[derive(Error, Debug)] #[allow(dead_code)] pub enum ParseError { #[error("Parse error: {message}")] LalrpopError { message: String, span: Span }, } }
This dual approach ensures that syntax errors receive precise source location feedback while type errors focus on logical relationships between expressions and types.
End-to-end
To demonstrate the complete System F implementation with all the modular typing rules working together, consider this lambda expression that doubles its integer argument:
$ cargo run -- check "\x : Int -> x + x"
The system produces the following output showing the complete inference process:
Parsed expression: Abs("x", Int, BinOp(Add, Var("x"), Var("x")))
Type checking successful!
Final type: Int -> Int
InfLam: ⊢ Abs("x", Int, BinOp(Add, Var("x"), Var("x"))) => ⊢ Abs("x", Int, BinOp(Add, Var("x"), Var("x"))) ⇒ Int -> ^α0 ⊣ ^α0 = Int
ChkSub: ^α0, x: Int ⊢ BinOp(Add, Var("x"), Var("x")) ⇐ ^α0 => ^α0 = Int, x: Int
InfArith: ^α0, x: Int ⊢ BinOp(Add, Var("x"), Var("x")) => ^α0, x: Int ⊢ BinOp(Add, Var("x"), Var("x")) ⇒ Int ⊣ ^α0, x: Int
ChkSub: ^α0, x: Int ⊢ Var("x") ⇐ Int => ^α0, x: Int
InfVar: ^α0, x: Int ⊢ Var("x") => ^α0, x: Int ⊢ Var("x") ⇒ Int ⊣ ^α0, x: Int
SubRefl: ^α0, x: Int ⊢ Int <: Int => ^α0, x: Int
ChkSub: ^α0, x: Int ⊢ Var("x") ⇐ Int => ^α0, x: Int
InfVar: ^α0, x: Int ⊢ Var("x") => ^α0, x: Int ⊢ Var("x") ⇒ Int ⊣ ^α0, x: Int
SubRefl: ^α0, x: Int ⊢ Int <: Int => ^α0, x: Int
SubInstR: ^α0, x: Int ⊢ Int <: ^α0 => ^α0 = Int, x: Int
InstRSolve: ^α0, x: Int ⊢ Int :=< ^α0 => ^α0 = Int, x: Int
The final result shows Final type: Int -> Int
, correctly inferring that this lambda expression is a function from integers to integers! The existential variable ^α0
in the proof tree gets resolved to Int
through the constraint solving process, and the final type application ensures all existential variables are properly substituted in the output.
For non-trivial expressions, the constraint solving process may involve more complex reasoning and may require additional inference steps but using the proof tree method we can visualize the inference steps and understand the constraints being solved. This is a very powerful technique.
Mission Accomplished
And there we have it! A complete bidirectional type checker for System F with polymorphic types, existential variables, and constraint solving! The algorithm handles the complex interplay between synthesis and checking modes, managing existential variables and subtyping relationships with the precision needed for a production-quality type system.
The bidirectional approach transforms what could be an overwhelming inference problem into a systematic, decidable process that gives us both powerful expressiveness and reliable type safety. System F’s polymorphism opens up a whole new world of type-safe programming that feels almost magical once you see it in action!