Implementation
Rows
The row is the spine of every record type. A row is one of three things: the closed empty row, a polymorphic tail variable, or an extension that prepends a labelled field to another row. The same data type is reused by the row effects crate with the field payload dropped.
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq)] pub enum Expr { Var(String), Lit(Lit), Abs(String, Box<Expr>), App(Box<Expr>, Box<Expr>), Let(String, Box<Expr>, Box<Expr>), EmptyRecord, Extend(String, Box<Expr>, Box<Expr>), Select(Box<Expr>, String), Restrict(Box<Expr>, String), } #[derive(Debug, Clone, PartialEq)] pub enum Lit { Int(i64), Bool(bool), } #[derive(Debug, Clone, PartialEq)] pub enum Type { Var(String), Int, Bool, Arrow(Box<Type>, Box<Type>), Record(Box<Row>), } #[derive(Debug, Clone, PartialEq)] pub struct Scheme { pub type_vars: Vec<String>, pub row_vars: Vec<String>, pub ty: Type, } impl std::fmt::Display for Expr { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Expr::Var(name) => write!(f, "{}", name), Expr::Lit(Lit::Int(n)) => write!(f, "{}", n), Expr::Lit(Lit::Bool(b)) => write!(f, "{}", b), Expr::Abs(param, body) => write!(f, "\\{} -> {}", param, body), Expr::App(func, arg) => match (func.as_ref(), arg.as_ref()) { (Expr::Abs(_, _), _) => write!(f, "({}) {}", func, arg), (_, Expr::App(_, _)) => write!(f, "{} ({})", func, arg), (_, Expr::Abs(_, _)) => write!(f, "{} ({})", func, arg), _ => write!(f, "{} {}", func, arg), }, Expr::Let(var, value, body) => { write!(f, "let {} = {} in {}", var, value, body) } Expr::EmptyRecord => write!(f, "{{}}"), Expr::Extend(_, _, _) => { let (fields, tail) = collect_extend(self); write!(f, "{{")?; for (i, (label, value)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} = {}", label, value)?; } match tail { Expr::EmptyRecord => write!(f, "}}"), other => write!(f, " | {}}}", other), } } Expr::Select(e, label) => write!(f, "{}.{}", e, label), Expr::Restrict(e, label) => write!(f, "{{{} - {}}}", e, label), } } } fn collect_extend(mut e: &Expr) -> (Vec<(String, &Expr)>, &Expr) { let mut fields = Vec::new(); while let Expr::Extend(l, v, rest) = e { fields.push((l.clone(), v.as_ref())); e = rest.as_ref(); } (fields, e) } impl std::fmt::Display for Type { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Type::Var(name) => write!(f, "{}", name), Type::Int => write!(f, "Int"), Type::Bool => write!(f, "Bool"), Type::Arrow(t1, t2) => match t1.as_ref() { Type::Arrow(_, _) => write!(f, "({}) -> {}", t1, t2), _ => write!(f, "{} -> {}", t1, t2), }, Type::Record(row) => write!(f, "{{{}}}", row), } } } impl std::fmt::Display for Row { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { let (fields, tail) = collect_row(self); for (i, (label, ty)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} : {}", label, ty)?; } match tail { Row::Empty => Ok(()), Row::Var(name) => { if fields.is_empty() { write!(f, "{}", name) } else { write!(f, " | {}", name) } } Row::Extend(_, _, _) => unreachable!("collect_row drains all extensions"), } } } fn collect_row(mut r: &Row) -> (Vec<(String, &Type)>, &Row) { let mut fields = Vec::new(); while let Row::Extend(l, t, rest) = r { fields.push((l.clone(), t.as_ref())); r = rest.as_ref(); } (fields, r) } impl std::fmt::Display for Scheme { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { if self.type_vars.is_empty() && self.row_vars.is_empty() { write!(f, "{}", self.ty) } else { write!(f, "forall")?; for v in &self.type_vars { write!(f, " {}", v)?; } for v in &self.row_vars { write!(f, " {}", v)?; } write!(f, ". {}", self.ty) } } } pub fn build_extend(fields: Vec<(String, Box<Expr>)>, tail: Box<Expr>) -> Box<Expr> { let mut acc = tail; for (label, value) in fields.into_iter().rev() { acc = Box::new(Expr::Extend(label, value, acc)); } acc } #[derive(Debug, Clone, PartialEq)] pub enum Row { Empty, Var(String), Extend(String, Box<Type>, Box<Row>), } }
Empty is the closed row containing no labels. Var is a polymorphic tail standing for some unknown row. Extend adds a label with its field type on top of an existing row. The row {x : Int, y : Bool | r} is Extend("x", Int, Extend("y", Bool, Var("r"))). Two occurrences of the same label are distinct, so {x : Int, x : Bool} is a record with two x fields, the leftmost shadowing the rightmost under selection and restriction.
A record type wraps a row, and the type language is otherwise the small core of variables, base types, and arrows.
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq)] pub enum Expr { Var(String), Lit(Lit), Abs(String, Box<Expr>), App(Box<Expr>, Box<Expr>), Let(String, Box<Expr>, Box<Expr>), EmptyRecord, Extend(String, Box<Expr>, Box<Expr>), Select(Box<Expr>, String), Restrict(Box<Expr>, String), } #[derive(Debug, Clone, PartialEq)] pub enum Lit { Int(i64), Bool(bool), } #[derive(Debug, Clone, PartialEq)] pub enum Row { Empty, Var(String), Extend(String, Box<Type>, Box<Row>), } #[derive(Debug, Clone, PartialEq)] pub struct Scheme { pub type_vars: Vec<String>, pub row_vars: Vec<String>, pub ty: Type, } impl std::fmt::Display for Expr { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Expr::Var(name) => write!(f, "{}", name), Expr::Lit(Lit::Int(n)) => write!(f, "{}", n), Expr::Lit(Lit::Bool(b)) => write!(f, "{}", b), Expr::Abs(param, body) => write!(f, "\\{} -> {}", param, body), Expr::App(func, arg) => match (func.as_ref(), arg.as_ref()) { (Expr::Abs(_, _), _) => write!(f, "({}) {}", func, arg), (_, Expr::App(_, _)) => write!(f, "{} ({})", func, arg), (_, Expr::Abs(_, _)) => write!(f, "{} ({})", func, arg), _ => write!(f, "{} {}", func, arg), }, Expr::Let(var, value, body) => { write!(f, "let {} = {} in {}", var, value, body) } Expr::EmptyRecord => write!(f, "{{}}"), Expr::Extend(_, _, _) => { let (fields, tail) = collect_extend(self); write!(f, "{{")?; for (i, (label, value)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} = {}", label, value)?; } match tail { Expr::EmptyRecord => write!(f, "}}"), other => write!(f, " | {}}}", other), } } Expr::Select(e, label) => write!(f, "{}.{}", e, label), Expr::Restrict(e, label) => write!(f, "{{{} - {}}}", e, label), } } } fn collect_extend(mut e: &Expr) -> (Vec<(String, &Expr)>, &Expr) { let mut fields = Vec::new(); while let Expr::Extend(l, v, rest) = e { fields.push((l.clone(), v.as_ref())); e = rest.as_ref(); } (fields, e) } impl std::fmt::Display for Type { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Type::Var(name) => write!(f, "{}", name), Type::Int => write!(f, "Int"), Type::Bool => write!(f, "Bool"), Type::Arrow(t1, t2) => match t1.as_ref() { Type::Arrow(_, _) => write!(f, "({}) -> {}", t1, t2), _ => write!(f, "{} -> {}", t1, t2), }, Type::Record(row) => write!(f, "{{{}}}", row), } } } impl std::fmt::Display for Row { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { let (fields, tail) = collect_row(self); for (i, (label, ty)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} : {}", label, ty)?; } match tail { Row::Empty => Ok(()), Row::Var(name) => { if fields.is_empty() { write!(f, "{}", name) } else { write!(f, " | {}", name) } } Row::Extend(_, _, _) => unreachable!("collect_row drains all extensions"), } } } fn collect_row(mut r: &Row) -> (Vec<(String, &Type)>, &Row) { let mut fields = Vec::new(); while let Row::Extend(l, t, rest) = r { fields.push((l.clone(), t.as_ref())); r = rest.as_ref(); } (fields, r) } impl std::fmt::Display for Scheme { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { if self.type_vars.is_empty() && self.row_vars.is_empty() { write!(f, "{}", self.ty) } else { write!(f, "forall")?; for v in &self.type_vars { write!(f, " {}", v)?; } for v in &self.row_vars { write!(f, " {}", v)?; } write!(f, ". {}", self.ty) } } } pub fn build_extend(fields: Vec<(String, Box<Expr>)>, tail: Box<Expr>) -> Box<Expr> { let mut acc = tail; for (label, value) in fields.into_iter().rev() { acc = Box::new(Expr::Extend(label, value, acc)); } acc } #[derive(Debug, Clone, PartialEq)] pub enum Type { Var(String), Int, Bool, Arrow(Box<Type>, Box<Type>), Record(Box<Row>), } }
Generalisation tracks type and row variables separately because they live in disjoint substitutions and instantiate against different kinds of structure.
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq)] pub enum Expr { Var(String), Lit(Lit), Abs(String, Box<Expr>), App(Box<Expr>, Box<Expr>), Let(String, Box<Expr>, Box<Expr>), EmptyRecord, Extend(String, Box<Expr>, Box<Expr>), Select(Box<Expr>, String), Restrict(Box<Expr>, String), } #[derive(Debug, Clone, PartialEq)] pub enum Lit { Int(i64), Bool(bool), } #[derive(Debug, Clone, PartialEq)] pub enum Type { Var(String), Int, Bool, Arrow(Box<Type>, Box<Type>), Record(Box<Row>), } #[derive(Debug, Clone, PartialEq)] pub enum Row { Empty, Var(String), Extend(String, Box<Type>, Box<Row>), } impl std::fmt::Display for Expr { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Expr::Var(name) => write!(f, "{}", name), Expr::Lit(Lit::Int(n)) => write!(f, "{}", n), Expr::Lit(Lit::Bool(b)) => write!(f, "{}", b), Expr::Abs(param, body) => write!(f, "\\{} -> {}", param, body), Expr::App(func, arg) => match (func.as_ref(), arg.as_ref()) { (Expr::Abs(_, _), _) => write!(f, "({}) {}", func, arg), (_, Expr::App(_, _)) => write!(f, "{} ({})", func, arg), (_, Expr::Abs(_, _)) => write!(f, "{} ({})", func, arg), _ => write!(f, "{} {}", func, arg), }, Expr::Let(var, value, body) => { write!(f, "let {} = {} in {}", var, value, body) } Expr::EmptyRecord => write!(f, "{{}}"), Expr::Extend(_, _, _) => { let (fields, tail) = collect_extend(self); write!(f, "{{")?; for (i, (label, value)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} = {}", label, value)?; } match tail { Expr::EmptyRecord => write!(f, "}}"), other => write!(f, " | {}}}", other), } } Expr::Select(e, label) => write!(f, "{}.{}", e, label), Expr::Restrict(e, label) => write!(f, "{{{} - {}}}", e, label), } } } fn collect_extend(mut e: &Expr) -> (Vec<(String, &Expr)>, &Expr) { let mut fields = Vec::new(); while let Expr::Extend(l, v, rest) = e { fields.push((l.clone(), v.as_ref())); e = rest.as_ref(); } (fields, e) } impl std::fmt::Display for Type { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Type::Var(name) => write!(f, "{}", name), Type::Int => write!(f, "Int"), Type::Bool => write!(f, "Bool"), Type::Arrow(t1, t2) => match t1.as_ref() { Type::Arrow(_, _) => write!(f, "({}) -> {}", t1, t2), _ => write!(f, "{} -> {}", t1, t2), }, Type::Record(row) => write!(f, "{{{}}}", row), } } } impl std::fmt::Display for Row { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { let (fields, tail) = collect_row(self); for (i, (label, ty)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} : {}", label, ty)?; } match tail { Row::Empty => Ok(()), Row::Var(name) => { if fields.is_empty() { write!(f, "{}", name) } else { write!(f, " | {}", name) } } Row::Extend(_, _, _) => unreachable!("collect_row drains all extensions"), } } } fn collect_row(mut r: &Row) -> (Vec<(String, &Type)>, &Row) { let mut fields = Vec::new(); while let Row::Extend(l, t, rest) = r { fields.push((l.clone(), t.as_ref())); r = rest.as_ref(); } (fields, r) } impl std::fmt::Display for Scheme { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { if self.type_vars.is_empty() && self.row_vars.is_empty() { write!(f, "{}", self.ty) } else { write!(f, "forall")?; for v in &self.type_vars { write!(f, " {}", v)?; } for v in &self.row_vars { write!(f, " {}", v)?; } write!(f, ". {}", self.ty) } } } pub fn build_extend(fields: Vec<(String, Box<Expr>)>, tail: Box<Expr>) -> Box<Expr> { let mut acc = tail; for (label, value) in fields.into_iter().rev() { acc = Box::new(Expr::Extend(label, value, acc)); } acc } #[derive(Debug, Clone, PartialEq)] pub struct Scheme { pub type_vars: Vec<String>, pub row_vars: Vec<String>, pub ty: Type, } }
Unification
Row unification follows Figure 2 of Extensible records with scoped labels. Equating two rows reduces to equating their head labels in turn. When the left side exposes a label l, we hoist l to the head of the right side, unify the field types, then unify the residual tails.
#![allow(unused)] fn main() { fn unify_row(&mut self, r1: &Row, r2: &Row) -> Result<(Subst, InferenceTree)> { let input = format!("{{{}}} ~ {{{}}}", r1, r2); match (r1, r2) { (Row::Empty, Row::Empty) => { let tree = InferenceTree::new("Unify-RowEmpty", &input, "{}", vec![]); Ok((Subst::empty(), tree)) } (Row::Var(a), Row::Var(b)) if a == b => { let tree = InferenceTree::new("Unify-RowVar-Same", &input, "{}", vec![]); Ok((Subst::empty(), tree)) } (Row::Var(v), other) | (other, Row::Var(v)) => { let mut fv = FreeVars::default(); ftv_row(other, &mut fv); if fv.rows.contains(v) { Err(InferenceError::RowOccursCheck { var: v.clone(), row: other.clone(), }) } else { let subst = Subst::singleton_row(v.clone(), other.clone()); let output = format!("{{{{{}}}/{}}}", other, v); let tree = InferenceTree::new("Unify-RowVar", &input, &output, vec![]); Ok((subst, tree)) } } (Row::Extend(l, t1, rest1), other) => { let (t2, rest2, s1) = self.rewrite_row(other, l)?; if let Some(tv) = row_tail(rest1) { if s1.rows.contains_key(tv) { return Err(InferenceError::RecursiveRow { var: tv.to_string(), }); } } let (s2, tree2) = self.unify(&apply_type(&s1, t1), &apply_type(&s1, &t2))?; let s12 = s2.compose(&s1); let (s3, tree3) = self.unify_row(&apply_row(&s12, rest1), &apply_row(&s12, &rest2))?; let final_subst = s3.compose(&s12); let output = self.pretty_subst(&final_subst); let tree = InferenceTree::new("Unify-RowExtend", &input, &output, vec![tree2, tree3]); Ok((final_subst, tree)) } (Row::Empty, Row::Extend(l, _, _)) => Err(InferenceError::MissingLabel { label: l.clone(), row: Row::Empty, }), } } }
The variable cases are standard. A tail variable Var(α) unifies with any row that does not mention α in its tail, which is the row occurs check. The interesting case is (Extend(l, t1, rest1), other). We call rewrite_row to bring l to the front of other, recovering the matching field type t2 and a new tail rest2, then recurse on the field types and on the tails.
The side condition guards against the divergent case identified by Wand. If rest1’s own tail variable was bound by rewrite_row, recursing would force the same variable to be bound again and the algorithm would loop. We detect this by walking rest1 to its tail and checking whether that variable appears in the substitution that rewrite_row just produced.
#![allow(unused)] fn main() { use std::collections::{BTreeMap, HashMap, HashSet}; use std::fmt; use crate::ast::{Expr, Lit, Row, Scheme, Type}; use crate::errors::{InferenceError, Result}; pub type Env = BTreeMap<String, Scheme>; #[derive(Debug)] pub struct InferenceTree { pub rule: String, pub input: String, pub output: String, pub children: Vec<InferenceTree>, } impl InferenceTree { fn new(rule: &str, input: &str, output: &str, children: Vec<InferenceTree>) -> Self { Self { rule: rule.to_string(), input: input.to_string(), output: output.to_string(), children, } } } impl fmt::Display for InferenceTree { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { self.display_with_indent(f, 0) } } impl InferenceTree { fn display_with_indent(&self, f: &mut fmt::Formatter, indent: usize) -> fmt::Result { let prefix = " ".repeat(indent); writeln!( f, "{}{}: {} => {}", prefix, self.rule, self.input, self.output )?; for child in &self.children { child.display_with_indent(f, indent + 1)?; } Ok(()) } } #[derive(Debug, Clone, Default)] pub struct Subst { pub types: HashMap<String, Type>, pub rows: HashMap<String, Row>, } impl Subst { fn empty() -> Self { Self::default() } fn singleton_type(var: String, ty: Type) -> Self { let mut s = Self::empty(); s.types.insert(var, ty); s } fn singleton_row(var: String, row: Row) -> Self { let mut s = Self::empty(); s.rows.insert(var, row); s } // s2 ∘ s1: apply s1 first, then s2. fn compose(&self, s1: &Self) -> Self { let mut types: HashMap<String, Type> = s1 .types .iter() .map(|(k, v)| (k.clone(), apply_type(self, v))) .collect(); for (k, v) in &self.types { types.entry(k.clone()).or_insert_with(|| v.clone()); } let mut rows: HashMap<String, Row> = s1 .rows .iter() .map(|(k, v)| (k.clone(), apply_row(self, v))) .collect(); for (k, v) in &self.rows { rows.entry(k.clone()).or_insert_with(|| v.clone()); } Subst { types, rows } } } fn apply_type(s: &Subst, ty: &Type) -> Type { match ty { Type::Var(name) => match s.types.get(name) { Some(t) => apply_type(s, t), None => ty.clone(), }, Type::Int | Type::Bool => ty.clone(), Type::Arrow(a, b) => Type::Arrow(Box::new(apply_type(s, a)), Box::new(apply_type(s, b))), Type::Record(r) => Type::Record(Box::new(apply_row(s, r))), } } fn apply_row(s: &Subst, row: &Row) -> Row { match row { Row::Empty => Row::Empty, Row::Var(name) => match s.rows.get(name) { Some(r) => apply_row(s, r), None => row.clone(), }, Row::Extend(l, t, r) => Row::Extend( l.clone(), Box::new(apply_type(s, t)), Box::new(apply_row(s, r)), ), } } fn apply_scheme(s: &Subst, scheme: &Scheme) -> Scheme { let mut filtered = s.clone(); for v in &scheme.type_vars { filtered.types.remove(v); } for v in &scheme.row_vars { filtered.rows.remove(v); } Scheme { type_vars: scheme.type_vars.clone(), row_vars: scheme.row_vars.clone(), ty: apply_type(&filtered, &scheme.ty), } } fn apply_env(s: &Subst, env: &Env) -> Env { env.iter() .map(|(k, v)| (k.clone(), apply_scheme(s, v))) .collect() } #[derive(Default)] struct FreeVars { types: HashSet<String>, rows: HashSet<String>, } fn ftv_type(ty: &Type, out: &mut FreeVars) { match ty { Type::Var(n) => { out.types.insert(n.clone()); } Type::Int | Type::Bool => {} Type::Arrow(a, b) => { ftv_type(a, out); ftv_type(b, out); } Type::Record(r) => ftv_row(r, out), } } fn ftv_row(row: &Row, out: &mut FreeVars) { match row { Row::Empty => {} Row::Var(n) => { out.rows.insert(n.clone()); } Row::Extend(_, t, rest) => { ftv_type(t, out); ftv_row(rest, out); } } } fn ftv_scheme(s: &Scheme) -> FreeVars { let mut fv = FreeVars::default(); ftv_type(&s.ty, &mut fv); for v in &s.type_vars { fv.types.remove(v); } for v in &s.row_vars { fv.rows.remove(v); } fv } fn ftv_env(env: &Env) -> FreeVars { let mut fv = FreeVars::default(); for s in env.values() { let s_fv = ftv_scheme(s); fv.types.extend(s_fv.types); fv.rows.extend(s_fv.rows); } fv } pub struct TypeInference { counter: usize, } impl Default for TypeInference { fn default() -> Self { Self::new() } } impl TypeInference { pub fn new() -> Self { Self { counter: 0 } } fn fresh_type(&mut self) -> Type { let v = format!("t{}", self.counter); self.counter += 1; Type::Var(v) } fn fresh_row(&mut self) -> Row { let v = format!("r{}", self.counter); self.counter += 1; Row::Var(v) } fn pretty_env(&self, env: &Env) -> String { if env.is_empty() { "{}".to_string() } else { let entries: Vec<String> = env.iter().map(|(k, v)| format!("{}: {}", k, v)).collect(); format!("{{{}}}", entries.join(", ")) } } fn pretty_subst(&self, subst: &Subst) -> String { if subst.types.is_empty() && subst.rows.is_empty() { "{}".to_string() } else { let mut entries: Vec<String> = Vec::new(); for (k, v) in &subst.types { entries.push(format!("{}/{}", v, k)); } for (k, v) in &subst.rows { entries.push(format!("{}/{}", v, k)); } format!("{{{}}}", entries.join(", ")) } } fn instantiate(&mut self, scheme: &Scheme) -> Type { let mut s = Subst::empty(); for v in &scheme.type_vars { s.types.insert(v.clone(), self.fresh_type()); } for v in &scheme.row_vars { s.rows.insert(v.clone(), self.fresh_row()); } apply_type(&s, &scheme.ty) } fn generalize(&self, env: &Env, ty: &Type) -> Scheme { let mut fv = FreeVars::default(); ftv_type(ty, &mut fv); let env_fv = ftv_env(env); let mut tvars: Vec<_> = fv.types.difference(&env_fv.types).cloned().collect(); let mut rvars: Vec<_> = fv.rows.difference(&env_fv.rows).cloned().collect(); tvars.sort(); rvars.sort(); Scheme { type_vars: tvars, row_vars: rvars, ty: ty.clone(), } } fn unify(&mut self, t1: &Type, t2: &Type) -> Result<(Subst, InferenceTree)> { let input = format!("{} ~ {}", t1, t2); match (t1, t2) { (Type::Int, Type::Int) | (Type::Bool, Type::Bool) => { let tree = InferenceTree::new("Unify-Base", &input, "{}", vec![]); Ok((Subst::empty(), tree)) } (Type::Var(a), Type::Var(b)) if a == b => { let tree = InferenceTree::new("Unify-Var-Same", &input, "{}", vec![]); Ok((Subst::empty(), tree)) } (Type::Var(v), other) | (other, Type::Var(v)) => { let mut fv = FreeVars::default(); ftv_type(other, &mut fv); if fv.types.contains(v) { Err(InferenceError::OccursCheck { var: v.clone(), ty: other.clone(), }) } else { let subst = Subst::singleton_type(v.clone(), other.clone()); let output = format!("{{{}/{}}}", other, v); let tree = InferenceTree::new("Unify-Var", &input, &output, vec![]); Ok((subst, tree)) } } (Type::Arrow(a1, b1), Type::Arrow(a2, b2)) => { let (s1, tree1) = self.unify(a1, a2)?; let (s2, tree2) = self.unify(&apply_type(&s1, b1), &apply_type(&s1, b2))?; let final_subst = s2.compose(&s1); let output = self.pretty_subst(&final_subst); let tree = InferenceTree::new("Unify-Arrow", &input, &output, vec![tree1, tree2]); Ok((final_subst, tree)) } (Type::Record(r1), Type::Record(r2)) => { let (s, child) = self.unify_row(r1, r2)?; let output = self.pretty_subst(&s); let tree = InferenceTree::new("Unify-Record", &input, &output, vec![child]); Ok((s, tree)) } _ => Err(InferenceError::UnificationFailure { expected: t1.clone(), actual: t2.clone(), }), } } // Row unification rule (uni-row): rewrite the RHS so the head label of LHS // comes first, then unify field types and tails. The side condition // `tail(r) ∉ dom(θ1)` rules out non-terminating cases like // \r -> if c then {x=1|r} else {y=2|r} fn unify_row(&mut self, r1: &Row, r2: &Row) -> Result<(Subst, InferenceTree)> { let input = format!("{{{}}} ~ {{{}}}", r1, r2); match (r1, r2) { (Row::Empty, Row::Empty) => { let tree = InferenceTree::new("Unify-RowEmpty", &input, "{}", vec![]); Ok((Subst::empty(), tree)) } (Row::Var(a), Row::Var(b)) if a == b => { let tree = InferenceTree::new("Unify-RowVar-Same", &input, "{}", vec![]); Ok((Subst::empty(), tree)) } (Row::Var(v), other) | (other, Row::Var(v)) => { let mut fv = FreeVars::default(); ftv_row(other, &mut fv); if fv.rows.contains(v) { Err(InferenceError::RowOccursCheck { var: v.clone(), row: other.clone(), }) } else { let subst = Subst::singleton_row(v.clone(), other.clone()); let output = format!("{{{{{}}}/{}}}", other, v); let tree = InferenceTree::new("Unify-RowVar", &input, &output, vec![]); Ok((subst, tree)) } } (Row::Extend(l, t1, rest1), other) => { let (t2, rest2, s1) = self.rewrite_row(other, l)?; if let Some(tv) = row_tail(rest1) { if s1.rows.contains_key(tv) { return Err(InferenceError::RecursiveRow { var: tv.to_string(), }); } } let (s2, tree2) = self.unify(&apply_type(&s1, t1), &apply_type(&s1, &t2))?; let s12 = s2.compose(&s1); let (s3, tree3) = self.unify_row(&apply_row(&s12, rest1), &apply_row(&s12, &rest2))?; let final_subst = s3.compose(&s12); let output = self.pretty_subst(&final_subst); let tree = InferenceTree::new("Unify-RowExtend", &input, &output, vec![tree2, tree3]); Ok((final_subst, tree)) } (Row::Empty, Row::Extend(l, _, _)) => Err(InferenceError::MissingLabel { label: l.clone(), row: Row::Empty, }), } } // Row rewriting rule: r ≃ (l :: τ | s) : θ. Hoists label `l` to the head // of `r`, returning the field type, the remainder, and the substitution. fn rewrite_row(&mut self, row: &Row, label: &str) -> Result<(Type, Row, Subst)> { match row { Row::Empty => Err(InferenceError::MissingLabel { label: label.to_string(), row: Row::Empty, }), Row::Extend(l, t, rest) if l == label => { Ok((*t.clone(), *rest.clone(), Subst::empty())) } Row::Extend(l, t, rest) => { let (t2, rest2, s) = self.rewrite_row(rest, label)?; Ok((t2, Row::Extend(l.clone(), t.clone(), Box::new(rest2)), s)) } Row::Var(alpha) => { let beta = self.fresh_row(); let gamma = self.fresh_type(); let new_row = Row::Extend( label.to_string(), Box::new(gamma.clone()), Box::new(beta.clone()), ); let s = Subst::singleton_row(alpha.clone(), new_row); Ok((gamma, beta, s)) } } } pub fn infer(&mut self, env: &Env, expr: &Expr) -> Result<(Subst, Type, InferenceTree)> { match expr { Expr::Lit(Lit::Int(_)) => self.infer_lit_int(env, expr), Expr::Lit(Lit::Bool(_)) => self.infer_lit_bool(env, expr), Expr::Var(name) => self.infer_var(env, expr, name), Expr::Abs(param, body) => self.infer_abs(env, expr, param, body), Expr::App(f, a) => self.infer_app(env, expr, f, a), Expr::Let(name, value, body) => self.infer_let(env, expr, name, value, body), Expr::EmptyRecord => self.infer_empty_record(env, expr), Expr::Extend(label, value, rest) => self.infer_extend(env, expr, label, value, rest), Expr::Select(record, label) => self.infer_select(env, expr, record, label), Expr::Restrict(record, label) => self.infer_restrict(env, expr, record, label), } } /// T-Int: ───────────────── /// Γ ⊢ n : Int fn infer_lit_int(&mut self, env: &Env, expr: &Expr) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let tree = InferenceTree::new("T-Int", &input, "Int", vec![]); Ok((Subst::empty(), Type::Int, tree)) } /// T-Bool: ───────────────── /// Γ ⊢ b : Bool fn infer_lit_bool(&mut self, env: &Env, expr: &Expr) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let tree = InferenceTree::new("T-Bool", &input, "Bool", vec![]); Ok((Subst::empty(), Type::Bool, tree)) } /// T-Var: x : σ ∈ Γ τ = inst(σ) /// ───────────────────────── /// Γ ⊢ x : τ fn infer_var( &mut self, env: &Env, expr: &Expr, name: &str, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); match env.get(name) { Some(scheme) => { let instantiated = self.instantiate(scheme); let output = format!("{}", instantiated); let tree = InferenceTree::new("T-Var", &input, &output, vec![]); Ok((Subst::empty(), instantiated, tree)) } None => Err(InferenceError::UnboundVariable { name: name.to_string(), }), } } /// T-Abs: Γ, x : α ⊢ e : τ α fresh /// ───────────────────────────── /// Γ ⊢ λx. e : α → τ fn infer_abs( &mut self, env: &Env, expr: &Expr, param: &str, body: &Expr, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let param_ty = self.fresh_type(); let mut new_env = env.clone(); new_env.insert( param.to_string(), Scheme { type_vars: vec![], row_vars: vec![], ty: param_ty.clone(), }, ); let (s, body_ty, tree1) = self.infer(&new_env, body)?; let result = Type::Arrow(Box::new(apply_type(&s, ¶m_ty)), Box::new(body_ty)); let output = format!("{}", result); let tree = InferenceTree::new("T-Abs", &input, &output, vec![tree1]); Ok((s, result, tree)) } /// T-App: Γ ⊢ e₁ : τ₁ Γ ⊢ e₂ : τ₂ α fresh S = unify(τ₁, τ₂ → α) /// ────────────────────────────────────────────────────────────── /// Γ ⊢ e₁ e₂ : S(α) fn infer_app( &mut self, env: &Env, expr: &Expr, func: &Expr, arg: &Expr, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let result_ty = self.fresh_type(); let (s1, f_ty, tree1) = self.infer(env, func)?; let env1 = apply_env(&s1, env); let (s2, a_ty, tree2) = self.infer(&env1, arg)?; let expected = Type::Arrow(Box::new(a_ty), Box::new(result_ty.clone())); let (s3, tree3) = self.unify(&apply_type(&s2, &f_ty), &expected)?; let s = s3.compose(&s2.compose(&s1)); let final_ty = apply_type(&s, &result_ty); let output = format!("{}", final_ty); let tree = InferenceTree::new("T-App", &input, &output, vec![tree1, tree2, tree3]); Ok((s, final_ty, tree)) } /// T-Let: Γ ⊢ e₁ : τ₁ σ = gen(Γ, τ₁) Γ, x : σ ⊢ e₂ : τ₂ /// ────────────────────────────────────────────────────── /// Γ ⊢ let x = e₁ in e₂ : τ₂ fn infer_let( &mut self, env: &Env, expr: &Expr, name: &str, value: &Expr, body: &Expr, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let (s1, value_ty, tree1) = self.infer(env, value)?; let env1 = apply_env(&s1, env); let scheme = self.generalize(&env1, &value_ty); let mut env2 = env1; env2.insert(name.to_string(), scheme); let (s2, body_ty, tree2) = self.infer(&env2, body)?; let final_subst = s2.compose(&s1); let output = format!("{}", body_ty); let tree = InferenceTree::new("T-Let", &input, &output, vec![tree1, tree2]); Ok((final_subst, body_ty, tree)) } /// T-EmptyRecord: ───────────────── /// Γ ⊢ {} : {} fn infer_empty_record( &mut self, env: &Env, expr: &Expr, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let ty = Type::Record(Box::new(Row::Empty)); let output = format!("{}", ty); let tree = InferenceTree::new("T-EmptyRecord", &input, &output, vec![]); Ok((Subst::empty(), ty, tree)) } /// T-Extend: Γ ⊢ e : τ Γ ⊢ r : {ρ} /// ───────────────────────────── /// Γ ⊢ {l = e | r} : {l : τ | ρ} fn infer_extend( &mut self, env: &Env, expr: &Expr, label: &str, value: &Expr, rest: &Expr, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let row_var = match self.fresh_row() { Row::Var(v) => v, _ => unreachable!(), }; let (s1, v_ty, tree1) = self.infer(env, value)?; let env1 = apply_env(&s1, env); let (s2, r_ty, tree2) = self.infer(&env1, rest)?; let expected = Type::Record(Box::new(Row::Var(row_var.clone()))); let (s3, tree3) = self.unify(&apply_type(&s2, &r_ty), &expected)?; let s = s3.compose(&s2.compose(&s1)); let v_final = apply_type(&s, &v_ty); let row = Row::Extend( label.to_string(), Box::new(v_final), Box::new(apply_row(&s, &Row::Var(row_var))), ); let result = Type::Record(Box::new(row)); let output = format!("{}", result); let tree = InferenceTree::new("T-Extend", &input, &output, vec![tree1, tree2, tree3]); Ok((s, result, tree)) } /// T-Select: Γ ⊢ r : {l : α | ρ} /// ───────────────────── /// Γ ⊢ r.l : α fn infer_select( &mut self, env: &Env, expr: &Expr, record: &Expr, label: &str, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let field_ty = self.fresh_type(); let rest = self.fresh_row(); let (s1, r_ty, tree1) = self.infer(env, record)?; let expected = Type::Record(Box::new(Row::Extend( label.to_string(), Box::new(field_ty.clone()), Box::new(rest), ))); let (s2, tree2) = self.unify(&r_ty, &expected)?; let s = s2.compose(&s1); let final_ty = apply_type(&s, &field_ty); let output = format!("{}", final_ty); let tree = InferenceTree::new("T-Select", &input, &output, vec![tree1, tree2]); Ok((s, final_ty, tree)) } /// T-Restrict: Γ ⊢ r : {l : α | ρ} /// ───────────────────── /// Γ ⊢ {r - l} : {ρ} fn infer_restrict( &mut self, env: &Env, expr: &Expr, record: &Expr, label: &str, ) -> Result<(Subst, Type, InferenceTree)> { let input = format!("{} ⊢ {} ⇒", self.pretty_env(env), expr); let field_ty = self.fresh_type(); let rest_var = match self.fresh_row() { Row::Var(v) => v, _ => unreachable!(), }; let (s1, r_ty, tree1) = self.infer(env, record)?; let expected = Type::Record(Box::new(Row::Extend( label.to_string(), Box::new(field_ty), Box::new(Row::Var(rest_var.clone())), ))); let (s2, tree2) = self.unify(&r_ty, &expected)?; let s = s2.compose(&s1); let result = Type::Record(Box::new(apply_row(&s, &Row::Var(rest_var)))); let output = format!("{}", result); let tree = InferenceTree::new("T-Restrict", &input, &output, vec![tree1, tree2]); Ok((s, result, tree)) } } pub fn run_inference(expr: &Expr) -> Result<InferenceTree> { let mut inf = TypeInference::new(); let env = Env::new(); let (_, _, tree) = inf.infer(&env, expr)?; Ok(tree) } pub fn infer_type(expr: &Expr) -> Result<Type> { let mut inf = TypeInference::new(); let env = Env::new(); let (s, ty, _) = inf.infer(&env, expr)?; let final_ty = apply_type(&s, &ty); let scheme = inf.generalize(&env, &final_ty); Ok(rename_scheme(&scheme)) } fn rename_scheme(scheme: &Scheme) -> Type { let mut subst = Subst::empty(); let mut letters = ('a'..='z').map(|c| c.to_string()); for v in &scheme.type_vars { let fresh = letters.next().unwrap_or_else(|| v.clone()); subst.types.insert(v.clone(), Type::Var(fresh)); } let mut rletters = ('r'..='z').map(|c| c.to_string()); for v in &scheme.row_vars { let fresh = rletters.next().unwrap_or_else(|| v.clone()); subst.rows.insert(v.clone(), Row::Var(fresh)); } apply_type(&subst, &scheme.ty) } fn row_tail(row: &Row) -> Option<&str> { match row { Row::Empty => None, Row::Var(name) => Some(name), Row::Extend(_, _, rest) => row_tail(rest), } } }
If the tail is in the substitution we raise RecursiveRow instead of looping. The classic offender is a conditional that demands two different label extensions of the same row variable, like \r -> if c then {x = 1 | r} else {y = 2 | r}. The two branches force r to extend with x on one side and y on the other, and the side condition rejects this without diverging.
Rewriting
rewrite_row is the row-rewrite rule. Given a row and a label l, it produces an equivalent row whose head is l, the field type that sits under that head, and a substitution recording how the row was specialised.
#![allow(unused)] fn main() { fn rewrite_row(&mut self, row: &Row, label: &str) -> Result<(Type, Row, Subst)> { match row { Row::Empty => Err(InferenceError::MissingLabel { label: label.to_string(), row: Row::Empty, }), Row::Extend(l, t, rest) if l == label => { Ok((*t.clone(), *rest.clone(), Subst::empty())) } Row::Extend(l, t, rest) => { let (t2, rest2, s) = self.rewrite_row(rest, label)?; Ok((t2, Row::Extend(l.clone(), t.clone(), Box::new(rest2)), s)) } Row::Var(alpha) => { let beta = self.fresh_row(); let gamma = self.fresh_type(); let new_row = Row::Extend( label.to_string(), Box::new(gamma.clone()), Box::new(beta.clone()), ); let s = Subst::singleton_row(alpha.clone(), new_row); Ok((gamma, beta, s)) } } } }
Empty cannot expose l, so we report a missing label. Extend(l, t, rest) already has l at the head, so the field type is t, the residual is rest, and no substitution is needed. Extend(l', t, rest) with a different label recurses into rest and reattaches l' to the rewritten residual, which is what preserves scoped-label semantics: a later occurrence of l stays buried under any earlier l'. Var(α) is the open case. We generate a fresh field type γ and a fresh tail β, bind α := {l : γ | β}, and return γ, β, and the binding as the substitution.
The fresh-tail substitution in the open case is what lets rows grow during inference. When a record whose row is {x : Int | r} is unified with another record that needs a y label, rewrite_row extends r to {y : β_y | β} and the rows continue to unify against β. Without this rewrite, unifying two records with disjoint label sets through a common row variable would fail.
Records
Records are the surface form built on top of rows. A record literal {x = 1, y = true} has type {x : Int, y : Bool}. A row-polymorphic function \r -> r.x accepts any record with at least a field x, without committing to the rest of the shape. The term language adds four constructs to the lambda calculus core: the empty record, record extension, selection, and restriction.
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq)] pub enum Lit { Int(i64), Bool(bool), } #[derive(Debug, Clone, PartialEq)] pub enum Type { Var(String), Int, Bool, Arrow(Box<Type>, Box<Type>), Record(Box<Row>), } #[derive(Debug, Clone, PartialEq)] pub enum Row { Empty, Var(String), Extend(String, Box<Type>, Box<Row>), } #[derive(Debug, Clone, PartialEq)] pub struct Scheme { pub type_vars: Vec<String>, pub row_vars: Vec<String>, pub ty: Type, } impl std::fmt::Display for Expr { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Expr::Var(name) => write!(f, "{}", name), Expr::Lit(Lit::Int(n)) => write!(f, "{}", n), Expr::Lit(Lit::Bool(b)) => write!(f, "{}", b), Expr::Abs(param, body) => write!(f, "\\{} -> {}", param, body), Expr::App(func, arg) => match (func.as_ref(), arg.as_ref()) { (Expr::Abs(_, _), _) => write!(f, "({}) {}", func, arg), (_, Expr::App(_, _)) => write!(f, "{} ({})", func, arg), (_, Expr::Abs(_, _)) => write!(f, "{} ({})", func, arg), _ => write!(f, "{} {}", func, arg), }, Expr::Let(var, value, body) => { write!(f, "let {} = {} in {}", var, value, body) } Expr::EmptyRecord => write!(f, "{{}}"), Expr::Extend(_, _, _) => { let (fields, tail) = collect_extend(self); write!(f, "{{")?; for (i, (label, value)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} = {}", label, value)?; } match tail { Expr::EmptyRecord => write!(f, "}}"), other => write!(f, " | {}}}", other), } } Expr::Select(e, label) => write!(f, "{}.{}", e, label), Expr::Restrict(e, label) => write!(f, "{{{} - {}}}", e, label), } } } fn collect_extend(mut e: &Expr) -> (Vec<(String, &Expr)>, &Expr) { let mut fields = Vec::new(); while let Expr::Extend(l, v, rest) = e { fields.push((l.clone(), v.as_ref())); e = rest.as_ref(); } (fields, e) } impl std::fmt::Display for Type { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Type::Var(name) => write!(f, "{}", name), Type::Int => write!(f, "Int"), Type::Bool => write!(f, "Bool"), Type::Arrow(t1, t2) => match t1.as_ref() { Type::Arrow(_, _) => write!(f, "({}) -> {}", t1, t2), _ => write!(f, "{} -> {}", t1, t2), }, Type::Record(row) => write!(f, "{{{}}}", row), } } } impl std::fmt::Display for Row { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { let (fields, tail) = collect_row(self); for (i, (label, ty)) in fields.iter().enumerate() { if i > 0 { write!(f, ", ")?; } write!(f, "{} : {}", label, ty)?; } match tail { Row::Empty => Ok(()), Row::Var(name) => { if fields.is_empty() { write!(f, "{}", name) } else { write!(f, " | {}", name) } } Row::Extend(_, _, _) => unreachable!("collect_row drains all extensions"), } } } fn collect_row(mut r: &Row) -> (Vec<(String, &Type)>, &Row) { let mut fields = Vec::new(); while let Row::Extend(l, t, rest) = r { fields.push((l.clone(), t.as_ref())); r = rest.as_ref(); } (fields, r) } impl std::fmt::Display for Scheme { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { if self.type_vars.is_empty() && self.row_vars.is_empty() { write!(f, "{}", self.ty) } else { write!(f, "forall")?; for v in &self.type_vars { write!(f, " {}", v)?; } for v in &self.row_vars { write!(f, " {}", v)?; } write!(f, ". {}", self.ty) } } } pub fn build_extend(fields: Vec<(String, Box<Expr>)>, tail: Box<Expr>) -> Box<Expr> { let mut acc = tail; for (label, value) in fields.into_iter().rev() { acc = Box::new(Expr::Extend(label, value, acc)); } acc } #[derive(Debug, Clone, PartialEq)] pub enum Expr { Var(String), Lit(Lit), Abs(String, Box<Expr>), App(Box<Expr>, Box<Expr>), Let(String, Box<Expr>, Box<Expr>), EmptyRecord, Extend(String, Box<Expr>, Box<Expr>), Select(Box<Expr>, String), Restrict(Box<Expr>, String), } }
EmptyRecord is the literal {}. Extend(label, value, rest) is the desugared form of {l = e | r}, prepending a single labelled field to an existing record. A record literal {x = 1, y = 2} parses as a chain of Extend nodes ending in EmptyRecord. Select(record, label) is r.l, projecting the leftmost field with the matching label. Restrict(record, label) is {r - l}, removing the leftmost occurrence and exposing whatever lay beneath it.
Inference
The empty record has type {}, the record type wrapping the closed empty row. Extension introduces a fresh row variable for the tail, infers the value and the rest, unifies the rest with {ρ} to recover the underlying row, and assembles {l : τ | ρ} as the result. Inferring {x = 1 | r} for a free r gives the principal scheme forall r. {r} -> {x : Int | r}, which is what justifies free extension never failing on the right.
Selection has the principal type forall r a. {l : a | r} -> a. The inference rule fabricates a fresh field type α and a fresh tail ρ, then unifies the record’s inferred type with {l : α | ρ}. If the record already exposes l at the head the unifier accepts it directly. If the record’s row is a variable the unifier uses rewrite_row to bind that variable to {l : α | ρ}, which is how \r -> r.x becomes row-polymorphic in the rest.
Restriction has the principal type forall r a. {l : a | r} -> {r}. The mechanics mirror selection except the result type is the residual record {ρ} rather than the field type. Combined with extension this gives an update operation: {l = new | {r - l}} overwrites the leftmost l in r without disturbing fields beneath it.
Scoped Duplicate Labels
We follow the scoped-label semantics rather than the more common set-of-labels semantics. Duplicate labels are allowed and behave like a stack ordered left to right. Selection always returns the leftmost occurrence, and restriction always peels the leftmost occurrence off, exposing the next one. From the integration snapshot for 02_scoped_labels.fun:
{x = 1, x = true} : {x : Int, x : Bool}
{x = 1, x = true}.x : Int
{{x = 1, x = true} - x}.x : Bool
The first line shows that the duplicate labels are preserved in the type, including their distinct field types. The second line shows that .x picks the leftmost, which here is the Int. The third line shows that one round of restriction exposes the inner occurrence, which is the Bool. The same mechanism lets free extension shadow a polymorphic field type: \r -> {x = 1 | r} has type {r} -> {x : Int | r}, so applying it to a record that already has x : Bool produces {x : Int, x : Bool}, not a type error.
The implementation does not need a separate rule for duplicates. rewrite_row walks past labels with a non-matching name and stops at the first occurrence of the target label. A later x lying behind an earlier x' is invisible to the rewriter, which is exactly the behaviour scoped semantics demand. The Examples chapter walks through the full record test suite, including extension, restriction, and update.