Code Generation
This section is optional, but we’d like to explore a little bit how we can go from high-level lambda calculus (i.e. System F) all the way down to executable machine code requires several transformations that preserve the semantics of our programs while adapting them to the realities of modern CPU architectures.
We’re going to build a minimalist generation pipeline that transforms our typed functional programs into imperative machine code through three major phases:
- Type erasure First we removes the type information that guided our type checking
- Closure conversion Second we make the implicit environment capture of nested functions explicit
- Code generation Finally our code generation framework (in our case Cranelift) generates optimized machine code for the target architecture from our closure converted code.
Choosing a Code Generation Backend
Before diving into our implementation, we should understand why we chose Cranelift as our code generation backend. When implementing a compiler, you face a fundamental choice about how to generate executable code, each with distinct tradeoffs.
Bespoke Backend: The most educational approach involves directly emitting assembly instructions for your target architecture. This gives you complete control and deep understanding of the machine, but requires implementing register allocation, instruction selection, and optimization passes from scratch. For production compilers targeting multiple architectures, this quickly becomes intractable. You would need to understand the intricacies of x86-64, ARM64, RISC-V, and other instruction sets, along with their calling conventions and performance characteristics. This can be a non-trivial project.
Virtual Machine Approach: Many languages choose to compile to bytecode for a custom virtual machine. Over the decades, numerous specialized VMs have been developed for functional languages:
VM | Languages | Key Features | Evaluation |
---|---|---|---|
SECD | ML, Scheme | Stack-based, formal basis | Call-by-value |
CEK | Scheme, Racket | Control/environment/continuation | Call-by-value |
Krivine | OCaml | De Bruijn indices, minimal | Call-by-name |
STG | Haskell | Lazy graph reduction, closures, parallelism | Call-by-need |
ZINC | OCaml | Efficient currying, stack/env/closure | Call-by-value |
JVM | Scala, Clojure, Kotlin, Eta | Closures, immutable structures, concurrency | Configurable |
BEAM | Erlang, Elixir | Concurrency, hot swapping, lightweight processes | Strict |
FLVM | Curry | Functional logic, non-determinism, pools | Mixed |
Uxn | Funktal | Minimalism, 8-bit opcodes, functional core | Custom |
This approach provides excellent portability and simplifies the compiler, but sacrifices performance due to interpretation overhead. Even with just-in-time compilation, the VM approach typically cannot match native code performance for compute-intensive tasks.
LibJIT: Libraries like LibJIT provide a middle ground, offering a simple API for generating machine code without building a full compiler backend. These work well for domain-specific languages and embedded scripting, but typically lack sophisticated optimizations and broad architecture support. They excel at simplicity but plateau quickly in terms of performance.
LLVM: The dominant choice for production compilers, LLVM provides industrial-strength optimization passes and supports virtually every architecture. Rust, Swift, and Clang all use LLVM. However, LLVM’s comprehensiveness comes with significant costs: massive binary sizes (hundreds of megabytes), slow compilation times, and a complex C++ API that requires deep expertise. LLVM’s optimization passes, while extremely powerful, can take longer than the rest of your compiler combined. There are good Rust bindings in the inkwell crate.
MLIR: Multi-Level Intermediate Representation is like a higher-level LLVM, which extends LLVM with better support for domain-specific optimizations and heterogeneous computing (think GPUs). While powerful for machine learning compilers and specialized domains, MLIR adds even more complexity than LLVM and is still fairly early and undocumented (although I’ve tired somewhat to remedy that). There are excellent Rust bindings in melior crate.
Cranelift: Originally designed for WebAssembly, Cranelift occupies a sweet spot for language experimentation. It generates good quality code quickly, has a clean Rust API, produces small binaries, and supports the major architectures (x86-64, ARM64). While it cannot match LLVM’s peak optimization quality, Cranelift compiles code an order of magnitude faster. For a teaching compiler where iteration speed and code clarity matter more than squeezing out the last 10% of performance, Cranelift is awesome.
We’re going to use Cranelift, because it’s simple and uses the Rust build system with no external dependencies. It also has a clean API and compiles quickly, making it ideal for experimentation and learning. But our approach would work with any of these backends … only the final code generation phase would change if you choose a different backend.
Type Erasure
System Fω’s type system serves its purpose during type checking, ensuring program correctness and enabling powerful abstractions. But as we discussed before types exist purely for compile-time verification and carry no computational content at runtime. The type erasure phase strips away all type information, leaving only the essential computational structure.
#![allow(unused)] fn main() { use crate::core::{CoreBinOp, CoreTerm}; /// Erase types from Core representation pub fn erase(term: &CoreTerm) -> Erased { match term { CoreTerm::Var(name) => Erased::Var(name.clone()), CoreTerm::LitInt(n) => Erased::Int(*n), CoreTerm::Lambda { param, param_ty: _, body, } => { // Erase type annotation Erased::Lam(param.clone(), Box::new(erase(body))) } CoreTerm::App { func, arg } => Erased::App(Box::new(erase(func)), Box::new(erase(arg))), CoreTerm::TypeLambda { param: _, body } => { // Type abstractions are erased - just keep the body erase(body) } CoreTerm::Constructor { .. } => { // For now, we don't handle constructors panic!("Constructor not yet supported in code generation") } CoreTerm::Case { .. } => { panic!("Pattern matching not yet supported in code generation") } CoreTerm::BinOp { op, left, right } => { Erased::BinOp(op.clone(), Box::new(erase(left)), Box::new(erase(right))) } CoreTerm::If { cond, then_branch, else_branch, } => Erased::If( Box::new(erase(cond)), Box::new(erase(then_branch)), Box::new(erase(else_branch)), ), CoreTerm::IntrinsicCall { name, args } => Erased::IntrinsicCall { name: name.clone(), args: args.iter().map(erase).collect(), }, } } impl Erased { /// Pretty print erased terms pub fn pretty(&self) -> String { match self { Erased::Var(name) => name.to_string(), Erased::Lam(name, body) => format!("λ{}. {}", name, body.pretty()), Erased::App(f, x) => { let f_str = match &**f { Erased::Lam(_, _) => format!("({})", f.pretty()), _ => f.pretty(), }; let x_str = match &**x { Erased::App(_, _) | Erased::Lam(_, _) => format!("({})", x.pretty()), _ => x.pretty(), }; format!("{} {}", f_str, x_str) } Erased::Int(n) => n.to_string(), Erased::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()), Erased::If(c, t, e) => { format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty()) } Erased::IntrinsicCall { name, args } => { let args_str = args .iter() .map(|a| a.pretty()) .collect::<Vec<_>>() .join(", "); format!("{}({})", name, args_str) } } } /// Collect all free variables pub fn free_vars(&self) -> Vec<String> { self.free_vars_impl(&mut Vec::new()) } fn free_vars_impl(&self, bound: &mut Vec<String>) -> Vec<String> { match self { Erased::Var(name) => { if bound.contains(name) { vec![] } else { vec![name.clone()] } } Erased::Lam(name, body) => { bound.push(name.clone()); let result = body.free_vars_impl(bound); bound.pop(); result } Erased::App(f, x) => { let mut fvs = f.free_vars_impl(bound); let x_fvs = x.free_vars_impl(bound); for fv in x_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } fvs } Erased::Int(_) => vec![], Erased::BinOp(_, l, r) => { let mut fvs = l.free_vars_impl(bound); let r_fvs = r.free_vars_impl(bound); for fv in r_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } fvs } Erased::If(c, t, e) => { let mut fvs = c.free_vars_impl(bound); for term in [t, e] { let term_fvs = term.free_vars_impl(bound); for fv in term_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } } fvs } Erased::IntrinsicCall { args, .. } => { let mut fvs = vec![]; for arg in args { let arg_fvs = arg.free_vars_impl(bound); for fv in arg_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } } fvs } } } } #[cfg(test)] mod tests { use super::*; use crate::core::CoreType; #[test] fn test_erase_simple() { // λx:Int. x let term = CoreTerm::Lambda { param: "x".to_string(), param_ty: CoreType::Con("Int".to_string()), body: Box::new(CoreTerm::Var("x".to_string())), }; let erased = erase(&term); assert_eq!( erased, Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string()))) ); } #[test] fn test_erase_type_abstraction() { // Λα. λx:α. x let term = CoreTerm::TypeLambda { param: "α".to_string(), body: Box::new(CoreTerm::Lambda { param: "x".to_string(), param_ty: CoreType::Var("α".to_string()), body: Box::new(CoreTerm::Var("x".to_string())), }), }; let erased = erase(&term); assert_eq!( erased, Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string()))) ); } } /// Untyped lambda calculus after type erasure #[derive(Debug, Clone, PartialEq)] pub enum Erased { /// Variable Var(String), /// Lambda abstraction (no type annotation) Lam(String, Box<Erased>), /// Application App(Box<Erased>, Box<Erased>), /// Integer literal Int(i64), /// Binary operation BinOp(CoreBinOp, Box<Erased>, Box<Erased>), /// Conditional If(Box<Erased>, Box<Erased>, Box<Erased>), /// Call a runtime intrinsic IntrinsicCall { name: String, args: Vec<Erased> }, } }
The erased representation captures the essence of computation without types. Variables become simple names, lambda abstractions lose their type annotations, and applications remain as the fundamental operation of function invocation. Integer literals and binary operations pass through unchanged, as they represent actual runtime computations. Critically, type abstractions and type applications vanish entirely during erasure, as they exist solely to guide type checking.
#![allow(unused)] fn main() { use crate::core::{CoreBinOp, CoreTerm}; /// Untyped lambda calculus after type erasure #[derive(Debug, Clone, PartialEq)] pub enum Erased { /// Variable Var(String), /// Lambda abstraction (no type annotation) Lam(String, Box<Erased>), /// Application App(Box<Erased>, Box<Erased>), /// Integer literal Int(i64), /// Binary operation BinOp(CoreBinOp, Box<Erased>, Box<Erased>), /// Conditional If(Box<Erased>, Box<Erased>, Box<Erased>), /// Call a runtime intrinsic IntrinsicCall { name: String, args: Vec<Erased> }, } impl Erased { /// Pretty print erased terms pub fn pretty(&self) -> String { match self { Erased::Var(name) => name.to_string(), Erased::Lam(name, body) => format!("λ{}. {}", name, body.pretty()), Erased::App(f, x) => { let f_str = match &**f { Erased::Lam(_, _) => format!("({})", f.pretty()), _ => f.pretty(), }; let x_str = match &**x { Erased::App(_, _) | Erased::Lam(_, _) => format!("({})", x.pretty()), _ => x.pretty(), }; format!("{} {}", f_str, x_str) } Erased::Int(n) => n.to_string(), Erased::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()), Erased::If(c, t, e) => { format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty()) } Erased::IntrinsicCall { name, args } => { let args_str = args .iter() .map(|a| a.pretty()) .collect::<Vec<_>>() .join(", "); format!("{}({})", name, args_str) } } } /// Collect all free variables pub fn free_vars(&self) -> Vec<String> { self.free_vars_impl(&mut Vec::new()) } fn free_vars_impl(&self, bound: &mut Vec<String>) -> Vec<String> { match self { Erased::Var(name) => { if bound.contains(name) { vec![] } else { vec![name.clone()] } } Erased::Lam(name, body) => { bound.push(name.clone()); let result = body.free_vars_impl(bound); bound.pop(); result } Erased::App(f, x) => { let mut fvs = f.free_vars_impl(bound); let x_fvs = x.free_vars_impl(bound); for fv in x_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } fvs } Erased::Int(_) => vec![], Erased::BinOp(_, l, r) => { let mut fvs = l.free_vars_impl(bound); let r_fvs = r.free_vars_impl(bound); for fv in r_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } fvs } Erased::If(c, t, e) => { let mut fvs = c.free_vars_impl(bound); for term in [t, e] { let term_fvs = term.free_vars_impl(bound); for fv in term_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } } fvs } Erased::IntrinsicCall { args, .. } => { let mut fvs = vec![]; for arg in args { let arg_fvs = arg.free_vars_impl(bound); for fv in arg_fvs { if !fvs.contains(&fv) { fvs.push(fv); } } } fvs } } } } #[cfg(test)] mod tests { use super::*; use crate::core::CoreType; #[test] fn test_erase_simple() { // λx:Int. x let term = CoreTerm::Lambda { param: "x".to_string(), param_ty: CoreType::Con("Int".to_string()), body: Box::new(CoreTerm::Var("x".to_string())), }; let erased = erase(&term); assert_eq!( erased, Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string()))) ); } #[test] fn test_erase_type_abstraction() { // Λα. λx:α. x let term = CoreTerm::TypeLambda { param: "α".to_string(), body: Box::new(CoreTerm::Lambda { param: "x".to_string(), param_ty: CoreType::Var("α".to_string()), body: Box::new(CoreTerm::Var("x".to_string())), }), }; let erased = erase(&term); assert_eq!( erased, Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string()))) ); } } /// Erase types from Core representation pub fn erase(term: &CoreTerm) -> Erased { match term { CoreTerm::Var(name) => Erased::Var(name.clone()), CoreTerm::LitInt(n) => Erased::Int(*n), CoreTerm::Lambda { param, param_ty: _, body, } => { // Erase type annotation Erased::Lam(param.clone(), Box::new(erase(body))) } CoreTerm::App { func, arg } => Erased::App(Box::new(erase(func)), Box::new(erase(arg))), CoreTerm::TypeLambda { param: _, body } => { // Type abstractions are erased - just keep the body erase(body) } CoreTerm::Constructor { .. } => { // For now, we don't handle constructors panic!("Constructor not yet supported in code generation") } CoreTerm::Case { .. } => { panic!("Pattern matching not yet supported in code generation") } CoreTerm::BinOp { op, left, right } => { Erased::BinOp(op.clone(), Box::new(erase(left)), Box::new(erase(right))) } CoreTerm::If { cond, then_branch, else_branch, } => Erased::If( Box::new(erase(cond)), Box::new(erase(then_branch)), Box::new(erase(else_branch)), ), CoreTerm::IntrinsicCall { name, args } => Erased::IntrinsicCall { name: name.clone(), args: args.iter().map(erase).collect(), }, } } }
The erasure algorithm traverses the typed abstract syntax tree and systematically removes all type information. Lambda abstractions lose their parameter type annotations, becoming untyped functions that operate solely on runtime values. Type abstractions disappear entirely, with only their bodies remaining, since type parameters have no runtime representation. Type applications similarly vanish, as they merely instantiate type variables that no longer exist after erasure.
This phase demonstrates a fundamental principle of type systems: types are a compile-time discipline that ensures program correctness without imposing runtime overhead. The erased program retains exactly the computational content of the original while shedding the scaffolding that ensured its correctness.
Closure Conversion
Functional languages allow functions to be defined within other functions, creating nested scopes where inner functions can reference variables from enclosing scopes. This natural programming style poses a challenge for compilation to machine code, where functions are typically compiled to fixed addresses with no implicit access to their defining environment.
Consider this simple example of a function that creates an adder:
makeAdder :: Int -> (Int -> Int);
makeAdder x = λy. x + y;
add5 :: Int -> Int;
add5 = makeAdder 5;
The inner lambda λy. x + y
references the variable x
from its enclosing scope. When makeAdder
returns this function, the value of x
must somehow be preserved so that when add5
is later called with an argument, it still has access to the value 5
that was passed to makeAdder
.
In traditional assembly language or C, functions are just code addresses with no associated data. A function like add
would be compiled to a fixed location in memory:
add_function:
; expects two arguments in registers
; adds them and returns result
But our add5
function needs to remember that x = 5
. This is where closure conversion comes in.
Closure conversion solves this problem by transforming every function into a pair of a code pointer and an environment containing captured values. The transformation makes environment capture explicit by:
- Identifying free variables in each function (variables used but not defined locally)
- Creating an environment structure to hold these captured values
- Rewriting functions to take an extra parameter (the closure) and extract captured values from it
- Transforming function calls to pass both the closure and the regular argument
After closure conversion, our example becomes something like:
// Original: λy. x + y where x is free
// Converted: λ(closure, y). project(closure, 0) + y
// Creating add5:
// 1. Allocate closure: [code_ptr=add_function, env=[5]]
// 2. Return this closure
// Calling add5(3):
// 1. Extract code pointer from closure
// 2. Call code_ptr(closure, 3)
// 3. Inside function: project(closure, 0) gives us 5
// 4. Return 5 + 3 = 8
This transformation turns the implicit variable capture of nested functions into explicit data structures that can be allocated and manipulated at runtime.
#![allow(unused)] fn main() { use std::collections::HashMap; use super::erase::Erased; use crate::core::CoreBinOp; /// The parameter name used for the closure environment in converted functions pub const CLOSURE_ENV_PARAM: &str = "$closure"; /// The parameter name used for thunks (non-lambda top-level definitions) pub const THUNK_PARAM: &str = "_unit"; /// A unique identifier for each function pub type FunctionId = usize; /// A top-level function definition after closure conversion #[derive(Debug, Clone)] pub struct Function { pub id: FunctionId, pub param: String, pub body: Closed, pub free_vars: Vec<String>, } /// Result of closure conversion #[derive(Debug)] pub struct Program { pub functions: Vec<Function>, pub main: Closed, } /// State for closure conversion struct ClosureConverter { next_id: FunctionId, functions: Vec<Function>, } impl ClosureConverter { fn new() -> Self { Self { next_id: 0, functions: Vec::new(), } } fn fresh_id(&mut self) -> FunctionId { let id = self.next_id; self.next_id += 1; id } /// Convert with knowledge of module-level functions fn convert_with_modules( &mut self, term: &Erased, env: &[(String, usize)], module_funcs: &HashMap<String, FunctionId>, ) -> Closed { match term { Erased::Var(name) => { // Check if it's a module function if let Some(&func_id) = module_funcs.get(name) { // Check if this is a thunk (non-lambda definition) // by looking at the function we've already created let is_thunk = self .functions .iter() .any(|f| f.id == func_id && f.param == THUNK_PARAM); if is_thunk { // Call the thunk with a dummy argument (0) Closed::Call( Box::new(Closed::MakeClosure(func_id, vec![])), Box::new(Closed::Int(0)), ) } else { // Regular function - create a closure with no captured variables Closed::MakeClosure(func_id, vec![]) } } else if let Some((_, idx)) = env.iter().find(|(n, _)| n == name) { // This is a captured variable - project from closure Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx) } else { // Free variable or parameter Closed::Var(name.clone()) } } Erased::Lam(param, body) => { // Collect free variables (excluding the parameter and module functions) let mut free_vars = body.free_vars(); free_vars.retain(|v| v != param && !module_funcs.contains_key(v)); // Create environment mapping for the body let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)]; for (i, fv) in free_vars.iter().enumerate() { body_env.push((fv.clone(), i)); } // Convert the body with the new environment let body_closed = self.convert_with_modules(body, &body_env, module_funcs); // Create a new function let id = self.fresh_id(); self.functions.push(Function { id, param: param.clone(), body: body_closed, free_vars: free_vars.clone(), }); // Create closure capturing the free variables let captured = free_vars .into_iter() .map(|fv| { if let Some((_, idx)) = env.iter().find(|(n, _)| n == &fv) { Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx) } else { Closed::Var(fv) } }) .collect(); Closed::MakeClosure(id, captured) } Erased::App(f, x) => { let f_closed = self.convert_with_modules(f, env, module_funcs); let x_closed = self.convert_with_modules(x, env, module_funcs); Closed::Call(Box::new(f_closed), Box::new(x_closed)) } Erased::Int(n) => Closed::Int(*n), Erased::BinOp(op, l, r) => { let l_closed = self.convert_with_modules(l, env, module_funcs); let r_closed = self.convert_with_modules(r, env, module_funcs); Closed::BinOp(op.clone(), Box::new(l_closed), Box::new(r_closed)) } Erased::If(c, t, e) => { let c_closed = self.convert_with_modules(c, env, module_funcs); let t_closed = self.convert_with_modules(t, env, module_funcs); let e_closed = self.convert_with_modules(e, env, module_funcs); Closed::If(Box::new(c_closed), Box::new(t_closed), Box::new(e_closed)) } Erased::IntrinsicCall { name, args } => { let args_closed = args .iter() .map(|arg| self.convert_with_modules(arg, env, module_funcs)) .collect(); Closed::IntrinsicCall { name: name.clone(), args: args_closed, } } } } } /// Perform closure conversion with module context pub fn closure_convert_module(functions: Vec<(&str, &Erased)>, main: &Erased) -> Program { let mut converter = ClosureConverter::new(); // First, assign IDs to all module functions let mut module_funcs = HashMap::new(); for (name, _) in &functions { let id = converter.fresh_id(); module_funcs.insert(name.to_string(), id); } // Convert each function body for (name, body) in functions { match body { Erased::Lam(param, func_body) => { // Get free variables let mut free_vars = func_body.free_vars(); free_vars.retain(|v| v != param && !module_funcs.contains_key(v)); // Create environment mapping let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)]; for (i, fv) in free_vars.iter().enumerate() { body_env.push((fv.clone(), i)); } // Convert the body let body_closed = converter.convert_with_modules(func_body, &body_env, &module_funcs); converter.functions.push(Function { id: module_funcs[name], param: param.clone(), body: body_closed, free_vars, }); } _ => { // For non-lambda definitions (like add5 = add 5), // convert them to thunks (lambdas that take a dummy parameter) let body_closed = converter.convert_with_modules(body, &[], &module_funcs); converter.functions.push(Function { id: module_funcs[name], param: THUNK_PARAM.to_string(), body: body_closed, free_vars: vec![], }); } } } // Convert main let main_closed = converter.convert_with_modules(main, &[], &module_funcs); Program { functions: converter.functions, main: main_closed, } } impl Closed { /// Pretty print closed terms pub fn pretty(&self) -> String { match self { Closed::Var(name) => name.to_string(), Closed::Proj(closure, idx) => format!("{}[{}]", closure.pretty(), idx), Closed::MakeClosure(id, captured) => { let caps: Vec<String> = captured.iter().map(|c| c.pretty()).collect(); format!("closure({}, [{}])", id, caps.join(", ")) } Closed::Call(f, x) => format!("{}({})", f.pretty(), x.pretty()), Closed::Int(n) => n.to_string(), Closed::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()), Closed::If(c, t, e) => { format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty()) } Closed::IntrinsicCall { name, args } => { let args_str = args .iter() .map(|a| a.pretty()) .collect::<Vec<_>>() .join(", "); format!("{}({})", name, args_str) } } } } /// Closure-converted representation #[derive(Debug, Clone, PartialEq)] pub enum Closed { /// Variable reference Var(String), /// Environment projection: project(closure, index) Proj(Box<Closed>, usize), /// Create closure: closure(function_id, [captured_vars]) MakeClosure(FunctionId, Vec<Closed>), /// Direct function call: call(closure, arg) Call(Box<Closed>, Box<Closed>), /// Integer literal Int(i64), /// Binary operation BinOp(CoreBinOp, Box<Closed>, Box<Closed>), /// Conditional If(Box<Closed>, Box<Closed>, Box<Closed>), /// Intrinsic function call IntrinsicCall { name: String, args: Vec<Closed> }, } }
The closed representation introduces several new constructs that make environment manipulation explicit. The MakeClosure
construct creates a closure by pairing a function identifier with its captured environment. The Proj
construct extracts values from a closure’s environment using positional indices. The Call
construct invokes a closure by passing both the closure itself and the argument.
#![allow(unused)] fn main() { use std::collections::HashMap; use super::erase::Erased; use crate::core::CoreBinOp; /// The parameter name used for the closure environment in converted functions pub const CLOSURE_ENV_PARAM: &str = "$closure"; /// The parameter name used for thunks (non-lambda top-level definitions) pub const THUNK_PARAM: &str = "_unit"; /// A unique identifier for each function pub type FunctionId = usize; /// Closure-converted representation #[derive(Debug, Clone, PartialEq)] pub enum Closed { /// Variable reference Var(String), /// Environment projection: project(closure, index) Proj(Box<Closed>, usize), /// Create closure: closure(function_id, [captured_vars]) MakeClosure(FunctionId, Vec<Closed>), /// Direct function call: call(closure, arg) Call(Box<Closed>, Box<Closed>), /// Integer literal Int(i64), /// Binary operation BinOp(CoreBinOp, Box<Closed>, Box<Closed>), /// Conditional If(Box<Closed>, Box<Closed>, Box<Closed>), /// Intrinsic function call IntrinsicCall { name: String, args: Vec<Closed> }, } /// Result of closure conversion #[derive(Debug)] pub struct Program { pub functions: Vec<Function>, pub main: Closed, } /// State for closure conversion struct ClosureConverter { next_id: FunctionId, functions: Vec<Function>, } impl ClosureConverter { fn new() -> Self { Self { next_id: 0, functions: Vec::new(), } } fn fresh_id(&mut self) -> FunctionId { let id = self.next_id; self.next_id += 1; id } /// Convert with knowledge of module-level functions fn convert_with_modules( &mut self, term: &Erased, env: &[(String, usize)], module_funcs: &HashMap<String, FunctionId>, ) -> Closed { match term { Erased::Var(name) => { // Check if it's a module function if let Some(&func_id) = module_funcs.get(name) { // Check if this is a thunk (non-lambda definition) // by looking at the function we've already created let is_thunk = self .functions .iter() .any(|f| f.id == func_id && f.param == THUNK_PARAM); if is_thunk { // Call the thunk with a dummy argument (0) Closed::Call( Box::new(Closed::MakeClosure(func_id, vec![])), Box::new(Closed::Int(0)), ) } else { // Regular function - create a closure with no captured variables Closed::MakeClosure(func_id, vec![]) } } else if let Some((_, idx)) = env.iter().find(|(n, _)| n == name) { // This is a captured variable - project from closure Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx) } else { // Free variable or parameter Closed::Var(name.clone()) } } Erased::Lam(param, body) => { // Collect free variables (excluding the parameter and module functions) let mut free_vars = body.free_vars(); free_vars.retain(|v| v != param && !module_funcs.contains_key(v)); // Create environment mapping for the body let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)]; for (i, fv) in free_vars.iter().enumerate() { body_env.push((fv.clone(), i)); } // Convert the body with the new environment let body_closed = self.convert_with_modules(body, &body_env, module_funcs); // Create a new function let id = self.fresh_id(); self.functions.push(Function { id, param: param.clone(), body: body_closed, free_vars: free_vars.clone(), }); // Create closure capturing the free variables let captured = free_vars .into_iter() .map(|fv| { if let Some((_, idx)) = env.iter().find(|(n, _)| n == &fv) { Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx) } else { Closed::Var(fv) } }) .collect(); Closed::MakeClosure(id, captured) } Erased::App(f, x) => { let f_closed = self.convert_with_modules(f, env, module_funcs); let x_closed = self.convert_with_modules(x, env, module_funcs); Closed::Call(Box::new(f_closed), Box::new(x_closed)) } Erased::Int(n) => Closed::Int(*n), Erased::BinOp(op, l, r) => { let l_closed = self.convert_with_modules(l, env, module_funcs); let r_closed = self.convert_with_modules(r, env, module_funcs); Closed::BinOp(op.clone(), Box::new(l_closed), Box::new(r_closed)) } Erased::If(c, t, e) => { let c_closed = self.convert_with_modules(c, env, module_funcs); let t_closed = self.convert_with_modules(t, env, module_funcs); let e_closed = self.convert_with_modules(e, env, module_funcs); Closed::If(Box::new(c_closed), Box::new(t_closed), Box::new(e_closed)) } Erased::IntrinsicCall { name, args } => { let args_closed = args .iter() .map(|arg| self.convert_with_modules(arg, env, module_funcs)) .collect(); Closed::IntrinsicCall { name: name.clone(), args: args_closed, } } } } } /// Perform closure conversion with module context pub fn closure_convert_module(functions: Vec<(&str, &Erased)>, main: &Erased) -> Program { let mut converter = ClosureConverter::new(); // First, assign IDs to all module functions let mut module_funcs = HashMap::new(); for (name, _) in &functions { let id = converter.fresh_id(); module_funcs.insert(name.to_string(), id); } // Convert each function body for (name, body) in functions { match body { Erased::Lam(param, func_body) => { // Get free variables let mut free_vars = func_body.free_vars(); free_vars.retain(|v| v != param && !module_funcs.contains_key(v)); // Create environment mapping let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)]; for (i, fv) in free_vars.iter().enumerate() { body_env.push((fv.clone(), i)); } // Convert the body let body_closed = converter.convert_with_modules(func_body, &body_env, &module_funcs); converter.functions.push(Function { id: module_funcs[name], param: param.clone(), body: body_closed, free_vars, }); } _ => { // For non-lambda definitions (like add5 = add 5), // convert them to thunks (lambdas that take a dummy parameter) let body_closed = converter.convert_with_modules(body, &[], &module_funcs); converter.functions.push(Function { id: module_funcs[name], param: THUNK_PARAM.to_string(), body: body_closed, free_vars: vec![], }); } } } // Convert main let main_closed = converter.convert_with_modules(main, &[], &module_funcs); Program { functions: converter.functions, main: main_closed, } } impl Closed { /// Pretty print closed terms pub fn pretty(&self) -> String { match self { Closed::Var(name) => name.to_string(), Closed::Proj(closure, idx) => format!("{}[{}]", closure.pretty(), idx), Closed::MakeClosure(id, captured) => { let caps: Vec<String> = captured.iter().map(|c| c.pretty()).collect(); format!("closure({}, [{}])", id, caps.join(", ")) } Closed::Call(f, x) => format!("{}({})", f.pretty(), x.pretty()), Closed::Int(n) => n.to_string(), Closed::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()), Closed::If(c, t, e) => { format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty()) } Closed::IntrinsicCall { name, args } => { let args_str = args .iter() .map(|a| a.pretty()) .collect::<Vec<_>>() .join(", "); format!("{}({})", name, args_str) } } } } /// A top-level function definition after closure conversion #[derive(Debug, Clone)] pub struct Function { pub id: FunctionId, pub param: String, pub body: Closed, pub free_vars: Vec<String>, } }
After closure conversion, each function exists as a separate top-level definition with an explicit parameter for its closure environment. The function body can access captured variables through projections from this environment parameter. This flat structure maps directly to the function model of assembly language, where each function has a fixed address and explicit parameters.
As an aside, there are alternative approaches to handling nested functions. The so-called defunctionalization approach converts higher-order functions into first-order code by replacing function values with data tags and a dispatch function, eliminating closures entirely but requiring whole-program transformation. Lambda lifting (also called closure elimination) transforms nested functions into top-level functions by adding their free variables as extra parameters, avoiding heap allocation of closures but potentially increasing the number of parameters passed at each call site. Alternatively the continuation-passing style transformation rewrites all functions to never return directly, instead passing their results to explicit continuation functions that represent “what to do next”. This transformation makes control flow explicit and can simplify closure representation, but produces code that’s harder to optimize and debug. We chose traditional closure conversion because it produces straightforward code that maps well to machine calling conventions, supports separate compilation, and integrates naturally with existing runtime systems while avoiding the parameter bloat of lambda lifting.
Free Variable Analysis
Closure conversion requires identifying which variables each function captures from its enclosing scope. This free variable analysis traverses each function body to determine which variables are referenced but not bound within the function itself.
#![allow(unused)] fn main() { /// Collect all free variables pub fn free_vars(&self) -> Vec<String> { self.free_vars_impl(&mut Vec::new()) } }
The free variable analysis maintains a set of bound variables as it traverses expressions. Variables that appear in the expression but not in the bound set are free and must be captured in the function’s closure. This analysis handles the scoping rules correctly, adding variables to the bound set when entering their scope and removing them when leaving.
Environment Representation
The closure conversion algorithm maintains an environment that maps variables to their locations in closure environments. When a function captures variables, they are assigned positions in its closure, and references to these variables become projections at the appropriate indices.
#![allow(unused)] fn main() { /// Convert with knowledge of module-level functions fn convert_with_modules( &mut self, term: &Erased, env: &[(String, usize)], module_funcs: &HashMap<String, FunctionId>, ) -> Closed { match term { Erased::Var(name) => { // Check if it's a module function if let Some(&func_id) = module_funcs.get(name) { // Check if this is a thunk (non-lambda definition) // by looking at the function we've already created let is_thunk = self .functions .iter() .any(|f| f.id == func_id && f.param == THUNK_PARAM); if is_thunk { // Call the thunk with a dummy argument (0) Closed::Call( Box::new(Closed::MakeClosure(func_id, vec![])), Box::new(Closed::Int(0)), ) } else { // Regular function - create a closure with no captured variables Closed::MakeClosure(func_id, vec![]) } } else if let Some((_, idx)) = env.iter().find(|(n, _)| n == name) { // This is a captured variable - project from closure Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx) } else { // Free variable or parameter Closed::Var(name.clone()) } } Erased::Lam(param, body) => { // Collect free variables (excluding the parameter and module functions) let mut free_vars = body.free_vars(); free_vars.retain(|v| v != param && !module_funcs.contains_key(v)); // Create environment mapping for the body let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)]; for (i, fv) in free_vars.iter().enumerate() { body_env.push((fv.clone(), i)); } // Convert the body with the new environment let body_closed = self.convert_with_modules(body, &body_env, module_funcs); // Create a new function let id = self.fresh_id(); self.functions.push(Function { id, param: param.clone(), body: body_closed, free_vars: free_vars.clone(), }); // Create closure capturing the free variables let captured = free_vars .into_iter() .map(|fv| { if let Some((_, idx)) = env.iter().find(|(n, _)| n == &fv) { Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx) } else { Closed::Var(fv) } }) .collect(); Closed::MakeClosure(id, captured) } Erased::App(f, x) => { let f_closed = self.convert_with_modules(f, env, module_funcs); let x_closed = self.convert_with_modules(x, env, module_funcs); Closed::Call(Box::new(f_closed), Box::new(x_closed)) } Erased::Int(n) => Closed::Int(*n), Erased::BinOp(op, l, r) => { let l_closed = self.convert_with_modules(l, env, module_funcs); let r_closed = self.convert_with_modules(r, env, module_funcs); Closed::BinOp(op.clone(), Box::new(l_closed), Box::new(r_closed)) } Erased::If(c, t, e) => { let c_closed = self.convert_with_modules(c, env, module_funcs); let t_closed = self.convert_with_modules(t, env, module_funcs); let e_closed = self.convert_with_modules(e, env, module_funcs); Closed::If(Box::new(c_closed), Box::new(t_closed), Box::new(e_closed)) } Erased::IntrinsicCall { name, args } => { let args_closed = args .iter() .map(|arg| self.convert_with_modules(arg, env, module_funcs)) .collect(); Closed::IntrinsicCall { name: name.clone(), args: args_closed, } } } } }
The conversion process transforms each expression based on its structure and the current environment. Lambda abstractions become closure allocations that capture the current values of their free variables. Variable references check whether the variable is in the local environment or needs to be accessed through closure projection. Applications become calls that pass both the closure and the argument, following the calling convention for closure-converted code.
Cranelift Code Generation
With closure conversion complete, the program consists of a collection of first-order functions and explicit closure operations. The Cranelift compiler framework transforms this representation into optimized machine code for the target architecture.
use std::collections::HashMap; use cranelift::prelude::codegen::ir::BlockArg; use cranelift::prelude::{ codegen, types, AbiParam, FunctionBuilder, FunctionBuilderContext, InstBuilder, IntCC, Signature, Value, }; use cranelift_module::{FuncId, Linkage, Module}; use super::closure::{Closed, Function, FunctionId, Program, CLOSURE_ENV_PARAM}; use super::runtime::RuntimeFunctions; use crate::core::CoreBinOp; impl<M: Module> CodeGen<M> { /// Finish compilation and return the module pub fn finish(self) -> M { self.module } /// Get mutable reference to the module pub fn module_mut(&mut self) -> &mut M { &mut self.module } pub fn new(module: M, runtime_funcs: RuntimeFunctions) -> Self { let ctx = module.make_context(); Self { module, builder_context: FunctionBuilderContext::new(), ctx, function_map: HashMap::new(), runtime_funcs, } } /// Compile a program pub fn compile_program(&mut self, program: &Program) -> Result<FuncId, String> { // First pass: declare all functions for func in &program.functions { let func_id = self.declare_function(func)?; self.function_map.insert(func.id, func_id); } // Second pass: compile function bodies for func in &program.functions { self.compile_function(func)?; } // Compile main expression as a function self.compile_main(&program.main) } /// Declare a function fn declare_function(&mut self, func: &Function) -> Result<FuncId, String> { let name = format!("func_{}", func.id); let func_id = self .module .declare_function(&name, Linkage::Local, &self.function_signature()) .map_err(|e| format!("Failed to declare function: {}", e))?; Ok(func_id) } /// Get the standard function signature (closure, arg) -> value fn function_signature(&self) -> Signature { let mut sig = self.module.make_signature(); let pointer_type = self.module.target_config().pointer_type(); sig.params.push(AbiParam::new(pointer_type)); // closure sig.params.push(AbiParam::new(pointer_type)); // argument sig.returns.push(AbiParam::new(pointer_type)); // result sig } /// Compile a function fn compile_function(&mut self, func: &Function) -> Result<(), String> { self.ctx.func.signature = self.function_signature(); // Ensure proper alignment for ARM64 self.ctx.set_disasm(true); { let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context); let entry_block = builder.create_block(); builder.append_block_params_for_function_params(entry_block); builder.switch_to_block(entry_block); builder.seal_block(entry_block); // Set up variable environment let mut env = Environment::new(); // Get parameters let closure_val = builder.block_params(entry_block)[0]; let arg_val = builder.block_params(entry_block)[1]; env.bind(CLOSURE_ENV_PARAM.to_string(), closure_val); env.bind(func.param.clone(), arg_val); // Compile body - use helper to avoid borrow issues let result = compile_expr( &mut self.module, &self.runtime_funcs, &self.function_map, &mut builder, &func.body, &mut env, )?; builder.ins().return_(&[result]); builder.finalize(); } // Define the function let func_id = self.function_map[&func.id]; self.module .define_function(func_id, &mut self.ctx) .map_err(|e| format!("Failed to define function: {}", e))?; self.module.clear_context(&mut self.ctx); Ok(()) } /// Compile the main expression fn compile_main(&mut self, main: &Closed) -> Result<FuncId, String> { let func_id = self .module .declare_function("main_compiled", Linkage::Export, &self.main_signature()) .map_err(|e| format!("Failed to declare main: {}", e))?; self.ctx.func.signature = self.main_signature(); // Ensure proper alignment for ARM64 self.ctx.set_disasm(true); { let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context); let entry_block = builder.create_block(); builder.switch_to_block(entry_block); builder.seal_block(entry_block); let mut env = Environment::new(); let result = compile_expr( &mut self.module, &self.runtime_funcs, &self.function_map, &mut builder, main, &mut env, )?; builder.ins().return_(&[result]); builder.finalize(); } self.module .define_function(func_id, &mut self.ctx) .map_err(|e| format!("Failed to define main: {}", e))?; Ok(func_id) } /// Get the main function signature () -> value fn main_signature(&self) -> Signature { let mut sig = self.module.make_signature(); let pointer_type = self.module.target_config().pointer_type(); sig.returns.push(AbiParam::new(pointer_type)); sig } } /// Compile a closed expression (extracted to avoid borrow checker issues) fn compile_expr<M: Module>( module: &mut M, runtime_funcs: &RuntimeFunctions, function_map: &HashMap<FunctionId, FuncId>, builder: &mut FunctionBuilder, expr: &Closed, env: &mut Environment, ) -> Result<Value, String> { match expr { Closed::Var(name) => env .lookup(name) .ok_or_else(|| format!("Unbound variable: {}", name)), Closed::Proj(closure, idx) => { let closure_val = compile_expr(module, runtime_funcs, function_map, builder, closure, env)?; let idx_val = builder.ins().iconst(types::I64, *idx as i64); let project_func = module.declare_func_in_func(runtime_funcs.project_env, builder.func); let inst = builder.ins().call(project_func, &[closure_val, idx_val]); Ok(builder.inst_results(inst)[0]) } Closed::MakeClosure(func_id, captured) => { // Get function pointer let cranelift_func_id = function_map[func_id]; let func_ref = module.declare_func_in_func(cranelift_func_id, builder.func); let func_ptr = builder .ins() .func_addr(module.target_config().pointer_type(), func_ref); // Compile captured values (up to 4 for now) let mut captured_vals = Vec::new(); for cap in captured.iter().take(4) { captured_vals.push(compile_expr( module, runtime_funcs, function_map, builder, cap, env, )?); } // Pad with zeros if needed let zero = builder.ins().iconst(types::I64, 0); while captured_vals.len() < 4 { captured_vals.push(zero); } // Number of captured values let env_size = builder.ins().iconst(types::I64, captured.len() as i64); // Call make_closure with fixed signature let make_closure_func = module.declare_func_in_func(runtime_funcs.make_closure, builder.func); let mut args = vec![func_ptr, env_size]; args.extend(captured_vals); let inst = builder.ins().call(make_closure_func, &args); Ok(builder.inst_results(inst)[0]) } Closed::Call(f, x) => { let f_val = compile_expr(module, runtime_funcs, function_map, builder, f, env)?; let x_val = compile_expr(module, runtime_funcs, function_map, builder, x, env)?; let apply_func = module.declare_func_in_func(runtime_funcs.apply, builder.func); let inst = builder.ins().call(apply_func, &[f_val, x_val]); Ok(builder.inst_results(inst)[0]) } Closed::Int(n) => { let n_val = builder.ins().iconst(types::I64, *n); let make_int_func = module.declare_func_in_func(runtime_funcs.make_int, builder.func); let inst = builder.ins().call(make_int_func, &[n_val]); Ok(builder.inst_results(inst)[0]) } Closed::BinOp(op, l, r) => { // For integer operations, we need to extract the integer values let l_compiled = compile_expr(module, runtime_funcs, function_map, builder, l, env)?; let r_compiled = compile_expr(module, runtime_funcs, function_map, builder, r, env)?; // Extract integer values (arithmetic shift right by 3 to remove tag and // preserve sign) let l_int = builder.ins().sshr_imm(l_compiled, 3); let r_int = builder.ins().sshr_imm(r_compiled, 3); // Perform the operation let result_int = match op { CoreBinOp::Add => builder.ins().iadd(l_int, r_int), CoreBinOp::Sub => builder.ins().isub(l_int, r_int), CoreBinOp::Mul => builder.ins().imul(l_int, r_int), CoreBinOp::Div => builder.ins().sdiv(l_int, r_int), CoreBinOp::Lt => { let cmp = builder.ins().icmp(IntCC::SignedLessThan, l_int, r_int); // Convert boolean to 0/1 let zero = builder.ins().iconst(types::I64, 0); let one = builder.ins().iconst(types::I64, 1); builder.ins().select(cmp, one, zero) } CoreBinOp::Le => { let cmp = builder .ins() .icmp(IntCC::SignedLessThanOrEqual, l_int, r_int); // Convert boolean to 0/1 let zero = builder.ins().iconst(types::I64, 0); let one = builder.ins().iconst(types::I64, 1); builder.ins().select(cmp, one, zero) } }; // Re-tag the result let shifted = builder.ins().ishl_imm(result_int, 3); let tagged = builder.ins().bor_imm(shifted, 1); // Integer tag Ok(tagged) } Closed::IntrinsicCall { name, args } => match name.as_str() { "printInt" => { if args.len() != 1 { return Err(format!( "printInt expects exactly 1 argument, got {}", args.len() )); } let arg_val = compile_expr(module, runtime_funcs, function_map, builder, &args[0], env)?; let print_int_func = module.declare_func_in_func(runtime_funcs.print_int, builder.func); let inst = builder.ins().call(print_int_func, &[arg_val]); Ok(builder.inst_results(inst)[0]) } _ => Err(format!("Unknown intrinsic: {}", name)), }, Closed::If(c, t, e) => { // Compile condition let cond_val = compile_expr(module, runtime_funcs, function_map, builder, c, env)?; // Extract boolean value (shift right by 3, then check if non-zero) let cond_int = builder.ins().ushr_imm(cond_val, 3); let zero = builder.ins().iconst(types::I64, 0); let cond_bool = builder.ins().icmp(IntCC::NotEqual, cond_int, zero); // Create blocks let then_block = builder.create_block(); let else_block = builder.create_block(); let merge_block = builder.create_block(); // Add block parameter to merge block for the result builder.append_block_param(merge_block, types::I64); // Branch on condition builder .ins() .brif(cond_bool, then_block, &[], else_block, &[]); // Compile then branch builder.switch_to_block(then_block); builder.seal_block(then_block); let then_val = compile_expr(module, runtime_funcs, function_map, builder, t, env)?; builder.ins().jump(merge_block, &[BlockArg::from(then_val)]); // Compile else branch builder.switch_to_block(else_block); builder.seal_block(else_block); let else_val = compile_expr(module, runtime_funcs, function_map, builder, e, env)?; builder.ins().jump(merge_block, &[BlockArg::from(else_val)]); // Continue at merge block builder.switch_to_block(merge_block); builder.seal_block(merge_block); // Return the phi value Ok(builder.block_params(merge_block)[0]) } } } /// Variable environment for compilation struct Environment { bindings: Vec<(String, Value)>, } impl Environment { fn new() -> Self { Self { bindings: Vec::new(), } } fn bind(&mut self, name: String, value: Value) { self.bindings.push((name, value)); } fn lookup(&self, name: &str) -> Option<Value> { self.bindings .iter() .rev() .find(|(n, _)| n == name) .map(|(_, v)| *v) } } /// Code generator state pub struct CodeGen<M: Module> { module: M, builder_context: FunctionBuilderContext, ctx: codegen::Context, /// Maps function IDs to Cranelift function IDs function_map: HashMap<FunctionId, FuncId>, /// Runtime function declarations runtime_funcs: RuntimeFunctions, }
The code generator maintains the state necessary for compilation, including the Cranelift module that accumulates compiled functions, the function builder context for constructing individual functions, and mappings between our function identifiers and Cranelift’s function references. The runtime functions structure provides access to the runtime system primitives that support memory allocation and other essential operations.
Value Representation
Our implementation uses a uniform representation for all runtime values, enabling polymorphic functions to operate on values of any type. This representation uses tagged pointers where the low bits indicate the value’s type and the high bits contain the actual data or pointer.
The tagging scheme exploits the fact that heap-allocated objects are word-aligned, leaving the low bits available for type tags. Here’s how different values are represented in our 64-bit system:
Integer Representation:
Original integer: 42
Binary representation of 42: 0000000000101010
Shift left by 3: 0000000101010000
Add tag 1: 0000000101010001
64-bit tagged value:
┌─────────────────────────────────────────────────────────┬───┐
│ Value (42) │001│
└─────────────────────────────────────────────────────────┴───┘
63 3 2 0
tag
Closure Representation:
Closure in heap at address 0x7fff8000:
Heap memory layout:
┌────────────────┬────────────────┬────────────────┬────────────────┐
│ Code pointer │ Environment │ Captured │ Captured │
│ 0x400500 │ size: 2 │ value 1 │ value 2 │
└────────────────┴────────────────┴────────────────┴────────────────┘
0x7fff8000 0x7fff8008 0x7fff8010 0x7fff8018
Tagged pointer (address already 8-byte aligned, tag 0):
┌─────────────────────────────────────────────────────────┬───┐
│ 0x7fff8000 │000│
└─────────────────────────────────────────────────────────┴───┘
63 3 2 0
tag
Extracting Values:
To check if integer: value & 0x7 == 1
To extract integer: value >> 3 (arithmetic shift)
To extract pointer: value & ~0x7 (clear low 3 bits)
Examples:
Integer -5:
Actual value: -5
Binary: 1111111111111111111111111111111111111111111111111111111111111011
Shifted left 3: 1111111111111111111111111111111111111111111111111111111111011000
Tagged: 1111111111111111111111111111111111111111111111111111111111011001
Hex: 0xffffffffffffffd9
Integer 1000:
Tagged decimal: 8001
Tagged hex: 0x1f41
Extract: 8001 >> 3 = 1000
Closure at 0x7000:
Tagged: 0x7000 (low bits already 000)
Code pointer: *(uint64_t*)0x7000
Environment: (uint64_t*)(0x7000 + 8)
This representation allows our runtime to efficiently distinguish between integers and heap-allocated objects using a single bit test, while keeping integers unboxed for performance. The 3-bit tag space could be extended to support additional immediate types like booleans or characters, though our current implementation only uses integers and pointers.
Function Compilation
Each closure-converted function compiles to a Cranelift function that follows a uniform calling convention. Functions receive two parameters: the closure containing captured variables and the function argument. They return a single value using the same tagged representation.
#![allow(unused)] fn main() { /// Compile a function fn compile_function(&mut self, func: &Function) -> Result<(), String> { self.ctx.func.signature = self.function_signature(); // Ensure proper alignment for ARM64 self.ctx.set_disasm(true); { let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context); let entry_block = builder.create_block(); builder.append_block_params_for_function_params(entry_block); builder.switch_to_block(entry_block); builder.seal_block(entry_block); // Set up variable environment let mut env = Environment::new(); // Get parameters let closure_val = builder.block_params(entry_block)[0]; let arg_val = builder.block_params(entry_block)[1]; env.bind(CLOSURE_ENV_PARAM.to_string(), closure_val); env.bind(func.param.clone(), arg_val); // Compile body - use helper to avoid borrow issues let result = compile_expr( &mut self.module, &self.runtime_funcs, &self.function_map, &mut builder, &func.body, &mut env, )?; builder.ins().return_(&[result]); builder.finalize(); } // Define the function let func_id = self.function_map[&func.id]; self.module .define_function(func_id, &mut self.ctx) .map_err(|e| format!("Failed to define function: {}", e))?; self.module.clear_context(&mut self.ctx); Ok(()) } }
Function compilation begins by creating the appropriate function signature and setting up the entry block. The function parameters are bound to variables in the compilation environment, making them available for use in the function body. The body compilation generates instructions that compute the function result, which is then returned using Cranelift’s return instruction.
Expression Compilation
The expression compiler transforms closure-converted expressions into sequences of Cranelift instructions. Each expression type requires specific handling to generate correct and efficient machine code.
use std::collections::HashMap; use cranelift::prelude::codegen::ir::BlockArg; use cranelift::prelude::{ codegen, types, AbiParam, FunctionBuilder, FunctionBuilderContext, InstBuilder, IntCC, Signature, Value, }; use cranelift_module::{FuncId, Linkage, Module}; use super::closure::{Closed, Function, FunctionId, Program, CLOSURE_ENV_PARAM}; use super::runtime::RuntimeFunctions; use crate::core::CoreBinOp; /// Code generator state pub struct CodeGen<M: Module> { module: M, builder_context: FunctionBuilderContext, ctx: codegen::Context, /// Maps function IDs to Cranelift function IDs function_map: HashMap<FunctionId, FuncId>, /// Runtime function declarations runtime_funcs: RuntimeFunctions, } impl<M: Module> CodeGen<M> { /// Finish compilation and return the module pub fn finish(self) -> M { self.module } /// Get mutable reference to the module pub fn module_mut(&mut self) -> &mut M { &mut self.module } pub fn new(module: M, runtime_funcs: RuntimeFunctions) -> Self { let ctx = module.make_context(); Self { module, builder_context: FunctionBuilderContext::new(), ctx, function_map: HashMap::new(), runtime_funcs, } } /// Compile a program pub fn compile_program(&mut self, program: &Program) -> Result<FuncId, String> { // First pass: declare all functions for func in &program.functions { let func_id = self.declare_function(func)?; self.function_map.insert(func.id, func_id); } // Second pass: compile function bodies for func in &program.functions { self.compile_function(func)?; } // Compile main expression as a function self.compile_main(&program.main) } /// Declare a function fn declare_function(&mut self, func: &Function) -> Result<FuncId, String> { let name = format!("func_{}", func.id); let func_id = self .module .declare_function(&name, Linkage::Local, &self.function_signature()) .map_err(|e| format!("Failed to declare function: {}", e))?; Ok(func_id) } /// Get the standard function signature (closure, arg) -> value fn function_signature(&self) -> Signature { let mut sig = self.module.make_signature(); let pointer_type = self.module.target_config().pointer_type(); sig.params.push(AbiParam::new(pointer_type)); // closure sig.params.push(AbiParam::new(pointer_type)); // argument sig.returns.push(AbiParam::new(pointer_type)); // result sig } /// Compile a function fn compile_function(&mut self, func: &Function) -> Result<(), String> { self.ctx.func.signature = self.function_signature(); // Ensure proper alignment for ARM64 self.ctx.set_disasm(true); { let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context); let entry_block = builder.create_block(); builder.append_block_params_for_function_params(entry_block); builder.switch_to_block(entry_block); builder.seal_block(entry_block); // Set up variable environment let mut env = Environment::new(); // Get parameters let closure_val = builder.block_params(entry_block)[0]; let arg_val = builder.block_params(entry_block)[1]; env.bind(CLOSURE_ENV_PARAM.to_string(), closure_val); env.bind(func.param.clone(), arg_val); // Compile body - use helper to avoid borrow issues let result = compile_expr( &mut self.module, &self.runtime_funcs, &self.function_map, &mut builder, &func.body, &mut env, )?; builder.ins().return_(&[result]); builder.finalize(); } // Define the function let func_id = self.function_map[&func.id]; self.module .define_function(func_id, &mut self.ctx) .map_err(|e| format!("Failed to define function: {}", e))?; self.module.clear_context(&mut self.ctx); Ok(()) } /// Compile the main expression fn compile_main(&mut self, main: &Closed) -> Result<FuncId, String> { let func_id = self .module .declare_function("main_compiled", Linkage::Export, &self.main_signature()) .map_err(|e| format!("Failed to declare main: {}", e))?; self.ctx.func.signature = self.main_signature(); // Ensure proper alignment for ARM64 self.ctx.set_disasm(true); { let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context); let entry_block = builder.create_block(); builder.switch_to_block(entry_block); builder.seal_block(entry_block); let mut env = Environment::new(); let result = compile_expr( &mut self.module, &self.runtime_funcs, &self.function_map, &mut builder, main, &mut env, )?; builder.ins().return_(&[result]); builder.finalize(); } self.module .define_function(func_id, &mut self.ctx) .map_err(|e| format!("Failed to define main: {}", e))?; Ok(func_id) } /// Get the main function signature () -> value fn main_signature(&self) -> Signature { let mut sig = self.module.make_signature(); let pointer_type = self.module.target_config().pointer_type(); sig.returns.push(AbiParam::new(pointer_type)); sig } } /// Variable environment for compilation struct Environment { bindings: Vec<(String, Value)>, } impl Environment { fn new() -> Self { Self { bindings: Vec::new(), } } fn bind(&mut self, name: String, value: Value) { self.bindings.push((name, value)); } fn lookup(&self, name: &str) -> Option<Value> { self.bindings .iter() .rev() .find(|(n, _)| n == name) .map(|(_, v)| *v) } } /// Compile a closed expression (extracted to avoid borrow checker issues) fn compile_expr<M: Module>( module: &mut M, runtime_funcs: &RuntimeFunctions, function_map: &HashMap<FunctionId, FuncId>, builder: &mut FunctionBuilder, expr: &Closed, env: &mut Environment, ) -> Result<Value, String> { match expr { Closed::Var(name) => env .lookup(name) .ok_or_else(|| format!("Unbound variable: {}", name)), Closed::Proj(closure, idx) => { let closure_val = compile_expr(module, runtime_funcs, function_map, builder, closure, env)?; let idx_val = builder.ins().iconst(types::I64, *idx as i64); let project_func = module.declare_func_in_func(runtime_funcs.project_env, builder.func); let inst = builder.ins().call(project_func, &[closure_val, idx_val]); Ok(builder.inst_results(inst)[0]) } Closed::MakeClosure(func_id, captured) => { // Get function pointer let cranelift_func_id = function_map[func_id]; let func_ref = module.declare_func_in_func(cranelift_func_id, builder.func); let func_ptr = builder .ins() .func_addr(module.target_config().pointer_type(), func_ref); // Compile captured values (up to 4 for now) let mut captured_vals = Vec::new(); for cap in captured.iter().take(4) { captured_vals.push(compile_expr( module, runtime_funcs, function_map, builder, cap, env, )?); } // Pad with zeros if needed let zero = builder.ins().iconst(types::I64, 0); while captured_vals.len() < 4 { captured_vals.push(zero); } // Number of captured values let env_size = builder.ins().iconst(types::I64, captured.len() as i64); // Call make_closure with fixed signature let make_closure_func = module.declare_func_in_func(runtime_funcs.make_closure, builder.func); let mut args = vec![func_ptr, env_size]; args.extend(captured_vals); let inst = builder.ins().call(make_closure_func, &args); Ok(builder.inst_results(inst)[0]) } Closed::Call(f, x) => { let f_val = compile_expr(module, runtime_funcs, function_map, builder, f, env)?; let x_val = compile_expr(module, runtime_funcs, function_map, builder, x, env)?; let apply_func = module.declare_func_in_func(runtime_funcs.apply, builder.func); let inst = builder.ins().call(apply_func, &[f_val, x_val]); Ok(builder.inst_results(inst)[0]) } Closed::Int(n) => { let n_val = builder.ins().iconst(types::I64, *n); let make_int_func = module.declare_func_in_func(runtime_funcs.make_int, builder.func); let inst = builder.ins().call(make_int_func, &[n_val]); Ok(builder.inst_results(inst)[0]) } Closed::BinOp(op, l, r) => { // For integer operations, we need to extract the integer values let l_compiled = compile_expr(module, runtime_funcs, function_map, builder, l, env)?; let r_compiled = compile_expr(module, runtime_funcs, function_map, builder, r, env)?; // Extract integer values (arithmetic shift right by 3 to remove tag and // preserve sign) let l_int = builder.ins().sshr_imm(l_compiled, 3); let r_int = builder.ins().sshr_imm(r_compiled, 3); // Perform the operation let result_int = match op { CoreBinOp::Add => builder.ins().iadd(l_int, r_int), CoreBinOp::Sub => builder.ins().isub(l_int, r_int), CoreBinOp::Mul => builder.ins().imul(l_int, r_int), CoreBinOp::Div => builder.ins().sdiv(l_int, r_int), CoreBinOp::Lt => { let cmp = builder.ins().icmp(IntCC::SignedLessThan, l_int, r_int); // Convert boolean to 0/1 let zero = builder.ins().iconst(types::I64, 0); let one = builder.ins().iconst(types::I64, 1); builder.ins().select(cmp, one, zero) } CoreBinOp::Le => { let cmp = builder .ins() .icmp(IntCC::SignedLessThanOrEqual, l_int, r_int); // Convert boolean to 0/1 let zero = builder.ins().iconst(types::I64, 0); let one = builder.ins().iconst(types::I64, 1); builder.ins().select(cmp, one, zero) } }; // Re-tag the result let shifted = builder.ins().ishl_imm(result_int, 3); let tagged = builder.ins().bor_imm(shifted, 1); // Integer tag Ok(tagged) } Closed::IntrinsicCall { name, args } => match name.as_str() { "printInt" => { if args.len() != 1 { return Err(format!( "printInt expects exactly 1 argument, got {}", args.len() )); } let arg_val = compile_expr(module, runtime_funcs, function_map, builder, &args[0], env)?; let print_int_func = module.declare_func_in_func(runtime_funcs.print_int, builder.func); let inst = builder.ins().call(print_int_func, &[arg_val]); Ok(builder.inst_results(inst)[0]) } _ => Err(format!("Unknown intrinsic: {}", name)), }, Closed::If(c, t, e) => { // Compile condition let cond_val = compile_expr(module, runtime_funcs, function_map, builder, c, env)?; // Extract boolean value (shift right by 3, then check if non-zero) let cond_int = builder.ins().ushr_imm(cond_val, 3); let zero = builder.ins().iconst(types::I64, 0); let cond_bool = builder.ins().icmp(IntCC::NotEqual, cond_int, zero); // Create blocks let then_block = builder.create_block(); let else_block = builder.create_block(); let merge_block = builder.create_block(); // Add block parameter to merge block for the result builder.append_block_param(merge_block, types::I64); // Branch on condition builder .ins() .brif(cond_bool, then_block, &[], else_block, &[]); // Compile then branch builder.switch_to_block(then_block); builder.seal_block(then_block); let then_val = compile_expr(module, runtime_funcs, function_map, builder, t, env)?; builder.ins().jump(merge_block, &[BlockArg::from(then_val)]); // Compile else branch builder.switch_to_block(else_block); builder.seal_block(else_block); let else_val = compile_expr(module, runtime_funcs, function_map, builder, e, env)?; builder.ins().jump(merge_block, &[BlockArg::from(else_val)]); // Continue at merge block builder.switch_to_block(merge_block); builder.seal_block(merge_block); // Return the phi value Ok(builder.block_params(merge_block)[0]) } } }
Variable compilation looks up the variable in the environment to find its Cranelift value. Integer literals are tagged and returned directly. Binary operations extract integer values from their tagged representations, perform the operation, and retag the result. Closure creation allocates memory for the closure structure and initializes it with captured values. Function calls invoke the target function with the appropriate closure and argument.
Runtime System
The generated machine code cannot stand alone. It requires a runtime system that provides essential services not expressible in our high-level functional language. This runtime bridges the gap between pure functional code and the underlying operating system.
#![allow(unused)] fn main() { use cranelift::prelude::{ types, AbiParam, FunctionBuilder, FunctionBuilderContext, InstBuilder, IntCC, MemFlags, Type, }; use cranelift_module::{FuncId, Linkage, Module}; /// Closure representation in memory: /// [code_ptr | env_size | env[0] | env[1] | ... ] /// Each field is 8 bytes (pointer-sized) /// Compile runtime support functions pub fn compile_runtime<M: Module>(module: &mut M) -> Result<RuntimeFunctions, String> { let pointer_type = module.target_config().pointer_type(); // Declare the allocator function let alloc_func = declare_alloc(module, pointer_type)?; // Compile runtime functions let make_int = compile_make_int(module, pointer_type)?; let make_closure = compile_make_closure(module, pointer_type, alloc_func)?; let project_env = compile_project_env(module, pointer_type)?; let apply = compile_apply(module, pointer_type)?; let print_int = compile_print_int(module)?; Ok(RuntimeFunctions { make_int, make_closure, project_env, apply, print_int, }) } /// Declare the allocator function (implemented in C) fn declare_alloc<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> { let mut sig = module.make_signature(); sig.params.push(AbiParam::new(types::I64)); // size in bytes sig.returns.push(AbiParam::new(pointer_type)); // pointer to allocated memory module .declare_function("rt_alloc", Linkage::Import, &sig) .map_err(|e| format!("Failed to declare rt_alloc: {}", e)) } /// Compile make_int: tag an integer fn compile_make_int<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> { let mut sig = module.make_signature(); sig.params.push(AbiParam::new(types::I64)); sig.returns.push(AbiParam::new(pointer_type)); let func_id = module .declare_function("make_int", Linkage::Local, &sig) .map_err(|e| format!("Failed to declare make_int: {}", e))?; let mut ctx = module.make_context(); ctx.func.signature = sig; ctx.set_disasm(true); let mut func_ctx = FunctionBuilderContext::new(); let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx); let entry = builder.create_block(); builder.append_block_params_for_function_params(entry); builder.switch_to_block(entry); builder.seal_block(entry); let n = builder.block_params(entry)[0]; let shifted = builder.ins().ishl_imm(n, 3); let tagged = builder.ins().bor_imm(shifted, 1); // Integer tag = 1 builder.ins().return_(&[tagged]); builder.finalize(); module .define_function(func_id, &mut ctx) .map_err(|e| format!("Failed to define make_int: {}", e))?; Ok(func_id) } /// Compile make_closure: allocate a closure with captured environment fn compile_make_closure<M: Module>( module: &mut M, pointer_type: Type, alloc_func: FuncId, ) -> Result<FuncId, String> { // For simplicity, we'll create a fixed signature that takes a code pointer // and up to 4 captured values. Real implementation would be variadic. let mut sig = module.make_signature(); sig.params.push(AbiParam::new(pointer_type)); // code pointer sig.params.push(AbiParam::new(types::I64)); // number of captured values sig.params.push(AbiParam::new(pointer_type)); // captured value 0 (optional) sig.params.push(AbiParam::new(pointer_type)); // captured value 1 (optional) sig.params.push(AbiParam::new(pointer_type)); // captured value 2 (optional) sig.params.push(AbiParam::new(pointer_type)); // captured value 3 (optional) sig.returns.push(AbiParam::new(pointer_type)); let func_id = module .declare_function("make_closure", Linkage::Local, &sig) .map_err(|e| format!("Failed to declare make_closure: {}", e))?; let mut ctx = module.make_context(); ctx.func.signature = sig; ctx.set_disasm(true); let mut func_ctx = FunctionBuilderContext::new(); let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx); let entry = builder.create_block(); builder.append_block_params_for_function_params(entry); builder.switch_to_block(entry); builder.seal_block(entry); let code_ptr = builder.block_params(entry)[0]; let env_size = builder.block_params(entry)[1]; let cap0 = builder.block_params(entry)[2]; let cap1 = builder.block_params(entry)[3]; let _cap2 = builder.block_params(entry)[4]; let _cap3 = builder.block_params(entry)[5]; // Calculate allocation size: (2 + env_size) * 8 bytes let two = builder.ins().iconst(types::I64, 2); let total_fields = builder.ins().iadd(two, env_size); let eight = builder.ins().iconst(types::I64, 8); let alloc_size = builder.ins().imul(total_fields, eight); // Allocate memory let alloc_ref = module.declare_func_in_func(alloc_func, builder.func); let call = builder.ins().call(alloc_ref, &[alloc_size]); let closure_ptr = builder.inst_results(call)[0]; // Store code pointer at offset 0 builder .ins() .store(MemFlags::new(), code_ptr, closure_ptr, 0); // Store env_size at offset 8 builder .ins() .store(MemFlags::new(), env_size, closure_ptr, 8); // Store captured values based on env_size // We'll need to generate conditional code for each possible size let zero = builder.ins().iconst(types::I64, 0); let one = builder.ins().iconst(types::I64, 1); let _two_const = builder.ins().iconst(types::I64, 2); let _three = builder.ins().iconst(types::I64, 3); // If env_size > 0, store cap0 let cmp0 = builder.ins().icmp(IntCC::SignedGreaterThan, env_size, zero); let then_block0 = builder.create_block(); let merge_block0 = builder.create_block(); builder .ins() .brif(cmp0, then_block0, &[], merge_block0, &[]); builder.switch_to_block(then_block0); builder.seal_block(then_block0); builder.ins().store(MemFlags::new(), cap0, closure_ptr, 16); builder.ins().jump(merge_block0, &[]); builder.switch_to_block(merge_block0); builder.seal_block(merge_block0); // Similar for cap1, cap2, cap3... // For brevity, I'll just implement up to 2 captures let cmp1 = builder.ins().icmp(IntCC::SignedGreaterThan, env_size, one); let then_block1 = builder.create_block(); let merge_block1 = builder.create_block(); builder .ins() .brif(cmp1, then_block1, &[], merge_block1, &[]); builder.switch_to_block(then_block1); builder.seal_block(then_block1); builder.ins().store(MemFlags::new(), cap1, closure_ptr, 24); builder.ins().jump(merge_block1, &[]); builder.switch_to_block(merge_block1); builder.seal_block(merge_block1); // Tag the closure pointer (tag = 0, so no change needed) builder.ins().return_(&[closure_ptr]); builder.finalize(); module .define_function(func_id, &mut ctx) .map_err(|e| format!("Failed to define make_closure: {}", e))?; Ok(func_id) } /// Compile project_env: extract a value from closure environment fn compile_project_env<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> { let mut sig = module.make_signature(); sig.params.push(AbiParam::new(pointer_type)); // closure sig.params.push(AbiParam::new(types::I64)); // index sig.returns.push(AbiParam::new(pointer_type)); let func_id = module .declare_function("project_env", Linkage::Local, &sig) .map_err(|e| format!("Failed to declare project_env: {}", e))?; let mut ctx = module.make_context(); ctx.func.signature = sig; ctx.set_disasm(true); let mut func_ctx = FunctionBuilderContext::new(); let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx); let entry = builder.create_block(); builder.append_block_params_for_function_params(entry); builder.switch_to_block(entry); builder.seal_block(entry); let closure_ptr = builder.block_params(entry)[0]; let index = builder.block_params(entry)[1]; // Calculate offset: (2 + index) * 8 // For now, we'll use dynamic address calculation let two = builder.ins().iconst(types::I64, 2); let field_index = builder.ins().iadd(two, index); let eight = builder.ins().iconst(types::I64, 8); let offset = builder.ins().imul(field_index, eight); // Add offset to base pointer let addr = builder.ins().iadd(closure_ptr, offset); // Load the value let value = builder.ins().load(pointer_type, MemFlags::new(), addr, 0); builder.ins().return_(&[value]); builder.finalize(); module .define_function(func_id, &mut ctx) .map_err(|e| format!("Failed to define project_env: {}", e))?; Ok(func_id) } /// Compile apply: call a closure with an argument fn compile_apply<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> { let mut sig = module.make_signature(); sig.params.push(AbiParam::new(pointer_type)); // closure sig.params.push(AbiParam::new(pointer_type)); // argument sig.returns.push(AbiParam::new(pointer_type)); let func_id = module .declare_function("apply", Linkage::Local, &sig) .map_err(|e| format!("Failed to declare apply: {}", e))?; let mut ctx = module.make_context(); ctx.func.signature = sig.clone(); let mut func_ctx = FunctionBuilderContext::new(); let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx); let entry = builder.create_block(); builder.append_block_params_for_function_params(entry); builder.switch_to_block(entry); builder.seal_block(entry); let closure = builder.block_params(entry)[0]; let arg = builder.block_params(entry)[1]; // Load the code pointer from offset 0 let code_ptr = builder .ins() .load(pointer_type, MemFlags::new(), closure, 0); // Call the function with closure and argument let call_sig = builder.import_signature(sig); let result = builder .ins() .call_indirect(call_sig, code_ptr, &[closure, arg]); let result_val = builder.inst_results(result)[0]; builder.ins().return_(&[result_val]); builder.finalize(); module .define_function(func_id, &mut ctx) .map_err(|e| format!("Failed to define apply: {}", e))?; Ok(func_id) } /// Compile print_int: print an integer fn compile_print_int<M: Module>(module: &mut M) -> Result<FuncId, String> { let pointer_type = module.target_config().pointer_type(); // Declare external rt_print_int that's implemented in C let mut sig = module.make_signature(); sig.params.push(AbiParam::new(types::I64)); // integer value to print sig.returns.push(AbiParam::new(pointer_type)); // returns unit (as tagged 0) let rt_print_func = module .declare_function("rt_print_int", Linkage::Import, &sig) .map_err(|e| format!("Failed to declare rt_print_int: {}", e))?; // Create print_int function that takes a tagged integer let mut sig = module.make_signature(); sig.params.push(AbiParam::new(pointer_type)); // tagged integer sig.returns.push(AbiParam::new(pointer_type)); // returns unit let func_id = module .declare_function("print_int", Linkage::Local, &sig) .map_err(|e| format!("Failed to declare print_int: {}", e))?; let mut ctx = module.make_context(); ctx.func.signature = sig; ctx.set_disasm(true); let mut func_ctx = FunctionBuilderContext::new(); let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx); let entry = builder.create_block(); builder.append_block_params_for_function_params(entry); builder.switch_to_block(entry); builder.seal_block(entry); let tagged_int = builder.block_params(entry)[0]; // Extract the integer value (arithmetic shift right by 3 to preserve sign) let int_val = builder.ins().sshr_imm(tagged_int, 3); // Call rt_print_int let rt_print_ref = module.declare_func_in_func(rt_print_func, builder.func); let call = builder.ins().call(rt_print_ref, &[int_val]); let unit_result = builder.inst_results(call)[0]; builder.ins().return_(&[unit_result]); builder.finalize(); module .define_function(func_id, &mut ctx) .map_err(|e| format!("Failed to define print_int: {}", e))?; Ok(func_id) } pub struct RuntimeFunctions { pub make_int: FuncId, pub make_closure: FuncId, pub project_env: FuncId, pub apply: FuncId, pub print_int: FuncId, } }
The runtime provides several categories of essential services through functions that follow our calling conventions and value representations. Each runtime function is declared to Cranelift and can be called directly from generated code.
The actual implementation of these runtime functions lives in a separate support library written in Rust with no_std
to minimize dependencies:
#![allow(unused)] fn main() { //! Runtime support functions for System F-ω //! //! This file is automatically compiled by build.rs during the build process. #![no_std] #![no_main] // Simple bump allocator for runtime static mut HEAP: [u8; 1024 * 1024] = [0; 1024 * 1024]; // 1MB heap static mut HEAP_PTR: usize = 0; // Out of memory error string static OOM: &[u8] = b"Out of memory!\n"; /// Runtime allocator function #[no_mangle] pub unsafe extern "C" fn rt_alloc(size: usize) -> *mut u8 { // Align to 8 bytes let aligned_size = (size + 7) & !7; if HEAP_PTR + aligned_size > HEAP.len() { // Just write to stderr directly libc::write(2, OOM.as_ptr(), OOM.len()); libc::exit(1); } let result = HEAP.as_mut_ptr().add(HEAP_PTR); HEAP_PTR += aligned_size; result } /// Runtime print function - just use printf from libc #[no_mangle] pub unsafe extern "C" fn rt_print_int(value: i64) -> u64 { libc::printf(b"%lld\n\0".as_ptr() as *const i8, value); 0 // Return Unit (tagged 0) } // Minimal panic handler for no_std #[panic_handler] fn panic(_: &core::panic::PanicInfo) -> ! { unsafe { libc::exit(1) } } // Link with libc mod libc { extern "C" { pub fn printf(format: *const i8, ...) -> i32; pub fn write(fd: i32, buf: *const u8, count: usize) -> isize; pub fn exit(status: i32) -> !; } } }
The runtime support library serves several core purposes:
Memory Allocation: Our functional language creates closures and other data structures dynamically, but has no concept of memory management. The runtime provides rt_alloc
, a simple bump allocator that manages a fixed 1MB heap. This allocator is deliberately minimal, it only allocates, never frees, which is sufficient for our demonstration language. The bump allocator maintains a pointer into a static array and advances it for each allocation, ensuring 8-byte alignment for all allocations.
Value Creation: The make_int
function tags raw integers for use in our tagged representation, while make_closure
allocates and initializes closure structures with their code pointers and captured environments. The project_env
function extracts values from closure environments during execution.
Function Application: The apply
function implements the calling convention for closure invocation. It extracts the code pointer from a closure and calls it with the closure and argument, handling the low-level details of our calling convention.
Input/Output: Pure functional languages have no inherent notion of effects like printing to the console. The runtime provides rt_print_int
which unwraps our tagged integer representation and calls the C library’s printf
function to display the value. This is our only connection to the outside world, allowing programs to produce observable output.
Error Handling: When the heap is exhausted, the runtime writes an error message directly to stderr using the POSIX write
system call and terminates the program. The panic handler similarly ensures clean termination if any runtime invariant is violated.
Operating System Interface: The runtime uses direct foreign function interface calls to libc for its interactions with the operating system. This includes printf
for formatted output, write
for error messages, and exit
for program termination.
The no_std
and no_main
attributes tell Rust not to include its standard library or generate a main function, keeping the runtime minimal. The entire runtime compiles to a small object file that gets linked with our generated code. During the build process, build.rs
invokes the Rust compiler directly:
rustc --crate-type=staticlib --emit=obj -C opt-level=2 -C panic=abort runtime_support.rs
This produces an object file containing just our runtime functions, which the system linker combines with the Cranelift-generated code to create the final executable. The beauty of this approach is that we get exactly the runtime support we need while leveraging existing system libraries for the heavy lifting.
Memory Management
As an aside, note that our toy runtime uses a naive bump allocator that never frees memory. Which is simple, but isn’t ideal to put it mildly. This approach would quickly exhaust memory in real programs since functional languages allocate closures and immutable data structures prolifically. A real implementation would require some careful thinking about memory management, and there are several approaches.
Garbage Collection: The simplest solution is to bolt on a garbage collector like the Boehm-Demers-Weiser garbage collector, which is a conservative collector that works with minimal compiler modifications. Boehm GC scans memory for values that look like pointers and conservatively assumes they might be live references. It pretty much works off-the-shell, just requiring us to replace malloc
with GC_malloc
and handles root identification, reachability analysis, and memory reclamation automatically. The conservative approach occasionally retains dead memory but avoids the complexity of precise pointer tracking. There are Rust bindings and it merely requires installing the libgc
C library with:
brew install libgc
More sophisticated garbage collectors offer better performance through various techniques. Generational collectors exploit the observation that most objects die young, segregating new allocations into a nursery that gets collected frequently, this is used in languages like Haskell and Ocaml. Concurrent collectors run collection in parallel with the program, reducing pause times. Incremental collectors spread collection work across many small pauses rather than stopping the world. Each approach requires deeper runtime integration and careful handling of write barriers and safepoints.
Modern Approaches: Recent years have seen renewed interest in static memory management techniques that avoid garbage collection entirely. Affine types (or move semantics), pioneered by languages like Rust, ensure that each value has exactly one owner and is used exactly once. This enables compile-time memory management without runtime overhead. The type system tracks ownership and borrowing, preventing use-after-free and data races while enabling safe manual memory management.
Region-based memory management groups related allocations into regions that can be deallocated together. Languages like Cyclone and more recently Koka use effect systems to track region lifetimes. This approach works particularly well for functional languages where data often has stack-like lifetimes corresponding to function calls.
Reference counting with cycle detection offers another alternative, used by languages like Swift and Python. Modern reference counting implementations use deferred reference counting to reduce overhead and can achieve performance competitive with tracing garbage collectors for many workloads.
The choice of memory management strategy will profoundly impact language design. Tracing garbage collection enables simple APIs and unrestricted sharing but requires runtime overhead and can cause unpredictable pauses. Linear types and ownership systems provide predictable performance and memory usage but complicate the programming model. As hardware evolves and programming patterns change, the tradeoffs continue to shift. A good overview of this deep area is Hosking’s book The Garbage Collection Handbook: The Art of Automatic Memory Management.
Executable Generation
The final phase links the generated code with the runtime system to produce a standalone executable. This process involves generating object files, linking with the runtime support library, and producing an executable for the target platform.
#![allow(unused)] fn main() { use std::process::Command; use std::str::FromStr; use cranelift::codegen::isa::lookup; use cranelift::codegen::settings::{builder, Flags}; use cranelift::prelude::{ types, AbiParam, Configurable, FunctionBuilder, FunctionBuilderContext, InstBuilder, }; use cranelift_module::{Linkage, Module}; use cranelift_object::{ObjectBuilder, ObjectModule}; use target_lexicon::Triple; use super::{closure, compile, erase, runtime}; use crate::core::CoreModule; /// Add an entry point that calls main and prints the result fn add_entry_point<M: Module>(codegen: &mut compile::CodeGen<M>) -> Result<(), String> { // Get the module and create entry point function let module = codegen.module_mut(); // Create signature for entry point - takes no arguments, returns i32 (exit // code) let mut sig = module.make_signature(); sig.returns.push(AbiParam::new(types::I32)); // Declare the entry point function with export linkage let entry_func = module .declare_function("main", Linkage::Export, &sig) .map_err(|e| format!("Failed to declare entry point: {}", e))?; // Get reference to our compiled main function // main_compiled returns a pointer (tagged unit value) let mut main_sig = module.make_signature(); let pointer_type = module.target_config().pointer_type(); main_sig.returns.push(AbiParam::new(pointer_type)); let main_func = module .declare_function("main_compiled", Linkage::Import, &main_sig) .map_err(|e| format!("Failed to declare main_compiled reference: {}", e))?; // Create the function body let mut ctx = module.make_context(); ctx.func.signature = sig; { let mut func_ctx = FunctionBuilderContext::new(); let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx); // Create entry block let block = builder.create_block(); builder.switch_to_block(block); builder.seal_block(block); // Call our compiled main function let main_ref = module.declare_func_in_func(main_func, builder.func); builder.ins().call(main_ref, &[]); // Return 0 as exit code let zero = builder.ins().iconst(types::I32, 0); builder.ins().return_(&[zero]); // Finalize builder.finalize(); } // Define the function module .define_function(entry_func, &mut ctx) .map_err(|e| format!("Failed to define entry point: {}", e))?; Ok(()) } /// Link the object file to create an executable fn link_executable(obj_path: &str, output_path: &str) -> Result<(), String> { // Get the pre-compiled runtime object path from build.rs let runtime_obj_path = env!("RUNTIME_OBJ"); // Link the generated code with the runtime support let output = Command::new("cc") .args(["-o", output_path, obj_path, runtime_obj_path]) .output() .map_err(|e| format!("Failed to run linker: {}", e))?; if !output.status.success() { return Err(format!( "Linker failed: {}", String::from_utf8_lossy(&output.stderr) )); } Ok(()) } /// Compile a module to a standalone executable pub fn compile_executable(module: &CoreModule, output_path: &str) -> Result<(), String> { println!("=== Compiling Module ==="); // Find main function let main_def = module .term_defs .iter() .find(|def| def.name == "main") .ok_or("No main function found")?; // Type erase all functions let mut erased_functions = Vec::new(); for def in &module.term_defs { if def.name != "main" { let erased = erase::erase(&def.body); erased_functions.push((def.name.as_str(), erased)); } } // Type erase main let erased_main = erase::erase(&main_def.body); println!("=== Type Erased Functions ==="); for (name, body) in &erased_functions { println!("{}: {}", name, body.pretty()); } println!("main: {}", erased_main.pretty()); // Closure conversion with module context let program = closure::closure_convert_module( erased_functions.iter().map(|(n, b)| (*n, b)).collect(), &erased_main, ); println!("=== Closure Converted ==="); for func in &program.functions { println!( "Function {}: {} -> {}", func.id, func.param, func.body.pretty() ); println!(" Free vars: {:?}", func.free_vars); } println!("Main: {}", program.main.pretty()); // Create cranelift module let triple = if cfg!(target_arch = "aarch64") && cfg!(target_os = "macos") { // Use the proper ARM64 macOS triple target_lexicon::Triple::from_str("aarch64-apple-darwin").unwrap() } else { Triple::host() }; let mut settings = builder(); settings.set("opt_level", "none").unwrap(); settings.set("is_pic", "true").unwrap(); if cfg!(target_arch = "aarch64") { // Ensure proper alignment for ARM64 settings .set("enable_heap_access_spectre_mitigation", "false") .unwrap(); settings.set("use_colocated_libcalls", "false").unwrap(); } let flags = Flags::new(settings); let isa = lookup(triple.clone()) .map_err(|e| format!("Error looking up ISA: {}", e))? .finish(flags) .map_err(|e| format!("Error creating ISA: {}", e))?; let builder = ObjectBuilder::new( isa, "system_f_omega", cranelift_module::default_libcall_names(), ) .map_err(|e| format!("Error creating object builder: {}", e))?; let mut obj_module = ObjectModule::new(builder); // Compile runtime functions let runtime_funcs = runtime::compile_runtime(&mut obj_module)?; // Compile the program let mut codegen = compile::CodeGen::new(obj_module, runtime_funcs); let _main_func = codegen.compile_program(&program)?; // Add entry point that calls our main add_entry_point(&mut codegen)?; // Finalize let module = codegen.finish(); let object = module.finish(); let bytes = object .emit() .map_err(|e| format!("Error emitting object: {}", e))?; // Write object file let obj_path = format!("{}.o", output_path); std::fs::write(&obj_path, bytes).map_err(|e| format!("Error writing object file: {}", e))?; // Link to executable using system linker link_executable(&obj_path, output_path)?; println!("=== Compilation Complete ==="); println!("Executable written to: {}", output_path); Ok(()) } }
The executable generation process coordinates the entire compilation pipeline. Type erasure removes type information from the core language representation. Closure conversion transforms nested functions into explicit closures. Cranelift compilation generates machine code for each function. The main function wraps the program’s entry point with appropriate initialization. Finally, the system linker combines the generated code with the runtime library to produce an executable program.
And that’s it, that’s a full compiler! You’ve made a full optimizing compiler from a pure platonic mathematical construct all the way down to the messy real world of silicon and electrons.
Compiling a Simple Program
So how do invoke it? Basically just like any other compile. We take an input file and we get a machine-native executable out the other end.
Let’s trace through the compilation of a simple recursive factorial function to see how all these pieces work together:
-- factorial.fun
fib :: Int -> Int;
fib n = if n Le 1 then n else fib (n Sub 1) Add fib (n Sub 2);
main :: Int;
main = printInt(fib 10);
First, compile and run the program:
$ cd system-f-omega
$ cargo run -- compile factorial.fun -o factorial
$ ./factorial
55
The compilation process transforms this high-level program through several stages:
After Type Erasure: The polymorphic type information disappears, leaving only the computational structure:
fib = λn. if n Le 1 then n else fib (n Sub 1) Add fib (n Sub 2)
main = printInt(fib 10)
After Closure Conversion: Functions become explicit closures with captured environments:
Function 0: n -> if n Le 1 then n else closure(0, [])(n Sub 1) Add closure(0, [])(n Sub 2)
Free vars: []
Main: printInt(closure(0, [])(10))
Machine Code Generation: Cranelift generates optimized assembly for the target architecture, handling the tagged value representation, closure allocation, and function calls according to our calling convention.
And that’s the entire journey from theoretical type system to efficient machine code. The lambda calculus need not remain an abstract concept; it can be compiled down to run precisely as efficiently as C or Rust code with the right set of abstractions! Compilers need not be hard, in fact they are quite fun!