Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Code Generation

This section is optional, but we’d like to explore a little bit how we can go from high-level lambda calculus (i.e. System F) all the way down to executable machine code requires several transformations that preserve the semantics of our programs while adapting them to the realities of modern CPU architectures.

We’re going to build a minimalist generation pipeline that transforms our typed functional programs into imperative machine code through three major phases:

  • Type erasure First we removes the type information that guided our type checking
  • Closure conversion Second we make the implicit environment capture of nested functions explicit
  • Code generation Finally our code generation framework (in our case Cranelift) generates optimized machine code for the target architecture from our closure converted code.

Choosing a Code Generation Backend

Before diving into our implementation, we should understand why we chose Cranelift as our code generation backend. When implementing a compiler, you face a fundamental choice about how to generate executable code, each with distinct tradeoffs.

Bespoke Backend: The most educational approach involves directly emitting assembly instructions for your target architecture. This gives you complete control and deep understanding of the machine, but requires implementing register allocation, instruction selection, and optimization passes from scratch. For production compilers targeting multiple architectures, this quickly becomes intractable. You would need to understand the intricacies of x86-64, ARM64, RISC-V, and other instruction sets, along with their calling conventions and performance characteristics. This can be a non-trivial project.

Virtual Machine Approach: Many languages choose to compile to bytecode for a custom virtual machine. Over the decades, numerous specialized VMs have been developed for functional languages:

VMLanguagesKey FeaturesEvaluation
SECDML, SchemeStack-based, formal basisCall-by-value
CEKScheme, RacketControl/environment/continuationCall-by-value
KrivineOCamlDe Bruijn indices, minimalCall-by-name
STGHaskellLazy graph reduction, closures, parallelismCall-by-need
ZINCOCamlEfficient currying, stack/env/closureCall-by-value
JVMScala, Clojure, Kotlin, EtaClosures, immutable structures, concurrencyConfigurable
BEAMErlang, ElixirConcurrency, hot swapping, lightweight processesStrict
FLVMCurryFunctional logic, non-determinism, poolsMixed
UxnFunktalMinimalism, 8-bit opcodes, functional coreCustom

This approach provides excellent portability and simplifies the compiler, but sacrifices performance due to interpretation overhead. Even with just-in-time compilation, the VM approach typically cannot match native code performance for compute-intensive tasks.

LibJIT: Libraries like LibJIT provide a middle ground, offering a simple API for generating machine code without building a full compiler backend. These work well for domain-specific languages and embedded scripting, but typically lack sophisticated optimizations and broad architecture support. They excel at simplicity but plateau quickly in terms of performance.

LLVM: The dominant choice for production compilers, LLVM provides industrial-strength optimization passes and supports virtually every architecture. Rust, Swift, and Clang all use LLVM. However, LLVM’s comprehensiveness comes with significant costs: massive binary sizes (hundreds of megabytes), slow compilation times, and a complex C++ API that requires deep expertise. LLVM’s optimization passes, while extremely powerful, can take longer than the rest of your compiler combined. There are good Rust bindings in the inkwell crate.

MLIR: Multi-Level Intermediate Representation is like a higher-level LLVM, which extends LLVM with better support for domain-specific optimizations and heterogeneous computing (think GPUs). While powerful for machine learning compilers and specialized domains, MLIR adds even more complexity than LLVM and is still fairly early and undocumented (although I’ve tired somewhat to remedy that). There are excellent Rust bindings in melior crate.

Cranelift: Originally designed for WebAssembly, Cranelift occupies a sweet spot for language experimentation. It generates good quality code quickly, has a clean Rust API, produces small binaries, and supports the major architectures (x86-64, ARM64). While it cannot match LLVM’s peak optimization quality, Cranelift compiles code an order of magnitude faster. For a teaching compiler where iteration speed and code clarity matter more than squeezing out the last 10% of performance, Cranelift is awesome.

We’re going to use Cranelift, because it’s simple and uses the Rust build system with no external dependencies. It also has a clean API and compiles quickly, making it ideal for experimentation and learning. But our approach would work with any of these backends … only the final code generation phase would change if you choose a different backend.

Type Erasure

System Fω’s type system serves its purpose during type checking, ensuring program correctness and enabling powerful abstractions. But as we discussed before types exist purely for compile-time verification and carry no computational content at runtime. The type erasure phase strips away all type information, leaving only the essential computational structure.

#![allow(unused)]
fn main() {
use crate::core::{CoreBinOp, CoreTerm};
/// Erase types from Core representation
pub fn erase(term: &CoreTerm) -> Erased {
    match term {
        CoreTerm::Var(name) => Erased::Var(name.clone()),

        CoreTerm::LitInt(n) => Erased::Int(*n),

        CoreTerm::Lambda {
            param,
            param_ty: _,
            body,
        } => {
            // Erase type annotation
            Erased::Lam(param.clone(), Box::new(erase(body)))
        }

        CoreTerm::App { func, arg } => Erased::App(Box::new(erase(func)), Box::new(erase(arg))),

        CoreTerm::TypeLambda { param: _, body } => {
            // Type abstractions are erased - just keep the body
            erase(body)
        }

        CoreTerm::Constructor { .. } => {
            // For now, we don't handle constructors
            panic!("Constructor not yet supported in code generation")
        }

        CoreTerm::Case { .. } => {
            panic!("Pattern matching not yet supported in code generation")
        }

        CoreTerm::BinOp { op, left, right } => {
            Erased::BinOp(op.clone(), Box::new(erase(left)), Box::new(erase(right)))
        }

        CoreTerm::If {
            cond,
            then_branch,
            else_branch,
        } => Erased::If(
            Box::new(erase(cond)),
            Box::new(erase(then_branch)),
            Box::new(erase(else_branch)),
        ),

        CoreTerm::IntrinsicCall { name, args } => Erased::IntrinsicCall {
            name: name.clone(),
            args: args.iter().map(erase).collect(),
        },
    }
}
impl Erased {
    /// Pretty print erased terms
    pub fn pretty(&self) -> String {
        match self {
            Erased::Var(name) => name.to_string(),
            Erased::Lam(name, body) => format!("λ{}. {}", name, body.pretty()),
            Erased::App(f, x) => {
                let f_str = match &**f {
                    Erased::Lam(_, _) => format!("({})", f.pretty()),
                    _ => f.pretty(),
                };
                let x_str = match &**x {
                    Erased::App(_, _) | Erased::Lam(_, _) => format!("({})", x.pretty()),
                    _ => x.pretty(),
                };
                format!("{} {}", f_str, x_str)
            }
            Erased::Int(n) => n.to_string(),
            Erased::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()),
            Erased::If(c, t, e) => {
                format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty())
            }
            Erased::IntrinsicCall { name, args } => {
                let args_str = args
                    .iter()
                    .map(|a| a.pretty())
                    .collect::<Vec<_>>()
                    .join(", ");
                format!("{}({})", name, args_str)
            }
        }
    }

    /// Collect all free variables
    pub fn free_vars(&self) -> Vec<String> {
        self.free_vars_impl(&mut Vec::new())
    }

    fn free_vars_impl(&self, bound: &mut Vec<String>) -> Vec<String> {
        match self {
            Erased::Var(name) => {
                if bound.contains(name) {
                    vec![]
                } else {
                    vec![name.clone()]
                }
            }
            Erased::Lam(name, body) => {
                bound.push(name.clone());
                let result = body.free_vars_impl(bound);
                bound.pop();
                result
            }
            Erased::App(f, x) => {
                let mut fvs = f.free_vars_impl(bound);
                let x_fvs = x.free_vars_impl(bound);
                for fv in x_fvs {
                    if !fvs.contains(&fv) {
                        fvs.push(fv);
                    }
                }
                fvs
            }
            Erased::Int(_) => vec![],
            Erased::BinOp(_, l, r) => {
                let mut fvs = l.free_vars_impl(bound);
                let r_fvs = r.free_vars_impl(bound);
                for fv in r_fvs {
                    if !fvs.contains(&fv) {
                        fvs.push(fv);
                    }
                }
                fvs
            }
            Erased::If(c, t, e) => {
                let mut fvs = c.free_vars_impl(bound);
                for term in [t, e] {
                    let term_fvs = term.free_vars_impl(bound);
                    for fv in term_fvs {
                        if !fvs.contains(&fv) {
                            fvs.push(fv);
                        }
                    }
                }
                fvs
            }
            Erased::IntrinsicCall { args, .. } => {
                let mut fvs = vec![];
                for arg in args {
                    let arg_fvs = arg.free_vars_impl(bound);
                    for fv in arg_fvs {
                        if !fvs.contains(&fv) {
                            fvs.push(fv);
                        }
                    }
                }
                fvs
            }
        }
    }
}
#[cfg(test)]
mod tests {
    use super::*;
    use crate::core::CoreType;

    #[test]
    fn test_erase_simple() {
        // λx:Int. x
        let term = CoreTerm::Lambda {
            param: "x".to_string(),
            param_ty: CoreType::Con("Int".to_string()),
            body: Box::new(CoreTerm::Var("x".to_string())),
        };

        let erased = erase(&term);
        assert_eq!(
            erased,
            Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string())))
        );
    }

    #[test]
    fn test_erase_type_abstraction() {
        // Λα. λx:α. x
        let term = CoreTerm::TypeLambda {
            param: "α".to_string(),
            body: Box::new(CoreTerm::Lambda {
                param: "x".to_string(),
                param_ty: CoreType::Var("α".to_string()),
                body: Box::new(CoreTerm::Var("x".to_string())),
            }),
        };

        let erased = erase(&term);
        assert_eq!(
            erased,
            Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string())))
        );
    }
}
/// Untyped lambda calculus after type erasure
#[derive(Debug, Clone, PartialEq)]
pub enum Erased {
    /// Variable
    Var(String),
    /// Lambda abstraction (no type annotation)
    Lam(String, Box<Erased>),
    /// Application
    App(Box<Erased>, Box<Erased>),
    /// Integer literal
    Int(i64),
    /// Binary operation
    BinOp(CoreBinOp, Box<Erased>, Box<Erased>),
    /// Conditional
    If(Box<Erased>, Box<Erased>, Box<Erased>),
    /// Call a runtime intrinsic
    IntrinsicCall { name: String, args: Vec<Erased> },
}
}

The erased representation captures the essence of computation without types. Variables become simple names, lambda abstractions lose their type annotations, and applications remain as the fundamental operation of function invocation. Integer literals and binary operations pass through unchanged, as they represent actual runtime computations. Critically, type abstractions and type applications vanish entirely during erasure, as they exist solely to guide type checking.

#![allow(unused)]
fn main() {
use crate::core::{CoreBinOp, CoreTerm};
/// Untyped lambda calculus after type erasure
#[derive(Debug, Clone, PartialEq)]
pub enum Erased {
    /// Variable
    Var(String),
    /// Lambda abstraction (no type annotation)
    Lam(String, Box<Erased>),
    /// Application
    App(Box<Erased>, Box<Erased>),
    /// Integer literal
    Int(i64),
    /// Binary operation
    BinOp(CoreBinOp, Box<Erased>, Box<Erased>),
    /// Conditional
    If(Box<Erased>, Box<Erased>, Box<Erased>),
    /// Call a runtime intrinsic
    IntrinsicCall { name: String, args: Vec<Erased> },
}
impl Erased {
    /// Pretty print erased terms
    pub fn pretty(&self) -> String {
        match self {
            Erased::Var(name) => name.to_string(),
            Erased::Lam(name, body) => format!("λ{}. {}", name, body.pretty()),
            Erased::App(f, x) => {
                let f_str = match &**f {
                    Erased::Lam(_, _) => format!("({})", f.pretty()),
                    _ => f.pretty(),
                };
                let x_str = match &**x {
                    Erased::App(_, _) | Erased::Lam(_, _) => format!("({})", x.pretty()),
                    _ => x.pretty(),
                };
                format!("{} {}", f_str, x_str)
            }
            Erased::Int(n) => n.to_string(),
            Erased::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()),
            Erased::If(c, t, e) => {
                format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty())
            }
            Erased::IntrinsicCall { name, args } => {
                let args_str = args
                    .iter()
                    .map(|a| a.pretty())
                    .collect::<Vec<_>>()
                    .join(", ");
                format!("{}({})", name, args_str)
            }
        }
    }

    /// Collect all free variables
    pub fn free_vars(&self) -> Vec<String> {
        self.free_vars_impl(&mut Vec::new())
    }

    fn free_vars_impl(&self, bound: &mut Vec<String>) -> Vec<String> {
        match self {
            Erased::Var(name) => {
                if bound.contains(name) {
                    vec![]
                } else {
                    vec![name.clone()]
                }
            }
            Erased::Lam(name, body) => {
                bound.push(name.clone());
                let result = body.free_vars_impl(bound);
                bound.pop();
                result
            }
            Erased::App(f, x) => {
                let mut fvs = f.free_vars_impl(bound);
                let x_fvs = x.free_vars_impl(bound);
                for fv in x_fvs {
                    if !fvs.contains(&fv) {
                        fvs.push(fv);
                    }
                }
                fvs
            }
            Erased::Int(_) => vec![],
            Erased::BinOp(_, l, r) => {
                let mut fvs = l.free_vars_impl(bound);
                let r_fvs = r.free_vars_impl(bound);
                for fv in r_fvs {
                    if !fvs.contains(&fv) {
                        fvs.push(fv);
                    }
                }
                fvs
            }
            Erased::If(c, t, e) => {
                let mut fvs = c.free_vars_impl(bound);
                for term in [t, e] {
                    let term_fvs = term.free_vars_impl(bound);
                    for fv in term_fvs {
                        if !fvs.contains(&fv) {
                            fvs.push(fv);
                        }
                    }
                }
                fvs
            }
            Erased::IntrinsicCall { args, .. } => {
                let mut fvs = vec![];
                for arg in args {
                    let arg_fvs = arg.free_vars_impl(bound);
                    for fv in arg_fvs {
                        if !fvs.contains(&fv) {
                            fvs.push(fv);
                        }
                    }
                }
                fvs
            }
        }
    }
}
#[cfg(test)]
mod tests {
    use super::*;
    use crate::core::CoreType;

    #[test]
    fn test_erase_simple() {
        // λx:Int. x
        let term = CoreTerm::Lambda {
            param: "x".to_string(),
            param_ty: CoreType::Con("Int".to_string()),
            body: Box::new(CoreTerm::Var("x".to_string())),
        };

        let erased = erase(&term);
        assert_eq!(
            erased,
            Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string())))
        );
    }

    #[test]
    fn test_erase_type_abstraction() {
        // Λα. λx:α. x
        let term = CoreTerm::TypeLambda {
            param: "α".to_string(),
            body: Box::new(CoreTerm::Lambda {
                param: "x".to_string(),
                param_ty: CoreType::Var("α".to_string()),
                body: Box::new(CoreTerm::Var("x".to_string())),
            }),
        };

        let erased = erase(&term);
        assert_eq!(
            erased,
            Erased::Lam("x".to_string(), Box::new(Erased::Var("x".to_string())))
        );
    }
}
/// Erase types from Core representation
pub fn erase(term: &CoreTerm) -> Erased {
    match term {
        CoreTerm::Var(name) => Erased::Var(name.clone()),

        CoreTerm::LitInt(n) => Erased::Int(*n),

        CoreTerm::Lambda {
            param,
            param_ty: _,
            body,
        } => {
            // Erase type annotation
            Erased::Lam(param.clone(), Box::new(erase(body)))
        }

        CoreTerm::App { func, arg } => Erased::App(Box::new(erase(func)), Box::new(erase(arg))),

        CoreTerm::TypeLambda { param: _, body } => {
            // Type abstractions are erased - just keep the body
            erase(body)
        }

        CoreTerm::Constructor { .. } => {
            // For now, we don't handle constructors
            panic!("Constructor not yet supported in code generation")
        }

        CoreTerm::Case { .. } => {
            panic!("Pattern matching not yet supported in code generation")
        }

        CoreTerm::BinOp { op, left, right } => {
            Erased::BinOp(op.clone(), Box::new(erase(left)), Box::new(erase(right)))
        }

        CoreTerm::If {
            cond,
            then_branch,
            else_branch,
        } => Erased::If(
            Box::new(erase(cond)),
            Box::new(erase(then_branch)),
            Box::new(erase(else_branch)),
        ),

        CoreTerm::IntrinsicCall { name, args } => Erased::IntrinsicCall {
            name: name.clone(),
            args: args.iter().map(erase).collect(),
        },
    }
}
}

The erasure algorithm traverses the typed abstract syntax tree and systematically removes all type information. Lambda abstractions lose their parameter type annotations, becoming untyped functions that operate solely on runtime values. Type abstractions disappear entirely, with only their bodies remaining, since type parameters have no runtime representation. Type applications similarly vanish, as they merely instantiate type variables that no longer exist after erasure.

This phase demonstrates a fundamental principle of type systems: types are a compile-time discipline that ensures program correctness without imposing runtime overhead. The erased program retains exactly the computational content of the original while shedding the scaffolding that ensured its correctness.

Closure Conversion

Functional languages allow functions to be defined within other functions, creating nested scopes where inner functions can reference variables from enclosing scopes. This natural programming style poses a challenge for compilation to machine code, where functions are typically compiled to fixed addresses with no implicit access to their defining environment.

Consider this simple example of a function that creates an adder:

makeAdder :: Int -> (Int -> Int);
makeAdder x = λy. x + y;

add5 :: Int -> Int;
add5 = makeAdder 5;

The inner lambda λy. x + y references the variable x from its enclosing scope. When makeAdder returns this function, the value of x must somehow be preserved so that when add5 is later called with an argument, it still has access to the value 5 that was passed to makeAdder.

In traditional assembly language or C, functions are just code addresses with no associated data. A function like add would be compiled to a fixed location in memory:

add_function:
    ; expects two arguments in registers
    ; adds them and returns result

But our add5 function needs to remember that x = 5. This is where closure conversion comes in.

Closure conversion solves this problem by transforming every function into a pair of a code pointer and an environment containing captured values. The transformation makes environment capture explicit by:

  1. Identifying free variables in each function (variables used but not defined locally)
  2. Creating an environment structure to hold these captured values
  3. Rewriting functions to take an extra parameter (the closure) and extract captured values from it
  4. Transforming function calls to pass both the closure and the regular argument

After closure conversion, our example becomes something like:

// Original: λy. x + y where x is free
// Converted: λ(closure, y). project(closure, 0) + y

// Creating add5:
// 1. Allocate closure: [code_ptr=add_function, env=[5]]
// 2. Return this closure

// Calling add5(3):
// 1. Extract code pointer from closure
// 2. Call code_ptr(closure, 3)
// 3. Inside function: project(closure, 0) gives us 5
// 4. Return 5 + 3 = 8

This transformation turns the implicit variable capture of nested functions into explicit data structures that can be allocated and manipulated at runtime.

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use super::erase::Erased;
use crate::core::CoreBinOp;
/// The parameter name used for the closure environment in converted functions
pub const CLOSURE_ENV_PARAM: &str = "$closure";
/// The parameter name used for thunks (non-lambda top-level definitions)
pub const THUNK_PARAM: &str = "_unit";
/// A unique identifier for each function
pub type FunctionId = usize;
/// A top-level function definition after closure conversion
#[derive(Debug, Clone)]
pub struct Function {
    pub id: FunctionId,
    pub param: String,
    pub body: Closed,
    pub free_vars: Vec<String>,
}
/// Result of closure conversion
#[derive(Debug)]
pub struct Program {
    pub functions: Vec<Function>,
    pub main: Closed,
}
/// State for closure conversion
struct ClosureConverter {
    next_id: FunctionId,
    functions: Vec<Function>,
}
impl ClosureConverter {
    fn new() -> Self {
        Self {
            next_id: 0,
            functions: Vec::new(),
        }
    }

    fn fresh_id(&mut self) -> FunctionId {
        let id = self.next_id;
        self.next_id += 1;
        id
    }

    /// Convert with knowledge of module-level functions
    fn convert_with_modules(
        &mut self,
        term: &Erased,
        env: &[(String, usize)],
        module_funcs: &HashMap<String, FunctionId>,
    ) -> Closed {
        match term {
            Erased::Var(name) => {
                // Check if it's a module function
                if let Some(&func_id) = module_funcs.get(name) {
                    // Check if this is a thunk (non-lambda definition)
                    // by looking at the function we've already created
                    let is_thunk = self
                        .functions
                        .iter()
                        .any(|f| f.id == func_id && f.param == THUNK_PARAM);

                    if is_thunk {
                        // Call the thunk with a dummy argument (0)
                        Closed::Call(
                            Box::new(Closed::MakeClosure(func_id, vec![])),
                            Box::new(Closed::Int(0)),
                        )
                    } else {
                        // Regular function - create a closure with no captured variables
                        Closed::MakeClosure(func_id, vec![])
                    }
                } else if let Some((_, idx)) = env.iter().find(|(n, _)| n == name) {
                    // This is a captured variable - project from closure
                    Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx)
                } else {
                    // Free variable or parameter
                    Closed::Var(name.clone())
                }
            }

            Erased::Lam(param, body) => {
                // Collect free variables (excluding the parameter and module functions)
                let mut free_vars = body.free_vars();
                free_vars.retain(|v| v != param && !module_funcs.contains_key(v));

                // Create environment mapping for the body
                let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)];
                for (i, fv) in free_vars.iter().enumerate() {
                    body_env.push((fv.clone(), i));
                }

                // Convert the body with the new environment
                let body_closed = self.convert_with_modules(body, &body_env, module_funcs);

                // Create a new function
                let id = self.fresh_id();
                self.functions.push(Function {
                    id,
                    param: param.clone(),
                    body: body_closed,
                    free_vars: free_vars.clone(),
                });

                // Create closure capturing the free variables
                let captured = free_vars
                    .into_iter()
                    .map(|fv| {
                        if let Some((_, idx)) = env.iter().find(|(n, _)| n == &fv) {
                            Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx)
                        } else {
                            Closed::Var(fv)
                        }
                    })
                    .collect();

                Closed::MakeClosure(id, captured)
            }

            Erased::App(f, x) => {
                let f_closed = self.convert_with_modules(f, env, module_funcs);
                let x_closed = self.convert_with_modules(x, env, module_funcs);
                Closed::Call(Box::new(f_closed), Box::new(x_closed))
            }

            Erased::Int(n) => Closed::Int(*n),

            Erased::BinOp(op, l, r) => {
                let l_closed = self.convert_with_modules(l, env, module_funcs);
                let r_closed = self.convert_with_modules(r, env, module_funcs);
                Closed::BinOp(op.clone(), Box::new(l_closed), Box::new(r_closed))
            }

            Erased::If(c, t, e) => {
                let c_closed = self.convert_with_modules(c, env, module_funcs);
                let t_closed = self.convert_with_modules(t, env, module_funcs);
                let e_closed = self.convert_with_modules(e, env, module_funcs);
                Closed::If(Box::new(c_closed), Box::new(t_closed), Box::new(e_closed))
            }

            Erased::IntrinsicCall { name, args } => {
                let args_closed = args
                    .iter()
                    .map(|arg| self.convert_with_modules(arg, env, module_funcs))
                    .collect();
                Closed::IntrinsicCall {
                    name: name.clone(),
                    args: args_closed,
                }
            }
        }
    }
}
/// Perform closure conversion with module context
pub fn closure_convert_module(functions: Vec<(&str, &Erased)>, main: &Erased) -> Program {
    let mut converter = ClosureConverter::new();

    // First, assign IDs to all module functions
    let mut module_funcs = HashMap::new();
    for (name, _) in &functions {
        let id = converter.fresh_id();
        module_funcs.insert(name.to_string(), id);
    }

    // Convert each function body
    for (name, body) in functions {
        match body {
            Erased::Lam(param, func_body) => {
                // Get free variables
                let mut free_vars = func_body.free_vars();
                free_vars.retain(|v| v != param && !module_funcs.contains_key(v));

                // Create environment mapping
                let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)];
                for (i, fv) in free_vars.iter().enumerate() {
                    body_env.push((fv.clone(), i));
                }

                // Convert the body
                let body_closed =
                    converter.convert_with_modules(func_body, &body_env, &module_funcs);

                converter.functions.push(Function {
                    id: module_funcs[name],
                    param: param.clone(),
                    body: body_closed,
                    free_vars,
                });
            }
            _ => {
                // For non-lambda definitions (like add5 = add 5),
                // convert them to thunks (lambdas that take a dummy parameter)
                let body_closed = converter.convert_with_modules(body, &[], &module_funcs);

                converter.functions.push(Function {
                    id: module_funcs[name],
                    param: THUNK_PARAM.to_string(),
                    body: body_closed,
                    free_vars: vec![],
                });
            }
        }
    }

    // Convert main
    let main_closed = converter.convert_with_modules(main, &[], &module_funcs);

    Program {
        functions: converter.functions,
        main: main_closed,
    }
}
impl Closed {
    /// Pretty print closed terms
    pub fn pretty(&self) -> String {
        match self {
            Closed::Var(name) => name.to_string(),
            Closed::Proj(closure, idx) => format!("{}[{}]", closure.pretty(), idx),
            Closed::MakeClosure(id, captured) => {
                let caps: Vec<String> = captured.iter().map(|c| c.pretty()).collect();
                format!("closure({}, [{}])", id, caps.join(", "))
            }
            Closed::Call(f, x) => format!("{}({})", f.pretty(), x.pretty()),
            Closed::Int(n) => n.to_string(),
            Closed::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()),
            Closed::If(c, t, e) => {
                format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty())
            }
            Closed::IntrinsicCall { name, args } => {
                let args_str = args
                    .iter()
                    .map(|a| a.pretty())
                    .collect::<Vec<_>>()
                    .join(", ");
                format!("{}({})", name, args_str)
            }
        }
    }
}
/// Closure-converted representation
#[derive(Debug, Clone, PartialEq)]
pub enum Closed {
    /// Variable reference
    Var(String),
    /// Environment projection: project(closure, index)
    Proj(Box<Closed>, usize),
    /// Create closure: closure(function_id, [captured_vars])
    MakeClosure(FunctionId, Vec<Closed>),
    /// Direct function call: call(closure, arg)
    Call(Box<Closed>, Box<Closed>),
    /// Integer literal
    Int(i64),
    /// Binary operation
    BinOp(CoreBinOp, Box<Closed>, Box<Closed>),
    /// Conditional
    If(Box<Closed>, Box<Closed>, Box<Closed>),
    /// Intrinsic function call
    IntrinsicCall { name: String, args: Vec<Closed> },
}
}

The closed representation introduces several new constructs that make environment manipulation explicit. The MakeClosure construct creates a closure by pairing a function identifier with its captured environment. The Proj construct extracts values from a closure’s environment using positional indices. The Call construct invokes a closure by passing both the closure itself and the argument.

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use super::erase::Erased;
use crate::core::CoreBinOp;
/// The parameter name used for the closure environment in converted functions
pub const CLOSURE_ENV_PARAM: &str = "$closure";
/// The parameter name used for thunks (non-lambda top-level definitions)
pub const THUNK_PARAM: &str = "_unit";
/// A unique identifier for each function
pub type FunctionId = usize;
/// Closure-converted representation
#[derive(Debug, Clone, PartialEq)]
pub enum Closed {
    /// Variable reference
    Var(String),
    /// Environment projection: project(closure, index)
    Proj(Box<Closed>, usize),
    /// Create closure: closure(function_id, [captured_vars])
    MakeClosure(FunctionId, Vec<Closed>),
    /// Direct function call: call(closure, arg)
    Call(Box<Closed>, Box<Closed>),
    /// Integer literal
    Int(i64),
    /// Binary operation
    BinOp(CoreBinOp, Box<Closed>, Box<Closed>),
    /// Conditional
    If(Box<Closed>, Box<Closed>, Box<Closed>),
    /// Intrinsic function call
    IntrinsicCall { name: String, args: Vec<Closed> },
}
/// Result of closure conversion
#[derive(Debug)]
pub struct Program {
    pub functions: Vec<Function>,
    pub main: Closed,
}
/// State for closure conversion
struct ClosureConverter {
    next_id: FunctionId,
    functions: Vec<Function>,
}
impl ClosureConverter {
    fn new() -> Self {
        Self {
            next_id: 0,
            functions: Vec::new(),
        }
    }

    fn fresh_id(&mut self) -> FunctionId {
        let id = self.next_id;
        self.next_id += 1;
        id
    }

    /// Convert with knowledge of module-level functions
    fn convert_with_modules(
        &mut self,
        term: &Erased,
        env: &[(String, usize)],
        module_funcs: &HashMap<String, FunctionId>,
    ) -> Closed {
        match term {
            Erased::Var(name) => {
                // Check if it's a module function
                if let Some(&func_id) = module_funcs.get(name) {
                    // Check if this is a thunk (non-lambda definition)
                    // by looking at the function we've already created
                    let is_thunk = self
                        .functions
                        .iter()
                        .any(|f| f.id == func_id && f.param == THUNK_PARAM);

                    if is_thunk {
                        // Call the thunk with a dummy argument (0)
                        Closed::Call(
                            Box::new(Closed::MakeClosure(func_id, vec![])),
                            Box::new(Closed::Int(0)),
                        )
                    } else {
                        // Regular function - create a closure with no captured variables
                        Closed::MakeClosure(func_id, vec![])
                    }
                } else if let Some((_, idx)) = env.iter().find(|(n, _)| n == name) {
                    // This is a captured variable - project from closure
                    Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx)
                } else {
                    // Free variable or parameter
                    Closed::Var(name.clone())
                }
            }

            Erased::Lam(param, body) => {
                // Collect free variables (excluding the parameter and module functions)
                let mut free_vars = body.free_vars();
                free_vars.retain(|v| v != param && !module_funcs.contains_key(v));

                // Create environment mapping for the body
                let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)];
                for (i, fv) in free_vars.iter().enumerate() {
                    body_env.push((fv.clone(), i));
                }

                // Convert the body with the new environment
                let body_closed = self.convert_with_modules(body, &body_env, module_funcs);

                // Create a new function
                let id = self.fresh_id();
                self.functions.push(Function {
                    id,
                    param: param.clone(),
                    body: body_closed,
                    free_vars: free_vars.clone(),
                });

                // Create closure capturing the free variables
                let captured = free_vars
                    .into_iter()
                    .map(|fv| {
                        if let Some((_, idx)) = env.iter().find(|(n, _)| n == &fv) {
                            Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx)
                        } else {
                            Closed::Var(fv)
                        }
                    })
                    .collect();

                Closed::MakeClosure(id, captured)
            }

            Erased::App(f, x) => {
                let f_closed = self.convert_with_modules(f, env, module_funcs);
                let x_closed = self.convert_with_modules(x, env, module_funcs);
                Closed::Call(Box::new(f_closed), Box::new(x_closed))
            }

            Erased::Int(n) => Closed::Int(*n),

            Erased::BinOp(op, l, r) => {
                let l_closed = self.convert_with_modules(l, env, module_funcs);
                let r_closed = self.convert_with_modules(r, env, module_funcs);
                Closed::BinOp(op.clone(), Box::new(l_closed), Box::new(r_closed))
            }

            Erased::If(c, t, e) => {
                let c_closed = self.convert_with_modules(c, env, module_funcs);
                let t_closed = self.convert_with_modules(t, env, module_funcs);
                let e_closed = self.convert_with_modules(e, env, module_funcs);
                Closed::If(Box::new(c_closed), Box::new(t_closed), Box::new(e_closed))
            }

            Erased::IntrinsicCall { name, args } => {
                let args_closed = args
                    .iter()
                    .map(|arg| self.convert_with_modules(arg, env, module_funcs))
                    .collect();
                Closed::IntrinsicCall {
                    name: name.clone(),
                    args: args_closed,
                }
            }
        }
    }
}
/// Perform closure conversion with module context
pub fn closure_convert_module(functions: Vec<(&str, &Erased)>, main: &Erased) -> Program {
    let mut converter = ClosureConverter::new();

    // First, assign IDs to all module functions
    let mut module_funcs = HashMap::new();
    for (name, _) in &functions {
        let id = converter.fresh_id();
        module_funcs.insert(name.to_string(), id);
    }

    // Convert each function body
    for (name, body) in functions {
        match body {
            Erased::Lam(param, func_body) => {
                // Get free variables
                let mut free_vars = func_body.free_vars();
                free_vars.retain(|v| v != param && !module_funcs.contains_key(v));

                // Create environment mapping
                let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)];
                for (i, fv) in free_vars.iter().enumerate() {
                    body_env.push((fv.clone(), i));
                }

                // Convert the body
                let body_closed =
                    converter.convert_with_modules(func_body, &body_env, &module_funcs);

                converter.functions.push(Function {
                    id: module_funcs[name],
                    param: param.clone(),
                    body: body_closed,
                    free_vars,
                });
            }
            _ => {
                // For non-lambda definitions (like add5 = add 5),
                // convert them to thunks (lambdas that take a dummy parameter)
                let body_closed = converter.convert_with_modules(body, &[], &module_funcs);

                converter.functions.push(Function {
                    id: module_funcs[name],
                    param: THUNK_PARAM.to_string(),
                    body: body_closed,
                    free_vars: vec![],
                });
            }
        }
    }

    // Convert main
    let main_closed = converter.convert_with_modules(main, &[], &module_funcs);

    Program {
        functions: converter.functions,
        main: main_closed,
    }
}
impl Closed {
    /// Pretty print closed terms
    pub fn pretty(&self) -> String {
        match self {
            Closed::Var(name) => name.to_string(),
            Closed::Proj(closure, idx) => format!("{}[{}]", closure.pretty(), idx),
            Closed::MakeClosure(id, captured) => {
                let caps: Vec<String> = captured.iter().map(|c| c.pretty()).collect();
                format!("closure({}, [{}])", id, caps.join(", "))
            }
            Closed::Call(f, x) => format!("{}({})", f.pretty(), x.pretty()),
            Closed::Int(n) => n.to_string(),
            Closed::BinOp(op, l, r) => format!("{} {:?} {}", l.pretty(), op, r.pretty()),
            Closed::If(c, t, e) => {
                format!("if {} then {} else {}", c.pretty(), t.pretty(), e.pretty())
            }
            Closed::IntrinsicCall { name, args } => {
                let args_str = args
                    .iter()
                    .map(|a| a.pretty())
                    .collect::<Vec<_>>()
                    .join(", ");
                format!("{}({})", name, args_str)
            }
        }
    }
}
/// A top-level function definition after closure conversion
#[derive(Debug, Clone)]
pub struct Function {
    pub id: FunctionId,
    pub param: String,
    pub body: Closed,
    pub free_vars: Vec<String>,
}
}

After closure conversion, each function exists as a separate top-level definition with an explicit parameter for its closure environment. The function body can access captured variables through projections from this environment parameter. This flat structure maps directly to the function model of assembly language, where each function has a fixed address and explicit parameters.

As an aside, there are alternative approaches to handling nested functions. The so-called defunctionalization approach converts higher-order functions into first-order code by replacing function values with data tags and a dispatch function, eliminating closures entirely but requiring whole-program transformation. Lambda lifting (also called closure elimination) transforms nested functions into top-level functions by adding their free variables as extra parameters, avoiding heap allocation of closures but potentially increasing the number of parameters passed at each call site. Alternatively the continuation-passing style transformation rewrites all functions to never return directly, instead passing their results to explicit continuation functions that represent “what to do next”. This transformation makes control flow explicit and can simplify closure representation, but produces code that’s harder to optimize and debug. We chose traditional closure conversion because it produces straightforward code that maps well to machine calling conventions, supports separate compilation, and integrates naturally with existing runtime systems while avoiding the parameter bloat of lambda lifting.

Free Variable Analysis

Closure conversion requires identifying which variables each function captures from its enclosing scope. This free variable analysis traverses each function body to determine which variables are referenced but not bound within the function itself.

#![allow(unused)]
fn main() {
/// Collect all free variables
pub fn free_vars(&self) -> Vec<String> {
    self.free_vars_impl(&mut Vec::new())
}
}

The free variable analysis maintains a set of bound variables as it traverses expressions. Variables that appear in the expression but not in the bound set are free and must be captured in the function’s closure. This analysis handles the scoping rules correctly, adding variables to the bound set when entering their scope and removing them when leaving.

Environment Representation

The closure conversion algorithm maintains an environment that maps variables to their locations in closure environments. When a function captures variables, they are assigned positions in its closure, and references to these variables become projections at the appropriate indices.

#![allow(unused)]
fn main() {
/// Convert with knowledge of module-level functions
fn convert_with_modules(
    &mut self,
    term: &Erased,
    env: &[(String, usize)],
    module_funcs: &HashMap<String, FunctionId>,
) -> Closed {
    match term {
        Erased::Var(name) => {
            // Check if it's a module function
            if let Some(&func_id) = module_funcs.get(name) {
                // Check if this is a thunk (non-lambda definition)
                // by looking at the function we've already created
                let is_thunk = self
                    .functions
                    .iter()
                    .any(|f| f.id == func_id && f.param == THUNK_PARAM);

                if is_thunk {
                    // Call the thunk with a dummy argument (0)
                    Closed::Call(
                        Box::new(Closed::MakeClosure(func_id, vec![])),
                        Box::new(Closed::Int(0)),
                    )
                } else {
                    // Regular function - create a closure with no captured variables
                    Closed::MakeClosure(func_id, vec![])
                }
            } else if let Some((_, idx)) = env.iter().find(|(n, _)| n == name) {
                // This is a captured variable - project from closure
                Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx)
            } else {
                // Free variable or parameter
                Closed::Var(name.clone())
            }
        }

        Erased::Lam(param, body) => {
            // Collect free variables (excluding the parameter and module functions)
            let mut free_vars = body.free_vars();
            free_vars.retain(|v| v != param && !module_funcs.contains_key(v));

            // Create environment mapping for the body
            let mut body_env = vec![(CLOSURE_ENV_PARAM.into(), 0)];
            for (i, fv) in free_vars.iter().enumerate() {
                body_env.push((fv.clone(), i));
            }

            // Convert the body with the new environment
            let body_closed = self.convert_with_modules(body, &body_env, module_funcs);

            // Create a new function
            let id = self.fresh_id();
            self.functions.push(Function {
                id,
                param: param.clone(),
                body: body_closed,
                free_vars: free_vars.clone(),
            });

            // Create closure capturing the free variables
            let captured = free_vars
                .into_iter()
                .map(|fv| {
                    if let Some((_, idx)) = env.iter().find(|(n, _)| n == &fv) {
                        Closed::Proj(Box::new(Closed::Var(CLOSURE_ENV_PARAM.into())), *idx)
                    } else {
                        Closed::Var(fv)
                    }
                })
                .collect();

            Closed::MakeClosure(id, captured)
        }

        Erased::App(f, x) => {
            let f_closed = self.convert_with_modules(f, env, module_funcs);
            let x_closed = self.convert_with_modules(x, env, module_funcs);
            Closed::Call(Box::new(f_closed), Box::new(x_closed))
        }

        Erased::Int(n) => Closed::Int(*n),

        Erased::BinOp(op, l, r) => {
            let l_closed = self.convert_with_modules(l, env, module_funcs);
            let r_closed = self.convert_with_modules(r, env, module_funcs);
            Closed::BinOp(op.clone(), Box::new(l_closed), Box::new(r_closed))
        }

        Erased::If(c, t, e) => {
            let c_closed = self.convert_with_modules(c, env, module_funcs);
            let t_closed = self.convert_with_modules(t, env, module_funcs);
            let e_closed = self.convert_with_modules(e, env, module_funcs);
            Closed::If(Box::new(c_closed), Box::new(t_closed), Box::new(e_closed))
        }

        Erased::IntrinsicCall { name, args } => {
            let args_closed = args
                .iter()
                .map(|arg| self.convert_with_modules(arg, env, module_funcs))
                .collect();
            Closed::IntrinsicCall {
                name: name.clone(),
                args: args_closed,
            }
        }
    }
}
}

The conversion process transforms each expression based on its structure and the current environment. Lambda abstractions become closure allocations that capture the current values of their free variables. Variable references check whether the variable is in the local environment or needs to be accessed through closure projection. Applications become calls that pass both the closure and the argument, following the calling convention for closure-converted code.

Cranelift Code Generation

With closure conversion complete, the program consists of a collection of first-order functions and explicit closure operations. The Cranelift compiler framework transforms this representation into optimized machine code for the target architecture.

use std::collections::HashMap;
use cranelift::prelude::codegen::ir::BlockArg;
use cranelift::prelude::{
    codegen, types, AbiParam, FunctionBuilder, FunctionBuilderContext, InstBuilder, IntCC,
    Signature, Value,
};
use cranelift_module::{FuncId, Linkage, Module};
use super::closure::{Closed, Function, FunctionId, Program, CLOSURE_ENV_PARAM};
use super::runtime::RuntimeFunctions;
use crate::core::CoreBinOp;
impl<M: Module> CodeGen<M> {
    /// Finish compilation and return the module
    pub fn finish(self) -> M {
        self.module
    }

    /// Get mutable reference to the module
    pub fn module_mut(&mut self) -> &mut M {
        &mut self.module
    }

    pub fn new(module: M, runtime_funcs: RuntimeFunctions) -> Self {
        let ctx = module.make_context();

        Self {
            module,
            builder_context: FunctionBuilderContext::new(),
            ctx,
            function_map: HashMap::new(),
            runtime_funcs,
        }
    }

    /// Compile a program
    pub fn compile_program(&mut self, program: &Program) -> Result<FuncId, String> {
        // First pass: declare all functions
        for func in &program.functions {
            let func_id = self.declare_function(func)?;
            self.function_map.insert(func.id, func_id);
        }

        // Second pass: compile function bodies
        for func in &program.functions {
            self.compile_function(func)?;
        }

        // Compile main expression as a function
        self.compile_main(&program.main)
    }

    /// Declare a function
    fn declare_function(&mut self, func: &Function) -> Result<FuncId, String> {
        let name = format!("func_{}", func.id);
        let func_id = self
            .module
            .declare_function(&name, Linkage::Local, &self.function_signature())
            .map_err(|e| format!("Failed to declare function: {}", e))?;
        Ok(func_id)
    }

    /// Get the standard function signature (closure, arg) -> value
    fn function_signature(&self) -> Signature {
        let mut sig = self.module.make_signature();
        let pointer_type = self.module.target_config().pointer_type();
        sig.params.push(AbiParam::new(pointer_type)); // closure
        sig.params.push(AbiParam::new(pointer_type)); // argument
        sig.returns.push(AbiParam::new(pointer_type)); // result
        sig
    }

    /// Compile a function
    fn compile_function(&mut self, func: &Function) -> Result<(), String> {
        self.ctx.func.signature = self.function_signature();

        // Ensure proper alignment for ARM64
        self.ctx.set_disasm(true);

        {
            let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context);
            let entry_block = builder.create_block();

            builder.append_block_params_for_function_params(entry_block);
            builder.switch_to_block(entry_block);
            builder.seal_block(entry_block);

            // Set up variable environment
            let mut env = Environment::new();

            // Get parameters
            let closure_val = builder.block_params(entry_block)[0];
            let arg_val = builder.block_params(entry_block)[1];

            env.bind(CLOSURE_ENV_PARAM.to_string(), closure_val);
            env.bind(func.param.clone(), arg_val);

            // Compile body - use helper to avoid borrow issues
            let result = compile_expr(
                &mut self.module,
                &self.runtime_funcs,
                &self.function_map,
                &mut builder,
                &func.body,
                &mut env,
            )?;

            builder.ins().return_(&[result]);
            builder.finalize();
        }

        // Define the function
        let func_id = self.function_map[&func.id];
        self.module
            .define_function(func_id, &mut self.ctx)
            .map_err(|e| format!("Failed to define function: {}", e))?;

        self.module.clear_context(&mut self.ctx);
        Ok(())
    }

    /// Compile the main expression
    fn compile_main(&mut self, main: &Closed) -> Result<FuncId, String> {
        let func_id = self
            .module
            .declare_function("main_compiled", Linkage::Export, &self.main_signature())
            .map_err(|e| format!("Failed to declare main: {}", e))?;

        self.ctx.func.signature = self.main_signature();

        // Ensure proper alignment for ARM64
        self.ctx.set_disasm(true);

        {
            let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context);
            let entry_block = builder.create_block();

            builder.switch_to_block(entry_block);
            builder.seal_block(entry_block);

            let mut env = Environment::new();
            let result = compile_expr(
                &mut self.module,
                &self.runtime_funcs,
                &self.function_map,
                &mut builder,
                main,
                &mut env,
            )?;

            builder.ins().return_(&[result]);
            builder.finalize();
        }

        self.module
            .define_function(func_id, &mut self.ctx)
            .map_err(|e| format!("Failed to define main: {}", e))?;

        Ok(func_id)
    }

    /// Get the main function signature () -> value
    fn main_signature(&self) -> Signature {
        let mut sig = self.module.make_signature();
        let pointer_type = self.module.target_config().pointer_type();
        sig.returns.push(AbiParam::new(pointer_type));
        sig
    }
}
/// Compile a closed expression (extracted to avoid borrow checker issues)
fn compile_expr<M: Module>(
    module: &mut M,
    runtime_funcs: &RuntimeFunctions,
    function_map: &HashMap<FunctionId, FuncId>,
    builder: &mut FunctionBuilder,
    expr: &Closed,
    env: &mut Environment,
) -> Result<Value, String> {
    match expr {
        Closed::Var(name) => env
            .lookup(name)
            .ok_or_else(|| format!("Unbound variable: {}", name)),

        Closed::Proj(closure, idx) => {
            let closure_val =
                compile_expr(module, runtime_funcs, function_map, builder, closure, env)?;
            let idx_val = builder.ins().iconst(types::I64, *idx as i64);
            let project_func = module.declare_func_in_func(runtime_funcs.project_env, builder.func);
            let inst = builder.ins().call(project_func, &[closure_val, idx_val]);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::MakeClosure(func_id, captured) => {
            // Get function pointer
            let cranelift_func_id = function_map[func_id];
            let func_ref = module.declare_func_in_func(cranelift_func_id, builder.func);
            let func_ptr = builder
                .ins()
                .func_addr(module.target_config().pointer_type(), func_ref);

            // Compile captured values (up to 4 for now)
            let mut captured_vals = Vec::new();
            for cap in captured.iter().take(4) {
                captured_vals.push(compile_expr(
                    module,
                    runtime_funcs,
                    function_map,
                    builder,
                    cap,
                    env,
                )?);
            }

            // Pad with zeros if needed
            let zero = builder.ins().iconst(types::I64, 0);
            while captured_vals.len() < 4 {
                captured_vals.push(zero);
            }

            // Number of captured values
            let env_size = builder.ins().iconst(types::I64, captured.len() as i64);

            // Call make_closure with fixed signature
            let make_closure_func =
                module.declare_func_in_func(runtime_funcs.make_closure, builder.func);
            let mut args = vec![func_ptr, env_size];
            args.extend(captured_vals);
            let inst = builder.ins().call(make_closure_func, &args);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::Call(f, x) => {
            let f_val = compile_expr(module, runtime_funcs, function_map, builder, f, env)?;
            let x_val = compile_expr(module, runtime_funcs, function_map, builder, x, env)?;
            let apply_func = module.declare_func_in_func(runtime_funcs.apply, builder.func);
            let inst = builder.ins().call(apply_func, &[f_val, x_val]);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::Int(n) => {
            let n_val = builder.ins().iconst(types::I64, *n);
            let make_int_func = module.declare_func_in_func(runtime_funcs.make_int, builder.func);
            let inst = builder.ins().call(make_int_func, &[n_val]);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::BinOp(op, l, r) => {
            // For integer operations, we need to extract the integer values
            let l_compiled = compile_expr(module, runtime_funcs, function_map, builder, l, env)?;
            let r_compiled = compile_expr(module, runtime_funcs, function_map, builder, r, env)?;

            // Extract integer values (arithmetic shift right by 3 to remove tag and
            // preserve sign)
            let l_int = builder.ins().sshr_imm(l_compiled, 3);
            let r_int = builder.ins().sshr_imm(r_compiled, 3);

            // Perform the operation
            let result_int = match op {
                CoreBinOp::Add => builder.ins().iadd(l_int, r_int),
                CoreBinOp::Sub => builder.ins().isub(l_int, r_int),
                CoreBinOp::Mul => builder.ins().imul(l_int, r_int),
                CoreBinOp::Div => builder.ins().sdiv(l_int, r_int),
                CoreBinOp::Lt => {
                    let cmp = builder.ins().icmp(IntCC::SignedLessThan, l_int, r_int);
                    // Convert boolean to 0/1
                    let zero = builder.ins().iconst(types::I64, 0);
                    let one = builder.ins().iconst(types::I64, 1);
                    builder.ins().select(cmp, one, zero)
                }
                CoreBinOp::Le => {
                    let cmp = builder
                        .ins()
                        .icmp(IntCC::SignedLessThanOrEqual, l_int, r_int);
                    // Convert boolean to 0/1
                    let zero = builder.ins().iconst(types::I64, 0);
                    let one = builder.ins().iconst(types::I64, 1);
                    builder.ins().select(cmp, one, zero)
                }
            };

            // Re-tag the result
            let shifted = builder.ins().ishl_imm(result_int, 3);
            let tagged = builder.ins().bor_imm(shifted, 1); // Integer tag
            Ok(tagged)
        }

        Closed::IntrinsicCall { name, args } => match name.as_str() {
            "printInt" => {
                if args.len() != 1 {
                    return Err(format!(
                        "printInt expects exactly 1 argument, got {}",
                        args.len()
                    ));
                }
                let arg_val =
                    compile_expr(module, runtime_funcs, function_map, builder, &args[0], env)?;
                let print_int_func =
                    module.declare_func_in_func(runtime_funcs.print_int, builder.func);
                let inst = builder.ins().call(print_int_func, &[arg_val]);
                Ok(builder.inst_results(inst)[0])
            }
            _ => Err(format!("Unknown intrinsic: {}", name)),
        },

        Closed::If(c, t, e) => {
            // Compile condition
            let cond_val = compile_expr(module, runtime_funcs, function_map, builder, c, env)?;

            // Extract boolean value (shift right by 3, then check if non-zero)
            let cond_int = builder.ins().ushr_imm(cond_val, 3);
            let zero = builder.ins().iconst(types::I64, 0);
            let cond_bool = builder.ins().icmp(IntCC::NotEqual, cond_int, zero);

            // Create blocks
            let then_block = builder.create_block();
            let else_block = builder.create_block();
            let merge_block = builder.create_block();

            // Add block parameter to merge block for the result
            builder.append_block_param(merge_block, types::I64);

            // Branch on condition
            builder
                .ins()
                .brif(cond_bool, then_block, &[], else_block, &[]);

            // Compile then branch
            builder.switch_to_block(then_block);
            builder.seal_block(then_block);
            let then_val = compile_expr(module, runtime_funcs, function_map, builder, t, env)?;
            builder.ins().jump(merge_block, &[BlockArg::from(then_val)]);

            // Compile else branch
            builder.switch_to_block(else_block);
            builder.seal_block(else_block);
            let else_val = compile_expr(module, runtime_funcs, function_map, builder, e, env)?;
            builder.ins().jump(merge_block, &[BlockArg::from(else_val)]);

            // Continue at merge block
            builder.switch_to_block(merge_block);
            builder.seal_block(merge_block);

            // Return the phi value
            Ok(builder.block_params(merge_block)[0])
        }
    }
}
/// Variable environment for compilation
struct Environment {
    bindings: Vec<(String, Value)>,
}
impl Environment {
    fn new() -> Self {
        Self {
            bindings: Vec::new(),
        }
    }

    fn bind(&mut self, name: String, value: Value) {
        self.bindings.push((name, value));
    }

    fn lookup(&self, name: &str) -> Option<Value> {
        self.bindings
            .iter()
            .rev()
            .find(|(n, _)| n == name)
            .map(|(_, v)| *v)
    }
}
/// Code generator state
pub struct CodeGen<M: Module> {
    module: M,
    builder_context: FunctionBuilderContext,
    ctx: codegen::Context,
    /// Maps function IDs to Cranelift function IDs
    function_map: HashMap<FunctionId, FuncId>,
    /// Runtime function declarations
    runtime_funcs: RuntimeFunctions,
}

The code generator maintains the state necessary for compilation, including the Cranelift module that accumulates compiled functions, the function builder context for constructing individual functions, and mappings between our function identifiers and Cranelift’s function references. The runtime functions structure provides access to the runtime system primitives that support memory allocation and other essential operations.

Value Representation

Our implementation uses a uniform representation for all runtime values, enabling polymorphic functions to operate on values of any type. This representation uses tagged pointers where the low bits indicate the value’s type and the high bits contain the actual data or pointer.

The tagging scheme exploits the fact that heap-allocated objects are word-aligned, leaving the low bits available for type tags. Here’s how different values are represented in our 64-bit system:

Integer Representation:

Original integer: 42

Binary representation of 42:      0000000000101010
Shift left by 3:                  0000000101010000
Add tag 1:                        0000000101010001

64-bit tagged value:
┌─────────────────────────────────────────────────────────┬───┐
│                     Value (42)                          │001│
└─────────────────────────────────────────────────────────┴───┘
 63                                                      3 2 0
                                                           tag

Closure Representation:

Closure in heap at address 0x7fff8000:

Heap memory layout:
┌────────────────┬────────────────┬────────────────┬────────────────┐
│ Code pointer   │ Environment    │ Captured       │ Captured       │
│ 0x400500       │ size: 2        │ value 1        │ value 2        │
└────────────────┴────────────────┴────────────────┴────────────────┘
 0x7fff8000       0x7fff8008       0x7fff8010       0x7fff8018

Tagged pointer (address already 8-byte aligned, tag 0):
┌─────────────────────────────────────────────────────────┬───┐
│                 0x7fff8000                              │000│
└─────────────────────────────────────────────────────────┴───┘
 63                                                      3 2 0
                                                           tag

Extracting Values:

To check if integer:     value & 0x7 == 1
To extract integer:      value >> 3 (arithmetic shift)
To extract pointer:      value & ~0x7 (clear low 3 bits)

Examples:

Integer -5:
  Actual value:     -5
  Binary:           1111111111111111111111111111111111111111111111111111111111111011
  Shifted left 3:   1111111111111111111111111111111111111111111111111111111111011000
  Tagged:           1111111111111111111111111111111111111111111111111111111111011001
  Hex:              0xffffffffffffffd9

Integer 1000:
  Tagged decimal:   8001
  Tagged hex:       0x1f41
  Extract:          8001 >> 3 = 1000

Closure at 0x7000:
  Tagged:           0x7000 (low bits already 000)
  Code pointer:     *(uint64_t*)0x7000
  Environment:      (uint64_t*)(0x7000 + 8)

This representation allows our runtime to efficiently distinguish between integers and heap-allocated objects using a single bit test, while keeping integers unboxed for performance. The 3-bit tag space could be extended to support additional immediate types like booleans or characters, though our current implementation only uses integers and pointers.

Function Compilation

Each closure-converted function compiles to a Cranelift function that follows a uniform calling convention. Functions receive two parameters: the closure containing captured variables and the function argument. They return a single value using the same tagged representation.

#![allow(unused)]
fn main() {
/// Compile a function
fn compile_function(&mut self, func: &Function) -> Result<(), String> {
    self.ctx.func.signature = self.function_signature();

    // Ensure proper alignment for ARM64
    self.ctx.set_disasm(true);

    {
        let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context);
        let entry_block = builder.create_block();

        builder.append_block_params_for_function_params(entry_block);
        builder.switch_to_block(entry_block);
        builder.seal_block(entry_block);

        // Set up variable environment
        let mut env = Environment::new();

        // Get parameters
        let closure_val = builder.block_params(entry_block)[0];
        let arg_val = builder.block_params(entry_block)[1];

        env.bind(CLOSURE_ENV_PARAM.to_string(), closure_val);
        env.bind(func.param.clone(), arg_val);

        // Compile body - use helper to avoid borrow issues
        let result = compile_expr(
            &mut self.module,
            &self.runtime_funcs,
            &self.function_map,
            &mut builder,
            &func.body,
            &mut env,
        )?;

        builder.ins().return_(&[result]);
        builder.finalize();
    }

    // Define the function
    let func_id = self.function_map[&func.id];
    self.module
        .define_function(func_id, &mut self.ctx)
        .map_err(|e| format!("Failed to define function: {}", e))?;

    self.module.clear_context(&mut self.ctx);
    Ok(())
}
}

Function compilation begins by creating the appropriate function signature and setting up the entry block. The function parameters are bound to variables in the compilation environment, making them available for use in the function body. The body compilation generates instructions that compute the function result, which is then returned using Cranelift’s return instruction.

Expression Compilation

The expression compiler transforms closure-converted expressions into sequences of Cranelift instructions. Each expression type requires specific handling to generate correct and efficient machine code.

use std::collections::HashMap;
use cranelift::prelude::codegen::ir::BlockArg;
use cranelift::prelude::{
    codegen, types, AbiParam, FunctionBuilder, FunctionBuilderContext, InstBuilder, IntCC,
    Signature, Value,
};
use cranelift_module::{FuncId, Linkage, Module};
use super::closure::{Closed, Function, FunctionId, Program, CLOSURE_ENV_PARAM};
use super::runtime::RuntimeFunctions;
use crate::core::CoreBinOp;
/// Code generator state
pub struct CodeGen<M: Module> {
    module: M,
    builder_context: FunctionBuilderContext,
    ctx: codegen::Context,
    /// Maps function IDs to Cranelift function IDs
    function_map: HashMap<FunctionId, FuncId>,
    /// Runtime function declarations
    runtime_funcs: RuntimeFunctions,
}
impl<M: Module> CodeGen<M> {
    /// Finish compilation and return the module
    pub fn finish(self) -> M {
        self.module
    }

    /// Get mutable reference to the module
    pub fn module_mut(&mut self) -> &mut M {
        &mut self.module
    }

    pub fn new(module: M, runtime_funcs: RuntimeFunctions) -> Self {
        let ctx = module.make_context();

        Self {
            module,
            builder_context: FunctionBuilderContext::new(),
            ctx,
            function_map: HashMap::new(),
            runtime_funcs,
        }
    }

    /// Compile a program
    pub fn compile_program(&mut self, program: &Program) -> Result<FuncId, String> {
        // First pass: declare all functions
        for func in &program.functions {
            let func_id = self.declare_function(func)?;
            self.function_map.insert(func.id, func_id);
        }

        // Second pass: compile function bodies
        for func in &program.functions {
            self.compile_function(func)?;
        }

        // Compile main expression as a function
        self.compile_main(&program.main)
    }

    /// Declare a function
    fn declare_function(&mut self, func: &Function) -> Result<FuncId, String> {
        let name = format!("func_{}", func.id);
        let func_id = self
            .module
            .declare_function(&name, Linkage::Local, &self.function_signature())
            .map_err(|e| format!("Failed to declare function: {}", e))?;
        Ok(func_id)
    }

    /// Get the standard function signature (closure, arg) -> value
    fn function_signature(&self) -> Signature {
        let mut sig = self.module.make_signature();
        let pointer_type = self.module.target_config().pointer_type();
        sig.params.push(AbiParam::new(pointer_type)); // closure
        sig.params.push(AbiParam::new(pointer_type)); // argument
        sig.returns.push(AbiParam::new(pointer_type)); // result
        sig
    }

    /// Compile a function
    fn compile_function(&mut self, func: &Function) -> Result<(), String> {
        self.ctx.func.signature = self.function_signature();

        // Ensure proper alignment for ARM64
        self.ctx.set_disasm(true);

        {
            let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context);
            let entry_block = builder.create_block();

            builder.append_block_params_for_function_params(entry_block);
            builder.switch_to_block(entry_block);
            builder.seal_block(entry_block);

            // Set up variable environment
            let mut env = Environment::new();

            // Get parameters
            let closure_val = builder.block_params(entry_block)[0];
            let arg_val = builder.block_params(entry_block)[1];

            env.bind(CLOSURE_ENV_PARAM.to_string(), closure_val);
            env.bind(func.param.clone(), arg_val);

            // Compile body - use helper to avoid borrow issues
            let result = compile_expr(
                &mut self.module,
                &self.runtime_funcs,
                &self.function_map,
                &mut builder,
                &func.body,
                &mut env,
            )?;

            builder.ins().return_(&[result]);
            builder.finalize();
        }

        // Define the function
        let func_id = self.function_map[&func.id];
        self.module
            .define_function(func_id, &mut self.ctx)
            .map_err(|e| format!("Failed to define function: {}", e))?;

        self.module.clear_context(&mut self.ctx);
        Ok(())
    }

    /// Compile the main expression
    fn compile_main(&mut self, main: &Closed) -> Result<FuncId, String> {
        let func_id = self
            .module
            .declare_function("main_compiled", Linkage::Export, &self.main_signature())
            .map_err(|e| format!("Failed to declare main: {}", e))?;

        self.ctx.func.signature = self.main_signature();

        // Ensure proper alignment for ARM64
        self.ctx.set_disasm(true);

        {
            let mut builder = FunctionBuilder::new(&mut self.ctx.func, &mut self.builder_context);
            let entry_block = builder.create_block();

            builder.switch_to_block(entry_block);
            builder.seal_block(entry_block);

            let mut env = Environment::new();
            let result = compile_expr(
                &mut self.module,
                &self.runtime_funcs,
                &self.function_map,
                &mut builder,
                main,
                &mut env,
            )?;

            builder.ins().return_(&[result]);
            builder.finalize();
        }

        self.module
            .define_function(func_id, &mut self.ctx)
            .map_err(|e| format!("Failed to define main: {}", e))?;

        Ok(func_id)
    }

    /// Get the main function signature () -> value
    fn main_signature(&self) -> Signature {
        let mut sig = self.module.make_signature();
        let pointer_type = self.module.target_config().pointer_type();
        sig.returns.push(AbiParam::new(pointer_type));
        sig
    }
}
/// Variable environment for compilation
struct Environment {
    bindings: Vec<(String, Value)>,
}
impl Environment {
    fn new() -> Self {
        Self {
            bindings: Vec::new(),
        }
    }

    fn bind(&mut self, name: String, value: Value) {
        self.bindings.push((name, value));
    }

    fn lookup(&self, name: &str) -> Option<Value> {
        self.bindings
            .iter()
            .rev()
            .find(|(n, _)| n == name)
            .map(|(_, v)| *v)
    }
}
/// Compile a closed expression (extracted to avoid borrow checker issues)
fn compile_expr<M: Module>(
    module: &mut M,
    runtime_funcs: &RuntimeFunctions,
    function_map: &HashMap<FunctionId, FuncId>,
    builder: &mut FunctionBuilder,
    expr: &Closed,
    env: &mut Environment,
) -> Result<Value, String> {
    match expr {
        Closed::Var(name) => env
            .lookup(name)
            .ok_or_else(|| format!("Unbound variable: {}", name)),

        Closed::Proj(closure, idx) => {
            let closure_val =
                compile_expr(module, runtime_funcs, function_map, builder, closure, env)?;
            let idx_val = builder.ins().iconst(types::I64, *idx as i64);
            let project_func = module.declare_func_in_func(runtime_funcs.project_env, builder.func);
            let inst = builder.ins().call(project_func, &[closure_val, idx_val]);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::MakeClosure(func_id, captured) => {
            // Get function pointer
            let cranelift_func_id = function_map[func_id];
            let func_ref = module.declare_func_in_func(cranelift_func_id, builder.func);
            let func_ptr = builder
                .ins()
                .func_addr(module.target_config().pointer_type(), func_ref);

            // Compile captured values (up to 4 for now)
            let mut captured_vals = Vec::new();
            for cap in captured.iter().take(4) {
                captured_vals.push(compile_expr(
                    module,
                    runtime_funcs,
                    function_map,
                    builder,
                    cap,
                    env,
                )?);
            }

            // Pad with zeros if needed
            let zero = builder.ins().iconst(types::I64, 0);
            while captured_vals.len() < 4 {
                captured_vals.push(zero);
            }

            // Number of captured values
            let env_size = builder.ins().iconst(types::I64, captured.len() as i64);

            // Call make_closure with fixed signature
            let make_closure_func =
                module.declare_func_in_func(runtime_funcs.make_closure, builder.func);
            let mut args = vec![func_ptr, env_size];
            args.extend(captured_vals);
            let inst = builder.ins().call(make_closure_func, &args);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::Call(f, x) => {
            let f_val = compile_expr(module, runtime_funcs, function_map, builder, f, env)?;
            let x_val = compile_expr(module, runtime_funcs, function_map, builder, x, env)?;
            let apply_func = module.declare_func_in_func(runtime_funcs.apply, builder.func);
            let inst = builder.ins().call(apply_func, &[f_val, x_val]);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::Int(n) => {
            let n_val = builder.ins().iconst(types::I64, *n);
            let make_int_func = module.declare_func_in_func(runtime_funcs.make_int, builder.func);
            let inst = builder.ins().call(make_int_func, &[n_val]);
            Ok(builder.inst_results(inst)[0])
        }

        Closed::BinOp(op, l, r) => {
            // For integer operations, we need to extract the integer values
            let l_compiled = compile_expr(module, runtime_funcs, function_map, builder, l, env)?;
            let r_compiled = compile_expr(module, runtime_funcs, function_map, builder, r, env)?;

            // Extract integer values (arithmetic shift right by 3 to remove tag and
            // preserve sign)
            let l_int = builder.ins().sshr_imm(l_compiled, 3);
            let r_int = builder.ins().sshr_imm(r_compiled, 3);

            // Perform the operation
            let result_int = match op {
                CoreBinOp::Add => builder.ins().iadd(l_int, r_int),
                CoreBinOp::Sub => builder.ins().isub(l_int, r_int),
                CoreBinOp::Mul => builder.ins().imul(l_int, r_int),
                CoreBinOp::Div => builder.ins().sdiv(l_int, r_int),
                CoreBinOp::Lt => {
                    let cmp = builder.ins().icmp(IntCC::SignedLessThan, l_int, r_int);
                    // Convert boolean to 0/1
                    let zero = builder.ins().iconst(types::I64, 0);
                    let one = builder.ins().iconst(types::I64, 1);
                    builder.ins().select(cmp, one, zero)
                }
                CoreBinOp::Le => {
                    let cmp = builder
                        .ins()
                        .icmp(IntCC::SignedLessThanOrEqual, l_int, r_int);
                    // Convert boolean to 0/1
                    let zero = builder.ins().iconst(types::I64, 0);
                    let one = builder.ins().iconst(types::I64, 1);
                    builder.ins().select(cmp, one, zero)
                }
            };

            // Re-tag the result
            let shifted = builder.ins().ishl_imm(result_int, 3);
            let tagged = builder.ins().bor_imm(shifted, 1); // Integer tag
            Ok(tagged)
        }

        Closed::IntrinsicCall { name, args } => match name.as_str() {
            "printInt" => {
                if args.len() != 1 {
                    return Err(format!(
                        "printInt expects exactly 1 argument, got {}",
                        args.len()
                    ));
                }
                let arg_val =
                    compile_expr(module, runtime_funcs, function_map, builder, &args[0], env)?;
                let print_int_func =
                    module.declare_func_in_func(runtime_funcs.print_int, builder.func);
                let inst = builder.ins().call(print_int_func, &[arg_val]);
                Ok(builder.inst_results(inst)[0])
            }
            _ => Err(format!("Unknown intrinsic: {}", name)),
        },

        Closed::If(c, t, e) => {
            // Compile condition
            let cond_val = compile_expr(module, runtime_funcs, function_map, builder, c, env)?;

            // Extract boolean value (shift right by 3, then check if non-zero)
            let cond_int = builder.ins().ushr_imm(cond_val, 3);
            let zero = builder.ins().iconst(types::I64, 0);
            let cond_bool = builder.ins().icmp(IntCC::NotEqual, cond_int, zero);

            // Create blocks
            let then_block = builder.create_block();
            let else_block = builder.create_block();
            let merge_block = builder.create_block();

            // Add block parameter to merge block for the result
            builder.append_block_param(merge_block, types::I64);

            // Branch on condition
            builder
                .ins()
                .brif(cond_bool, then_block, &[], else_block, &[]);

            // Compile then branch
            builder.switch_to_block(then_block);
            builder.seal_block(then_block);
            let then_val = compile_expr(module, runtime_funcs, function_map, builder, t, env)?;
            builder.ins().jump(merge_block, &[BlockArg::from(then_val)]);

            // Compile else branch
            builder.switch_to_block(else_block);
            builder.seal_block(else_block);
            let else_val = compile_expr(module, runtime_funcs, function_map, builder, e, env)?;
            builder.ins().jump(merge_block, &[BlockArg::from(else_val)]);

            // Continue at merge block
            builder.switch_to_block(merge_block);
            builder.seal_block(merge_block);

            // Return the phi value
            Ok(builder.block_params(merge_block)[0])
        }
    }
}

Variable compilation looks up the variable in the environment to find its Cranelift value. Integer literals are tagged and returned directly. Binary operations extract integer values from their tagged representations, perform the operation, and retag the result. Closure creation allocates memory for the closure structure and initializes it with captured values. Function calls invoke the target function with the appropriate closure and argument.

Runtime System

The generated machine code cannot stand alone. It requires a runtime system that provides essential services not expressible in our high-level functional language. This runtime bridges the gap between pure functional code and the underlying operating system.

#![allow(unused)]
fn main() {
use cranelift::prelude::{
    types, AbiParam, FunctionBuilder, FunctionBuilderContext, InstBuilder, IntCC, MemFlags, Type,
};
use cranelift_module::{FuncId, Linkage, Module};
/// Closure representation in memory:
/// [code_ptr | env_size | env[0] | env[1] | ... ]
/// Each field is 8 bytes (pointer-sized)
/// Compile runtime support functions
pub fn compile_runtime<M: Module>(module: &mut M) -> Result<RuntimeFunctions, String> {
    let pointer_type = module.target_config().pointer_type();

    // Declare the allocator function
    let alloc_func = declare_alloc(module, pointer_type)?;

    // Compile runtime functions
    let make_int = compile_make_int(module, pointer_type)?;
    let make_closure = compile_make_closure(module, pointer_type, alloc_func)?;
    let project_env = compile_project_env(module, pointer_type)?;
    let apply = compile_apply(module, pointer_type)?;
    let print_int = compile_print_int(module)?;

    Ok(RuntimeFunctions {
        make_int,
        make_closure,
        project_env,
        apply,
        print_int,
    })
}
/// Declare the allocator function (implemented in C)
fn declare_alloc<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> {
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(types::I64)); // size in bytes
    sig.returns.push(AbiParam::new(pointer_type)); // pointer to allocated memory

    module
        .declare_function("rt_alloc", Linkage::Import, &sig)
        .map_err(|e| format!("Failed to declare rt_alloc: {}", e))
}
/// Compile make_int: tag an integer
fn compile_make_int<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> {
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(types::I64));
    sig.returns.push(AbiParam::new(pointer_type));

    let func_id = module
        .declare_function("make_int", Linkage::Local, &sig)
        .map_err(|e| format!("Failed to declare make_int: {}", e))?;

    let mut ctx = module.make_context();
    ctx.func.signature = sig;
    ctx.set_disasm(true);

    let mut func_ctx = FunctionBuilderContext::new();
    let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx);

    let entry = builder.create_block();
    builder.append_block_params_for_function_params(entry);
    builder.switch_to_block(entry);
    builder.seal_block(entry);

    let n = builder.block_params(entry)[0];
    let shifted = builder.ins().ishl_imm(n, 3);
    let tagged = builder.ins().bor_imm(shifted, 1); // Integer tag = 1
    builder.ins().return_(&[tagged]);

    builder.finalize();
    module
        .define_function(func_id, &mut ctx)
        .map_err(|e| format!("Failed to define make_int: {}", e))?;

    Ok(func_id)
}
/// Compile make_closure: allocate a closure with captured environment
fn compile_make_closure<M: Module>(
    module: &mut M,
    pointer_type: Type,
    alloc_func: FuncId,
) -> Result<FuncId, String> {
    // For simplicity, we'll create a fixed signature that takes a code pointer
    // and up to 4 captured values. Real implementation would be variadic.
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(pointer_type)); // code pointer
    sig.params.push(AbiParam::new(types::I64)); // number of captured values
    sig.params.push(AbiParam::new(pointer_type)); // captured value 0 (optional)
    sig.params.push(AbiParam::new(pointer_type)); // captured value 1 (optional)
    sig.params.push(AbiParam::new(pointer_type)); // captured value 2 (optional)
    sig.params.push(AbiParam::new(pointer_type)); // captured value 3 (optional)
    sig.returns.push(AbiParam::new(pointer_type));

    let func_id = module
        .declare_function("make_closure", Linkage::Local, &sig)
        .map_err(|e| format!("Failed to declare make_closure: {}", e))?;

    let mut ctx = module.make_context();
    ctx.func.signature = sig;
    ctx.set_disasm(true);

    let mut func_ctx = FunctionBuilderContext::new();
    let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx);

    let entry = builder.create_block();
    builder.append_block_params_for_function_params(entry);
    builder.switch_to_block(entry);
    builder.seal_block(entry);

    let code_ptr = builder.block_params(entry)[0];
    let env_size = builder.block_params(entry)[1];
    let cap0 = builder.block_params(entry)[2];
    let cap1 = builder.block_params(entry)[3];
    let _cap2 = builder.block_params(entry)[4];
    let _cap3 = builder.block_params(entry)[5];

    // Calculate allocation size: (2 + env_size) * 8 bytes
    let two = builder.ins().iconst(types::I64, 2);
    let total_fields = builder.ins().iadd(two, env_size);
    let eight = builder.ins().iconst(types::I64, 8);
    let alloc_size = builder.ins().imul(total_fields, eight);

    // Allocate memory
    let alloc_ref = module.declare_func_in_func(alloc_func, builder.func);
    let call = builder.ins().call(alloc_ref, &[alloc_size]);
    let closure_ptr = builder.inst_results(call)[0];

    // Store code pointer at offset 0
    builder
        .ins()
        .store(MemFlags::new(), code_ptr, closure_ptr, 0);

    // Store env_size at offset 8
    builder
        .ins()
        .store(MemFlags::new(), env_size, closure_ptr, 8);

    // Store captured values based on env_size
    // We'll need to generate conditional code for each possible size
    let zero = builder.ins().iconst(types::I64, 0);
    let one = builder.ins().iconst(types::I64, 1);
    let _two_const = builder.ins().iconst(types::I64, 2);
    let _three = builder.ins().iconst(types::I64, 3);

    // If env_size > 0, store cap0
    let cmp0 = builder.ins().icmp(IntCC::SignedGreaterThan, env_size, zero);
    let then_block0 = builder.create_block();
    let merge_block0 = builder.create_block();
    builder
        .ins()
        .brif(cmp0, then_block0, &[], merge_block0, &[]);

    builder.switch_to_block(then_block0);
    builder.seal_block(then_block0);
    builder.ins().store(MemFlags::new(), cap0, closure_ptr, 16);
    builder.ins().jump(merge_block0, &[]);

    builder.switch_to_block(merge_block0);
    builder.seal_block(merge_block0);

    // Similar for cap1, cap2, cap3...
    // For brevity, I'll just implement up to 2 captures
    let cmp1 = builder.ins().icmp(IntCC::SignedGreaterThan, env_size, one);
    let then_block1 = builder.create_block();
    let merge_block1 = builder.create_block();
    builder
        .ins()
        .brif(cmp1, then_block1, &[], merge_block1, &[]);

    builder.switch_to_block(then_block1);
    builder.seal_block(then_block1);
    builder.ins().store(MemFlags::new(), cap1, closure_ptr, 24);
    builder.ins().jump(merge_block1, &[]);

    builder.switch_to_block(merge_block1);
    builder.seal_block(merge_block1);

    // Tag the closure pointer (tag = 0, so no change needed)
    builder.ins().return_(&[closure_ptr]);

    builder.finalize();
    module
        .define_function(func_id, &mut ctx)
        .map_err(|e| format!("Failed to define make_closure: {}", e))?;

    Ok(func_id)
}
/// Compile project_env: extract a value from closure environment
fn compile_project_env<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> {
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(pointer_type)); // closure
    sig.params.push(AbiParam::new(types::I64)); // index
    sig.returns.push(AbiParam::new(pointer_type));

    let func_id = module
        .declare_function("project_env", Linkage::Local, &sig)
        .map_err(|e| format!("Failed to declare project_env: {}", e))?;

    let mut ctx = module.make_context();
    ctx.func.signature = sig;
    ctx.set_disasm(true);

    let mut func_ctx = FunctionBuilderContext::new();
    let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx);

    let entry = builder.create_block();
    builder.append_block_params_for_function_params(entry);
    builder.switch_to_block(entry);
    builder.seal_block(entry);

    let closure_ptr = builder.block_params(entry)[0];
    let index = builder.block_params(entry)[1];

    // Calculate offset: (2 + index) * 8
    // For now, we'll use dynamic address calculation
    let two = builder.ins().iconst(types::I64, 2);
    let field_index = builder.ins().iadd(two, index);
    let eight = builder.ins().iconst(types::I64, 8);
    let offset = builder.ins().imul(field_index, eight);

    // Add offset to base pointer
    let addr = builder.ins().iadd(closure_ptr, offset);

    // Load the value
    let value = builder.ins().load(pointer_type, MemFlags::new(), addr, 0);
    builder.ins().return_(&[value]);

    builder.finalize();
    module
        .define_function(func_id, &mut ctx)
        .map_err(|e| format!("Failed to define project_env: {}", e))?;

    Ok(func_id)
}
/// Compile apply: call a closure with an argument
fn compile_apply<M: Module>(module: &mut M, pointer_type: Type) -> Result<FuncId, String> {
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(pointer_type)); // closure
    sig.params.push(AbiParam::new(pointer_type)); // argument
    sig.returns.push(AbiParam::new(pointer_type));

    let func_id = module
        .declare_function("apply", Linkage::Local, &sig)
        .map_err(|e| format!("Failed to declare apply: {}", e))?;

    let mut ctx = module.make_context();
    ctx.func.signature = sig.clone();

    let mut func_ctx = FunctionBuilderContext::new();
    let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx);

    let entry = builder.create_block();
    builder.append_block_params_for_function_params(entry);
    builder.switch_to_block(entry);
    builder.seal_block(entry);

    let closure = builder.block_params(entry)[0];
    let arg = builder.block_params(entry)[1];

    // Load the code pointer from offset 0
    let code_ptr = builder
        .ins()
        .load(pointer_type, MemFlags::new(), closure, 0);

    // Call the function with closure and argument
    let call_sig = builder.import_signature(sig);
    let result = builder
        .ins()
        .call_indirect(call_sig, code_ptr, &[closure, arg]);
    let result_val = builder.inst_results(result)[0];
    builder.ins().return_(&[result_val]);

    builder.finalize();
    module
        .define_function(func_id, &mut ctx)
        .map_err(|e| format!("Failed to define apply: {}", e))?;

    Ok(func_id)
}
/// Compile print_int: print an integer
fn compile_print_int<M: Module>(module: &mut M) -> Result<FuncId, String> {
    let pointer_type = module.target_config().pointer_type();

    // Declare external rt_print_int that's implemented in C
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(types::I64)); // integer value to print
    sig.returns.push(AbiParam::new(pointer_type)); // returns unit (as tagged 0)

    let rt_print_func = module
        .declare_function("rt_print_int", Linkage::Import, &sig)
        .map_err(|e| format!("Failed to declare rt_print_int: {}", e))?;

    // Create print_int function that takes a tagged integer
    let mut sig = module.make_signature();
    sig.params.push(AbiParam::new(pointer_type)); // tagged integer
    sig.returns.push(AbiParam::new(pointer_type)); // returns unit

    let func_id = module
        .declare_function("print_int", Linkage::Local, &sig)
        .map_err(|e| format!("Failed to declare print_int: {}", e))?;

    let mut ctx = module.make_context();
    ctx.func.signature = sig;
    ctx.set_disasm(true);

    let mut func_ctx = FunctionBuilderContext::new();
    let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx);

    let entry = builder.create_block();
    builder.append_block_params_for_function_params(entry);
    builder.switch_to_block(entry);
    builder.seal_block(entry);

    let tagged_int = builder.block_params(entry)[0];

    // Extract the integer value (arithmetic shift right by 3 to preserve sign)
    let int_val = builder.ins().sshr_imm(tagged_int, 3);

    // Call rt_print_int
    let rt_print_ref = module.declare_func_in_func(rt_print_func, builder.func);
    let call = builder.ins().call(rt_print_ref, &[int_val]);
    let unit_result = builder.inst_results(call)[0];

    builder.ins().return_(&[unit_result]);

    builder.finalize();
    module
        .define_function(func_id, &mut ctx)
        .map_err(|e| format!("Failed to define print_int: {}", e))?;

    Ok(func_id)
}
pub struct RuntimeFunctions {
    pub make_int: FuncId,
    pub make_closure: FuncId,
    pub project_env: FuncId,
    pub apply: FuncId,
    pub print_int: FuncId,
}
}

The runtime provides several categories of essential services through functions that follow our calling conventions and value representations. Each runtime function is declared to Cranelift and can be called directly from generated code.

The actual implementation of these runtime functions lives in a separate support library written in Rust with no_std to minimize dependencies:

#![allow(unused)]
fn main() {
//! Runtime support functions for System F-ω
//! 
//! This file is automatically compiled by build.rs during the build process.

#![no_std]
#![no_main]

// Simple bump allocator for runtime
static mut HEAP: [u8; 1024 * 1024] = [0; 1024 * 1024]; // 1MB heap
static mut HEAP_PTR: usize = 0;

// Out of memory error string
static OOM: &[u8] = b"Out of memory!\n";

/// Runtime allocator function
#[no_mangle]
pub unsafe extern "C" fn rt_alloc(size: usize) -> *mut u8 {
    // Align to 8 bytes
    let aligned_size = (size + 7) & !7;
    
    if HEAP_PTR + aligned_size > HEAP.len() {
        // Just write to stderr directly
        libc::write(2, OOM.as_ptr(), OOM.len());
        libc::exit(1);
    }
    
    let result = HEAP.as_mut_ptr().add(HEAP_PTR);
    HEAP_PTR += aligned_size;
    result
}

/// Runtime print function - just use printf from libc
#[no_mangle]
pub unsafe extern "C" fn rt_print_int(value: i64) -> u64 {
    libc::printf(b"%lld\n\0".as_ptr() as *const i8, value);
    0 // Return Unit (tagged 0)
}

// Minimal panic handler for no_std
#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
    unsafe { libc::exit(1) }
}

// Link with libc
mod libc {
    extern "C" {
        pub fn printf(format: *const i8, ...) -> i32;
        pub fn write(fd: i32, buf: *const u8, count: usize) -> isize;
        pub fn exit(status: i32) -> !;
    }
}
}

The runtime support library serves several core purposes:

Memory Allocation: Our functional language creates closures and other data structures dynamically, but has no concept of memory management. The runtime provides rt_alloc, a simple bump allocator that manages a fixed 1MB heap. This allocator is deliberately minimal, it only allocates, never frees, which is sufficient for our demonstration language. The bump allocator maintains a pointer into a static array and advances it for each allocation, ensuring 8-byte alignment for all allocations.

Value Creation: The make_int function tags raw integers for use in our tagged representation, while make_closure allocates and initializes closure structures with their code pointers and captured environments. The project_env function extracts values from closure environments during execution.

Function Application: The apply function implements the calling convention for closure invocation. It extracts the code pointer from a closure and calls it with the closure and argument, handling the low-level details of our calling convention.

Input/Output: Pure functional languages have no inherent notion of effects like printing to the console. The runtime provides rt_print_int which unwraps our tagged integer representation and calls the C library’s printf function to display the value. This is our only connection to the outside world, allowing programs to produce observable output.

Error Handling: When the heap is exhausted, the runtime writes an error message directly to stderr using the POSIX write system call and terminates the program. The panic handler similarly ensures clean termination if any runtime invariant is violated.

Operating System Interface: The runtime uses direct foreign function interface calls to libc for its interactions with the operating system. This includes printf for formatted output, write for error messages, and exit for program termination.

The no_std and no_main attributes tell Rust not to include its standard library or generate a main function, keeping the runtime minimal. The entire runtime compiles to a small object file that gets linked with our generated code. During the build process, build.rs invokes the Rust compiler directly:

rustc --crate-type=staticlib --emit=obj -C opt-level=2 -C panic=abort runtime_support.rs

This produces an object file containing just our runtime functions, which the system linker combines with the Cranelift-generated code to create the final executable. The beauty of this approach is that we get exactly the runtime support we need while leveraging existing system libraries for the heavy lifting.

Memory Management

As an aside, note that our toy runtime uses a naive bump allocator that never frees memory. Which is simple, but isn’t ideal to put it mildly. This approach would quickly exhaust memory in real programs since functional languages allocate closures and immutable data structures prolifically. A real implementation would require some careful thinking about memory management, and there are several approaches.

Garbage Collection: The simplest solution is to bolt on a garbage collector like the Boehm-Demers-Weiser garbage collector, which is a conservative collector that works with minimal compiler modifications. Boehm GC scans memory for values that look like pointers and conservatively assumes they might be live references. It pretty much works off-the-shell, just requiring us to replace malloc with GC_malloc and handles root identification, reachability analysis, and memory reclamation automatically. The conservative approach occasionally retains dead memory but avoids the complexity of precise pointer tracking. There are Rust bindings and it merely requires installing the libgc C library with:

brew install libgc

More sophisticated garbage collectors offer better performance through various techniques. Generational collectors exploit the observation that most objects die young, segregating new allocations into a nursery that gets collected frequently, this is used in languages like Haskell and Ocaml. Concurrent collectors run collection in parallel with the program, reducing pause times. Incremental collectors spread collection work across many small pauses rather than stopping the world. Each approach requires deeper runtime integration and careful handling of write barriers and safepoints.

Modern Approaches: Recent years have seen renewed interest in static memory management techniques that avoid garbage collection entirely. Affine types (or move semantics), pioneered by languages like Rust, ensure that each value has exactly one owner and is used exactly once. This enables compile-time memory management without runtime overhead. The type system tracks ownership and borrowing, preventing use-after-free and data races while enabling safe manual memory management.

Region-based memory management groups related allocations into regions that can be deallocated together. Languages like Cyclone and more recently Koka use effect systems to track region lifetimes. This approach works particularly well for functional languages where data often has stack-like lifetimes corresponding to function calls.

Reference counting with cycle detection offers another alternative, used by languages like Swift and Python. Modern reference counting implementations use deferred reference counting to reduce overhead and can achieve performance competitive with tracing garbage collectors for many workloads.

The choice of memory management strategy will profoundly impact language design. Tracing garbage collection enables simple APIs and unrestricted sharing but requires runtime overhead and can cause unpredictable pauses. Linear types and ownership systems provide predictable performance and memory usage but complicate the programming model. As hardware evolves and programming patterns change, the tradeoffs continue to shift. A good overview of this deep area is Hosking’s book The Garbage Collection Handbook: The Art of Automatic Memory Management.

Executable Generation

The final phase links the generated code with the runtime system to produce a standalone executable. This process involves generating object files, linking with the runtime support library, and producing an executable for the target platform.

#![allow(unused)]
fn main() {
use std::process::Command;
use std::str::FromStr;
use cranelift::codegen::isa::lookup;
use cranelift::codegen::settings::{builder, Flags};
use cranelift::prelude::{
    types, AbiParam, Configurable, FunctionBuilder, FunctionBuilderContext, InstBuilder,
};
use cranelift_module::{Linkage, Module};
use cranelift_object::{ObjectBuilder, ObjectModule};
use target_lexicon::Triple;
use super::{closure, compile, erase, runtime};
use crate::core::CoreModule;
/// Add an entry point that calls main and prints the result
fn add_entry_point<M: Module>(codegen: &mut compile::CodeGen<M>) -> Result<(), String> {
    // Get the module and create entry point function
    let module = codegen.module_mut();

    // Create signature for entry point - takes no arguments, returns i32 (exit
    // code)
    let mut sig = module.make_signature();
    sig.returns.push(AbiParam::new(types::I32));

    // Declare the entry point function with export linkage
    let entry_func = module
        .declare_function("main", Linkage::Export, &sig)
        .map_err(|e| format!("Failed to declare entry point: {}", e))?;

    // Get reference to our compiled main function
    // main_compiled returns a pointer (tagged unit value)
    let mut main_sig = module.make_signature();
    let pointer_type = module.target_config().pointer_type();
    main_sig.returns.push(AbiParam::new(pointer_type));

    let main_func = module
        .declare_function("main_compiled", Linkage::Import, &main_sig)
        .map_err(|e| format!("Failed to declare main_compiled reference: {}", e))?;

    // Create the function body
    let mut ctx = module.make_context();
    ctx.func.signature = sig;

    {
        let mut func_ctx = FunctionBuilderContext::new();
        let mut builder = FunctionBuilder::new(&mut ctx.func, &mut func_ctx);

        // Create entry block
        let block = builder.create_block();
        builder.switch_to_block(block);
        builder.seal_block(block);

        // Call our compiled main function
        let main_ref = module.declare_func_in_func(main_func, builder.func);
        builder.ins().call(main_ref, &[]);

        // Return 0 as exit code
        let zero = builder.ins().iconst(types::I32, 0);
        builder.ins().return_(&[zero]);

        // Finalize
        builder.finalize();
    }

    // Define the function
    module
        .define_function(entry_func, &mut ctx)
        .map_err(|e| format!("Failed to define entry point: {}", e))?;

    Ok(())
}
/// Link the object file to create an executable
fn link_executable(obj_path: &str, output_path: &str) -> Result<(), String> {
    // Get the pre-compiled runtime object path from build.rs
    let runtime_obj_path = env!("RUNTIME_OBJ");

    // Link the generated code with the runtime support
    let output = Command::new("cc")
        .args(["-o", output_path, obj_path, runtime_obj_path])
        .output()
        .map_err(|e| format!("Failed to run linker: {}", e))?;

    if !output.status.success() {
        return Err(format!(
            "Linker failed: {}",
            String::from_utf8_lossy(&output.stderr)
        ));
    }

    Ok(())
}
/// Compile a module to a standalone executable
pub fn compile_executable(module: &CoreModule, output_path: &str) -> Result<(), String> {
    println!("=== Compiling Module ===");

    // Find main function
    let main_def = module
        .term_defs
        .iter()
        .find(|def| def.name == "main")
        .ok_or("No main function found")?;

    // Type erase all functions
    let mut erased_functions = Vec::new();
    for def in &module.term_defs {
        if def.name != "main" {
            let erased = erase::erase(&def.body);
            erased_functions.push((def.name.as_str(), erased));
        }
    }

    // Type erase main
    let erased_main = erase::erase(&main_def.body);

    println!("=== Type Erased Functions ===");
    for (name, body) in &erased_functions {
        println!("{}: {}", name, body.pretty());
    }
    println!("main: {}", erased_main.pretty());

    // Closure conversion with module context
    let program = closure::closure_convert_module(
        erased_functions.iter().map(|(n, b)| (*n, b)).collect(),
        &erased_main,
    );

    println!("=== Closure Converted ===");
    for func in &program.functions {
        println!(
            "Function {}: {} -> {}",
            func.id,
            func.param,
            func.body.pretty()
        );
        println!("  Free vars: {:?}", func.free_vars);
    }
    println!("Main: {}", program.main.pretty());

    // Create cranelift module
    let triple = if cfg!(target_arch = "aarch64") && cfg!(target_os = "macos") {
        // Use the proper ARM64 macOS triple
        target_lexicon::Triple::from_str("aarch64-apple-darwin").unwrap()
    } else {
        Triple::host()
    };

    let mut settings = builder();
    settings.set("opt_level", "none").unwrap();
    settings.set("is_pic", "true").unwrap();
    if cfg!(target_arch = "aarch64") {
        // Ensure proper alignment for ARM64
        settings
            .set("enable_heap_access_spectre_mitigation", "false")
            .unwrap();
        settings.set("use_colocated_libcalls", "false").unwrap();
    }
    let flags = Flags::new(settings);

    let isa = lookup(triple.clone())
        .map_err(|e| format!("Error looking up ISA: {}", e))?
        .finish(flags)
        .map_err(|e| format!("Error creating ISA: {}", e))?;

    let builder = ObjectBuilder::new(
        isa,
        "system_f_omega",
        cranelift_module::default_libcall_names(),
    )
    .map_err(|e| format!("Error creating object builder: {}", e))?;

    let mut obj_module = ObjectModule::new(builder);

    // Compile runtime functions
    let runtime_funcs = runtime::compile_runtime(&mut obj_module)?;

    // Compile the program
    let mut codegen = compile::CodeGen::new(obj_module, runtime_funcs);
    let _main_func = codegen.compile_program(&program)?;

    // Add entry point that calls our main
    add_entry_point(&mut codegen)?;

    // Finalize
    let module = codegen.finish();
    let object = module.finish();
    let bytes = object
        .emit()
        .map_err(|e| format!("Error emitting object: {}", e))?;

    // Write object file
    let obj_path = format!("{}.o", output_path);
    std::fs::write(&obj_path, bytes).map_err(|e| format!("Error writing object file: {}", e))?;

    // Link to executable using system linker
    link_executable(&obj_path, output_path)?;

    println!("=== Compilation Complete ===");
    println!("Executable written to: {}", output_path);

    Ok(())
}
}

The executable generation process coordinates the entire compilation pipeline. Type erasure removes type information from the core language representation. Closure conversion transforms nested functions into explicit closures. Cranelift compilation generates machine code for each function. The main function wraps the program’s entry point with appropriate initialization. Finally, the system linker combines the generated code with the runtime library to produce an executable program.

And that’s it, that’s a full compiler! You’ve made a full optimizing compiler from a pure platonic mathematical construct all the way down to the messy real world of silicon and electrons.

Compiling a Simple Program

So how do invoke it? Basically just like any other compile. We take an input file and we get a machine-native executable out the other end.

Let’s trace through the compilation of a simple recursive factorial function to see how all these pieces work together:

-- factorial.fun
fib :: Int -> Int;
fib n = if n Le 1 then n else fib (n Sub 1) Add fib (n Sub 2);

main :: Int;
main = printInt(fib 10);

First, compile and run the program:

$ cd system-f-omega
$ cargo run -- compile factorial.fun -o factorial
$ ./factorial
55

The compilation process transforms this high-level program through several stages:

After Type Erasure: The polymorphic type information disappears, leaving only the computational structure:

fib = λn. if n Le 1 then n else fib (n Sub 1) Add fib (n Sub 2)
main = printInt(fib 10)

After Closure Conversion: Functions become explicit closures with captured environments:

Function 0: n -> if n Le 1 then n else closure(0, [])(n Sub 1) Add closure(0, [])(n Sub 2)
  Free vars: []
Main: printInt(closure(0, [])(10))

Machine Code Generation: Cranelift generates optimized assembly for the target architecture, handling the tagged value representation, closure allocation, and function calls according to our calling convention.

And that’s the entire journey from theoretical type system to efficient machine code. The lambda calculus need not remain an abstract concept; it can be compiled down to run precisely as efficiently as C or Rust code with the right set of abstractions! Compilers need not be hard, in fact they are quite fun!