Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Prism

Prism

Prism is a small, strict, impure functional language in the ML family whose type system tracks side effects. Effects are inferred, extensible rows that combine structurally as functions call one another, and they track observability: an effect handled inside a function vanishes from its type. The core is a call-by-push-value calculus in A-normal form that compiles to native code through LLVM, with memory managed by deterministic reference counting and fully-in-place update rather than a garbage collector.

This book has two parts:

  • Language Specification defines the surface language: lexical structure, grammar, types, effects, and evaluation.
  • Compiler documents the implementation: the pipeline, the core calculus, effect lowering, reference counting, the backends, and the verification harness.

Use the Playground to edit, run, and inspect Prism code via an interpreter run in the browser.

The Prism Language Specification

Prism is a strict, impure functional language in the ML family whose type system tracks side effects. This document defines the surface language: its lexical structure, grammar, type system, and evaluation. It describes the language as the prism compiler accepts it. Where the implementation is incomplete or carries known debt, the limitation is stated as such, not hidden.

1. Introduction

A Prism program is a set of modules, each a file of declarations. The surface language elaborates to a strict, call-by-push-value core (Levy, 2004) in A-normal form (the companion Compiler document), compiles to native code through LLVM, and is managed by deterministic reference counting rather than a garbage collector.

Two things distinguish Prism from its ML and Haskell ancestors. Side effects are inferred, extensible effect rows (Section 7) that combine structurally across calls instead of through monads, and they track observability: an operation handled inside a function does not appear in its type, so internally effectful code is reused as pure. The same reference-count discipline both frees memory and performs fully-in-place (FBIP) update (Section 10), compiling record updates and derived setters to in-place writes on uniquely owned values (those that a reference count proves have no other live reference; see Compiler).

This specification proceeds in dependency order: notation, lexical structure, grammar, types, then the constructs the grammar describes.

2. Notation

Grammar is given in the following EBNF. A terminal is a literal token written in double quotes; a nonterminal is a lower-case name. The character classes are the ASCII letters (letter), the two cases (lower, upper), the decimal digits (digit), any printable character (graphic), and any character other than ", \, or a newline (strchar). These are primitives, not grammar nonterminals.

x  y      juxtaposition: x followed by y
x | y     alternative: x or y
[ x ]     option: zero or one x
{ x }     repetition: zero or more x
( x )     grouping
"fn"      a terminal (literal token), in double quotes
varid     a nonterminal, in lower case
x , ...   a comma-separated list of one or more x

Identifiers in productions name the tokens defined in 3 (varid, conid, qualid, integer, float, char, string) and the character classes defined just above. Layout (Section 3.6) inserts block delimiters that the grammar then treats as ordinary terminals.

3. Lexical Structure

Source text is UTF-8. Tokens are lexed by longest match, then the stream is rewritten by the layout algorithm of Section 3.6. Whitespace and comments separate tokens and are otherwise insignificant except as layout boundaries.

varid    ->  (lower | "_") { idchar }
conid    ->  upper { idchar }
qualid   ->  conid "." ident { "." ident }
ident    ->  varid | conid
idchar   ->  letter | digit | "_"

integer  ->  digit { digit } [ "i64" | "u64" ]
float    ->  digit { digit } "." digit { digit } [ exponent ]
          |  digit { digit } exponent
exponent ->  ( "e" | "E" ) [ "+" | "-" ] digit { digit }
char     ->  "'" ( graphic | escape | "\\'" ) "'"
string   ->  "\"" { strchar | escape | interp } "\""
interp   ->  "{" expr "}"
escape   ->  "\\" ( "n" | "t" | "r" | "\\" | "{" | "}" | "\"" )
comment  ->  "--" { any character other than newline }

-- An interpolated string "a{e}b{e}c" is lexed into the pieces below, each
-- carrying its literal segment; the holes are re-lexed as expressions.
istart   ->  "\"" { strchar } "{"
imid     ->  "}" { strchar } "{"
iend     ->  "}" { strchar } "\""

-- Virtual layout tokens, synthesized by the offside pass; no source spelling.
vopen    ->  "v{"   -- start of an indented block
vclose   ->  "v}"   -- end of an indented block
vsep     ->  "v;"   -- a newline at the same column (item / statement separator)

3.1 Identifiers

Prism distinguishes identifiers by initial case. A varid begins with a lower-case letter or underscore and names a variable, function, parameter, or record field. A conid begins with an upper-case letter and names a type, data constructor, type class, or effect. A qualid is a dotted path such as Data.Map or Map.insert; it is lexed as a single token so that a module path never collides with field access.

3.2 Keywords

The following are reserved and may not be used as identifiers.

fnfipfbippubimport
astypenewtypeopaquealias
effecterrorthrowtrycatch
transactclassinstancepatternderiving
wheregivenhandlewithhandler
maskctlfinalfunval
returnletvarborrowin
fordoifthenelse
elifmatchofforalltrue
falsewhileloopbreakcontinue
usingcanonical

The built-in type names Int, I64, U64, Bool, Unit, Float, Char, and String are also reserved.

3.3 Operators and Punctuation

The operator set is fixed; the language has no user-defined operators. Every comparison operator, and every arithmetic operator except % and ^, also has a floating-point form suffixed with a dot. Exponentiation ^ is a single operator over both Int and Float (Section 8.4).

ClassOperators
Arithmetic+ - * / % ^ and float +. -. *. /.
Comparison== /= < <= > >= and float ==. /=. <. <=. >. >=.
Logical&& ||
Pipeline|> >> <<
Failure?? ?. ?
Arrows-> <- =>
Binding= := : and compound += -= *= %=
Effect!
Brackets( ) { } [ ]
Other, . .. | \

3.4 Literals

An integer is a sequence of decimal digits. A value that fits in a machine word is an immediate; a larger literal is an arbitrary-precision integer (bignum). The suffix i64 or u64 selects a fixed-width 64-bit lane that wraps on overflow. A float is an IEEE-754 double. A char is a single Unicode scalar in single quotes. A string is double-quoted UTF-8.

The escape sequences \n, \t, \r, \\, \", \{, and \} are recognized in both character and string literals; a character literal additionally accepts \'.

3.5 String Interpolation

Within a string, an unescaped { expr } is an interpolation hole. The hole text is re-lexed at its source position and elaborated as an expression whose Show value is spliced into the string. A hole runs to its matching }, balancing nested braces and string literals, so a hole may itself contain a string with braces. A literal brace outside a hole is written \{ or \}. An empty hole, an unterminated hole, and an unterminated string are each lexical errors. The catch arms of the error example in Section 7.5 use interpolation, as in "no such key: {k}".

3.6 Layout

Prism uses the offside rule: indentation, not explicit braces, delimits a block. A layout block opens after any of the keywords or symbols =, then, else, =>, of, with, handler, do, where, try, catch, transact, loop, and after fn (a while block opens at its do). The first token after such an opener sets the block’s indentation column; a later line at that column starts a new item in the block, and a line indented less closes the block. Explicit { } override layout and may always be used in place of an implicit block, as in the brace-delimited handler arms of the masking example (Section 7.3).

4. Surface Grammar

A program is a layout-delimited sequence of top-level declarations.

program   ->  { topdecl }                       -- separated by layout

topdecl   ->  [ "pub" | "opaque" ] item
item      ->  import | datatype | newtype | synonym | rowalias
           |  classdecl | instancedecl | canonicaldecl | effectdecl
           |  errordecl | patterndecl | constdecl | fundecl

import    ->  "import" modpath [ "as" conid ] [ "(" name , ... ")" ]
           |  "pub" "import" modpath [ "(" name , ... ")" ]
modpath   ->  conid { "." conid }
name      ->  varid | conid

datatype  ->  "type" conid [ tyvars ] "=" ctor { "|" ctor } [ deriving ]
newtype   ->  "newtype" conid [ tyvars ] "=" ctor [ deriving ]
synonym   ->  "alias" conid [ tyvars ] "=" type
rowalias  ->  "alias" conid "=" "{" [ label , ... ] "}"
tyvars    ->  "(" varid , ... ")"
ctor      ->  conid [ "(" type , ... ")" ]
           |  conid "{" field , ... "}"
field     ->  varid ":" type
deriving  ->  "deriving" "(" conid , ... ")"

constdecl ->  "let" varid [ ":" type ] "=" expr
fundecl   ->  [ "fip" | "fbip" ] "fn" varid "(" [ param , ... ] ")"
                [ retann ] [ given ] "=" expr [ wheres ]
param     ->  [ "borrow" ] varid [ ":" type ] [ ":=" expr ]
retann    ->  ":" "!" "{" [ label , ... ] "}" type
           |  ":" "!" type
           |  ":" type
given     ->  "given" constraint , ...
constraint -> conid "(" type ")"
wheres    ->  "where" "{" { varid "=" expr } "}"
classdecl    ->  "class" conid "(" varid ")" [ given ] "{" sig , ... "}"
instancedecl ->  "instance" varid ":" conid "(" type ")" [ given ]
                   "{" fundecl , ... "}"
canonicaldecl -> "canonical" conid "(" type ")" "=" varid
sig          ->  varid ":" type

effectdecl   ->  "effect" conid [ tyvars ] "{" effop , ... "}"
effop        ->  "ctl" varid "(" [ type , ... ] ")" ":" type
errordecl    ->  "error" conid [ "(" type , ... ")" ]

patterndecl  ->  "pattern" conid "(" [ varid , ... ] ")" "for" conid "="
                   "view" expr [ "make" expr ]

Type syntax. A function type carries an optional effect row on its codomain (Section 7); the row binds to a function type only.

type    ->  "forall" varid { varid } "." type
         |  arrow [ row ]                       -- row applies to a function type
arrow   ->  "(" [ type , ... ] ")" "->" arrow   -- n-ary domain
         |  atype "->" arrow                     -- single domain
         |  atype
atype   ->  scalar
         |  "[" type "]"                          -- List(type)
         |  conid [ "(" type , ... ")" ]          -- type constructor
         |  varid [ "(" type , ... ")" ]          -- variable, possibly applied
         |  "(" type ")"
         |  "(" type "," type , ... ")"           -- tuple
scalar  ->  "Int" | "I64" | "U64" | "Bool" | "Unit" | "Float" | "Char" | "String"

row     ->  "!" "{" [ label , ... ] [ "|" varid ] "}"
         |  "!"                                   -- empty row, effectful position
label   ->  conid [ "(" type , ... ")" ]

Expressions, patterns, and the handler block of handle/try (used in Section 7).

expr    ->  "if" expr "then" expr { "elif" expr "then" expr } "else" expr
         |  "let" pattern "=" expr  expr            -- bind, then continue
         |  "var" varid ":=" expr  expr             -- local mutable, then continue
         |  "\\" "(" [ varid , ... ] ")" "->" expr  -- lambda
         |  "match" expr "of" { arm }
         |  "handle" expr "with" handler
         |  "throw" conid [ "(" expr , ... ")" ]
         |  "try" expr "catch" { catcharm }
         |  "transact" expr "else" expr
         |  "for" gen { "," qual } "do" expr
         |  "while" expr "do" expr                    -- while loop
         |  "loop" expr                               -- unconditional loop
         |  "return" expr                             -- early return
         |  varid ":=" expr                          -- assignment
         |  varid ( "+=" | "-=" | "*=" | "%=" ) expr  -- compound assignment
         |  opexpr

opexpr  ->  opexpr binop opexpr                      -- binop and binding per 4.1
         |  appexpr
appexpr ->  appexpr "(" [ expr , ... ] [ "using" iname , ... ] ")"
                                                     -- application; `using` selects instances
         |  appexpr "[" expr "]"                      -- indexed read a[i]
         |  appexpr "." varid                         -- field access
         |  appexpr "." varid "(" [ expr , ... ] ")" -- method call (UFCS)
         |  appexpr "?." varid                        -- optional chaining
         |  appexpr "?"                               -- try marker
         |  atom
atom    ->  integer | float | char | string | "true" | "false" | "(" ")"
         |  "break" | "continue"                      -- loop control
         |  varid | conid | qualid
         |  "(" expr ")" | "(" expr ":" type ")"
         |  "(" expr "," expr , ... ")"               -- tuple
         |  "[" [ expr , ... ] "]"                    -- list
         |  "[" expr ".." expr "]"                    -- range
         |  "[" expr "for" gen { "," qual } "]"       -- comprehension
         |  conid "{" fieldval , ... "}"              -- record
         |  conid "{" ".." expr "," fieldval , ... "}" -- record update
         |  "{" expr "|" path "=" expr , ... "}"      -- path update
         |  "mask" "<" conid ">" "(" expr ")"

arm      ->  pattern [ "if" expr ] "=>" expr
gen      ->  varid "in" expr
qual     ->  "if" expr | "let" varid "=" expr
iname    ->  varid | qualid                          -- an instance name
fieldval ->  varid "=" expr                          -- a record field binding
path     ->  varid { "." varid }                     -- a nested field path
binop    ->  "??" | "|>" | ">>" | "<<" | "||" | "&&" -- binding given in 4.1
         |  "==" | "/=" | "<" | "<=" | ">" | ">="
         |  "+" | "-" | "*" | "/" | "%" | "^"
         |  "==." | "/=." | "<." | "<=." | ">." | ">=."
         |  "+." | "-." | "*." | "/."
pattern  ->  conid [ "(" pattern , ... ")" ]        -- constructor
          |  conid "{" fieldpat , ... "}"            -- record, ".." allowed
          |  varid                                    -- variable binding
          |  "_"                                      -- wildcard
          |  integer | float | char | "true" | "false"
          |  "[" [ pattern , ... ] "]"                -- list
          |  "(" pattern "," pattern , ... ")"        -- tuple
fieldpat ->  varid "=" pattern  |  ".."
handler   ->  "{" hclause , ... "}"  |  hclause { hclause }   -- braces or layout

hclause   ->  "return" varid "=>" expr                 -- result transform
           |  varid "(" [ varid , ... ] "," varid ")" "=>" expr  -- op, last arg is resume
           |  "fun" varid "(" [ varid , ... ] ")" "=>" expr      -- tail-resumptive sugar
           |  "final" "ctl" varid "(" [ varid , ... ] ")" "=>" expr  -- non-resumable
           |  "val" varid "=" expr                     -- install-time constant

catcharm  ->  conid [ "(" varid , ... ")" ] "=>" expr

4.1 Operator Precedence

The table gives the binding of each operator, loosest to tightest. Levels 1 to 9 are the binop operators of the grammar; level 10 is application, field access, and the postfix failure operators, which bind tighter than every binop.

LevelOperatorsAssociativity
1??right
2|>left
3>> <<left
4||left
5&&left
6== /= < <= > >= (and float forms)none
7+ - (and float forms)left
8* / %, and float *. /.left
9^right
10f(...) a[i] .field ?.field ?left

5. Types and Kinds

Prism infers types by the bidirectional, higher-rank inference algorithm of Dunfield & Krishnaswami (2013). An unannotated declaration infers its principal type; an annotated one is checked against the annotation. Annotations are required for rank-N polymorphism, since a nested forall cannot be inferred.

Quantification is predicative: a type-constructor argument and an inferred type variable range over monomorphic types, so a forall may not be written directly as a type argument (List(forall a. (a) -> a) is rejected as impredicative). Higher-rank types are allowed wherever they are not a type argument, namely as a function parameter, a function result, and a declared data field; a polymorphic value can be carried through a generic container by wrapping it in a data type with a polymorphic field.

5.1 Types

The scalar types are Int (arbitrary precision), I64, U64, Float, Bool, Char, String, and Unit. A type constructor applied to arguments is written Con(t, ...); the list type has the sugar [t] for List(t). A tuple type is (t, ...). A function type is (t, ...) -> u, optionally carrying an effect row on u. A universally quantified type is forall a. t. Type variables are varids.

5.2 Kinds

A type has kind * (a type of values) or * -> * (a type constructor awaiting one argument), and so on; List has kind * -> *, since List(Int) is a type only once Int is supplied. A class parameter may range over a constructor of kind * -> *, applied as f(a) in method signatures; see Section 6. There is no explicit kind-checking phase: well-kindedness is enforced during unification, which requires a constructor and its arguments to agree in arity.

5.3 Inference, Generalization, and Defaulting

A row is built from labels, the effect names of Section 7 (a parametric effect’s label carries type arguments). It is closed when it ends in a fixed set of labels and open when it ends in a row variable (! {L | r}), which stands for further labels the caller may add. An unannotated binding is generalized over its free type and row variables not fixed by the surrounding scope. Two cases default rather than generalize, both resolved in one pass at generalization. A numeric operand of an arithmetic or comparison operator left otherwise unconstrained defaults to Int; because the default is deferred to that pass rather than applied at the operator, a later use that fixes the operand to a fixed-width lane (I64/U64) takes precedence, so x + y followed by an i64 use of x is fixed-width, not Int. An open row left unconstrained at a monomorphic declaration (one with no remaining free row variable) defaults to empty (pure); an effect-polymorphic declaration keeps its row variable, as traverse does in the prelude (Section 12).

5.4 Subsumption and Row Equivalence

Checking a value against an expected type uses subsumption, not equality. A more polymorphic type is accepted where a less polymorphic one is expected: a forall on the expected side introduces a rigid variable the value must satisfy for all instances, and a forall on the value side is instantiated to meet the expectation. Function subtyping is contravariant in the arguments and covariant in the result, so a function accepting more and returning less may stand in for one accepting less and returning more.

Effect rows are compared up to reordering: ! {A, B} and ! {B, A} are the same row, because unification hoists a demanded label to the head of the other row before matching the tails. An open row ! {A | r} unifies with any row that provides A by binding r to the remainder; for instance ! {A | r} unifies with ! {A, B} by binding r to {B}. This is how a caller’s row absorbs a callee’s. A unification that would make a row contain itself is rejected, so recursive effect rows do not arise.

5.5 Fixed-Width Integers

Int is arbitrary precision. I64 and U64 are the signed two’s-complement and unsigned 64-bit lanes; they wrap on overflow rather than promoting to a bignum. Their operations are named builtins, not operators, since the surface +, -, and so on target Int and Float. Each takes two operands of the lane type.

FamilyOperations (and the u64_* counterparts)
Arithmetici64_add i64_sub i64_mul i64_div i64_rem i64_cmp
Bitwisei64_and i64_or i64_xor
Shifti64_shl i64_shr

and, or, and xor share a single bit pattern across both lanes; i64_shr is an arithmetic (sign-extending) shift while u64_shr is logical; a shift count is taken modulo 64. to_i64/to_u64 and int_of_i64/int_of_u64 convert between Int and the fixed-width lanes.

5.6 Algebraic Data Types

A type declaration introduces an algebraic data type: a sum of constructors, each a product of fields. A constructor is named with an upper-case identifier and applied like a function to build a value; a match (Section 9) destructures a value by constructor. A type may take type parameters and may be recursive, including mutually so.

-- A sum type: a value is exactly one of the listed constructors, each with its
-- own fields. Construction names a constructor; destruction matches on it.
type Shape = Circle(Int) | Rect(Int, Int)

fn area(s) =
  match s of
    Circle(r) => 3 * r * r
    Rect(w, h) => w * h

-- A recursive sum type, the canonical binary tree.
type Tree = Leaf | Node(Tree, Int, Tree)

fn sum_tree(t) =
  match t of
    Leaf => 0
    Node(l, v, r) => sum_tree(l) + v + sum_tree(r)

fn main() =
  println(area(Circle(10)))
  println(area(Rect(3, 4)))
  println(sum_tree(Node(Node(Leaf, 1, Leaf), 2, Node(Leaf, 3, Leaf))))

A newtype is a data type with exactly one single-field constructor: a type distinct from its payload, with no runtime wrapper. An alias on a type expression is a transparent synonym, interchangeable with its definition. A deriving (C, ...) clause generates the named instances structurally (Section 6); Eq, Ord, Show, and Lens are derivable.

5.7 Records

A constructor may instead take named fields, C { f : T, ... }, making the type a record. A field is read with e.f; records are built and updated by the expressions of Section 8.2. deriving (Lens) synthesizes a getter f_of and a setter with_f per field.

-- A record is a single-constructor product type with named fields. Fields are
-- read with `.f`, built with `C { f = e, ... }`, and updated functionally.
type Point = Point { x: Int, y: Int }

fn main() =
  let p = Point { x = 3, y = 4 }
  let q = Point { ..p, y = 9 }
  println(p.x)
  println(q.y)

6. Type Classes

A class declares a single-parameter constraint and a set of method signatures. An instance is a named value providing those methods for one head type. A constrained function receives its dictionaries as hidden arguments resolved at each call site. The following program declares a second Ord(Int) instance named ordDesc that reverses the ordering, designates the prelude’s ascending ordInt as canonical, and selects each explicitly.

-- Typeclasses with Lean-style named instances. A class declares a dictionary
-- shape, instances are named values, and constrained functions take hidden
-- dictionaries resolved at the call site. When a head has two instances one is
-- canonical (here the prelude's ascending ordInt, designated below); implicit
-- resolution picks it, and the other is selected explicitly with
-- f(args, using instName). Coherence: resolution is never silently ambiguous.

fn maximum_by_ord(xs : List(a)) : Option(a) given Ord(a) =
  match xs of
    Nil => None
    Cons(x, rest) =>
      match maximum_by_ord(rest) of
        None => Some(x)
        Some(y) =>
          if cmp(x, y) >= 0 then
            Some(x)
          else
            Some(y)

instance ordDesc : Ord(Int) {
  fn cmp(x, y) = int_cmp(y, x)
}

-- The prelude's ascending ordInt is the canonical Ord(Int); ordDesc stays the
-- explicit yardstick. Without this, two Ord(Int) instances would be a coherence
-- error at definition rather than a silent ambiguity at the use site.
canonical Ord(Int) = ordInt

fn join_shown(xs : List(a)) : String given Show(a) =
  match xs of
    Nil => ""
    Cons(x, rest) =>
      match rest of
        Nil => show_c(x)
        Cons(_, _) => concat(show_c(x), concat(", ", join_shown(rest)))

fn main() : !{IO} Unit =
  let xs = [3, 1, 4, 1, 5, 9, 2, 6]
  println(join_shown(sort_by_ord(xs, using ordInt)))
  println(join_shown(sort_by_ord(xs, using ordDesc)))
  println(unwrap_or(0, maximum_by_ord(xs, using ordInt)))
  println(unwrap_or(0, maximum_by_ord(xs, using ordDesc)))
  println(join_shown(sort_by_ord(["pear", "apple", "quince"])))

6.1 Coherence and Resolution

An instance is selected by the head constructor of the constraint type (the outermost constructor, for example List in List(Int)). Resolution is coherent: a program’s meaning never silently depends on which instance the resolver happened to pick. For each (class, type-head) there is exactly one canonical instance, and implicit resolution always selects it, so resolution is deterministic.

With a single instance for a head, that instance is canonical automatically. When two or more instances share a head, one must be designated canonical with a top-level declaration:

canonical Class(Head) = instanceName

Two instances for one head with no designation is a coherence error reported at definition, not a silent ambiguity deferred to the use site. The designated instance is what implicit resolution selects; the others remain reachable only through an explicit override.

An explicit override is visible at the use site and changes nothing else’s resolution: pass the chosen instance as a trailing using argument, f(args, using instanceName), as sort_by_ord(xs, using ordDesc) does above. (This is the same using form reserved for first-class dictionary passing.) There is no ambient, scoped instance mechanism: an override is always written where it is used.

The preferred way to obtain a different instance for a type is a newtype with its own canonical instance (newtype Down = Down(Int) for reverse order, a folded-case wrapper for case-insensitive comparison) rather than a non-canonical instance of the base type. This changes the type, not the instance-for-a-type, so coherence is preserved exactly and the difference is visible in the signature; an explicit using override is the second-line tool when a newtype is too heavy.

Resolution recurses through instance contexts up to a fixed depth.

6.2 Superclasses

A class may require another as a superclass with given. Each instance then stores a resolved superclass dictionary as the leading field of its dictionary cell, and a given Ord(a) constraint discharges an Eq(a) obligation by projecting that field. The superclass witness is found automatically from the instances in scope.

-- Superclasses: a class may declare another class as a superclass with
-- `given`. Every instance then carries a resolved superclass dictionary, stored
-- as a leading field of its dict cell, and a `given Ord2(a)` constraint can
-- discharge an `Eq2(a)` obligation by projecting that field. The superclass
-- witness is found automatically from the instances in scope, so the instance
-- declaration does not repeat it.
--
-- Prints:
--   equal
--   less
--   greater

class Eq2(a) {
  eq2 : (a, a) -> Bool
}

class Ord2(a) given Eq2(a) {
  cmp2 : (a, a) -> Int
}

instance eqInt2 : Eq2(Int) {
  fn eq2(x, y) = x == y
}

instance ordInt2 : Ord2(Int) {
  fn cmp2(x, y) = if x < y then 0 - 1 elif x > y then 1 else 0
}

-- Polymorphic over any Ord2(a). It calls cmp2 (its own method, taken from the
-- Ord2 dictionary parameter) AND eq2 (a superclass method, projected out of the
-- same dictionary). No Eq2 constraint is written: Ord2 entails it.
fn describe(x : a, y : a) : String given Ord2(a) =
  if eq2(x, y) then
    "equal"
  elif cmp2(x, y) < 0 then
    "less"
  else
    "greater"

fn main() : !{IO} Unit =
  println(describe(3, 3))
  println(describe(3, 5))
  println(describe(7, 2))

6.3 Higher-Kinded Classes

A class parameter may be a type constructor of kind * -> *, applied as f(a) in method signatures and resolved on the head constructor of each instance. The prelude’s Functor/Applicative/Monad/Foldable/Traversable tower is built this way. Its methods are effect-polymorphic (defined in Section 7.6): a per-element effect row threads through in place of an Applicative wrapper, so effectful traversal needs no monad and no do-notation.

-- Higher-kinded type classes over the language's own data types: the class
-- parameter ranges over a type constructor (kind * -> *), applied as `f(a)` in
-- the method signatures. The Functor/Applicative/Monad/Foldable/Traversable
-- tower is in the prelude with instances for List and Option, resolved by the
-- head constructor of each instance. `fmap`/`traverse` are effect-polymorphic:
-- the per-element effect row threads through instead of an Applicative wrapper,
-- so effectful traversal needs no monad and no do-notation.
--
-- Prints:
--   [2, 3, 4]
--   Some(42)
--   10
--   [2, 3, 10, 20]
--   Some(15)
--   [1, 1, 2, 2, 3, 3]
--   Some(9)
--   [10, 20, 30]

fn main() : !{IO} Unit =
  -- Functor: map over either container with one `fmap`.
  println(show(fmap(\(x) -> x + 1, [1, 2, 3])))
  println(show(fmap(\(x) -> x * 2, Some(21))))
  -- Foldable: an effect-polymorphic right fold.
  println(show(fold_r(\(x, acc) -> x + acc, 0, [1, 2, 3, 4])))
  -- Applicative: `ap` is cartesian on List, positional on Option.
  println(show(ap([\(x) -> x + 1, \(x) -> x * 10], [1, 2])))
  println(show(ap(Some(\(x) -> x + 5), Some(10))))
  -- Monad: structural `bind` (List is nondeterminism, Option is failure).
  println(show(bind([1, 2, 3], \(x) -> [x, x])))
  println(show(bind(Some(3), \(x) -> Some(x * x))))
  -- Traversable: effectful map (here the effect is none).
  println(show(traverse(\(x) -> x * 10, [1, 2, 3])))

Classes remain single-parameter; multi-parameter classes are not supported.

7. Effects and Handlers

An effect declares a set of operations; each ctl operation has an argument list and a result type. Performing an operation is an ordinary call to its name. A function’s effect row is the set of effects whose operations it may perform and has not handled, written ! {L, ...} on its result type, with an optional row variable tail ! {L | r}. A bare ! is an explicit empty row. A row is inferred when omitted.

effect State {
  ctl get(Unit) : Int,
  ctl put(Int) : Unit
}

-- Algebraic State, the Koka way.
--
-- The handler interprets get/put by parameter passing. Each clause returns a
-- function s -> result, and `k(v)(s)` threads the state forward. The handled
-- block becomes a state transformer that we run with an initial s.
--
-- counter() never mentions a state value, it just performs get and put. The
-- row is inferred and discharged at the handler, so the same ops could be
-- reinterpreted (logging, bounding) without touching counter.

fn tick() : !{State} Int =
  let n = get(())
  put(n + 1)
  n

fn counter() : !{State} Int =
  tick()
  tick()
  tick()
  get(())

fn run_counter(init) =
  let f =
    handle counter() with
      get(u, k) => \(s) -> k(s)(s)
      put(s2, k) => \(_s) -> k(())(s2)
      return r => \(_s) -> r
  f(init)

fn main() = println(run_counter(0))

A handle e with block discharges operations; its grammar is the handler nonterminal of Section 4. Each operation clause names an operation and binds its arguments and the resumption k (the captured continuation, explained below); calling k(v) resumes the suspended computation with v, and k may be called zero times (abort), once (the common case), or many times (multishot). A return r clause transforms the final value. The handler in eff_state.pr interprets get/put by threading a state parameter, so counter, which only performs the operations, never mentions a state value.

Operations and handlers are delimited control: the handle block is the delimiter (a prompt), and the resumption k is the delimited continuation it captures, the slice of computation between the perform site and the handler. Being first-class, k reinstalls that slice under the same handler when invoked. This is the typed, named generalization of shift/reset: a single prompt with one anonymous continuation becomes a row of named operations, each with its own clause, and the effect row is the static record of which delimiters a computation still requires.

A clause may invoke k any number of times; more than once makes the continuation multishot: each call re-runs the captured slice from the perform site with a different result, so one handler can pursue several futures of the same computation. This is how nondeterminism or search handlers explore alternatives (an amb operation whose clause calls k once per choice and combines the outcomes) and how generators yield and continue. Never invoking k discards the captured slice, which is exactly how raise (Section 7.1) and a final ctl clause abort.

7.1 Observability

The defining property of the row discipline: an operation handled inside a function is discharged, so it does not appear in that function’s inferred row. In the example below, checked carries the row ! {Exn}, but attempt, which handles raise, is pure.

effect Exn {
  ctl raise(Int) : Int
}

-- Exceptions as an effect: raise and catch.
--
-- raise aborts the computation. The handler clause drops the captured
-- continuation k and returns a recovery value, so control never comes back.
-- checked() raises deep inside a helper, far from the catching handler: the
-- effect propagates across calls and is discharged where the handle sits.

fn safe_div(n, d) : !{Exn} Int =
  if d == 0 then
    raise(0 - 1)
  else
    n / d

fn checked(n) : !{Exn} Int =
  let a = safe_div(100, n)
  let b = safe_div(a, n - 5)
  a + b

fn attempt(n) =
  handle checked(n) with
    raise(code, k) => code
    return r => r

fn main() =
  let good = attempt(2)
  let bad = attempt(5)
  println(good * 100 + bad)

7.2 Clause Sugar

Two clause forms abbreviate common shapes. fun op(x) => e is tail-resumptive sugar for op(x, k) => k(e), resuming exactly once. val v = e is an install-time constant: e runs once when the handler installs, and every use of v returns it.

-- Koka-style fun and val handler clauses, pure desugaring sugar.
--
-- `fun op(x) => e` is a tail-resumptive clause, identical to
-- `op(x, k) => k(e)`, resuming exactly once with e. `val v = e` is an
-- install-time constant: e runs once when the handler installs (its effects
-- land on the installer, not the handled block) and every v() returns it.
--
-- Expected output:
--   999     (printed once at install, from val tag)
--   prism
--   ready
--   70      (tag 0 + width(3) + width(4), each * 10)

effect App {
  ctl name() : String,
  ctl tag() : Int,
  ctl width(Int) : Int,
  ctl log(String) : Unit
}

fn render() : !{App} Int =
  log(name())
  log("ready")
  tag() + width(3) + width(4)

fn run_app() : !{IO} Int =
  handle render() with
    val name = "prism"
    val tag =
      println(999)
      0
    fun width(w) => w * 10
    log(s, k) =>
      println(s)
      k(())
    return r => r

fn main() = println(run_app())

A final ctl op(x) => e clause is non-resumable: it discards the continuation. This is the shape that error, throw, try, and catch desugar to (Section 7.5).

7.3 Masking

mask<E>(e) makes every operation of effect E performed in e bypass the innermost enclosing handler of E and reach the next one out. Masks nest, so a double mask skips two handlers. The masked expression still demands an enclosing handler, so E remains in its row.

-- Koka-style effect masking. `mask<Eff>(e)` makes every Eff operation in e
-- bypass the innermost enclosing Eff handler and reach the next one out.
-- Masks nest, so a double mask skips two handlers. The row type still injects
-- Eff, so a masked expression demands an enclosing handler. Dispatch skips
-- one matching frame per mask.
--
-- Expected output:
--   12
--   300
--   13

effect Ask {
  ctl ask() : Int
}

-- inner answers 2, outer 10. ask() + mask<Ask>(ask()) = 2 + 10 = 12
fn shadowed() =
  handle (handle ask() + mask<Ask>(ask()) with {
      ask(k) => k(2),
      return r => r
    }) with
    ask(k) => k(10)
    return r => r

-- three handlers deep (inner 1, middle 100, outer 200). The single mask
-- reaches the middle, the double mask the outer. 100 + 200 = 300
fn deep() =
  handle (handle handle mask<Ask>(ask()) + mask<Ask>(mask<Ask>(ask())) with {
      ask(k) => k(1),
      return r => r
    } with {
      ask(k) => k(100),
      return r => r
    }) with
    ask(k) => k(200)
    return r => r

-- the inner clause re-performs ask, answered by the outer (3). The masked ask
-- in the body skips to the outer too. 3 + 3 + 7 = 13
fn reask() =
  handle (handle ask() + mask<Ask>(ask()) + 7 with {
      ask(k) => k(ask()),
      return r => r
    }) with
    ask(k) => k(3)
    return r => r

fn main() =
  println(shadowed())
  println(deep())
  println(reask())

7.4 Local Mutation

A var mutates, yet the function holding it stays pure. fib_iter below updates two locals in a loop but has type (Int) -> Int with an empty row, so it is accepted where only a pure function is allowed. Prism has no mutation primitive; var is sugar over the effect system.

A var x := e desugars to a private two-operation effect (a get and a set); each read of x becomes a perform of get, each x := v a perform of set. In the same pass, a handler that threads the value as a hidden parameter is wrapped around the block. That handler discharges the get and set labels (Section 7.1), so they never reach the function’s type: the state is implemented but not observable. Because an escape analysis (below) has proved the state never leaves its block, effect lowering then erases the whole handler to a mutable cell, turning each get into a cell read and each set into a cell write, and the loop into a constant-stack loop, so the lowered code allocates nothing per iteration.

-- twice demands a pure (Int) -> Int. fib_iter mutates two locals in a loop,
-- yet is accepted here: its inferred row is empty.
fn twice(f : (Int) -> Int) = f(f(6))

fn fib_iter(n : Int) : Int =
  var a := 0
  var b := 1
  repeat(n) fn
    let t = a + b
    a := b
    b := t
  a

fn main() = println(twice(fib_iter))

An escape analysis keeps the purity honest: the compiler rejects any closure or returned value that would carry the var out of its block, so the state cannot outlive its handler.

7.5 Errors and Failure

Prism has no built-in exception type. Errors and failure are two related mechanisms, both resting on the non-resumable final ctl clause of Section 7.2.

Extensible errors. An error N(t) declaration introduces a one-operation effect whose operation never resumes; throw N(x) performs it. A function’s error row is exactly the set of errors it may raise and has not caught, and distinct error declarations union structurally as functions compose, with no umbrella sum type and no conversion glue: find_port carrying {NotFound} and parse_port carrying {Malformed} compose to {NotFound, Malformed}. try e catch { ... } is subtractive handler sugar (one nested final ctl per arm): a partial catch discharges the labels it names and lets the rest flow to an enclosing handler, and an uncaught error is an unhandled-effect error naming exactly the labels that remain. Each catch arm names an error and binds its fields to variables.

-- Errors as effect rows. Each `error` declaration is its own effect label and
-- `throw` performs it, so a function's row is its error set. find_port carries
-- {NotFound}, parse_port carries {Malformed}, and their composition unions to
-- {NotFound, Malformed} with no umbrella enum or conversion glue.
--
-- `try`/`catch` is subtractive handler sugar (one nested `final ctl` per arm).
-- A partial catch discharges its label and passes the rest to an outer try. An
-- uncaught case is an unhandled-effect error naming exactly what remains.
--
-- Expected output:
--   8080
--   default 80
--   bad config: oops
--   443
--   no such key: tls
--
error NotFound(String)

error Malformed(String)

fn find_port(cfg, key) =
  match cfg of
    Nil => throw NotFound(key)
    Cons(pair, rest) =>
      let (k, v) = pair
      if k == key then
        v
      else
        find_port(rest, key)

fn parse_port(s) =
  if s == "8080" then
    8080
  elif s == "443" then
    443
  else
    throw Malformed(s)

-- Composed: the row unions to {NotFound, Malformed}.
fn port_of(cfg, key) : !{NotFound, Malformed} Int =
  parse_port(find_port(cfg, key))

fn describe(cfg, key) =
  try
    show(port_of(cfg, key))
  catch
    NotFound(k) => "no such key: {k}"
    Malformed(s) => "bad config: {s}"

fn main() =
  let cfg = [("http", "8080"), ("https", "443"), ("smtp", "oops")]
  println(describe(cfg, "http"))
  let port =
    try
      try port_of(cfg, "gopher") catch { Malformed(_s) => 0 }
    catch
      NotFound(_k) => 80
  println("default {show(port)}")
  println(describe(cfg, "smtp"))
  println(describe(cfg, "https"))
  println(describe(cfg, "tls"))

These idioms span the recovery spectrum: the built-in Exn effect, raised by error(code) and uncatchable (it aborts); Result with the postfix e? propagation of Section 8; a plain match on Ok/Err; and a custom non-resumable effect.

-- Exceptions tour, four idioms from fatal to fully recoverable.
--
-- 1. Built-in Exn row: `error(code)` aborts the process, uncatchable. Any
--    caller carries `!{Exn}`. checked_div sees only safe input here.
-- 2. `?` propagation over Result: `half(n)?` unwraps Ok and short-circuits
--    the chain on the first Err. report matches the outcome at the boundary.
-- 3. Recovery is a plain match on Ok/Err (or result_or for a default).
-- 4. Custom exception effect via a `final ctl` clause: abort never resumes,
--    the continuation is discarded, and the clause's value becomes the
--    result. Cheap throws, no Result plumbing in the throwing code.
--
-- Expected output:
--   42
--   3
--   odd
--   3
--   0
--   pass
--   fail
--   invalid: negative
--   invalid: too big

fn checked_div(n, d) : !{Exn} Int =
  if d == 0 then
    error(1)
  else
    n / d

fn half(n) =
  if even(n) then
    Ok(n / 2)
  else
    Err("odd")

fn quarter(n) =
  let h = half(n)?
  let q = half(h)?
  Ok(q)

fn report(r) =
  match r of
    Ok(v) => println(v)
    Err(e) => println(e)

fn quarter_or_zero(n) =
  match quarter(n) of
    Ok(v) => v
    Err(_) => 0

effect Abort {
  ctl abort(String) : Unit
}

fn grade(n) : !{Abort} String =
  if n < 0 then
    abort("negative")
  if n > 100 then
    abort("too big")
  if n >= 60 then
    "pass"
  else
    "fail"

fn safe_grade(n) =
  handle grade(n) with
    final ctl abort(msg) => concat("invalid: ", msg)
    return r => r

fn main() =
  println(checked_div(84, 2))
  report(quarter(12))
  report(quarter(6))
  println(quarter_or_zero(12))
  println(quarter_or_zero(6))
  println(safe_grade(72))
  println(safe_grade(40))
  println(safe_grade(0 - 5))
  println(safe_grade(200))

The failure axis. Beyond named errors, Prism has an anonymous, recoverable fail(), the deterministic-functional-logic failure of the Verse calculus (Augustsson et al., 2023). guard(b) fails when b is false; a ?? b runs a under a failure handler and falls back to b; e?.field chains through options, failing on None; optional/succeeds/default reify a failing computation as an Option, a Bool, or a default; and a comprehension guard may itself fail, pruning the element (Section 8). transact body else fallback snapshots every live var, runs the body under a failure handler, and restores the snapshots on failure, so an aborted attempt leaves observable state unchanged. The whole axis is final ctl handlers over a Fail effect, so an unhandled fail() is the ordinary unhandled-effect error, and “failable only in a failure context” falls out of the row discipline for free.

-- Transactional rollback. A purchase that fails on insufficient funds leaves
-- balance and stock untouched, as if it never happened. The first buy succeeds.
-- The second overspends, so `guard` fails inside the `transact` and both vars
-- roll back to their pre-attempt values.
fn main() =
  var balance := 100
  var stock := 5
  let r1 =
    transact
      balance -= 40
      stock -= 1
      guard(balance >= 0)
      1
    else
      0
  println(balance)
  println(stock)
  println(r1)
  let r2 =
    transact
      balance -= 90
      stock -= 1
      guard(balance >= 0)
      1
    else
      0
  println(balance)
  println(stock)
  println(r2)

7.6 Effect Polymorphism

A function can be generic over the effects of a thunk it is given by quantifying over a row variable in the argument’s type. Below, twice accepts any (Unit) -> Int thunk and adds an open row {| e} for whatever that thunk performs; each call unifies e with the actual row (empty, {Tick}, or {Say}), and a handler discharges only the label it names, leaving the rest in e. This is the mechanism the prelude’s fmap and traverse use to thread a per-element effect (Section 6.3), so an effectful traversal needs no Applicative wrapper.

effect Tick {
  ctl tick(Unit) : Int
}

effect Say {
  ctl say(Int) : Unit
}

-- Effect polymorphism: one higher-order function, any effect row.
--
-- twice is written once, with a row variable e for whatever its argument
-- performs. Each call site unifies e with that thunk's row: {} for the pure
-- thunk, {Tick} and {Say} for the effectful ones. A handler discharges only
-- the label it names. e carries the rest along.

fn twice(f : (Unit) -> Int ! {| e}) = f(()) + f(())

-- e = {}: pure code needs no handler at all.
fn pure_use() = twice(\(_u) -> 21)

-- e = {Tick}: a counting handler hands out 1 then 2, so the same thunk
-- answers differently per force.
fn tick_use() =
  let g =
    handle twice(\(_u) -> tick(())) with
      tick(u, k) => \(n) -> k(n)(n + 1)
      return r => \(_n) -> r
  g(1)

-- e = {Say}: each force shouts before answering.
fn say_use() =
  handle twice(\(_u) -> let _ = say(9) in 9) with
    say(m, k) =>
      println(m)
      k(())
    return r => r

fn main() =
  println(pure_use())
  println(tick_use())
  println(say_use())

8. Expressions

The expression grammar is in Section 4 and the effect and failure forms are in Section 7; the forms below are those the grammar alone does not settle.

A method call e.m(args) is uniform-function-call sugar for m(e, args). A trailing block argument, e.m(args) fn (x) { body }, appends a lambda as the last argument; this is how the stream consumers in streams.pr chain. Field access is e.field.

8.1 Comprehensions

A comprehension [ e for x in s, q, ... ] collects e for each element; a qualifier q is a guard if g or a binder let y = e. A guard is evaluated in a failure context, so an element is pruned both when g is false and when computing g fails: a failable accessor such as at_list (a prelude lookup, Section 12) past the end of a list prunes that element rather than aborting. The statement form for x in s, q, ... do body runs body per survivor. Both desugar to the prelude’s stream combinators (the Emit effect of Section 12), so they fuse without building an intermediate list.

-- List comprehensions and qualified for-loops. A comprehension
-- `[ e for x in s, <quals> ]` collects e per element. Qualifiers filter with
-- `if g` and bind with `let y = e`. A guard may be failable: it prunes the
-- element when computing g raises Fail, not only when g returns false. The
-- statement form `for x in s, <quals> do body` runs body per survivor.
-- Expected output:
--
--   1
--   4
--   9
--   16
--   25
--   30
--   40
--   50
--   2
--   4
--   10
--   30

fn main() =
  for sq in sof([x * x for x in srange(1, 6)]) do
    println(sq)
  for y in sof([y for x in srange(1, 6), let y = x * 10, if y > 20]) do
    println(y)
  for x in srange(1, 6),
      if even(x) do
    println(x)
  -- Failable guard: `at_list` fails past the end, so the guard prunes
  -- out-of-range indices just as it prunes the zero price. Keeps 10 and 30.
  let prices = [10, 0, 30]
  for i in srange(0, 6),
      if at_list(prices, i) > 0 do
    println(at_list(prices, i))

8.2 Records

Record construction C { f = e, ... }, functional update C { ..base, f = e }, and nested path update { base | a.b = e, ... } build and modify the records of Section 5.7; each is an in-place write on a uniquely owned value. The deriving (Lens) getters and setters compose with them for deeper access. A path generalizes past nested fields to traversals, indices, prisms, filters, and a read form (Section 8.6).

-- `deriving (Lens)` synthesizes a getter `<f>_of` and a functional setter
-- `with_<f>` per field. They are ordinary functions, no optic types needed.
-- On a uniquely owned value the setter is FBIP-reused.
--
-- Expected output:
-- 3
-- 7
-- 9
-- 4
type Vec2 = Vec2 { x: Int, y: Int } deriving (Lens)

fn main() : !{IO} Unit =
  let v = Vec2 { x = 3, y = 4 }
  println(x_of(v))
  let v2 = with_x(v, 7)
  println(x_of(v2))
  let v3 = with_y(with_x(v, 9), 4)
  println(v3.x)
  println(v3.y)

8.3 Imperative control flow

Loops and early exit are surface sugar over tail recursion and effects, so they cost nothing beyond what an explicit recursion would. while cond do body and loop body (an unconditional loop) lower to a tail-recursive driver applied to the condition and body as thunks; because a var is a State effect (Section 12) the body mutates freely and the loop runs in constant stack with no per-iteration allocation. break and continue (valid inside while, loop, and for) and statement-form return e (which exits the enclosing function) compile to non-resumable performs of internal, fully-handled control effects, installed only for the keyword a body actually uses; a nested loop captures its own break/continue. Because each control effect is discharged at its loop or function boundary, none appears in the surfaced effect row: a loop is as pure as its body, and a function using return infers the same row as the equivalent recursion. Compound assignment x += e (and -=, *=, %=) on a var is shorthand for x := x <op> e.

Each form desugars to an existing construct:

SurfaceDesugaring
x += e (and -=, *=, %=)x := x <op> e
while cond do bodyrepeat_while(\() -> cond, \() -> body)
loop body (reachable break)repeat_while(\() -> true, \() -> body)
loop body (no break)forever(\() -> body), whose result is a bottom type
break / continuea final ctl perform of an internal Break/Continue effect handled at the loop
return ea final ctl perform of an internal Return(a) effect handled at the function body
-- Imperative control flow recovered from effects and tail recursion: `while`,
-- `loop`, `break`, `continue`, early `return`, and compound assignment, all
-- desugaring to the existing core. `var` is a State effect, so loop bodies mutate
-- freely yet each function below stays observably pure `(Int) -> Int`.
fn sum_evens(limit : Int) : Int =
  var total := 0
  var i := 0
  while i < limit do
    i += 1
    if i % 2 == 1 then
      continue
    total += i
  total

fn first_factor(n : Int) : Int =
  var d := 2
  loop
    if d * d > n then
      return n
    if n % d == 0 then
      return d
    d += 1

fn countdown(start : Int) : Int =
  var n := start
  var ticks := 0
  loop
    if n <= 0 then
      break
    n -= 1
    ticks += 1
  ticks

fn main() : !{IO} Unit =
  println(show_int(sum_evens(10)))
  println(show_int(first_factor(91)))
  println(show_int(first_factor(97)))
  println(show_int(countdown(5)))

8.4 Exponentiation

a ^ b raises a to the power b. It binds tighter than * and is right-associative, so 2 ^ 3 ^ 2 is 2 ^ (3 ^ 2). It is the method of the Pow class (Section 12) with Int and Float instances, so it desugars to pow(a, b): over Int it is bignum-correct (the instance multiplies), over Float it is a pow_float call. A mixed Int ^ Float is a type error, resolved by an explicit to_float, exactly as 2 + 3.0 is (Prism never coerces between Int and Float implicitly).

8.5 Indexing

a[i] reads, a[i] := v writes, and a[i] += e updates an indexed container. The form is dispatched on the receiver’s type (not a class, so no inference change): Array is indexed by Int, HashMap by String, String by Int (yielding the byte), and List by Int. Array, HashMap, and List are writable; String is read-only. Array and HashMap rewrite the cell in place (FBIP); a List write is the functional list_set, rebuilding the spine.

A read is failable: a missing index or key performs the Fail effect (Section 7.5), so a[i] has type Elem ! {Fail} and the partiality surfaces in the row rather than in an Option wrapper. It therefore composes with ??, ?., default, and the rest of the failure axis: a[i] ?? d supplies a default, and the counter idiom is m[k] := (m[k] ?? 0) + 1, honest that an absent key starts at zero. A plain write a[i] := v is total; a[i] += e reads first, so it is ! {Fail}. Writes rebind the underlying var and rewrite the cell in place when it is uniquely owned (FBIP, Section 10); nested grid[i][j] := v composes the same way. a[i] := v requires a to be an assignable var.

8.6 Optic Paths

Prism has no optic library: no Lens type, no over/set/toListOf to compose, no profunctor encodings. It has one rule instead. Between the | and the operator of a record update (Section 8.2), or inside s.[ ... ], a path is a sequence of steps read left to right. The path is the optic, spelled at the use site rather than reified as a value. Every form is sugar over map/with/match, so in-place reuse and fusion come for free and nothing new reaches the core: this is the language’s “effects instead of monads” stance applied to optics, paths instead of optic combinators.

A step is one of:

StepMeaning
.fielddescend into a record field
eachtraverse every element of a functor (lowers to fmap)
[i]focus one element of a list or array, by index
?Ctorfocus through a sum constructor; others pass through
(steps where p)keep only the foci satisfying the predicate p

End a path with = v to set the focus or ~ f to modify it (apply f); wrap a path in s.[ path ] to read every focus it selects into a list. each is a reserved keyword; the other steps reuse existing tokens.

Each form lowers to ordinary code, shown here over type Player = Player { name: String, pos: Vec2, hp: Int, bag: List(Int) } and type Shape = Circle { radius: Int } | Square { side: Int }. A field sets through the derived setter, { p | hp = 100 } to with_hp(p, 100), and nests, { pl | pos.x = 30 } to with_pos(pl, with_x(pl.pos, 30)). Modify reads the focus, { p | hp ~ heal } to with_hp(p, heal(p.hp)). each fans out over any functor, { players | each.hp ~ heal } to fmap(\p -> with_hp(p, heal(p.hp)), players), and composes with descent, { world | party.each.pos.x = 0 }. An index focuses one element, { world | party[0].hp = 100 }, lowering through list_set (or in-place array_set); an out-of-range index leaves the container unchanged. A prism rebuilds a matched constructor and passes the others through, the prism law for update:

{ shape | ?Circle.radius ~ double }
  =>  match shape of
        Circle { radius = r } => Circle { radius = double(r) }
        other                 => other

A filter guards a traversal, { world | party.(each where alive).hp ~ heal } applying the rest only to the foci alive keeps and passing the rest through. The whole vocabulary composes in one path:

{ world | party.(each where alive).bag.each.count ~ \(n) -> n + 5 }
  =>  with_party(world,
        fmap(\p -> if alive(p)
                   then with_bag(p, fmap(\it -> with_count(it, it.count + 5), p.bag))
                   else p,
             world.party))

The read form s.[ path ] collects every focus a path selects into a list, the read twin of the update: players.[each.hp] is the list of each hp, each flat-maps so world.[party.each.bag.each.count] flattens, ?Ctor previews zero or one focus, and a single-focus path yields a one-element list.

Paths are deliberately use-site syntax, not first-class values: there is no Optic type, no passing an optic to a function, no library of named composable optics, and optic kinds are not tracked in the type system (that a read-only path is read-only is a structural fact of the desugaring, not a typed law). This is the explicit trade: paths cover the great majority of real optic use and give up abstracting over which optic. The mental model is one breath: steps read left to right, = v/~ f to write, s.[ ... ] to read, nothing escaping into a new core construct.

-- A tour of optics by path. A path is a sequence of steps between `|` and the
-- operator (or inside `s.[ ... ]`): `.field` descends, `each` traverses a
-- functor, `[i]` indexes, `?Ctor` focuses a constructor, `(each where p)`
-- filters. End with `= v` to set or `~ f` to modify; wrap in `s.[ path ]` to
-- read every focus into a list. Every form is sugar over `map`/`with`/`match`,
-- so in-place reuse and fusion come for free; nothing reaches a new core form.
--
-- Expected output:
--   Vec2(1, 4)
--   Player(hero, Vec2(7, 4), 30, [])
--   [Player(a, Vec2(0, 0), 15, []), Player(b, Vec2(0, 0), 10, [])]
--   World([Player(a, Vec2(1, 2), 100, []), Player(b, Vec2(1, 2), 0, [])], 1)
--   Circle(20)
--   Square(3)
--   [Player(a, Vec2(1, 2), 15, []), Player(b, Vec2(1, 2), 0, [])]
--   [5, 0]
--   [5]

type Vec2 = Vec2 { x: Int, y: Int }

type Player = Player { name: String, pos: Vec2, hp: Int, bag: List(Int) }

type World = World { party: List(Player), turn: Int }

type Shape = Circle { radius: Int } | Square { side: Int }

fn alive(p : Player) : Bool = p.hp > 0

fn heal(h : Int) : Int = h + 10

fn mk(n : String, h : Int) : Player =
  Player { name = n, pos = Vec2 { x = 1, y = 2 }, hp = h, bag = Nil }

fn main() =
  -- Set and modify a (possibly nested) field.
  let v = Vec2 { x = 3, y = 4 }
  println({ v | x ~ \(n) -> n - 2 })
  let hero = Player { name = "hero", pos = v, hp = 20, bag = Nil }
  println({ hero | pos.x = 7, hp ~ heal })
  -- Traverse with `each`, then descend; modify and set per element.
  let party = Cons(mk("a", 5), Cons(mk("b", 0), Nil))
  println({ party | each.hp ~ heal, each.pos = Vec2 { x = 0, y = 0 } })
  -- A field, an index, then a field, mixed with a plain field.
  let world = World { party = party, turn = 1 }
  println({ world | party[0].hp = 100 })
  -- A prism focuses one constructor and passes the others through.
  println({ Circle { radius = 10 } | ?Circle.radius ~ \(r) -> r * 2 })
  println({ Square { side = 3 } | ?Circle.radius = 0 })
  -- Filter a traversal: only the living are healed.
  println({ party | (each where alive).hp ~ heal })
  -- Read every focus a path selects into a list.
  println(party.[each.hp])
  println(party.[(each where alive).hp])

9. Patterns

Patterns appear in match arms, let bindings, lambda and function parameters, and catch arms; their grammar is the pattern nonterminal of Section 4. A constructor pattern destructures a value of the algebraic data type that built it (Section 5.6), binding its fields; literal, tuple, wildcard, and record patterns match the remaining forms.

A match arm may carry a guard, pat if cond => body; when the guard is false, control falls through to the next arm. Matches are checked for exhaustiveness and redundancy by the usefulness algorithm of Maranget (2007): an unreachable arm is an error, and a non-exhaustive match is an error naming a missing pattern. A guarded arm does not count toward exhaustiveness, since its guard may fail at run time.

-- Pattern guards: `pat if cond => body`. The guard sees the pattern's
-- variables; when it is false the value falls through to the next arm.
-- Coverage counts only unguarded arms toward exhaustiveness, so a match
-- covered solely by guarded arms is rejected at compile time.

-- The classic: literal-free fizzbuzz, first true guard wins.
fn fizzbuzz(n : Int) : String =
  match n of
    k if k % 15 == 0 => "FizzBuzz"
    k if k % 3 == 0 => "Fizz"
    k if k % 5 == 0 => "Buzz"
    k => show(k)

-- Guarded insertion keeps the list sorted. The catchall handles both Nil and
-- the spot where x belongs.
fn insert(x : Int, xs : List(Int)) : List(Int) =
  match xs of
    Cons(h, t) if h < x => Cons(h, insert(x, t))
    _ => Cons(x, xs)

fn isort(xs : List(Int)) : List(Int) = foldr(insert, [], xs)

-- Expected output:
--   1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz 11 Fizz 13 14 FizzBuzz
--   [1, 2, 5, 7, 9]
fn main() =
  println(str_join(" ", map(fizzbuzz, range(1, 16))))
  println(show(isort([5, 2, 9, 1, 7])))

A pattern N(x) for T = view ... make ... declaration defines a bidirectional pattern synonym: in match position it runs view and succeeds when that returns Some (the present case of Option, Section 12); in expression position it runs make. Here view and make are contextual keywords, significant only inside a pattern declaration. A synonym with both halves is a prism (a composable view-and-build pair); one with only view is a view pattern.

type Vec2 = Vec2 { x: Int, y: Int }

-- a bidirectional pattern synonym: matches when y == 0, binding x
pattern OnXAxis(x) for Vec2 =
  view \(v) -> if v.y == 0 then Some(v.x) else None
  make \(x) -> Vec2 { x = x, y = 0 }

fn describe(v) =
  match v of
    OnXAxis(x) => x
    _ => 0 - 1

fn main() =
  println(describe(Vec2 { x = 5, y = 0 }))
  println(describe(Vec2 { x = 5, y = 3 }))

10. Declarations and Programs

A function is declared with fn; a parameter may carry a type annotation, a default value := e, or the borrow modifier, which lets a pure function read a parameter without taking ownership of it. A return annotation is written : !{R} T for result type T and effect row R, : ! T for an explicit empty row, or : T to leave the row inferred. A parameter with a default may be omitted, and any argument may be passed by name as f(p = e); the call is rewritten to positional form, filling omitted defaults. Defaults and named arguments are honored on top-level functions. A top-level let is a constant: its references are inlined. A where block attaches non-recursive, lexically scoped definitions to a function body.

A function may be annotated fip or fbip to assert the fully-in-place discipline of Lorenzen et al. (2023). fbip proves the body allocates no fresh cell and calls only annotated, allocation-free functions. fip additionally proves linearity (each owned, non-immediate binding is consumed at most once) and bounded stack (each recursive call in the group is a tail call or a single tail-modulo-cons or tail-modulo-add). These are static checks that reject a non-conforming body; the mechanism is described in Compiler.

-- FP^2 in-place list operations (Lorenzen/Leijen/Swierstra, ICFP 2023),
-- statically checked. The annotation makes the compiler PROVE the function
-- allocates no fresh cell: every `Cons` it builds reuses one it just matched and
-- dropped. Run with PRISM_REUSE_STATS=1 to watch the reuse hits, or
-- PRISM_CHECK_LEAKS=1 to confirm zero live cells at exit.
--
-- `rev_onto` and `bump` are `fip`: linear (each binding used once) AND bounded
-- stack. `rev_onto` is a plain tail call and `bump` is a tail-modulo-constructor
-- (`Cons(.., bump(t))`), so both lower to a constant-stack loop, not recursion.
-- `cap_at` is only `fbip`: its `h` is read by the guard AND rebuilt into the
-- `Cons`, two uses of one value, so it is not linear (its element type is generic
-- `a`, so `h` cannot be assumed an immediate). Zero-allocation still holds.
--
-- Prints:
--   10
--   6
--   22

-- Reverse onto an accumulator, the canonical fip. Each input `Cons` is matched,
-- freed, and immediately reused as the next accumulator cell, so the reversal
-- runs in place with zero allocation, and the tail call makes it a loop.
fip fn rev_onto(xs, acc) =
  match xs of
    Nil => acc
    Cons(h, t) => rev_onto(t, Cons(h, acc))

-- The seeding wrapper is not fbip: that initial `Nil` is a genuine fresh
-- allocation with no cell to reuse, so it stays un-annotated.
fn reverse_ip(xs) = rev_onto(xs, Nil)

-- Spine-rebuilding map: drop each `Cons`, rebuild it around the bumped head.
fip fn bump(xs) =
  match xs of
    Nil => Nil
    Cons(h, t) => Cons(h + 1, bump(t))

-- Saturating transform: rebuild every cell, head chosen by a test. Both `if`
-- branches end in a `Cons` reusing the same freed cell, so the reuse credit is
-- spent exactly once on every path. (A `filter` that DROPS cells could not be
-- fbip: the discard path frees a cell with no local allocation to reuse, and
-- this runtime has no cross-call reuse credit, so it would allocate.)
fbip fn cap_at(xs) =
  match xs of
    Nil => Nil
    Cons(h, t) =>
      if h > 9 then
        Cons(9, cap_at(t))
      else
        Cons(h, cap_at(t))

fn main() =
  println(sum(reverse_ip([1, 2, 3, 4])))
  println(sum(bump([1, 1, 1])))
  println(sum(cap_at([1, 20, 3, 40])))

11. Modules

A file is a module and a directory is a namespace prefix: import Data.Map loads Data/Map.pr. A project is a prism.toml manifest plus a source tree, resolved from the project root. A single-file program is one module.

import M brings M’s exports into scope under qualified names; import M (a, b) also brings a and b into bare scope; import M as N adds the alias N. The pub modifier on a declaration makes it visible to importers; pub import M (x) re-exports x through the importing module. An opaque type exports its name but not its constructors.

Name resolution rewrites every top-level definition to a canonical, module-qualified symbol (an export as Data.Map.insert, a private as the unforgeable, source-unwritable Data.Map@helper) and merges all modules into one program keyed by those symbols. Because identity is the canonical symbol, two modules may export the same short name and coexist. This is namespacing, not separate compilation: there are no per-module artifacts, and changing one module recompiles the whole program.

Instances are global, but each records its defining module. An orphan instance (defined apart from both its class and its head type) and instances that overlap across modules are reported as warnings; an ambiguity names each candidate’s module.

12. The Standard Prelude

The prelude in lib/prelude.pr is in scope in every module. It is ordinary Prism, not built-in. Its contents, by category:

  • Data types. Option(a), Result(a, e), List(a), the balanced-tree Map(k, v), and the hash table HashMap(v). A set is a Map(k, Unit), with a set_* API rather than a distinct type.
  • List combinators. map, filter, foldl, foldr, length, append, reverse, zip/unzip, take/drop, sort, range, and the rest of the usual vocabulary.
  • Option and Result combinators. map_option, and_then, unwrap_or, map_result, and_then_result, result_or, and conversions between the two.
  • The class tower. Eq, Ord (with Eq as superclass), Show, Pow (exponentiation ^, with Int and Float instances), and the higher-kinded Functor, Applicative, Monad, Foldable, Traversable, with instances for List and Option.
  • Strings and characters. Classifiers (is_digit, is_alpha, …), case mapping, starts_with/ends_with/contains/index_of, split, trim, chars, and single-allocation joining.
  • Arrays and maps. The growable Array(a) API (array_new, array_get, array_set, array_push, with in-place update on unique ownership), the AVL Map and set_* API, and the HashMap API over string keys.
  • Streams. The Emit(a) effect and the producer/transformer/consumer combinators srange, sof, smap, skeep, stake, sfold, ssum, scollect, which fuse without intermediate collections.
  • Numerics and failure. Fixed-width i64_*/u64_* operations, common math, and the failure helpers guard, optional, succeeds, default.

The Prism Compiler

This document describes the prism compiler, from source text to native binary across its three backends.

1. Architecture

Compilation is a pipeline from source text to a native binary. Each phase is a total function over the program, and there are no per-module artifacts.

PhaseRoleOwner
Lextext to tokens, then layoutsrc/lex/
Parsetokens to surface ASTsrc/parse/, src/syntax/grammar.lalrpop
Resolveload imports, canonicalize names, mergesrc/resolve/
Desugarsurface sugar to core surfacesrc/syntax/desugar/
Checktype and effect inferencesrc/tc/
Elaboratesurface to CBPV / ANF core (match compilation, Section 8)src/core/elaborate/
Effect lowerremove handlers and operationssrc/core/effect_lower/
Reference countinsert dup/drop, then reusesrc/core/fbip.rs
Codegencore to interpreter, LLVM, or MLIRsrc/eval/, src/codegen/

The driver (src/driver/) exposes these as subcommands: a bare prism <file.pr> compiles a single file to a native binary named after the source (override with -o), prism build compiles the enclosing project (the nearest prism.toml) and fails outside one, prism run interprets, prism check runs the front end only, prism fmt formats, and prism dump <phase> prints an intermediate form, where <phase> is tokens, ast, types, core, core-json (the core as a JSON tree the Lean model reads, Section 13), core-hash (a content-addressed hash of each definition’s elaborated core, src/core/hash.rs), fbip (core after reference-count insertion and reuse), lowered (after effect lowering), llvm, or mlir (the last gated on the MLIR backend feature).

2. Lexing and Layout

The lexer produces a token stream and trivia (comments and spacing) that the formatter uses to reproduce source faithfully. An interpolated string is lexed by re-lexing each { ... } hole at its absolute source offset, so spans inside holes remain accurate. A layout pass then rewrites the stream, inserting virtual block-open, block-close, and separator tokens according to the offside rule of Spec 3.6, which the grammar consumes as ordinary terminals.

3. Parsing

The grammar is an LALR(1) grammar in LALRPOP (src/syntax/grammar.lalrpop), with two entry points: a whole program, and a single expression for the REPL. Parsing produces the surface AST of src/syntax/ast.rs. Type and parse errors are rendered with a source caret.

4. Name Resolution and Modules

Resolution loads every transitively imported module, rewrites each top-level definition to a globally unique canonical symbol (an export as Data.Map.insert, a private as Data.Map@helper), resolves qualified and re-exported references to those symbols, and merges all modules into one flat program. This is a whole-program renamer: the entire program is checked and compiled from source on every build. The canonical-symbol scheme makes the merge sound, since two modules can export the same short name without collision.

Moving to incremental, per-module compilation is planned but not implemented; the interprocedural analyses below (effect lowering, borrow signatures, instance coherence) are what make it nontrivial, since each crosses module boundaries.

5. Desugaring

Desugaring rewrites surface sugar into the smaller core-surface language the checker and elaborator handle (src/syntax/desugar/), each rule shown as surface form and the form it lowers to.

Function composition lowers to a lambda, kept as sugar only so the operator survives formatting.

fn inc(x) = x + 1

fn dbl(x) = x * 2

-- forward composition: inc, then dbl
fn main() = println((inc >> dbl)(10))

An arithmetic sequence lowers to a prelude enumeration call.

fn main() =
  println(sum([1..5]))
  println(sum([1, 3..9]))

A list comprehension (and the statement for) lowers to a stream (a producer performing the Emit effect, Section 9) that emits each surviving element, collected with scollect (a stream consumer that gathers the emissions into a list), so it fuses with no intermediate list.

fn main() = println(sum([x * x for x in srange(1, 6)]))

A record update rebuilds the constructor along the named fields; on a uniquely owned value the rebuild is the in-place write of Section 10.

type Vec2 = Vec2 { x: Int, y: Int }

fn main() =
  let v = Vec2 { x = 1, y = 2 }
  let w = Vec2 { ..v, x = 7 }
  println(w.x)
  println(w.y)

deriving (Lens) synthesizes a getter and a functional setter per field.

-- `deriving (Lens)` synthesizes a getter `<f>_of` and a functional setter
-- `with_<f>` per field. They are ordinary functions, no optic types needed.
-- On a uniquely owned value the setter is FBIP-reused.
--
-- Expected output:
-- 3
-- 7
-- 9
-- 4
type Vec2 = Vec2 { x: Int, y: Int } deriving (Lens)

fn main() : !{IO} Unit =
  let v = Vec2 { x = 3, y = 4 }
  println(x_of(v))
  let v2 = with_x(v, 7)
  println(x_of(v2))
  let v3 = with_y(with_x(v, 9), 4)
  println(v3.x)
  println(v3.y)

The failure fallback a ?? b runs a under a Fail handler that yields b if a fails.

-- at_list fails past the end of the list; ?? supplies the fallback
fn main() = println(at_list([10, 20, 30], 5) ?? 99)

A method call e.m(args) is uniform-function-call sugar: the receiver becomes the first argument.

fn inc(x) = x + 1

fn double(x) = x * 2

-- a dot-chain reads left to right: inc(5), then double(...)
fn main() = println(5.inc().double())

A string with interpolation holes becomes a concatenation of its literal pieces and the show of each hole.

fn main() =
  let name = "Prism"
  let n = 42
  println("hello {name}, n = {n}")

try/catch/throw is subtractive handler sugar: one nested final ctl clause (the non-resumable handler clause of Spec 7.2) per arm, each discharging one error label.

error NotFound(String)

fn lookup(k : String) : !{NotFound} Int =
  if k == "a" then
    1
  else
    throw NotFound(k)

fn main() =
  println(try lookup("a") catch { NotFound(k) => 0 })
  println(try lookup("z") catch { NotFound(k) => 0 })

transact body else fallback snapshots every live var, runs the body under a Fail handler, and restores the snapshots on failure, so a failed attempt leaves observable state unchanged.

-- Transactional rollback. A purchase that fails on insufficient funds leaves
-- balance and stock untouched, as if it never happened. The first buy succeeds.
-- The second overspends, so `guard` fails inside the `transact` and both vars
-- roll back to their pre-attempt values.
fn main() =
  var balance := 100
  var stock := 5
  let r1 =
    transact
      balance -= 40
      stock -= 1
      guard(balance >= 0)
      1
    else
      0
  println(balance)
  println(stock)
  println(r1)
  let r2 =
    transact
      balance -= 90
      stock -= 1
      guard(balance >= 0)
      1
    else
      0
  println(balance)
  println(stock)
  println(r2)

Optional chaining a?.b is force(a).b, where force raises fail() on None, so a path short-circuits at the first None and an enclosing ?? supplies the default.

-- The `??` operator runs its left side in a `Fail` context and falls back to
-- the right on failure. It reads as Verse's `or`. The failable accessors
-- at_list / at_map / force raise `fail()` on a miss, so lookups default
-- cleanly. They chain: `a ?? b ?? c` nests right, each `??` its own handler.
-- Optional chaining `a?.b` is `force(a).b`, so a path short-circuits on the
-- first `None` and defaults with `??`. Expected output:
--
--   10
--   -1
--   80
--   8080
--   443
--   7
--   99
--   42
--   -1
--   -1

type City = City { zip: Int }

type Addr = Addr { city: Option(City) }

fn main() =
  let xs = [10, 20, 30]
  println(xs.at_list(0) ?? 0 - 1)
  println(xs.at_list(9) ?? 0 - 1)
  let cfg = map_from_list([("host", 80)])
  println(cfg.at_map("host") ?? 8080)
  println(cfg.at_map("port") ?? 8080)
  println(cfg.at_map("port") ?? cfg.at_map("https") ?? 443)
  println(force(Some(7)) ?? 0)
  println(force(None) ?? force(None) ?? 99)
  -- Optional chaining: the whole path short-circuits if any link is `None`.
  let full = Some(Addr { city = Some(City { zip = 42 }) })
  println(full?.city?.zip ?? 0 - 1)
  let middle = Some(Addr { city = None })
  println(middle?.city?.zip ?? 0 - 1)
  let outer = (None : Option(Addr))
  println(outer?.city?.zip ?? 0 - 1)

A with f <- handler { .. } block binds a first-class handler instance over a fresh private effect; f.op(..) targets it by name.

effect Read {
  ctl read() : String
}

-- `with conf <- handler { .. }` binds a first-class handler instance;
-- `conf.read()` dispatches to it by name.
fn main() =
  with conf <- handler
    read(k) => k("conf.toml")
    return r => r
  println(conf.read())

A trailing block argument is appended as the call’s last argument.

fn twice(f : (Int) -> Int) = f(f(0))

-- the trailing `fn (x) ...` block becomes twice's last argument
fn main() =
  let r = twice(\(x) -> x + 10)
  println(r)

A bidirectional pattern synonym desugars to a view call in match position and a make call in expression position.

type Vec2 = Vec2 { x: Int, y: Int }

-- a bidirectional pattern synonym: matches when y == 0, binding x
pattern OnXAxis(x) for Vec2 =
  view \(v) -> if v.y == 0 then Some(v.x) else None
  make \(x) -> Vec2 { x = x, y = 0 }

fn describe(v) =
  match v of
    OnXAxis(x) => x
    _ => 0 - 1

fn main() =
  println(describe(Vec2 { x = 5, y = 0 }))
  println(describe(Vec2 { x = 5, y = 3 }))

A nested path update rebuilds the single-constructor spine (the chain of nested constructor cells) along the path.

type Vec2 = Vec2 { x: Int, y: Int }

type Line = Line { from: Vec2, to: Vec2 }

fn main() =
  let l = Line { from = Vec2 { x = 0, y = 0 }, to = Vec2 { x = 1, y = 1 } }
  let l2 = { l | from.x = 9 }
  println(l2.from.x)

deriving (Eq, Ord, Show) generates one structural instance per class.

type Color = Red | Green | Blue deriving (Eq, Ord, Show)

fn main() =
  println(show(Green))
  println(Red == Red)

The postfix e? unwraps Ok and short-circuits on Err.

fn half(n) =
  if even(n) then
    Ok(n / 2)
  else
    Err("odd")

-- `half(n)?` unwraps Ok and short-circuits the chain on the first Err.
fn quarter(n) =
  let h = half(n)?
  let q = half(h)?
  Ok(q)

fn main() = println(result_or(0 - 1, quarter(8)))

The var desugaring is shown with full Source / Desugared / Core stage tabs in Spec 7.4; default and named arguments lower to positional ones in the same pass.

6. Type and Effect Inference

Type inference is the bidirectional, higher-rank algorithm of Dunfield & Krishnaswami (2013) (src/tc/); the surface rules are in Spec 5. Type classes elaborate to dictionary-passing: a constraint becomes a hidden parameter, resolved to a global instance, a passed dictionary, or a projection of a superclass dictionary.

Instances are global, but each records its defining module, so coherence is checked by provenance (src/tc/classes.rs). Resolution is coherent: for each (class, type-head) there is exactly one canonical instance, and implicit resolution always selects it. A single instance for a head is canonical automatically. When two or more instances share a head, one must be designated with a top-level canonical Class(Head) = name declaration (Spec 6.1). An undesignated overlap is a hard error reported at definition, naming the candidates and their modules, with a source caret when they point into the program being compiled. An orphan instance (defined apart from both its class and its head type) is reported as a warning. An explicit override is written at the use site as a trailing using argument, f(args, using name), which changes nothing else’s resolution.

Indexing (a[i], a[i] := v) is resolved the same way the Show and ^ lowerings are, by type-directed dispatch at elaboration: the checker records each sub-expression’s type in a span-keyed table, and the elaborator reads the receiver’s head type back and emits the matching builtin or accessor (Array to at_array/array_set, HashMap to at_hashmap/hm_insert, String to at_byte, List to at_list). A receiver whose type is still an unsolved existential when first synthesized (a var indexed before its initializer fixes its state type) defers to one pass at the end of the declaration, after the initializer has constrained it. No class or type-system extension is involved, so concrete indexing ships today; the desugar target is index/index_set, leaving room for a user-extensible Index class later.

Effect-row inference is principal: each declaration infers its most general row from its body alone. The row unifier in src/tc/subsume.rs discovers every label on its own (a row is a function’s effect set; Spec 5) from direct performs, applied effect-carrying callees, builtin rows, and mask. At a call it adds the callee’s row to the caller’s ambient row (the effect set accumulated for the body so far), and a handler removes the operations it discharges. The row is the single source of truth: there is no separate set-pass seed and no subset reconciliation against one.

A syntactic set-pass (a pass that computes a set of operation labels by a call-graph fixpoint, src/types/effects.rs) still runs, but only to feed the syntactic purity checks: it confirms a konst declaration and a declared-pure instance method perform nothing. It no longer seeds the row. After lowering, reconcile_effects checks the operations the lowered code actually performs against the inferred row, and the interpreter parity oracle (Section 13) is the final backstop. Effect lowering computes its own per-function latent operation set by an independent call-graph fixpoint (Section 9), so the two phases no longer share the set-pass result.

7. The Core Calculus

Elaboration lowers the surface language to a call-by-push-value core (Levy, 2004; src/core/cbpv.rs) in A-normal form. CBPV separates values, which are inert, from computations, which can be run; Thunk freezes a computation into a value and Force runs it. A-normal form names every intermediate result with a Bind, making evaluation order explicit and each operation and allocation syntactically distinguished, enabling the later effect and reference-counting passes. The grammar below is the elaborated core; the reference-count pass (Section 10) later adds dup, drop, and reuse nodes to it.

value  ->  Var sym
        |  Int i | I64 i | U64 i | Float f | Bool b | Unit | Str s
        |  Ctor sym tag [ value, ... ]
        |  Tuple [ value, ... ]
        |  Thunk comp                         -- suspend a computation as a value

comp   ->  Return value                       -- produce a value
        |  Bind comp sym comp                  -- run, name the result, continue
        |  Force value                         -- run a thunk
        |  Lam [ sym, ... ] comp
        |  App comp [ value, ... ]
        |  Call sym [ value, ... ]             -- direct call to a top-level fn
        |  If value comp comp
        |  Case value [ (pat, comp), ... ]
        |  Prim op value value                 -- primitive arithmetic / compare
        |  Do sym [ value, ... ]               -- perform an effect operation
        |  Handle { body, return_var, return_body, ops }   -- install a handler
        |  Mask [ sym, ... ] comp              -- bypass matching handlers

For example, a constructor applied to a call elaborates so the call is named before the constructor is built:

-- surface
fn f(y) = Cons(g(y), Nil)

-- core (every intermediate result is named by a Bind; arguments are values)
Lam [y]
  (Bind (Call g [Var y]) x
        (Return (Ctor Cons 1 [Var x, Ctor Nil 0 []])))

This calculus is modeled in Lean 4 (de Moura & Ullrich, 2021): models/Prism.lean mirrors the core one variant at a time with a substitution small-step relation, on top of which the model adds an executable abstract machine that mirrors the interpreter and is proved to agree with it. Section 13 describes the model and how it is run as a third differential oracle.

8. Pattern-Match Compilation

A match is compiled to a decision tree (src/core/elaborate/match_compile.rs). The arms form a matrix whose rows are arms and columns are argument positions. The compiler selects a column, partitions the arms by the head of that column’s patterns, and emits a test: a Case on the constructor tag of the scrutinee (the value being matched) for a constructor column, or a chain of equality tests for a scalar column. Wildcard rows form a default sub-matrix shared by the branches that fall through. A guarded arm compiles to a conditional that re-enters the remaining arms when the guard fails. Exhaustiveness, proven by the checker (Spec 9), guarantees every scrutinee reaches an arm.

9. Effect Lowering

Effect lowering compiles away the Handle, Do, and Mask nodes of the core. An operation is delimited control (an effect suspended and resumed within a handler’s scope): Handle is the delimiter, and the resumption k is the continuation captured between a perform site and its handler (Spec 7). The three strategies below are three compilations of that one mechanism, differing in how much of k they make manifest, from nothing to a heap-allocated tree. The compiler tries them in that order and takes the first that applies, so it reifies as little of the continuation as the program allows; a check then confirms no effect construct survives.

Two erasure pre-passes run before the strategy cascade, each recognizing a statically fixed handler shape and rewriting it to direct code, leaving everything else for the strategies. Var erasure (src/core/effect_lower/erase_var.rs) rewrites an escape-checked local var (a closed two-operation State handler, Spec 7.4) to a mutable cell: get becomes a cell read, set a cell write, and the block is wrapped in a fresh-cell allocation. It is sound exactly because the escape analysis proved the var’s continuation is never resumed more than once, so the shared cell and pure-state copies agree; a multishot use disables it. Control erasure (src/core/effect_lower/erase_control.rs) rewrites the internal break/continue/return effects (Spec 8.3), whose final ctl handlers have fixed templates, back to direct control flow. It runs after var erasure, so a pure imperative loop has lost all of its effect operations by the time the cascade classifies it and falls into the trivial pure path (no effect constructs remain), compiling to a musttail loop with no per-iteration allocation.

Evidence passing is the fast path for tail-resumptive handlers (every clause calls k exactly once, in tail position, so the continuation need never be captured at all). Each operation is assigned a stable numeric id by sorting the operation names, and a call-graph fixpoint computes each function’s latent set, the operations still performed anywhere in its call-graph closure. An effectful function then gains one extra parameter per latent operation, ev@<id>, a thunk holding the active handler clause. Performing an operation forces its evidence thunk directly; a handle binds fresh evidence for its body’s latent operations; and every call site appends the callee’s evidence, in ascending id order, so the convention is positional and stable. A first-class thunk that escapes carries evidence parameters for its own latent operations, threaded at each force site. No continuation is reified and no per-operation cell is allocated. What evidence to thread where is computed by an interprocedural least-fixpoint flow analysis (src/core/effect_lower/flow.rs) that derives, for every function, the operation signature of the thunk it returns and of each thunk-valued parameter.

State threading and stream fusion is the path for a uniform single-operation handler, the shape a stream consumer takes: a handler that folds every emit into an accumulator. Such a handler clause is rewritten to an accumulator transformer \acc -> acc', and the producer it wraps becomes a loop that threads the accumulator through each emission instead of allocating a value per step. A consumer that can stop early, like stake, returns a two-state tag (continue or done) that the producer checks, so the loop exits without unwinding. This reifies one small tag cell per early-terminating handler and, like evidence passing, no free-monad cell, so a smap/skeep/stake/ssum pipeline allocates neither an intermediate list nor a per-operation cell.

-- Streams as effects: a stream is a producer performing Emit(a), the
-- transformers smap/skeep/stake are handlers that re-emit, and the consumers
-- ssum/scollect/for are handlers that fold. A dot chain nests the handlers over
-- one producer, with no intermediate lists. stake stops the source early by
-- dropping its continuation. Expected output:
--
--   220
--   lo
--   hi
--   0
--   3
--   6

fn square(n) = n * n

fn main() =
  println(srange(1, 1000).smap(square).skeep(even).stake(5).ssum())
  for w in sof(["lo", "hi"]) do
    println(w)
  let xs = srange(0, 1000000).smap(\(n) -> n * 3).stake(3).scollect()
  for x in sof(xs) do
    println(x)

The free-monad fallback applies when an effect escapes static tracking: buried in data, dynamically applied, masked, genuinely multishot (a clause that resumes k more than once), or self-referential (a handler whose own body performs the effect it handles). A multishot handler forces this path because the two fast paths erase k, and a continuation invoked more than once must exist as a reusable value. Here the delimited continuation is reified in full: each computation becomes a tree of EPure and EOp cells threaded by ebind (shown below), and k is an explicit closure field of each EOp, so a clause can hold it, drop it, or apply it repeatedly. A handle becomes a generated driver function that case-dispatches the reified tree: an EPure runs the return clause, an EOp whose id the handler names and whose skip count is zero runs the matching clause, and any other EOp is re-emitted outward with a re-entry continuation, which is how an inner handler forwards an operation it does not catch. An EOp carries a skip field, its mask depth, the number of matching handlers it must still bypass; a mask driver increments it and the handler driver only fires when it is zero. This is exactly the interpreter’s dispatch (Section 11), so the two agree by construction. Each EOp allocation bumps the PRISM_EFFOP_STATS counter, so the fallback’s cost is observable, and a default-on warning (silenceable with PRISM_QUIET) names the functions that lost fusion and the cause when a program takes this path, so a pipeline meant to stay fused can be steered back. The generated drivers are closed by construction: a per-handler driver takes exactly its clauses’ captured free variables as parameters, and the fixed-binder templates (ebind, the mask drivers) use a reserved binder band and never nest, so a binder cannot capture a free occurrence. Lowering is local: when an effectful thunk escapes, only the connected component entangled with it (closed over the call graph, but leaving pure closure-inert helpers shared, and over shared operations) is converted to the free-monad form, while unrelated functions stay on their fused paths, provided the component’s operations are disjoint from the rest; when the split is not clean lowering falls back to converting every effectful function together. A convention-boundary check, run in both modes, validates the split and turns a missed monadic/direct boundary into a compile-time internal error.

Constant-stack driving changes how a closed handler on this fallback is run, not what it reifies. By default such a handler is driven by a single self-tail-recursive loop, {n}@region, rather than a pair of mutually recursive driver functions: the loop case-dispatches the same EPure/EOp tree but re-enters itself by a musttail self-call on the resumed continuation, so an iterative or deeply nested resumption runs in constant native stack where the mutually recursive driver grew it per step. Two clause shapes qualify. A tail-resumptive clause (every resume is the head of a tail application) re-drives the operation’s continuation queue with qApply. A function-answer state clause, the parameter-passing pattern whose answer is a function S -> A (rd(u, r) => \s -> r(s)(s), wr(v, r) => \s -> r(())(v)) applied once at the handler’s use site, threads the state in an accumulator parameter and folds that use-site application into the loop’s entry, so the pending-apply chain that would otherwise grow the stack per iteration lives in the accumulator instead. The reification is unchanged, so the per-operation EOp cost stays and the only zero-cell routes remain the evidence and state paths above; the gain is purely that a parameter-passing loop no longer overflows (the bounded-stack performance gate pins a million-iteration State loop completing in a 2 MB stack). An open handler, a multishot or escaping resume, or any clause outside these shapes keeps the mutually recursive driver, whose qApply the loop reuses, so the free-monad machinery is the substrate it drives rather than a thing it replaces. This is on by default and reverts under PRISM_NATIVE_EFFECTS=0; the interpreter oracle’s whole-corpus parity holds byte-for-byte either way.

-- Reified computation tree (the free-monad fallback)
EPure v                      -- a finished result
EOp id skip arg k            -- a suspended operation:
                             --   id   : numeric identity of the operation
                             --   skip : matching handlers still to bypass (mask depth)
                             --   arg  : the operation argument
                             --   k    : resumption closure \resume_value -> Eff

-- Sequencing extends the tree without forcing it:
ebind (EPure v)          f  =  f v
ebind (EOp id skip a k)  f  =  EOp id skip a (\r -> ebind (k r) f)

The example below exercises this path: an inner handler catches Log and forwards raise outward to an Exn handler, the two effects interleaving across the nesting.

effect Log {
  ctl log(Int) : Unit
}

effect Exn {
  ctl raise(Int) : Int
}

-- Cross-effect forwarding: an inner handler re-emits what it does not catch.
--
-- work() performs both Log and Exn. The inner handler interprets Log (prints
-- and resumes) but knows nothing about raise. When raise fires inside the Log
-- handler, that handler forwards it outward to the enclosing Exn handler, which
-- discharges it. Resumption threads back through the inner handler, so the two
-- effects interleave across nested handlers. The native free-monad lowering
-- drives this the same way the interpreter does.

fn work() : !{Log, Exn} Int =
  log(1)
  log(2)
  let x = raise(0 - 7)
  log(x)
  x

fn run() =
  handle (handle work() with {
      log(n, k) => let _ = println(n) in k(()),
      return r => r
    }) with
    raise(code, k) => code
    return r => r

fn main() = println(run())

The fallback reifies one cell per pending operation, so its cost is proportional to the operations in flight; the fast paths avoid it where they apply.

10. Reference Counting and FBIP Reuse

Reference counting runs after effect lowering, over the handler-free core, so it counts evidence parameters and any reified cells as ordinary values. Memory is managed by Perceus-style reference counting (Reinking et al., 2021; src/core/fbip.rs): every parameter and binding is owned and consumed exactly once on every control-flow path from its binding to the end of its scope; a second use inserts a dup and an unused value inserts a drop. Perceus places these operations precisely rather than conservatively at scope exit, which frees a cell at the earliest point the reuse pass below can claim it. Closure captures are borrowed (read without being consumed) and duplicated before a consuming use, as is a borrow parameter (Spec 10). The parameters a function borrows are recorded as a per-function bit vector, its interprocedural borrow signature, which every caller consults to place its dup/drop correctly. Because that signature crosses call sites, it is one of the analyses that complicates the move to separate compilation (Section 4).

The reuse pass then turns drops into in-place updates. When a uniquely owned scrutinee is dropped and the continuation rebuilds a constructor of the same or smaller size, the drop becomes a scoped reuse node, WithReuse { token, freed, body }: it frees the cell once and binds a reuse token over the continuation, and the rebuild spends that token with an in-place Reuse(token, ctor), so map and tree rebuilds mutate the spine in place. The token is a binder that only a Reuse may name, and the rewrite spends it on every control path or declines wholesale (keeping the safe no-reuse body), so freeing a cell once and spending its token at exactly one allocation are well-formedness properties of the term rather than a condition checked afterward.

The fip/fbip annotations (Spec 10) are the fully-in-place discipline of Lorenzen et al. (2023), here static checks layered on these passes. fbip proves zero fresh allocation and a call-graph closure over annotated, allocation-free callees. fip adds two further properties: linearity (each owned binding is consumed at most once, checked on the source term, with scalars exempt because adjusting the count of an unboxed word costs nothing) and bounded stack. The tail-call and tail-modulo-cons (a tail call whose result is wrapped in one constructor) classification (src/core/tailrec.rs) is shared with codegen, so an accepted fip function always lowers to a loop; acceptance never outruns what the backend emits.

-- FP^2 in-place list operations (Lorenzen/Leijen/Swierstra, ICFP 2023),
-- statically checked. The annotation makes the compiler PROVE the function
-- allocates no fresh cell: every `Cons` it builds reuses one it just matched and
-- dropped. Run with PRISM_REUSE_STATS=1 to watch the reuse hits, or
-- PRISM_CHECK_LEAKS=1 to confirm zero live cells at exit.
--
-- `rev_onto` and `bump` are `fip`: linear (each binding used once) AND bounded
-- stack. `rev_onto` is a plain tail call and `bump` is a tail-modulo-constructor
-- (`Cons(.., bump(t))`), so both lower to a constant-stack loop, not recursion.
-- `cap_at` is only `fbip`: its `h` is read by the guard AND rebuilt into the
-- `Cons`, two uses of one value, so it is not linear (its element type is generic
-- `a`, so `h` cannot be assumed an immediate). Zero-allocation still holds.
--
-- Prints:
--   10
--   6
--   22

-- Reverse onto an accumulator, the canonical fip. Each input `Cons` is matched,
-- freed, and immediately reused as the next accumulator cell, so the reversal
-- runs in place with zero allocation, and the tail call makes it a loop.
fip fn rev_onto(xs, acc) =
  match xs of
    Nil => acc
    Cons(h, t) => rev_onto(t, Cons(h, acc))

-- The seeding wrapper is not fbip: that initial `Nil` is a genuine fresh
-- allocation with no cell to reuse, so it stays un-annotated.
fn reverse_ip(xs) = rev_onto(xs, Nil)

-- Spine-rebuilding map: drop each `Cons`, rebuild it around the bumped head.
fip fn bump(xs) =
  match xs of
    Nil => Nil
    Cons(h, t) => Cons(h + 1, bump(t))

-- Saturating transform: rebuild every cell, head chosen by a test. Both `if`
-- branches end in a `Cons` reusing the same freed cell, so the reuse credit is
-- spent exactly once on every path. (A `filter` that DROPS cells could not be
-- fbip: the discard path frees a cell with no local allocation to reuse, and
-- this runtime has no cross-call reuse credit, so it would allocate.)
fbip fn cap_at(xs) =
  match xs of
    Nil => Nil
    Cons(h, t) =>
      if h > 9 then
        Cons(9, cap_at(t))
      else
        Cons(h, cap_at(t))

fn main() =
  println(sum(reverse_ip([1, 2, 3, 4])))
  println(sum(bump([1, 1, 1])))
  println(sum(cap_at([1, 20, 3, 40])))

11. Backends

Prism has three backends over one core: a tree-walking interpreter that is the reference oracle, and two native backends that must match it byte for byte. The native backends share a single generic emitter, so the differences below are narrow.

11.1 The Interpreter

The tree-walking interpreter (src/eval/) is a flat CEK (control, environment, continuation-stack) machine. Pending work lives on an explicit heap stack of frames rather than the host call stack, so object-program recursion never overflows it. A frame is one of: Bind (await a result, then continue with the rest of a sequence), Args (await a function before applying it), Handle (an installed handler), Mask (a masking frame), and Restore (unwind a name binding; a Restore already on top marks tail position, which is where the machine recognizes a tail call).

This machine makes the delimited continuation of Spec 7 concrete: performing an operation searches the frame stack outward for a matching Handle, decrementing the skip count past masked frames, and the captured continuation is exactly the slice of frames between the do and that handler, the handler included. Resuming pushes a clone of that slice back onto the stack, so the same resumption can be pushed again, which makes k multishot. The free-monad backend reifies this same frame slice as the k closure of an EOp (Section 9); evidence passing never materializes it.

11.2 The Shared Emitter

Both native backends drive one generic emitter (src/codegen/emit.rs) behind an Isa trait that abstracts instruction emission, so they differ only in instruction spelling. The emitter owns case dispatch, constructor allocation and reuse, and tail-call lowering: a self-tail call of equal arity becomes a musttail loop, and a constructor- or accumulator-shaped tail call (one whose result feeds a constructor or an integer accumulator) becomes a destination-passing loop, one that writes its result into an address passed as a hidden parameter rather than returning it, using the same classification the fip check reads (Section 10).

11.3 LLVM

The LLVM backend (src/codegen/llvm.rs) implements Isa over inkwell, emitting LLVM IR that clang compiles and links against the runtime. This is the default native path.

11.4 MLIR

The MLIR backend (src/codegen/mlir.rs) implements the same Isa by writing textual MLIR in the llvm dialect. Sharing the emitter makes its output byte-identical to the LLVM backend’s, which the parity gate (Section 13) enforces.

11.5 WebAssembly

The compiler front end and the interpreter also compile to WebAssembly (src/wasm.rs), so Prism type-checks and runs in the browser. This target hosts the interpreter, not the native code generators; the LLVM and MLIR backends are absent there.

12. The Runtime

The C runtime (runtime/prism_rt.c) is linked with the code each backend emits. It assumes an LP64 target (64-bit pointers and long) and uses mimalloc when available. The data representation below is shared by the backends and the runtime.

12.1 Value Representation

Every value occupies one 64-bit word, tagged by its low bit so that a single representation serves both scalars and pointers under polymorphism.

A Prism value is one 64-bit word, distinguished by its low bit:

  ...xxx1   immediate: a small integer or other unboxed scalar (value in the
            high 63 bits; a small Int is (n << 1) | 1)
  00000000  unit (the zero word)
  ...xxx0   a pointer to a heap cell (word-aligned, so the low bit is clear)

prism_rc_inc / prism_rc_dec take the raw word: an immediate or unit is skipped
without a dereference, so dup and drop are no-ops on non-cell values under
polymorphism.

A float does not fit the immediate scheme, so it is boxed: wrapped in a one-field cell holding the raw double bits, which are read back out (unboxed) at every float operation. Boxing makes a float field self-describing, so the collector frees it without interpreting its payload.

12.2 Cell Layout

A heap cell is a three-word header followed by its fields.

offset  field      meaning
   0    refcount   number of live references to this cell
   8    tag        constructor tag; reserved values mark String and bignum cells
  16    arity      number of fields (or byte length for a String)
  24    fields     arity words, each a value or pointer (UTF-8 bytes for a String)

Constructor tags follow declaration order (for Option(a) = None | Some(a), None is 0 and Some is 1). Two tag values are reserved, 0x53545200 for a string and 0x42494700 for a bignum (Section 12.5), marking cells whose payload is raw bytes or limbs rather than child values; the collector and the reuse pass (Section 10) read the tag to avoid recursing into them.

12.3 Reference Counting

prism_rc_inc and prism_rc_dec take the raw value word and return immediately on an immediate or unit, so counting is a no-op on non-cell values. Decrement to a nonzero count just decrements. Decrement to zero frees the cell, but freeing is iterative, not recursive: the dead cell’s now-zero refcount word is reused as a link field in an intrusive worklist of cells pending free, so a structure of any depth is reclaimed in constant auxiliary space without growing the C stack. A string or bignum tag short-circuits the child traversal.

12.4 In-Place Reuse

The reuse pass of Section 10 emits two runtime calls. prism_reuse_token(v) inspects a cell about to be dropped: if it is uniquely owned (refcount 1), it drops the cell’s children and returns the shell as a token, leaving the live-cell count untouched; otherwise it decrements and returns null. prism_reuse_alloc(token, n) overwrites the token’s header for the new constructor when the token is non-null, and falls back to a fresh allocation when it is null. A uniquely owned spine is therefore mutated in place, and a shared one transparently copies.

12.5 Integers

A small integer is an immediate, (n << 1) | 1. An operation whose fixed-width result would overflow promotes to a bignum: a cell tagged 0x42494700 storing the value in sign-magnitude form (sign and magnitude kept separate). Its header word is a signed limb count whose sign is the value’s sign; the magnitude follows as that many little-endian u64 limbs (base-2^64 digits) with no leading zero limb. Zero is a count of zero with no limbs. Each surface arithmetic operation takes a fast path on two immediates with a checked-overflow primitive and falls back to magnitude routines (add, subtract, multiply, and a shift-subtract long division) that renormalize the result, demoting back to an immediate when it again fits. The surface Int is this unbounded integer. The I64 and U64 lanes are raw machine words and wrap rather than promote.

12.6 Strings

A string is a cell tagged 0x53545200 whose field words hold its UTF-8 bytes inline, length-prefixed by the arity word and NUL-terminated for C interop. Each string the program builds, including a literal at each use, is a counted cell, so the leak counter (Section 12.7) accounts for strings like any other allocation. Two indexing families coexist: char_at, substring, and str_len work in Unicode codepoints, walking the UTF-8 encoding (and so are O(n)), while byte_at and byte_len give O(1) raw-byte access for a scanner or hash.

12.7 Instrumentation

Three environment-gated counters report to stderr at exit, leaving stdout (the parity-checked channel) untouched. PRISM_CHECK_LEAKS reports the live-cell balance, which a clean run drives to zero. PRISM_REUSE_STATS reports how many cells the reuse pass rewrote in place. PRISM_EFFOP_STATS reports how many free-monad EOp cells were allocated, which the performance gate asserts is zero on the fusion corpus.

12.8 Growable Arrays

The growable Array(a) (Spec 12) is an ordinary cell, { rc, tag 0, arity cap+1, len, elem0 .. }, with the length word stored odd-tagged (low bit set, so the collector skips it as an immediate per Section 12.1) and unused slots held at zero. Because it is a normal cell, reference counting recurses into its live elements with no special case. Every array operation borrows its array argument. array_get returns a counted element; array_set, array_push, and array_pop write in place when the array is uniquely owned (refcount 1) and copy otherwise, so functional array code runs as mutation exactly when ownership permits. array_push doubles the capacity when full, making appends amortized O(1). The prelude’s HashMap is a separate-chaining hash table layered on this array, with an FNV-1a hash written in Prism (so iteration order is a deterministic function of the inserts); it is library code, not a runtime primitive.

12.9 Primitive Sort

sort is a runtime primitive (prism_sort_prim) that borrows a list and returns it sorted, dispatched on a key kind. Arbitrary-precision Int keys use a bignum-aware stable bottom-up merge sort, ping-ponging between two buffers; fixed-width keys use a radix sort over a derived key. When the input spine is uniquely owned, the sorted heads are written back into the existing cells with no allocation; a shared spine is copied with its elements shared. The Cons and Nil tags are read off the input spine, so no list layout is baked into the runtime.

12.10 Input, Output, and Randomness

The runtime provides the impure builtins. read_int and read_line read stdin; read_file, write_file, append_file, and file_exists operate on files; getenv reads the environment; system runs a shell command and returns its exit code; eprint and eprintln write to stderr, leaving the parity-checked stdout untouched; and args_count and arg (wrapped by the prelude’s args) read the command line. Randomness is a SplitMix64 generator: rand advances it and srand seeds it, so a seeded run is deterministic and reproducible. Because these touch the world, the parity harness (Section 13) runs only the programs that avoid them.

13. Verification

Two test gates hold the implementation to its claims. The parity harness (tests/parity.rs) is differential testing with the interpreter as the reference: it runs every example on the interpreter and each native backend and asserts byte-identical output, and with PRISM_CHECK_LEAKS set, zero leaked cells.

The performance gate (tests/perf_gate.rs) asserts that the optimizations actually fire, so a regression that leaves output unchanged is still caught. With PRISM_EFFOP_STATS set, it requires zero free-monad cells allocated on the fusion corpus (the stream and multi-handler programs such as streams.pr), confirming that the evidence and state paths of Section 9 reify nothing. It also pins local monadification: a program that pairs an escaping effectful closure with an unrelated fused pipeline must allocate no more cells than the escape alone, so the pipeline stays fused despite the escape. With PRISM_REUSE_STATS set, it requires in-place reuse to fire on the reuse corpus (list.pr), confirming the reuse pass of Section 10 rewrites drops into in-place updates.

13.1 The Lean Model

Beyond the differential gates, the core calculus is mechanized in Lean 4 (the models/ directory, built with lake). Prism.lean defines the syntax and a substitution-based small-step relation Step with its determinism theorem (Step.deterministic). CEK.lean then defines the abstract machine the compiler actually runs (Section 11.1): an environment machine with a continuation stack, Rv runtime values carrying closures and thunks, curried application, and the deep, mask-aware handler capture that makes resume multishot. The machine is a total, executable step function, so it is deterministic by construction and runnable.

The model’s central theorem connects the two. A big-step natural semantics specifies what a program evaluates to, and bigstep_runs proves the machine implements it (a forward simulation under any continuation stack), so the abstract machine is a faithful realization of the specification rather than an independent artifact. Meta.lean adds the supporting metatheory: a unique-normal-form corollary, substitution lemmas, and a progress trichotomy (every computation is a value, takes a step, or is an explicit Stuck error, with stuckNoStep confirming the classification is a genuine partition). Dynamics.lean covers the effect machinery, proving the machine reaches a handler exactly when one is in scope (effect_progress) and is stuck on an unhandled operation otherwise (effect_unhandled). Every theorem is sorry-free; the proofs depend only on propext/Quot.sound, with Classical.choice added only where the model evaluates IEEE floats (whose arithmetic Lean defines non-constructively).

13.2 The Model as a Differential Oracle

The Lean machine is run as a third oracle alongside the interpreter and native backends. prism dump core-json <file> serializes the elaborated core to a JSON tree (src/core/json.rs), which models/Json.lean decodes back into the Lean syntax, and the oracle executable (models/Oracle.lean) runs the verified machine on it and prints the result, rendering floats through a port of the runtime’s fmt_g so output is byte-identical. models/diff_against_rust.sh pipes each fixture through prism dump core-json | oracle and compares against prism run, so the verified model is checked against the interpreter on the compiler’s actual core, not a hand-transcription. models/Certificates.lean records the same agreements as kernel-checked rfl theorems for the curated set. The grammar in the specification is itself single-sourced from models/grammar.ebnf.

14. The Interactive Shell

Running prism with no arguments starts a read-eval-print loop (src/repl/) backed by the interpreter of Section 11. It is a typed REPL: an entered expression is parsed through the expression entry point of Section 3, inferred, elaborated, and evaluated, and its type and effect row are shown above the value.

A session accumulates state. An expression is evaluated and its result bound to it; a let binds a name for reuse; and a top-level fn, type, class, instance, or effect declaration is added to the session so later input sees it. Declarations entered for a name shadow earlier ones.

Commands begin with :; any unambiguous prefix resolves to its command, GHCi-style, so :r is :reload and :lo is :load.

CommandAction
:type eshow the type and effect row of expression e
:kind Tshow the kind of a type constructor
:info ndescribe a binding, type, or class
:browselist the bindings this session has added over the prelude
:coredump the lowered core IR of the session
:load fload declarations from a file, making it the active file
:reloadre-read the active file from disk
:edit [f]open a file (or a scratch buffer) in $EDITOR, then load it
:set [+-]flagstoggle options; bare :set lists them
:quitleave the shell

Two :set toggles exist: t (types) shows the inferred type and effect row of each result, on by default, and s (timing) reports evaluation time. A multi-line block runs between :{ and :}, or is auto-detected when a line opens a layout block that is not yet closed.

15. The Formatter

prism fmt parses a file to the surface AST and prints that tree back to text. It is not a token reflow: parsing discards whitespace, so the printer reconstructs all layout from scratch, and the only things carried across from the original are the trivia the lexer recorded (Section 2) and, as a last resort, raw byte ranges. An already-formatted file is a fixed point, which is what prism fmt --check verifies by comparing the printed output to the input byte for byte. The implementation lives in src/fmt/, with mod.rs driving expression and block layout, decl.rs and pat.rs handling declarations and patterns, and ops.rs the operator-precedence parenthesization.

Trivia reattachment. Comments and blank lines are not nodes in the AST. The lexer records them in marginalia’s TriviaTable, a side table keyed by byte offset that the parser fills alongside the token stream. The printer queries that table by source range: between(lo, hi) yields the trivia events lying in a gap and after(end) yields whatever trails the final declaration. Because every AST node carries its source span, “the comment above this arm” or “the blank line between these two statements” is recovered as the trivia lying in the gap between one node’s end offset and the next node’s start offset. This is why trivia survives not just between top-level declarations but inside function bodies, handler arms, and match arms, each printer threads the byte offset where the previous construct ended and re-emits the comments found before the next one begins. Line comments and deliberate blank-line grouping round-trip; block comments carry no placeable layout and are dropped.

Two layout engines. Expressions use a hand-rolled width model: each node is first rendered inline by fmt_expr_inline, and if that string fits the 80-column budget at the current indent it is kept, otherwise the node falls back to a broken multi-line form via fmt_expr_break. Patterns instead reuse marginalia’s Doc combinator engine (block, concat, comma, pretty_at), which makes the same fit-or-break choice generically over their nested constructor, tuple, and record structure. The two coexist because expression breaking is context-sensitive (statement position, offside blocks, trailing-lambda calls, operator precedence) in ways the generic combinators do not capture, whereas patterns are purely structural and a stock pretty-printer suffices.

Layout versus flat. A Mode is threaded through every printer. Layout emits Prism’s offside blocks, the indentation-significant let chains and match/handle/if/for bodies. Flat is for bracketed contexts (tuples, argument lists, an inline match { ... }) where the virtual layout tokens are suppressed, so only inline let ... in and braced arms are legal there. In statement position match and if always lay out across lines even when they would fit on one, since stacked arms and branches read better, matching how those forms are written in other languages.

Surface restoration. Desugaring (Section 5) rewrites several surface forms into a smaller core before formatting ever sees them: UFCS recv.f(x) becomes f(recv, x), e? and string interpolation become marker calls, and pattern-let and ?-binding become single-arm matches. The parser tags each lowered node with a synthetic span or a synth flag, and the formatter keys off those flags to reprint the original surface rather than the lowered shape. This is what makes formatting idempotent: the printed output desugars back to the same core, so a second pass changes nothing.

Never destroy code. The formatter borrows the original source for one purpose, a verbatim fallback: any node it cannot otherwise print is emitted as its exact original byte range. Together with parse-or-fail (an unparseable file yields an error, never a mangled rewrite), this guarantees prism fmt is information-preserving even on constructs the printer does not yet model.

Bibliography

Augustsson, L. (1985). Compiling pattern matching. Functional Programming Languages and Computer Architecture (FPCA).

Augustsson, L., Breitner, J., Claessen, K., Jhala, R., Peyton Jones, S., Shivers, O., Steele, G. L., & Sweeney, T. (2023). The Verse calculus: A core calculus for deterministic functional logic programming. Proceedings of the ACM on Programming Languages, 7(ICFP).

Bauer, A., & Pretnar, M. (2015). Programming with algebraic effects and handlers. Journal of Logical and Algebraic Methods in Programming, 84(1), 108-123.

Barendsen, E., & Smetsers, S. (1993). Conventional and uniqueness typing in graph rewrite systems. Foundations of Software Technology and Theoretical Computer Science (FSTTCS).

Biernacki, D., Piróg, M., Polesiuk, P., & Sieczkowski, F. (2018). Handle with care: Relational interpretation of algebraic effects and handlers. Proceedings of the ACM on Programming Languages, 2(POPL).

Bour, F., Clément, B., & Scherer, G. (2021). Tail modulo cons. Journées Francophones des Langages Applicatifs (JFLA).

Coutts, D., Leshchinskiy, R., & Stewart, D. (2007). Stream fusion: From lists to streams to nothing at all. International Conference on Functional Programming (ICFP).

Damas, L., & Milner, R. (1982). Principal type-schemes for functional programs. Principles of Programming Languages (POPL), 207-212.

de Moura, L., & Ullrich, S. (2021). The Lean 4 theorem prover and programming language. Conference on Automated Deduction (CADE).

Dreyer, D., Harper, R., Chakravarty, M. M. T., & Keller, G. (2007). Modular type classes. Principles of Programming Languages (POPL).

Dunfield, J., & Krishnaswami, N. R. (2013). Complete and easy bidirectional typechecking for higher-rank polymorphism. International Conference on Functional Programming (ICFP).

Felleisen, M., & Friedman, D. P. (1986). Control operators, the SECD-machine, and the lambda-calculus. Formal Description of Programming Concepts III, 193-217.

Foster, J. N., Greenwald, M. B., Moore, J. T., Pierce, B. C., & Schmitt, A. (2007). Combinators for bidirectional tree transformations: A linguistic approach to the view-update problem. ACM Transactions on Programming Languages and Systems, 29(3).

Friedman, D. P., & Wise, D. S. (1975). Unwinding stylized recursions into iterations (Technical Report 19). Indiana University.

Gibbons, J., & Oliveira, B. C. d. S. (2009). The essence of the iterator pattern. Journal of Functional Programming, 19(3-4), 377-402.

Gill, A., Launchbury, J., & Peyton Jones, S. L. (1993). A short cut to deforestation. Functional Programming Languages and Computer Architecture (FPCA).

Harper, R., & Lillibridge, M. (1994). A type-theoretic approach to higher-order modules with sharing. Principles of Programming Languages (POPL).

Hindley, R. (1969). The principal type-scheme of an object in combinatory logic. Transactions of the American Mathematical Society, 146, 29-60.

Kiselyov, O., & Ishii, H. (2015). Freer monads, more extensible effects. Haskell Symposium.

Kiselyov, O., Sabry, A., & Swords, C. (2013). Extensible effects: An alternative to monad transformers. Haskell Symposium.

Lattner, C., & Adve, V. (2004). LLVM: A compilation framework for lifelong program analysis & transformation. International Symposium on Code Generation and Optimization (CGO).

Lattner, C., Amini, M., Bondhugula, U., Cohen, A., Davis, A., Pienaar, J., Riddle, R., Shpeisman, T., Vasilache, N., & Zinenko, O. (2021). MLIR: Scaling compiler infrastructure for the end of Moore’s law. International Symposium on Code Generation and Optimization (CGO).

Leijen, D. (2005). Extensible records with scoped labels. Trends in Functional Programming (TFP).

Leijen, D. (2017). Type directed compilation of row-typed algebraic effects. Principles of Programming Languages (POPL).

Leroy, X. (1994). Manifest types, modules, and separate compilation. Principles of Programming Languages (POPL).

Levy, P. B. (2004). Call-by-push-value: A functional/imperative synthesis. Springer.

Lorenzen, A., & Leijen, D. (2021). Reference counting with frame-limited reuse. International Conference on Functional Programming (ICFP).

Lorenzen, A., Leijen, D., & Swierstra, W. (2023). FP^2: Fully in-place functional programming. Proceedings of the ACM on Programming Languages, 7(ICFP).

Maranget, L. (2007). Warnings for pattern matching. Journal of Functional Programming, 17(3), 387-421.

Maranget, L. (2008). Compiling pattern matching to good decision trees. ML Workshop.

McBride, C., & Paterson, R. (2008). Applicative programming with effects. Journal of Functional Programming, 18(1), 1-13.

Milner, R. (1978). A theory of type polymorphism in programming. Journal of Computer and System Sciences, 17(3), 348-375.

Pickering, M., Érdi, G., Peyton Jones, S., & Eisenberg, R. A. (2016). Pattern synonyms. Haskell Symposium.

Pierce, B. C., & Turner, D. N. (2000). Local type inference. ACM Transactions on Programming Languages and Systems, 22(1), 1-44.

Plotkin, G. D. (1981). A structural approach to operational semantics (Technical Report DAIMI FN-19). Aarhus University.

Plotkin, G., & Power, J. (2001). Adequacy for algebraic effects. Foundations of Software Science and Computation Structures (FoSSaCS).

Plotkin, G., & Power, J. (2003). Algebraic operations and generic effects. Applied Categorical Structures, 11(1), 69-94.

Plotkin, G., & Pretnar, M. (2009). Handlers of algebraic effects. European Symposium on Programming (ESOP).

Pnueli, A., Siegel, M., & Singerman, E. (1998). Translation validation. Tools and Algorithms for the Construction and Analysis of Systems (TACAS).

Reinking, A., Xie, N., de Moura, L., & Leijen, D. (2021). Perceus: Garbage free reference counting with reuse. Programming Language Design and Implementation (PLDI).

Swierstra, W. (2008). Data types à la carte. Journal of Functional Programming, 18(4), 423-436.

Wand, M. (1987). Complete type inference for simple objects. Logic in Computer Science (LICS).

White, L., Bour, F., & Yallop, J. (2015). Modular implicits. Electronic Proceedings in Theoretical Computer Science, 198, 22-63.

Wu, N., Schrijvers, T., & Hinze, R. (2014). Effect handlers in scope. Haskell Symposium.

Xie, N., Brachthäuser, J. I., Hillerström, D., Schuster, P., & Leijen, D. (2020). Effect handlers, evidently. Proceedings of the ACM on Programming Languages, 4(ICFP).

Xie, N., & Leijen, D. (2021). Generalized evidence passing for effect handlers. Proceedings of the ACM on Programming Languages, 5(ICFP).