Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

pest

Pest is a PEG (Parsing Expression Grammar) parser generator that uses a dedicated grammar syntax to generate parsers at compile time. Unlike parser combinators, pest separates grammar definition from parsing logic, enabling clear and maintainable parser specifications. The library excels at parsing complex languages with automatic whitespace handling, built-in error reporting, and elegant precedence climbing for expressions.

The core philosophy of pest centers on declarative grammar definitions that closely resemble formal language specifications. Grammars are written in separate .pest files using a clean, readable syntax that supports modifiers for different parsing behaviors. The pest_derive macro generates efficient parsers from these grammars at compile time.

Grammar Fundamentals

#![allow(unused)]
fn main() {
// Whitespace handling - pest automatically skips whitespace between rules
WHITESPACE = _{ " " | "\t" | "\n" | "\r" }
COMMENT = _{ "//" ~ (!NEWLINE ~ ANY)* }

// ===== EXPRESSION PARSER =====
// Demonstrates precedence climbing and operators

// Main expression entry point using precedence climbing
expression = { term ~ (binary_op ~ term)* }

// Terms in expressions
term = { number | identifier | "(" ~ expression ~ ")" }

// Binary operators for precedence climbing
binary_op = _{ add | subtract | multiply | divide | power | eq | ne | lt | le | gt | ge }
add = { "+" }
subtract = { "-" }
multiply = { "*" }
divide = { "/" }
power = { "^" }
eq = { "==" }
ne = { "!=" }
lt = { "<" }
le = { "<=" }
gt = { ">" }
ge = { ">=" }

// Basic tokens
number = @{ ASCII_DIGIT+ ~ ("." ~ ASCII_DIGIT+)? }
identifier = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }

// ===== JSON PARSER =====
// Demonstrates recursive structures and string handling

json_value = { object | array | string | number | boolean | null }

object = { "{" ~ (pair ~ ("," ~ pair)*)? ~ "}" }
pair = { string ~ ":" ~ json_value }

array = { "[" ~ (json_value ~ ("," ~ json_value)*)? ~ "]" }

string = ${ "\"" ~ inner_string ~ "\"" }
inner_string = @{ char* }
char = {
    !("\"" | "\\") ~ ANY
    | "\\" ~ ("\"" | "\\" | "/" | "b" | "f" | "n" | "r" | "t")
    | "\\" ~ ("u" ~ ASCII_HEX_DIGIT{4})
}

boolean = { "true" | "false" }
null = { "null" }

// ===== PROGRAMMING LANGUAGE CONSTRUCTS =====
// Demonstrates complex language parsing with statements and control flow

program = { SOI ~ statement* ~ EOI }

statement = {
    if_statement
    | while_statement
    | function_def
    | assignment
    | expression_statement
    | block
}

// Control flow statements
if_statement = { "if" ~ expression ~ block ~ ("else" ~ (if_statement | block))? }
while_statement = { "while" ~ expression ~ block }

// Function definition
function_def = { "fn" ~ identifier ~ "(" ~ parameter_list? ~ ")" ~ "->" ~ type_name ~ block }
parameter_list = { parameter ~ ("," ~ parameter)* }
parameter = { identifier ~ ":" ~ type_name }

// Assignment and expressions
assignment = { identifier ~ "=" ~ expression ~ ";" }
expression_statement = { expression ~ ";" }

// Block structure
block = { "{" ~ statement* ~ "}" }

// Type system
type_name = { "int" | "float" | "bool" | "string" | identifier }

// ===== CALCULATOR WITH PRECEDENCE =====
// Demonstrates pest's precedence climbing capabilities

calculation = { SOI ~ calc_expression ~ EOI }
calc_expression = { calc_term ~ (calc_add_op ~ calc_term)* }
calc_term = { calc_factor ~ (calc_mul_op ~ calc_factor)* }
calc_factor = { calc_power }
calc_power = { calc_atom ~ (calc_pow_op ~ calc_power)? }
calc_atom = { calc_number | "(" ~ calc_expression ~ ")" | calc_unary }
calc_unary = { (calc_plus | calc_minus) ~ calc_number | calc_number }

calc_add_op = { calc_plus | calc_minus }
calc_mul_op = { calc_multiply | calc_divide }
calc_pow_op = { calc_power_op }

calc_plus = { "+" }
calc_minus = { "-" }
calc_multiply = { "*" }
calc_divide = { "/" }
calc_power_op = { "^" }

calc_number = @{ ASCII_DIGIT+ ~ ("." ~ ASCII_DIGIT+)? }

// ===== CUSTOM ERROR HANDLING =====
// Demonstrates custom error messages and recovery

error_prone = { SOI ~ error_statement* ~ EOI }
error_statement = {
    good_statement
    | expected_semicolon
}

good_statement = { identifier ~ "=" ~ number ~ ";" }
expected_semicolon = { identifier ~ "=" ~ number ~ !(";" | NEWLINE) }

// ===== LEXER-LIKE TOKENS =====
// Demonstrates atomic rules and token extraction

token_stream = { SOI ~ token* ~ EOI }
token = _{ keyword | operator_token | punctuation | literal | identifier_token }

keyword = { "if" | "else" | "while" | "fn" | "let" | "return" | "mut" }
operator_token = { "+=" | "-=" | "==" | "!=" | "<=" | ">=" | "&&" | "||" | "++" | "--" | "->" | "+" | "-" | "*" | "/" | "=" | "<" | ">" | "!" }
punctuation = { "(" | ")" | "{" | "}" | "[" | "]" | ";" | "," | ":" }
literal = { string_literal | number_literal | boolean_literal }
string_literal = ${ "\"" ~ string_content ~ "\"" }
string_content = @{ (!"\"" ~ ANY)* }
number_literal = @{ ASCII_DIGIT+ ~ ("." ~ ASCII_DIGIT+)? }
boolean_literal = { "true" | "false" }
identifier_token = @{ ASCII_ALPHA ~ (ASCII_ALPHANUMERIC | "_")* }
}

The grammar file demonstrates pest’s PEG syntax with various rule types. Silent rules prefixed with underscore don’t appear in the parse tree, simplifying AST construction. Atomic rules marked with @ consume input without considering inner whitespace. Compound atomic rules using $ capture the entire matched string as a single token.

AST Construction

#![allow(unused)]
fn main() {
use std::fmt;
use pest::error::Error;
use pest::iterators::{Pair, Pairs};
use pest::pratt_parser::{Assoc, Op, PrattParser};
use pest::Parser;
use pest_derive::Parser;
use thiserror::Error;
#[derive(Parser)]
#[grammar = "grammar.pest"]
pub struct GrammarParser;
#[derive(Error, Debug)]
pub enum ParseError {
    #[error("Pest parsing error: {0}")]
    Pest(#[from] Box<Error<Rule>>),
    #[error("Invalid number format: {0}")]
    InvalidNumber(String),
    #[error("Unknown operator: {0}")]
    UnknownOperator(String),
    #[error("Invalid JSON value")]
    InvalidJson,
    #[error("Unexpected end of input")]
    UnexpectedEOF,
}
pub type Result<T> = std::result::Result<T, ParseError>;
#[derive(Debug, Clone, PartialEq)]
pub enum BinOperator {
    Add,
    Subtract,
    Multiply,
    Divide,
    Power,
    Eq,
    Ne,
    Lt,
    Le,
    Gt,
    Ge,
}
impl fmt::Display for BinOperator {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            BinOperator::Add => write!(f, "+"),
            BinOperator::Subtract => write!(f, "-"),
            BinOperator::Multiply => write!(f, "*"),
            BinOperator::Divide => write!(f, "/"),
            BinOperator::Power => write!(f, "^"),
            BinOperator::Eq => write!(f, "=="),
            BinOperator::Ne => write!(f, "!="),
            BinOperator::Lt => write!(f, "<"),
            BinOperator::Le => write!(f, "<="),
            BinOperator::Gt => write!(f, ">"),
            BinOperator::Ge => write!(f, ">="),
        }
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum JsonValue {
    Object(Vec<(String, JsonValue)>),
    Array(Vec<JsonValue>),
    String(String),
    Number(f64),
    Boolean(bool),
    Null,
}
#[derive(Debug, Clone, PartialEq)]
pub struct Program {
    pub statements: Vec<Statement>,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Statement {
    If {
        condition: Expr,
        then_block: Vec<Statement>,
        else_block: Option<Vec<Statement>>,
    },
    While {
        condition: Expr,
        body: Vec<Statement>,
    },
    Function {
        name: String,
        parameters: Vec<Parameter>,
        return_type: String,
        body: Vec<Statement>,
    },
    Assignment {
        name: String,
        value: Expr,
    },
    Expression(Expr),
    Block(Vec<Statement>),
}
#[derive(Debug, Clone, PartialEq)]
pub struct Parameter {
    pub name: String,
    pub type_name: String,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Token {
    Keyword(String),
    Operator(String),
    Punctuation(String),
    Literal(LiteralValue),
    Identifier(String),
}
#[derive(Debug, Clone, PartialEq)]
pub enum LiteralValue {
    String(String),
    Number(f64),
    Boolean(bool),
}
impl GrammarParser {
    /// Parse an expression using pest's built-in precedence climbing
    pub fn parse_expression(input: &str) -> Result<Expr> {
        let pairs = Self::parse(Rule::expression, input).map_err(Box::new)?;
        Self::build_expression(pairs)
    }

    fn build_expression(pairs: Pairs<Rule>) -> Result<Expr> {
        let pratt = PrattParser::new()
            .op(Op::infix(Rule::eq, Assoc::Left) | Op::infix(Rule::ne, Assoc::Left))
            .op(Op::infix(Rule::lt, Assoc::Left)
                | Op::infix(Rule::le, Assoc::Left)
                | Op::infix(Rule::gt, Assoc::Left)
                | Op::infix(Rule::ge, Assoc::Left))
            .op(Op::infix(Rule::add, Assoc::Left) | Op::infix(Rule::subtract, Assoc::Left))
            .op(Op::infix(Rule::multiply, Assoc::Left) | Op::infix(Rule::divide, Assoc::Left))
            .op(Op::infix(Rule::power, Assoc::Right));

        pratt
            .map_primary(|primary| match primary.as_rule() {
                Rule::term => {
                    let inner = primary
                        .into_inner()
                        .next()
                        .ok_or(ParseError::UnexpectedEOF)?;
                    match inner.as_rule() {
                        Rule::number => {
                            let num = inner.as_str().parse::<f64>().map_err(|_| {
                                ParseError::InvalidNumber(inner.as_str().to_string())
                            })?;
                            Ok(Expr::Number(num))
                        }
                        Rule::identifier => Ok(Expr::Identifier(inner.as_str().to_string())),
                        Rule::expression => Self::build_expression(inner.into_inner()),
                        _ => unreachable!("Unexpected term rule: {:?}", inner.as_rule()),
                    }
                }
                Rule::number => {
                    let num = primary
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(primary.as_str().to_string()))?;
                    Ok(Expr::Number(num))
                }
                Rule::identifier => Ok(Expr::Identifier(primary.as_str().to_string())),
                Rule::expression => Self::build_expression(primary.into_inner()),
                _ => unreachable!("Unexpected primary rule: {:?}", primary.as_rule()),
            })
            .map_infix(|left, op, right| {
                let op = match op.as_rule() {
                    Rule::add => BinOperator::Add,
                    Rule::subtract => BinOperator::Subtract,
                    Rule::multiply => BinOperator::Multiply,
                    Rule::divide => BinOperator::Divide,
                    Rule::power => BinOperator::Power,
                    Rule::eq => BinOperator::Eq,
                    Rule::ne => BinOperator::Ne,
                    Rule::lt => BinOperator::Lt,
                    Rule::le => BinOperator::Le,
                    Rule::gt => BinOperator::Gt,
                    Rule::ge => BinOperator::Ge,
                    _ => return Err(ParseError::UnknownOperator(op.as_str().to_string())),
                };
                Ok(Expr::BinOp {
                    left: Box::new(left?),
                    op,
                    right: Box::new(right?),
                })
            })
            .parse(pairs)
    }

    /// Parse a calculator expression with explicit precedence rules
    pub fn parse_calculation(input: &str) -> Result<f64> {
        let pairs = Self::parse(Rule::calculation, input).map_err(Box::new)?;
        Self::evaluate_calculation(
            pairs
                .into_iter()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?
                .into_inner()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?,
        )
    }

    fn evaluate_calculation(pair: Pair<Rule>) -> Result<f64> {
        match pair.as_rule() {
            Rule::calc_expression => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(term)) = (pairs.next(), pairs.next()) {
                    let term_val = Self::evaluate_calculation(term)?;
                    match op.as_rule() {
                        Rule::calc_add_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_plus => result += term_val,
                                Rule::calc_minus => result -= term_val,
                                _ => unreachable!("Unexpected add op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_plus => result += term_val,
                        Rule::calc_minus => result -= term_val,
                        _ => unreachable!("Unexpected calc expression op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_term => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(factor)) = (pairs.next(), pairs.next()) {
                    let factor_val = Self::evaluate_calculation(factor)?;
                    match op.as_rule() {
                        Rule::calc_mul_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_multiply => result *= factor_val,
                                Rule::calc_divide => result /= factor_val,
                                _ => unreachable!("Unexpected mul op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_multiply => result *= factor_val,
                        Rule::calc_divide => result /= factor_val,
                        _ => unreachable!("Unexpected calc term op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_factor => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_power => {
                let mut pairs = pair.into_inner();
                let base =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                if let Some(op) = pairs.next() {
                    if op.as_rule() == Rule::calc_pow_op {
                        let exponent = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(base.powf(exponent))
                    } else {
                        unreachable!("Expected calc_pow_op, got: {:?}", op.as_rule());
                    }
                } else {
                    Ok(base)
                }
            }
            Rule::calc_atom => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_unary => {
                let mut pairs = pair.into_inner();
                let first = pairs.next().ok_or(ParseError::UnexpectedEOF)?;

                match first.as_rule() {
                    Rule::calc_minus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(-val)
                    }
                    Rule::calc_plus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(val)
                    }
                    Rule::calc_number => first
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(first.as_str().to_string())),
                    _ => Self::evaluate_calculation(first),
                }
            }
            Rule::calc_number => pair
                .as_str()
                .parse::<f64>()
                .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string())),
            _ => unreachable!("Unexpected rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse JSON input into a JsonValue AST
    pub fn parse_json(input: &str) -> Result<JsonValue> {
        let pairs = Self::parse(Rule::json_value, input).map_err(Box::new)?;
        Self::build_json_value(pairs.into_iter().next().ok_or(ParseError::UnexpectedEOF)?)
    }

    fn build_json_value(pair: Pair<Rule>) -> Result<JsonValue> {
        match pair.as_rule() {
            Rule::json_value => {
                Self::build_json_value(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::object => {
                let mut object = Vec::new();
                for pair in pair.into_inner() {
                    if let Rule::pair = pair.as_rule() {
                        let mut inner = pair.into_inner();
                        let key =
                            Self::parse_string(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        let value =
                            Self::build_json_value(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        object.push((key, value));
                    }
                }
                Ok(JsonValue::Object(object))
            }
            Rule::array => {
                let mut array = Vec::new();
                for pair in pair.into_inner() {
                    array.push(Self::build_json_value(pair)?);
                }
                Ok(JsonValue::Array(array))
            }
            Rule::string => Ok(JsonValue::String(Self::parse_string(pair)?)),
            Rule::number => {
                let num = pair
                    .as_str()
                    .parse::<f64>()
                    .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string()))?;
                Ok(JsonValue::Number(num))
            }
            Rule::boolean => Ok(JsonValue::Boolean(pair.as_str() == "true")),
            Rule::null => Ok(JsonValue::Null),
            _ => Err(ParseError::InvalidJson),
        }
    }

    fn parse_string(pair: Pair<Rule>) -> Result<String> {
        let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
        Ok(inner.as_str().to_string())
    }
}
impl GrammarParser {
    /// Parse a complete program
    pub fn parse_program(input: &str) -> Result<Program> {
        let pairs = Self::parse(Rule::program, input).map_err(Box::new)?;
        let mut statements = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                statements.push(Self::build_statement(pair)?);
            }
        }

        Ok(Program { statements })
    }

    fn build_statement(pair: Pair<Rule>) -> Result<Statement> {
        match pair.as_rule() {
            Rule::statement => {
                Self::build_statement(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::if_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let then_block = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                let else_block = inner
                    .next()
                    .map(|p| match p.as_rule() {
                        Rule::block => Self::build_block(p),
                        Rule::if_statement => Ok(vec![Self::build_statement(p)?]),
                        _ => unreachable!(),
                    })
                    .transpose()?;

                Ok(Statement::If {
                    condition,
                    then_block,
                    else_block,
                })
            }
            Rule::while_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::While { condition, body })
            }
            Rule::function_def => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();

                let mut parameters = Vec::new();
                let mut next = inner.next().ok_or(ParseError::UnexpectedEOF)?;

                if next.as_rule() == Rule::parameter_list {
                    for param_pair in next.into_inner() {
                        let mut param_inner = param_pair.into_inner();
                        let param_name = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        let param_type = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        parameters.push(Parameter {
                            name: param_name,
                            type_name: param_type,
                        });
                    }
                    next = inner.next().ok_or(ParseError::UnexpectedEOF)?;
                }

                let return_type = next.as_str().to_string();
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::Function {
                    name,
                    parameters,
                    return_type,
                    body,
                })
            }
            Rule::assignment => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();
                let value = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;

                Ok(Statement::Assignment { name, value })
            }
            Rule::expression_statement => {
                let expr = Self::build_expression_from_pair(
                    pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                Ok(Statement::Expression(expr))
            }
            Rule::block => Ok(Statement::Block(Self::build_block(pair)?)),
            _ => unreachable!("Unexpected statement rule: {:?}", pair.as_rule()),
        }
    }

    fn build_block(pair: Pair<Rule>) -> Result<Vec<Statement>> {
        let mut statements = Vec::new();
        for stmt_pair in pair.into_inner() {
            statements.push(Self::build_statement(stmt_pair)?);
        }
        Ok(statements)
    }

    fn build_expression_from_pair(pair: Pair<Rule>) -> Result<Expr> {
        Self::build_expression(pair.into_inner())
    }
}
impl GrammarParser {
    /// Parse input into a stream of tokens
    pub fn parse_tokens(input: &str) -> Result<Vec<Token>> {
        let pairs = Self::parse(Rule::token_stream, input).map_err(Box::new)?;
        let mut tokens = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                tokens.push(Self::build_token(pair)?);
            }
        }

        Ok(tokens)
    }

    fn build_token(pair: Pair<Rule>) -> Result<Token> {
        match pair.as_rule() {
            Rule::keyword => Ok(Token::Keyword(pair.as_str().to_string())),
            Rule::operator_token => Ok(Token::Operator(pair.as_str().to_string())),
            Rule::punctuation => Ok(Token::Punctuation(pair.as_str().to_string())),
            Rule::literal => {
                let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                match inner.as_rule() {
                    Rule::string_literal => {
                        let content = inner
                            .into_inner()
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str();
                        Ok(Token::Literal(LiteralValue::String(content.to_string())))
                    }
                    Rule::number_literal => {
                        let num = inner
                            .as_str()
                            .parse::<f64>()
                            .map_err(|_| ParseError::InvalidNumber(inner.as_str().to_string()))?;
                        Ok(Token::Literal(LiteralValue::Number(num)))
                    }
                    Rule::boolean_literal => Ok(Token::Literal(LiteralValue::Boolean(
                        inner.as_str() == "true",
                    ))),
                    _ => unreachable!(),
                }
            }
            Rule::identifier_token => Ok(Token::Identifier(pair.as_str().to_string())),
            _ => unreachable!("Unexpected token rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse and print pest parse tree for debugging
    pub fn debug_parse(rule: Rule, input: &str) -> Result<()> {
        let pairs = Self::parse(rule, input).map_err(Box::new)?;
        for pair in pairs {
            Self::print_pair(&pair, 0);
        }
        Ok(())
    }

    fn print_pair(pair: &Pair<Rule>, indent: usize) {
        let indent_str = "  ".repeat(indent);
        println!("{}{:?}: \"{}\"", indent_str, pair.as_rule(), pair.as_str());

        for inner_pair in pair.clone().into_inner() {
            Self::print_pair(&inner_pair, indent + 1);
        }
    }

    /// Extract all identifiers from an expression
    pub fn extract_identifiers(expr: &Expr) -> Vec<String> {
        match expr {
            Expr::Identifier(name) => vec![name.clone()],
            Expr::BinOp { left, right, .. } => {
                let mut ids = Self::extract_identifiers(left);
                ids.extend(Self::extract_identifiers(right));
                ids
            }
            Expr::Number(_) => vec![],
        }
    }

    /// Check if a rule matches the complete input
    pub fn can_parse(rule: Rule, input: &str) -> bool {
        match Self::parse(rule, input) {
            Ok(pairs) => {
                // Check that the entire input is consumed
                let input_len = input.len();
                let parsed_len = pairs.as_str().len();
                parsed_len == input_len
            }
            Err(_) => false,
        }
    }
}
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_expression_parsing() {
        let expr = GrammarParser::parse_expression("2 + 3 * 4").unwrap();
        match expr {
            Expr::BinOp {
                op: BinOperator::Add,
                ..
            } => (),
            _ => panic!("Expected addition at top level"),
        }
    }

    #[test]
    fn test_calculation() {
        assert_eq!(GrammarParser::parse_calculation("2 + 3 * 4").unwrap(), 14.0);
        assert_eq!(
            GrammarParser::parse_calculation("(2 + 3) * 4").unwrap(),
            20.0
        );
        assert_eq!(
            GrammarParser::parse_calculation("2 ^ 3 ^ 2").unwrap(),
            512.0
        );
    }

    #[test]
    fn test_json_parsing() {
        let json = r#"{"name": "test", "value": 42, "active": true}"#;
        let result = GrammarParser::parse_json(json).unwrap();

        if let JsonValue::Object(obj) = result {
            assert_eq!(obj.len(), 3);
        } else {
            panic!("Expected JSON object");
        }
    }

    #[test]
    fn test_program_parsing() {
        let program = r#"
            fn add(x: int, y: int) -> int {
                x + y;
            }

            if x > 0 {
                y = 42;
            }
        "#;

        let result = GrammarParser::parse_program(program).unwrap();
        assert_eq!(result.statements.len(), 2);
    }

    #[test]
    fn test_token_parsing() {
        let input = "if x == 42 { return true; }";
        let tokens = GrammarParser::parse_tokens(input).unwrap();
        assert!(tokens.len() > 5);

        match &tokens[0] {
            Token::Keyword(kw) => assert_eq!(kw, "if"),
            _ => panic!("Expected keyword"),
        }
    }

    #[test]
    fn test_identifier_extraction() {
        let expr = GrammarParser::parse_expression("x + y * z").unwrap();
        let ids = GrammarParser::extract_identifiers(&expr);
        assert_eq!(ids, vec!["x", "y", "z"]);
    }

    #[test]
    fn test_debug_features() {
        assert!(GrammarParser::can_parse(Rule::expression, "2 + 3"));
        assert!(!GrammarParser::can_parse(Rule::expression, "2 +"));
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum Expr {
    Number(f64),
    Identifier(String),
    BinOp {
        left: Box<Expr>,
        op: BinOperator,
        right: Box<Expr>,
    },
}
}

The expression type represents the abstract syntax tree that parsers construct from pest’s parse pairs. Each variant corresponds to grammatical constructs defined in the pest grammar.

#![allow(unused)]
fn main() {
/// Parse an expression using pest's built-in precedence climbing
pub fn parse_expression(input: &str) -> Result<Expr> {
    let pairs = Self::parse(Rule::expression, input).map_err(Box::new)?;
    Self::build_expression(pairs)
}
}

Expression parsing demonstrates the transformation from pest’s generic parse tree to a typed AST. The PrattParser handles operator precedence through precedence climbing, supporting both left and right associative operators. The parser processes pairs recursively, matching rule names to construct appropriate AST nodes.

Precedence Climbing

The Pratt parser configuration is built inline within the expression parser, defining operator precedence and associativity declaratively. Left associative operators like addition and multiplication are specified with infix, while right associative operators like exponentiation use infix with right associativity. Prefix operators for unary expressions integrate seamlessly with the precedence system.

#![allow(unused)]
fn main() {
/// Parse a calculator expression with explicit precedence rules
pub fn parse_calculation(input: &str) -> Result<f64> {
    let pairs = Self::parse(Rule::calculation, input).map_err(Box::new)?;
    Self::evaluate_calculation(
        pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?,
    )
}
}

The calculator parser evaluates expressions directly during parsing, demonstrating how precedence climbing produces correct results. Right associative exponentiation evaluates from right to left, while other operators evaluate left to right. This approach combines parsing and evaluation for simple expression languages.

JSON Parsing

#![allow(unused)]
fn main() {
use std::fmt;
use pest::error::Error;
use pest::iterators::{Pair, Pairs};
use pest::pratt_parser::{Assoc, Op, PrattParser};
use pest::Parser;
use pest_derive::Parser;
use thiserror::Error;
#[derive(Parser)]
#[grammar = "grammar.pest"]
pub struct GrammarParser;
#[derive(Error, Debug)]
pub enum ParseError {
    #[error("Pest parsing error: {0}")]
    Pest(#[from] Box<Error<Rule>>),
    #[error("Invalid number format: {0}")]
    InvalidNumber(String),
    #[error("Unknown operator: {0}")]
    UnknownOperator(String),
    #[error("Invalid JSON value")]
    InvalidJson,
    #[error("Unexpected end of input")]
    UnexpectedEOF,
}
pub type Result<T> = std::result::Result<T, ParseError>;
#[derive(Debug, Clone, PartialEq)]
pub enum Expr {
    Number(f64),
    Identifier(String),
    BinOp {
        left: Box<Expr>,
        op: BinOperator,
        right: Box<Expr>,
    },
}
#[derive(Debug, Clone, PartialEq)]
pub enum BinOperator {
    Add,
    Subtract,
    Multiply,
    Divide,
    Power,
    Eq,
    Ne,
    Lt,
    Le,
    Gt,
    Ge,
}
impl fmt::Display for BinOperator {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            BinOperator::Add => write!(f, "+"),
            BinOperator::Subtract => write!(f, "-"),
            BinOperator::Multiply => write!(f, "*"),
            BinOperator::Divide => write!(f, "/"),
            BinOperator::Power => write!(f, "^"),
            BinOperator::Eq => write!(f, "=="),
            BinOperator::Ne => write!(f, "!="),
            BinOperator::Lt => write!(f, "<"),
            BinOperator::Le => write!(f, "<="),
            BinOperator::Gt => write!(f, ">"),
            BinOperator::Ge => write!(f, ">="),
        }
    }
}
#[derive(Debug, Clone, PartialEq)]
pub struct Program {
    pub statements: Vec<Statement>,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Statement {
    If {
        condition: Expr,
        then_block: Vec<Statement>,
        else_block: Option<Vec<Statement>>,
    },
    While {
        condition: Expr,
        body: Vec<Statement>,
    },
    Function {
        name: String,
        parameters: Vec<Parameter>,
        return_type: String,
        body: Vec<Statement>,
    },
    Assignment {
        name: String,
        value: Expr,
    },
    Expression(Expr),
    Block(Vec<Statement>),
}
#[derive(Debug, Clone, PartialEq)]
pub struct Parameter {
    pub name: String,
    pub type_name: String,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Token {
    Keyword(String),
    Operator(String),
    Punctuation(String),
    Literal(LiteralValue),
    Identifier(String),
}
#[derive(Debug, Clone, PartialEq)]
pub enum LiteralValue {
    String(String),
    Number(f64),
    Boolean(bool),
}
impl GrammarParser {
    /// Parse an expression using pest's built-in precedence climbing
    pub fn parse_expression(input: &str) -> Result<Expr> {
        let pairs = Self::parse(Rule::expression, input).map_err(Box::new)?;
        Self::build_expression(pairs)
    }

    fn build_expression(pairs: Pairs<Rule>) -> Result<Expr> {
        let pratt = PrattParser::new()
            .op(Op::infix(Rule::eq, Assoc::Left) | Op::infix(Rule::ne, Assoc::Left))
            .op(Op::infix(Rule::lt, Assoc::Left)
                | Op::infix(Rule::le, Assoc::Left)
                | Op::infix(Rule::gt, Assoc::Left)
                | Op::infix(Rule::ge, Assoc::Left))
            .op(Op::infix(Rule::add, Assoc::Left) | Op::infix(Rule::subtract, Assoc::Left))
            .op(Op::infix(Rule::multiply, Assoc::Left) | Op::infix(Rule::divide, Assoc::Left))
            .op(Op::infix(Rule::power, Assoc::Right));

        pratt
            .map_primary(|primary| match primary.as_rule() {
                Rule::term => {
                    let inner = primary
                        .into_inner()
                        .next()
                        .ok_or(ParseError::UnexpectedEOF)?;
                    match inner.as_rule() {
                        Rule::number => {
                            let num = inner.as_str().parse::<f64>().map_err(|_| {
                                ParseError::InvalidNumber(inner.as_str().to_string())
                            })?;
                            Ok(Expr::Number(num))
                        }
                        Rule::identifier => Ok(Expr::Identifier(inner.as_str().to_string())),
                        Rule::expression => Self::build_expression(inner.into_inner()),
                        _ => unreachable!("Unexpected term rule: {:?}", inner.as_rule()),
                    }
                }
                Rule::number => {
                    let num = primary
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(primary.as_str().to_string()))?;
                    Ok(Expr::Number(num))
                }
                Rule::identifier => Ok(Expr::Identifier(primary.as_str().to_string())),
                Rule::expression => Self::build_expression(primary.into_inner()),
                _ => unreachable!("Unexpected primary rule: {:?}", primary.as_rule()),
            })
            .map_infix(|left, op, right| {
                let op = match op.as_rule() {
                    Rule::add => BinOperator::Add,
                    Rule::subtract => BinOperator::Subtract,
                    Rule::multiply => BinOperator::Multiply,
                    Rule::divide => BinOperator::Divide,
                    Rule::power => BinOperator::Power,
                    Rule::eq => BinOperator::Eq,
                    Rule::ne => BinOperator::Ne,
                    Rule::lt => BinOperator::Lt,
                    Rule::le => BinOperator::Le,
                    Rule::gt => BinOperator::Gt,
                    Rule::ge => BinOperator::Ge,
                    _ => return Err(ParseError::UnknownOperator(op.as_str().to_string())),
                };
                Ok(Expr::BinOp {
                    left: Box::new(left?),
                    op,
                    right: Box::new(right?),
                })
            })
            .parse(pairs)
    }

    /// Parse a calculator expression with explicit precedence rules
    pub fn parse_calculation(input: &str) -> Result<f64> {
        let pairs = Self::parse(Rule::calculation, input).map_err(Box::new)?;
        Self::evaluate_calculation(
            pairs
                .into_iter()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?
                .into_inner()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?,
        )
    }

    fn evaluate_calculation(pair: Pair<Rule>) -> Result<f64> {
        match pair.as_rule() {
            Rule::calc_expression => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(term)) = (pairs.next(), pairs.next()) {
                    let term_val = Self::evaluate_calculation(term)?;
                    match op.as_rule() {
                        Rule::calc_add_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_plus => result += term_val,
                                Rule::calc_minus => result -= term_val,
                                _ => unreachable!("Unexpected add op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_plus => result += term_val,
                        Rule::calc_minus => result -= term_val,
                        _ => unreachable!("Unexpected calc expression op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_term => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(factor)) = (pairs.next(), pairs.next()) {
                    let factor_val = Self::evaluate_calculation(factor)?;
                    match op.as_rule() {
                        Rule::calc_mul_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_multiply => result *= factor_val,
                                Rule::calc_divide => result /= factor_val,
                                _ => unreachable!("Unexpected mul op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_multiply => result *= factor_val,
                        Rule::calc_divide => result /= factor_val,
                        _ => unreachable!("Unexpected calc term op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_factor => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_power => {
                let mut pairs = pair.into_inner();
                let base =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                if let Some(op) = pairs.next() {
                    if op.as_rule() == Rule::calc_pow_op {
                        let exponent = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(base.powf(exponent))
                    } else {
                        unreachable!("Expected calc_pow_op, got: {:?}", op.as_rule());
                    }
                } else {
                    Ok(base)
                }
            }
            Rule::calc_atom => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_unary => {
                let mut pairs = pair.into_inner();
                let first = pairs.next().ok_or(ParseError::UnexpectedEOF)?;

                match first.as_rule() {
                    Rule::calc_minus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(-val)
                    }
                    Rule::calc_plus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(val)
                    }
                    Rule::calc_number => first
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(first.as_str().to_string())),
                    _ => Self::evaluate_calculation(first),
                }
            }
            Rule::calc_number => pair
                .as_str()
                .parse::<f64>()
                .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string())),
            _ => unreachable!("Unexpected rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse JSON input into a JsonValue AST
    pub fn parse_json(input: &str) -> Result<JsonValue> {
        let pairs = Self::parse(Rule::json_value, input).map_err(Box::new)?;
        Self::build_json_value(pairs.into_iter().next().ok_or(ParseError::UnexpectedEOF)?)
    }

    fn build_json_value(pair: Pair<Rule>) -> Result<JsonValue> {
        match pair.as_rule() {
            Rule::json_value => {
                Self::build_json_value(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::object => {
                let mut object = Vec::new();
                for pair in pair.into_inner() {
                    if let Rule::pair = pair.as_rule() {
                        let mut inner = pair.into_inner();
                        let key =
                            Self::parse_string(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        let value =
                            Self::build_json_value(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        object.push((key, value));
                    }
                }
                Ok(JsonValue::Object(object))
            }
            Rule::array => {
                let mut array = Vec::new();
                for pair in pair.into_inner() {
                    array.push(Self::build_json_value(pair)?);
                }
                Ok(JsonValue::Array(array))
            }
            Rule::string => Ok(JsonValue::String(Self::parse_string(pair)?)),
            Rule::number => {
                let num = pair
                    .as_str()
                    .parse::<f64>()
                    .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string()))?;
                Ok(JsonValue::Number(num))
            }
            Rule::boolean => Ok(JsonValue::Boolean(pair.as_str() == "true")),
            Rule::null => Ok(JsonValue::Null),
            _ => Err(ParseError::InvalidJson),
        }
    }

    fn parse_string(pair: Pair<Rule>) -> Result<String> {
        let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
        Ok(inner.as_str().to_string())
    }
}
impl GrammarParser {
    /// Parse a complete program
    pub fn parse_program(input: &str) -> Result<Program> {
        let pairs = Self::parse(Rule::program, input).map_err(Box::new)?;
        let mut statements = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                statements.push(Self::build_statement(pair)?);
            }
        }

        Ok(Program { statements })
    }

    fn build_statement(pair: Pair<Rule>) -> Result<Statement> {
        match pair.as_rule() {
            Rule::statement => {
                Self::build_statement(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::if_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let then_block = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                let else_block = inner
                    .next()
                    .map(|p| match p.as_rule() {
                        Rule::block => Self::build_block(p),
                        Rule::if_statement => Ok(vec![Self::build_statement(p)?]),
                        _ => unreachable!(),
                    })
                    .transpose()?;

                Ok(Statement::If {
                    condition,
                    then_block,
                    else_block,
                })
            }
            Rule::while_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::While { condition, body })
            }
            Rule::function_def => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();

                let mut parameters = Vec::new();
                let mut next = inner.next().ok_or(ParseError::UnexpectedEOF)?;

                if next.as_rule() == Rule::parameter_list {
                    for param_pair in next.into_inner() {
                        let mut param_inner = param_pair.into_inner();
                        let param_name = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        let param_type = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        parameters.push(Parameter {
                            name: param_name,
                            type_name: param_type,
                        });
                    }
                    next = inner.next().ok_or(ParseError::UnexpectedEOF)?;
                }

                let return_type = next.as_str().to_string();
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::Function {
                    name,
                    parameters,
                    return_type,
                    body,
                })
            }
            Rule::assignment => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();
                let value = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;

                Ok(Statement::Assignment { name, value })
            }
            Rule::expression_statement => {
                let expr = Self::build_expression_from_pair(
                    pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                Ok(Statement::Expression(expr))
            }
            Rule::block => Ok(Statement::Block(Self::build_block(pair)?)),
            _ => unreachable!("Unexpected statement rule: {:?}", pair.as_rule()),
        }
    }

    fn build_block(pair: Pair<Rule>) -> Result<Vec<Statement>> {
        let mut statements = Vec::new();
        for stmt_pair in pair.into_inner() {
            statements.push(Self::build_statement(stmt_pair)?);
        }
        Ok(statements)
    }

    fn build_expression_from_pair(pair: Pair<Rule>) -> Result<Expr> {
        Self::build_expression(pair.into_inner())
    }
}
impl GrammarParser {
    /// Parse input into a stream of tokens
    pub fn parse_tokens(input: &str) -> Result<Vec<Token>> {
        let pairs = Self::parse(Rule::token_stream, input).map_err(Box::new)?;
        let mut tokens = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                tokens.push(Self::build_token(pair)?);
            }
        }

        Ok(tokens)
    }

    fn build_token(pair: Pair<Rule>) -> Result<Token> {
        match pair.as_rule() {
            Rule::keyword => Ok(Token::Keyword(pair.as_str().to_string())),
            Rule::operator_token => Ok(Token::Operator(pair.as_str().to_string())),
            Rule::punctuation => Ok(Token::Punctuation(pair.as_str().to_string())),
            Rule::literal => {
                let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                match inner.as_rule() {
                    Rule::string_literal => {
                        let content = inner
                            .into_inner()
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str();
                        Ok(Token::Literal(LiteralValue::String(content.to_string())))
                    }
                    Rule::number_literal => {
                        let num = inner
                            .as_str()
                            .parse::<f64>()
                            .map_err(|_| ParseError::InvalidNumber(inner.as_str().to_string()))?;
                        Ok(Token::Literal(LiteralValue::Number(num)))
                    }
                    Rule::boolean_literal => Ok(Token::Literal(LiteralValue::Boolean(
                        inner.as_str() == "true",
                    ))),
                    _ => unreachable!(),
                }
            }
            Rule::identifier_token => Ok(Token::Identifier(pair.as_str().to_string())),
            _ => unreachable!("Unexpected token rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse and print pest parse tree for debugging
    pub fn debug_parse(rule: Rule, input: &str) -> Result<()> {
        let pairs = Self::parse(rule, input).map_err(Box::new)?;
        for pair in pairs {
            Self::print_pair(&pair, 0);
        }
        Ok(())
    }

    fn print_pair(pair: &Pair<Rule>, indent: usize) {
        let indent_str = "  ".repeat(indent);
        println!("{}{:?}: \"{}\"", indent_str, pair.as_rule(), pair.as_str());

        for inner_pair in pair.clone().into_inner() {
            Self::print_pair(&inner_pair, indent + 1);
        }
    }

    /// Extract all identifiers from an expression
    pub fn extract_identifiers(expr: &Expr) -> Vec<String> {
        match expr {
            Expr::Identifier(name) => vec![name.clone()],
            Expr::BinOp { left, right, .. } => {
                let mut ids = Self::extract_identifiers(left);
                ids.extend(Self::extract_identifiers(right));
                ids
            }
            Expr::Number(_) => vec![],
        }
    }

    /// Check if a rule matches the complete input
    pub fn can_parse(rule: Rule, input: &str) -> bool {
        match Self::parse(rule, input) {
            Ok(pairs) => {
                // Check that the entire input is consumed
                let input_len = input.len();
                let parsed_len = pairs.as_str().len();
                parsed_len == input_len
            }
            Err(_) => false,
        }
    }
}
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_expression_parsing() {
        let expr = GrammarParser::parse_expression("2 + 3 * 4").unwrap();
        match expr {
            Expr::BinOp {
                op: BinOperator::Add,
                ..
            } => (),
            _ => panic!("Expected addition at top level"),
        }
    }

    #[test]
    fn test_calculation() {
        assert_eq!(GrammarParser::parse_calculation("2 + 3 * 4").unwrap(), 14.0);
        assert_eq!(
            GrammarParser::parse_calculation("(2 + 3) * 4").unwrap(),
            20.0
        );
        assert_eq!(
            GrammarParser::parse_calculation("2 ^ 3 ^ 2").unwrap(),
            512.0
        );
    }

    #[test]
    fn test_json_parsing() {
        let json = r#"{"name": "test", "value": 42, "active": true}"#;
        let result = GrammarParser::parse_json(json).unwrap();

        if let JsonValue::Object(obj) = result {
            assert_eq!(obj.len(), 3);
        } else {
            panic!("Expected JSON object");
        }
    }

    #[test]
    fn test_program_parsing() {
        let program = r#"
            fn add(x: int, y: int) -> int {
                x + y;
            }

            if x > 0 {
                y = 42;
            }
        "#;

        let result = GrammarParser::parse_program(program).unwrap();
        assert_eq!(result.statements.len(), 2);
    }

    #[test]
    fn test_token_parsing() {
        let input = "if x == 42 { return true; }";
        let tokens = GrammarParser::parse_tokens(input).unwrap();
        assert!(tokens.len() > 5);

        match &tokens[0] {
            Token::Keyword(kw) => assert_eq!(kw, "if"),
            _ => panic!("Expected keyword"),
        }
    }

    #[test]
    fn test_identifier_extraction() {
        let expr = GrammarParser::parse_expression("x + y * z").unwrap();
        let ids = GrammarParser::extract_identifiers(&expr);
        assert_eq!(ids, vec!["x", "y", "z"]);
    }

    #[test]
    fn test_debug_features() {
        assert!(GrammarParser::can_parse(Rule::expression, "2 + 3"));
        assert!(!GrammarParser::can_parse(Rule::expression, "2 +"));
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum JsonValue {
    Object(Vec<(String, JsonValue)>),
    Array(Vec<JsonValue>),
    String(String),
    Number(f64),
    Boolean(bool),
    Null,
}
}

JSON parsing showcases pest’s handling of recursive data structures. The grammar defines objects, arrays, strings, numbers, and literals with appropriate nesting rules.

#![allow(unused)]
fn main() {
/// Parse JSON input into a JsonValue AST
pub fn parse_json(input: &str) -> Result<JsonValue> {
    let pairs = Self::parse(Rule::json_value, input).map_err(Box::new)?;
    Self::build_json_value(pairs.into_iter().next().ok_or(ParseError::UnexpectedEOF)?)
}
}

The JSON parser transforms pest pairs into a typed representation. Match expressions on rule types drive the recursive construction of nested structures. String escape sequences and number parsing are handled during AST construction rather than in the grammar.

Programming Language Constructs

#![allow(unused)]
fn main() {
use std::fmt;
use pest::error::Error;
use pest::iterators::{Pair, Pairs};
use pest::pratt_parser::{Assoc, Op, PrattParser};
use pest::Parser;
use pest_derive::Parser;
use thiserror::Error;
#[derive(Parser)]
#[grammar = "grammar.pest"]
pub struct GrammarParser;
#[derive(Error, Debug)]
pub enum ParseError {
    #[error("Pest parsing error: {0}")]
    Pest(#[from] Box<Error<Rule>>),
    #[error("Invalid number format: {0}")]
    InvalidNumber(String),
    #[error("Unknown operator: {0}")]
    UnknownOperator(String),
    #[error("Invalid JSON value")]
    InvalidJson,
    #[error("Unexpected end of input")]
    UnexpectedEOF,
}
pub type Result<T> = std::result::Result<T, ParseError>;
#[derive(Debug, Clone, PartialEq)]
pub enum Expr {
    Number(f64),
    Identifier(String),
    BinOp {
        left: Box<Expr>,
        op: BinOperator,
        right: Box<Expr>,
    },
}
#[derive(Debug, Clone, PartialEq)]
pub enum BinOperator {
    Add,
    Subtract,
    Multiply,
    Divide,
    Power,
    Eq,
    Ne,
    Lt,
    Le,
    Gt,
    Ge,
}
impl fmt::Display for BinOperator {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            BinOperator::Add => write!(f, "+"),
            BinOperator::Subtract => write!(f, "-"),
            BinOperator::Multiply => write!(f, "*"),
            BinOperator::Divide => write!(f, "/"),
            BinOperator::Power => write!(f, "^"),
            BinOperator::Eq => write!(f, "=="),
            BinOperator::Ne => write!(f, "!="),
            BinOperator::Lt => write!(f, "<"),
            BinOperator::Le => write!(f, "<="),
            BinOperator::Gt => write!(f, ">"),
            BinOperator::Ge => write!(f, ">="),
        }
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum JsonValue {
    Object(Vec<(String, JsonValue)>),
    Array(Vec<JsonValue>),
    String(String),
    Number(f64),
    Boolean(bool),
    Null,
}
#[derive(Debug, Clone, PartialEq)]
pub struct Program {
    pub statements: Vec<Statement>,
}
#[derive(Debug, Clone, PartialEq)]
pub struct Parameter {
    pub name: String,
    pub type_name: String,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Token {
    Keyword(String),
    Operator(String),
    Punctuation(String),
    Literal(LiteralValue),
    Identifier(String),
}
#[derive(Debug, Clone, PartialEq)]
pub enum LiteralValue {
    String(String),
    Number(f64),
    Boolean(bool),
}
impl GrammarParser {
    /// Parse an expression using pest's built-in precedence climbing
    pub fn parse_expression(input: &str) -> Result<Expr> {
        let pairs = Self::parse(Rule::expression, input).map_err(Box::new)?;
        Self::build_expression(pairs)
    }

    fn build_expression(pairs: Pairs<Rule>) -> Result<Expr> {
        let pratt = PrattParser::new()
            .op(Op::infix(Rule::eq, Assoc::Left) | Op::infix(Rule::ne, Assoc::Left))
            .op(Op::infix(Rule::lt, Assoc::Left)
                | Op::infix(Rule::le, Assoc::Left)
                | Op::infix(Rule::gt, Assoc::Left)
                | Op::infix(Rule::ge, Assoc::Left))
            .op(Op::infix(Rule::add, Assoc::Left) | Op::infix(Rule::subtract, Assoc::Left))
            .op(Op::infix(Rule::multiply, Assoc::Left) | Op::infix(Rule::divide, Assoc::Left))
            .op(Op::infix(Rule::power, Assoc::Right));

        pratt
            .map_primary(|primary| match primary.as_rule() {
                Rule::term => {
                    let inner = primary
                        .into_inner()
                        .next()
                        .ok_or(ParseError::UnexpectedEOF)?;
                    match inner.as_rule() {
                        Rule::number => {
                            let num = inner.as_str().parse::<f64>().map_err(|_| {
                                ParseError::InvalidNumber(inner.as_str().to_string())
                            })?;
                            Ok(Expr::Number(num))
                        }
                        Rule::identifier => Ok(Expr::Identifier(inner.as_str().to_string())),
                        Rule::expression => Self::build_expression(inner.into_inner()),
                        _ => unreachable!("Unexpected term rule: {:?}", inner.as_rule()),
                    }
                }
                Rule::number => {
                    let num = primary
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(primary.as_str().to_string()))?;
                    Ok(Expr::Number(num))
                }
                Rule::identifier => Ok(Expr::Identifier(primary.as_str().to_string())),
                Rule::expression => Self::build_expression(primary.into_inner()),
                _ => unreachable!("Unexpected primary rule: {:?}", primary.as_rule()),
            })
            .map_infix(|left, op, right| {
                let op = match op.as_rule() {
                    Rule::add => BinOperator::Add,
                    Rule::subtract => BinOperator::Subtract,
                    Rule::multiply => BinOperator::Multiply,
                    Rule::divide => BinOperator::Divide,
                    Rule::power => BinOperator::Power,
                    Rule::eq => BinOperator::Eq,
                    Rule::ne => BinOperator::Ne,
                    Rule::lt => BinOperator::Lt,
                    Rule::le => BinOperator::Le,
                    Rule::gt => BinOperator::Gt,
                    Rule::ge => BinOperator::Ge,
                    _ => return Err(ParseError::UnknownOperator(op.as_str().to_string())),
                };
                Ok(Expr::BinOp {
                    left: Box::new(left?),
                    op,
                    right: Box::new(right?),
                })
            })
            .parse(pairs)
    }

    /// Parse a calculator expression with explicit precedence rules
    pub fn parse_calculation(input: &str) -> Result<f64> {
        let pairs = Self::parse(Rule::calculation, input).map_err(Box::new)?;
        Self::evaluate_calculation(
            pairs
                .into_iter()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?
                .into_inner()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?,
        )
    }

    fn evaluate_calculation(pair: Pair<Rule>) -> Result<f64> {
        match pair.as_rule() {
            Rule::calc_expression => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(term)) = (pairs.next(), pairs.next()) {
                    let term_val = Self::evaluate_calculation(term)?;
                    match op.as_rule() {
                        Rule::calc_add_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_plus => result += term_val,
                                Rule::calc_minus => result -= term_val,
                                _ => unreachable!("Unexpected add op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_plus => result += term_val,
                        Rule::calc_minus => result -= term_val,
                        _ => unreachable!("Unexpected calc expression op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_term => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(factor)) = (pairs.next(), pairs.next()) {
                    let factor_val = Self::evaluate_calculation(factor)?;
                    match op.as_rule() {
                        Rule::calc_mul_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_multiply => result *= factor_val,
                                Rule::calc_divide => result /= factor_val,
                                _ => unreachable!("Unexpected mul op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_multiply => result *= factor_val,
                        Rule::calc_divide => result /= factor_val,
                        _ => unreachable!("Unexpected calc term op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_factor => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_power => {
                let mut pairs = pair.into_inner();
                let base =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                if let Some(op) = pairs.next() {
                    if op.as_rule() == Rule::calc_pow_op {
                        let exponent = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(base.powf(exponent))
                    } else {
                        unreachable!("Expected calc_pow_op, got: {:?}", op.as_rule());
                    }
                } else {
                    Ok(base)
                }
            }
            Rule::calc_atom => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_unary => {
                let mut pairs = pair.into_inner();
                let first = pairs.next().ok_or(ParseError::UnexpectedEOF)?;

                match first.as_rule() {
                    Rule::calc_minus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(-val)
                    }
                    Rule::calc_plus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(val)
                    }
                    Rule::calc_number => first
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(first.as_str().to_string())),
                    _ => Self::evaluate_calculation(first),
                }
            }
            Rule::calc_number => pair
                .as_str()
                .parse::<f64>()
                .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string())),
            _ => unreachable!("Unexpected rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse JSON input into a JsonValue AST
    pub fn parse_json(input: &str) -> Result<JsonValue> {
        let pairs = Self::parse(Rule::json_value, input).map_err(Box::new)?;
        Self::build_json_value(pairs.into_iter().next().ok_or(ParseError::UnexpectedEOF)?)
    }

    fn build_json_value(pair: Pair<Rule>) -> Result<JsonValue> {
        match pair.as_rule() {
            Rule::json_value => {
                Self::build_json_value(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::object => {
                let mut object = Vec::new();
                for pair in pair.into_inner() {
                    if let Rule::pair = pair.as_rule() {
                        let mut inner = pair.into_inner();
                        let key =
                            Self::parse_string(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        let value =
                            Self::build_json_value(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        object.push((key, value));
                    }
                }
                Ok(JsonValue::Object(object))
            }
            Rule::array => {
                let mut array = Vec::new();
                for pair in pair.into_inner() {
                    array.push(Self::build_json_value(pair)?);
                }
                Ok(JsonValue::Array(array))
            }
            Rule::string => Ok(JsonValue::String(Self::parse_string(pair)?)),
            Rule::number => {
                let num = pair
                    .as_str()
                    .parse::<f64>()
                    .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string()))?;
                Ok(JsonValue::Number(num))
            }
            Rule::boolean => Ok(JsonValue::Boolean(pair.as_str() == "true")),
            Rule::null => Ok(JsonValue::Null),
            _ => Err(ParseError::InvalidJson),
        }
    }

    fn parse_string(pair: Pair<Rule>) -> Result<String> {
        let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
        Ok(inner.as_str().to_string())
    }
}
impl GrammarParser {
    /// Parse a complete program
    pub fn parse_program(input: &str) -> Result<Program> {
        let pairs = Self::parse(Rule::program, input).map_err(Box::new)?;
        let mut statements = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                statements.push(Self::build_statement(pair)?);
            }
        }

        Ok(Program { statements })
    }

    fn build_statement(pair: Pair<Rule>) -> Result<Statement> {
        match pair.as_rule() {
            Rule::statement => {
                Self::build_statement(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::if_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let then_block = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                let else_block = inner
                    .next()
                    .map(|p| match p.as_rule() {
                        Rule::block => Self::build_block(p),
                        Rule::if_statement => Ok(vec![Self::build_statement(p)?]),
                        _ => unreachable!(),
                    })
                    .transpose()?;

                Ok(Statement::If {
                    condition,
                    then_block,
                    else_block,
                })
            }
            Rule::while_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::While { condition, body })
            }
            Rule::function_def => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();

                let mut parameters = Vec::new();
                let mut next = inner.next().ok_or(ParseError::UnexpectedEOF)?;

                if next.as_rule() == Rule::parameter_list {
                    for param_pair in next.into_inner() {
                        let mut param_inner = param_pair.into_inner();
                        let param_name = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        let param_type = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        parameters.push(Parameter {
                            name: param_name,
                            type_name: param_type,
                        });
                    }
                    next = inner.next().ok_or(ParseError::UnexpectedEOF)?;
                }

                let return_type = next.as_str().to_string();
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::Function {
                    name,
                    parameters,
                    return_type,
                    body,
                })
            }
            Rule::assignment => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();
                let value = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;

                Ok(Statement::Assignment { name, value })
            }
            Rule::expression_statement => {
                let expr = Self::build_expression_from_pair(
                    pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                Ok(Statement::Expression(expr))
            }
            Rule::block => Ok(Statement::Block(Self::build_block(pair)?)),
            _ => unreachable!("Unexpected statement rule: {:?}", pair.as_rule()),
        }
    }

    fn build_block(pair: Pair<Rule>) -> Result<Vec<Statement>> {
        let mut statements = Vec::new();
        for stmt_pair in pair.into_inner() {
            statements.push(Self::build_statement(stmt_pair)?);
        }
        Ok(statements)
    }

    fn build_expression_from_pair(pair: Pair<Rule>) -> Result<Expr> {
        Self::build_expression(pair.into_inner())
    }
}
impl GrammarParser {
    /// Parse input into a stream of tokens
    pub fn parse_tokens(input: &str) -> Result<Vec<Token>> {
        let pairs = Self::parse(Rule::token_stream, input).map_err(Box::new)?;
        let mut tokens = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                tokens.push(Self::build_token(pair)?);
            }
        }

        Ok(tokens)
    }

    fn build_token(pair: Pair<Rule>) -> Result<Token> {
        match pair.as_rule() {
            Rule::keyword => Ok(Token::Keyword(pair.as_str().to_string())),
            Rule::operator_token => Ok(Token::Operator(pair.as_str().to_string())),
            Rule::punctuation => Ok(Token::Punctuation(pair.as_str().to_string())),
            Rule::literal => {
                let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                match inner.as_rule() {
                    Rule::string_literal => {
                        let content = inner
                            .into_inner()
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str();
                        Ok(Token::Literal(LiteralValue::String(content.to_string())))
                    }
                    Rule::number_literal => {
                        let num = inner
                            .as_str()
                            .parse::<f64>()
                            .map_err(|_| ParseError::InvalidNumber(inner.as_str().to_string()))?;
                        Ok(Token::Literal(LiteralValue::Number(num)))
                    }
                    Rule::boolean_literal => Ok(Token::Literal(LiteralValue::Boolean(
                        inner.as_str() == "true",
                    ))),
                    _ => unreachable!(),
                }
            }
            Rule::identifier_token => Ok(Token::Identifier(pair.as_str().to_string())),
            _ => unreachable!("Unexpected token rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse and print pest parse tree for debugging
    pub fn debug_parse(rule: Rule, input: &str) -> Result<()> {
        let pairs = Self::parse(rule, input).map_err(Box::new)?;
        for pair in pairs {
            Self::print_pair(&pair, 0);
        }
        Ok(())
    }

    fn print_pair(pair: &Pair<Rule>, indent: usize) {
        let indent_str = "  ".repeat(indent);
        println!("{}{:?}: \"{}\"", indent_str, pair.as_rule(), pair.as_str());

        for inner_pair in pair.clone().into_inner() {
            Self::print_pair(&inner_pair, indent + 1);
        }
    }

    /// Extract all identifiers from an expression
    pub fn extract_identifiers(expr: &Expr) -> Vec<String> {
        match expr {
            Expr::Identifier(name) => vec![name.clone()],
            Expr::BinOp { left, right, .. } => {
                let mut ids = Self::extract_identifiers(left);
                ids.extend(Self::extract_identifiers(right));
                ids
            }
            Expr::Number(_) => vec![],
        }
    }

    /// Check if a rule matches the complete input
    pub fn can_parse(rule: Rule, input: &str) -> bool {
        match Self::parse(rule, input) {
            Ok(pairs) => {
                // Check that the entire input is consumed
                let input_len = input.len();
                let parsed_len = pairs.as_str().len();
                parsed_len == input_len
            }
            Err(_) => false,
        }
    }
}
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_expression_parsing() {
        let expr = GrammarParser::parse_expression("2 + 3 * 4").unwrap();
        match expr {
            Expr::BinOp {
                op: BinOperator::Add,
                ..
            } => (),
            _ => panic!("Expected addition at top level"),
        }
    }

    #[test]
    fn test_calculation() {
        assert_eq!(GrammarParser::parse_calculation("2 + 3 * 4").unwrap(), 14.0);
        assert_eq!(
            GrammarParser::parse_calculation("(2 + 3) * 4").unwrap(),
            20.0
        );
        assert_eq!(
            GrammarParser::parse_calculation("2 ^ 3 ^ 2").unwrap(),
            512.0
        );
    }

    #[test]
    fn test_json_parsing() {
        let json = r#"{"name": "test", "value": 42, "active": true}"#;
        let result = GrammarParser::parse_json(json).unwrap();

        if let JsonValue::Object(obj) = result {
            assert_eq!(obj.len(), 3);
        } else {
            panic!("Expected JSON object");
        }
    }

    #[test]
    fn test_program_parsing() {
        let program = r#"
            fn add(x: int, y: int) -> int {
                x + y;
            }

            if x > 0 {
                y = 42;
            }
        "#;

        let result = GrammarParser::parse_program(program).unwrap();
        assert_eq!(result.statements.len(), 2);
    }

    #[test]
    fn test_token_parsing() {
        let input = "if x == 42 { return true; }";
        let tokens = GrammarParser::parse_tokens(input).unwrap();
        assert!(tokens.len() > 5);

        match &tokens[0] {
            Token::Keyword(kw) => assert_eq!(kw, "if"),
            _ => panic!("Expected keyword"),
        }
    }

    #[test]
    fn test_identifier_extraction() {
        let expr = GrammarParser::parse_expression("x + y * z").unwrap();
        let ids = GrammarParser::extract_identifiers(&expr);
        assert_eq!(ids, vec!["x", "y", "z"]);
    }

    #[test]
    fn test_debug_features() {
        assert!(GrammarParser::can_parse(Rule::expression, "2 + 3"));
        assert!(!GrammarParser::can_parse(Rule::expression, "2 +"));
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum Statement {
    If {
        condition: Expr,
        then_block: Vec<Statement>,
        else_block: Option<Vec<Statement>>,
    },
    While {
        condition: Expr,
        body: Vec<Statement>,
    },
    Function {
        name: String,
        parameters: Vec<Parameter>,
        return_type: String,
        body: Vec<Statement>,
    },
    Assignment {
        name: String,
        value: Expr,
    },
    Expression(Expr),
    Block(Vec<Statement>),
}
}

Statement types demonstrate parsing of control flow and program structure. The grammar supports if statements, while loops, function definitions, and variable declarations.

#![allow(unused)]
fn main() {
/// Parse a complete program
pub fn parse_program(input: &str) -> Result<Program> {
    let pairs = Self::parse(Rule::program, input).map_err(Box::new)?;
    let mut statements = Vec::new();

    for pair in pairs
        .into_iter()
        .next()
        .ok_or(ParseError::UnexpectedEOF)?
        .into_inner()
    {
        if pair.as_rule() != Rule::EOI {
            statements.push(Self::build_statement(pair)?);
        }
    }

    Ok(Program { statements })
}
}

Program parsing builds complex ASTs from multiple statement types. The recursive nature of the parser handles nested blocks and control structures. Pattern matching on rule names provides clear correspondence between grammar and implementation.

Token Stream Parsing

#![allow(unused)]
fn main() {
use std::fmt;
use pest::error::Error;
use pest::iterators::{Pair, Pairs};
use pest::pratt_parser::{Assoc, Op, PrattParser};
use pest::Parser;
use pest_derive::Parser;
use thiserror::Error;
#[derive(Parser)]
#[grammar = "grammar.pest"]
pub struct GrammarParser;
#[derive(Error, Debug)]
pub enum ParseError {
    #[error("Pest parsing error: {0}")]
    Pest(#[from] Box<Error<Rule>>),
    #[error("Invalid number format: {0}")]
    InvalidNumber(String),
    #[error("Unknown operator: {0}")]
    UnknownOperator(String),
    #[error("Invalid JSON value")]
    InvalidJson,
    #[error("Unexpected end of input")]
    UnexpectedEOF,
}
pub type Result<T> = std::result::Result<T, ParseError>;
#[derive(Debug, Clone, PartialEq)]
pub enum Expr {
    Number(f64),
    Identifier(String),
    BinOp {
        left: Box<Expr>,
        op: BinOperator,
        right: Box<Expr>,
    },
}
#[derive(Debug, Clone, PartialEq)]
pub enum BinOperator {
    Add,
    Subtract,
    Multiply,
    Divide,
    Power,
    Eq,
    Ne,
    Lt,
    Le,
    Gt,
    Ge,
}
impl fmt::Display for BinOperator {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            BinOperator::Add => write!(f, "+"),
            BinOperator::Subtract => write!(f, "-"),
            BinOperator::Multiply => write!(f, "*"),
            BinOperator::Divide => write!(f, "/"),
            BinOperator::Power => write!(f, "^"),
            BinOperator::Eq => write!(f, "=="),
            BinOperator::Ne => write!(f, "!="),
            BinOperator::Lt => write!(f, "<"),
            BinOperator::Le => write!(f, "<="),
            BinOperator::Gt => write!(f, ">"),
            BinOperator::Ge => write!(f, ">="),
        }
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum JsonValue {
    Object(Vec<(String, JsonValue)>),
    Array(Vec<JsonValue>),
    String(String),
    Number(f64),
    Boolean(bool),
    Null,
}
#[derive(Debug, Clone, PartialEq)]
pub struct Program {
    pub statements: Vec<Statement>,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Statement {
    If {
        condition: Expr,
        then_block: Vec<Statement>,
        else_block: Option<Vec<Statement>>,
    },
    While {
        condition: Expr,
        body: Vec<Statement>,
    },
    Function {
        name: String,
        parameters: Vec<Parameter>,
        return_type: String,
        body: Vec<Statement>,
    },
    Assignment {
        name: String,
        value: Expr,
    },
    Expression(Expr),
    Block(Vec<Statement>),
}
#[derive(Debug, Clone, PartialEq)]
pub struct Parameter {
    pub name: String,
    pub type_name: String,
}
#[derive(Debug, Clone, PartialEq)]
pub enum LiteralValue {
    String(String),
    Number(f64),
    Boolean(bool),
}
impl GrammarParser {
    /// Parse an expression using pest's built-in precedence climbing
    pub fn parse_expression(input: &str) -> Result<Expr> {
        let pairs = Self::parse(Rule::expression, input).map_err(Box::new)?;
        Self::build_expression(pairs)
    }

    fn build_expression(pairs: Pairs<Rule>) -> Result<Expr> {
        let pratt = PrattParser::new()
            .op(Op::infix(Rule::eq, Assoc::Left) | Op::infix(Rule::ne, Assoc::Left))
            .op(Op::infix(Rule::lt, Assoc::Left)
                | Op::infix(Rule::le, Assoc::Left)
                | Op::infix(Rule::gt, Assoc::Left)
                | Op::infix(Rule::ge, Assoc::Left))
            .op(Op::infix(Rule::add, Assoc::Left) | Op::infix(Rule::subtract, Assoc::Left))
            .op(Op::infix(Rule::multiply, Assoc::Left) | Op::infix(Rule::divide, Assoc::Left))
            .op(Op::infix(Rule::power, Assoc::Right));

        pratt
            .map_primary(|primary| match primary.as_rule() {
                Rule::term => {
                    let inner = primary
                        .into_inner()
                        .next()
                        .ok_or(ParseError::UnexpectedEOF)?;
                    match inner.as_rule() {
                        Rule::number => {
                            let num = inner.as_str().parse::<f64>().map_err(|_| {
                                ParseError::InvalidNumber(inner.as_str().to_string())
                            })?;
                            Ok(Expr::Number(num))
                        }
                        Rule::identifier => Ok(Expr::Identifier(inner.as_str().to_string())),
                        Rule::expression => Self::build_expression(inner.into_inner()),
                        _ => unreachable!("Unexpected term rule: {:?}", inner.as_rule()),
                    }
                }
                Rule::number => {
                    let num = primary
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(primary.as_str().to_string()))?;
                    Ok(Expr::Number(num))
                }
                Rule::identifier => Ok(Expr::Identifier(primary.as_str().to_string())),
                Rule::expression => Self::build_expression(primary.into_inner()),
                _ => unreachable!("Unexpected primary rule: {:?}", primary.as_rule()),
            })
            .map_infix(|left, op, right| {
                let op = match op.as_rule() {
                    Rule::add => BinOperator::Add,
                    Rule::subtract => BinOperator::Subtract,
                    Rule::multiply => BinOperator::Multiply,
                    Rule::divide => BinOperator::Divide,
                    Rule::power => BinOperator::Power,
                    Rule::eq => BinOperator::Eq,
                    Rule::ne => BinOperator::Ne,
                    Rule::lt => BinOperator::Lt,
                    Rule::le => BinOperator::Le,
                    Rule::gt => BinOperator::Gt,
                    Rule::ge => BinOperator::Ge,
                    _ => return Err(ParseError::UnknownOperator(op.as_str().to_string())),
                };
                Ok(Expr::BinOp {
                    left: Box::new(left?),
                    op,
                    right: Box::new(right?),
                })
            })
            .parse(pairs)
    }

    /// Parse a calculator expression with explicit precedence rules
    pub fn parse_calculation(input: &str) -> Result<f64> {
        let pairs = Self::parse(Rule::calculation, input).map_err(Box::new)?;
        Self::evaluate_calculation(
            pairs
                .into_iter()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?
                .into_inner()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?,
        )
    }

    fn evaluate_calculation(pair: Pair<Rule>) -> Result<f64> {
        match pair.as_rule() {
            Rule::calc_expression => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(term)) = (pairs.next(), pairs.next()) {
                    let term_val = Self::evaluate_calculation(term)?;
                    match op.as_rule() {
                        Rule::calc_add_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_plus => result += term_val,
                                Rule::calc_minus => result -= term_val,
                                _ => unreachable!("Unexpected add op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_plus => result += term_val,
                        Rule::calc_minus => result -= term_val,
                        _ => unreachable!("Unexpected calc expression op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_term => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(factor)) = (pairs.next(), pairs.next()) {
                    let factor_val = Self::evaluate_calculation(factor)?;
                    match op.as_rule() {
                        Rule::calc_mul_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_multiply => result *= factor_val,
                                Rule::calc_divide => result /= factor_val,
                                _ => unreachable!("Unexpected mul op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_multiply => result *= factor_val,
                        Rule::calc_divide => result /= factor_val,
                        _ => unreachable!("Unexpected calc term op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_factor => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_power => {
                let mut pairs = pair.into_inner();
                let base =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                if let Some(op) = pairs.next() {
                    if op.as_rule() == Rule::calc_pow_op {
                        let exponent = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(base.powf(exponent))
                    } else {
                        unreachable!("Expected calc_pow_op, got: {:?}", op.as_rule());
                    }
                } else {
                    Ok(base)
                }
            }
            Rule::calc_atom => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_unary => {
                let mut pairs = pair.into_inner();
                let first = pairs.next().ok_or(ParseError::UnexpectedEOF)?;

                match first.as_rule() {
                    Rule::calc_minus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(-val)
                    }
                    Rule::calc_plus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(val)
                    }
                    Rule::calc_number => first
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(first.as_str().to_string())),
                    _ => Self::evaluate_calculation(first),
                }
            }
            Rule::calc_number => pair
                .as_str()
                .parse::<f64>()
                .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string())),
            _ => unreachable!("Unexpected rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse JSON input into a JsonValue AST
    pub fn parse_json(input: &str) -> Result<JsonValue> {
        let pairs = Self::parse(Rule::json_value, input).map_err(Box::new)?;
        Self::build_json_value(pairs.into_iter().next().ok_or(ParseError::UnexpectedEOF)?)
    }

    fn build_json_value(pair: Pair<Rule>) -> Result<JsonValue> {
        match pair.as_rule() {
            Rule::json_value => {
                Self::build_json_value(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::object => {
                let mut object = Vec::new();
                for pair in pair.into_inner() {
                    if let Rule::pair = pair.as_rule() {
                        let mut inner = pair.into_inner();
                        let key =
                            Self::parse_string(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        let value =
                            Self::build_json_value(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        object.push((key, value));
                    }
                }
                Ok(JsonValue::Object(object))
            }
            Rule::array => {
                let mut array = Vec::new();
                for pair in pair.into_inner() {
                    array.push(Self::build_json_value(pair)?);
                }
                Ok(JsonValue::Array(array))
            }
            Rule::string => Ok(JsonValue::String(Self::parse_string(pair)?)),
            Rule::number => {
                let num = pair
                    .as_str()
                    .parse::<f64>()
                    .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string()))?;
                Ok(JsonValue::Number(num))
            }
            Rule::boolean => Ok(JsonValue::Boolean(pair.as_str() == "true")),
            Rule::null => Ok(JsonValue::Null),
            _ => Err(ParseError::InvalidJson),
        }
    }

    fn parse_string(pair: Pair<Rule>) -> Result<String> {
        let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
        Ok(inner.as_str().to_string())
    }
}
impl GrammarParser {
    /// Parse a complete program
    pub fn parse_program(input: &str) -> Result<Program> {
        let pairs = Self::parse(Rule::program, input).map_err(Box::new)?;
        let mut statements = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                statements.push(Self::build_statement(pair)?);
            }
        }

        Ok(Program { statements })
    }

    fn build_statement(pair: Pair<Rule>) -> Result<Statement> {
        match pair.as_rule() {
            Rule::statement => {
                Self::build_statement(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::if_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let then_block = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                let else_block = inner
                    .next()
                    .map(|p| match p.as_rule() {
                        Rule::block => Self::build_block(p),
                        Rule::if_statement => Ok(vec![Self::build_statement(p)?]),
                        _ => unreachable!(),
                    })
                    .transpose()?;

                Ok(Statement::If {
                    condition,
                    then_block,
                    else_block,
                })
            }
            Rule::while_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::While { condition, body })
            }
            Rule::function_def => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();

                let mut parameters = Vec::new();
                let mut next = inner.next().ok_or(ParseError::UnexpectedEOF)?;

                if next.as_rule() == Rule::parameter_list {
                    for param_pair in next.into_inner() {
                        let mut param_inner = param_pair.into_inner();
                        let param_name = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        let param_type = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        parameters.push(Parameter {
                            name: param_name,
                            type_name: param_type,
                        });
                    }
                    next = inner.next().ok_or(ParseError::UnexpectedEOF)?;
                }

                let return_type = next.as_str().to_string();
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::Function {
                    name,
                    parameters,
                    return_type,
                    body,
                })
            }
            Rule::assignment => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();
                let value = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;

                Ok(Statement::Assignment { name, value })
            }
            Rule::expression_statement => {
                let expr = Self::build_expression_from_pair(
                    pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                Ok(Statement::Expression(expr))
            }
            Rule::block => Ok(Statement::Block(Self::build_block(pair)?)),
            _ => unreachable!("Unexpected statement rule: {:?}", pair.as_rule()),
        }
    }

    fn build_block(pair: Pair<Rule>) -> Result<Vec<Statement>> {
        let mut statements = Vec::new();
        for stmt_pair in pair.into_inner() {
            statements.push(Self::build_statement(stmt_pair)?);
        }
        Ok(statements)
    }

    fn build_expression_from_pair(pair: Pair<Rule>) -> Result<Expr> {
        Self::build_expression(pair.into_inner())
    }
}
impl GrammarParser {
    /// Parse input into a stream of tokens
    pub fn parse_tokens(input: &str) -> Result<Vec<Token>> {
        let pairs = Self::parse(Rule::token_stream, input).map_err(Box::new)?;
        let mut tokens = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                tokens.push(Self::build_token(pair)?);
            }
        }

        Ok(tokens)
    }

    fn build_token(pair: Pair<Rule>) -> Result<Token> {
        match pair.as_rule() {
            Rule::keyword => Ok(Token::Keyword(pair.as_str().to_string())),
            Rule::operator_token => Ok(Token::Operator(pair.as_str().to_string())),
            Rule::punctuation => Ok(Token::Punctuation(pair.as_str().to_string())),
            Rule::literal => {
                let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                match inner.as_rule() {
                    Rule::string_literal => {
                        let content = inner
                            .into_inner()
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str();
                        Ok(Token::Literal(LiteralValue::String(content.to_string())))
                    }
                    Rule::number_literal => {
                        let num = inner
                            .as_str()
                            .parse::<f64>()
                            .map_err(|_| ParseError::InvalidNumber(inner.as_str().to_string()))?;
                        Ok(Token::Literal(LiteralValue::Number(num)))
                    }
                    Rule::boolean_literal => Ok(Token::Literal(LiteralValue::Boolean(
                        inner.as_str() == "true",
                    ))),
                    _ => unreachable!(),
                }
            }
            Rule::identifier_token => Ok(Token::Identifier(pair.as_str().to_string())),
            _ => unreachable!("Unexpected token rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse and print pest parse tree for debugging
    pub fn debug_parse(rule: Rule, input: &str) -> Result<()> {
        let pairs = Self::parse(rule, input).map_err(Box::new)?;
        for pair in pairs {
            Self::print_pair(&pair, 0);
        }
        Ok(())
    }

    fn print_pair(pair: &Pair<Rule>, indent: usize) {
        let indent_str = "  ".repeat(indent);
        println!("{}{:?}: \"{}\"", indent_str, pair.as_rule(), pair.as_str());

        for inner_pair in pair.clone().into_inner() {
            Self::print_pair(&inner_pair, indent + 1);
        }
    }

    /// Extract all identifiers from an expression
    pub fn extract_identifiers(expr: &Expr) -> Vec<String> {
        match expr {
            Expr::Identifier(name) => vec![name.clone()],
            Expr::BinOp { left, right, .. } => {
                let mut ids = Self::extract_identifiers(left);
                ids.extend(Self::extract_identifiers(right));
                ids
            }
            Expr::Number(_) => vec![],
        }
    }

    /// Check if a rule matches the complete input
    pub fn can_parse(rule: Rule, input: &str) -> bool {
        match Self::parse(rule, input) {
            Ok(pairs) => {
                // Check that the entire input is consumed
                let input_len = input.len();
                let parsed_len = pairs.as_str().len();
                parsed_len == input_len
            }
            Err(_) => false,
        }
    }
}
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_expression_parsing() {
        let expr = GrammarParser::parse_expression("2 + 3 * 4").unwrap();
        match expr {
            Expr::BinOp {
                op: BinOperator::Add,
                ..
            } => (),
            _ => panic!("Expected addition at top level"),
        }
    }

    #[test]
    fn test_calculation() {
        assert_eq!(GrammarParser::parse_calculation("2 + 3 * 4").unwrap(), 14.0);
        assert_eq!(
            GrammarParser::parse_calculation("(2 + 3) * 4").unwrap(),
            20.0
        );
        assert_eq!(
            GrammarParser::parse_calculation("2 ^ 3 ^ 2").unwrap(),
            512.0
        );
    }

    #[test]
    fn test_json_parsing() {
        let json = r#"{"name": "test", "value": 42, "active": true}"#;
        let result = GrammarParser::parse_json(json).unwrap();

        if let JsonValue::Object(obj) = result {
            assert_eq!(obj.len(), 3);
        } else {
            panic!("Expected JSON object");
        }
    }

    #[test]
    fn test_program_parsing() {
        let program = r#"
            fn add(x: int, y: int) -> int {
                x + y;
            }

            if x > 0 {
                y = 42;
            }
        "#;

        let result = GrammarParser::parse_program(program).unwrap();
        assert_eq!(result.statements.len(), 2);
    }

    #[test]
    fn test_token_parsing() {
        let input = "if x == 42 { return true; }";
        let tokens = GrammarParser::parse_tokens(input).unwrap();
        assert!(tokens.len() > 5);

        match &tokens[0] {
            Token::Keyword(kw) => assert_eq!(kw, "if"),
            _ => panic!("Expected keyword"),
        }
    }

    #[test]
    fn test_identifier_extraction() {
        let expr = GrammarParser::parse_expression("x + y * z").unwrap();
        let ids = GrammarParser::extract_identifiers(&expr);
        assert_eq!(ids, vec!["x", "y", "z"]);
    }

    #[test]
    fn test_debug_features() {
        assert!(GrammarParser::can_parse(Rule::expression, "2 + 3"));
        assert!(!GrammarParser::can_parse(Rule::expression, "2 +"));
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum Token {
    Keyword(String),
    Operator(String),
    Punctuation(String),
    Literal(LiteralValue),
    Identifier(String),
}
}

Token extraction demonstrates pest’s ability to function as a lexer. The grammar identifies different token types while preserving their textual representation.

#![allow(unused)]
fn main() {
/// Parse input into a stream of tokens
pub fn parse_tokens(input: &str) -> Result<Vec<Token>> {
    let pairs = Self::parse(Rule::token_stream, input).map_err(Box::new)?;
    let mut tokens = Vec::new();

    for pair in pairs
        .into_iter()
        .next()
        .ok_or(ParseError::UnexpectedEOF)?
        .into_inner()
    {
        if pair.as_rule() != Rule::EOI {
            tokens.push(Self::build_token(pair)?);
        }
    }

    Ok(tokens)
}
}

Token stream parsing extracts lexical elements from source text. Each token preserves its span information for error reporting and source mapping. The approach separates lexical analysis from syntactic parsing when needed.

Error Handling

#![allow(unused)]
fn main() {
use std::fmt;
use pest::error::Error;
use pest::iterators::{Pair, Pairs};
use pest::pratt_parser::{Assoc, Op, PrattParser};
use pest::Parser;
use pest_derive::Parser;
use thiserror::Error;
#[derive(Parser)]
#[grammar = "grammar.pest"]
pub struct GrammarParser;
pub type Result<T> = std::result::Result<T, ParseError>;
#[derive(Debug, Clone, PartialEq)]
pub enum Expr {
    Number(f64),
    Identifier(String),
    BinOp {
        left: Box<Expr>,
        op: BinOperator,
        right: Box<Expr>,
    },
}
#[derive(Debug, Clone, PartialEq)]
pub enum BinOperator {
    Add,
    Subtract,
    Multiply,
    Divide,
    Power,
    Eq,
    Ne,
    Lt,
    Le,
    Gt,
    Ge,
}
impl fmt::Display for BinOperator {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            BinOperator::Add => write!(f, "+"),
            BinOperator::Subtract => write!(f, "-"),
            BinOperator::Multiply => write!(f, "*"),
            BinOperator::Divide => write!(f, "/"),
            BinOperator::Power => write!(f, "^"),
            BinOperator::Eq => write!(f, "=="),
            BinOperator::Ne => write!(f, "!="),
            BinOperator::Lt => write!(f, "<"),
            BinOperator::Le => write!(f, "<="),
            BinOperator::Gt => write!(f, ">"),
            BinOperator::Ge => write!(f, ">="),
        }
    }
}
#[derive(Debug, Clone, PartialEq)]
pub enum JsonValue {
    Object(Vec<(String, JsonValue)>),
    Array(Vec<JsonValue>),
    String(String),
    Number(f64),
    Boolean(bool),
    Null,
}
#[derive(Debug, Clone, PartialEq)]
pub struct Program {
    pub statements: Vec<Statement>,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Statement {
    If {
        condition: Expr,
        then_block: Vec<Statement>,
        else_block: Option<Vec<Statement>>,
    },
    While {
        condition: Expr,
        body: Vec<Statement>,
    },
    Function {
        name: String,
        parameters: Vec<Parameter>,
        return_type: String,
        body: Vec<Statement>,
    },
    Assignment {
        name: String,
        value: Expr,
    },
    Expression(Expr),
    Block(Vec<Statement>),
}
#[derive(Debug, Clone, PartialEq)]
pub struct Parameter {
    pub name: String,
    pub type_name: String,
}
#[derive(Debug, Clone, PartialEq)]
pub enum Token {
    Keyword(String),
    Operator(String),
    Punctuation(String),
    Literal(LiteralValue),
    Identifier(String),
}
#[derive(Debug, Clone, PartialEq)]
pub enum LiteralValue {
    String(String),
    Number(f64),
    Boolean(bool),
}
impl GrammarParser {
    /// Parse an expression using pest's built-in precedence climbing
    pub fn parse_expression(input: &str) -> Result<Expr> {
        let pairs = Self::parse(Rule::expression, input).map_err(Box::new)?;
        Self::build_expression(pairs)
    }

    fn build_expression(pairs: Pairs<Rule>) -> Result<Expr> {
        let pratt = PrattParser::new()
            .op(Op::infix(Rule::eq, Assoc::Left) | Op::infix(Rule::ne, Assoc::Left))
            .op(Op::infix(Rule::lt, Assoc::Left)
                | Op::infix(Rule::le, Assoc::Left)
                | Op::infix(Rule::gt, Assoc::Left)
                | Op::infix(Rule::ge, Assoc::Left))
            .op(Op::infix(Rule::add, Assoc::Left) | Op::infix(Rule::subtract, Assoc::Left))
            .op(Op::infix(Rule::multiply, Assoc::Left) | Op::infix(Rule::divide, Assoc::Left))
            .op(Op::infix(Rule::power, Assoc::Right));

        pratt
            .map_primary(|primary| match primary.as_rule() {
                Rule::term => {
                    let inner = primary
                        .into_inner()
                        .next()
                        .ok_or(ParseError::UnexpectedEOF)?;
                    match inner.as_rule() {
                        Rule::number => {
                            let num = inner.as_str().parse::<f64>().map_err(|_| {
                                ParseError::InvalidNumber(inner.as_str().to_string())
                            })?;
                            Ok(Expr::Number(num))
                        }
                        Rule::identifier => Ok(Expr::Identifier(inner.as_str().to_string())),
                        Rule::expression => Self::build_expression(inner.into_inner()),
                        _ => unreachable!("Unexpected term rule: {:?}", inner.as_rule()),
                    }
                }
                Rule::number => {
                    let num = primary
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(primary.as_str().to_string()))?;
                    Ok(Expr::Number(num))
                }
                Rule::identifier => Ok(Expr::Identifier(primary.as_str().to_string())),
                Rule::expression => Self::build_expression(primary.into_inner()),
                _ => unreachable!("Unexpected primary rule: {:?}", primary.as_rule()),
            })
            .map_infix(|left, op, right| {
                let op = match op.as_rule() {
                    Rule::add => BinOperator::Add,
                    Rule::subtract => BinOperator::Subtract,
                    Rule::multiply => BinOperator::Multiply,
                    Rule::divide => BinOperator::Divide,
                    Rule::power => BinOperator::Power,
                    Rule::eq => BinOperator::Eq,
                    Rule::ne => BinOperator::Ne,
                    Rule::lt => BinOperator::Lt,
                    Rule::le => BinOperator::Le,
                    Rule::gt => BinOperator::Gt,
                    Rule::ge => BinOperator::Ge,
                    _ => return Err(ParseError::UnknownOperator(op.as_str().to_string())),
                };
                Ok(Expr::BinOp {
                    left: Box::new(left?),
                    op,
                    right: Box::new(right?),
                })
            })
            .parse(pairs)
    }

    /// Parse a calculator expression with explicit precedence rules
    pub fn parse_calculation(input: &str) -> Result<f64> {
        let pairs = Self::parse(Rule::calculation, input).map_err(Box::new)?;
        Self::evaluate_calculation(
            pairs
                .into_iter()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?
                .into_inner()
                .next()
                .ok_or(ParseError::UnexpectedEOF)?,
        )
    }

    fn evaluate_calculation(pair: Pair<Rule>) -> Result<f64> {
        match pair.as_rule() {
            Rule::calc_expression => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(term)) = (pairs.next(), pairs.next()) {
                    let term_val = Self::evaluate_calculation(term)?;
                    match op.as_rule() {
                        Rule::calc_add_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_plus => result += term_val,
                                Rule::calc_minus => result -= term_val,
                                _ => unreachable!("Unexpected add op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_plus => result += term_val,
                        Rule::calc_minus => result -= term_val,
                        _ => unreachable!("Unexpected calc expression op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_term => {
                let mut pairs = pair.into_inner();
                let mut result =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                while let (Some(op), Some(factor)) = (pairs.next(), pairs.next()) {
                    let factor_val = Self::evaluate_calculation(factor)?;
                    match op.as_rule() {
                        Rule::calc_mul_op => {
                            let inner_op =
                                op.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                            match inner_op.as_rule() {
                                Rule::calc_multiply => result *= factor_val,
                                Rule::calc_divide => result /= factor_val,
                                _ => unreachable!("Unexpected mul op: {:?}", inner_op.as_rule()),
                            }
                        }
                        Rule::calc_multiply => result *= factor_val,
                        Rule::calc_divide => result /= factor_val,
                        _ => unreachable!("Unexpected calc term op: {:?}", op.as_rule()),
                    }
                }
                Ok(result)
            }
            Rule::calc_factor => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_power => {
                let mut pairs = pair.into_inner();
                let base =
                    Self::evaluate_calculation(pairs.next().ok_or(ParseError::UnexpectedEOF)?)?;

                if let Some(op) = pairs.next() {
                    if op.as_rule() == Rule::calc_pow_op {
                        let exponent = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(base.powf(exponent))
                    } else {
                        unreachable!("Expected calc_pow_op, got: {:?}", op.as_rule());
                    }
                } else {
                    Ok(base)
                }
            }
            Rule::calc_atom => Self::evaluate_calculation(
                pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
            ),
            Rule::calc_unary => {
                let mut pairs = pair.into_inner();
                let first = pairs.next().ok_or(ParseError::UnexpectedEOF)?;

                match first.as_rule() {
                    Rule::calc_minus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(-val)
                    }
                    Rule::calc_plus => {
                        let val = Self::evaluate_calculation(
                            pairs.next().ok_or(ParseError::UnexpectedEOF)?,
                        )?;
                        Ok(val)
                    }
                    Rule::calc_number => first
                        .as_str()
                        .parse::<f64>()
                        .map_err(|_| ParseError::InvalidNumber(first.as_str().to_string())),
                    _ => Self::evaluate_calculation(first),
                }
            }
            Rule::calc_number => pair
                .as_str()
                .parse::<f64>()
                .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string())),
            _ => unreachable!("Unexpected rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse JSON input into a JsonValue AST
    pub fn parse_json(input: &str) -> Result<JsonValue> {
        let pairs = Self::parse(Rule::json_value, input).map_err(Box::new)?;
        Self::build_json_value(pairs.into_iter().next().ok_or(ParseError::UnexpectedEOF)?)
    }

    fn build_json_value(pair: Pair<Rule>) -> Result<JsonValue> {
        match pair.as_rule() {
            Rule::json_value => {
                Self::build_json_value(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::object => {
                let mut object = Vec::new();
                for pair in pair.into_inner() {
                    if let Rule::pair = pair.as_rule() {
                        let mut inner = pair.into_inner();
                        let key =
                            Self::parse_string(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        let value =
                            Self::build_json_value(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                        object.push((key, value));
                    }
                }
                Ok(JsonValue::Object(object))
            }
            Rule::array => {
                let mut array = Vec::new();
                for pair in pair.into_inner() {
                    array.push(Self::build_json_value(pair)?);
                }
                Ok(JsonValue::Array(array))
            }
            Rule::string => Ok(JsonValue::String(Self::parse_string(pair)?)),
            Rule::number => {
                let num = pair
                    .as_str()
                    .parse::<f64>()
                    .map_err(|_| ParseError::InvalidNumber(pair.as_str().to_string()))?;
                Ok(JsonValue::Number(num))
            }
            Rule::boolean => Ok(JsonValue::Boolean(pair.as_str() == "true")),
            Rule::null => Ok(JsonValue::Null),
            _ => Err(ParseError::InvalidJson),
        }
    }

    fn parse_string(pair: Pair<Rule>) -> Result<String> {
        let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
        Ok(inner.as_str().to_string())
    }
}
impl GrammarParser {
    /// Parse a complete program
    pub fn parse_program(input: &str) -> Result<Program> {
        let pairs = Self::parse(Rule::program, input).map_err(Box::new)?;
        let mut statements = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                statements.push(Self::build_statement(pair)?);
            }
        }

        Ok(Program { statements })
    }

    fn build_statement(pair: Pair<Rule>) -> Result<Statement> {
        match pair.as_rule() {
            Rule::statement => {
                Self::build_statement(pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?)
            }
            Rule::if_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let then_block = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;
                let else_block = inner
                    .next()
                    .map(|p| match p.as_rule() {
                        Rule::block => Self::build_block(p),
                        Rule::if_statement => Ok(vec![Self::build_statement(p)?]),
                        _ => unreachable!(),
                    })
                    .transpose()?;

                Ok(Statement::If {
                    condition,
                    then_block,
                    else_block,
                })
            }
            Rule::while_statement => {
                let mut inner = pair.into_inner();
                let condition = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::While { condition, body })
            }
            Rule::function_def => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();

                let mut parameters = Vec::new();
                let mut next = inner.next().ok_or(ParseError::UnexpectedEOF)?;

                if next.as_rule() == Rule::parameter_list {
                    for param_pair in next.into_inner() {
                        let mut param_inner = param_pair.into_inner();
                        let param_name = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        let param_type = param_inner
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str()
                            .to_string();
                        parameters.push(Parameter {
                            name: param_name,
                            type_name: param_type,
                        });
                    }
                    next = inner.next().ok_or(ParseError::UnexpectedEOF)?;
                }

                let return_type = next.as_str().to_string();
                let body = Self::build_block(inner.next().ok_or(ParseError::UnexpectedEOF)?)?;

                Ok(Statement::Function {
                    name,
                    parameters,
                    return_type,
                    body,
                })
            }
            Rule::assignment => {
                let mut inner = pair.into_inner();
                let name = inner
                    .next()
                    .ok_or(ParseError::UnexpectedEOF)?
                    .as_str()
                    .to_string();
                let value = Self::build_expression_from_pair(
                    inner.next().ok_or(ParseError::UnexpectedEOF)?,
                )?;

                Ok(Statement::Assignment { name, value })
            }
            Rule::expression_statement => {
                let expr = Self::build_expression_from_pair(
                    pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?,
                )?;
                Ok(Statement::Expression(expr))
            }
            Rule::block => Ok(Statement::Block(Self::build_block(pair)?)),
            _ => unreachable!("Unexpected statement rule: {:?}", pair.as_rule()),
        }
    }

    fn build_block(pair: Pair<Rule>) -> Result<Vec<Statement>> {
        let mut statements = Vec::new();
        for stmt_pair in pair.into_inner() {
            statements.push(Self::build_statement(stmt_pair)?);
        }
        Ok(statements)
    }

    fn build_expression_from_pair(pair: Pair<Rule>) -> Result<Expr> {
        Self::build_expression(pair.into_inner())
    }
}
impl GrammarParser {
    /// Parse input into a stream of tokens
    pub fn parse_tokens(input: &str) -> Result<Vec<Token>> {
        let pairs = Self::parse(Rule::token_stream, input).map_err(Box::new)?;
        let mut tokens = Vec::new();

        for pair in pairs
            .into_iter()
            .next()
            .ok_or(ParseError::UnexpectedEOF)?
            .into_inner()
        {
            if pair.as_rule() != Rule::EOI {
                tokens.push(Self::build_token(pair)?);
            }
        }

        Ok(tokens)
    }

    fn build_token(pair: Pair<Rule>) -> Result<Token> {
        match pair.as_rule() {
            Rule::keyword => Ok(Token::Keyword(pair.as_str().to_string())),
            Rule::operator_token => Ok(Token::Operator(pair.as_str().to_string())),
            Rule::punctuation => Ok(Token::Punctuation(pair.as_str().to_string())),
            Rule::literal => {
                let inner = pair.into_inner().next().ok_or(ParseError::UnexpectedEOF)?;
                match inner.as_rule() {
                    Rule::string_literal => {
                        let content = inner
                            .into_inner()
                            .next()
                            .ok_or(ParseError::UnexpectedEOF)?
                            .as_str();
                        Ok(Token::Literal(LiteralValue::String(content.to_string())))
                    }
                    Rule::number_literal => {
                        let num = inner
                            .as_str()
                            .parse::<f64>()
                            .map_err(|_| ParseError::InvalidNumber(inner.as_str().to_string()))?;
                        Ok(Token::Literal(LiteralValue::Number(num)))
                    }
                    Rule::boolean_literal => Ok(Token::Literal(LiteralValue::Boolean(
                        inner.as_str() == "true",
                    ))),
                    _ => unreachable!(),
                }
            }
            Rule::identifier_token => Ok(Token::Identifier(pair.as_str().to_string())),
            _ => unreachable!("Unexpected token rule: {:?}", pair.as_rule()),
        }
    }
}
impl GrammarParser {
    /// Parse and print pest parse tree for debugging
    pub fn debug_parse(rule: Rule, input: &str) -> Result<()> {
        let pairs = Self::parse(rule, input).map_err(Box::new)?;
        for pair in pairs {
            Self::print_pair(&pair, 0);
        }
        Ok(())
    }

    fn print_pair(pair: &Pair<Rule>, indent: usize) {
        let indent_str = "  ".repeat(indent);
        println!("{}{:?}: \"{}\"", indent_str, pair.as_rule(), pair.as_str());

        for inner_pair in pair.clone().into_inner() {
            Self::print_pair(&inner_pair, indent + 1);
        }
    }

    /// Extract all identifiers from an expression
    pub fn extract_identifiers(expr: &Expr) -> Vec<String> {
        match expr {
            Expr::Identifier(name) => vec![name.clone()],
            Expr::BinOp { left, right, .. } => {
                let mut ids = Self::extract_identifiers(left);
                ids.extend(Self::extract_identifiers(right));
                ids
            }
            Expr::Number(_) => vec![],
        }
    }

    /// Check if a rule matches the complete input
    pub fn can_parse(rule: Rule, input: &str) -> bool {
        match Self::parse(rule, input) {
            Ok(pairs) => {
                // Check that the entire input is consumed
                let input_len = input.len();
                let parsed_len = pairs.as_str().len();
                parsed_len == input_len
            }
            Err(_) => false,
        }
    }
}
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_expression_parsing() {
        let expr = GrammarParser::parse_expression("2 + 3 * 4").unwrap();
        match expr {
            Expr::BinOp {
                op: BinOperator::Add,
                ..
            } => (),
            _ => panic!("Expected addition at top level"),
        }
    }

    #[test]
    fn test_calculation() {
        assert_eq!(GrammarParser::parse_calculation("2 + 3 * 4").unwrap(), 14.0);
        assert_eq!(
            GrammarParser::parse_calculation("(2 + 3) * 4").unwrap(),
            20.0
        );
        assert_eq!(
            GrammarParser::parse_calculation("2 ^ 3 ^ 2").unwrap(),
            512.0
        );
    }

    #[test]
    fn test_json_parsing() {
        let json = r#"{"name": "test", "value": 42, "active": true}"#;
        let result = GrammarParser::parse_json(json).unwrap();

        if let JsonValue::Object(obj) = result {
            assert_eq!(obj.len(), 3);
        } else {
            panic!("Expected JSON object");
        }
    }

    #[test]
    fn test_program_parsing() {
        let program = r#"
            fn add(x: int, y: int) -> int {
                x + y;
            }

            if x > 0 {
                y = 42;
            }
        "#;

        let result = GrammarParser::parse_program(program).unwrap();
        assert_eq!(result.statements.len(), 2);
    }

    #[test]
    fn test_token_parsing() {
        let input = "if x == 42 { return true; }";
        let tokens = GrammarParser::parse_tokens(input).unwrap();
        assert!(tokens.len() > 5);

        match &tokens[0] {
            Token::Keyword(kw) => assert_eq!(kw, "if"),
            _ => panic!("Expected keyword"),
        }
    }

    #[test]
    fn test_identifier_extraction() {
        let expr = GrammarParser::parse_expression("x + y * z").unwrap();
        let ids = GrammarParser::extract_identifiers(&expr);
        assert_eq!(ids, vec!["x", "y", "z"]);
    }

    #[test]
    fn test_debug_features() {
        assert!(GrammarParser::can_parse(Rule::expression, "2 + 3"));
        assert!(!GrammarParser::can_parse(Rule::expression, "2 +"));
    }
}
#[derive(Error, Debug)]
pub enum ParseError {
    #[error("Pest parsing error: {0}")]
    Pest(#[from] Box<Error<Rule>>),
    #[error("Invalid number format: {0}")]
    InvalidNumber(String),
    #[error("Unknown operator: {0}")]
    UnknownOperator(String),
    #[error("Invalid JSON value")]
    InvalidJson,
    #[error("Unexpected end of input")]
    UnexpectedEOF,
}
}

Custom error types provide context-specific error messages beyond pest’s built-in reporting. The error variants correspond to different failure modes in parsing and semantic analysis.

#![allow(unused)]
fn main() {
/// Parse and print pest parse tree for debugging
pub fn debug_parse(rule: Rule, input: &str) -> Result<()> {
    let pairs = Self::parse(rule, input).map_err(Box::new)?;
    for pair in pairs {
        Self::print_pair(&pair, 0);
    }
    Ok(())
}
}

Debug parsing enriches pest’s error messages with domain-specific information. The function wraps pest errors with additional context about what was being parsed. Line and column information from pest integrates with custom error formatting.

Utility Functions

#![allow(unused)]
fn main() {
/// Extract all identifiers from an expression
pub fn extract_identifiers(expr: &Expr) -> Vec<String> {
    match expr {
        Expr::Identifier(name) => vec![name.clone()],
        Expr::BinOp { left, right, .. } => {
            let mut ids = Self::extract_identifiers(left);
            ids.extend(Self::extract_identifiers(right));
            ids
        }
        Expr::Number(_) => vec![],
    }
}
}

Helper functions simplify common parsing patterns. Identifier extraction demonstrates traversing the parse tree to collect specific elements. The visitor pattern works well with pest’s pair structure for gathering information.

Debug utilities help understand pest’s parse tree structure during development. The function recursively prints the tree with indentation showing nesting levels. Rule names and captured text provide insight into how the grammar matched input.

Grammar Design Patterns

Pest grammars benefit from consistent structure and naming conventions. Use snake_case for rule names and UPPER_CASE for token constants. Group related rules together with comments explaining their purpose. Silent rules with underscore prefixes hide implementation details from the parse tree.

Whitespace handling deserves special attention in grammar design. The built-in WHITESPACE rule automatically skips whitespace between tokens. Atomic rules disable automatic whitespace handling when exact matching is required. Use push and pop operations for significant indentation in languages like Python.

Comments can be handled uniformly through the COMMENT rule. Single-line and multi-line comment patterns integrate naturally with automatic skipping. This approach keeps the main grammar rules clean and focused on language structure.

Performance Optimization

Pest generates efficient parsers through compile-time code generation. The generated code uses backtracking only when necessary, preferring deterministic parsing where possible. Memoization of rule results prevents redundant parsing of the same input.

Rule ordering impacts performance in choice expressions. Place more common alternatives first to reduce backtracking. Use atomic rules to prevent unnecessary whitespace checking in tight loops. Consider breaking complex rules into smaller components for better caching.

The precedence climbing algorithm provides optimal performance for expression parsing. Unlike naive recursive descent, it avoids deep recursion for left-associative operators. The algorithm handles arbitrary precedence levels efficiently without grammar transformations.

Integration Patterns

Pest integrates well with other Rust compiler infrastructure. Parse results can be converted to spans for error reporting libraries like codespan or ariadne. The AST types can implement serde traits for serialization or visitor patterns for analysis passes.

Incremental parsing can be implemented by caching parse results for unchanged input sections. The stateless nature of pest parsers enables parallel parsing of independent input chunks. Custom pair processing can extract only needed information without full AST construction.

Testing pest grammars requires attention to both positive and negative cases. Use pest’s built-in testing syntax in grammar files for quick validation. Integration tests should verify AST construction and error handling. Property-based testing can validate grammar properties like precedence and associativity.

Best Practices

Keep grammars readable and maintainable by avoiding overly complex rules. Break down complicated patterns into named sub-rules that document their purpose. Use meaningful rule names that correspond to language concepts rather than implementation details.

Version control grammar files alongside implementation code. Document grammar changes and their rationale in commit messages. Consider grammar compatibility when evolving languages to avoid breaking existing code.

Profile parser performance on representative input to identify bottlenecks. Complex backtracking patterns or excessive rule nesting can impact performance. Use pest’s built-in debugging features to understand parsing behavior on problematic input.

Handle errors gracefully with informative messages. Pest’s automatic error reporting provides good default messages, but custom errors can add domain-specific context. Consider recovery strategies for IDE integration where partial results are valuable.