syn
Syn is a parser library for Rust code that provides a complete syntax tree representation of Rust source code. While primarily designed for procedural macros, syn’s powerful parsing capabilities make it invaluable for compiler construction tasks, especially when building languages that integrate with Rust or when analyzing Rust code itself.
The library excels at parsing complex token streams into strongly-typed abstract syntax trees. Unlike traditional parser generators that work with external grammar files, syn embeds the entire Rust grammar as Rust types, providing compile-time safety and excellent IDE support. This approach makes it particularly suitable for building domain-specific languages that extend Rust’s syntax or for creating compiler tools that analyze and transform Rust code.
Core Concepts
Syn operates on TokenStreams, which represent sequences of Rust tokens. These tokens flow from the Rust compiler through proc-macro2 into syn for parsing. The library provides three primary ways to work with syntax: parsing tokens into predefined AST types, implementing custom parsers using the Parse trait, and transforming existing AST nodes.
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } }
The Parse trait forms the foundation of syn’s extensibility. By implementing this trait, you can create parsers for custom syntax that integrates seamlessly with Rust’s token system. This capability proves essential when building domain-specific languages or extending Rust with new syntactic constructs.
Custom Language Parsing
One of syn’s most powerful features is its ability to parse custom languages that feel native to Rust. By defining custom keywords and implementing Parse traits, you can create domain-specific languages that leverage Rust’s tokenization while introducing novel syntax.
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } }
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } }
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } }
The Parse implementations for these types demonstrate how to build recursive descent parsers using syn’s parsing infrastructure:
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } }
This approach allows you to create languages that feel natural within Rust’s syntax while maintaining full control over parsing and error reporting. The custom keywords are defined using syn’s macro system, providing proper scoping and collision avoidance.
AST Transformation
Compiler construction often requires transforming abstract syntax trees to implement optimizations, add instrumentation, or change program behavior. Syn provides comprehensive facilities for traversing and modifying Rust ASTs while preserving source location information.
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } }
This transformation demonstrates several important patterns for AST manipulation. The function modifies the AST in-place, preserving all type information and source locations. The parse_quote! macro allows embedding Rust syntax directly in transformation code, making it easy to construct new AST nodes.
Type Analysis
Understanding type information is crucial for many compiler optimizations. Syn provides detailed type representations that enable sophisticated analysis of Rust’s type system.
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } }
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } }
This type analysis can inform optimization decisions, such as determining whether values can be stack-allocated, identifying opportunities for specialization, or checking whether types implement specific traits.
Constant Folding
Compile-time evaluation of expressions is a fundamental compiler optimization. Syn’s expression types make it straightforward to implement constant folding and other algebraic simplifications.
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } }
This example shows how to recursively traverse expression trees and apply transformations. While simple, this pattern extends to more sophisticated optimizations like strength reduction, algebraic simplification, and dead code elimination.
Custom Attributes and Directives
Compilers often need to process custom attributes that control optimization, linking, or other compilation aspects. Syn makes it easy to define and parse such attributes with full type safety.
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } }
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } }
These custom attributes can control various aspects of compilation, from optimization levels to target-specific features, providing a clean interface between source code and compiler behavior.
Error Handling and Diagnostics
High-quality error messages are essential for any compiler. Syn provides detailed span information for every AST node, enabling precise error reporting that points directly to problematic source code.
#![allow(unused)] fn main() { use std::collections::HashMap; use proc_macro2::{Ident, TokenStream}; use quote::quote; use syn::parse::{Parse, ParseStream}; use syn::punctuated::Punctuated; use syn::spanned::Spanned; use syn::{ parse_quote, Error, Expr, ExprLit, FnArg, ItemFn, Lit, Pat, Result, Stmt, Token, Type, Visibility, }; /// Example: Parsing and analyzing a Rust function pub fn analyze_function(input: TokenStream) -> Result<FunctionAnalysis> { let func: ItemFn = syn::parse2(input)?; let param_count = func.sig.inputs.len(); let is_async = func.sig.asyncness.is_some(); let is_unsafe = func.sig.unsafety.is_some(); let has_generics = !func.sig.generics.params.is_empty(); let params = func .sig .inputs .iter() .filter_map(|arg| match arg { FnArg::Typed(pat_type) => { if let Pat::Ident(ident) = pat_type.pat.as_ref() { Some(ident.ident.to_string()) } else { None } } _ => None, }) .collect(); Ok(FunctionAnalysis { name: func.sig.ident.to_string(), param_count, params, is_async, is_unsafe, has_generics, visibility: format!("{:?}", func.vis), }) } #[derive(Debug, Clone)] pub struct FunctionAnalysis { pub name: String, pub param_count: usize, pub params: Vec<String>, pub is_async: bool, pub is_unsafe: bool, pub has_generics: bool, pub visibility: String, } /// Example: Custom DSL parsing - Simple state machine language pub struct StateMachine { pub name: Ident, pub states: Vec<State>, pub initial: Ident, } pub struct State { pub name: Ident, pub transitions: Vec<Transition>, } pub struct Transition { pub event: Ident, pub target: Ident, pub action: Option<Expr>, } impl Parse for StateMachine { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; input.parse::<kw::machine>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); // Parse initial state content.parse::<kw::initial>()?; content.parse::<Token![:]>()?; let initial: Ident = content.parse()?; content.parse::<Token![;]>()?; // Parse states let mut states = Vec::new(); while !content.is_empty() { states.push(content.parse()?); } Ok(StateMachine { name, states, initial, }) } } impl Parse for State { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::state>()?; let name: Ident = input.parse()?; let content; syn::braced!(content in input); let mut transitions = Vec::new(); while !content.is_empty() { transitions.push(content.parse()?); } Ok(State { name, transitions }) } } impl Parse for Transition { fn parse(input: ParseStream) -> Result<Self> { input.parse::<kw::on>()?; let event: Ident = input.parse()?; input.parse::<Token![=>]>()?; let target: Ident = input.parse()?; let action = if input.peek(Token![,]) { input.parse::<Token![,]>()?; Some(input.parse()?) } else { None }; input.parse::<Token![;]>()?; Ok(Transition { event, target, action, }) } } mod kw { use syn::custom_keyword; custom_keyword!(state); custom_keyword!(machine); custom_keyword!(initial); custom_keyword!(on); } /// Example: AST transformation - Add logging to functions pub fn inject_logging(mut func: ItemFn) -> ItemFn { let fn_name = &func.sig.ident; let log_entry: Stmt = parse_quote! { println!("Entering function: {}", stringify!(#fn_name)); }; // Insert at the beginning of the function body func.block.stmts.insert(0, log_entry); // Add exit logging before each return let log_exit: Stmt = parse_quote! { println!("Exiting function: {}", stringify!(#fn_name)); }; let mut new_stmts = Vec::new(); for stmt in func.block.stmts.drain(..) { match &stmt { Stmt::Expr(Expr::Return(_), _) => { new_stmts.push(log_exit.clone()); new_stmts.push(stmt); } _ => new_stmts.push(stmt), } } // Add exit log at the end if there's no explicit return if !matches!(new_stmts.last(), Some(Stmt::Expr(Expr::Return(_), _))) { new_stmts.push(log_exit); } func.block.stmts = new_stmts; func } /// Example: Custom attribute parsing #[derive(Debug)] pub struct CompilerDirective { pub optimization_level: u8, pub inline: bool, pub target_features: Vec<String>, } impl Parse for CompilerDirective { fn parse(input: ParseStream) -> Result<Self> { let mut optimization_level = 0; let mut inline = false; let mut target_features = Vec::new(); let vars = Punctuated::<MetaItem, Token![,]>::parse_terminated(input)?; for var in vars { match var.name.to_string().as_str() { "opt_level" => optimization_level = var.value, "inline" => inline = true, "features" => { target_features = var .list .into_iter() .map(|s| s.trim_matches('"').to_string()) .collect(); } _ => { return Err(Error::new( var.name.span(), format!("Unknown directive: {}", var.name), )) } } } Ok(CompilerDirective { optimization_level, inline, target_features, }) } } struct MetaItem { name: Ident, value: u8, list: Vec<String>, } impl Parse for MetaItem { fn parse(input: ParseStream) -> Result<Self> { let name: Ident = input.parse()?; if input.peek(Token![=]) { input.parse::<Token![=]>()?; if let Ok(lit) = input.parse::<ExprLit>() { if let Lit::Int(int) = lit.lit { let value = int.base10_parse::<u8>()?; return Ok(MetaItem { name, value, list: vec![], }); } } } if input.peek(syn::token::Paren) { let content; syn::parenthesized!(content in input); let list = Punctuated::<ExprLit, Token![,]>::parse_terminated(&content)? .into_iter() .filter_map(|lit| { if let Lit::Str(s) = lit.lit { Some(s.value()) } else { None } }) .collect(); return Ok(MetaItem { name, value: 0, list, }); } Ok(MetaItem { name, value: 1, list: vec![], }) } } /// Example: Type analysis for compiler optimizations pub fn analyze_types_in_function(func: &ItemFn) -> HashMap<String, TypeInfo> { let mut type_info = HashMap::new(); // Analyze parameter types for input in &func.sig.inputs { if let FnArg::Typed(pat_type) = input { if let Pat::Ident(ident) = pat_type.pat.as_ref() { let info = analyze_type(&pat_type.ty); type_info.insert(ident.ident.to_string(), info); } } } type_info } #[derive(Debug, Clone)] pub struct TypeInfo { pub is_primitive: bool, pub is_reference: bool, pub is_mutable: bool, pub type_string: String, } fn analyze_type(ty: &Type) -> TypeInfo { match ty { Type::Path(type_path) => { let type_string = quote!(#type_path).to_string(); let is_primitive = matches!( type_string.as_str(), "i8" | "i16" | "i32" | "i64" | "i128" | "u8" | "u16" | "u32" | "u64" | "u128" | "f32" | "f64" | "bool" | "char" ); TypeInfo { is_primitive, is_reference: false, is_mutable: false, type_string, } } Type::Reference(type_ref) => { let inner = analyze_type(&type_ref.elem); TypeInfo { is_reference: true, is_mutable: type_ref.mutability.is_some(), ..inner } } _ => TypeInfo { is_primitive: false, is_reference: false, is_mutable: false, type_string: quote!(#ty).to_string(), }, } } /// Example: Generate optimized code based on const evaluation pub fn const_fold_binary_ops(expr: Expr) -> Expr { match expr { Expr::Binary(mut binary) => { // Recursively fold sub-expressions binary.left = Box::new(const_fold_binary_ops(*binary.left)); binary.right = Box::new(const_fold_binary_ops(*binary.right)); // Try to fold if both operands are literals if let (Expr::Lit(left_lit), Expr::Lit(right_lit)) = (binary.left.as_ref(), binary.right.as_ref()) { if let (Lit::Int(l), Lit::Int(r)) = (&left_lit.lit, &right_lit.lit) { if let (Ok(l_val), Ok(r_val)) = (l.base10_parse::<i64>(), r.base10_parse::<i64>()) { use syn::BinOp; let result = match binary.op { BinOp::Add(_) => Some(l_val + r_val), BinOp::Sub(_) => Some(l_val - r_val), BinOp::Mul(_) => Some(l_val * r_val), BinOp::Div(_) if r_val != 0 => Some(l_val / r_val), _ => None, }; if let Some(val) = result { return parse_quote!(#val); } } } } Expr::Binary(binary) } // Recursively process other expression types Expr::Paren(mut paren) => { paren.expr = Box::new(const_fold_binary_ops(*paren.expr)); Expr::Paren(paren) } Expr::Block(mut block) => { if let Some(Stmt::Expr(expr, _semi)) = block.block.stmts.last_mut() { *expr = const_fold_binary_ops(expr.clone()); } Expr::Block(block) } other => other, } } #[cfg(test)] mod tests { use super::*; #[test] fn test_function_analysis() { let input = quote! { pub async unsafe fn process_data<T>(input: &str, count: usize) -> Result<T> { todo!() } }; let analysis = analyze_function(input).unwrap(); assert_eq!(analysis.name, "process_data"); assert_eq!(analysis.param_count, 2); assert!(analysis.is_async); assert!(analysis.is_unsafe); assert!(analysis.has_generics); assert_eq!(analysis.params, vec!["input", "count"]); } #[test] fn test_inject_logging() { let input: ItemFn = parse_quote! { fn calculate(x: i32, y: i32) -> i32 { if x > y { return x - y; } x + y } }; let modified = inject_logging(input); let output = quote!(#modified).to_string(); assert!(output.contains("Entering function")); assert!(output.contains("Exiting function")); } #[test] fn test_const_folding() { // Test simple constant folding let expr: Expr = parse_quote! { 2 + 3 }; let folded = const_fold_binary_ops(expr); match &folded { Expr::Lit(lit) => { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } else { panic!("Expected integer literal"); } } _ => panic!( "Expected literal after folding, got: {:?}", quote!(#folded).to_string() ), } // Test division let expr: Expr = parse_quote! { 10 / 2 }; let folded = const_fold_binary_ops(expr); if let Expr::Lit(lit) = &folded { if let Lit::Int(int) = &lit.lit { assert_eq!(int.base10_parse::<i64>().unwrap(), 5); } } // Test non-foldable expression (variable) let expr: Expr = parse_quote! { x + 3 }; let folded = const_fold_binary_ops(expr); assert!(matches!(folded, Expr::Binary(_))); } #[test] fn test_type_analysis() { let func: ItemFn = parse_quote! { fn example(x: i32, s: &str, data: &mut Vec<u8>) {} }; let types = analyze_types_in_function(&func); assert!(types["x"].is_primitive); assert!(types["s"].is_reference); assert!(!types["s"].is_mutable); assert!(types["data"].is_reference); assert!(types["data"].is_mutable); } } /// Error handling with span information pub fn validate_function(func: &ItemFn) -> std::result::Result<(), Vec<Error>> { let mut errors = Vec::new(); // Check function name conventions let name = func.sig.ident.to_string(); if name.starts_with('_') && func.vis != Visibility::Inherited { errors.push(Error::new( func.sig.ident.span(), "Public functions should not start with underscore", )); } // Check for missing documentation if !func.attrs.iter().any(|attr| attr.path().is_ident("doc")) { errors.push(Error::new( func.sig.ident.span(), "Missing documentation comment", )); } // Check parameter conventions for input in &func.sig.inputs { let FnArg::Typed(pat_type) = input else { continue; }; let Type::Reference(type_ref) = pat_type.ty.as_ref() else { continue; }; if type_ref.mutability.is_some() { continue; } let Type::Path(path) = type_ref.elem.as_ref() else { continue; }; let Some(ident) = path.path.get_ident() else { continue; }; let type_name = ident.to_string(); if matches!(type_name.as_str(), "String" | "Vec" | "HashMap") { errors.push(Error::new( pat_type.ty.span(), format!( "Consider using &{} instead of {} for better performance", type_name, type_name ), )); } } if errors.is_empty() { Ok(()) } else { Err(errors) } } }
The Error type in syn includes span information that integrates with Rust’s diagnostic system, producing error messages that feel native to the Rust compiler. This integration is particularly valuable when building tools that extend the Rust compiler or when creating lints and code analysis tools.
Integration with Quote
Syn works hand-in-hand with the quote crate for code generation. While syn parses TokenStreams into ASTs, quote converts ASTs back into TokenStreams. This bidirectional conversion enables powerful metaprogramming patterns.
The quote! macro supports interpolation of syn types, making it easy to construct complex code fragments. The parse_quote! macro combines both operations, parsing tokens directly into syn types. This combination provides a complete toolkit for reading, analyzing, transforming, and generating Rust code.
Advanced Patterns
Building production compilers with syn involves several advanced patterns. Visitor traits (Visit and VisitMut) enable systematic traversal of large ASTs. Fold traits support functional transformation patterns. The punctuated module handles comma-separated lists with proper parsing of trailing commas.
For performance-critical applications, syn supports parsing without allocating strings for identifiers, using lifetime parameters to borrow from the original token stream. This zero-copy parsing can significantly improve performance when processing large codebases.
Best Practices
When using syn for compiler construction, organize your code to separate parsing, analysis, and transformation phases. Define clear AST types for your domain-specific constructs. Preserve span information throughout transformations to maintain high-quality error messages.
Test your parsers thoroughly using syn’s parsing functions directly. The library’s strong typing catches many errors at compile time, but runtime testing remains essential for ensuring correct parsing of edge cases.
Consider performance implications when designing AST transformations. While syn is highly optimized, traversing large ASTs multiple times can impact compilation speed. Combine related transformations when possible to minimize traversal overhead.
Common Patterns
Several patterns appear repeatedly in syn-based compiler tools. The parse-transform-generate pipeline forms the basis of most procedural macros. Custom parsing often combines syn’s built-in types with domain-specific structures. Hygiene preservation ensures that generated code doesn’t accidentally capture or shadow user identifiers.
Error accumulation allows reporting multiple problems in a single compilation pass. Span manipulation enables precise error messages and suggestions. Integration with the broader Rust ecosystem through traits and standard types ensures that syn-based tools compose well with other compiler infrastructure.
Syn provides a solid foundation for building sophisticated compiler tools that integrate seamlessly with Rust. Whether you’re creating procedural macros, building development tools, or implementing entirely new languages, syn’s combination of power, safety, and ergonomics makes it an invaluable tool in the compiler writer’s toolkit.