Make A Language

Part Eight: Function Definitions

8 October 2020

First, we need to decide on a syntax. Let’s start with Rust’s syntax:

fn frobnicate(foo: String, bar: String) -> Vec<Bar> {
    ...
}

Eldiro doesn’t have types (yet), so we need to remove those:1

fn frobnicate(foo, bar) {
    ...
}

Parentheses and commas are annoying, so let’s get rid of those:

fn frobnicate foo bar {
    ...
}

Often we have functions whose body is just a single statement. Here’s a contrived example:

fn add x y {
    x + y
}

It can be convenient to not force function bodies to be blocks in these cases. Using the syntax from C#:

fn add x y => x + y

Since blocks are expressions (which are statements), we can also do this:

fn add x y => {
    x + y
}

To reduce the number of ways to do one thing, we’ll remove the ‘block-only’ syntax, instead forcing the use of the => everywhere.

Why not use closures for everything?

In the past I would have preferred for all function definitions to be binding definitions whose value is a closure. Here’s an example in Rust:

// Function-specific syntax
fn add(x: i32, y: i32) -> i32 {
    x + y
}

// More general closure syntax
let add = |x: i32, y: i32| -> i32 { x + y };

A comment from u/witty___name clarified that this approach is problematic, though. (I’ll be paraphrasing their explanation here.)

Eldiro doesn’t have any way to catch undefined functions until runtime at the moment, which means that this approach works fine, even with recursion. However, if Eldiro gains a way to check for undefined functions without running the program, then using closures for function definitions exclusively would lead to problems. Consider this closure definition in Rust:

let overflow_stack = || overflow_stack();

This doesn’t compile, since bindings aren’t in scope on their left-hand side. If they were, it would allow things like

let x = x;

which is clearly nonsensical.

Because of this, I’ve chosen for function definitions to have a dedicated syntax in Eldiro.

Parsing

Let’s get started on the parser. Function definitions aren’t an expression – they’re a statement, so let’s go to stmt.rs and add a variant to that enum:

use crate::binding_def::BindingDef;
use crate::env::Env;
use crate::expr::Expr;
use crate::func_def::FuncDef;
use crate::val::Val;

// snip

#[derive(Debug, PartialEq)]
pub(crate) enum Stmt {
    BindingDef(BindingDef),
    FuncDef(FuncDef),
    Expr(Expr),
}

To get our project compiling as quickly as possible, let’s add a case for Stmt::FuncDef to Stmt::eval:

impl Stmt {
    // snip

    pub(crate) fn eval(&self, env: &mut Env) -> Result<Val, String> {
        match self {
            Self::BindingDef(binding_def) => {
                binding_def.eval(env)?;
                Ok(Val::Unit)
            }
            Self::FuncDef(_) => todo!(),
            Self::Expr(expr) => expr.eval(env),
        }
    }
}

Let’s add a test, too:

#[cfg(test)]
mod tests {
    use super::*;
    use crate::expr::{BindingUsage, Number, Op};

    // snip

    #[test]
    fn parse_func_def() {
        assert_eq!(
            Stmt::new("fn identity x => x"),
            Ok((
                "",
                Stmt::FuncDef(FuncDef {
                    name: "identity".to_string(),
                    params: vec!["x".to_string()],
                    body: Stmt::Expr(Expr::BindingUsage(BindingUsage {
                        name: "x".to_string(),
                    })),
                }),
            )),
        );
    }

    // snip
}

BindingUsage isn’t re-exported from crate::expr; let’s fix that. We’ll also re-export Block for consistency:

// expr.rs

mod binding_usage;
mod block;

pub(crate) use binding_usage::BindingUsage;
pub(crate) use block::Block;

use crate::env::Env;
use crate::utils;
use crate::val::Val;

The field name on BindingUsage is pub(super), when it should be pub(crate) so we can use it for our tests:

// binding_usage.rs

#[derive(Debug, PartialEq)]
pub(crate) struct BindingUsage {
    pub(crate) name: String,
}

We now need to create crate::func_def::FuncDef. Let’s start by declaring and creating the module:

// lib.rs

mod binding_def;
mod env;
mod expr;
mod func_def; // new!
mod stmt;
mod utils;
mod val;

Open src/func_def.rs and declare FuncDef:

use crate::stmt::Stmt;

#[derive(Debug, PartialEq)]
pub(crate) struct FuncDef {
    pub(crate) name: String,
    pub(crate) params: Vec<String>,
    pub(crate) body: Stmt,
}

Let’s try compiling Eldiro:

$ cargo c
error[E0072]: recursive type `func_def::FuncDef` has infinite size
 --> crates/eldiro/src/func_def.rs:4:1
  |
4 | pub(crate) struct FuncDef {
  | ^^^^^^^^^^^^^^^^^^^^^^^^^ recursive type has infinite size
...
7 |     pub(crate) body: Stmt,
  |                      ---- recursive without indirection
  |
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `func_def::FuncDef` representable
  |
7 |     pub(crate) body: Box<Stmt>,
  |                      ^^^^    ^

error[E0072]: recursive type `stmt::Stmt` has infinite size
  --> crates/eldiro/src/stmt.rs:8:1
   |
8  | pub(crate) enum Stmt {
   | ^^^^^^^^^^^^^^^^^^^^ recursive type has infinite size
9  |     BindingDef(BindingDef),
10 |     FuncDef(FuncDef),
   |             ------- recursive without indirection
   |
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `stmt::Stmt` representable
   |
10 |     FuncDef(Box<FuncDef>),
   |             ^^^^       ^

error[E0391]: cycle detected when computing drop-check constraints for `stmt::Stmt`
  --> crates/eldiro/src/stmt.rs:8:1
   |
8  | pub(crate) enum Stmt {
   | ^^^^^^^^^^^^^^^^^^^^
   |
note: ...which requires computing drop-check constraints for `func_def::FuncDef`...
  --> crates/eldiro/src/func_def.rs:4:1
   |
4  | pub(crate) struct FuncDef {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^
   = note: ...which again requires computing drop-check constraints for `stmt::Stmt`, completing the cycle
note: cycle used when computing drop-check constraints for `Parse`
  --> crates/eldiro/src/lib.rs:13:1
   |
13 | pub struct Parse(stmt::Stmt);
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to 3 previous errors

Whoa! We have a bunch of errors because we’ve accidentally created a recursive type – a type that contains itself. Rust doesn’t like this because, when it’s calculating the size of Stmt, it falls into the following loop:

  1. To calculate Stmt’s size, I need to find the size of Stmt’s largest variant.
  2. To calculate Stmt::BindingDef’s size, I need to find the size of all its fields.
  3. Calculates the size of all of BindingDef’s fields.
  4. To calculate Stmt::FuncDef’s size, I need to find the size of all its fields.
  5. To calculate the size of the body field of FuncDef, I need to find the size of Stmt.
  6. Go to 1.

In other words – we have a recursive type because Stmt could contain a FuncDef depending on which variant it is, and FuncDef in turn contains a Stmt through the body field. The Rust compiler very kindly suggests how to remove the recursion; namely, add indirection somewhere. I’ll opt for Boxing the body of FuncDef:

use crate::stmt::Stmt;

#[derive(Debug, PartialEq)]
pub(crate) struct FuncDef {
    pub(crate) name: String,
    pub(crate) params: Vec<String>,
    pub(crate) body: Box<Stmt>,
}

A Box<Stmt> is a pointer to some memory on the heap that contains a Stmt. Rust’s process for calculating the size of Stmt now goes like this:

  1. To calculate Stmt’s size, I need to find the size of Stmt’s largest variant.
  2. To calculate Stmt::BindingDef’s size, I need to find the size of all its fields.
  3. Calculates the size of all of BindingDef’s fields.
  4. To calculate Stmt::FuncDef’s size, I need to find the size of all its fields.
  5. To calculate the size of the name field of FuncDef, I need to find the size of String (it’s 24 bytes).
  6. To calculate the size of the params field of FuncDef, I need to find the size of Vec (it’s 24 bytes).
  7. To calculate the size of the body field of FuncDef, I need to find the size of Box (depends on your architecture, e.g. on a 64-bit system it’s 8 bytes).
  8. To calculate Stmt::Expr’s size, I need to find the size of Expr.

No more recursion, so Rust doesn’t get stuck when trying to calculate Stmt’s size.

We now need to update that test we wrote earlier:

// stmt.rs

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn parse_func_def() {
        assert_eq!(
            Stmt::new("fn identity x => x"),
            Ok((
                "",
                Stmt::FuncDef(FuncDef {
                    name: "identity".to_string(),
                    params: vec!["x".to_string()],
                    body: Box::new(Stmt::Expr(Expr::BindingUsage(BindingUsage {
                        name: "x".to_string(),
                    }))),
                }),
            )),
        );
    }

    // snip
}

Let’s write another test for parsing a function definition, but this time in func_def.rs:

#[cfg(test)]
mod tests {
    use super::*;
    use crate::expr::{Block, Expr};

    #[test]
    fn parse_func_def_with_no_params_and_empty_body() {
        assert_eq!(
            FuncDef::new("fn nothing => {}"),
            Ok((
                "",
                FuncDef {
                    name: "nothing".to_string(),
                    params: Vec::new(),
                    body: Box::new(Stmt::Expr(Expr::Block(Block { stmts: Vec::new() }))),
                },
            )),
        );
    }
}

We need to make the stmts field of Block pub(crate):

// block.rs

#[derive(Debug, PartialEq)]
pub(crate) struct Block {
    pub(crate) stmts: Vec<Stmt>,
}

Let’s implement FuncDef::new:

// func_def.rs

impl FuncDef {
    pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("fn", s)?;
        let (s, _) = utils::extract_whitespace1(s)?;

        let (s, name) = utils::extract_ident(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let s = utils::tag("=>", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, body) = Stmt::new(s)?;

        Ok((
            s,
            Self {
                name: name.to_string(),
                params: Vec::new(),
                body: Box::new(body),
            },
        ))
    }
}

We should run the tests to see how we went:

$ cargo t -q
running 48 tests
..........................................F.....
failures:

---- stmt::tests::parse_func_def stdout ----
thread 'stmt::tests::parse_func_def' panicked at 'assertion failed: `(left == right)`
  left: `Ok((" identity x => x", Expr(BindingUsage(BindingUsage { name: "fn" }))))`,
 right: `Ok(("", FuncDef(FuncDef { name: "identity", params: ["x"], body: Expr(BindingUsage(BindingUsage { name: "x" })) })))`', crates/eldiro/src/stmt.rs:54:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    stmt::tests::parse_func_def

test result: FAILED. 47 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

Looks like the parse_func_def_with_no_params_and_empty_body test passed, but the test we wrote in stmt.rs is failing. To make the test pass, we need to handle parameters:

impl FuncDef {
    pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("fn", s)?;
        let (s, _) = utils::extract_whitespace1(s)?;

        let (s, name) = utils::extract_ident(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let mut s = s;
        let mut params = Vec::new();

        while let Ok((new_s, param)) = utils::extract_ident(s) {
            s = new_s;
            params.push(param.to_string());

            let (new_s, _) = utils::extract_whitespace(s);
            s = new_s;
        }

        let s = utils::tag("=>", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, body) = Stmt::new(s)?;

        Ok((
            s,
            Self {
                name: name.to_string(),
                params,
                body: Box::new(body),
            },
        ))
    }
}

We also use this approach of ‘keep extracting something (in this case identifiers) and stuff them into a Vec until you can’t continue’ in Block::new. Let’s see if it works:

$ cargo t -q
running 48 tests
..........................................F.....
failures:

---- stmt::tests::parse_func_def stdout ----
thread 'stmt::tests::parse_func_def' panicked at 'assertion failed: `(left == right)`
  left: `Ok((" identity x => x", Expr(BindingUsage(BindingUsage { name: "fn" }))))`,
 right: `Ok(("", FuncDef(FuncDef { name: "identity", params: ["x"], body: Expr(BindingUsage(BindingUsage { name: "x" })) })))`', crates/eldiro/src/stmt.rs:55:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    stmt::tests::parse_func_def

test result: FAILED. 47 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

Hmm, I thought we’d fixed it. thinking emoji …

Ah, I figured it out! The test that’s failing is using Stmt::new, which we haven’t updated to try parsing a function definition. Here’s the current implementation:

// stmt.rs

impl Stmt {
    pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
        BindingDef::new(s)
            .map(|(s, binding_def)| (s, Self::BindingDef(binding_def)))
            .or_else(|_| Expr::new(s).map(|(s, expr)| (s, Self::Expr(expr))))
    }

    // snip
}

We need to add just one line to make the test pass:

impl Stmt {
    pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
        BindingDef::new(s)
            .map(|(s, binding_def)| (s, Self::BindingDef(binding_def)))
            .or_else(|_| FuncDef::new(s).map(|(s, func_def)| (s, Self::FuncDef(func_def))))
            .or_else(|_| Expr::new(s).map(|(s, expr)| (s, Self::Expr(expr))))
    }

    // snip
}

Remember that technique we used in both FuncDef::new and Block::new to parse multiple things? Let’s extract that logic into a helper function in crate::utils to remove the duplication and make the code easier to read:

pub(crate) fn sequence<T>(
    parser: impl Fn(&str) -> Result<(&str, T), String>,
    mut s: &str,
) -> Result<(&str, Vec<T>), String> {
    let mut items = Vec::new();

    while let Ok((new_s, item)) = parser(s) {
        s = new_s;
        items.push(item);

        let (new_s, _) = extract_whitespace(s);
        s = new_s;
    }

    Ok((s, items))
}

The function is generic over the type of the thing being parsed. It isn’t generic over the type of error the parser returns, though, because our project only has one type of parser error: String. Take a look at this article to see more of the rationale behind this decision.

We can make use of our shiny new sequence function in Block::new:

// block.rs

impl Block {
    pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("{", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, stmts) = utils::sequence(Stmt::new, s)?;

        let (s, _) = utils::extract_whitespace(s);
        let s = utils::tag("}", s)?;

        Ok((s, Block { stmts }))
    }

    // snip
}

and FuncDef::new:

// func_def.rs

impl FuncDef {
    pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("fn", s)?;
        let (s, _) = utils::extract_whitespace1(s)?;

        let (s, name) = utils::extract_ident(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, params) = utils::sequence(utils::extract_ident, s)?;

        let s = utils::tag("=>", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, body) = Stmt::new(s)?;

        Ok((
            s,
            Self {
                name: name.to_string(),
                params,
                body: Box::new(body),
            },
        ))
    }
}

Wait! That doesn’t compile, because params is now of type Vec<&str> instead of Vec<String>. params is a Vec<&str> because the type that utils::extract_ident returns (ignoring the usual leftover input &str) is a &str, which is picked up and propagated by the <T> in utils::sequence. We can fix this by passing utils::sequence a wrapper closure around utils::extract_ident that turns the identifier it extracts into a String:

impl FuncDef {
    pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("fn", s)?;
        let (s, _) = utils::extract_whitespace1(s)?;

        let (s, name) = utils::extract_ident(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, params) = utils::sequence(
            |s| utils::extract_ident(s).map(|(s, ident)| (s, ident.to_string())),
            s,
        )?;

        let s = utils::tag("=>", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, body) = Stmt::new(s)?;

        Ok((
            s,
            Self {
                name: name.to_string(),
                params,
                body: Box::new(body),
            },
        ))
    }
}

And now this works perfectly:

$ cargo t -q
running 48 tests
................................................
test result: ok. 48 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Let’s write another test:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn parse_func_def_with_one_param_and_empty_body() {
        assert_eq!(
            FuncDef::new("fn greet name => {}"),
            Ok((
                "",
                FuncDef {
                    name: "greet".to_string(),
                    params: vec!["name".to_string()],
                    body: Box::new(Stmt::Expr(Expr::Block(Block { stmts: Vec::new() }))),
                },
            )),
        );
    }
}

It passes:

$ cargo t -q
running 49 tests
.................................................
test result: ok. 49 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Let’s write a test with multiple parameters and a non-empty body:

#[cfg(test)]
mod tests {
    use super::*;
    use crate::expr::{BindingUsage, Block, Expr, Op};

    // snip

    #[test]
    fn parse_func_def_with_multiple_params() {
        assert_eq!(
            FuncDef::new("fn add x y => x + y"),
            Ok((
                "",
                FuncDef {
                    name: "add".to_string(),
                    params: vec!["x".to_string(), "y".to_string()],
                    body: Box::new(Stmt::Expr(Expr::Operation {
                        lhs: Expr::BindingUsage(BindingUsage {
                            name: "x".to_string()
                        }),
                        rhs: Expr::BindingUsage(BindingUsage {
                            name: "y".to_string()
                        }),
                        op: Op::Add
                    }))
                }
            ))
        );
    }
}

Aaaand it doesn’t compile. It seems that the fields lhs and rhs of Expr::Operation can only be crate::expr::Numbers. Although this does adhere to our initial design decision to disallow nested expressions, it prevents us from doing things as basic as defining an add function! We need to fix this.

Allowing operations on expressions

It might be helpful to run git stash – if you’re using Git – to temporarily hide the modifications we’ve made to existing files in our quest to add support for function definitions. Note that this will leave func_def.rs intact.

Let’s remind ourselves of Expr’s definition:

#[derive(Debug, PartialEq)]
pub(crate) enum Expr {
    Number(Number),
    Operation { lhs: Number, rhs: Number, op: Op },
    BindingUsage(BindingUsage),
    Block(Block),
}

Your first instinct might be to change that to

#[derive(Debug, PartialEq)]
pub(crate) enum Expr {
    Number(Number),
    Operation { lhs: Self, rhs: Self, op: Op },
    BindingUsage(BindingUsage),
    Block(Block),
}

Remember, though, that that would lead to the recursive type error we got earlier. We need to add Boxes:

#[derive(Debug, PartialEq)]
pub(crate) enum Expr {
    Number(Number),
    Operation {
        lhs: Box<Self>,
        rhs: Box<Self>,
        op: Op,
    },
    BindingUsage(BindingUsage),
    Block(Block),
}

Let’s start by fixing all the tests:

// binding_def.rs

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn parse_binding_def() {
        assert_eq!(
            BindingDef::new("let a = 10 / 2"),
            Ok((
                "",
                BindingDef {
                    name: "a".to_string(),
                    val: Expr::Operation {
                        lhs: Box::new(Expr::Number(Number(10))),
                        rhs: Box::new(Expr::Number(Number(2))),
                        op: Op::Div,
                    },
                },
            )),
        );
    }

    // snip
}
// block.rs

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn eval_block_with_multiple_exprs() {
        assert_eq!(
            Block {
                stmts: vec![
                    Stmt::Expr(Expr::Number(Number(100))),
                    Stmt::Expr(Expr::Number(Number(30))),
                    Stmt::Expr(Expr::Operation {
                        lhs: Box::new(Expr::Number(Number(10))),
                        rhs: Box::new(Expr::Number(Number(7))),
                        op: Op::Sub,
                    }),
                ],
            }
            .eval(&Env::default()),
            Ok(Val::Number(3)),
        );
    }

    // snip
}
// expr.rs

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn parse_one_plus_two() {
        assert_eq!(
            Expr::new("1+2"),
            Ok((
                "",
                Expr::Operation {
                    lhs: Box::new(Expr::Number(Number(1))),
                    rhs: Box::new(Expr::Number(Number(2))),
                    op: Op::Add,
                },
            )),
        );
    }

    #[test]
    fn parse_expr_with_whitespace() {
        assert_eq!(
            Expr::new("2 * 2"),
            Ok((
                "",
                Expr::Operation {
                    lhs: Box::new(Expr::Number(Number(2))),
                    rhs: Box::new(Expr::Number(Number(2))),
                    op: Op::Mul,
                },
            )),
        );
    }

    // snip

    #[test]
    fn eval_add() {
        assert_eq!(
            Expr::Operation {
                lhs: Box::new(Expr::Number(Number(10))),
                rhs: Box::new(Expr::Number(Number(10))),
                op: Op::Add,
            }
            .eval(&Env::default()),
            Ok(Val::Number(20)),
        );
    }

    #[test]
    fn eval_sub() {
        assert_eq!(
            Expr::Operation {
                lhs: Box::new(Expr::Number(Number(1))),
                rhs: Box::new(Expr::Number(Number(5))),
                op: Op::Sub,
            }
            .eval(&Env::default()),
            Ok(Val::Number(-4)),
        );
    }

    #[test]
    fn eval_mul() {
        assert_eq!(
            Expr::Operation {
                lhs: Box::new(Expr::Number(Number(5))),
                rhs: Box::new(Expr::Number(Number(6))),
                op: Op::Mul,
            }
            .eval(&Env::default()),
            Ok(Val::Number(30)),
        );
    }

    #[test]
    fn eval_div() {
        assert_eq!(
            Expr::Operation {
                lhs: Box::new(Expr::Number(Number(200))),
                rhs: Box::new(Expr::Number(Number(20))),
                op: Op::Div,
            }
            .eval(&Env::default()),
            Ok(Val::Number(10)),
        );
    }

    // snip
}
// func_def.rs

#[cfg(test)]
mod tests {
    // snip

    #[test]
    #[test]
    fn parse_func_def_with_multiple_params() {
        assert_eq!(
            FuncDef::new("fn add x y => x + y"),
            Ok((
                "",
                FuncDef {
                    name: "add".to_string(),
                    params: vec!["x".to_string(), "y".to_string()],
                    body: Box::new(Stmt::Expr(Expr::Operation {
                        lhs: Box::new(Expr::BindingUsage(BindingUsage {
                            name: "x".to_string(),
                        })),
                        rhs: Box::new(Expr::BindingUsage(BindingUsage {
                            name: "y".to_string(),
                        })),
                        op: Op::Add,
                    })),
                },
            )),
        );
    }
}
// stmt.rs

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn parse_expr() {
        assert_eq!(
            Stmt::new("1+1"),
            Ok((
                "",
                Stmt::Expr(Expr::Operation {
                    lhs: Box::new(Expr::Number(Number(1))),
                    rhs: Box::new(Expr::Number(Number(1))),
                    op: Op::Add,
                }),
            )),
        );
    }

    // snip
}

Now that we’ve updated all those tests, it’s time to tackle the four outstanding compilation errors. Two of these are from parsing an Expr::Operation, and the other two are from evaluating it.

Let’s fix the parsing first. Here’s how an Expr::Operation is currently parsed:

// expr.rs

impl Expr {
    // snip

    fn new_operation(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Number::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s)?;

        Ok((s, Self::Operation { lhs, rhs, op }))
    }

    // snip
}

As you can see, lhs and rhs are of type Number, not of type Box<Expr>. We can add the Box part:

impl Expr {
    // snip

    fn new_operation(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Number::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s)?;

        Ok((
            s,
            Self::Operation {
                lhs: Box::new(lhs),
                rhs: Box::new(rhs),
                op,
            },
        ))
    }

    // snip
}

Let’s try making lhs and rhs Exprs by calling Expr::new:

impl Expr {
    // snip

    fn new_operation(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Self::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Self::new(s)?;

        Ok((
            s,
            Self::Operation {
                lhs: Box::new(lhs),
                rhs: Box::new(rhs),
                op,
            },
        ))
    }

    // snip
}

This looks reasonable enough. Let’s run the tests:

$ cargo t -q
error[E0308]: mismatched types
  --> crates/eldiro/src/expr.rs:89:21
   |
89 |                 let Number(lhs) = lhs;
   |                     ^^^^^^^^^^^   --- this expression has type `&std::boxed::Box<expr::Expr>`
   |                     |
   |                     expected struct `std::boxed::Box`, found struct `expr::Number`
   |
   = note: expected struct `std::boxed::Box<expr::Expr>`
              found struct `expr::Number`

error[E0308]: mismatched types
  --> crates/eldiro/src/expr.rs:90:21
   |
90 |                 let Number(rhs) = rhs;
   |                     ^^^^^^^^^^^   --- this expression has type `&std::boxed::Box<expr::Expr>`
   |                     |
   |                     expected struct `std::boxed::Box`, found struct `expr::Number`
   |
   = note: expected struct `std::boxed::Box<expr::Expr>`
              found struct `expr::Number`

error: aborting due to 2 previous errors

Oh, yeah, we still have that compilation error. Let’s add a todo!() and comment out the rest so we can get on with running our tests:

impl Expr {
    // snip

    pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
        match self {
            Self::Number(Number(n)) => Ok(Val::Number(*n)),
            Self::Operation { lhs, rhs, op } => {
                todo!();

                // let Number(lhs) = lhs;
                // let Number(rhs) = rhs;

                // let result = match op {
                //     Op::Add => lhs + rhs,
                //     Op::Sub => lhs - rhs,
                //     Op::Mul => lhs * rhs,
                //     Op::Div => lhs / rhs,
                // };

                // Ok(Val::Number(result))
            }
            Self::BindingUsage(binding_usage) => binding_usage.eval(env),
            Self::Block(block) => block.eval(env),
        }
    }
}
$ cargo t -q
running 46 tests
.........F.F.FF.F.
thread 'binding_def::tests::parse_binding_def' has overflowed its stack
fatal runtime error: stack overflow

Oh, no. Oh, no no no. What’s happened here is that Expr::new_operation has called Expr::new, which calls Expr::new_operation, and so forth. The solution to this is to create an intermediary function that tries parsing all the possibilities apart from Expr::Operation. We use this function to parse lhs and rhs, and use it and Expr::new_operation in Expr::new. It’s difficult to explain with words; take look at the code:

impl Expr {
    pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
        Self::new_operation(s).or_else(|_| Self::new_non_operation(s))
    }

    fn new_non_operation(s: &str) -> Result<(&str, Self), String> {
        Self::new_number(s)
            .or_else(|_| {
                BindingUsage::new(s)
                    .map(|(s, binding_usage)| (s, Self::BindingUsage(binding_usage)))
            })
            .or_else(|_| Block::new(s).map(|(s, block)| (s, Self::Block(block))))
    }

    // snip

    fn new_operation(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Self::new_non_operation(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Self::new_non_operation(s)?;

        Ok((
            s,
            Self::Operation {
                lhs: Box::new(lhs),
                rhs: Box::new(rhs),
                op,
            },
        ))
    }

    // snip
}
$ cargo t -q
running 46 tests
.............F.F.F.F..F.......................
failures:

---- expr::block::tests::eval_block_with_multiple_exprs stdout ----
thread 'expr::block::tests::eval_block_with_multiple_exprs' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- expr::tests::eval_add stdout ----
thread 'expr::tests::eval_add' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17

---- expr::tests::eval_div stdout ----
thread 'expr::tests::eval_div' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17

---- expr::tests::eval_mul stdout ----
thread 'expr::tests::eval_mul' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17

---- expr::tests::eval_sub stdout ----
thread 'expr::tests::eval_sub' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17


failures:
    expr::block::tests::eval_block_with_multiple_exprs
    expr::tests::eval_add
    expr::tests::eval_div
    expr::tests::eval_mul
    expr::tests::eval_sub

test result: FAILED. 41 passed; 5 failed; 0 ignored; 0 measured; 0 filtered out

Ah, those failures are from the todo!() in Expr::eval. Regardless, all the parsers are passing, which means that we’ve managed to fix both of our issues at once: we no longer have a stack overflow, and we have also prevented nested operations (since lhs and rhs call Expr::new_non_operation, there’s no chance that they could be Expr::Operations themselves).

Here’s the current state of Expr::eval:

impl Expr {
    // snip

    pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
        match self {
            Self::Number(Number(n)) => Ok(Val::Number(*n)),
            Self::Operation { lhs, rhs, op } => {
                todo!();

                // let Number(lhs) = lhs;
                // let Number(rhs) = rhs;

                // let result = match op {
                //     Op::Add => lhs + rhs,
                //     Op::Sub => lhs - rhs,
                //     Op::Mul => lhs * rhs,
                //     Op::Div => lhs / rhs,
                // };

                // Ok(Val::Number(result))
            }
            Self::BindingUsage(binding_usage) => binding_usage.eval(env),
            Self::Block(block) => block.eval(env),
        }
    }
}

To evaluate an Expr::Operation, we must first evaluate the lhs and the rhs:

impl Expr {
    // snip

    pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
        match self {
            Self::Number(Number(n)) => Ok(Val::Number(*n)),
            Self::Operation { lhs, rhs, op } => {
                let lhs = lhs.eval(env)?;
                let rhs = rhs.eval(env)?;

                let result = match op {
                    Op::Add => lhs + rhs,
                    Op::Sub => lhs - rhs,
                    Op::Mul => lhs * rhs,
                    Op::Div => lhs / rhs,
                };

                Ok(Val::Number(result))
            }
            Self::BindingUsage(binding_usage) => binding_usage.eval(env),
            Self::Block(block) => block.eval(env),
        }
    }
}

We’ve got errors now because you can’t add, subtract, multiply or divide Vals. What we want is for lhs and rhs to be integers. Before, when we knew that lhs and rhs were Numbers, we could simply use pattern matching to extract the contained integer. Now, though, they could be any type that Val can contain. Although at the moment this means that lhs and rhs could both either be Numbers or Units, we’ll add strings and other data types in the future. To solve this issue, we need to check that lhs and rhs are indeed Val::Numbers, and if they aren’t return an error.

impl Expr {
    // snip

    pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
        match self {
            Self::Number(Number(n)) => Ok(Val::Number(*n)),
            Self::Operation { lhs, rhs, op } => {
                let lhs = lhs.eval(env)?;
                let rhs = rhs.eval(env)?;

                let (lhs, rhs) = match (lhs,rhs) {
                    (Val::Number(lhs), Val::Number(rhs)) => (lhs, rhs),
                    _ => return Err("cannot evaluate operation whose left-hand side and right-hand side are not both numbers".to_string()),
                };

                let result = match op {
                    Op::Add => lhs + rhs,
                    Op::Sub => lhs - rhs,
                    Op::Mul => lhs * rhs,
                    Op::Div => lhs / rhs,
                };

                Ok(Val::Number(result))
            }
            Self::BindingUsage(binding_usage) => binding_usage.eval(env),
            Self::Block(block) => block.eval(env),
        }
    }
}

Let’s write a test to verify that the error message kicks in at the right time:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn eval_non_number_operation() {
        assert_eq!(
            Expr::Operation {
                lhs: Box::new(Expr::Number(Number(10))),
                rhs: Box::new(Expr::Block(Block { stmts: Vec::new() })),
                op: Op::Add,
            }
            .eval(&Env::default()),
            Err("cannot evaluate operation whose left-hand side and right-hand side are not both numbers".to_string()),
        );
    }

    // snip
}
$ cargo t -q
running 47 tests
...............................................
test result: ok. 47 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Run git stash pop to get back the changes we made for function definitions. Let’s run the tests again to see if the new feature of Expr::Operation is being used successfully:

$ cargo t -q
running 51 tests
...................................................
test result: ok. 51 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Beautiful! With that diversion complete, we can get back to testing the parsing of function definitions.

Back to parsing

So, what tests do we still need for the parsing? Well, we have one with no parameters and an empty body. We also have one with one parameter and an empty body, as well as one with multiple parameters and a non-empty body. Although we haven’t tested every possible combination, we’ve tested enough scenarios for me to be confident that the parser is working correctly.

Evaluation

Here’s the definition of Env:

#[derive(Debug, PartialEq, Default)]
pub struct Env<'parent> {
    bindings: HashMap<String, Val>,
    parent: Option<&'parent Self>,
}

One way to store function definitions in Env is to add another field, funcs, and have its type be

HashMap<
    String,
    /* everything inside of FuncDef apart from name */,
>

Alternatively, you could have only one field for both bindings and functions. The type of this field is HashMap<String, V>, where V is an enum that holds either a binding’s value, or information about a function. This way, you can’t ever have both a function and a binding with the same name – whichever one is declared last takes priority. This method also prevents duplication of the ‘recursively search parents’ logic we wrote originally for bindings.2

A nice place to start is removing the bindings field and replacing it with a field called … well, it can contain bindings and functions … what do they both have in common? … names?

#[derive(Debug, PartialEq, Default)]
pub struct Env<'parent> {
    named: HashMap<String, NamedInfo>,
    parent: Option<&'parent Self>,
}

Let’s declare NamedInfo:

// still in env.rs

#[derive(Debug, PartialEq)]
enum NamedInfo {
    Binding(Val),
    Func { params: Vec<String>, body: Stmt },
}

We need to update all of Env’s methods – at the same time, we can also add methods for functions:

impl<'parent> Env<'parent> {
    pub(crate) fn create_child(&'parent self) -> Self {
        Self {
            named: HashMap::new(),
            parent: Some(self),
        }
    }

    pub(crate) fn store_binding(&mut self, name: String, val: Val) {
        self.named.insert(name, NamedInfo::Binding(val));
    }

    pub(crate) fn store_func(&mut self, name: String, params: Vec<String>, body: Stmt) {
        self.named.insert(name, NamedInfo::Func { params, body });
    }

    pub(crate) fn get_binding(&self, name: &str) -> Result<Val, String> {
        self.get_named_info(name)
            .and_then(|named_info| match named_info {
                NamedInfo::Binding(val) => Some(val),
                _ => None,
            })
            .ok_or_else(|| format!("binding with name ‘{}’ does not exist", name))
    }

    pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
        self.get_named_info(name)
            .and_then(|named_info| match named_info {
                NamedInfo::Func { params, body } => Some((params, body)),
                _ => None,
            })
            .ok_or_else(|| format!("function with name ‘{}’ does not exist", name))
    }

    fn get_named_info(&self, name: &str) -> Option<NamedInfo> {
        self.named
            .get(name)
            .cloned()
            .or_else(|| self.parent.and_then(|parent| parent.get_named_info(name)))
    }
}

That code is a touch repetitive. Let‘s add some methods to NamedInfo to make converting it into its variants easier:

impl NamedInfo {
    fn into_binding(self) -> Option<Val> {
        if let Self::Binding(val) = self {
            Some(val)
        } else {
            None
        }
    }

    fn into_func(self) -> Option<(Vec<String>, Stmt)> {
        if let Self::Func { params, body } = self {
            Some((params, body))
        } else {
            None
        }
    }
}

Let’s make use of that in Env::get_binding and Env::get_func:

impl<'parent> Env<'parent> {
    // snip

    pub(crate) fn get_binding(&self, name: &str) -> Result<Val, String> {
        self.get_named_info(name)
            .and_then(NamedInfo::into_binding)
            .ok_or_else(|| format!("binding with name ‘{}’ does not exist", name))
    }

    pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
        self.get_named_info(name)
            .and_then(NamedInfo::into_func)
            .ok_or_else(|| format!("function with name ‘{}’ does not exist", name))
    }

    // snip
}

The first of the two compilation errors we have left is from us not being allowed to call .cloned() in Env::get_named_info, since NamedInfo doesn’t implement Clone. Let’s fix that:

#[derive(Debug, Clone, PartialEq)]
enum NamedInfo {
    Binding(Val),
    Func { params: Vec<String>, body: Stmt },
}

This doesn’t compile, since Stmt doesn’t implement Clone. I’ll leave you to go through all the ‘blah doesn’t implement Clone’ errors, since adding #[derive(Clone)] to almost all the types we’ve written so far is too laborious to write out here. As I did in the last article, here’s a link to the diff of the commit that adds this on GitHub in case you get lost.

We only have one error left; ‘no method named get_binding_value’. Let’s fix it:

// binding_usage.rs

impl BindingUsage {
    // snip

    pub(super) fn eval(&self, env: &Env) -> Result<Val, String> {
        env.get_binding(&self.name)
    }
}

Now that all those shenanigans with Env are done, we can finally implement evaluation of function definitions. Let‘s start with a test in stmt.rs:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn eval_func_def() {
        assert_eq!(
            Stmt::FuncDef(FuncDef {
                name: "always_return_one".to_string(),
                params: Vec::new(),
                body: Box::new(Stmt::Expr(Expr::Number(Number(1)))),
            })
            .eval(&mut Env::default()),
            Ok(Val::Unit),
        );
    }

    // snip
}

To stay consistent with crate::stmt::tests::eval_binding_def, the test doesn’t examine the effect it’s had on the Env – instead, it just checks that evaluating a function definition gives you a Unit.

$ cargo t -q
running 52 tests
.............................................F......
failures:

---- stmt::tests::eval_func_def stdout ----
thread 'stmt::tests::eval_func_def' panicked at 'not yet implemented', crates/eldiro/src/stmt.rs:28:33
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    stmt::tests::eval_func_def

test result: FAILED. 51 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

Let’s remove the todo!() from Stmt::eval, and fix this:

impl Stmt {
    // snip

    pub(crate) fn eval(&self, env: &mut Env) -> Result<Val, String> {
        match self {
            Self::BindingDef(binding_def) => {
                binding_def.eval(env)?;
                Ok(Val::Unit)
            }
            Self::FuncDef(func_def) => {
                func_def.eval(env)?;
                Ok(Val::Unit)
            }
            Self::Expr(expr) => expr.eval(env),
        }
    }
}

Let’s create an empty FuncDef::eval so that we can run our tests:

// func_def.rs

use crate::env::Env;
use crate::stmt::Stmt;
use crate::utils;

// snip

impl FuncDef {
    // snip

    pub(crate) fn eval(&self, env: &mut Env) -> Result<(), String> {
        Ok(())
    }
}
$ cargo t -q
running 52 tests
....................................................
test result: ok. 52 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

We don’t need to write a test for evaluating function definitions, since the code to do so is trivial:3

impl FuncDef {
    // snip

    pub(crate) fn eval(&self, env: &mut Env) -> Result<(), String> {
        env.store_func(self.name.clone(), self.params.clone(), *self.body.clone());
        Ok(())
    }
}

See you all next time, when we’ll add function calls to Eldiro.


  1. The syntax highlighting has gone away because this isn’t Rust anymore, and my website doesn’t have Eldiro syntax highlighting. ↩︎

  2. I spent almost an hour trying to refactor this logic out so that it can be reused in an attempt to use the first approach without copy-pasting. My first try at this used a macro, but the macro took up more code than copy-pasting the code. The same thing happened when I tried using a generic function instead. ↩︎

  3. Famous last words. ↩︎