Writing a lambda calculus interpeter in a new language and an old language

Reading time: 19 minute(s).
2026-05-04

Back to Basics: Lambda Calculus Interpreter

You should definitely lay down your new, shiniest tool that will unlock infinite productivity and learn something just for yourself, for fun, slowly, clumsily. You'll be left with something that can't be ever taken away: experience!

I wanted to learn Zig, so I decided I would write a lambda calculus interpreter with just the standard library. I also wrote a Haskell implementation acting as a baseline; Haskell is an entirely different class of language and it'll be fun to talk about the differences in implementation.

You can follow along in any language and try to write your own lambda calculus interpreter. I'll first highlight the general architecture used the Haskell implementation and then present how the same steps were implemented in Zig.

And a note: I apologize for formatting and code examples. Following from source might be easier.

You can skip the next section if you know what lambda calculus is.

What's a Lambda Calculus?

Lambda Calculus was invented in the '30 by logician Alonzo Church as he was attempting to improve on Russel's type theoretical foundation of mathematics. Many notations for function substitution were used before, but Lambda Calculus was the first one where substitution and alpha conversion were formalized.

In essence, lambda calculus defines "lambda terms" which can be "reduced" or computed, and there is a single computational rule. Having a single computational rule seems very limited, but we'll discover we can define anything in terms of this single operation.

A lambda term can either be:

When multiple terms are written in succession, like a b c, it is understood as a sequence of applications with left-associativity, so a b c = ((a b) c) and not (a (b c))

Applications behave like anonymous functions, so f(x) = x (the identity function) could be written in lambda calculus as λx.x. To continue our analogy, beta reduction is like function application, and f(3) = 3 (function evaluation) would be written as the following reduction (λx.x) 3 -> 3. (The arrow is not official).

To "reduce" an application, we substitute the variable that follows the lambda inside the function body with the second term of the application. For instance, to reduce (λx.x x) k we have to substitute every instance of x (the variable to the right of the lambda symbol) inside the abstraction's body, .x x in this case, with the second term k. This becomes just k k which is not very interesting, but think about the following expr and reduction steps:

(λa.a a) (λx.x)
(λx.x) (λx.x)
(λx.x)

We can often perform multiple reductions until a term becomes irreducible. In this example, we substituted a inside a a with the identity (λx.x). Then, we applied the identity to itself, which yields the identity again.

Sometimes we can perform reductions forever:

(λx.x x) (λx.x x)
(λx.x x) (λx.x x)
(λx.x x) (λx.x x)
(λx.x x) (λx.x x)
(λx.x x) (λx.x x)

Trying to reduce this term results in a copy of the original term, which sets us back at square one.

There are many more interesting lambda terms we'll cover later, but this information is enough to write an interpreter.

We won't go too much into alpha renaming, but it suffices to say it's the way we can formally say that λx.x is the same identity as λy.y and we can safely interchange the two. This is like stating that the function f(x) = x*2 and g(y) = y*2 are the same, which is intuitively true but laid down formally in lambda calculus.

General Strategy and Constraints

We're aiming to implement our lambda calculus interpreter as a library whose functionality can be used separately. We'll code as library authors and be our own library consumers in a minimal "main" file that provides a basic REPL (read-eval-print-loop). Follow the same guidelines in your language for extra fun.

Each implementation should provide parsing, pretty printing, and evaluation, and it should be enough to implement afore mentioned REPL.

We'll use standard library (or distribution) only, or as much as possible to write "idiomatic" code.

Parsing and Data Modeling

We'll start with parsing lambda terms such as λc.c(λx.x), λx.x, y or λa.(a b c) into an abstract syntax tree. We also need to define such a tree.

Lambda terms can contain other lambda terms, and makes tempting to use a recursive abstract syntax tree. This could be encouraged or discouraged depending on your language.

Parsing will be very hard and uninteresting. Our general strategy will be implementing bits of code that parse one of the three lambda terms and then write a backtracking parser: it will attempt to parse each variant calling itself recursively and picking another choice on failue.

Backtracking, recursive descent parsers are easy to reason about and write but can get very slow. If you follow along I suggest doing whatever is simplest to parse your strings into AST nodes.

I won't provide a formal grammar for lambda calculus, here's a simple ebnf grammar for lambda calculus according to the University of Maryland:

x ::= any variable name
e ::= x | ( e e ) | ( λ ( x ) e )
v ::= ( λ ( x ) e )

A nice feature we'd love to support that isn't present in this specific grammar is lists of applications, such as (a b c d e) -> ((((a b) c) d) e). You can definitely write lambda calculus without it and it might be challenging to implement depending on how you write parser.

Evaluation

We'll implement a function to substitute a variable inside an expression. We'll use this subst function to implement beta reduction, and lastly we'll provide a method that reduces terms until no further reduction can happen (or until we're stuck).

Care must be ensured in the subst function to avoid variables captured by lambda abstractions.

This is quite fun and in the implementation we'll have to ponder deeply about how programming languages in general implement "evaluation".

Environment, Repl

Lastly, we'll cheat and let the user actually name a lambda term for later, like this: id = λx.x, and write a second evaluator function that takes into account these user defined variables, allowing them to write id 3 and get 3 as expected.

In both my implementation I defined a new data type for declarations and I used some kind of HashMap to model the environment.

If we did great on the first steps this should be a fairly straightforward extension to our current code, and will teach us a little about stateful computation (the state being the user-defined env) when implementing the "resolve" function. Care must be ensured not to substitute captured variables, e.g. \id.id mustn't change even if {id: \x.x} in our env.

The best part about naming lambda expression is that this lets us write massive lambda terms without going insane by just composing together many simple named terms.

Writing an interpreter (Haskell Baseline Version)

Haskell was practically invented to let undergraduates write lambda calculus interpreters, a joke that isn't too far from reality as its main influence, Standard ML (Meta Language), was in fact engineered to write compilers and interpreters.

I also have prior experience with Haskell and know for a fact I was able to write a lambda calculus interpreter with it. I'm sort of Rusty.

Remembering Haskell

Haskell is sadly in a sorry state where the most famous compiler essentially defines the language, as the old Haskell 2010 standard remains the last official reference. The ecosystem, which is entirely built on top of GHC, has evolved and thrived with compiler-specific extension. Any serious and modern Haskell code has required 10+ compiler extensions to work. The most popular extensions were supposed to be included in a Haskell 2020 standard, but interest died out. Contributors wanted.

Sometimes I have troubles looking for official documentation. The standard library has really fancy and clean pages on Hackage, and the official documentation for the language is here, I guess. The standard is 10 years old. I wish someone could point me to something akin to the beautiful rust reference book, where I can get a quick and friendly refresher on the language basics.

AST in Haskell

This is the boring AST we come up with. We're using the T.Text package, which isn't standard library but is widely recommended to use in place of default Strings.

Before Rust, Haskell pioneered the practice of having many confusing string types, since the default string type is a lazy, linked list of single characters. A lazy, linked list of single characters is great for writing streaming, lazy functions, but not so great for memory layout and almost never what you want to use to represent a string. T.Text is a boring utf8 byte array.

Haskell
type Ident = T.Text

data Expr  = Var Ident | Appl Expr Expr | Abs Ident Expr 
    deriving Eq

Haskell does not complain about the recursive nature of our data structure. It has three |-separated variants, the variable, the application, and the abstraction.

Parsing in Haskell

Instead of reaching for a parser library (probably the best idea) we'll try to implement an instance of the Read typeclass. This lets us call the function read :: Read a => String -> a with our custom type as return argument, resolving to our read implementation here.

This led to the following implementation, which I'm happy with despite not being completely sure about.

Haskell
instance Read Expr where
  readPrec = readExpr
  readListPrec = readListPrecDefault

readExpr :: ReadPrec (Expr)
readExpr = readAppl <++ readSingle

readSingle :: ReadPrec (Expr)
readSingle = parens readLambda <++ (fmap Var readIdent) <++ paren readExpr

readAppl :: ReadPrec (Expr)
readAppl =  do
    first <- readSingle
    lift skipSpaces
    items <- lift $ many1 $ do
        x <- readPrec_to_P readSingle 0
        skipSpaces
        return x
    return $ foldl' Appl first items

readLambda :: ReadPrec (Expr)
readLambda = do
    _ <- choice [readChar 'λ', readChar '\\']
    ident <- readIdent
    _ <- readChar '.'
    lift skipSpaces
    rest <- readExpr
    return $ Abs ident rest

readIdent :: ReadPrec T.Text
readIdent = do
    ident <- lift (munch1 isAlphaNum)
    return $ T.pack ident
    
readChar :: Char ->  ReadPrec Char
readChar c = lift (char c)

The suggested way to implement read was using the ReadPrec monad: a function with type ReadPrec (Type) is a parser that returns said type, and you can compose them using do notation and other functions like choice, many, or the <++ operator.

Our exprParse is a composition of simpler and simpler parsers down to readChar and readIdent, which match a single character and an identifier respectively.

All is good expect for readAppl: it's very annoying to create a left-associative chain of operations using a recursive descent parser: we have to parse a b c as ((a b) c), meaning the first term could be "very deep". To fix this, we actually parse a full list of terms, then we foldr on that list to build the final left-associative object.

Inside of do blocks we can also use the var <- readExpr syntax, which means "read an expression and unwrap it or fail with short circuit". A lot of control flow is encoded this way!

There is some confusing interplay with another parsing monad called ReadS, which is mentioned in the standard but is much slower than the efficient and ghc-only ReadPrec. The lift function can be seen translating functions for the old ReadS monad to the new ReadPrec, like skipSpaces. This was kind of confusing and just adds clutter.

It's worth nothing that most Haskell parsing libraries also make use of do notation and monads, providing a very clean and nice interface. I used attoparsec a lot and other parsec clones.

I'm interested in hearing from a seasoned Haskeller what I could've done better here.

Eval in Haskell

Haskell
sub :: Ident -> Expr -> Expr -> Expr 
sub id0 arg (Var id1) | id0 == id1 = arg
sub _   _   (Var id1)  = Var id1
sub id0 _   (Abs id1 expr) | id0 == id1 = Abs id1 expr
sub id0 arg (Abs id1 expr) = Abs id1 (sub id0 arg expr)
sub id0 arg0 (Appl expr arg1) = Appl (sub id0 arg0 expr) (sub id0 arg0 arg1)

-- -- normal order
eval :: Expr -> Expr
eval (Appl (Abs ident expr) arg) = sub ident arg expr
eval (Appl expr arg) = Appl (eval expr) arg
eval x = x

steps :: Expr -> [Expr]
steps = fix (\rec x -> if x == eval x then [x] else x:(rec $ eval x))

The substitution function is simple: it only needs to avoid captured variables.

When implementing the evaluation function, a question naturally arises: we need to find a suitable application to reduce, But what if there are multiple possible candidates? In the following term we have two:

((λx.λy.y) ((λx.x x) (λx.x x))) b

We could apply the left-most, outermost reduction by substituting ((λx.x x) (λx.x x)) in place of x inside λy.y. There's no x in here to this gets discarded:

((λx.λy.y) ((λx.x x) (λx.x x))) b ->
((λy.y)) b ->
b

Or we could stubbornly apply the innermost, leftmost beta reduction available to us, in the (λx.x x) (λx.x x) bit. As we saw earlier this evaluates to itself, giving us the original expression unchanged.

((λx.λy.y) ((λx.x x) (λx.x x))) b ->
((λx.λy.y) ((λx.x x) (λx.x x))) b ->
((λx.λy.y) ((λx.x x) (λx.x x))) b ->
... no progress

The order in which we pick the lambda term can actually dictate if we'll ever reach a final state or not. Good news! Always picking the leftmost, outermost reduction WILL always take you to the same final term if such a term exist.

Most programming languages tend to evaluate the innermost, leftmost function first: when you call a function, all its arguments are evaluated before the function body. Haskell is one of the few languages where this isn't true and bodies are "substituted" before their arguments.

After the interpreter is done, you'll be able to play around with different evaluation strategies and see how they differ.

The steps function will repeatedly call eval until x == eval x. There's a simpler way to implement this but I admittedly just wanted to use fix here.

There is no function to get the normal form, as steps suffices:

normal :: Expr -> Expr
normal = last . steps

Delcarations and State in Haskell

At this point we can test our functions with GHCi or a simple main, since we have all the components for basic read -> eval -> print.

However, it gets very tedious to write long lambda terms manually. Let's implement a simple dialect of our lambda calculus for interactive use: every line is either an expression to evaluate or a definition.

The syntax for definitions will be simply name = <lambda term>. We define a new data type to reflect this that can either be a declaration or a bare expression, along with Read, Show implementations.

data EnvExpr = Decl Ident Expr | Base Expr
instance Read EnvExpr where
  readPrec = readEnvExpr
  readListPrec = readListPrecDefault

instance Show EnvExpr where
    show (Decl ident expr) = printf "%s = %s" ident (show expr)
    show (Base expr) = show expr

readEnvExpr :: ReadPrec (EnvExpr)
readEnvExpr = readDecl <++ (Base <$> readExpr)

readDecl :: ReadPrec (EnvExpr)
readDecl = do
    ident <- readIdent
    lift skipSpaces
    _ <- readChar '='
    lift skipSpaces
    expr <- readExpr
    return $ Decl ident expr

Now we just need to store definitions as we parse them, and to implement a new function that will replace all variables (like (const id a)) with the terms stored in our env (like {const: \x.\y.x, id = \x.x} ) until we are left with a bare expression again (like ((\x.\y.x) (\x.x) a)).

I modeled the environment using a classic Data.Map from containers. To model stateful computation, I just had to use the State monad.

Haskell
type Env = Map Ident Expr;

readAndStore :: [String] -> [String]
readAndStore = fst . evalAndStore empty
    where evalAndStore = flip inner
          inner = runState . read_store_monad
        
read_store_monad :: [String] -> State Env [String]
read_store_monad []     = 
    return []
read_store_monad (!x:xs) = do
    !x' <- case ((read x)::EnvExpr) of
        Decl i e -> (\s -> s ++ " -> " ++ (show . last .steps $ e)) <$> put_print i e
        Base e -> (\s -> show s ++ " -> " ++ (show . last . steps $ s)) <$> eval_env e
    xs'<- read_store_monad xs
    return $ x':xs'


put_print :: Ident -> Expr -> State Env String
put_print i e = do
    env <- get
    put (insert i e env)
    pure $ show $ Decl i e

eval_env :: Expr -> State Env Expr
eval_env (Var i) = do
    env <- get
    case MP.lookup i env of
        Just x -> return x
        Nothing -> return $ Var i
eval_env (Appl a b) = do
    a' <- eval_env a;
    b' <- eval_env b;
    return $ Appl a' b'
eval_env (Abs k b) = do
    env <- get
    b' <- case MP.lookup k env of
        Just x -> do
            put (delete k env)
            eval_env x
        Nothing -> eval_env b
    return $ Abs k b'

The State A B monad lets us write functions that return B and have read/write access to a state A. Inside of do notation we can <- get the state or put (<new_state>) as we see fit, and return our desired types.

eval_env does the heavy lifting of inserting new declarations in the environment or removing them temporarily when they're captured by a lambda expression.

put_print inserts and prints in a single operation: when a user types id = \x.x in the repl they'll receive a printout of the declaration again.

read_store_monad performs the fun task of reading the whole input line-by-line and return a list of output lines inside a State monad. Since Haskell lists are lazy, these will actually work as streams if needed:

readAndStore doesn't have the State monad in the signature, instead it takes again a list of lines as input, runs the state monad performing computation, return the output.

Now, because the output of our State monad is a list, Haskell won't bother computing the full list but will actually just try to fetch elements as they are needed. Lazyness is preserved in the state monad and in our functions too, meaning they work with streaming data for free.

IO

Haskell
prompt :: String
prompt = "λ " 

main :: IO ()
main = do
  putStr prompt
  hFlush stdout
  hSetBuffering stdout NoBuffering
  interact $ prompt_inner

prompt_inner :: String -> String
prompt_inner = foldMap (\s -> s ++ ("\n" ++ prompt)) . readAndStore . lines

In our IO module we invoke readAndStore inside another function which is just String -> String. These two strings are the whole output and whole input of our program. They're also default Haskell strings: a lazy linked list of single characters.

Feeding the function to interact will hook it to stdin -> stdout, and thanks to lazyness it will start producing output as soon as enough input is available to.

Final Haskell Review

Monads are very nice and not just some gimmick. I've used three just in this small example (State, ReadPrec, and IO!) for wildly different context, and do notation makes sense in each context.

The tooling has improved tremendously and is on par with Rust: while stack has always been very mature, easy, and an inspiration for cargo and friends, the HLS (Haskell Language Server) is now working great and out-of-the-box for most projects.

This style of code is still mostly unreadable for most programmers used to other programming paradigms. Could you follow in your language of choice?

Writing an interpreter (Zig version)

Getting started is okay. I used mise to handle zig and zls installation (0.16.0)

Zig has a single page html as language reference and ships the standard library's documentation offline via zig std. I consulted the language reference, the standard library, and my friend's TRGWII for Zig wisdom. I wrote an initial version, then he gave me a lot of stylistic suggestions and sent me a modified parser design. I then took that design and finished the interpreter.

AST in Zig

Zig

const Ident = struct {
    name: []const u8,
    fn deinit(_: Ident, _: std.mem.Allocator) void {}
};

const Appl = struct {
    f: *AST,
    a: *AST,
    fn deinit(self: Appl, alloc: std.mem.Allocator) void {
        self.f.deinit(alloc);
        alloc.destroy(self.f);
        self.a.deinit(alloc);
        alloc.destroy(self.a);
    }
};

const Abs = struct {
    v: []const u8,
    e: *AST,
    fn deinit(self: Abs, alloc: std.mem.Allocator) void {
        self.e.deinit(alloc);
        alloc.destroy(self.e);
    }
};

pub const AST = union(enum) {
    ident: Ident,
    appl: Appl,
    abs: Abs,
    /// many methods here

The lack of a string type is frustrating until you actually embrace the []const u8 oriented philosophy, where actually everything is just bytes and you assign meaning to them. Now usually my compiler greatly aids me in this task, in Zig it is once again revealed upon you: so my identifier is once again just a list of numbers. At some point I must've learned that a C char is a number before being a character. No facade here.

Parsing in Zig

I could not find anything helpful in the standard library to make parsers, so I made my own. My initial design was just a recursive descent parser built with "a bunch of functions that take a string and return (Result, how_much_of_input_I_parsed).". TRGWII suggested using a Parser struct which takes an allocator just once and uses it, keeping the position inside the input string as internal state. Fair enough

Zig
pub const Parser = struct {
    src: []const u8,
    i: usize = 0,
    alloc: std.mem.Allocator,

    // many methods and entire AST definition actually

    fn ident(self: *Self) ParseError!Ident {
        if (self.i >= self.src.len) return ParseError.EOF;
        var i = self.i;
        for (self.src[self.i..]) |c| {
            if (std.ascii.isAlphabetic(c) or c == '-' or c == '_') {
                i += 1;
            } else {
                break;
            }
        }
        if (self.i == i) return ParseError.MissingId;
        const name = self.src[self.i..i];
        self.i = i;
        return .{ .name = name };
    }

    fn abs(self: *Self) !Abs {
        const restore = self.i;
        errdefer self.i = restore;
        if (self.src.len <= self.i) return error.EOF;
        if (self.src[self.i] == '\\')
            self.i += 1
        else if (std.mem.startsWith(u8, self.src[self.i..], "λ"))
            self.i += 2
        else
            return error.MissingLambda;
        const id = try self.ident();
        if (self.src.len <= self.i) return error.EOF;
        if (self.src[self.i] != '.') return error.MissingDot;
        self.i += 1;
        const ast = try self.alloc.create(AST);
        errdefer self.alloc.destroy(ast);
        ast.* = try self.parse();
        return .{ .v = id.name, .e = ast };
    }

    // other methods are omitted for brevity

While in the Haskell version this was hidden inside control flow, here we very explicitly backtrack: at the start of our parsing function we store the current i or index in the source string. Using an errdefer block i can be restored to its initial value on error: if the parsing fails, we go back to where we started (maybe to try another variant?)

I thought of my Ast as immutable, owning its own subnodes. My design turned out to be very noisy, with a lot of allocations and free() taking place. I know this is not the best style of Zig. The strings in identifiers are not owned and belong to the input string, so they are not freed or handled by the parser.

In fact, the .deinit method for identifiers kind doesn't do anything.

This is the proper way in Zig to do a switch on variants: implement the ".deinit" method or anything for each variant, then write something like this:

Zig
    fn deinit(self: AST, allocator: std.mem.Allocator) void {
        switch (self) {
            inline else => |variant| variant.deinit(allocator),
        }
    }

This will get inlined to a proper switch where we call the .deinit method of every function.

Evaluating in Zig

I then wrote my equivalent functions to perform a single substitution, do a single evaluation step, and reduce until normal form:

Zig

pub fn subst(self: AST, id: Ident, arg: *const AST, allocator: std.mem.Allocator) error{OutOfMemory}!*AST {
    // std.debug.print("I put {f} in {f} in place of {s}\n", .{ arg, self, id.name });
    switch (self) {
        .abs => |a| {
            var cloned_expr = try allocator.create(AST);
            cloned_expr.* = self;
            if (!std.mem.eql(u8, id.name, a.v)) {
                cloned_expr.abs.e = try a.e.subst(id, arg, allocator);
                cloned_expr.abs.v = a.v;
            } else {
                cloned_expr.abs.e = try a.e.clone(allocator);
                cloned_expr.abs.v = a.v;
            }
            return cloned_expr;
        },
        .appl => |app| {
            var cloned_expr = try allocator.create(AST);
            cloned_expr.* = self;
            cloned_expr.appl.f = try app.f.subst(id, arg, allocator);
            cloned_expr.appl.a = try app.a.subst(id, arg, allocator);
            return cloned_expr;
        },
        .ident => |idn| {
            if (std.mem.eql(u8, id.name, idn.name)) {
                return try arg.clone(allocator);
            } else {
                var cloned_expr = try allocator.create(AST);
                cloned_expr.* = self;
                cloned_expr.ident = Ident{ .name = self.ident.name };
                return cloned_expr;
            }
        },
    }
}

const EvalError = error{ MalformedAst, OutOfMemory };


pub fn eval(self: AST, allocator: std.mem.Allocator) EvalError!*AST {

    switch (self) {
        .appl => |app| {
            switch (app.f.*) {
                .abs => |a| {
                    return try a.e.subst(.{ .name = a.v }, app.a, allocator);
                },
                else => {
                    var cloned_expr = try allocator.create(AST);
                    cloned_expr.* = self;
                    cloned_expr.appl.a = try app.a.clone(allocator); // deep clone arg
                    cloned_expr.appl.f = try app.f.eval(allocator); // eval function body

                    return cloned_expr;
                },
            }
        },
        .abs => |a| {
            var cloned_expr = try allocator.create(AST);
            cloned_expr.* = self;
            cloned_expr.abs.e = try a.e.eval(allocator); // eval function body
            cloned_expr.abs.v = a.v;

            return cloned_expr;
        },
        else => {
            return try self.clone(allocator);
        },
    }
}

pub fn normal(self: *AST, allocator: std.mem.Allocator) EvalError!*AST {
    var res = try self.eval(allocator);
    var prev = try self.clone(allocator);
    while (!res.eql(prev.*)) {
        prev.deinit(allocator);
        prev = res;
        res = try res.eval(allocator);
    }
    prev.deinit(allocator);
    return res;
}

They're very similar to what I did in Haskell, just written in an imperative style and more allocations than I would like.

Environment in Zig

Just like in Haskell, I added a new parser and AST for declarations. This parser also keeps track of its position in a source string with an i variable. You can actually mix and match them with no issues if you create two parsers with the same .src string and keep track of i. Since Zig offers no encapsulation for fields, you can just edit i to your liking. I also integrated the environment inside the parser, which is a bit of a cheat. I used a boring std.StringHashMap and had to create an OwnedAST version of my previous AST to put inside the env. The parser has a resolve() method which takes a lambda term and returns a lambda term with all variables "resolved". To avoid captured variables this time I modify the env, removing and adding the term back when I'm done. This is silly, and a usual technique is to create more sophisticated envs (such as a linked list of environments), but it worked fine for my small zig program.

Zig
pub const EnvParser = struct {
    src: []const u8,
    i: usize = 0,
    alloc: std.mem.Allocator,
    env: std.StringHashMap(Parser.OwnedAST),
    parser: Parser,
    const Self = @This();
    const ParseError = error{MissingEqual} || Parser.ParseError || std.mem.Allocator.Error;
    const EvalError = Parser.AST.EvalError;

    pub fn init(src: []const u8, alloc: std.mem.Allocator) Self {
        return Self{ .src = src, .i = 0, .alloc = alloc, .env = std.StringHashMap(Parser.OwnedAST).init(alloc), .parser = Parser{ .alloc = alloc, .src = src, .i = 0 } };
    }

    pub fn deinit(self: *Self) void {
        var iter = self.env.iterator();
        while (iter.next()) |entry| {
            entry.value_ptr.deinit(self.alloc);
            self.alloc.free(entry.key_ptr.*);
        }
        self.env.deinit();
    }

    pub fn loadnew(self: *Self, newline: []const u8) void {
        self.i = 0;
        self.src = newline;
        self.parser.src = newline;
        self.parser.i = 0;
    }

    pub const DEFAST = union(enum) {
        definition: Definition,
        bare: Parser.AST,
        pub fn normal(self: DEFAST, allocator: std.mem.Allocator) EvalError!*Parser.AST {
            switch (self) {
                inline else => |variant| return variant.normal(allocator),
            }
        }
        pub fn format(self: DEFAST, writer: anytype) !void {
            try switch (self) {
                .definition => |def| writer.print("{s} = {f}", .{ def.name, def.exp }),
                .bare => |bare| bare.format(writer),
            };
        }
    };

    pub fn parse(self: *Self) EvalError!DEFAST {
        if (lambda_debug) {
            depth += 1;
            std.debug.print(".{s} {s}: with {s} and s {d}\n", .{ depth_str[0 .. depth * 2], @src().fn_name, self.src[self.i..], self.i });
        }
        defer if (lambda_debug) {
            depth -= 1;
        };
        if (self.definition()) |def| {
            return .{ .definition = def };
        } else |_| {
            self.parser.i = 0; // reset parser
            if (self.parser.parse()) |ast| {
                return .{ .bare = (try self.resolve(&ast)).* };
            } else |err| return err;
        }
    }

    pub fn resolve(self: *Self, expr: *const Parser.AST) EvalError!*Parser.AST {
        switch (expr.*) {
            .ident => |id| {
                if (self.env.contains(id.name)) {
                    const env_expr = self.env.get(id.name) orelse return EvalError.NotInEnv;
                    return env_expr.ast.clone(self.alloc);
                } else {
                    return expr.clone(self.alloc);
                }
            },
            .appl => |a| {
                const cloned_expr = try self.alloc.create(Parser.AST);
                errdefer {
                    self.alloc.destroy(cloned_expr);
                }
                cloned_expr.* = expr.*;
                cloned_expr.appl.f = try self.resolve(a.f);
                cloned_expr.appl.a = try self.resolve(a.a);
                return cloned_expr;
            },
            .abs => |a| {
                const cloned_expr = try self.alloc.create(Parser.AST);
                errdefer {
                    self.alloc.destroy(cloned_expr);
                }
                if (self.env.contains(a.v)) {
                    const cache = self.env.getEntry(a.v) orelse return EvalError.NotInEnv;
                    if (!self.env.remove(a.v)) return EvalError.NotInEnv;

                    cloned_expr.* = expr.*;
                    cloned_expr.abs.v = a.v;
                    cloned_expr.abs.e = try self.resolve(a.e);
                    try self.env.put(cache.key_ptr.*, cache.value_ptr.*);
                } else {
                    cloned_expr.* = expr.*;
                    cloned_expr.abs.v = a.v;
                    cloned_expr.abs.e = try self.resolve(a.e);
                }
                return cloned_expr;
            },
        }
    }

    pub fn resolve(self: *Self, expr: *const Parser.AST) EvalError!*Parser.AST {
        switch (expr.*) {
            .ident => |id| {
                if (self.env.contains(id.name)) {
                    const env_expr = self.env.get(id.name) orelse return EvalError.NotInEnv;
                    return env_expr.ast.clone(self.alloc);
                } else {
                    return expr.clone(self.alloc);
                }
            },
            .appl => |a| {
                const cloned_expr = try self.alloc.create(Parser.AST);
                errdefer {
                    self.alloc.destroy(cloned_expr);
                }
                cloned_expr.* = expr.*;
                cloned_expr.appl.f = try self.resolve(a.f);
                cloned_expr.appl.a = try self.resolve(a.a);
                return cloned_expr;
            },
            .abs => |a| {
                const cloned_expr = try self.alloc.create(Parser.AST);
                errdefer {
                    self.alloc.destroy(cloned_expr);
                }
                if (self.env.contains(a.v)) {
                    const cache = self.env.getEntry(a.v) orelse return EvalError.NotInEnv;
                    if (!self.env.remove(a.v)) return EvalError.NotInEnv;

                    cloned_expr.* = expr.*;
                    cloned_expr.abs.v = a.v;
                    cloned_expr.abs.e = try self.resolve(a.e);
                    try self.env.put(cache.key_ptr.*, cache.value_ptr.*);
                } else {
                    cloned_expr.* = expr.*;
                    cloned_expr.abs.v = a.v;
                    cloned_expr.abs.e = try self.resolve(a.e);
                }
                return cloned_expr;
            },
        }
    }


Repl in zig

In the main we create an allocator to give to the library. We don't really care about freeing objects since we are using the amazing area allocator, which will free everything in one swoop with area.deinit();.

We also set up a lot of buffers to invoke the new and confusing IO interface, opening stdin() and stdout() manually and reading until newline for user input. Then parse, eval, print.

Zig
const std = @import("std");
const lambda = @import("root.zig");

test {
    std.testing.refAllDecls(@This());
}

const prompt = "λ ";

const lambda_debug = true;

pub fn main(init: std.process.Init) !void {
    var arena = std.heap.ArenaAllocator.init(init.gpa);
    const io = init.io;

    defer arena.deinit();
    const allocator = arena.allocator();
    var stdin_buf: [1024]u8 = undefined;
    var stdout_buf: [1024]u8 = undefined;
    var stdin = std.Io.File.stdin().reader(io, &stdin_buf);
    var stdout = std.Io.File.stdout().writer(io, &stdout_buf);
    var parser = lambda.EnvParser.init("", allocator);
    try stdout.interface.writeAll(prompt);
    try stdout.interface.flush();
    while (true) {
        var line_buffer: [1024]u8 = undefined;
        var w: std.Io.Writer = .fixed(&line_buffer);

        const line_length = try stdin.interface.streamDelimiterLimit(&w, '\n', .unlimited);
        stdin.interface.toss(1);

        const input_line = line_buffer[0..line_length];
        parser.loadnew(input_line);
        var parsed = try parser.parse();
        try stdout.interface.writeAll(try std.fmt.allocPrint(allocator, "{f} -> {f}\n{s}", .{ parsed, try parsed.normal(allocator), prompt }));
        try stdout.interface.flush();
    }
}

Testing in Zig

The Zig version took at least a new test for every new piece of code I wrote. After writing tests your code gets run and you get a chance of catching memory leaks and other common memory bugs. If you don't write tests it's very much possible that your code won't even compile when you begin using it, as unused code is entirely discarded in the compilation process. This also means your Zig code is as good as your tests are, which is true for a lot of languages. I don't think I would be able to write Zig without writing 1 line of tests for each line of code (or whatever it takes to reach 100% coverage of my code)

Overall Zig Review

I thought I'd have a much harder time than expected, but once you accept Zig's hubris and design it's a very powerful and self-contained language! I have not even tested the integrated fuzzer yet. I feel like I could be more productive in Zig after this experiment rather than C, where you get the same problems but awkward tools to solve them.

The program I picked was intentionally adverse to Zig's imperative style and very easy in Haskell, and despite this I really enjoyed the experience.

Next Up

This was very fun and a lot could be improved. Please contact me if you have strong opinions about anything I said.

In the next episodes we will:

top↑ end↓