4 The Lambda Calculus

8.15

4 The Lambda Calculus🔗

Today we will see a phase transition in programming language design. So far, the languages we’ve seen – LetLang, IteLang, CalcLang – are tiny, and seem unrepresentative of the kinds of languages that we would actually prgram in. Today, we will see the smallest interesting programming language, the lambda calculus. We will see that the lambda calculus is a Turing-complete language: it is capable – more or less – of representing all of the programs that your computer can run. Despite this expressive power, we will see that the lambda calculus has a very small number of rules that determine its behavior. This combination of expressivity and simplicity has made it the historical and intellectual starting point for much of programming languages.

4.1 Syntax of the Lambda Calculus🔗

The lambda calculus has only three syntactic forms:

identifiers,
lambda abstractions (also called "functions"), and
lambda application (also called "function calls").

Notice what this language doesn’t have: no numbers, no Booleans, no conditionals. Only functions and function calls. Essentially, the lambda calculus is the untyped fragment of Plait restricted to only these three syntactic forms.

Let’s see some examples. First, the identity function:

> (lambda (x) x)
- ('_a -> '_a)
#<procedure>

Syntactically, a lambda expression has two parts: an argument, which is a variable name; and a body, which is an expression. In the lambda calculus, functions are the only kind of values. We can of course call a function:

> ((lambda (x) x) (lambda (y)
(lambda (z) y)))
- ('_a -> ('_b -> '_a))
#<procedure>

Syntactically, an application has two parts: a callee, the function being called, and a argument, which is the argument being being passed to the callee. In the above, (lambda (x) x) is the callee and (lambda (y) (lambda (z) y)) is the argument.

That’s it: that’s the entire language. Just functions and function calls. So, let’s design an abstract syntax datastructure for representing programs like these:

> (define-type Lam
    [identE (id : Symbol)]
    [lamE (arg : Symbol) (body : Lam)]
    [appE (e1 : Lam) (e2 : Lam)])

4.1.1 Surface Syntax🔗

In this class we will be writing a lot of lambda calculus terms, so it’s important that we have good and concise notation for representing them. We will use the following surface syntax grammar:

‹expr›
::=
( ‹expr› )

  |
λ ‹ident› . ‹expr›

  |
‹expr› ‹expr›

  |
‹ident›
Figure 3: Surface syntax for the lambda calculus.

The parenthesis in this surface syntax act like in infix notation: they disambiguate binding order. The surface syntax λ ‹ident› . ‹expr› denotes a lambda abstraction with argument ‹ident› and body ‹expr›; the . separates the two. The surface syntax ‹expr› ‹expr› denotes an application. Note that this application rule is ambiguous, similar to addition in infix grammars, so we specify a left-associate binding order, i.e. x~y~z is equivalent surface syntax to (x~y)~z. This is likely an unfamiliar grammar to you, so let’s see some examples by seeing some surface syntax and corresponding abstract syntax datastructures:

Surface syntax		Abstract syntax datastructure
x		(identE 'x)
\lambda x . x		(lamE 'x (identE 'x))
\lambda x . \lambda y . x		(lamE 'x (lamE 'y (identE 'x)))
x~y		(appE (identE 'x) (identE 'y))
(x~y)~z		(appE (appE (identE 'x) (identE 'y)) (identE 'z))
x~y~z		(appE (appE (identE 'x) (identE 'y)) (identE 'z))
(\lambda x . x)~(\lambda y . y)		(appE (lamE 'x (identE 'x)) (lamE 'y (identE 'y)))
(\lambda x . x~x)~(\lambda y . y)		(appE (lamE 'x (appE (identE 'x) (identE 'x))) (lamE 'y (identE 'y)))
\lambda x . (x~(\lambda y . y))		(lamE 'x (appE (identE 'x) (lamE 'y (identE 'y))))

4.2 Semantics of the Lambda Calculus🔗

This is a very critical section, read it carefully. Several concept check questions will involve stepping and parsing \lambda-calculus terms.

Now let’s get a feel for how to run lambda calculus programs. We will start with a stepper for lambda calculus terms. There aren’t very many cases, so we will inspect each of them. First, since functions are values, they step to themselves, for example: \lambda x . x \rightarrow \lambda x . x

Next, let’s consider how to step function calls. Similar to how we implemented LetLang, we will use substitution:

(\lambda x . x)~(\lambda y . \lambda z . y) \xrightarrow[]{\text{substitute } (\lambda y . \lambda z . y) \text{ for }x} (\lambda y . \lambda z . y)

The above step is so important that it has a very esoteric-sounding name: \beta-reduction.

A critical property of substitution is that it respects lexical scope, just like in LetLang. So, in a lambda term like (\lambda x . (\lambda x . x)), the identifier x refers to its inner-most binding. Plait works similarly, observe that the first argument 10 is ignored in the following snippet:

> (((lambda (x)
(lambda (x)
(+ x 10))) 10) 200)
- Number
210

Let’s see some examples.

(\lambda x . (\lambda x . x)) ~ (\lambda z . z~z) \rightarrow (\lambda x . x)

Recall that, in our calculator language, if we had an expression like (+ (+ 1 2) (+ 3 4)), we would start by stepping the left argument to the outer-most addition, so this would step to (+ 3 (+ 3 4)). A similar thing happens here: in order to call a function, might sometimes need to simplify it first. Consider the following: \Big((\lambda x . x) ~ (\lambda y . y~y) \Big) ~ (\lambda z . z) \xrightarrow{\text{Step } (\lambda x . x) ~ (\lambda y . y~y)} (\lambda y . y~y) ~ (\lambda z . z)

So, one of our rules is that to step (appE e1 e2), first fully simplify e1.

Now, let’s consider the other case: what happens if for an application like (appE (lamE 'id body) e2), we have that e2 is not yet fully simplified? Here we have a design decision: we can either substitute without simplifying e2 first, or we can first simplify e2 and then substitute it. Similar to what we did with LetLang, we will opt to first simplify e2 before substituting it; this is called the call-by-value semantics of the \lambda-calculus. So, we have the following step in the call-by-value semantics:

(\lambda x . x) ~ ((\lambda y . y)~(\lambda z . z~z)) \xrightarrow{\text{Step }((\lambda y . y)~(\lambda z . z~z))} (\lambda x . x) ~ (\lambda z . z~z)

It is sometimes the case that the body of a lambda is not fully simplified, for instance in the following lambda term: \lambda x . (x~(\lambda y . y))

There are variants of the lambda calculus that do simplify the bodies of lambdas even if they are not called. This is sometimes called full-\beta-reduction. Our language does not do this.

Again we have a design decision: how do we step such functions? We will define it so that these functions do not simplify any further, and immediately step to themselves, i.e.:

\lambda x . (x~(\lambda y . y)) \rightarrow \lambda x . (x~(\lambda y . y))

So, to recap our rules:

Lambda abstractions are values and they immediately step to themselves without performing any further simplification.
To step an application e1 e2, first fully simplify e1 to a lambda \lambda x . e, then fully simplify e2 to a value v, then perform beta reduction and substitute x with v in e.

4.3 A Definitional Interpreter🔗

Now we are ready to implement an interpreter for the lambda calculus. First, substitution, which looks a lot like LetLang.

> (define-type Value
    [lamV (id : Symbol) (body : Lam)])
> (subst : (Symbol Value Lam -> Lam))
> (define (subst id assignment body)
    (type-case Lam body
               [(identE x)
                (if (equal? id x)
                    (lamE (lamV-id assignment) (lamV-body assignment))
                    (identE x))]
               [(appE e1 e2)
                (appE (subst id assignment e1)
                      (subst id assignment e2))]
               [(lamE x body)
                (if (equal? x id)
                    (lamE x body)
                    (lamE x (subst id assignment body)))]))

Note how we handle shadowing by checking to see if x and id are equal in the (lamE x body) case. Now, we can make a definitional interpreter that runs lambda calculus programs using substitution:

> (interp : (Lam -> Value))
> (define (interp e)
    (type-case Lam e
               [(identE x) (error 'runtime "free variable")]
               [(lamE id body) (lamV id body)]
               [(appE e1 e2)
                (let* [(v1 (interp e1))
                       (v2 (interp e2))
                       (id (lamV-id v1))
                       (body (lamV-body v1))
                       (subst-body (subst id v2 body))]
                  (interp subst-body))]))
> (test (interp (lamE 'x (identE 'x)))
        (lamV 'x (identE 'x)))
- Void
good (interp (lamE (quote x) (identE (quote x)))) at line 11
  expected: (lamV 'x (identE 'x))
  given: (lamV 'x (identE 'x))
> (test (interp (lamE 'x (appE (identE 'x) (lamE 'y (identE 'y)))))
        (lamV 'x (appE (identE 'x) (lamE 'y (identE 'y)))))
- Void
good (interp (lamE (quote x) (appE (identE (quote x)) (lamE (quote y) (identE (quote y)))))) at line 12
  expected: (lamV 'x (appE (identE 'x) (lamE 'y (identE 'y))))
  given: (lamV 'x (appE (identE 'x) (lamE 'y (identE 'y))))

4.4 Programming in the Lambda Calculus🔗

I told you in the beginning that the lambda calculus is a very expressive language. It might not seem that way since it consists of so little syntax. So, let’s see how we might encode a richer feature into the lambda calculus. In future lectures, we will see even more examples of how to program effectively in the lambda calculus, including how to compile many expressive high-level language features into it.

4.4.1 Multiple arguments🔗

Plait lets us write functions with multiple arguments like this:

> (define (callx x y)
(x y))

How do we do this in the lambda calculus? Well, it’s easy enough: we can write a function that returns another function, like so:

(\lambda x . (\lambda y . x~y))

The process of transforming a function that takes multiple arguments into a sequence of functions that each takes a single argument is called currying. In some languages, like OCaml and Haskell, all functions are automatically "curried" by default.

4.4.2 Non-termination🔗

One thing the lambda calculus seems to be lacking is loops or any form of recursion. Let’s conclude by seeing that, even though the lambda calculus does not have an explicit looping construct, it is nonetheless possible to make lambda calculus programs that run forever!

Let’s make a program that runs forever. First, let’s define a lambda calculus term \omega that takes a function as an argument and calls that function with itself as an argument:

\omega ::= \lambda x . x~x

Clearly, \omega doesn’t run forever: running it immediately terminates because lambdas are values. The magic happens when we call \omega with itself as an argument. Look what happens when we perform one step of \beta-reduction on this:

(\lambda x . x~x) ~ (\lambda x . x~x) \xrightarrow{\text{Substitute }x \text{ for }(\lambda x . x~x)} (\lambda x . x~x) ~ (\lambda x . x~x)

"I am the Alpha and the Omega, the Beginning and the End, the First and the Last.”
King James Bible. Revelation 22:13.

Observe: The program did not change after one step of computation! So, this program will run forever, repeatedly stepping to itself. This program is usually called \Omega: it is the while (true) {} of the lambda calculus.

contents ← prev up next →

1	Introduction
2	Interpreting Programs
3	Scope and Substitution
4	The Lambda Calculus
5	Compilation
6	Formalism
7	Types
8	Types II: Typing Formalism
9	Types III: Type Soundness
10	Parametric Polymorphism
11	Compiling to System-F
12	Subtyping
13	Control
14	Implementing Exceptions
15	Control Formalism
16	Control Formalism II
17	Abstract Machines
18	Abstract Machines II
19	Memory Safety
20	Memory Management
21	Probabilistic Programming
22	Managing Effects

4.1	Syntax of the Lambda Calculus
4.2	Semantics of the Lambda Calculus
4.3	A Definitional Interpreter
4.4	Programming in the Lambda Calculus

‹expr›	::=	( ‹expr› )
	\|	λ ‹ident› . ‹expr›
	\|	‹expr› ‹expr›
	\|	‹ident›