18 Abstract Machines II
We continue our exploration of abstract machines by extending the CK machine from the previous lecture with an environment, denoted E. This environment enables us the machine to efficiently deal with local variables. After we see the basic structure and form of the CEK machine, we will extend extend it with lambdas to see our first efficient implementation of the lambda calculus.
18.1 CEK Machine
The control register C that holds the current instruction being evaluated.
The environment register E that associates variable identifiers with values.
The call stacK K that holds a stack of frames holding pending evaluation.
We begin as by defining the programs that the CEK machine will run; these will go in the machine’s control register. Here we consider a simple calculator language extended with let-bindings:
(define-type Calc (addE [e1 : Calc] [e2 : Calc]) (letE [id : Symbol] [e1 : Calc] [e2 : Calc]) (identE [id : Symbol]) (numE [n : Number]))
The environment E is a map from variable identifiers to values (i.e., numbers):
(define-type-alias Env (Hashof Symbol Number))
Next we define what the frames of our machine should be. Intuitively, we need a frame for each case of the control register where the machine needs to recursively evaluate a sub-program. Looking at our grammar, there are two recursively defined control register states: addE and letE. We introduce frames handling the recursion for these cases:
(define-type Frame ;;; frame for C = (+ n e) when e is not a number (add1E [E : (Hashof Symbol Number)] [n : Number]) ;;; frame for C = (+ e1 e2) when e1 is not a number (add2E [E : (Hashof Symbol Number)] [e : Calc]) ;;; frame for (let x e1 e2) when e1 is not a number (let1E [E : (Hashof Symbol Number)] [id : Symbol] [e : Calc]))
You may wonder: why do we only need a single frame case for let1E even though it has two recursively-defined subexpressions in it? You could do it this way, but it would result in a slightly more complicated machine definition in what follows.
Now we are ready to give the definition of the CEK machine state for the Calc langauge with let-bindings:
(define-type State (state [C : Lam] [E : (Hashof Symbol Number)] [K : (Listof Frame)]))
Once the state of the machine is given, we can describe its transition system using a table:
#
Current C
Current E
Current K
Guard
Next C
Next E
Next K
Final?
1
(addE e1 e2)
E
K
e1 = n1, e2 = n2
(numE (+ n1 n2))
∅
K
N
2
(addE e1 e2)
E
K
e1 not a number
e1
E
(cons (add2E E e2) K)
N
2
(addE e1 e2)
E
K
e1 = n1
e2
E
(cons (add1E E n1) K)
N
3
(letE x e1 e2)
E
K
e1
E
(cons (let1E E x e2) K)
N
4
(identE x)
E
K
x in E
E[x]
∅
K
N
5
(numE n)
∅
(cons (add1E E n1) K)
K not empty
(addE (numE n) (numE n1))
E
K
N
6
(numE n)
∅
(cons (add2E E e2) K)
K not empty
(addE (numE n) e2)
E
K
N
7
(numE n)
∅
(cons (let1E E id e) K)
K not empty
e
E[id↦n]
K
N
8
(numE n)
∅
K
K empty
(numE n)
∅
K
Y
Figure 28: A tabular representation of the CEK machine for Calc.
Example of running the machine (where ∅ denotes the empty callstack and environment):
C
E
K
Applied rule
(letE ’x (addE (numE 1) (numE 2)) (identE ’x))
∅
∅
(addE (numE 1) (numE 2))
∅
(let1E ∅ (identE ’x))
3
(numE 3)
∅
(let1E ∅ (identE ’x))
1
(identE ’x)
{x↦3}
7
(numE 3)
∅
∅
4
Figure 29: A small example of running the CEK machine for the calculator language with let-bindings.
Here is a link to a full implementation of the CEK machine in Plait.
18.2 Adding Functions to the CEK Machine
Now we extend our language with first-class functions. The key difficulty is with variable capture. Consider the following small Plait program:
> ((let [(my-val 10)] (lambda (x) my-val)) 20) - Number
10
The lambda (lambda (x) my-val) refers to a variable after it has gone out of scope. We say that the lambda has captured the variable my-val. To handle this, we need to add additional data to lambda terms in our abstrat machine that tracks the values of variables that are in scope when they are defined.
Now let’s define our language:
(define-type Lam (lamE [id : Symbol] [body : Lam]) (appE [e1 : Lam] [e2 : Lam]) (identE [id : Symbol]) (cloE [closure : Env] [id : Symbol] [body : Lam])) (define-type Value (cloV [closure : Env] [id : Symbol] [body : Lam])) (define-type-alias Env (Hashof Symbol Value)) (define-type Frame ; frame for (app e1 e2) when e2 is not a value (app1E [E : Env] [e : Lam]) ;;; frame for (appE e1 v) when v is a value (app2E [E : Env] [v : Value]))
Now we are ready to define our machine’s transition function and run some example programs on it:
Current C
Current E
Current K
Guard
Next C
Next E
Next K
Final?
(lamE x e)
E
K
(cloE E x e)
∅
K
N
(identE x)
E
K
x in E
E[x]
∅
K
N
v
∅
(cons (app1E E e2) K)
(appE v e2)
E
K
N
v2
∅
(cons (app2E E v1) K)
(appE v1 v2)
E
K
N
v
∅
∅
v
∅
∅
Y
(appE (cloE E1 x e) v)
E2
K
v is a value
e
E1[x↦v]
K
N
(appE v e2)
E
K
e2 not a value
e2
E
(cons (app2E E v) K)
N
(appE e1 e2)
E
K
e1 not a value
e1
E
(cons (app1E E e2) K)
N
Figure 30: A tabular representation of the CEK machine for Lam.
Example of running the machine:
C
E
K
(appE (lamE ’x (identE ’x)) (lamE ’y (identE ’y)))
∅
∅
(lamE ’x (identE ’x))
∅
(app1E ∅ (lamE ’y (identE ’y)))
(cloE ∅ ’x (identE ’x))
∅
(app1E ∅ (lamE ’y (identE ’y)))
(appE (cloE ∅ ’x (identE ’x)) (lamE ’y (identE ’y)))
∅
∅
(lamE ’y (identE ’y))
∅
(app2E ∅ (cloE ∅ ’x (identE ’x)))
(cloE ∅ ’y (identE ’y))
∅
(app2E ∅ (cloE ∅ ’x (identE ’x)))
(appE (cloE ∅ ’x (identE ’x)) (cloE ∅ ’y (identE ’y)))
∅
∅
(identE ’x)
x↦(cloE ∅ ’y (identE ’y))
∅
(cloE ∅ ’y (identE ’y))
∅
∅
Figure 31: An example of running the CEK machine for the lambda calculus.
Here is a link to a full implementation of the CEK machine implementing the lambda calculus in Plait. Notice: this implementation does not make use of the Plait’s higher-order functions to implement lambdas. This implementation has been defunctionalized: all the higher-order functions have been converted into normal recursive calls. This makes this abstract machine model very amenable to implementing on modern hardware that does not have a first-class notion of higher-order functions.