13 Control
Control flow determines which part of the program executes next. So far, all of the programming languages we have seen have had simple control-flow structure. We have seen two syntactic constructs that give the programmer control over the control flow of a program: function cals and if-then-else expressions. Practical programming languages often have more interesting ways of modifying control flow, for example exceptions. Consider the following Java code:
public class Main { |
public static void main(String[ ] args) { |
try { |
int[] myNumbers = {1, 2, 3}; |
System.out.println(myNumbers[10]); |
} catch (Exception e) { |
System.out.println("Something went wrong."); |
} |
} |
} |
This code does not simply execute one expression after the other in sequence like all of our interpreters so far; it relies on the ability to jump from one place of execution to another. This module is about implementing languages with richer control flow constructs than the ones we have seen so far. To do this, we will rely heavily on an idea called a continuation, which we will see today.
13.1 The Control Context
Thus far, we have seen how we can write programs that manipulate names and mutable state by enriching our interpreter with heaps and environments. Now, we will enrich our interpreters with a notion of control context in order to implement interesting control-flow structure. A control context tracks which operations remain to be performed in order to finish a computation. If you’re familiar with the notion, a control context is analogous to a call stack. As an example, consider the following definition of a factorial function:
> (define (fact n) (if (equal? n 0) 1 (* n (fact (- n 1)))))
Let’s step through evaluating the recursive call (fact 4):
fact 4 |
--> 4 * (fact 3) |
^^^ control context |
--> 4 * 3 * (fact 2) |
^^^^^^^ control context |
--> 4 * 3 * 2 * (fact 1) |
^^^^^^^^^^^ control context |
--> 4 * 3 * 2 * 1 * (fact 0) |
^^^^^^^^^^^^^^^ control context |
--> 4 * 3 * 2 * 1 * 1 |
--> 24 |
Each time fact n is called, that comes with a promise that the eventual value returned will be multiplied by n. This "promise to do something with the return value of a function call" is stored in the control context, which is a control-flow feature of a processor that keeps track of the context in which functions are called. The control context of each call to fact is annotated above. Observe how above the control context grows with n: This can result in practical performance problems: if you’ve ever hit a "stack overflow" error in your code, it is typically because of accidental unbounded control context.
13.2 Tail Form
There is an alternative way to write fact that does not grow the control context by using an accumulator acc:
> (define (fact-tc n acc) (if (equal? n 0) acc (fact-tc (- n 1) (* n acc))))
Let’s examine the control context when we evaluate this function:
fact-tc 4 1 |
--> fact-tc 3 2 |
--> fact-tc 2 12 |
--> fact-tc 1 24 |
--> fact-tc 0 24 |
--> 24 |
Observe how instead of storing all the necessary data to finish the factorial computation in the control context, it is instead stored in the accumulator. A function that requires no additional control context in order to execute exhibits iterative control behavior. A function that does require control context is said to exhibit recursive control behavior.
How can you tell if a function has iterative or recursive control? If a function invokes itself in operand position (i.e., a recursive call is made as an argument to some other function), then it requires recursive control. Observe how in fact, the recursive call is invoked as an argument to the multiplication *.
A tail call is a function call performed as the final act of a function. Observe that fact-acc above only ever calls itself as a tail call. We say fact-acc is in tail form. Tail calls enable a host of powerful compiler optimizations and are often essential for writing high-performance functional programs.
13.3 Continuations
In the fact example above, we are able to capture all the context necessary to finish the computation using an accumulator. This is not always possible; sometimes you need to keep track of more complicated behavior in the control context. This brings us to the notion of a continuation. Instead of using an accumulator, let’s instead use a function to keep track of the remaining steps of computation. We will call this function a continuation (it “continues the computation”).
(fact-cps : (Number (Number -> 'a) -> 'a)) (define (fact-cps n k) (if (equal? n 0) (k 1) ; call continuation with base case 1 (fact-cps (- n 1) (λ (r) ; call continuation with n * r (k (* n r)))))) (define (fact-cont n) (fact-cps n (λ (r) r)))
In the above function we’ve labeled the continuation as k. Notice its type: its argument has type Number, which is the return type of fact-cont. The return type of the continuation is a type parameter 'a.
The best way to understand this code is to step through it and see how it avoids using control context (read this carefully! the scoping rules for the continuation k are very important):
fact-cps 2 (λ (r) r) ; label (λ (r) r) as id |
--> fact-cps 1 (λ (r) (id (* 2 r))) ; label (λ (r) (id (* 2 r))) as k1 |
--> fact-cps 0 (λ (r) (k1 (* 1 r))) |
--> (λ (r) (k1 (* 1 r))) 1 |
--> (λ (r) (id (* 2 r))) 1 |
--> (id (* 2 1)) |
--> 2 |
The function fact-cps is initially called with a continuation (λ (r) r). We typically denote the argument to a continuation as r, short for "returned value". We initialize the continuation to the identity function, indicating that we have no further work to. Think of this as the empty call stack. Note the order in which the multiplications are performed: first the multiplication 1 * 1 is performed; then 2 * 1.
13.3.1 Continuation-passing Style
Later on we will see algorithms for transforming any program into continuation-passing style; for now, we will practice doing this manually by hand.
For instance, in the above example we saw the direct style function fact had type (Number -> Number) and CPS style fact-cps has type (Number (Number -> 'a) -> 'a), which is the Plait syntax way of writing the type \forall X . \texttt{Num} \rightarrow (\texttt{Num} \rightarrow X) \rightarrow X. The type \texttt{Num} \rightarrow X is the type of the continuation; its argument is the return type of the direct-style implementation, and its return type is some unknown type X; this type is unknown because we do not yet know the context in which fact-cps will be called.
The direct and CPS version of a function must behave the same way for the same inputs. For instance, for the factorial function, it is the case that for any n it holds that (fact n) = (fact-cps n (λ (r) r)).
Why "essentially"? It’s possible that the function is written in a way to unnecessarily delay returning a value of type X, for instance by performing a recursive call and then doing some unnecessary computation before returning X. So, this function can in fact be treated as if it’s in tail-form, since it’s not possible to do any meaningful computation after a recursive call except immediately returning.
13.3.2 Multiple Self-Calls
Let’s get a feel for how to manually translate a program into continuation-passing style by working through a more interesting example.
Consider the following definition of the Fibonacci function:
> (fib : (Number -> Number))
> (define (fib n) (cond [(equal? n 0) 0] [(equal? n 1) 1] [else (+ (fib (- n 1)) (fib (- n 2)))]))
Now, let’s write the CPS’d version of fib. Start by writing down its type and signature:
(fib-cps : (Number (Number -> 'a) -> 'a)) (define (fib-cps n k) ...)
Now, we can start filling in the body. The types do the work for us here: if fib-cps is well-typed and heaves like fib for the identity continuation, then we’re done. Let’s start filling in the arms of the cond:
(fib-cps : (Number -> (Number -> 'a) -> 'a)) (define (fib-cps n k) (cond [(equal? n 0) (k 0)] [(equal? n 1) (k 1)] [else ...]))
Notice how for the base cases we call the continuation. The type of fib-cps forces us to call the continuation like this: we have essentially no other well-typed way of filling in these arms.
Now, what about the else branch? In order to return something of type 'a, we are forced to perform either a recursive call or directly invoke the continuation. Clearly here we want to perform a recursive call, so we start this way:
(fib-cps : (Number (Number -> 'a) -> 'a)) (define (fib-cps n k) (cond [(equal? n 0) (k 0)] [(equal? n 1) (k 1)] [else (fib-cps (- n 1) (λ (r1) ...))]))
Notice what’s happening here: we’ve started filling in the continuation for the recursive call for fib-cps. We’ve labeled the argument to this continuation r1; it is going to hold the result for evaluating the recursive call to fib-cps (- n 1). Now we can continue with the second recursive call to fib:
Most common mistake: forgetting to call the continuation inside the recursive cases (i.e., forgetting to include k in (k (+ r1 r2))). Just beecause it typechecks doesn’t mean it’s correctly implemented: be sure to test your CPS’d functions to make sure they match the behavior of your direct-style functions.
> (fib-cps : (Number (Number -> 'a) -> 'a))
> (define (fib-cps n k) (cond [(equal? n 0) (k 0)] [(equal? n 1) (k 1)] [else (fib-cps (- n 1) (λ (r1) (fib-cps (- n 2) (λ (r2) (k (+ r1 r2))))))]))
> (test (fib-cps 3 (λ (r) r)) (fib 3))
- Void
good (fib-cps 3 (λ (r) r)) at line 8
expected: 2
given: 2
13.4 Continuation-passing Interpreters
Explicit representation of the continuation is a powerful tool for implementing control-flow operators in interpreters. First, let’s see how to implement an interpreter in continuation-passing form: this way, we have the control context explicitly available. Then, we will make use of this explicit control context in order to do interesting control flow operations.
Recall the usual definitional interpreter for a calculator language:
> (define-type Calc (numE [n : Number]) (addE [e1 : Calc] [e2 : Calc])) > (interp : (Calc -> Number))
> (define (interp e) (type-case Calc e [(numE n) n] [(addE e1 e2) (+ (interp e1) (interp e2))]))
The function interp is in direct style and it is not in tail form. Let’s transform it into continuation-passing style:
> (interp-cps : (Calc (Number -> 'a) -> 'a))
> (define (interp-cps e k) (type-case Calc e [(numE n) (k n)] [(addE e1 e2) (interp-cps e1 (λ (r1) (interp-cps e2 (λ (r2) (k (+ r1 r2))))))]))
> (test (interp-cps (addE (numE 1) (numE 2)) (λ (r) r)) (interp (addE (numE 1) (numE 2))))
- Void
good (interp-cps (addE (numE 1) (numE 2)) (λ (r) r)) at line 14
expected: 3
given: 3
13.4.1 Implementing return
Continuations enable implementing interesting control-flow operations in our interpreters. For example, many programming languages have a return operation that immediately ends a function by returning a value. We can implement a return construct in our calculator language using continuations. Let’s extend our abstract syntax to accomodate this new construct:
> (define-type CalcRet (numR [n : Number]) (addR [e1 : CalcRet] [e2 : CalcRet]) (returnR [n : Number]))
For example, we would like the program (addE (numE 10) (returnE 20)) to evaluate to 20.
So far, you might have the impression that implementing programs in direct style is easy and writing them in CPS form is quite awkward. But, this isn’t always the case: sometimes, it is much easier to write certain programs if you are in continuation-passing style. Implementing an interpreter for CalcRet in direct style is quite tricky! However, if our interpreter is written in continuation-passing style, then we can quite elegantly implement this construct by having two seperate continuations: a default continuation defaultk that is invoked during normal evaluation, and a return continuation retk that is only called when returning:
Ben Lerner calls this a "double-barreled continuation".
> (interp-ret-h : (CalcRet (Number -> 'a) (Number -> 'a) -> 'a))
> (define (interp-ret-h e retk defaultk) (type-case CalcRet e [(numR n) (defaultk n)] [(addR e1 e2) (interp-ret-h e1 retk (λ (r1) (interp-ret-h e2 retk (λ (r2) (defaultk (+ r1 r2))))))] [(returnR n) (retk n)]))
> (define (interp-ret e) (interp-ret-h e (λ (r) r) (λ (r) r)))
> (test (interp-ret (addR (numR 10) (returnR 20))) 20)
- Void
good (interp-ret (addR (numR 10) (returnR 20))) at line 19
expected: 20
given: 20
As a final remark, notice how this expression is evaluated:
> (interp-ret (addR (returnR 10) (returnR 20))) - Number
10
The use of continuation-passing style to disambiguate the evaluation order of target languages in host languages is one of the classic applications of continuations and is due to Reynolds, see "Definitional interpreters for higher-order programming languages".