Lecture 5: cond and let

Goals for today:

Develop cond: extend calc with conditionals
- Explore the design-space of conditionals
- Create an evaluator for cond
- The value datatype
Extend calc with local binding
- Syntax and semantics of let-binding
- Understand scope: be able to identify when a particular variable is in scope, and what it is bound to
- Understand substitution and how it relates to scope

Logistics:

Next homework due Friday
Grades for homework 1 are posted
Will be releasing the next homework on Wednesday, due Friday Feb 9
First quiz will next Thursday Feb 8 (8am) to Friday Feb 9 (11:59pm)

Quiz logistics:

Goal is to not have time-pressure. It should take no more than 2 or 3 hours
No public Piazza questions or office-hours questions relating to quiz. Private questions are OK.
Will cover basics of Plait and the SMoL language (calculator, cond, let, and fun), up to and including Lecture 6 “First-class functions”
Open-book, open-note, open-DrRacket exam
A mix of programming, multiple-choice, short-answer, and open-ended design questions

Cond

Recall our desired syntax for our language with conditionals:

(define-type Exp
  [num (n : Number)]
  [bool (b : Boolean)]
  [plus (left : Exp) (right : Exp)]
  [cnd (test : Exp) (thn : Exp) (els : Exp)])

Semantics of cnd: If test evaluates to #t, evaluate thn; if guard evaluates to #f, evaluate els; otherwise, do what?
Suppose we want to make an evaluator for this language. What should its type be?
- Ponder: what do we do when we encounter Booleans?
We need a Value type that captures what cond programs can evaluate to: either numbers or Booleans

(define-type Value
  [vbool (v : Boolean)]
  [vnum (n : Number)])

Then, we can give a type to our cond interpreter:

(calc : (Exp -> Value))
(define (calc e)
  (error 'not-impl ""))

Now, we can start filling our interpreter in:

(calc : (Exp -> Value))
(define (calc e)
  (type-case Exp e
   [(num n) (vnum n)]
   [(bool b) (vbool b)]
   [(plus l r) ...]
   [else (error 'notimpl "")]))

What should we do for +? At this point it is useful to add some helper functions that extract the data that we need from values:

(define (value->num v)
  (type-case Value v
   [(vbool b) (error 'value "invalid value")]
   [(vnum n) n]))

(define (value->bool v)
  (type-case Value v
   [(vnum n) (error 'value "invalid value")]
   [(vbool b) b]))

And now, we can use these helpers to fill in + and cnd:

(define (add v1 v2)
  (+ (value->num v1) (value->num v2)))

(define (calc e)
  (type-case Exp e
    ([num n] (vnum n))
    ([bool b] (vbool b))
    ([plus l r] (vnum (add (calc l) (calc r))))
    ([cnd test thn els] (if (value->bool (calc test))
                            (calc thn)
                            (calc els)))))

Design decisions

We have some design decisions we made here implicitly: can you spot them?
First, we decided adding a Boolean and a number should be an error; not all languages do this!
Second, we inherited Plait’s semantics for if

Local binding and scope

Binding associates a value with a name
We have already seen instances of binding when we call functions:

(define (addone x) (+ x 1))
> (addone 2)
3

This function call binds the value 2 to the name x
Local binding tells us that a name is restricted in visibility to some part of the program
Again, we are familiar with this notion: a function’s argument is not visible outside the body of the function:

(define (addone x) (+ x 1))
> x
x: free variable while typechecking in: x

We say that x is not in scope outside of the body of the function. Scope is all about determining which variables are visible where. It is a semantic notion: the implementation of our evaluator will determine the scope of an variable.

Syntax of `let`

Now we want to extend our calculalator language with the ability to introduce local variables
For example, we want the following syntax:

(let1 (x 10)     (+ x x))
       ^ ^^      ^^^^^^^
       | |          |
       | assignment body
       |
       name

We can express this new syntax using BNF:

<expr> ::= <num>
          | (+ <expr> <expr>
          | <var>
          | (let1 (<var> <expr>) <expr>)

And, as usual, we can write down a Plait type to capture ASTs for this new syntax:

(define-type Exp
  [numE (n : Number)]
  [plusE (left : Exp) (right : Exp)]
  [varE (name : Symbol)]
  [let1E (var : Symbol)
         (assignment : Exp)
         (body : Exp)])

Semantics of let

Intuitively, what is the semantics of (let var assignment body)? It is surprisingly tricky, so we will develop it in stages.
Let’s run one by hand:

(let (x 10) (let (y 20) (+ x y)))
--> (let (y 20) (+ 10 y))
--> (+ 10 20)
--> 30

Let’s try another:

(let (x (+ 2 3)) x)
--> (let (x 5) x)
--> 5

Intuitively, the semantics of let should be: (1) evaluate the assignment to a value v, (2) substitute v for all instances of the symbol var in body, then (3) evaluate body
Ponder: What would happen if we didn’t evaluate the assignment before subsituting? Would anything change?

Implementing substitution: Try 1

Substitution try #1: write a function subst e id v that replaces all instances of the symbol id with expression v in e:

(subst : (Exp Symbol Exp -> Exp))
(define (subst substE substId substV)
  (type-case Exp substE
    [(varE name) (if (symbol=? name substId) substV (varE name))]
    [(plusE l r) (plusE (subst l substId substV) (subst r substId substV))]
    [(numE n) (numE n)]
    [(let1E var assignment body)
     (let [(substV (subst assignment substId substV))
           (substBody (subst body substId substV))]
       (let1E var substV
              (subst substBody var assignment)))]
    ))

Running subst on a few inputs:

> (subst (varE 'hello) 'hello (numE 10))
- Exp
(numE 10)

> (subst (let1E 'x (numE 10) (varE 'x)) 'x (numE 20))
- Exp
(let1E 'x (numE 10) (numE 20))

Using this subst function, we can implement an evaluator for let:

(interp : (Exp -> Number))
(define (interp e)
  (type-case Exp e
    [(numE n) n]
    [(plusE l r) (+ (interp l) (interp r))]
    [(varE x) (error 'runtime "unrecognized symbol")]
    [(let1E id assignment body)
     (let [(substbody (subst body id (num (interp assignment))))]
       (interp substbody))]))

Let’s try running this on a few inputs:

> (interp (numE 10))
- Number
10

> (interp (let1E 'x (numE 10) (plusE (varE 'x) (varE 'x))))
- Number
20

Now let’s try a tricker one:

> (interp (let1E 'x (numE 10)
               (let1E 'x (numE 20)
                      (varE 'x))))
- Number
10

Uh oh! What happened here? Is this what we expect?
Probably not! We intuitively x to refer to its inner-most binding, which is the one that assigns it equal to 20
Ponder: Describe in English what the scoping rules are for this evaluator.
Terminology: We say that the inner x shadows the outer x
Here is an even trickier one: (let1E (x 10) (plusE x (let1E (x 20) x))), drawn out as a valid program as:

(let1E 'x (numE 10)
                   (plusE (varE 'x)
                          (let1E 'x (numE 20) (varE 'x))))

What should happen here? Draw two diagrams: one with the scoping rules for our broken substitution, and the one that you intuitively think should be correct.

Fixing substitution

We need to update the specification of our subst function. How?
Substitution try #2: write a function subst e id v that replaces all instances of the symbol id with expression v in e until a new identifier that is equal to id enters scope:

(subst : (Exp Symbol Exp -> Exp))
(define (subst substE substId substV)
  (type-case Exp substE
    [(varE name) (if (symbol=? name substId) substV (varE name))]
    [(plusE l r) (plusE (subst l substId substV) (subst r substId substV))]
    [(numE n) (numE n)]
    [(let1E var assignment body)
     (let [(substV (subst assignment substId substV))
           (substBody (subst body substId substV))]
       (if (symbol=? var substId)
           (let1E var substV body) ; do *not* substitute the body, var is shadowed
           (let1E var substV       ; do substitute the body, no shadowing
              (subst substBody var assignment))))]))

Now let’s try running our program again:

> (eval (let1E 'x (numE 10)
               (let1E 'x (numE 20)
                      (varE 'x))))
- Number
20

Static vs. Dynamic Scope

A key property of our scoping rules so far is that they are static, meaning that we can always determine which variable references a particular binding regardless of the program’s behavior at runtime.
This might seem like an obvious requirement – most langauges you have used satisfy this requirement – but there are examples of languages where which variables are in-scope can depend on the runtime behavior of a program
This is an actual common Lisp program that prints out 5 5:

(defvar x 100)

(defmethod fun1 (x)
  (print x)
  (fun2))

(defmethod fun2 ()
  (print x))

(fun1 5)

What is happening!?
- The defvar command introduces a special global variable x.
- Then, when fun1 is called, it introduces a variable x into scope, and binds it to the value 5
- Then, it prints x, which has value 5 so the first 5 gets printed and seems normal.
- This is where things get really weird. Next, fun2 is called, which takes no arguments. It also prints x, which outputs 5, but we surely expect 100 to be printed!
- This is because the scoping rules in Common Lisp are dynamic: once introduced, a variable never leaves scope, and hence variables always refers to the most recently declared identifier encountered while running the program!
- (Note: Common Lisp also has a local-like facility that supports lexical scope)
Dynamic scope is quite unintuitive and almost certainly a bad design choice
- Ponder: what are some reasons why dynamic scope is undesirable?
- Ponder: What properties of our evaluator ensure that our scope cannot be dynamic?
Historical note: early implementations of Python and JavaScript had dynamic scope, but few modern examples exist
A nice blogpost on scope for more reading: https://prl.khoury.northeastern.edu/blog/2019/09/05/lexical-and-dynamic-scope/

Implementing let with environments

Q: What is the runtime of our substitution algorithm?
- A: Linear in the size of the program!
This runtime seems quite undesirable; every time we introduce a new variable, we will have to scan over the whole program text!
No real program implementations use substitution this way: most implementations make use of an environment for keeping track of which variables are in-scope
An environment is a map from identifiers to values. We write these as [x |-> v1, y |-> v2, …]
We can run an interpreter by hand that manipulates environments. Now our arrow rules produce a sequence of (environment, program) pairs:

[], (let (x 10) (let (y 20) (+ x y)))
--> [x |-> 10], (let (y 20) (+ x y))
--> [x |-> 10, y |-> 20], (+ x y)
--> [x |-> 10, y |-> 20], (+ 10 y)
--> [x |-> 10, y |-> 20], (+ 10 20)
--> [x |-> 10, y |-> 20], 30

Implementing an interpreter

To model the environment, we use a hashset datatype:

(define-type-alias Env (Hashof Symbol Value))
(define mt-env (hash empty)) ;; "empty environment"

There are two key functions for datatypes of type Hashof Keytype Valuetype:
- extend hashset newkey newvalue, which creates a new hashset out of hashset with an entry that maps newkey to newvalue
- hash-ref hashset key, which returns an Optionof Value, which is none if key is not in the hashtable and some v if v is in the hashtable
We will implement some helper functions to deal with environments:

(define (lookup (s : Symbol) (n : Env))
  (type-case (Optionof Number) (hash-ref n s)
    [(none) (error s "not bound")]
    [(some v) v]))

(extend : (Env Symbol Number -> Env))
(define (extend old-env new-name value)
  (hash-set old-env new-name value))

Now, we are ready to implement our evaluator. What should it do?

(define (interp e nv)
  (type-case Exp e
    [(numE n) n]
    [(varE s) (lookup s nv)]
    [(plusE l r) (+ (interp l nv) (interp r nv))]
    [(let1E var val body)
     (let ([new-env (extend nv
                            var
                            (interp val nv))])
       (interp body new-env))]))

And we can run this on some examples:

(test (interp (let1E 'x (numE 10)
               (let1E 'x (numE 20)
                      (varE 'x))) mt-env) 20)

(test (interp (let1E 'x (numE 10) (plusE (varE 'x) (varE 'x))) mt-env) 20)

(test (interp (let1E 'x (numE 10)
                   (plusE (varE 'x)
                          (let1E 'x (numE 20) (varE 'x)))) mt-env) 30)

(test (interp (plusE (let1E 'x (numE 10) (varE 'x)) (let1E 'x (numE 15) (varE 'x))) mt-env) 25)

Ponder: How does our interpreter deal with variables going out of scope? Why did that last example work?

Lecture 5: cond and let

Cond

Design decisions

Local binding and scope

Syntax of let

Semantics of let

Implementing substitution: Try 1

Fixing substitution

Static vs. Dynamic Scope

Implementing let with environments

Implementing an interpreter

Syntax of `let`