Lecture 5: cond and let

Goals for today:

  • Develop cond: extend calc with conditionals
    • Explore the design-space of conditionals
    • Create an evaluator for cond
    • The value datatype
  • Extend calc with local binding
    • Syntax and semantics of let-binding
    • Understand scope: be able to identify when a particular variable is in scope, and what it is bound to
    • Understand substitution and how it relates to scope

Logistics:

  • Next homework due Friday
  • Grades for homework 1 are posted
  • Will be releasing the next homework on Wednesday, due Friday Feb 9
  • First quiz will next Thursday Feb 8 (8am) to Friday Feb 9 (11:59pm)

Quiz logistics:

  • Goal is to not have time-pressure. It should take no more than 2 or 3 hours
  • No public Piazza questions or office-hours questions relating to quiz. Private questions are OK.
  • Will cover basics of Plait and the SMoL language (calculator, cond, let, and fun), up to and including Lecture 6 “First-class functions”
  • Open-book, open-note, open-DrRacket exam
  • A mix of programming, multiple-choice, short-answer, and open-ended design questions

Cond

  • Recall our desired syntax for our language with conditionals:
(define-type Exp
  [num (n : Number)]
  [bool (b : Boolean)]
  [plus (left : Exp) (right : Exp)]
  [cnd (test : Exp) (thn : Exp) (els : Exp)])
  • Semantics of cnd: If test evaluates to #t, evaluate thn; if guard evaluates to #f, evaluate els; otherwise, do what?
  • Suppose we want to make an evaluator for this language. What should its type be?
    • Ponder: what do we do when we encounter Booleans?
  • We need a Value type that captures what cond programs can evaluate to: either numbers or Booleans
(define-type Value
  [vbool (v : Boolean)]
  [vnum (n : Number)])
  • Then, we can give a type to our cond interpreter:
(calc : (Exp -> Value))
(define (calc e)
  (error 'not-impl ""))
  • Now, we can start filling our interpreter in:
(calc : (Exp -> Value))
(define (calc e)
  (type-case Exp e
   [(num n) (vnum n)]
   [(bool b) (vbool b)]
   [(plus l r) ...]
   [else (error 'notimpl "")]))
  • What should we do for +? At this point it is useful to add some helper functions that extract the data that we need from values:
(define (value->num v)
  (type-case Value v
   [(vbool b) (error 'value "invalid value")]
   [(vnum n) n]))

(define (value->bool v)
  (type-case Value v
   [(vnum n) (error 'value "invalid value")]
   [(vbool b) b]))
  • And now, we can use these helpers to fill in + and cnd:
(define (add v1 v2)
  (+ (value->num v1) (value->num v2)))

(define (calc e)
  (type-case Exp e
    ([num n] (vnum n))
    ([bool b] (vbool b))
    ([plus l r] (vnum (add (calc l) (calc r))))
    ([cnd test thn els] (if (value->bool (calc test))
                            (calc thn)
                            (calc els)))))

Design decisions

  • We have some design decisions we made here implicitly: can you spot them?
  • First, we decided adding a Boolean and a number should be an error; not all languages do this!
  • Second, we inherited Plait’s semantics for if

Local binding and scope

  • Binding associates a value with a name
  • We have already seen instances of binding when we call functions:
(define (addone x) (+ x 1))
> (addone 2)
3
  • This function call binds the value 2 to the name x
  • Local binding tells us that a name is restricted in visibility to some part of the program
  • Again, we are familiar with this notion: a function’s argument is not visible outside the body of the function:
(define (addone x) (+ x 1))
> x
x: free variable while typechecking in: x
  • We say that x is not in scope outside of the body of the function. Scope is all about determining which variables are visible where. It is a semantic notion: the implementation of our evaluator will determine the scope of an variable.

Syntax of let

  • Now we want to extend our calculalator language with the ability to introduce local variables
  • For example, we want the following syntax:
(let1 (x 10)     (+ x x))
       ^ ^^      ^^^^^^^
       | |          |
       | assignment body
       |
       name
  • We can express this new syntax using BNF:
<expr> ::= <num>
          | (+ <expr> <expr>
          | <var>
          | (let1 (<var> <expr>) <expr>)
  • And, as usual, we can write down a Plait type to capture ASTs for this new syntax:
(define-type Exp
  [numE (n : Number)]
  [plusE (left : Exp) (right : Exp)]
  [varE (name : Symbol)]
  [let1E (var : Symbol)
         (assignment : Exp)
         (body : Exp)])

Semantics of let

  • Intuitively, what is the semantics of (let var assignment body)? It is surprisingly tricky, so we will develop it in stages.
  • Let’s run one by hand:
(let (x 10) (let (y 20) (+ x y)))
--> (let (y 20) (+ 10 y))
--> (+ 10 20)
--> 30
  • Let’s try another:
(let (x (+ 2 3)) x)
--> (let (x 5) x)
--> 5
  • Intuitively, the semantics of let should be: (1) evaluate the assignment to a value v, (2) substitute v for all instances of the symbol var in body, then (3) evaluate body
  • Ponder: What would happen if we didn’t evaluate the assignment before subsituting? Would anything change?

Implementing substitution: Try 1

  • Substitution try #1: write a function subst e id v that replaces all instances of the symbol id with expression v in e:
(subst : (Exp Symbol Exp -> Exp))
(define (subst substE substId substV)
  (type-case Exp substE
    [(varE name) (if (symbol=? name substId) substV (varE name))]
    [(plusE l r) (plusE (subst l substId substV) (subst r substId substV))]
    [(numE n) (numE n)]
    [(let1E var assignment body)
     (let [(substV (subst assignment substId substV))
           (substBody (subst body substId substV))]
       (let1E var substV
              (subst substBody var assignment)))]
    ))
  • Running subst on a few inputs:
> (subst (varE 'hello) 'hello (numE 10))
- Exp
(numE 10)

> (subst (let1E 'x (numE 10) (varE 'x)) 'x (numE 20))
- Exp
(let1E 'x (numE 10) (numE 20))
  • Using this subst function, we can implement an evaluator for let:
(interp : (Exp -> Number))
(define (interp e)
  (type-case Exp e
    [(numE n) n]
    [(plusE l r) (+ (interp l) (interp r))]
    [(varE x) (error 'runtime "unrecognized symbol")]
    [(let1E id assignment body)
     (let [(substbody (subst body id (num (interp assignment))))]
       (interp substbody))]))
  • Let’s try running this on a few inputs:
> (interp (numE 10))
- Number
10

> (interp (let1E 'x (numE 10) (plusE (varE 'x) (varE 'x))))
- Number
20
  • Now let’s try a tricker one:
> (interp (let1E 'x (numE 10)
               (let1E 'x (numE 20)
                      (varE 'x))))
- Number
10
  • Uh oh! What happened here? Is this what we expect?
  • Probably not! We intuitively x to refer to its inner-most binding, which is the one that assigns it equal to 20
  • Ponder: Describe in English what the scoping rules are for this evaluator.
  • Terminology: We say that the inner x shadows the outer x
  • Here is an even trickier one: (let1E (x 10) (plusE x (let1E (x 20) x))), drawn out as a valid program as:
(let1E 'x (numE 10)
                   (plusE (varE 'x)
                          (let1E 'x (numE 20) (varE 'x))))
  • What should happen here? Draw two diagrams: one with the scoping rules for our broken substitution, and the one that you intuitively think should be correct.

Fixing substitution

  • We need to update the specification of our subst function. How?
  • Substitution try #2: write a function subst e id v that replaces all instances of the symbol id with expression v in e until a new identifier that is equal to id enters scope:
(subst : (Exp Symbol Exp -> Exp))
(define (subst substE substId substV)
  (type-case Exp substE
    [(varE name) (if (symbol=? name substId) substV (varE name))]
    [(plusE l r) (plusE (subst l substId substV) (subst r substId substV))]
    [(numE n) (numE n)]
    [(let1E var assignment body)
     (let [(substV (subst assignment substId substV))
           (substBody (subst body substId substV))]
       (if (symbol=? var substId)
           (let1E var substV body) ; do *not* substitute the body, var is shadowed
           (let1E var substV       ; do substitute the body, no shadowing
              (subst substBody var assignment))))]))
  • Now let’s try running our program again:
> (eval (let1E 'x (numE 10)
               (let1E 'x (numE 20)
                      (varE 'x))))
- Number
20

Static vs. Dynamic Scope

  • A key property of our scoping rules so far is that they are static, meaning that we can always determine which variable references a particular binding regardless of the program’s behavior at runtime.
  • This might seem like an obvious requirement – most langauges you have used satisfy this requirement – but there are examples of languages where which variables are in-scope can depend on the runtime behavior of a program
  • This is an actual common Lisp program that prints out 5 5:
(defvar x 100)

(defmethod fun1 (x)
  (print x)
  (fun2))

(defmethod fun2 ()
  (print x))

(fun1 5)
  • What is happening!?
    • The defvar command introduces a special global variable x.
    • Then, when fun1 is called, it introduces a variable x into scope, and binds it to the value 5
    • Then, it prints x, which has value 5 so the first 5 gets printed and seems normal.
    • This is where things get really weird. Next, fun2 is called, which takes no arguments. It also prints x, which outputs 5, but we surely expect 100 to be printed!
    • This is because the scoping rules in Common Lisp are dynamic: once introduced, a variable never leaves scope, and hence variables always refers to the most recently declared identifier encountered while running the program!
    • (Note: Common Lisp also has a local-like facility that supports lexical scope)
  • Dynamic scope is quite unintuitive and almost certainly a bad design choice
    • Ponder: what are some reasons why dynamic scope is undesirable?
    • Ponder: What properties of our evaluator ensure that our scope cannot be dynamic?
  • Historical note: early implementations of Python and JavaScript had dynamic scope, but few modern examples exist
  • A nice blogpost on scope for more reading: https://prl.khoury.northeastern.edu/blog/2019/09/05/lexical-and-dynamic-scope/

Implementing let with environments

  • Q: What is the runtime of our substitution algorithm?
    • A: Linear in the size of the program!
  • This runtime seems quite undesirable; every time we introduce a new variable, we will have to scan over the whole program text!
  • No real program implementations use substitution this way: most implementations make use of an environment for keeping track of which variables are in-scope
  • An environment is a map from identifiers to values. We write these as [x |-> v1, y |-> v2, …]
  • We can run an interpreter by hand that manipulates environments. Now our arrow rules produce a sequence of (environment, program) pairs:
[], (let (x 10) (let (y 20) (+ x y)))
--> [x |-> 10], (let (y 20) (+ x y))
--> [x |-> 10, y |-> 20], (+ x y)
--> [x |-> 10, y |-> 20], (+ 10 y)
--> [x |-> 10, y |-> 20], (+ 10 20)
--> [x |-> 10, y |-> 20], 30

Implementing an interpreter

  • To model the environment, we use a hashset datatype:
(define-type-alias Env (Hashof Symbol Value))
(define mt-env (hash empty)) ;; "empty environment"
  • There are two key functions for datatypes of type Hashof Keytype Valuetype:
    • extend hashset newkey newvalue, which creates a new hashset out of hashset with an entry that maps newkey to newvalue
    • hash-ref hashset key, which returns an Optionof Value, which is none if key is not in the hashtable and some v if v is in the hashtable
  • We will implement some helper functions to deal with environments:
(define (lookup (s : Symbol) (n : Env))
  (type-case (Optionof Number) (hash-ref n s)
    [(none) (error s "not bound")]
    [(some v) v]))

(extend : (Env Symbol Number -> Env))
(define (extend old-env new-name value)
  (hash-set old-env new-name value))
  • Now, we are ready to implement our evaluator. What should it do?
(define (interp e nv)
  (type-case Exp e
    [(numE n) n]
    [(varE s) (lookup s nv)]
    [(plusE l r) (+ (interp l nv) (interp r nv))]
    [(let1E var val body)
     (let ([new-env (extend nv
                            var
                            (interp val nv))])
       (interp body new-env))]))
  • And we can run this on some examples:
(test (interp (let1E 'x (numE 10)
               (let1E 'x (numE 20)
                      (varE 'x))) mt-env) 20)

(test (interp (let1E 'x (numE 10) (plusE (varE 'x) (varE 'x))) mt-env) 20)

(test (interp (let1E 'x (numE 10)
                   (plusE (varE 'x)
                          (let1E 'x (numE 20) (varE 'x)))) mt-env) 30)

(test (interp (plusE (let1E 'x (numE 10) (varE 'x)) (let1E 'x (numE 15) (varE 'x))) mt-env) 25)
  • Ponder: How does our interpreter deal with variables going out of scope? Why did that last example work?