Lecture 3: Conditionals and local variables

Key concepts and goals for today:

ite with Booleans
Scope and shadowing
Substitution and how it can be used to implement local variables
Natural semantics and inference rules

Conditionals: the `ite` language

Let’s revisit our ite interpreter from last time. This time, we will implement it using Booleans.
Here is the abstract syntax datatype for the ite language with Booleans:

;;; type expr =
;;;   | add of expr * expr
;;;   | mul of expr * expr
;;;   | num of number
;;;   | bool of bool
;;;   | ite of expr * expr * expr
(struct eadd (e1 e2) #:transparent)
(struct emul (e1 e2) #:transparent)
(struct enum (n) #:transparent)
(struct ebool (n) #:transparent)
(struct eite (guard thn els) #:transparent)

Note that we have an ebool struct: this denotes a Boolean constant
This language has two kinds of values: numbers and Booleans. We will represent these two kinds of values in a particular datatype:

;;; type value =
;;;    | vnum of number
;;;    | vbool of bool
(struct vbool (b) #:transparent)
(struct vnum (n) #:transparent)

;;; to-num : value -> number
;;; converts a value to a number or raises a runtime error
(define (to-num v)
  (match v
    [(vnum n) n]
    [_ (error "runtime")])) ; recall that the "_" case is a default case

;;; to-num : value -> bool
;;; converts a value to a bool or raises a runtime error
(define (to-bool v)
  (match v
    [(vbool b) b]
    [_ (error "runtime")]))

Now, we give a semantics to ite: each ite term evaluates to a value.
Semantics of ite:
- enum n evaluates to vnum n
- ebool b evaluates to vbool b
- eadd e1 e2:
  1. Evaluate e1 to v1
  2. Evaluate e2 to v2
  3. If v1 and v2 are both numbers then evaluate eadd e1 e2 to (+ v1 v1). Otherwise, raise an error error.
- emul e1 e2:
  1. Evaluate e1 to v1
  2. Evaluate e2 to v2
  3. If v1 and v2 are both numbers then evaluate emul e1 e2 to (* v1 v1). Otherwise, raise an error.
- eite guard thn els:
  1. Evaluate guard to v. If v is not a vbool, then raise an error.
  2. If v is #t, then evaluate thn. Otherwise, evaluate els.
We can implement these semantics in the following interpreter:

;;; interp : expr -> value
;;; evaluates an expression to a value
(define (interp e)
  (match e
    [(eadd e1 e2)
     (let [(n1 (to-num (interp e1)))
           (n2 (to-num (interp e2)))]
       (vnum (+ n1 n2)))]
    [(emul e1 e2)
     (let [(n1 (to-num (interp e1)))
           (n2 (to-num (interp e2)))]
       (vnum (* n1 n2)))]
    [(ebool b) (vbool b)]
    [(eite guard thn els)
     (let [(vguard (to-bool (interp guard)))]
       (if vguard (interp thn) (interp els)))]
    [(enum n) (vnum n)]))

(check-equal? (interp (enum 1)) (vnum 1))
(check-equal? (interp (eadd (enum 10) (enum 20))) (vnum 30))
(check-equal? (interp (emul (enum 10) (enum 20))) (vnum 200))
(check-equal? (interp (eadd (emul (enum 1) (enum 2)) (enum 3))) (vnum 5))
(check-equal? (interp (eite (ebool #t) (enum 2) (enum 3))) (vnum 2))
(check-equal? (interp (eite (ebool #f) (enum 2) (enum 3))) (vnum 3))

Local variables and scope

Let’s continue growing our little language by adding another important feature: local variables
You’ve programmed with local variables before. For instance, in Python we can create a local variable:

> x = 5
> y = x + 10
> print(x + y)
20

Similarly, in Racket we create a local variable using the let syntax:

> (let [(x 10)] (+ x 20))
30

Terminology:
- The name of a variable its identifier. In the above example, x is an identifier.
- The expression associated with an identifier is the assignment. In the above example, 10 is the assignment to x.
- Assigning the identifier x to its assignment is called a declaration.
- If an identifier x is assigned to a particular value by some declaration, we say it is bound to that value
The thing that makes “local variables” local is that they are not accessible to the entire program. For instance, in the following Racket program, we see x is not visible outside of the let expression:

> (let [(x 10)] (+ x 20))
30
> x
x: undefined;
 cannot reference an identifier before its definition

Definition: The scope of a declaration is the portion of the program for which that declaration can be used.
- In the above example, the scope of x is the sub-expression (+ x 20), which is called the body of the let expression.
There are a variety of rules for scope, and different languages have different rules: scoping rules are one of the key design decisions that distinguish different programming languages.
An important kind of scope is lexical (or static) scope, which says that the scope of a declaration can be determined without running the program. Most (but not all!) widely-used languages use lexical scope.
An important property of local variables is that there can be multiple declarations for the same identifier. For instance, this is a valid Racket program:

> (let [(x 10)]
    (let [(x 20)] x))
20

In the above program the inner-most declaration x is the one that takes precedence. This is a typical design choice in many programming languages, and can be summarized as “identifiers are always bound to their inner-most declaration”.
Definition: An outer declaration is called shadowed if there is some inner declaration that of that same identifier.

The `let` language

Now we want to extend our calculator language with the ability to introduce local variables
We will use similar scoping rules to Racket, and the following abstract syntax data structure:

;;; type expr =
;;;   | add of expr * expr
;;;   | mul of expr * expr
;;;   | num of number
;;;   | elet of string * expr * expr
(struct eadd (e1 e2) #:transparent)
(struct emul (e1 e2) #:transparent)
(struct enum (n) #:transparent)
(struct elet (id assignment body) #:transparent)
(struct eident (id) #:transparent)

The semantics of our let language again evaluates programs to numbers.
All the rules for the semantics are the same as calc except for the new terms elet and eident.
To give a semantics to elet we will introduce new idea: substitution
The goal of substitution is to replace an identifier with an expression while respecting scope. Think of it like “find and replace”: we want to find all instances of x in some expression body and replace it with a new expression assignment.
- We denote this as body[x |-> assignment]
Now we can give a semantics of let in terms of substitution:
Semantics of let:
- (elet id assignment body) evaluates to:
  1. evaluate assignment to v
  2. evaluate body[id |-> (enum v)] to v2
  3. return v2
- (eident id) raises an error if evaluated
Now for the tricky part: how do we define substitution?
Different choices will result in different scoping rules
To achieve our goal of “identifiers are always bound to their inner-most declaration”, we will give our substitution function the following implementation:

;;; subst : expr -> string -> expr -> expr
;;; performs the substitution expr[id |-> e]
;;; i.e., substitutes e for id in expr
(define (subst expr id e)
  (match expr
    [(eadd e1 e2) (eadd (subst e1 id e)
                        (subst e2 id e))]
    [(emul e1 e2) (emul (subst e1 id e)
                        (subst e2 id e))]
    [(enum num) (enum num)]
    [(elet letid assignment body)
     (if (equal? letid id)
         (elet letid assignment body) ; shadowing case, do nothing
         (elet letid (subst assignment id e) (subst body id e))) ; not shadowing
     ]
    [(eident x)
     ;; if x = id, then we perform substitution. otherwise, do nothing
     (if (equal? id x) e (eident x))]
    ))

Now we are ready to implement and test our interpreter:

;;; interp : expr -> value
;;; evaluates an expression to a value
(define (interp expr)
  (match expr
    [(eadd e1 e2) (+ (interp e1) (interp e2))]
    [(emul e1 e2) (* (interp e1) (interp e2))]
    [(eident x) (error "runtime error: unbound identifier")]
    [(elet id binding body)
     (let* [(vbinding (enum (interp binding)))
            (substbody (subst body id vbinding))]
       (interp substbody))]
    [(enum n) n]))

(check-equal? (interp (enum 1)) 1)
(check-equal? (interp (eadd (enum 10) (enum 20))) 30)
(check-equal? (interp (emul (enum 10) (enum 20))) 200)
(check-equal? (interp (eadd (emul (enum 1) (enum 2)) (enum 3))) 5)

;;; check basic case
(check-equal? (interp (elet "x" (enum 2) (eident "x"))) 2)
;;; check shadowing
(check-equal?
 (interp (elet "x" (enum 2)
               (elet "x" (enum 3)
                     (eident "x")))) 3)
;;; check multiple bindings
(check-equal?
 (interp (elet "x" (enum 2)
               (elet "y" (enum 3)
                     (eadd (eident "x") (eident "y"))))) 5)

Static vs. Dynamic Scope

A key property of our scoping rules so far is that they are static
This might seem like an obvious requirement – most languages you have used satisfy this requirement – but there are examples of languages where which variables are in-scope can depend on the runtime behavior of a program
This is an actual common Lisp program that prints out 5 5:

(defvar x 100)

(defmethod fun1 (x)
  (print x)
  (fun2))

(defmethod fun2 ()
  (print x))

(fun1 5)

What is happening!?
- The defvar command introduces a special global variable x.
- Then, when fun1 is called, it introduces a variable x into scope, and binds it to the value 5
- Then, it prints x, which has value 5 so the first 5 gets printed and seems normal.
- This is where things get really weird. Next, fun2 is called, which takes no arguments. It also prints x, which outputs 5, but we surely expect 100 to be printed!
- This is because the scoping rules in Common Lisp are dynamic: once introduced, a variable never leaves scope, and hence variables always refers to the most recently declared identifier encountered while running the program!
- (Note: Common Lisp also has a local-like facility that supports lexical scope)
Dynamic scope is quite unintuitive and almost certainly a bad design choice
- Ponder: what are some reasons why dynamic scope is undesirable?
- Ponder: What properties of our evaluator ensure that our scope cannot be dynamic?
Historical note: early implementations of Python and JavaScript had dynamic scope, but few modern examples exist
A nice blogpost on scope for more reading: https://prl.khoury.northeastern.edu/blog/2019/09/05/lexical-and-dynamic-scope/

Lecture 3: Conditionals and local variables

Conditionals: the ite language

Local variables and scope

The let language

Static vs. Dynamic Scope

Conditionals: the `ite` language

The `let` language