Lecture 2: More Racket, Calculator Lang
Key concepts and goals for today:
- Finish up basics of Racket that we will use throughout the class: lists, structs, first-class functions
- Implement our first programming language:
calc
- Understand abstract syntax surface syntax
- Be able to draw a parse tree
- Understand host semantics
Lists
- Lists are built out of two constructors:
'()
, the empty list value(cons hd tl)
, the list constructor that concatenateshd
to the listtl
- For example, we can construct a list of elements
1, 2, 3
by applyingcons
three times:
> (cons 1 (cons 2 (cons 3 '())))
'(1 2 3)
- Note the syntax
'(1 2 3)
, which is read “quote one two three”. This is how Racket renders lists; more on that later - It is tedious to type
cons
all the type so there are a number of short-hand ways to describe lists in Racket:
> (list 1 2 3)
'(1 2 3)
> '(1 2 3)
'(1 2 3)
- There are a number of useful built-in functions for lists; you can see a full list here
- Here are some examples of some useful ones:
> (define my-list '(1 2 3))
> (empty? my-list)
#f
> (length my-list)
3
- Now that we’ve built lists, we need a way of destructing them. To do this, we will use the built-in
match
function:
> (define my-list '(1 2 3))
> (match my-list
['() "empty!]
[(cons hd tl) "not empty!])
- Now we can define some interesting functions involving lists! Here is one that sums all of the elements of a list:
; sum-list: int list -> int
; returns the sum of all elements in the list
(define (sum-list l)
(match l
['() 0]
[(cons hd tl) (+ hd (sum-list tl))]))
(check-equal? (sum-list '()) 0)
(check-equal? (sum-list '(1 2 3)) 6)
User-defined data-types
- Most interesting programs implement their own custom data-types.
- An example you will see in the homework is a binary tree. We can build binary trees in Racket as follows:
;;; type tree =
;;; | node of tree * tree
;;; | leaf of number
(struct node (l r) #:transparent)
(struct leaf (x) #:transparent)
- Note the structure of comment that we used to describe this tree type. You read this as: “A
tree
is a type that either:- a
node
that is a pair of trees; - A
leaf
that is a number.
- a
- We will see more examples of writing these kinds of comments.
- The
#:transparent
syntax is boilerplate: it tells the DrRacket REPL that this struct can be printed. If you’re curious, see here - Now we can build binary trees:
> (leaf 10)
(leaf 10)
> (node (leaf 20) (leaf 30))
(node (leaf 20) (leaf 30))
- To destruct your structs and manipulate them, you should use pattern matching:
> (define my-tree (node (leaf 10) (leaf 20)))
> (match my-tree
[(leaf n) n]
[(node l r) l])
(leaf 10)
- Experiment with matching to get a feel for it
- Here is the detailed documentation for pattern matching if that is helpful. There are many more examples.
Local variables
- Local variables are declared with the built-in
let
function:
> (let [(x 10)] (+ x 20))
30
- There are a few different syntactic forms of
let
that offer different conveniences while programming; we will introduce those later as-needed. If you are curious see this part of the reference
First-class functions
We saw last time that every Racket program is either:
- A value, which is a number, Boolean, or string
- A function call, which is written
(func-name arg1 arg2 ... argn)
.
There is a third kind of Racket value that we will use quite often: \emph{functions}. We declare a function value as follows:
> (lambda (x) (+ 1 x))
#<procedure>
We will refer to these as $\lambda$-terms or $\lambda$-expressions. We can call a $\lambda$-term in the usual way we call functions in Racket:
> ((lambda (x) (+ x 1)) 5)
6
- Functions can be passed as arguments to other functions, just like any other Racket value. For example, the following defines a function
call-twice
that takes an argumentf
and evaluates it twice on some initial argumentk
:
> (define (call-twice f k) (f (f k)))
> (call-twice (lambda (f) (* 2 f)) 2) ; computes (* 2 (* 2 2))
8
- Functions can also be returned. Let’s make a function
make-adder k
that makes a function that addsk
to whatever it is called with:
> (define (make-adder k) (lambda (x) (+ x k)))
> (define add-5 (make-adder 5))
> (add-5 10)
15
- There are a few more Racket features we will use in this class, but not too many more. They will be explained as they are encountered.
Calculator Lang: your first programming language
Abstract syntax trees
- Recall: A programming language consists of two components:
- Syntax: text that describes programs
- Semantics: the meaning of the program
- There are many different kinds of syntax, even for simple operations like addition:
- s-expressions, such as Racket:
(+ 1 2)
- infix, like in Python:
1 + 2
- postfix, like in Forth:
1 2 +
- s-expressions, such as Racket:
- All of these syntactic forms represent are essentially equivalent and represent the same operation: adding 1 and 2.
- Our first step on our journey to defining the calculator language
calc
is to abstract our notion of syntax.- To keep things clear, we will typically refer to the textual version of syntax as surface syntax.
- Definition: An abstract syntax tree (AST) is a tree-like data structure for representing syntax.
- The internal nodes of the tree are called non-terminal nodes (or production nodes).
- The leaf nodes of the tree are called terminal nodes.
- The process of converting surface syntax into abstract syntax is called parsing.
- Example: we want to parse all of the above
1 + 2
expressions into the same kind of AST structure like this, with internal node+
and two terminal nodes1
and2
:
+
/ \
1 2
- Example: ASTs are useful for disambiguating the orders of operations. For instance, the expression
2 * 3 + 4
can be unambiguously written as an AST:
+
/ \
* 4
/ \
2 3
- Note that ASTs don’t require the use of parenthesis to disambiguate the order of operations.
Syntax of calc
- Now we can describe the syntax of our calculator language
- Goal: Design a small programming language that can add and multiply numbers
- Example surface-syntax programs in infix notation:
2
1 + 2
2 * (3 + 4)
- We will represent the abstract syntax of these programs using the following AST datastructure in Racket:
;;; type expr =
;;; | add of expr * expr
;;; | mul of expr * expr
;;; | num of number
(struct add (e1 e2) #:transparent)
(struct mul (e1 e2) #:transparent)
(struct num (n) #:transparent)
- Now we can write example ASTs for each of the above programs:
(num 2)
(add (num 1) (num 2))
(mul (num 2) (add (num 3) (num 4)))
- We will be working directly with ASTs for now; we will return to the problem of parsing later.
Semantics of calc
- The goal of semantics is to describe what programs mean
- What does this program mean:
(add (num 1) (num 2))
?- Intuitively you might say it means “add 1 to 2”: the meaning of this program is to run the program to evaluate it to a particular value. This is called interpreting the program.
- But, what do “add”, “1”, and “2” mean? is it binary addition? real-number addition?
- We are left with a circularity problem: to assign a meaning to our program, we need to use some external language to define its meaning.
- Definition: The host language is the language used to assign meaning to programs.
- We will use Racket as our host language for
calc
: we will use Racket’s definition of numbers to interpret numbers, and Racket’s definition of addition to interpretadd
. - Semantics of
calc
:(num n)
: evaluates to the Racket numbern
(add e1 e2)
:- evaluate
e1
to a Racket numberv1
- evaluate
e2
to a Racket numberv2
- return the Racket addition
(+ v1 v2)
- evaluate
- The semantics of
mul
is similar toadd
, except with Racket multiplication instead of addition.
- We can implement the above semantics as a program called an interpreter:
;;; interp : expr -> number
;;; evaluates a calc expression to a number
(define (interp e)
(match e
[(add e1 e2) (+ (interp e1) (interp e2))]
[(mul e1 e2) (* (interp e1) (interp e2))]
[(num n) n]))
(check-equal? (interp (num 1)) 1)
(check-equal? (interp (add (num 10) (num 20))) 30)
(check-equal? (interp (mul (num 10) (num 20))) 200)
(check-equal? (interp (add (mul (num 1) (num 2)) (num 3))) 5)
- This interpreter defines the semantics for
calc
: it gives a meaning to allcalc
programs in terms of Racket programs. - There are many other ways to have programmed this interpreter: we have chosen just one possible implementation
- Ponder: What are the consequences of our choice of host language?
- What if we had chosen C instead of Racket to implement our interpreter? What are some programs that would behave differently?
Parsing s-expressions
- Now we will develop an improved surface syntax for
calc
that is easier to use than manually writing AST nodes - We will use s-expressions as the basis for our new surface syntax
- Definition: An s-expression is either:
- A symbol, written
'symbol
(read “quote symbol”),'a
- A list, written
'(a b c)
- A symbol, written
- Racket’s syntax is based on s-expressions, which makes it very easy to work with. For example, we can easily generate an s-expressions for representing different
calc
programs:
> '(+ 1 2)
'(+ 1 2)
> '(* (+ 1 2) 3)
'(* (+ 1 2) 3)
- Note that these s-expressions are not yet
calc
AST nodes! - To translate these s-expressions into
calc
AST, we need a parser:
;;; parse-sexpr: sexpr -> expr
;;; parsers an s-expression into an expression
;;; this comment describes the surface-syntax of our language using Backus-Naur Form (BNF).
;;; we will discuss BNF a bit more later on once we have more complicated languages.
;;; sexpr ::= (+ <sexpr> <sexpr>) | num | (* <sexpr> <sexpr>)
(define (parse-sexpr s)
(match s
[(list op s1 s2)
(cond
[(equal? op '+) (add (parse-sexpr s1) (parse-sexpr s2))]
[(equal? op '*) (mul (parse-sexpr s1) (parse-sexpr s2))]
[#t (error "parse error: invalid operation")])]
[v (if (number? v) (num v) (error "parse error: not a number"))]))
(check-equal? (parse-sexpr '(+ 1 2)) (add (num 1) (num 2)))
(check-equal? (parse-sexpr '(+ (* 3 4) 2)) (add (mul (num 3) (num 4)) (num 2)))
;;; parse-and-run : sexpr -> number
;;; parses an s-expression into an expr and then runs it
(define (parse-and-run s)
(interp (parse-sexpr s)))
(check-equal? (parse-and-run '(+ 1 2)) 3)
(check-equal? (parse-and-run '(+ (* 3 4) 2)) 14)
Next time
- Conditionals (if-then-else)
- Let language, scope, and substitution