Introduction to types

Goals for today:

Encounter types and make our first simple typechecker
Understand the basics of typing judgments and derivations and how to draw typing derivation trees

A simple type checker

A classic saying about types from Robin Milner is that “well-typed programs do not go wrong” ¹
What does it mean for a program to “go wrong”? This can have many interpretations, and different approaches to types provide different notions of safety.
Let’s consider a tiny language and study the ways in which it can “go wrong”:

(define-type Exp
  [addE (l : Exp) (r : Exp)]
  [appendE (l : Exp) (r : Exp)]
  [numE (n : Number)]
  [stringE (s : String)])

(define-type Value
  [numV (n : Number)]
  [stringV (s : String)])

(calc : (Exp -> Value))
(define (calc e)
  (type-case Exp e
    [(numE n) (numV n)]
    [(stringE s) (stringV s)]
    [(addE l r ) (numV (+ (numV-n (calc l)) (numV-n (calc r))))]
    [(appendE l r ) (stringV (string-append
                              (stringV-s (calc l))
                              (stringV-s (calc r))))]))

One way to make things go wrong is to try to add a number and a string:

> (calc (addE (numE 10) (stringE "hello")))
- Value
numV-n: contract violation
  expected: numV?
  given: (stringV "hello")
  in: the 1st argument of
      (->
       numV?
       (or/c
        undefined?
        ...pkgs/plait/main.rkt:1013:41))
  contract from: numV-n
  blaming: use
   (assuming the contract is correct)

Wow, that’s quite an error message! Wouldn’t it be nice if we could tell the programmer something more specific, like “you tried to add a number and a string”?
Let’s write a simple program to prevent this kind of silly error
We will make a function called type-of that associates a type with an expression:
- Types are abstractions of values: 10 is a Number, "hello" is a String
- We say a term is a certain type if it can be run to produce a value of that type. For example, “1 + 2” is type Number
- Certain operations, like plusE, will expect their arguments to have certain types

(define-type Type
  [stringT]
  [numT])

(type-of : (Exp -> Type))
(define (type-of e)
  (type-case Exp e
             [(numE n) (numT)]
             [(stringE s) (stringT)]
             [(addE l r)
              (if (and (numT? (type-of l)) (numT? (type-of r)))
                  (numT)
                  (error 'type-error "tried to add non-numbers"))]
             [(appendE l r)
              (if (and (stringT? (type-of l)) (stringT? (type-of r)))
                  (stringT)
                  (error 'type-error "tried to append non-string"))
              ]))

Now, if we try to add a string and an number, we get a type error:

> (type-of (addE (numE 10) (stringE "hello")))
- Type
type-error: tried to add non-numbers

Notice: the type-of function looks a lot like an interpreter, except it does not compute values.
What we have written above is called a type-checker, which is a program that associates a program with a type, or fails if there is no type for the program.

Inference rules and natural deduction

Now we arrive at a powerful idea: a concise logical notation for describing the types of programs
Typing rules are of the form “to show that (+ 1 2) has type Number, I must first show that 1 has type Number and 2 has type Number, and then I can conclude that 1 + 2 has type Number”
We can introduce some more concise notation for these facts
First, we use the colon : to denote something has a particular type, i.e. we will write 1 : Number.
We can declare that all expressions numE have type Number: this is called an axiom. We write this as:

\[\frac{}{\texttt{(numE n) : Number}}~\text{(T-Num)}\]

This states a fact about our abstract syntax: it states that all syntactic terms (numE n) have type Number.
Notice that our (T-Num) axiom has a free variable n in it: we can fill in n with a number when we apply this axiom, i.e. it holds that:

\[\dfrac{}{\texttt{(numE 10) : Number}}~\text{(T-Num)}\]

Sometimes we do not include the horizontal line for axioms for notational convenience, but let’s leave it for now.
Then, we will use a horizontal line to separate premises (or antecedent) from conclusions (or consequents). To remember it:

\[\frac{\texttt{premise1}\qquad \texttt{premise2} \qquad \cdots \qquad \texttt{premiseN}}{\texttt{conclusion}}\]

The above notation is called an inference rule (you can infer conclusion from premises)
Using this notation, we can write the above English sentence as:

\[\dfrac{\dfrac{}{\texttt{(numE 1) : Number}} \qquad \dfrac{}{\texttt{(numE 2) : Number}}}{\texttt{(addE (numE 1) (numE 2)) : Number}}\]

Suppose we wanted to more generally state that “to show (addE e1 e2) : Number, we must show e1 : Number and e2 : Number. This is easily stated as an inference rule our nice new notation:

\[\frac{\texttt{e1 : Number} \qquad \texttt{e2 : Number}}{\texttt{(addE e1 e2) : Number}}~(\text{T-Add})\]

Notice: in the above notation I am permitted to have free variables for expressions referred to in the premise and the conclusion. When we apply this rule, we will substitute in expressions for these free variables.
Suppose we have a more complicated expression like (+ (+ 1 2) 3). We can apply our T-Add rule to type-check this:

\[\dfrac{\dfrac{\dfrac{}{\texttt{1 : Number}} \qquad \dfrac{}{\texttt{2 : Number}}}{\texttt{(+ 1 2) : Number}}~\text{(T-Add)} \qquad \dfrac{}{\texttt{3 : number}}} {\texttt{(+ (+ 1 2) 3) : Number}}~\text{(T-Add)}\]

Notice that the above sequence of applying inference rules forms a tree: we call this a derivation tree or judgment
Let’s expand our typing rules to account for strings:

\[\dfrac{}{\texttt{(stringE s) : String}}~\text{(T-String)} \qquad\qquad \dfrac{\texttt{e1 : String} \qquad \texttt{e2 : String}}{\texttt{(appendE e1 e2) : String}}~\text{(T-Append)}\]

Now, what happens if we try to give a typing derivation for an ill-typed term:

\[\dfrac{???}{\texttt{(appendE (numE 10) (stringE "hello") : String)}}\]

We get stuck: there is no premise that permits us to derive $\texttt{(numE 10)}$ has type $\texttt{Number}$. A type error is failure to construct a judgment
Historical note: This process of proving things via the application of inference rules and axioms is called natural deduction, and goes back to Gentzen (1934)

Typing if-then-else

Let’s extend our language and type-system further with if:

(define-type Exp
  [addE (l : Exp) (r : Exp)]
  [appendE (l : Exp) (r : Exp)]
  [numE (n : Number)]
  [iteE (g : Exp) (thn : Exp) (els : Exp)]
  [stringE (s : String)])

Let’s assume we have the “take the then-branch if the guard is 0” semantics, so we can write the following interpreter:

(calc : (Exp -> Value))
(define (calc e)
  (type-case Exp e
    [(numE n) (numV n)]
    [(stringE s) (stringV s)]
    [(addE l r ) (numV (+ (numV-n (calc l)) (numV-n (calc r))))]
    [(iteE g thn els)
     (if (eq? (numV-n (calc g)) 0)
         (calc thn)
         (calc els))]
    [(appendE l r ) (stringV (string-append
                              (stringV-s (calc l))
                              (stringV-s (calc r))))]))

Continuing with the mantra of “well-typed programs can’t go wrong”, what sorts of ways can our interpreter above “go wrong”, and how do we design a type system to prevent those errors?
Here are some examples that trigger errors:

; (1) argument is non-int
> (calc (iteE (stringE "oops") (numE 20) (numE 30)))

; (2) a more subtle case: this one works fine
> (calc (addE (numE 10)
               (iteE (numE 1) (stringE "oops") (numE 30))))
- Value
(numV 40)

; almost the same program errors!
(calc (addE (numE 10)
               (iteE (numE 0) (stringE "oops") (numE 30))))
numV-n: contract violation

How should we go about giving a typing rule for if to prevent these runtime errors?
Preventing the first case is easy: require that g have type Number
Preventing the second case is more subtle: what went wrong here?
- Executing one branch led to a type error, and the other didn’t
- A natural thing to do is to require both branches to be the same type
Then we can give the typing judgment for iteE:

\[\dfrac{\texttt{g : Number} \qquad \texttt{thn : T} \qquad \texttt{els T}}{\texttt{(iteE g thn els) : T}}~\text{(T-If)}\]

Here, we used a free-variable $\texttt{T}$, which must be a type
Tiny example derivation:

\[\dfrac{\dfrac{}{\texttt{0 : Number}} \qquad \dfrac{}{\texttt{1 : Number}} \qquad \dfrac{}{\texttt{2 : Number}}} {\texttt{(iteE 0 1 2) : Number}}\]

Now, with this rule in hand, we can make a typechecker:

(type-of : (Exp -> Type))
(define (type-of e)
  (type-case Exp e
             [(numE n) (numT)]
             [(stringE s) (stringT)]
             [(addE l r)
              (if (and (numT? (type-of l)) (numT? (type-of r)))
                  (numT)
                  (error 'type-error "tried to add non-numbers"))]
             [(iteE g thn els)
              (let [(t-g (type-of g))
                    (t-thn (type-of thn))
                    (t-els (type-of els))]
                (if (and (equal? t-g (numT)) (equal? t-thn t-els))
                    t-thn
                    (error 'type-error "Type error in if")))]
             [(appendE l r)
              (if (and (stringT? (type-of l)) (stringT? (type-of r)))
                  (stringT)
                  (error 'type-error "tried to append non-string"))
              ]))

And, we can test it a bit:

(type-of (iteE (numE 0) (stringE "hello") (stringE "world")))
- Type
(stringT)

(type-of (iteE (numE 0) (numE 10) (stringE "world")))
- Type
. . type-error: Type error in if

Ponder: the above program was a type-error, even though it would run without error! Why are we OK with this?

Milner, Robin (1978), “A Theory of Type Polymorphism in Programming”, Journal of Computer and System Sciences, 17 (3): 348–375 ↩