On this page:
7.1 Designing a Typesystem for If  Lang
7.1.1 The Typechecker Design Recipe
7.2 Simply-Typed Lambda Calculus (STLC)
8.15

7 Types🔗

Well-typed programs can’t go wrong.

Milner, Robin. "A theory of type polymorphism in programming." Journal of computer and system sciences 17.3 (1978): 348-375.

You’ve all programmed in a programming language with types before. A common example is C, where you declare the type of a variable alongside its name and assignment:

#include <stdio.h>

#include <stdint.h>

 

int main() {

    uint64_t my_num = 10;

    printf("Hello, World! %ld\n", my_num);

    return 0;

}

A type is a collection of values. In the above C program, the type uint64_t is the collection of all possible unsigned 64-bit integer values.

Types restrict the kinds of programs that you can write. For instance, in Plait, you cannot write this program:

(+ "hello" 1)

If you try, you will get an error that says something like "typecheck failed: Number vs. String". A type system is the set of rules that a program must follow in order to be well-typed. A type checker is a program that checks whether or not your program is well-typed.

Broadly, there are two motivations for wanting to have types in your programming languages:
  • Physical reasons. In order to run programs we need to run it on a low-level CPU that doesn’t know about strings, Booleans, etc. Types tell the compiler what shape data has so it can know how to map it into something the CPU understands. This can make writing compilers much easier, and it was the initial motivation for introducing types into programs.

  • Logical reasons. Types convey a lot of information about how programs behave to the compiler and to other programmers beyond how that information is represented physically on the computer. Types are used as a means to (1) find and prevent certain kinds of bugs, (2) provide checked specifications for how programs behave, (3) aid in the creation of developer tools, and more.

Our goal in this module is to explore the design and implementation of type systems for various programming languages. This is a rich and deep topic within programming languages research and practice; we will only scratch the surface here.

In order to be useful, a type checker should satisfy the following key properties:

As we design our type checkers for our languages, we will seek to ensure that each type checker satisfies the above four properties.

7.1 Designing a Typesystem for IfLang🔗

Let’s see an example of designing a typechecker for the IfLang language we have been studying. First, we recall the abstract syntax and definitional interpreter for IfLang:

> (define-type IfLang
    (numI [value : Number])
    (addI [e1 : IfLang] [e2 : IfLang])
    (boolI [value : Boolean])
    (ifI [guard : IfLang] [thn : IfLang] [els : IfLang]))
> (define-type Value
    (numV [v : Number])
    (boolV [v : Boolean]))
> (interp-ite : (IfLang -> Value))
> (define (interp-ite e)
    (type-case IfLang e
               [(numI n) (numV n)]
               [(boolI b) (boolV b)]
               [(addI e1 e2)
                (let ([v1 (numV-v (interp-ite e1))]
                      [v2 (numV-v (interp-ite e2))])
                  (numV (+ v1 v2)))]
               [(ifI guard thn els)
                (if (boolV-v (interp-ite guard))
                    (interp-ite thn)
                    (interp-ite els))]))

Currently, it is possible for IfLang programs to cause runtime errors:

> (interp-ite (ifI (numI 10) (numI 15) (numI 30)))
  boolV-v: contract violation ...

The first step in designing a typechecker is to establish the soundness guarantees of that typechecker, i.e., which kinds of bugs ought to be prevented. Here, we will say that our IfLang typechecker is sound if it rejects all IfLang programs that cause runtime errors in our definitional interpreter.

Let’s implement a typechecker that eliminates runtime errors. Let’s pause on this point: can’t a silly type checker just reject all programs? Such a typechecker is surely sound, but it is not useful, because we still want to be able to write programs. So, a second requirement on type systems is that they are expressive: we can’t just reject all programs, including the valid ones.

Let’s write down some examples of the kinds of runtime errors our progams can have that we want our type system to prevent:

Now we are ready to make our type checker. Like our interpreters, our type checker will work by checking that each subprogram is itself well-typed. We can write a function called type-of that associates each IfLang program with its type. The type of type-of will be (IfLang -> IfType).

What is our "type of types"? In this instance, we have two possible types: numbers and Booleans.

> (define-type IfType
     [numT]
     [boolT])

Now we are ready to implement our typechecker. Let’s start filling it in. The first few cases aren’t so bad:

(type-of : (IfLang -> Number))
(define (type-of e)
  (type-case IfLang e
             [(numI n) (numT)]
             [(boolI b) (boolT)]
             [(addI e1 e2)
              (if (and (equal? (type-of e1) (numT))
                       (equal? (type-of e2) (numT)))
                  (numT)
                  (error 'type "expected int"))]
             [(ifI g thn els)
              ???]))

Notice how the addI case checks that both e1 and e2 are the correct type before returning numT as the resulting type of the expression.

Now, how do we typecheck ifI? Certainly we should ensure that the guard is a Boolean:

(type-of : (IfLang -> Number))
(define (type-of e)
  (type-case IfLang e
             [(numI n) (numT)]
             [(boolI b) (boolT)]
             [(addI e1 e2)
              (if (and (equal? (type-of e1) (numT))
                       (equal? (type-of e2) (numT)))
                  (numT)
                  (error 'type "expected int"))]
             [(ifI g thn els)
              (if (equal? (type-of g) (boolT))
                  ???
                  (error 'type "expectec bool")
                  )]))

Now, what do we do for the ???? Here we encounter a conondrum, because we can have a program like this:

(ifI <very complex expression> (boolI #t) (numI 10))

Depending on what <very complex expression> evaluates to, this program could be have type boolT or numT. How do we resolve this?

We choose the first option. This is a design decision because it involves tradeoffs between design objectives. Most languages that have type checkers make this decision, including Plait. Plait requires that both branches of an if expression have the same type, or it will raise a type error:

(if #t #t #f)
- Boolean
#t
(if #t 10 #f)
typecheck failed: Number vs. Boolean
sources:
10
#f

We will make the same compromise as Plait and require that both branches of an if-expression return the same type:

> (type-of : (IfLang -> IfType))
> (define (type-of e)
    (type-case IfLang e
               [(numI n) (numT)]
               [(boolI b) (boolT)]
               [(addI e1 e2)
                (if (and (equal? (type-of e1) (numT))
                         (equal? (type-of e2) (numT)))
                    (numT)
                    (error 'type "expected int"))]
               [(ifI g thn els)
                (if (equal? (type-of g) (boolT))
                    (let [(t-thn (type-of thn))
                          (t-els (type-of els))]
                      (if (equal? t-thn t-els)
                          t-thn
                          (error 'type "expected thn and els to have same type")))
                    (error 'type "expected bool"))]))

Finally, we can test our typechecker. We will have two kinds of tests: soundness tests that check for type errors and expressivity tests that ensure that valid programs do not raise type errors.

First, some expressivity tests:

> (test (type-of (addI (numI 1) (numI 2)))
        (numT))

- Void

good (type-of (addI (numI 1) (numI 2))) at line 9

  expected: (numT)

  given: (numT)

> (test (type-of (addI (addI (numI 1) (numI 3)) (numI 2)))
        (numT))

- Void

good (type-of (addI (addI (numI 1) (numI 3)) (numI 2))) at line 10

  expected: (numT)

  given: (numT)

> (test (type-of (ifI (boolI #t) (numI 10) (numI 20)))
        (numT))

- Void

good (type-of (ifI (boolI #t) (numI 10) (numI 20))) at line 11

  expected: (numT)

  given: (numT)

Next, some soundness tests:

> (test/exn (type-of (addI (boolI #t) (numI 2))) "type")

- Void

good (type-of (addI (boolI #t) (numI 2))) at line 12

  expected: "type"

  given: "type: expected int"

> (test/exn (type-of (ifI (boolI #f) (numI 10) (boolI #t))) "type")

- Void

good (type-of (ifI (boolI #f) (numI 10) (boolI #t))) at line 13

  expected: "type"

  given: "type: expected thn and els to have same type"

7.1.1 The Typechecker Design Recipe🔗

"In sum, teaching mathematics insisted on these points:
  • the solution process

  • the ability to provide justifications for each step

  • a clean presentation that would imply a justification if it was omitted."

Matthias Felleisen, The Design Recipe

We can summarize the above design process into a design recipe that explains step-by-step how to approach designing a type system for any programming language:

7.2 Simply-Typed Lambda Calculus (STLC)🔗

Let’s take our new recipe and apply it to a more interesting language: the lambda calculus extended with numbers. We will use the begin with the following abstract syntax, which we will need to revise later:

(define-type LExp
   [varE (s : Symbol)]
   [numE (n : Number)]
   [lamE (arg : Symbol) (body : LExp)]
   [appE (e : LExp) (arg : LExp)])
(define-type LType
  [NumT]
  [FunT (arg : LType) (body : LType)])
 
(define-type LExp
  [varE (s : Symbol)]
  [numE (n : Number)]
  [lamE (arg : Symbol) (typ : LType) (body : LExp)]
  [appE (e : LExp) (arg : LExp)])
 
; perform e1[x |-> e2]
(subst : (LExp Symbol LExp -> LExp))
(define (subst e1 x e2)
  (type-case LExp e1
    [(varE s) (if (symbol=? s x)
                  e2
                  (varE s))]
    [(numE n) (numE n)]
    [(lamE id typ body)
    (if (symbol=? x id)
      (lamE id typ body)              ; shadowing case
      (lamE id typ (subst body x e2)))]
    [(appE e1App e2App)
     (appE (subst e1App x e2)
           (subst e2App x e2))]))
 
(define (interp e)
  (type-case LExp e
    [(varE s) (error 'runtime "unbound symbol")]
    [(lamE id typ body) (lamE id typ body)]
    [(numE n) (numE n)]
    [(appE e1 e2)
     ; run e1 to get (lambda (id) body)
     ; run e2 to get a value argV
     ; run body[id |-> v]
     (letrec [(e1V (interp e1))
              (body (lamE-body e1V))
              (id (lamE-arg e1V))
              (argV (interp e2))]
       (interp (subst body id argV)))]))
 
(define-type-alias TEnv (Hashof Symbol LType))
(define mt-env (hash empty)) ;; "empty environment"
 
(define (lookup (n : TEnv) (s : Symbol))
  (type-case (Optionof LType) (hash-ref n s)
             [(none) (error 'type-error "unrecognized symbol")]
             [(some v) v]))
 
(extend : (TEnv Symbol LType -> TEnv))
(define (extend old-env new-name value)
  (hash-set old-env new-name value))
 
 
(define (type-of env e)
  (type-case LExp e
             [(varE s) (lookup env s)]
             [(numE n) (NumT)]
             [(lamE arg typ body)
              (FunT typ (type-of (extend env arg typ) body))]
             [(appE e1 e2)
              (let [(t-e1 (type-of env e1))
                    (t-e2 (type-of env e2))]
                (type-case LType t-e1
                           [(FunT tau1 tau2)
                            (if (equal? tau1 t-e2)
                                tau2
                                (error 'type-error "invalid function call"))]
                           [else (error 'type-error "invalid function call")]))]))
 
(test (interp (appE (lamE 'x (NumT) (varE 'x)) (numE 10))) (numE 10))
 
(test (type-of mt-env (appE (lamE 'x (NumT) (varE 'x)) (numE 10))) (NumT))
 

Figure 10: Implementation of the simply-typed lambda calculus with numbers.