On this page:
10.1 System F
10.2 Semantics of System-F
10.3 A Typechecker for System-F
10.4 Consequences of System-F Types
8.15

10 Parametric Polymorphism🔗

It is useful to write functions that work for a variety of types. For example, the identity function \lambda x . x should work for any type of x. Type systems that allow a single piece of code to be used with multiple types are broadly referred to as polymorphic type systems, or sometimes simply polymorphism There are many kinds of polymorphism, and we will explore some of them in this class. A particularly common form of polymorphism permits writing functions that are parametric in a type, meaning that the function itself takes a type argument (also called a type parameter) as part of its definition. This form of polymorphism is called parametric polymorphism, and it is what we will study first.

For example, Java supports a form of parametric polymorphism via the following syntax:

public class Main {

  public static <T> T identity(T x) {

    return x;

  }

 

  public static void main(String[] args) {

    System.out.println(identity("Hello World!"));

  }

}

The syntax <T> declares a type argument T. The identity function is not allowed to know what T is; it must work over all possible instantiations of T. The main functions calls identity with an argument of type string; this replaces T with string in identity.

Parametric polymorphism is essential for code reuse: it allows the programmer to avoid unnecessary duplication of code that should work regardless of the argument type.

10.1 System F🔗

As usual in this class, we will design, implement, and study a small language to better understand the principles of parametric polymorphism. This language will be called System F.

System F was introduced independently by Girard in the context of logic and Reynolds in the context of programming languages. Our development in this section is based on Chapter 23 of Types and Programming Languages.

The idea of System F is to extend the simply-typed \lambda-calculus with the following new ideas:
  • Type abstraction. We need a way of designating a function that takes a type as an argument. This is called a type abstraction, and we use the surface syntax \Lambda X.e to denote it (the symbol \Lambda is a capital lambda). Using this syntax, we can write a polymorphic identity function: \Lambda X.(\lambda x : X . x) The identifier X above is a type parameter.

  • Type application. Now we need a way to instantiate type parameters. We do this using something that looks a lot like normal function application: (\Lambda X . (\lambda x : X . x)) ~ [\texttt{Num}]

    We call the above syntax a type application: its semantics is to substitute the type parameter X for the argument \texttt{Num} in the body of the type abstraction. So, the above steps to (\lambda x : \texttt{Num} . x). We wrap type application in square brackets to syntactically distinguish it from term application.

  • Universal types. The term \Lambda X.(\lambda x : X . x) is a program waiting to receive a type as an argument. Such a program should itself have a type to make sure we don’t misuse it (for instance, by calling it with an argument that is not a type). We will write this type \forall X . t (read "for all X"), where X is a type identifier and t is a type that can refer to X. Now we can write a type for the identity function above:

    \Lambda X.(\lambda x : X . x) : \forall X . X \rightarrow X Think of "\forall X . X \rightarrow X" like a lambda: the argument X is free in the body X \rightarrow X. Once application occurs, X will be substituted for a concrete type.

Given these intuitions, now we can define the complete surface syntax of System-F:

 

expr

 ::= 

λ ident : t . expr

 

  |  

expr expr

 

  |  

expr [ t ]

 

  |  

ident

 

  |  

num

 

  |  

ident

 

t

 ::= 

Num

 

  |  

tident

 

  |  

t -> t

 

  |  

\forall tident . t

Figure 14: Surface syntax for System-F with numbers.

We can give abstract syntax to match this surface syntax:

> (define-type typ
     (funT [arg : typ] [ret : typ])
     (identT [id : Symbol])
     (numT)
     (forallT [id : Symbol] [body : typ]))
> (define-type expr
    (identF [id : Symbol])
    (numF [n : Number])
    (lamF [arg : Symbol] [t : typ] [body : expr])
    (appF [e1 : expr] [e2 : expr])
    (tlamF [id : Symbol] [body : expr])
    (tappF [e1 : expr] [t : typ]))

Some examples of parsing surface syntax into abstract syntax:

Surface syntax

 

Abstract syntax datastructure

\Lambda X . \lambda x : X . x

 

(tlamF 'X (lamF 'x (identT 'X) (identF 'x)))

(\Lambda X . \lambda x : X . x)~[\texttt{Num}]

 

(tappF (tlamF 'X (lamF 'x (identT 'X) (identF 'x))) (numT))

10.2 Semantics of System-F🔗

The semantics of System-F can be given in terms of substitution, just like STLC. Let’s run some example programs to get a feel for how programming in System-F looks.

First, the identity function evaluated on the number 10:

((\Lambda X . \lambda x : X . x)~[\texttt{Num}{}])~10 \xrightarrow{\text{Substitute }\texttt{Num}{}} (\lambda x : \texttt{Num}{} . x)~10 \xrightarrow{\text{Substitute }10} 10

Now let’s do a more interesting case. Let’s define a "call-twice" function that takes two arguments: a function f of type X \rightarrow X and an argument a of type X, and calls f twice with a as the initial argument:

\texttt{calltwice} = \Lambda X . \lambda f : X \rightarrow X . \lambda a : X . f~(f~a)

Let’s step through an example invocation of calltwice:

\begin{align*} &(((\Lambda X . \lambda f : X \rightarrow X . \lambda a : X . f~(f~a))~[\texttt{Num}])~(\lambda x : \texttt{Num} . x))~10 \\ &\xrightarrow{\text{substitute \texttt{Num}}} ((\lambda f : \texttt{Num} \rightarrow \texttt{Num} . \lambda a : \texttt{Num} . f~(f~a))~(\lambda x : \texttt{Num} . x))~10 \\ &\xrightarrow{\text{substitute }(\lambda x : \texttt{Num} . x)} ((\lambda a : \texttt{Num} . (\lambda x : \texttt{Num} . x)~((\lambda x : \texttt{Num} . x)~a)))~10 \\ &\xrightarrow{\text{substitute }10} ((\lambda x : \texttt{Num} . x)~((\lambda x : \texttt{Num} . x)~10)) \\ &\rightarrow (\lambda x : \texttt{Num} . x)~10 \\ &\rightarrow 10 \end{align*}

10.3 A Typechecker for System-F🔗

The rules for typechecking System F are fairly simple extensions of typechecking STLC. We need to extend our type context \Gamma to account for the introduction of type variables. Now we will have two ways of extending \Gamma: (1) \Gamma \cup \{x:\tau\} which extends \Gamma by associating x with type \tau, and (2) \Gamma \cup X which extends \Gamma with a new type variable X.

\dfrac{~}{\Gamma \vdash \texttt{n} : \texttt{Num}}~\text{(T-Num)} \qquad \dfrac{x \in \Gamma \qquad \Gamma(x) = t}{\Gamma \vdash x : t}~\text{(T-Ident)} \dfrac{\Gamma \cup \{x \mapsto t_1\} \vdash e : t_2}{\Gamma \vdash (\lambda x : t_1 . e) : t_1 \rightarrow t_2 }~\text{(T-Lam)} \qquad \dfrac{\Gamma \vdash e_1 : t_1 \rightarrow t_2 \qquad \Gamma \vdash e_2 : t_1} {\Gamma \vdash (e_1~e_2) : t_2}~\text{(T-App)} \dfrac{\Gamma \cup X \vdash e : \tau}{\Gamma \vdash \Lambda X . e : \forall X . \tau}~\text{(T-TLam)} \qquad \dfrac{\Gamma \vdash e_1 : \forall X . \tau_1}{e_1 ~ [\tau_2] : \tau_1[X \mapsto e_2]}~\text{(T-TApp)}

Figure 15: Surface syntax for System-F with numbers.

As an example, here is a typing derivation tree:

\dfrac{ \dfrac{ \dfrac{ \dfrac{ \dfrac{x:X \in \{X, x:X\}}{ \{X, x:X\} \vdash x : X}~\text{(T-Var)} } {\{X\} \vdash \lambda x : X . x : X \rightarrow X}~\text{(T-Lam)} } {\{\} \vdash (\Lambda X . \lambda x : X . x) : \forall X . X \rightarrow X}~\text{(T-TLam)} } { \{\} \vdash ((\Lambda X . \lambda x : X . x)~[\texttt{Num}]) : \texttt{Num} \rightarrow \texttt{Num}}~\text{(T-Tapp)} \qquad \{\} \vdash 10 : \texttt{Num}} { \{\}\vdash ((\Lambda X . \lambda x : X . x)~[\texttt{Num}])~10 : \texttt{Num}}~\text{(T-App)}

There is a subtlety in these rules: the (T-App) rule implicitly requires comparing two types for equality. However, it is not trivial to check whether or not two types are equal in System-F. Consider the following two terms, annotated with their types:

\Lambda X . \lambda x : X . x : \forall X . X \rightarrow X \qquad \qquad \Lambda Y . \lambda x : Y . x : \forall Y . Y \rightarrow Y

Intuitively, these two terms should have the same type, since they differ only by their naming of the type parameter (in other words, they are \alpha-equivalent types). As a consequence, when implementing our typechecker, we will need to check type equality up to \alpha-equivalence.

Now we can implement our typechecker:

;;; performs substitution t1[x |-> t2]
(define (tsubst t1 x t2)
  (type-case typ t1
             [(funT arg ret)
              (funT (tsubst arg x t2) (tsubst ret x t2))]
             [(numT) (numT)]
             [(identT id)
              (if (equal? id x)
                  t2
                  (identT id))]
             [(forallT id body)
              ; check for shadowing
              (if (equal? x id)
                  (forallT id body) ; shadowing case, do nothing
                  (forallT id (tsubst body x t2))
                  )]))
 
;;; compute the DeBruijn indexed version of a term for checking
;;; alpha equivalence. this is a helper function for type-eq?
(debruijn : (typ String -> typ))
(define (debruijn t c)
  (type-case typ t
             [(numT) (numT)]
             [(funT t1 t2)
              (funT (debruijn t1 c) (debruijn t2 c))]
             [(identT x) (identT x)]
             [(forallT id body)
              (let* ([new-name (string-append c "x")]
                     [subst-body (debruijn (tsubst body id (identT (string->symbol new-name)))
                                           new-name)])
                (forallT (string->symbol new-name) subst-body))]
             ))
 
;;; check if t1 and t2 are equivalent types up to alpha renaming
(define (type-eq? t1 t2)
  (equal? (debruijn t1 "x") (debruijn t2 "x")))
 
;;; test type-eq?
(test (type-eq? (forallT 'X (identT 'X))
                (forallT 'Y (identT 'Y)))
      #t)
 
(test (type-eq? (forallT 'X (forallT 'Y (funT (identT 'X) (identT 'Y))))
                (forallT 'Z (forallT 'W (funT (identT 'Z) (identT 'W)))))
      #t)
 
(test (type-eq? (forallT 'X (forallT 'X (funT (identT 'X) (identT 'X))))
                (forallT 'Y (forallT 'X (funT (identT 'X) (identT 'X)))))
      #t)
 
(define-type-alias TEnv (Hashof Symbol typ))
(define mt-env (hash empty)) ;; "empty environment"
 
(extend : (TEnv Symbol typ -> TEnv))
(define (extend old-env new-name value)
  (hash-set old-env new-name value))
 
(type-of : (TEnv expr -> typ))
(define (type-of env e)
  (type-case expr e
             [(identF x)
              (type-case (Optionof typ) (hash-ref env x)
                         [(some t) t]
                         [(none) (error 'type "unbound identifier")])]
             [(numF n) (numT)]
             [(lamF id t body)
              (funT t (type-of (extend env id t) body))]
             [(appF e1 e2)
              (let [(t-e1 (type-of env e1))
                    (t-e2 (type-of env e2))]
                (type-case typ t-e1
                           [(funT tau1 tau2)
                            (if (type-eq? tau1 t-e2)
                                tau2
                                (error 'type-error "invalid function call"))]
                           [else (error 'type-error "invalid function call")]))]
             [(tlamF id body)
              (forallT id (type-of env body))]
             [(tappF e t)
              (type-case typ (type-of env e)
                         [(forallT id body)
                          (tsubst body id t)]
                         [else (error 'type "invalid type application")])]))
 
(define (type-check? e)
  (try
   (begin
     (type-of mt-env e)
     #t)
   (λ () #f)))
 

10.4 Consequences of System-F Types🔗

A natural question you might have is: is System F more expressive than the simply-typed lambda calculus? The answer is yes: we can give types to programs that are not typeable in STLC. An example is the self-application term:

\lambda x . x~x

We saw how we couldn’t give a type to this program in STLC because there was no way for the argument to a function to have the same type as the function itself. In System F, we can give a type to this program by making use of type application:

\lambda x : (\forall X . X \rightarrow X) . (x~[\forall X . X \rightarrow X])~x

Note that this program has type (\forall X . X \rightarrow X) \rightarrow (\forall X . X \rightarrow X) This seems to imply that System F is extremely expressive, perhaps even so expressive that it can write all programs that can be written in the untyped lambda calculus. But in fact, this is not the case: just like STLC, System F is strongly normalizing (meaning, all well-typed System F programs terminate).

How can this be if we just typechecked self application? It seems like we should now be able to write \Omega by chaining together \omega~\omega, just like we did in the untyped lambda calculus.

The observation that all System-F terms of type \forall X . X \rightarrow X must behave like the identity function is due to Reynolds, see Reynolds, J.C. (1983). "Types, abstraction, and parametric polymorphism". Information Processing. North Holland, Amsterdam. pp. 513–523. The introduction to this paper is hilarious.

Subsequently, these ideas were greatly expanded on by Wadler in his classic paper "Theorems for free!"

Let’s look closely at the type of \omega. Its argument is type \forall X . X \rightarrow X. Take a moment and write down as many System-F programs that you can can of this type. What do you see?

You should notice that they are all observationally equivalent to the identity function!

Why is this the case? Intuitively, all programs of type \forall X . X \rightarrow X must (1) consume some unknown type X, and (2) produce an inhabitant of some unknown type X \rightarrow X. Since this function can’t know anything at all about X – including how to construct values of type X – all it can do is return the argument to the caller.

Now we are starting to see how type structure can tell us rich and interesting details about how programs behave, and can even be so restrictive that there is exactly one program that inhabits a particular type.