19 Memory Safety
We continue developing our abstract machine model and explore memory management, a critical component of every programming language that up until this point we have not encountered. First, we will develop an abstract machine model called the CESK machine that has an explicit model of the store, also known as RAM. We will implement a small language with memory allocation and updating and see how, within this language, it is possible to perform memory-unsafe operations. This mini-language is not so different from many of today’s memory unsafe languages like C and C++. Then we will discuss how to recover runtime safety by tagging values with their type, which will enable us to distinguish numbers memory locations. This is how many untyped languages like Python and JavaScript enforce runtime safety.
19.1 The CESK Machine
To study memory and memory safety we will make one final extension to our abstract machine model: we will add a store that acts like your computer’s RAM. The store is a vector (also known as an array) that contains numbers.
As a basis for exploring the CESK machine, we will design a small language called RefLang that permits has let-bindings, boxing and unboxing, numbers, and addition:
(define-type RefLang (letE [id : Symbol] [e1 : RefLang] [e2 : RefLang]) (identE [id : Symbol]) (boxE [e : RefLang]) (addE [e1 : RefLang]) (unboxE [e : RefLang]) (numE [n : Number]))
As usual when designing an abstrat machine we must design a frame datatype that holds the state of the machine in order to handle recursively evaluating terms. RefLang has the following frame type:
(define-type Frame ;;; frame for (let x e1 e2) when e1 is not a value (let1F [E : Env] [id : Symbol] [e : RefLang]) (boxF [E : Env]) (unboxF [E : Env]) ;;; frame for (addE e1 e2) when e1 is not a number (add1E [E : Env] [e2 : RefLang]) ;;; frame for (addE n e2) when n is a number (add2E [E : Env] [n : Number]))
These frames are more-or-less identical to the CEK machine. You may wonder why we don’t need to add the store S to the frame; we don’t need to include this because the machine does not reset the store when popping the call stack. Then, CESK machine for RefLang has the following machine state:
(define-type State (state [C : RefLang] [E : Env] [S : (Vectorof Number)] [K : (Listof Frame)]))
C
E
S
K
Guard
C’
E’
S’
K’
Final?
(boxE e)
E
S
K
e
E
S
(cons (boxF E) K)
N
(numE n)
E1
S
(cons (boxF E) K)
l fresh loc
(numE l)
E
S[l↦n]
K
N
(unboxE e)
E
S
K
e
E
S
(cons (unboxF E) K)
N
(numE n)
E1
S
(cons (unboxF E) K)
v = S[n]
(numE v)
E
S
K
N
(identE x)
E
S
K
E[x]=n
(numE n)
E
S
K
N
(letE x e1 e2)
E
S
K
e1
S
E
(cons (let1E E x e2))
N
(numE n)
E1
S
(cons (let1E E x e2) K)
e2
E[x↦n]
S
K
N
(addE n1 n2)
E
S
K
(numE (+ n1 n2))
E
S
K
N
(addE e1 e2)
E
S
K
e1 not num
e1
E
S
(cons (add1E E e2))
N
(numE n)
E1
S
(cons (add1E E e2) K)
(addE n e2)
E
S
K
N
(addE n e2)
E
S
K
e2 not num
e2
E
S
(cons (add2E E n))
N
(numE n2)
E1
S
(cons (add2E E n1) K)
(addE n1 n2)
E
S
K
N
Figure 32: A tabular representation of a CESK machine for RefLang.
A full Plait implementation of the CESK machine for the above language is available here.
19.2 Safety
If you load cesk.rkt into your Plait environment, you’ll see that you can use the machine to run programs, for instance:
> (run-machine-init (letE 'x (boxE (numE 10)) (unboxE (identE 'x)))) (numE 10)
This looks like it works fine, but the machine as-written will let us run some rather odd programs. For instance, it will happily evaluate the following program:
> (run-machine-init (unboxE (numE 20))) (numE 0)
What happened? If we look at the semantics of the CESK machine, we see that the instruction (unboxE (numE 20)) simply looks up the value of whatever is at location 20. Nothing has been written to this location yet, so it runs to the value 0, since the store is initialized to be all zero. We have referenced a memory location before any value is put into it: this is called a use-before-initialization, and, if your programming language lets you do this, it is considered to be a bug.
More generally, a memory safety error is an error that results from treating a number as if it were a valid location when it is not. Use before initialization is an example of a memory error. In our tiny language this is the only kind of memory safety error. Other languages, like C and C++, permit memory unsafe behavior and have other kinds of exciting memory safety errors like use-after-free errors.
19.3 Runtime Safety
Today’s languages have slowly migrated away from permitting memory unsafety. There are two ways to prevent memory safety errors: statically and dynamically. The static approach leans on type systems and compile-time checks to eliminate memory safety problems. Here we will explore how to dynamically ensure memory safety at runtime by distinguishing numbers from locations.
In particular, we will change the semantics of our machine so that (1) only numbers can be added to each other; and (2) only locations can be unboxed; and (3) boxE produces locations. Together, these constraints ensure that we can only ever use locations created by boxE.
To achieve this we need to modify the syntax of our control register to distinguish locations from numbers:
(define-type SafeRefLang (letE [id : Symbol] [e1 : SafeRefLang] [e2 : SafeRefLang]) (identE [id : Symbol]) (boxE [e : SafeRefLang]) (addE [e1 : SafeRefLang]) (unboxE [e : SafeRefLang]) (numE [n : Number]) (locE [l : Number]))
Importantly, programmers are not allowed to write arbitrary locations into their programs. We can think of SafeRefLang as an intermediate program representation (IR): it is not abstract surface syntax that the programmer would program directly in, but rather, another language that facilitates evaluation for our machine. This is similar to the implementation of the CEK machine where we introduced closures into our control syntax.
A full Plait implementation of the CESK machine that implements runtime safety for the above language is available here. Here is a description of the new machine. Note that, if a machine is not in an acceping state and there are no valid transitions, then the machine is said to be stuck and in an error state.
We ensure boxing produces locations and unbox requires locations.
We tag each value in the store with its type. We define NUMT and LOCT to be special numbers that denote av alue being an integer or location, respectively.
During unboxing, when a value is loaded from the store, a location or number is produced depending on its type.
C
E
S
K
Guard
C’
E’
S’
K’
Final?
(locE l)
E1
S
(cons (unboxF E) K)
S[l]=INTT, S[l+1]=n
(numE n)
E
S
K
N
(locE l)
E1
S
(cons (unboxF E) K)
S[l]=LOCT, S[l+1]=l
(locE l)
E
S
K
N
(locE l1)
E1
S
(cons (boxF E) K)
l,l+1 fresh
(locE l)
E
S[l↦LOCT,l+1↦l1]
K
N
(numE n)
E1
S
(cons (boxF E) K)
l,l+1 fresh
(locE l)
E
S[l↦LOCT,l+1↦l1]
K
N
Figure 33: A tabular representation of a CESK machine for SafeRefLang. Only the changed rules are included relative to the state machine of RefLang.
Note that this new machine does not permit some safe operations, such as the adding of a valid offset to a location. So, we’ve given up some expressivity here for the sake of runtime safety. This process of tagging is widely used in programming. We will see next lecture how it’s used for implementing garbage collection.