21 Probabilistic Programming

8.15

21 Probabilistic Programming🔗

Today we see some advanced topics in programming languages that go beyond what programming languages that you’ve probably seen before do. We will discuss probabilistic programming languages: programming languages that let us define and manipulate probability distributions. We will see their semantics, describe how to implement them, and see a taste of what kinds of problems they can be useful for. Probabilistic programming languages are my primary research area, so I spend a lot of time thinking about these things.

21.1 TinyPPL syntax and semantics🔗

Probabilistic programming languages (PPLs) are programming languages whose semantics are probability distributions. Their defining feature is the ability to introduce and manipulate first-class probabilistic uncertainty. Our little probabilistic programming language will be called TinyPPL (read "tiny people").

TinyPPL programs consist of two different abstract syntax datatypes working together. The first are the pure expressions: these expressions do not manipulate probability distributions and look IfLang. The second set of expressions are the effectful probabilistic expressions: these expressions introduce and manipulate probabilistic quantities. First let’s see the pure expressions:

> (define-type PureExpr
    (identE [id : Symbol])
    (trueE)
    (falseE)
    (ifE [g : PureExpr] [thn : PureExpr] [els : PureExpr]))

We will give a normal non-probabilistic semantics to these pure expressions:

> (define (interp-pure env e)
    (type-case PureExpr e
               [(identE id)
                (type-case (Optionof Boolean) (hash-ref env id)
                           [(some v) v]
                           [(none) (error 'runtime "unbound ident")])]
               [(trueE) #t]
               [(falseE) #f]
               [(ifE g thn els)
                (if (interp-pure env g)
                    (interp-pure env thn)
                    (interp-pure env els))]))

Next, the probabilistic expressions:

> (define-type ProbExpr
    (flipE [prob : Number])
    (bindE [id : Symbol] [e1 : ProbExpr] [e2 : ProbExpr])
    (pureE [e : PureExpr]))

Our goal is to give these programs a probabilistic semantics, so we will have an interpreter with the following type:

(interp-prob : ((Hashof Symbol Boolean) ProbExpr Boolean -> Number))
(define (interp-prob env e v)
...)

This interpreter will will take 3 arguments – an environment, a ProbExpr expression, and a Boolean value – and it returns a number that represents the probability that the program runs to that particular value. Now we can give the desired semantics for each expression:

(flipE p) represents a probabilistic Boolean value that is #t with probability p and #f with probability 1-p. You can think of flip as "flipping a coin".
(bindE id e1 e2) binds the result of the probabilistic expression e1 to the identifier id and then evaluates e2 with x in scope. It behaves like a "probabilistic let-expression". The identifier x is a Boolean.
(pureE e) lifts a non-probabilistic expression e into a probabilistic one. For instance, (pureE (trueE)) represents a probability distribution that is true with probability 1 and false with probability 0.

Let’s see a few programs to get a feel for how TinyPPL behaves before we implement an interpreter for it.

(bind 'x (flip 1/2)

(bind 'y (flip 1/2)

(if )))

Now we are ready to make and test our interpreter:

> (interp-prob : ((Hashof Symbol Boolean) ProbExpr Boolean -> Number))
> (define (interp-prob env e v)
    (type-case ProbExpr e
               [(flipE prob)
                (if v prob (- 1 prob))]
               [(bindE x e1 e2)
                (let [(true-env (hash-set env x #t))
                      (false-env (hash-set env x #f))
                      (prob-e1-t (interp-prob env e1 #t))
                      (prob-e1-f (interp-prob env e1 #f))]
                  (+ (* prob-e1-t (interp-prob true-env e2 v))
                     (* prob-e1-f (interp-prob false-env e2 v))))]
               [(pureE e)
                (if (equal? (interp-pure env e) v)
                    1
                    0)]))

If you want to learn more about PPLs, I gave a four part lecture series on the topic at a recent summer school, with recordings available.

contents ← prev up next →

1	Introduction
2	Interpreting Programs
3	Scope and Substitution
4	The Lambda Calculus
5	Compilation
6	Formalism
7	Types
8	Types II: Typing Formalism
9	Types III: Type Soundness
10	Parametric Polymorphism
11	Compiling to System-F
12	Subtyping
13	Control
14	Implementing Exceptions
15	Control Formalism
16	Control Formalism II
17	Abstract Machines
18	Abstract Machines II
19	Memory Safety
20	Memory Management
21	Probabilistic Programming
22	Managing Effects