- Quiz on Thursday, released at about 9AM, due Friday at 11:59PM
- Will have a “programming in OCaml” question.
- There will be some multiple choice questions on garbage collection: understand how to identify roots and perform mark-and-sweep garbage collection.
- There will be a question on dynamic safety.
- Will be “basic programming in Rust” questions that are similar to the in-class exercises we will have over the next 2 days.
- Be familiar with the mini-languages we’ve covered (micro-C, micro-ASM)
- Outcomes for this Rust mini-module:
- Become familiar with the basic syntax and semantics of Rust programs
- See how Rust can prevent memory errors through ownership
- Continue to build on the course theme of studying programming languages by learning to use new languages. We will see how many of the language design features we have studied so far have rhymes and echoes in Rust.
- Our goal is to be able to implement a small calculator language in Rust and understand its core memory management features.
- Outcomes for today:
- Be comfortable with basic Rust programming (variables, functions, references).
- Rust has exceptionally high-quality learning resources:
- The Rust Book is an excellent general resource on learning Rust.
- The Rust Playground will be used for running all code.
- This repository has a number of good exercises for learning rust. We will work through some of these in class together.
- The rust documentation is the complete documentation for the language and is authoritative (but, quite hard to read as a newcomer!)
- Rust by Example is another very useful tool for learning Rust.
- There are a number of very interesting Rust features that we will intentionally not be covering, including traits, modules, concurrency, explicit lifetimes, and more. You are encouraged (but not required) to explore these topics using the resources we’ve provided!
A First Taste of Rust
- Our goal is not to become experts in Rust: that is a large topic which would take more than 2 lectures. Our goal is to get a taste of Rust: see its syntax, understand some of its unique features, and gain awareness of when and why one would want to program in Rust.
- Broaden horizons: see a bigger spectrum of language design
- Continue to build skill in quickly picking up a language and working within it.
- We will mostly make very simple Rust programs and avoid using most of the language. The tricky part will often be making the programs compile!
- Rust is essentially (micro-)C combined with OCaml, so you already have seen quite a few examples of code that looks a lot like Rust
- We will build on these parallels in order to build intuition and quickly explore Rust
- Rust is a large language and we will only see a small slice of it (the part that overlaps with OCaml and memory management).
- Rust has both an imperative (statement-oriented) and a functional (expression-oriented) flavor: you will see aspects of this in its syntax
- A very brief history of Rust:
- Rust was started as a hobby project in 2006 and officially adopted as a language by Mozilla in 2009.
- As a language, Rust is influenced by several languages:
- OCaml (the first implementation of Rust was made in OCaml; you can browse it at this commit)
- C/C++
- Cyclone
- Several other more minor influences (Ruby, Haskell, Erlang)
- Rust makes use of the LLVM compiler infrastructure
- Rust is named after a fungus
- The first stable release of the language was in 2015
- Rust has had a very unusually fast adoption rate: see here for some quantitative evidence of this. It is being used within the Linux kernel, the FireFox web browser,
- Rust programs have a
main
function that is executed first whenever a program is run:
fn main() {
// single line comments look like this
/* multi-line comments
look like this */
println!("hello world!");
}
- The command
println!
prints text to the console and behaves like a function call. - Like C, Java, and JavaScript, functions in Rust are wrapped in curly braces
{
and}
. - Local variables are declared using
let
:
fn main() {
let x = 5;
// multiple statements can be sequenced together in Rust using a semicolon,
// similar to C-like languages
// the println! string can include *formatting strings* that are replaced by arguments
// here, the format string {} is replaced by the variable x
println!("hello {}", x);
}
- Unlike any language we have studied so far, in Rust, curly braces play a critical role in determining scope:
fn main() {
let x = 5;
{
// define y in this *inner scope*
let y = 10;
}
// y no longer in scope here
println!("hello {} {}", x, y); // fails to compile: y not in scope
}
- Calling and defining functions with arguments looks like C:
- See here for documentation on functions.
- Functions are not curried by default
- Values are returned from functions via a
return
statement (though, Rust also supports a functional-style return syntax)
// all arguments must be annotated with their type and separated by commas
// the type `i64` refers to "signed int 64": it is a 64-bit integer. All integer types
// in rust are annotated by their bitwidth.
fn add_args(x: i64, y: i64) -> i64 {
return x + y;
}
fn add_functional(x: i64, y: i64) -> i64 {
// this functional-style return also works
// be careful: no semicolon!
x + y
}
fn main() {
// prints 30
println!("addition: {}", add_args(10, 20));
}
- Rust supports basic datatypes and operations on them like strings, Booleans, ints, and floats (see here):
fn main() {
// Booleans are true and false
let y = true;
// if you wish, you can annotate the *type* of a let-binding
let z : bool = false;
// rust supports if-expressions, whose branches return values
let my_v = if true { 10 } else { 25 };
// my_v has value 10
// rust also supports if-statements
if true {
// this branch executes
println!("then branch");
} else {
println!("else branch");
}
// rust supports standard operations on ints and Booleans
let my_sum = 10 + 25;
let my_compare = 25 < 30; // standard comparison operations like ==, <, >, <=
let bool_ops = (true && false) || (true) && (!true); // && is 'and', || is 'or', ! is 'not'
let more_exp = (25 == 30) || (10 < 20);
// rust supports floats, and similar to ints you can specify the bit-width
let my_64bit_float : f64 = 0.5;
let my_32bit_float : f32 = 0.5;
// rust is a low-level language, so it distinguishes all numeric types by bit-width
// an important numeric type is the `usize` type, which behaves like an unsigned
// integer and is used for things like array lookups:
let x : usize = 10;
// in contrast with OCaml, Rust supports mutable updates without using Box
// mutable variables are declared using the `let mut` syntax:
let mut x = 5;
x = x + 5;
// x now has the value 10
}
Exercise 1: Basic Variables
Modify the following code so that it compiles, and explain briefly why the first version does not compile:
fn main() {
let x;
if x == 10 {
println!("x is ten!");
} else {
println!("x is not ten!");
}
}
Exercise 2: Basic Function Calls
Modify the following code so that it compiles, and briefly explain why the first version does not compile:
fn main() {
let original_price = 51;
println!("Your sale price is {}", sale_price(original_price));
}
fn sale_price(price: i32) -> {
if is_even(price) {
price - 10
} else {
price - 3
}
}
fn is_even(num: i32) -> bool {
num % 2 == 0
}
Testing
- Rust has built-in support for testing which is very convenient (see here for many details, probably too many details):
// annotate a function with #[test]
#[test]
fn my_test() {
// assert_eq! asserts that the two arguments are equal to each other
assert_eq!(1, 2); // this will fail
}
#[test]
fn my_test2() {
assert!(true); // this will pass
}
Exercise 3: Programming a factorial function
Fill in the following code to implement a function that computes the factorial (recall that the factorial is defined as fact n = 1 if n = 0, n * fact (n - 1)) otherwise
):
fn main() {
println!("factorial: {}", factorial(3));
}
fn factorial(n: i64) -> i64 {
}
// rust has the following built-in feature for testing your code
#[test]
fn test_factorial() {
// enter your test here
}
Tuples, structs, and enums
- Rust has support for sum, pairs, and product types just like OCaml and Plait
- Rust has support for structs, which are basically OCaml records (see here):
struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let rect1 = Rectangle {
width: 30,
height: 50,
};
println!(
"The area of the rectangle is {} square pixels.",
area(rect1)
);
}
fn area(rectangle: Rectangle) -> u32 {
rectangle.width * rectangle.height
}
- Rust has support for product types using tuples:
fn main() {
let x : (usize, bool) = (10, true);
// the Rust-way to destruct pairs is to use the pattern-matching form of `let`:
let (x_fst, x_snd) = x;
println!("fst: {}, snd: {}", x_fst, x_snd);
}
- Rust supports sum-types using enums and pattern matching in a manner very similar to OCaml:
enum Animal {
Tiger { stripes: usize, hungry: bool },
Snake { weight: usize, hungry: bool }
}
fn is_hungry(a : Animal) -> bool {
// in Rust, all enums are placed behind a *namespace*. Rust uses the `::` syntax
// to refer to a name behind a namespace (in OCaml, we a single `.` instead of the `::` syntax.
// For instance we said "String.length" to get the length of a string.)
match a {
Animal::Tiger { stripes, hungry } => hungry,
Animal::Snake { weight, hungry } => hungry
}
}
fn main() {
println!("hungry? {}", is_hungry(Animal::Tiger { stripes: 10, hungry: true }))
}
Enum exercise
Fill in the following code so that the tests pass:
enum Animal {
Tiger { stripes: usize, hungry: bool },
Snake { weight: usize, hungry: bool }
}
fn is_hungry(a : Animal) -> bool {
match a {
Animal::Tiger { stripes, hungry } => hungry,
Animal::Snake { weight, hungry } => hungry
}
}
// given an animal a, return a new animal that is fed (i.e., hungry = false)
fn feed(a : Animal) -> Animal {
// implement me
}
#[test]
fn test_feed() {
let hungry_tiger = Animal::Tiger { stripes: 2, hungry: true};
assert_eq!(is_hungry(feed(hungry_tiger)), false);
}
Memory management
- We would like to make a small interpreter for a calculator language in Rust. How do we do that?
- Let’s try to translate the following tiny OCaml AST into Rust:
type calc =
| Num of int
| Add of calc * calc
- Let’s try the straightforward translation:
enum Calc {
Num(i64),
Add(Calc, Calc)
}
- If we try to run this, we get the following (amazingly helpful) error message:
Compiling playground v0.0.1 (/playground)
error[E0072]: recursive type `Calc` has infinite size
--> src/lib.rs:1:1
|
1 | enum Calc {
| ^^^^^^^^^
2 | Num(i64),
3 | Add(Calc, Calc)
| ---- recursive without indirection
|
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
|
3 | Add(Box<Calc>, Calc)
| ++++ +
- Rust says that
Calc
has infinite size! What does this error message mean? - In Rust, every struct must have a known size at compile-time. This is similar to C and C++.
- To get around this, we will follow the advice of the compile error and introduce dynamic allocation in the form of a box (note: Rust boxes are slightly different than OCaml references and Plait boxes; we will discuss this in more detail shortly):
enum Calc {
Num(i64),
Add(Box<Calc>, Box<Calc>)
}
- Note: Whereas in OCaml we would write
calc ref
, in Rust we instead writeBox<Calc>
- Important Question: Why is this struct now a constant size (i.e., known at compile time)?
- Now we can make a tiny calculator interpreter in Rust:
// this #[derive(Debug)] mumbu-jumbo makes it so that a data structure can be
// easily printed in Rust. Ignore it for now.
#[derive(Debug)]
enum Calc {
Num(i64),
Add(Box<Calc>, Box<Calc>)
}
fn interp(c : Calc) -> i64 {
match c {
Calc::Num(n) => n,
Calc::Add(l, r) => {
let l_v = interp(*l); // Rust uses the `*` operator to unbox / dereference a dynamically allocated value
let r_v = interp(*r);
return l_v + r_v
}
}
}
fn main() {
let my_calc = Calc::Num(10);
// we use the {:?} to trigger a debug print
println!("simple interp: {:?}", interp(my_calc));
let my_calc_2 = Calc::Add(Box::new(Calc::Num(10)), Box::new(Calc::Num(20)));
println!("simple interp 2: {:?}", interp(my_calc_2));
}
Rust memory management
- Let’s look at the above situation again and try to write a slightly different program:
// this #[derive(Debug)] mumbu-jumbo makes it so that a data structure can be
// easily printed in Rust. Ignore it for now.
#[derive(Debug)]
enum Calc {
Num(i64),
Add(Box<Calc>, Box<Calc>)
}
fn interp(c : Calc) -> i64 {
match c {
Calc::Num(n) => n,
Calc::Add(l, r) => {
let l_v = interp(*l); // Rust uses the `*` operator to unbox / dereference a dynamically allocated value
let r_v = interp(*r);
return l_v + r_v
}
}
}
fn main() {
let my_calc = Calc::Num(10);
// we use the {:?} to trigger a debug print
println!("simple interp: {:?}", interp(my_calc));
// let's try printing the same thing again!
println!("simple interp: {:?}", interp(my_calc));
let my_calc_2 = Calc::Add(Box::new(Calc::Num(10)), Box::new(Calc::Num(20)));
println!("simple interp 2: {:?}", interp(my_calc_2));
}
- Quite surprisingly, this raises a compile error!
Compiling playground v0.0.1 (/playground)
error[E0382]: use of moved value: `my_calc`
--> src/main.rs:26:44
|
21 | let my_calc = Calc::Num(10);
| ------- move occurs because `my_calc` has type `Calc`, which does not implement the `Copy` trait
22 | // we use the {:?} to trigger a debug print
23 | println!("simple interp: {:?}", interp(my_calc));
| ------- value moved here
...
26 | println!("simple interp: {:?}", interp(my_calc));
| ^^^^^^^ value used here after move
|
note: consider changing this parameter type in function `interp` to borrow instead if owning the value isn't necessary
--> src/main.rs:9:15
|
9 | fn interp(c : Calc) -> i64 {
| ------ ^^^^ this parameter takes ownership of the value
| |
| in this function
- Again, Rust provides a very interesting and informative error message here: it suggests that we “change the parameter in
interp
to borrow if owning the value isn’t necessary”. What on earth does that mean? The answer has to do with how Rust performs automatic safe memory management without garbage collection. - How does Rust know when to collect heap-allocated values like those created by
Box::new
? - There are three rules (see here):
- Each value in Rust is bound to a particular name called an owner.
- There can only be one owner for a particular value at a time, and the owner has special privileges for interacting with the data (which we will see).
- When the owner goes out of scope, the value will be freed.
- Let’s see what this means by inspecting a smaller example. Consider the following snippet:
fn main() {
let x = Box::new(10);
println!("my boxed value: {}", *x); // prints 10
// x is freed at the end of the function
}
- In the above example
x
is the owner of the heap-allocated boxed value10
. - The key to understanding ownsership is knowing where memory is collected and allocated.
Move semantics
- Suppose we want to refer to the same data with more than 1 name. We need to do this all the time, for example when calling functions. What are our options?
- Copy the data. This can be expensive, especially for large data-structures.
- Pass a location. This sounds great, but then whose responsibility will it be to collect the memory? Should it be the function that is called, or will it be the responsibility of the caller?
- Rust allows us to decide which of the above two strategies to use, since it is a highly performance-oriented language.
- Passing the responsibility to collect is called moving: you move the owner from one name to another.
- Once you move a value, you can no longer use it with the old name. This makes sense; the allocated memory may have been collected.
- See how the compiler raises an error for the following code, and think about where memory is allocated and freed:
fn my_func(y : Box<i64>) -> () {
println!("gobble! {}", *y);
// at this point y is freed
}
fn main() {
let x = Box::new(10);
// pass ownership of x to y in my_func
my_func(x);
println!("my boxed value: {}", *x); // we tried to access x, but it was freed by `my_func`! This is a memory error.
}
- If we try to run this program, Rust will complain and fail to compile with a very helpful message. Why did Rust complain? Let’s break down what happened:
- Initially
x
ownsBox::new(10)
- Then, when we call
my_func
, we move the ownership fromx
toy
. Now, the memory is collected when the new ownery
goes out of scope. - After
my_func
returns, back inmain
, we try to usex
again. This raises an error, sincex
refers to a location that has been collected.
- Initially
- Suppose we want to keep
x
around after callingmy_func
. There are two things we can do:- Make a copy of the heap-allocated value and pass that as an argument to
my_func
. - Retain ownership of the location.
- Make a copy of the heap-allocated value and pass that as an argument to
- Let’s see both of these approaches. First, approach 1:
fn my_func(y : Box<i64>) -> () {
println!("gobble! {}", *y);
// at this point y is freed
}
fn main() {
let x = Box::new(10);
let x2 = Box::new(*x); // create a copy of the value
// pass ownership of x to y in my_func
my_func(x2);
println!("my boxed value: {}", *x);
// at this point x is freed
}
- The above code compiles successfully! But, there’s an annoying performance problem: it performs two allocations and frees, when really only 1 is necessary.