• Goals for today:
    • Become familiar with the essential notions of memory management in Rust: moving, borrowing, and ownership
    • Be able to identify where memory in a Rust program is collected
    • See how to perform unsafe behavior in Rust
  • Rust design adages:
    • Rust prefers performance over convenience

Move semantics

  • Suppose we want to refer to the same data with more than 1 name. We need to do this all the time, for example when calling functions. What are our options?
    1. Copy the data. This can be expensive, especially for large data-structures.
    2. Pass a location. This sounds great, but then whose responsibility will it be to collect the memory? Should it be the function that is called, or will it be the responsibility of the caller?
  • Rust allows us to decide which of the above two strategies to use, since it is a highly performance-oriented language.
  • Passing the responsibility to collect is called moving: you move the owner from one name to another.
    • Once you move a value, you can no longer use it with the old name. This makes sense; the allocated memory may have been collected.
    • Rule: Whenever a value is bound to a new name, it is moved.
  • See how the compiler raises an error for the following code, and think about where memory is allocated and freed:
fn my_func(y : Box<i64>) -> () {
    println!("gobble! {}", *y);
    // at this point y is freed
}

fn main() {
    let x = Box::new(10);
    // pass ownership of x to y in my_func
    my_func(x);
    println!("my boxed value: {}", *x);  
    // we tried to access x, but it was freed by `my_func`! This is a memory error.
}
  • If we try to run this program, Rust will complain and fail to compile with a very helpful message. Why did Rust complain? Let’s break down what happened:
    • Initially x owns Box::new(10)
    • Then, when we call my_func, we move the ownership from x to y. Now, the memory is collected when the new owner y goes out of scope.
    • After my_func returns, back in main, we try to use x again. This raises an error, since x refers to a location that has been collected.
  • Suppose we want to keep x around after calling my_func. There are two things we can do:
    1. Make a copy of the heap-allocated value and pass that as an argument to my_func.
    2. Retain ownership of the location.
  • Let’s see both of these approaches. First, approach 1:
fn my_func(y : Box<i64>) -> () {
    println!("gobble! {}", *y);
    // at this point y is freed
}

fn main() {
    let x = Box::new(10);
    let x2 = Box::new(*x); // create a copy of the value
    // pass ownership of x to y in my_func
    my_func(x2);
    println!("my boxed value: {}", *x);
    // at this point x is freed
}
  • The above code compiles successfully! But, there’s an annoying performance problem: it performs two allocations and frees, when really only 1 is necessary.

Borrowing

  • See here
  • We would like to avoid the unnecessary allocation in the above example. How can we do this?
  • This leads us to one of the main innovations of Rust: borrowing
  • Borrowing lets you preserve the original owner while having access to the value that it owns
  • How does it work? We need to change the signature for my_func and write our code slightly differently:
fn my_func(y : &i64) -> () {
    println!("gobble! {}", *y);
    // at this point y is *not freed* since `y` does not own the location
}

fn main() {
    let x = Box::new(10);
    // this function call **does not** transfer ownership: x still owns the location allocated by Box::new(10)
    my_func(&x);
    println!("my boxed value: {}", *x);
    // at this point x is freed
}
  • The type signature &i64 is a reference to an i64; think of it like a pointer. It can be dereferenced just like a Box. However, references do not pass ownership.
    • This means you can create multiple references to the same value
  • How do we know that the above code is safe? We know that the borrowed context does not outlive the owner’s scope.
    • The Rust compiler performs fairly sophisticated analysis to enforce this liveness requirement
  • What is an example where a reference does not live long enough? Here is one:
fn broken(x: int) -> &i64 {
    return &x;
}
  • If we try to run this, we will get a slightly more cryptic compiler error message, but it contains the following:
this function's return type contains a borrowed value, but there is no value
for it to be borrowed from
  • To understand why this function is invalid, think about what the type tells us about whose job it is to collect the allocated memory
    • The name x owns the integer argument, so when x goes out of scope its memory is freed
    • This function is trying to return a reference to x, but according to the laws of ownership this memory is freed when the function returns.
  • Rust prevents you from transforming something that is borrowed into something that is owned. For instance, the following code will fail to compile for this reason:
fn gobble(x : Box<u64>) -> () {
    println!("nom {}", *x);
}

fn main() {
    let x = Box::new(10);
    let r1 = &x;
    gobble(*r1);
}

Exercises

  • In the following code, identify where the heap-allocated memory is allocated and freed:
fn gobble(z : Box<u64>) -> () {
    println!("nom {}", &z);
}

fn main() {
    let x = Box::new(10);
    let y = Box::new(20);
    gobble(y);
    gobble(x);
}
  • In the following code, identify where the heap-allocated memory is allocated and freed:
fn gobble(z : Box<i64>) -> () {
    println!("nom {}", &z);
}

fn gobble2(z : &i64) -> () {
    println!("nom {}", &z);
}

fn main() {
    let x = Box::new(10);
    let y = Box::new(20);
    gobble2(&x);
    gobble(x);
    gobble2(&y);
}
  • In the following code, identify where the heap-allocated memory is allocated and freed:
fn go_crazy(z : &u64) -> (&u64, &u64) {
    // notice: rust permits *multiple copies* of the same borrow
    (z, z)
}

fn gobble(x : &u64) -> () {
    println!("nom {}", *x);
}

fn main() {
    let x = Box::new(10);
    let (r1, r2) = go_crazy(&x);
    gobble(r1);
    gobble(r2);
}
  • Implement the fibonacci function by (1) moving arguments, and (2) borrowing arguments:
fn fib_copy(x: Box<u64>) -> u64 {

}

fn fib_borrow(x : &u64) -> u64 {

}
  • Try to write a program that has a memory safety error in Rust and report the error message that you get.

Unsafe code

  • Despite the best efforts of the borrow checker and Rust’s automated memory management capabilities, it is still sometimes necessary to circumvent them and manually manage memory
  • In particular, in Rust, because each value must have exactly 1 owner, it is not possible to make certain data structures using purely safe code (for instance, doubly-linked lists)
  • Rust provides an “escape hatch” to enable the programmer to write code with memory unsafe behavior that looks a lot like C (see here for more examples):
fn main() {
    unsafe {
        // create an arbitrary address
        let address = 0x012345usize;

        // cast that address to a pointer
        let r = address as *const i32;

        // try to dereference it!
        println!("oh god: {}", *r)
    }
}
  • What are some reasons why we might want unsafe behavior in Rust?
    • Low-level interaction with hardware (interfacing with some hardware requires writing to arbitrary memory addresses)
    • Interaction with other unsafe code (like C)
  • Q: If Rust has unsafe, how is it any better than C and other unsafe-by-default languages?
    • A: It’s a reasonable question! The answer is containment: if you are encountering a memory safety issue (like a segfault), you only have to audit the unsafe code. Similarly, if you are a bank trying to make a program without memory safety errors, you can focus your audit on the unsafe parts of the program.

Advanced topic: explicit lifetimes

  • Using Rust in practice requires a host of interesting language features that we won’t cover in detail here
  • This is an advanced topic, so we won’t go into too much detail here, but I think it’s very interesting to see.
  • We won’t go into all of these, but one that is particularly interesting is explicit lifetimes, also called lifetime annotations
  • Suppose we want to store a reference in a struct! This could be useful.
  • To illustrate this idea, let’s consider a very reailistic situation where we want to feed our two cats some food.
  • At first consider the situation where the two cats dislike each other and each want to own their own food. This is fine, we can create a cat struct that owns its food as follows:
enum Food {
    Chicken,
    Catnip
}

struct cat {
    food: Food
}

  • Now, we can feed our cats, and the compiler will keep us from giving the same food to different cats:
// this fails due to a move error:
fn main() {
    let cat1food = Food::Chicken;
    let cat1 = Cat { food: cat1food };
    let cat2 = Cat { food: cat1food };
}

// this is OK
fn main() {
    let cat1food = Food::Chicken;
    let cat2food = Food::Catnip;
    let cat1 = Cat { food: cat1food };
    let cat2 = Cat { food: cat2food };
}
  • Now, suppose we want to create generous cats that are willing to share their food. This is possible in Rust, but it requires the use of a new language feature: explicit lifetimes
enum Food {
    Chicken,
    Catnip
}

struct FriendlyCat<'a> {
    food: &'a Food
}

fn main() {
    let cat1food = Food::Chicken;
    let cat1 = FriendlyCat { food: &cat1food };
    let cat2 = FriendlyCat { food: &cat1food };
}
  • What is happening in the above example?
    • The struct FriendlyCat is parameterized by a lifetime: it takes a lifetime as an argument
    • The meaning of FriendlyCat<'a> is that (1) a lifetime 'a must be provided as an argument in order to create a FriendlyCat, and (2) a FriendlyCat cannot outlive this lifetime.
  • Looking at where the FriendlyCats are constructed above, it’s not clear where the lifetime argument is explicitly created.
    • The Rust compiler infers the lifetime here for us, so it is not necessary to explicitly annotate it.
  • Why am I showing you this seemingly esoteric feature? Because it’s an example of how Rust’s principles of ownership and borrowing have wide-ranging effects on every aspect of the language: structs must be made aware of borrowing and lifetimes in order to track ownership of resources that may be used inside the struct.

Memory management conclusion

  • In this module we’ve seen three fundamentally distinct approaches to memory management in programming languages:
    1. Manual memory management in micro-C and micro-ASM, which gives performance at the expense of memory safety.
    2. Automated memory management with garbage collection, which gives safety at the expense of performance.
    3. Static ownership-based memory management in Rust, which gives safety and performance at the same time, but at the expense of language ergonomics
  • A language’s approach to memory management is quite often its defining characteristic: the thing most people know about C is that it is a high-performance language (quite often due to its low-level memory-management capabilities), the thing most people know about Rust is its unique ownership-based approach to memory management
  • Expect approaches to memory management to continue to evolve: Rust isn’t the last word, and there are many interesting questions still in how to design languages that support effective memory management while still remaining ergonomic to use.