pcwalton

Occasional notes on Rust, Firefox, etc.

The Two Meanings of “Impl”

| Comments

impl declarations in Rust have two forms. The subtle distinction between the two can be confusing at first, so I’ll briefly explain the difference here.

The first form of impl is a type implementation. (Earlier I was calling this an “anonymous trait”, but I think that this terminology is probably more confusing than it’s worth.) This form allows you to define new functions associated with a type. For example:

struct Dog {
    name: ~str
}

impl Dog {
    static fn new(name: ~str) -> Dog {
        return Dog { name: name };
    }

    fn speak(&self) {
        io::println("woof");
    }
}

This example defines new functions new and speak under the Dog namespace. Here’s an example of their use:

let dog = Dog::new("Snoopy");
Dog::speak(&dog); // note: doesn't work today, see note below

(The explicit call of the form Dog::speak(&dog) doesn’t work today, but I wrote it out to emphasize the fact that speak lives in the Dog namespace. It’s likely to work in the future, though. Today, you need to write dog.speak().)

The second form of impl, on the other hand, is a trait implementation. It’s distinguished from the first form by the presence of a : followed by the name of a trait. This form allows you to provide an implementation for one or more existing functions belonging to a trait. It doesn’t define any new functions. For instance, suppose I defined this trait:

trait Animal {
    static fn species(&self) -> ~str;
}

Then I can supply an implementation of species() for my Dog structure like this:

impl Dog : Animal {
    static fn species(&self) -> ~str {
        return ~"Canis lupus familiaris";
    }
}

The key point to notice here is that this form doesn’t define any new names. This code won’t compile:

let dog = Dog::new("Fido");
io::println(Dog::species(&dog)); // unresolved name: `species`

But this code will:

let dog = Dog::new("Spot");
io::println(Animal::species(&dog));

The reason is that a trait implementation only provides the implementation of one or more existing functions rather than defining new functions. The function species is part of the Animal trait; it’s not part of Dog.

(You might reasonably ask: Why not duplicate the name species into Dog, for convenience? The reason is because of name collisions: it should be possible to implement Animal and later implement another trait with a different function called species without breaking existing code.)

So the upshot of this is that there are two forms of implementations in Rust: the type implementation, which defines new functions, and the trait implementation, which attaches functionality to existing functions. Both use the impl keyword, but they’re different forms with different meanings.

A Tour of Vector Representations

| Comments

One aspect of Rust that’s often confusing to newcomers is its treatment of strings and vectors (also known as arrays or lists). As a result of its focus on systems programming, Rust has a somewhat lower-level concept of a vector than most other languages do. As part of an overall goal to make Rust easy to understand, I thought I’d write up a quick tour of the way other languages’ vectors work from the perspective of the machine in order to make it easier to map these concepts into Rust.

There are three common models that I’ve observed in use—for lack of better terminology, I’ll call them the Java model, the Python model, and the C++ STL model. (For brevity, I’ve omitted fixed-size, stack-allocated arrays, since these are very limited.) Most languages build upon one of these three. In a subsequent blog post, I’ll explain how Rust’s system differs from these and how the programmer can build the equivalents of each of these models in Rust.

We’ll start with the Java model. Java’s basic array type has a fixed size when created and cannot be changed afterward. Arrays in Java are always allocated on the Java heap. For example, consider the following line of code:

int[] a = { 1, 2, 3, 4, 5 };

After this code is executed, the memory of the running program looks like this:

image

The cell highlighted in red is the value of type int[]. It’s a reference type, which means that it represents a reference to the data rather than the data itself. This is important when assigning one array value to another. For instance, we execute this code:

int[] b = a;

And now the memory looks like this:

image

Both values are pointing at the same underlying storage. We call this aliasing the array buffer. In Java, any number of values can point to same the underlying array storage. Because of this, the language has no idea how many pointers point to the storage at compile time; therefore, to determine when to clean up the storage, Java uses garbage collection. Periodically, the entire heap is scanned to determine whether any references to the array storage remain, and if there are none, the buffer is freed.

Now this model is simple and fast, but, since the arrays have a fixed size, the programmer can’t add new elements to them once they’re created. This is a very common thing to want, so Java provides another type, java.util.ArrayList, for this. As it turns out, the model used by Java’s ArrayList is essentially the same model that Python uses for all of its lists.

Let’s look at this model more closely. Consider this statement in Python:

a = [ 1, 2, 3, 4, 5 ]

Once this is executed, the memory looks like this:

image

As in Java, the cell highlighted in red (a) is the value that actually has the Python type list. We can see this if we assign a to b:

b = a

image

Obviously, the disadvantage of this model is that it requires two allocations instead of one. The advantage of this model is that new elements can be added to the end of the vector, and all outstanding references to the vector will see the new elements. Suppose that the vector had capacity 5 when initially created, so that no room exists to add new elements onto the end of the existing storage. Then when we execute the following line:

b.append(6)

The memory looks like this:

image

Here, Python has created a new and larger allocation for the storage, copied the existing elements over, and freed the old allocation (indicated in gray). Because a and b both point to the PyListObject allocation, which has not changed, they both see the new elements:

>>> a
[1, 2, 3, 4, 5, 6]
>>> b
[1, 2, 3, 4, 5, 6]

In summary, Python’s model sacrifices some efficiency at runtime because it requires both garbage collection and two allocations, but it gains flexibility by permitting both aliasing and append operations.

Turning our attention to the C++ STL, we find that it has a different model from both Python and Java: it sacrifices aliasing but retains the ability for vectors to grow. For instance, after this C++ STL code executes:

std::vector a;
a.push_back(1);
a.push_back(2);
a.push_back(3);
a.push_back(4);
a.push_back(5);

The memory looks like this:

image

As before, the red box indicates the value of type std::vector. It is stored directly on the stack. It is still fundamentally a reference type, just as vectors in Python and Java are; note that the underlying storage does not have the type std::vector<int> but instead has the type int[] (a plain old C array).

Like Python vectors, STL vectors can grow. After executing this line:

a.push_back(6);

The STL does this (assuming that there isn’t enough space to grow the vector in-place):

image

Just as the Python list did, the STL vector allocated new storage, copied the elements over, and deleted the old storage.

Unlike Java arrays, however, STL vectors do not support aliasing the contents of the vector (at least, not without some unsafe code). Instead, assignment of a value of type std::vector copies the contents of the vector. Consider this line:

std::vector b = a;

This code results in the following memory layout:

image

The entire contents of the vector were copied into a new allocation. This is, as you might expect, a quite expensive operation, and represents the downside of the C++ STL approach. However, the STL approach comes with significant upsides as well: no garbage collection (via tracing GC or reference counting) is required, there is one less allocation to manage, and the vectors are allowed to grow just as Python lists are.

This covers the three main vector representations in use by most languages. They’re fairly standard and representative; if I didn’t mention a language here, it’s likely that its implementation uses one of these three techniques. It’s important to note that none of these are right or wrong per se—they all have advantages and disadvantages. In a future post, I’ll explain the way Rust’s vector model allows the programmer to choose the model appropriate for the task at hand.

Typestate Is Dead, Long Live Typestate!

| Comments

One well-known fact about Rust is that the typestate system, which was one of the most unique aspects of the language early on, was dropped in Rust 0.4. The reason was that “in practice, it found little use” (courtesy of Wikipedia), which is fairly accurate. However, what’s less well known is that, in the meantime, Rust gained the building blocks necessary for typestate via its uniqueness typing system. With the right patterns, most of the safety guarantees that typestate enabled can be achieved, although it’s not as easy to use.

Let’s start with the simple example of a file that can be open or closed. We want to ensure at compile time that no methods that require the file to be open (for example, reading) can be called on the file while it is closed. With typestate, we would define the functions as follows:

use core::libc;

struct File {
    descriptor: int
}

pred is_open(file: File) -> bool {
    return file.descriptor >= 0;
}

fn open(path: &str) -> File : is_open {
    let file = File { descriptor: libc::open(path) };
    check is_open(file);
    return file;
}

fn close(file: &mut File) {
    libc::close(file.descriptor);
    file.descriptor = -1;
}

fn read(file: &File : is_open, buf: &mut [u8], len: uint) {
    libc::read(file.descriptor, ...)
}

And this is how this module might be used:

fn main() {
    let file: File : is_open = open("hello.txt");
    read(&file, ...);
    close(file);

    read(&file, ...);    // error: expected File : is_open but found File
    check is_open(file); // will fail at runtime
}

The constructs here that differ from Rust of today are:

  • Constraints are special type kinds that can be attached to types with the : syntax; e.g. File : is_open.

  • The pred keyword declares a predicate function, which defines both a function and a constraint.

  • All values have unconstrained types when initially constructed. To add a constraint to a type, we use the check keyword. The check expression evaluates a predicate and fails at runtime if the predicate returns false; otherwise, it adds the appropriate constraint to the type of the predicate’s argument.

Now let’s look at how we could achieve this in current Rust. We use the branding pattern:

struct File<State> {
    priv descriptor: int,
}

// Make the type noncopyable.
impl<T> File<T> : Drop {
    fn finalize(&self) {}
}

struct Open(@Open);
struct Closed(@Closed);

fn check_open<T>(file: File<T>) -> File<Open> {
    assert file.descriptor >= 0;
    let new_file: File<Open> = File {
        descriptor: file.descriptor
    };
    return new_file;
}

fn open(path: &str) -> File<Open> {
    let file: File<Closed> = File { descriptor: libc::open(path) };
    let file: File<Open> = check_open(file);
    return file;
}

fn close<T>(file: File<T>) -> File<Closed> {
    let new_file: File<Closed> = File {
        descriptor: -1
    };
    libc::close(file.descriptor);
    return new_file;
}

fn read(file: &File<Open>, buf: &mut [u8], len: uint) {
    libc::read(file.descriptor, ...)
}

Using this code has a different feel to it:

fn main() {
    let file: File<Open> = open("hello.txt");
    read(&file, ...);
    let file: File<Closed> = close(file);

    read(&file, ...);  // error: expected File<Open> but found File<Closed>
    let file: File<Open> = check_open(file); // will fail at runtime
}

The differences between this code and the original code using typestate are:

  • Rather than directly altering the constraints attached to a value’s type, the functions that change typestate take a value of one type and return a different value of a different type. For example, close() takes a value of File<T> for any state T and returns a value of type File<Closed>.

  • Instead of the built-in notion of a predicate, this code uses a phantom type. A phantom type is a type for which no values can be constructed—in this example, there is no way to construct a value of type Open or Closed. Instead, these types are solely used as “markers”. In the code above, a value of type File<Open> represents an open file, and a value of type File<Closed> represents a closed file. We call these branded types, because File is branded with the Open or Closed status. Generics (e.g. File<T>) can be used when the state of a file is irrelevant; e.g. if a function can operate on both closed or open files.

  • File instances are made noncopyable. This is important to prevent code like this from compiling:

    let file: File<Open> = open("hello.txt");
    let _: File<Closed> = close(file); // ignore the return value
    read(&file, ...);  // error: use of moved value `file`
    

The important idea is that to get a closed file, you must first surrender your open file. The uniqueness system in Rust allows the compiler to ensure this: when you change typestates, you must move your original value away, and the compiler will ensure that you can’t access it again.

  • The file descriptor field is made private to the containing module. This is important to disallow other modules from forging open or closed File instances. Otherwise, other code could simply convert an open file to a closed file the same way check_open does:

    let open_file: File<Open> = open("hello.txt");
    let closed_file: File<Closed> = close(open_file);
    let fake_open_file: File<Open> = File { descriptor: closed_file };
    // ^^^ error: use of private field 'descriptor'
    read(&fake_open_file, ...);
    

Since the File structure contains a private field, no code other than the containing module can create one. In this way, we ensure that nobody can forge instances of File and violate our invariants.

Now, it’s obvious that this isn’t perfect in terms of usability. For one, it’s a design pattern, and design patterns are the sincerest form of request for syntax. I’m not particularly concerned about this aspect, however, because syntactic sugar is readily achievable with macros.

The issue that I’m concerned with is deeper. One nice thing about typestate as previously implemented is that you don’t have to surrender your value; you can effectively “mutate” its type “in-place”. This saves you from writing temporary variables all over the place and also saves some (cheap) copies at runtime. For example, you can write:

let file = open("hello.txt");
read(&file, ...);
close(file);

Instead of:

let file = open("hello.txt");
read(&file, ...);
let file = close(file);

In Rust, however, this causes complications, which we never fully resolved. (In fact, this is part of what led to typestate’s removal.) Suppose that close mutated the type of its argument to change it from &File<Open> to &File<Closed>. Then consider the following code:

trait Foo {
    fn speak(&self);
}

impl File<Open> : Foo {
    fn speak(&self) {
        io::println("woof");
    }
}

trait Bar {
    fn speak(&self, x: int);
}

impl File<Closed> : Bar {
    fn speak(&self) {
        io::println("meow");
    }
}

let file = open("hello.txt");
for 5.times {
    file.speak();
    close(&file);
}

How do we compile this code? The first time around the for 5.times { ... } loop, file.speak() should resolve to Foo::speak; the second time around, file.speak() should resolve to Bar::speak. Needless to say, this makes compiling extremely difficult: we would have to consider the lexical scope of every single method invocation and compile it for each possible predicate!

Because of these and other complications, mutating the type doesn’t seem possible in the general case. We would certainly need to introduce some set of restrictions—perhaps we would need to formalize the notion of a “constraint” in the type system (probably by introducing a new type kind) and then introduce some restrictions on implementation declarations to prevent instances from depending on constraints. Whatever system we come up would be pretty complex and would require a fair bit of thought to get right.

So I’d like to try to play with the current setup and see how far we get with it. In future versions of the language (post-1.0), it might be worthwhile to try to allow some sort of in-place “mutation” of types, similar to languages with true typestate. Overall, though, the combination of uniqueness and branding places today’s Rust in an interesting position, supporting much of the power that came with typestate in a simple system.

Unique Pointers Aren’t Just About Memory Management

| Comments

One of the most unusual features of Rust, especially when compared to languages that aren’t C++, is the three types of pointers: borrowed pointers (&T), unique pointers (~T), and managed pointers (@T). Most people quite rightly ask “why three pointers? Isn’t one enough?” The usual answer is that unique pointers help with manual memory management:

  • Managed pointers (@T) allow convenient garbage collection.

  • Unique pointers (~T) work like malloc and free from C to allow programmers who don’t want the overhead and complexity of GC to avoid it.

  • Borrowed pointers (&T) allow functions to work equally well with both unique and managed pointers.

This is all true, but there’s another, equally important reason that’s often overlooked: unique pointers allow for efficient, safe concurrency.

To see why, let’s consider the possible ways that an actor- or CSP-based system could enforce safe message passing. By safe message passing I mean that actors can’t create data races by simultaneously accessing shared mutable data. In short, we want to enforce that this adage is followed (courtesy of Rob Pike)–“do not communicate by sharing memory; share memory by communicating.”

There are three simple ways to do this:

  1. Copy all messages sent from actor to actor. Changes that one actor makes to the contents of any message do not affect the other actors’ copies of the message.

  2. Require that all messages sent from actor to actor be immutable. No actor may make changes to any message after it’s created.

  3. Make messages inaccessible to the sender once sent–senders “give away” their messages. Only one actor may mutate a message at any given time.

Each of these patterns has advantages and disadvantages:

  1. Copying all messages has the advantage that it’s simple to reason about, and the programmer doesn’t have to worry about mutability restrictions. The disadvantage is that it comes with a significant performance cost, both in terms of allocation overhead and the copying itself.

  2. Requiring that messages be immutable has the advantage that many messages can be efficiently sent, but it still can lead to copying in many cases. Consider, for example, an application that spawns off a task to decode a large JPEG image. To be efficient, the image decoding algorithm generally wants to decode into a mutable buffer. But the decoded image data must be immutable to be sent, which necessitates a potentially-expensive copy of the pixel data out of the work buffer to an immutable location.

  3. Making messages inaccessible to the sender has the advantage that it’s simple and fast, but it has the disadvantage that it could lead to copying if both the sender and receiver need to access the memory after the send operation.

Because one pattern rarely fits every use case, most actor-based languages, including Rust, have varying levels of support for all three of these patterns (and for more complex patterns that don’t appear in the above list, such as software transactional memory). However, each language tends to favor one of the three patterns “by default”. For example, Erlang leans toward option #1 (copying all messages), Clojure leans toward option #2 (immutable sharing), while Rust leans toward option #3 (giving messages away). The important thing to note here is that all of the patterns have advantages and disadvantages, and so different scenarios will call for one or the other. Consider the image decoding example from before; pattern #3 is by far the most efficient way to handle this, as the buffer needs to be mutable while the image decoder works on it, but the decoder has no need for the image after decoding is done.

Now the simplest way to support pattern #3 safely–in other words, to enforce at compile time that only one actor can hold onto a message at any given time–is through unique pointers. The compiler guarantees that only one reference exists to a uniquely-owned object, enforcing the property we want. Unique pointers support a move operation, which allows functions to “give a pointer away” to another function. So by simply requiring that the “send” method takes a unique pointer and moves its argument, we teach the compiler everything it needs to know to enforce safe concurrency.

In this way, unique pointers aren’t just a tool for manual memory management. They’re also a powerful tool for eliminating data races at compile time. The fact that they also allow Rust programs to avoid the garbage collector is just an added bonus.

A Gentle Introduction to Traits in Rust

| Comments

Rust traits pack a lot of flexibility into a simple system, and they’re one of my favorite features of the language. But as a result of the rapid pace of the language’s development, there’s been a fair amount of confusion as to how they work. As such, I figured I’d write up a quick tutorial explaining why and how to use them.

This tutorial assumes only basic knowledge of C-like languages, so I’ll try to explain everything specific to Rust that might be unclear along the way. Also note that a couple of these features are unimplemented, so if you try this today the syntax will be a little different.

Simple implementations

In keeping with the theme of my previous blog posts on classes, let’s start by writing a game. I’ll start by defining a struct Monster and a struct Player like this:

struct Monster {
    name: &str;      // `&str` is a reference to a string
    mut health: int; // `mut` indicates that the health can be changed
}

struct Player {
    mut health: int;
}

Now I can create instances of each:

fn main() {  // `fn` defines a function
    let monster = Monster {
        name: "Gelatinous Cube",
        health: 50
    };
    let player = Player {
        health: 100
    };
}

Without some functionality, this isn’t a particularly interesting game. So let’s add a method to Monster:

impl Monster {
    fn attack(&self, player: &Player) {
        // fmt! is string formatting; this prints "Gelatinous Cube hits you!"
        io::println(fmt!("%s hits you!", self.name));
        player.health -= 10;
    }
}

And I can call it this way, inside main:

monster.attack(&player);

There are several things to note here.

  • References are explicit in Rust: the & sigil indicates that the method attack takes a reference to the player, not the player itself. If I didn’t write that, then the player would be copied into the method instead (and we’d get a compiler warning, because this indicates a bug).

  • I use the keyword impl to declare methods for a type. impl declarations can appear anywhere in the module that declared the type. The struct and impl pair appears a lot in Rust code; it nicely separates out data from implementation. Objective-C and C++ programmers will find this familiar.

  • Within an implementation, functions with a self parameter become methods. Python programmers will find this “explicit self” familiar. Because references are explicit in Rust, you specify how self is supposed to be passed; in this case, by reference (&self).

Generics

Now that we have basic implementations covered, let’s look at something completely different: generics. (We’ll come back to implementations later on.) Like many other languages, Rust features generic functions: functions that can operate on many different types. For example, here’s a function that returns true if a vector is empty:

// Vectors are written with square brackets around the type; e.g. a vector of
// ints is written `[int]`.
fn is_empty<T>(v: &[T]) -> bool {
    return v.len() == 0;
}

The generic type parameters are written inside the angle brackets (< and >), after the function name.

There’s nothing much more to say here; generics are pretty simple. In this form, however, they’re pretty limited, as we’ll see.

Limitations of generics

Let’s go back to our game example. Suppose I want to add functionality to save the state of the game to disk in JSON. I’ll implement some methods on Monster and Player to do this:

impl Monster {
    // `~str` means "a pointer to a string that'll be automatically freed"
    fn to_json(&self) -> ~str {
        return fmt!("{ name: \"%s\", health: %d }", self.name, self.health);
    }
}

impl Player {
    fn to_json(&self) -> ~str {
        return fmt!("{ health: %d }", self.health);
    }
}

Now imagine that I wanted a function to save any actor (either a monster or a player) into a file. Because monsters and players are different types, I need to use a generic function to handle both. My first attempt at the function looks like this:

fn save<T>(filename: &str, actor: &T) {
    // Because the writer returns an error code, I use .get() to mean "require
    // that this succeeded, and abort the program if it didn't".
    let writer = io::file_writer(filename, [ io::create, io::truncate ]).get();
    writer.write(actor.to_json());
    // Because of RAII, the file will automatically be closed.
}

Uh-oh. This doesn’t compile. I get the following error: “attempted access of field to_json on type &T, but no public field or method with that name was found”.

What the Rust compiler is telling me is that it doesn’t know that the type T in this function contains the method to_json. And, in fact, it might not. As written above, it’d be perfectly legal to call save on any type at all:

struct Penguin {
    name: &str;
}

save("penguin.txt", &Penguin { name: "Fred" });
// But how do I convert penguins to JSON?

So I’m stuck. But Rust provides a solution: traits.

Trait declaration

Traits are the way to tell the Rust compiler about functionality that a type must provide. They’re very similar in spirit to interfaces in Java, C#, and Go, and are similar in implementation to typeclasses in Haskell. They provide the solution to the problem I’m facing: I need to tell the Rust compiler, first of all, that some types can be converted to JSON, and, additionally, for the types that can be converted to JSON, how to do it.

To define a trait, I simply use the trait keyword:

trait ToJSON {
    fn to_json(&self) -> ~str;
}

This declares a trait named ToJSON, with one method that all types that implement the trait must define. That method is named to_json, and it takes its self parameter by reference.

Now I can define implementations of ToJSON for the various types I’m interested in. These implementations are exactly the same as above, except that we add : ToJSON.

impl Monster : ToJSON {
    // `~str` means "a pointer to a string that'll be automatically freed"
    fn to_json(&self) -> ~str {
        return fmt!("{ name: \"%s\", health: %d }", self.name, self.health);
    }
}

impl Player : ToJSON {
    fn to_json(&self) -> ~str {
        return fmt!("{ health: %d }", self.health);
    }
}

That’s all there is to it. Now I can modify the save function so that it does what I want.

Trait usage

Recall that the reason why the save function didn’t compile is that the Rust compiler didn’t know that the T type contained a to_json method. What I need is some way to tell the compiler that this function only accepts types that contain the methods I need to call. This is accomplished through trait restrictions. I modify the save function as follows:

fn save<T:ToJSON>(filename: &str, actor: &T) {
    let writer = io::file_writer(filename, [ io::create, io::truncate ]).get();
    writer.write(actor.to_json());
}

Note the addition of :ToJSON after the type parameter. This indicates that the function can only be called with types that implement the trait.

Now these calls to save will compile:

save("player.txt", &player);
save("monster.txt", &monster);

But this call will not:

save("penguin.txt", &Penguin { name: "Fred" });

I get the error “failed to find an implementation of trait ToJSON for Penguin”, just as expected.

Summing up

These are the basic features of traits and comprise most of what Rust programmers will need to know. There are only a few more features beyond these, which I’ll mention briefly:

  • Special traits. Some traits are known to the compiler and represent the built-in operations. Most notably, this includes the ubiquitous copy trait, which invokes the copy operation that occurs when you assign with let x = y. You’ll see T:copy in many generic functions for this reason. Other special traits include send, which is a trait that indicates the type is sendable, and add, sub, etc, which indicate the built-in arithmetic operators. The key is that, in all cases, traits simply specify what a generic type can do; when you want to do something with a type parameter like T, you specify a trait.

  • Generic traits. Traits can be generic, which is occasionally useful.

  • Default implementations. It’s often helpful for traits to provide default implementations of their methods that take over when the type doesn’t provide an implementation of its own. For example, the default implementation of to_json() might want to use the Rust reflection API to automatically create JSON for any type, even if that type doesn’t manually implement the to_json() method. (Note that this feature is currently being implemented.)

  • Trait composition. Sometimes we want one trait to include another trait. For example, the Num trait, which all number types in the language implement, obviously includes addition, subtraction, multiplication, etc. Trait composition allows traits to be “glued together” in this way. Note that this isn’t inheritance; it’s simply a convenience that allows trait methods to be combined together, like a mixin. (This is not fully implemented yet.)

  • First-class trait values. Rarely, it’s necessary to have a trait be a first-class value, like in Java or Go, instead of attached to a generic type parameter. This doesn’t come up particularly often, but Rust does support it in the rare cases in which it’s needed. Idiomatic Rust uses generics instead of Java-like interfaces.

That’s about all there is to traits. Traits are essentially Rust’s object system, but they’re simpler than many object systems and integrate especially well with generics.

Maximally Minimal Classes for Rust

| Comments

Now that classes have been implemented as per the original proposal, the other Rusters and I have been starting to get a feel for the way they work out in practice. The results are positive, but not optimal. Although they definitely succeeded in avoiding the rigidity of traditional object-oriented languages like Java, they still have two basic problems: (1) they feel somewhat out of place with the rest of the language; and (2) they’re still too heavyweight. Nevertheless, the functionality that they enabled is important, and we shouldn’t sacrifice it.

Language design tends to go in cycles: we grow the language to accommodate new functionality, then shrink the language as we discover ways in which the features can be orthogonally integrated into the rest of the system. Classes seem to me to be on the upward trajectory of complexity; now it’s time to shrink them down. At the same time, we shouldn’t sacrifice the functionality that they enable.

In Rust, classes provide five main pieces of functionality that don’t otherwise exist: (1) nominal records; (2) constructors; (3) privacy on the field level; (4) attached methods; and (5) destructors. I’ll go over these five features in turn and discuss how each one could be simplified.

Nominal records

Classes in Rust are nominal records. A class in this form:

class monster {
    let mut health: int;
    let name: str;
}

Is basically the moral equivalent of:

enum monster {
    monster({
        mut health: int,
        name: str
    })
}

Clearly, the class form is much easier to read and much less confusing for users of the language; “enum” makes little sense as there’s nothing enumerated here. Nevertheless, there’s a bit of unnecessary noise in the form of the let keyword. We could simplify it to:

class monster {
    mut health: int,
    name: str
}

It’s less typing, and it matches record syntax exactly.

Constructors

Those who have used Rust classes in their current form know that the above example class monster is incomplete. I still have to define a constructor for monster, like so:

class monster {
    let mut health: int;
    let name: str;

    new(health: int, name: str) {
        self.health = health;
        self.name = name;
    }
}

This is probably the most burdensome part of classes as they currently stand–having to repeat each field name four times, and each type twice, is annoying. Many languages have solutions for this (CoffeeScript and Dart, for example), so we could consider adopting one of these languages’ syntactic sugar for something like:

class monster {
    let mut health: int;
    let name: str;

    new(self.health, self.name) {}  // sugar for the above
}

Unfortunately, it doesn’t stop there. Constructors have other problems. For one, there can only be one constructor per class–this is far more restrictive than Java, which permits constructor overloading. Worse, constructors can’t indicate that they failed; they can only fail the task or set some internal “this failed” flag, both of which are clearly unsatisfactory. The right way to report a recoverable error to the caller in Rust is to use the result type, but constructors can’t return result<self>; they can only return self.

I think the easiest way to address these problems is, following the idea that classes are just nominal records, is to abolish constructors entirely and adopt record literal syntax for initializing classes. So a class like this:

class monster {
    mut health: int,
    name: str
}

Would be initialized with:

let foe = monster {
    health: 100,
    name: "Bigfoot"
};

If you want to declare one or more “constructor” functions, perhaps to signal success or failure, that’s easy; they’re just functions in the same crate:

fn monster(health: int, name: str) -> result<monster> {
    if name == "King Kong" || name == "Godzilla" {
        ret err("Trademark violation");
    }
    ret ok(monster { health: health, name: name });
}

But note that you only have to write a constructor if you’re doing something special, like returning an error or initializing private fields. If your class is simple and merely holds public state, then your callers can just use the record literal syntax to create instances of the class.

Privacy

Classes in Rust allow private fields:

class monster {
    let priv mut health: 100;
    let name: str;

    ...

    fn hit() {
        self.health -= 10;
    }
}

This is extremely useful functionality for modularity. But Rust already has a mechanism for privacy, via exports. For example, in order to write an enum whose contents are hidden from the outside world:

enum color {
    priv red;
    priv green;
    priv blue;
}

(Note that the syntax here is changing; for posterity, I’m using the new syntax, but note that the code here doesn’t work at the time of this writing, as it’s not yet implemented.)

Only this module can construct instances of this enum, or even inspect its contents, because while the enum itself can be named, none of its variants can. So we could apply the same principle to fields of classes:

mod A {
    mod B {
        class monster {
            priv mut health: int,
            name: str
        }

        fn hit(monster: &monster) {
            monster.health -= 10;    // OK
        }
    }

    fn heal(monster: &monster) {
        monster.health += 10;        // error: field "health" is private
    }
}

Here, a field marked with priv can only be named (and therefore accessed) by the enclosing module or containing modules. It works like every other instance of priv in the language: it restricts the use of a name to the enclosing module and its submodules.

It would be an error for modules that aren’t the module defining the class or an enclosing module to attempt to construct an instance of a class with a private field with the record literal syntax. This means that, if you use private fields, you need a constructor if you want your class instances to be constructible by the outside world.

Methods

Naturally, Rust classes support attached methods; this is much of the reason for their existence. But Rust already has a mechanism for creating methods–namely, typeclasses. We could write the above monster declaration this way:

mod A {
    class monster {
        priv mut health: int,
        name: str
    }

    impl monster for &monster {
        fn hit() {
            self.health -= 10;
        }
    }
}

The trick here is that the typeclass implementation is named monster, so a declaration like import A::monster will import both the class and the implementation. This entire scenario works because, with privacy restricted to the module, there is no need to place methods inside the class to achieve privacy.

Sometimes, it’s useful to have the hidden self parameter actually be a GC’d pointer to an instance of the class. In the original class proposal, this is accomplished with a separate type of class named @class. However, with this revised proposal, the @class functionality falls out naturally, without any extra features:

class monster {
    priv mut health: int,
    name: str,
    friends: dvec<@monster>  // a dynamic vector
}

impl monster for @monster {
    fn befriend(new_friend: @monster) {
        new_friend.friends.push(self);
    }
}

It’d be best if we could eliminate the repetition of the monster name in the impl declaration, so I propose inferring it:

impl for @monster {
    fn befriend(new_friend: @monster) {
        new_friend.friends.push(self);
    }
}

The name of the implementation would automatically be inferred to be the name of the class if, given a class C, the type is one of C, @C, ~C, or &C.

Note that, since traits can be applied to implementations, we can apply traits to classes in this way.

It would be ideal to eliminate the impl declaration entirely. However, this relies on typeclass coherence, which I’d like to keep separate to avoid coupling proposals. Nevertheless, it’s worth mentioning; so, in a forthcoming post, I’ll show how typeclass coherence can make method declaration syntax even simpler.

Destructors

Classes are intended to be the only mechanism for destructors in Rust. Unfortunately, there’s no obvious way to eliminate destructors from classes in a minimal way. There are a number of options:

  1. Keep destructors in classes, and remove resources.

  2. Keep resources around, and remove destructors from classes.

  3. Make the destructor interface (drop) into a special kind of “intrinsic interface” which enforces instance coherence. Then remove both resources and destructors from classes. (Recall that instance coherence means that each class can only have one implementation of an interface, which is clearly, to my mind, a necessity if destructors are to become an interface.)

  4. Make all interfaces enforce instance coherence, make drop into an interface, and remove both resources and destructors from the language.

I prefer option (4), but, as mentioned before, that’s a separate issue.

Finally, with nearly all of the special functionality of classes removed, it’s worth asking why records continue to exist. Indeed, I’ve been thinking for a while that structural records should be removed from the language, but the reasons for this tie into a deeper discussion on structural and nominal types and deserve their own blog post.

Coherence, Modularity, and Extensibility for Typeclasses

| Comments

I’ve been experimenting with the design of a modification to Rust typeclasses. Because it’s always best to start with code, here’s a synopsis of what I have in mind:

mod A {
    // Declaration of an interface:
    iface to_str {
        fn to_str() -> str;

        // Implementations for various types:

        impl int {
            fn to_str() -> str {
                ... implementation of to_str on ints ...
            }
        }

        impl uint {
            fn to_str() -> str {
                ... implementation of to_str on unsigned ints ...
            }
        }

        ... more types here ...
    }

    // Define a class and declare that it implements to_str:
    class foo : to_str {
        fn to_str() {
            ret "foo";
        }
    }
}

mod B {
    import A::to_str;    // Must import the interface first, so
                         // that the compiler can find the method
                         // "to_str".

    println(3.to_str()); // Calls the "to_str" defined above.
}

mod C {
    let x = A::foo();    // Creates a foo object named "x".

    x.to_str();          // Calls "to_str" on "x". Note that I
                         // didn't have to import the "to_str"
                         // method in this scope—since it was
                         // defined inside the declaration of the
                         // class "foo", it's obvious what the
                         // implementation is.
} 

Essentially, the implementations of an interface go inside the declaration of the interface, with one significant exception: a class is permitted to define implementations of interfaces in the body of the class. The compiler prohibits multiple implementations of the same interface on the same type using two simple rules: (1) implementations defined within an interface must be non-overlapping (i.e. there can’t be any types which match multiple implementations); and (2) a class can’t implement an interface that already defines an implementation which might itself match an instance of that class.

The fact that the implementations go inside the interface is a little strange—it resembles the proposed Java defender methods, although it’s used for a completely different purpose—but I believe there is an important reason for it. It means that, if a programmer wants to look up the definition of a method call, he or she can simply figure out which interface it belongs to, which must always be in scope via an import statement, and then look at the declaration of the interface to find the method.

Fundamentally, the guiding principles behind this design are that the typeclass system should be coherent and modular while supporting extensibility. Here are the definitions of these terms as I see them:

Coherent — A typeclass system is coherent if there exists at most one implementation of an interface for every type. Typeclass systems that don’t have this property have the instance coherence problem (or, as we called it when we independently stumbled across it, the “hash table problem”.)

Modular — A typeclass system is modular if the unit in the source code that carries the implementation for every method must be in the lexical scope of every call site that needs the implementation (or, for nominal types only, in the lexical scope of the declaration of the type). This is a little unclear, so some examples are in order. First, a simple one:

import vec::len;
printf("The length is %u", [ 1, 2, 3 ].len());

In this example, we need the implementation for len in scope in order to make a direct call to the method len.

Now a more complex example:

fn print_length<T:len>(x: T) {
    printf("The length is %u", x.len());
}

import vec::len;
print_length([ 1, 2, 3 ]);

Here, we need the definition of len in scope at the time we call print_length. Because print_length can print the length of any value that implements the len interface, it doesn’t intrinsically know which method to call. This information has to be provided by the caller of print_length. For this reason, the call to print_length requires the implementation vec::len to be in scope.

In typeclass systems that aren’t modular, modules that define conflicting typeclass implementations usually can’t be linked together. For instance, in such a system, if module A implements len for vectors and module B independently implements len for vectors, then modules A and B can’t be used together in the same program. Obviously, this poses a hazard for large systems composed of many independently developed submodules.

Extensibility — A typeclass system facilitates extensibility if it’s possible for the programmer to introduce a new interface and provide implementations of that interface for existing types in the system. This is what makes typeclasses act like object extensions; it’s also what makes user-defined typeclasses useful on primitive types.

Many languages have typeclass or interface systems, but to my knowledge none of the mainstream systems support all three of these features. For example:

C++—C++ concepts support extensibility, but aren’t coherent and are only somewhat modular. The C++ language permits out-of-line definition of custom operations on class and enum types. As an example, to provide an ordering on vectors of integers:

#include <vector>
bool operator<(std::vector<int> &a, std::vector<int> &b) {
    ...
}

In this way, C++ concepts are extensible. But there’s no check to ensure that there is only such definition in the program for each data type, so C++ concepts aren’t coherent. In this example, other namespaces can define operator< over the same types.

Generally, C++ scoping rules ensure that a function can never be called outside of its lexical scope. But there is a significant exception: argument-dependent lookup. With ADL, a function can be called outside of its lexical scope if that function was defined in the same scope as the type of one of its arguments. This feature was intended for extensibility, as it allows collections like std::map to pick up definitions of functions like operator< even if the functions aren’t in scope. However, it clearly comes at the cost of modularity.

Haskell—Haskell typeclasses are coherent and support extensibility, but aren’t modular. Haskell programmers can define instances of typeclasses for any type in the system, but there can be only one instance of a typeclass for every type in the program. This can cause problems when two modules are linked together—if, say, module A defines Show of int and module B independently defines Show of int, modules A and B can’t be linked together.

Java and Go—Java interfaces are modular and coherent, but aren’t extensible. In Java, an implementation of an interface can be defined only within the package that declares the type. This means, in particular, that interfaces can’t be defined on primitive types. It also means that a module can’t define an interface and then declare an implementation of the interface on existing types without modifying the existing type. Go interfaces have the same limitations (unless you define an interface over methods that are already defined on the type in question).

Scala—Scala interfaces are modular but only mostly coherent; they also offer some support for extensibility. Unsurprisingly, interfaces in Scala are basically the same as interfaces in Java. The major difference is that, unlike Java, Scala provides a mechanism for extending classes with implementations of interfaces without having to modify the definition of the class—namely, implicits. This feature is extremely useful for extensibility; it also solves the problem of methods on primitive types in an elegant way. The trouble is that implicits are somewhat inconvenient to use; the programmer must define an implicit wrapper around the object, so the this parameter won’t refer to the object itself but rather to the wrapper. Equally importantly, implicits don’t enforce coherence—two modules can define two different implicits on the same type.

Objective-C—Objective-C categories support extensibility, but aren’t modular or coherent. In Objective-C, methods can be added to existing objects by defining a new category for that object and placing the methods within that category. The problems with categories are that method calls are all late-bound (precluding static scoping), and what happens when two modules that define conflicting category implementations are linked together is undefined: the resulting object might provide one implementation, or it might provide the other implementation. Either way, the resulting program is unlikely to work.

Current Rust—The current Rust implementation of typeclasses is modular and supports extensibility, but it doesn’t maintain coherence. Implementations are separate from interfaces in Rust (except for classes), and interfaces and implementations can both be defined over primitive types. The trouble is that there can be multiple conflicting implementations for typeclasses, which can lead to the instance coherence problem.

So how does this proposed design compare?

  • It offers coherence because there can be only one implementation of an interface for each type. For the implementations provided within the interface itself, we can check that they’re nonoverlapping. For the implementations defined with classes, we can check to ensure that the interface implementations don’t overlap with the implementations that the interface itself defined. Either way, the checks involved are simple and ensure that we meet the criterion for coherence.

  • It offers modularity because the implementation either has to be imported as part of the interface (for implementations defined inside interfaces) or part of the nominal type (for class implementations). Consequently, it is never the case that two Rust crates cannot be linked together because of conflicting typeclass implementations.

  • It offers extensibility because, when an interface is defined, implementations can be provided for any existing types without modifying the declarations of those types.

Finally, it supports all three of these features while maintaining a minimal feature set.

Why Lifetimes?

| Comments

One of the most unique new features of Rust is its slowly-growing support for regions—or lifetimes, as some of us core developers like to call them. As lifetimes aren’t found in any mainstream languages, I thought I’d expand upon why we want them and how they can be used to improve memory management for performance (especially interactive performance) without sacrificing safety. In this first post I’ll explain why existing memory models weren’t enough and why we went searching for alternatives. Here I’m assuming basic knowledge of garbage collection, reference counting, and malloc/free, but nothing more.

The programming models that the current crop of mainstream programming languages expose can be divided pretty evenly into two camps: explicitly-managed environments and garbage-collected enivornments. By far, the most common programming languages built around explicitly-managed environments are C, C++, and Objective-C, and explicit memory management is so associated with these languages that it’s often just called “the C memory model”. Almost all other languages in mainstream use are garbage collected—Java, C#, JavaScript, Python, Ruby, Lisp, Perl, and tons of other languages all fall into this camp. (Note that here I’m using “garbage collection” in the general sense to mean automatic memory management; some of these languages don’t have tracing garbage collection and instead use reference counting.)

Now C and its derivatives famously offer a huge amount of control over memory usage—the built-in language features make it easy to implement stack allocation, ownership (i.e. explicit new and delete), memory pools, and reference counting (manually or with smart pointers or Objective-C’s Automatic Reference Counting). Most large C/C++/Objective-C codebases use all four strategies. Some programs (like Firefox and OS kernels) even implement their own general-purpose memory allocators. (A few use conservative garbage collectors, like the Boehm GC, but these are in the minority, so I’ll leave them aside.) This flexibility has a real benefit, especially for real-time and interactive apps (like web browsers!). Not only does explicit memory management tend to spread out the load so that pauses associated with tracing GC don’t appear, but it also provides a clear path toward improving performance whenever malloc and free do become expensive. In C++, for example, if you profile a program and see lots of expensive calls to operator new near the top, you can often just drop the Boost pool library into your code, change new to new (pool), and call it a day.

Of course, all this power comes at a huge cost: namely, memory safety. Dangling pointers, wild pointers, and buffer overruns are not only annoying and costly in terms of hard-to-find bugs but also deadly from a security perspective. Heap spray attacks make any vtable dispatch on a freed object into an exploitable security vulnerability. Think about that for a second: in C++, you’re always one virtual method call away from an exploitable security vulnerability. You can, of course, mitigate this with sandboxing, but sandboxing has a performance and maintenance cost, and mitigating these costs isn’t easy.

Recognizing the huge costs associated with manual memory management, a huge amount of programming these days has shifted to languages that require garbage-collected environments. These include all of the scripting languages, as well as Java and C#. Garbage collection brings about enormous productivity savings (because the programmer doesn’t have to think as much about memory management) and also enormous security benefits. An entire class of security vulnerabilities (buffer overruns, use-after-free, stack overflow) basically cease to exist for programs running in a garbage-collected environment (to be replaced by exciting new security vulnerabilities such as SQL injection, but that’s another story).

The problem with garbage collection is that, now that memory management isn’t explicit (i.e. that when to recycle memory can’t be statically known by the compiler anymore), lifetimes have to be discovered at runtime—and that entails a performance cost. Tracing stop-the-world garbage collectors (and cycle collectors) have to suspend the entire program for pauses that can last hundreds of milliseconds, a fact which hurts lots of programs—for instance, mobile apps really need to be able to draw at 60 frames per second, ruling out any pause longer than 16 ms. Incremental garbage collection is better, but it’s tricky to implement and causes a loss of throughput, because the compiler has to insert extra operations on every modification of a pointer. And because everything has to essentially be done dynamically (barring simple static analyses like escape analysis), there will always be scenarios in which a fully garbage collected system loses to a manually-managed one—and both major open source web browser engines have zero tolerance for performance regressions.

There are many workarounds in garbage-collected languages for the lack of manual memory management. For example, free lists are a popular technique in languages like Java to reduce GC pause times. The idea is simple—when you have a large number of objects of the same type that you constantly allocate and deallocate, you keep a pool of old objects around and reuse them. The programmer is then responsible for manually allocating and deallocating objects from this free list. This is definitely an effective way to reduce allocations when it’s needed. But, unfortunately, there are a number of downsides to this approach.

First of all, garbage-collected languages usually don’t have any built-in syntax for creating objects out of a free list instead of the heap. The built-in constructor for the object can only be called on a fresh heap allocation. The usual workaround for this is to create an init method on the object or to create a factory object, but all of those approaches tend to look awkward syntactically. This problem itself isn’t a deal-breaker—after all, Java programmers frequently make factory classes for other reasons—but it does compound the awkwardness of the free list pattern. Of course, in and of itself, this wouldn’t be sufficient grounds to add a large amount of complexity to a garbage-collected language to support this pattern.

But there’s a much worse problem: free lists are inherently unsafe. They aren’t unsafe in the same way as C++, to be sure—in C++, there are serious security vulnerabilities to contend with—but they still allow for many of the same bugs that dangling pointers entail. To see why, notice that a free list has no idea when no more references remain to the objects that it hands out. In fact, it can’t know how many references remain to the objects allocated within it—at least, not without reference counting or tracing the object graph, which would lead back to GC and defeat the purpose of the free list! So a free list must require manual memory management. When the programmer frees an object that’s managed by a free list, it’s the programmer’s responsibility to ensure that no more references to it remain. If the programmer accidentally leaks a reference, then that object might be reused for a new instance, and a potentially hard-to-find bug will result. It’s malloc and free all over again.

The end result of this is that we seem to be trapped between the rock of unpredictable performance and the hard place of programmer burdens and security vulnerabilities. The current set of commonly-used languages don’t provide solutions here.

Fortunately, the research landscape offers some promising potential solutions, which I’ll cover next time.