<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[pcwalton]]></title>
  <link href="http://pcwalton.github.com/atom.xml" rel="self"/>
  <link href="http://pcwalton.github.com/"/>
  <updated>2013-05-20T20:47:15-07:00</updated>
  <id>http://pcwalton.github.com/</id>
  <author>
    <name><![CDATA[Patrick Walton]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Safe Manual Memory Management]]></title>
    <link href="http://pcwalton.github.com/blog/2013/05/20/safe-manual-memory-management/"/>
    <updated>2013-05-20T20:46:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2013/05/20/safe-manual-memory-management</id>
    <content type="html"><![CDATA[<p>If there&#8217;s one feature of Rust that is probably the most unique among languages in industry, it&#8217;s <em>safe manual memory management</em>.</p>

<p>It&#8217;s easiest to explain safe manual memory management by explaining how it differs from the memory models of other languages. There are a few different models in common use in industry languages:</p>

<ul>
<li><p><em>Unsafe manual memory management</em>—These languages provide very fine-grained control over memory allocation; heap memory can be explicitly allocated and deallocated. The most important examples here are C and C++. The well-known downside of this approach is that memory safety violations can only be detected at runtime with a memory safety checker such as Valgrind or Address Sanitizer. Memory safety violations that go unchecked can lead to crashes at best and exploitable security vulnerabilities at worst.</p></li>
<li><p><em>Full garbage collection</em>—The majority of modern languages expose a memory model that falls into this category—the space is very diverse, ranging from Java to Go to JavaScript to Ruby to Haskell. In general, such languages place all allocations into the heap instead of the stack, although escape analysis and value types may be used to reduce the number of heap allocations. Periodically, a <em>garbage collector</em> scans all pointers on the stack and in the heap, judges unreachable objects dead, and reclaims them. This approach has the advantage of <em>memory safety</em> at compile time—the language arranges for there to be no dangling pointers, wild pointers, and so forth. The downsides, however, are:</p>

<ol>
<li><p>The garbage collector may run at an inconvenient time. This can be mitigated by explicit control over when the GC runs, although if the garbage collector must collect multiple threads&#8217; heaps at the same time, this may be difficult to synchronize. This can also be mitigated by using manual memory pooling and free lists, although pooling has undesirable safety properties—much like unsafe manual memory management, there is no static guarantee that objects allocated from a pool are returned properly or that an object is not reachable when returned to the pool. Incremental and concurrent garbage collectors help here, but they are not free, as they typically require write and/or read barriers, reducing throughput.</p></li>
<li><p>When it runs, the garbage collector must mark all pointers to discover which ones are live, reducing throughput of the application. Essentially, the GC must discover at <em>runtime</em> what a C++ (say) programmer knows at <em>compile time</em>. Not much can typically be done about this cost in fully garbage-collected languages, short of falling back to unsafe manual memory management. Pools don&#8217;t help much here, because the GC must still trace the pointers into the pool. Even pointers into the stack generally must be traced.</p></li>
</ol>
</li>
<li><p><em>Garbage collection with value types and references</em>—This category includes languages like C#. (I believe D falls into this category as well, although I may be mistaken.) These languages are essentially garbage-collected, but they include <em>value types</em> which are guaranteed to be stack-allocated if in local variables. Additionally, and most importantly, they include <em>reference parameters</em> (and sometimes reference locals), which allow stack-allocated values to be temporarily aliased when calling another function. Effective use of value types can reduce marking and sweeping time. In general, this system is an effective addition to a garbage-collected system, allowing a good measure of additional control without much cost in complexity and no cost in memory safety. It is not, however, typically sufficient to write programs without using the garbage collector at all; the system is too simple to statically encode anything other than the most basic memory management patterns.</p></li>
</ul>


<p>Where does Rust fit in? Actually, it fits into a category all to itself among industry languages (although one shared by various research languages, like Cyclone). Rust offers <em>safe manual memory management</em> (although some have objected to the term &#8220;manual&#8221; here). It extends the system described above as &#8220;garbage collection with value types and references&#8221; in two important ways:</p>

<ol>
<li><p>You can allocate memory that will not be traced by the garbage collector, and free it manually if you choose. This is the feature of Rust known as &#8220;unique pointers&#8221;. Rust will automatically free memory that is uniquely owned when its owning pointer goes out of scope. It&#8217;s also easy to write a function that acts exactly as <code>free</code> does, so you can precisely choose when your objects die. Unique pointers are not traced by the GC (unless they point to a type that transitively contains a garbage-collected pointer), so they are an effective way to cut down marking times.</p></li>
<li><p>You can <em>return references</em> and <em>place references into data structures</em>. Like other references, these references are not traced by the garbage collector. As long as the references follow a <em>stack discipline</em>, meaning that they point to memory that was allocated by one of the callers of the current function, the compiler allows them to be placed anywhere. This adds a great deal of expressiveness over the reference parameter approach, and it enables a large number of programs to be written without using the garbage collector at all.</p></li>
</ol>


<p>In terms of safety and performance, safe manual memory management is having your cake and eating it too. You get memory safety like a garbage-collected language, but control like that of unsafe manual memory management. But this system has its downsides as well—most importantly, complexity of implementation and interface. Learning to use references and unique pointers poses a significant learning curve. But, once the system is learned, it&#8217;s remarkably flexible, with an attractive combination of performance and safety.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Performance of Sequential Rust Programs]]></title>
    <link href="http://pcwalton.github.com/blog/2013/04/18/performance-of-sequential-rust-programs/"/>
    <updated>2013-04-18T16:09:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2013/04/18/performance-of-sequential-rust-programs</id>
    <content type="html"><![CDATA[<p>Although Rust is designed for parallel programs, it is important that the performance of single-threaded, sequential programs not suffer in its design. As far as Servo is concerned, sequential performance is still important in many domains that a Web browser engine must compete in.</p>

<p>Below are some selected single-threaded benchmarks from the <a href="http://benchmarksgame.alioth.debian.org/">Computer Language Benchmarks Game</a> (formerly, and still informally, called the &#8220;shootout&#8221;). <em>This is far from an ideal set.</em> These benchmarks are showing their age quite heavily, they are too small and simplistic to extrapolate to real-world use cases, and many of them are too I/O-bound.</p>

<p>It is perfectly legal per the rules of the benchmarks game to use unsafe code (or calling libraries written in C, which is equivalent), and I believe it&#8217;s very difficult to precisely match C&#8217;s performance without resorting to unsafe code. (Practically speaking, one would need an extremely smart JIT, or a research language with a complex dependent type system.) As my colleague Niko pointed out, a more interesting benchmark would not allow <em>any</em> languages to use unsafe code and would exclude C and C++ from competing at all, except as a point of comparison—such a benchmark would be interesting to determine how much performance one has to trade for type safety in mainstream languages. But the shootout is what it is, and so the Rust versions of these benchmarks heavily use unsafe code. Over time, I hope to be able to reduce the amount of unsafe code present in the Rust versions of these benchmarks, but a couple of benchmarks will likely always remain unsafe.</p>

<p><em>Neither the C nor the Rust versions of these benchmarks use SIMD or threads.</em> This is by design, as the goal of this test is to measure Rust&#8217;s sequential performance. Over time, as Rust gains SIMD support and the scheduler improves (both of which are active areas of development), the benchmarks will be updated to use multiple threads. But keep in mind that <em>the C implementation tested against is not usually the top one on the shootout site; rather, I selected the fastest implementation that did not use SIMD or threads for comparison.</em> As the Rust benchmarks are updated to use SIMD and threads, equivalent C versions will be used for comparison.</p>

<p>For all these reasons and more, it is important to not read too much into these benchmark results. It would be a mistake to conclude that &#8220;Rust is faster than C&#8221; because of the performance on the <code>k-nucleotide</code> benchmark. Likewise, it would be a mistake to conclude that &#8220;C is faster than Rust&#8221; because of the <code>fasta-redux</code> benchmark. The goal here is simply to demonstrate that <em>sequential Rust can be written in a way that approaches competitive parity with equivalent C code.</em></p>

<p><em>Note that the benchmarks include <code>clang</code> because GCC 4.2 is a very old version. The purpose of this benchmark is not to benchmark C compilers, but rather to perform cross-implementation comparisons between two languages.</em></p>

<p>Enough disclaimers; on to the results:</p>

<p><img src="http://i.imgur.com/Cd3ZBHT.png" alt="Results" /></p>

<p>These programs were tested on a 2.53 GHz Intel Core 2 Duo MacBook Pro with 4 GB of RAM, running Mac OS X 10.6 Snow Leopard. &#8220;GCC 4.2&#8221; is GCC 4.2.1, Apple build 5666; &#8220;clang 1.7&#8221; is Apple clang 1.7, based on LLVM 2.9svn; &#8220;clang 3.1&#8221; is LLVM 3.1, trunk 149587. GCC and clang were run with <code>-O2</code>, and Rust was run with <code>-O</code> (which is like <code>-O2</code>). Three runs were averaged together to produce each result. Results are normalized to GCC 4.2. Lower numbers are better.</p>

<p>As mentioned before, this is a selected set of benchmarks. The benchmarks that were not included are:</p>

<ul>
<li><p><code>fasta</code> is omitted because it is similar to <code>fasta-redux</code>.</p></li>
<li><p><code>regexp-dna</code> is omitted because it consists of an uninteresting binding to PCRE.</p></li>
<li><p><code>binary-trees</code> is omitted because it is a garbage collection benchmark and the C version uses an arena, defeating the purpose (although I suspect a Rust version that did the same would do well).</p></li>
<li><p><code>chameneos-redux</code> and <code>threadring</code> are omitted because they are threading benchmarks.</p></li>
</ul>


<p>You can see the changes to the Rust compiler that were made to optimize these tests, as well as the benchmark sources, on my <a href="https://github.com/pcwalton/rust/tree/shootout">branch</a> of the compiler on GitHub. The goal will be to land these changes over the next few days.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Hard Case for Memory Safety]]></title>
    <link href="http://pcwalton.github.com/blog/2013/04/12/a-hard-case-for-memory-safety/"/>
    <updated>2013-04-12T14:01:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2013/04/12/a-hard-case-for-memory-safety</id>
    <content type="html"><![CDATA[<p>Quick quiz: In this C++ program, is the definition of <code>munge</code> guaranteed to be memory safe? (Assume that the definition of <code>increment_counter</code> uses only modern C++ idioms and doesn&#8217;t do anything like dereference an invalid pointer.)</p>

<pre><code>#include &lt;iostream&gt;
#include &lt;vector&gt;

class foo {
public:
    std::vector&lt;int&gt; indices;
    int counter;

    foo() : indices(), counter(0) {
        indices.push_back(1);
        indices.push_back(2);
        indices.push_back(3);
    }

    void increment_counter();

    int &amp;get_first_index() {
        assert(indices.size() &gt; 0);
        return indices[0];
    }

    void munge() {
        int &amp;first = get_first_index();
        increment_counter();
        std::cout &lt;&lt; first &lt;&lt; std::endl;
        first = 20;
    }
};

int main() {
    foo foo;
    foo.munge();
    return 0;
}
</code></pre>

<p>The answer: Even with this caveat, we can&#8217;t tell! It depends on the definition of <code>increment_counter</code>.</p>

<p>If <code>increment_counter</code> has this definition, the code is memory safe:</p>

<pre><code>void foo::increment_counter() {
    counter++;
}
</code></pre>

<p>But if <code>increment_counter</code> has this definition, for example, then it isn&#8217;t:</p>

<pre><code>void foo::increment_counter() {
    indices.clear();
    counter++;
}
</code></pre>

<p>This definition would cause the <code>first</code> reference in <code>munge</code> to become a dangling reference, and the call to <code>std::cout</code> and subsequent assignment of <code>first</code> will have undefined behavior. If <code>first</code> were not an <code>int</code> but were instead an instance of a class, and <code>munge</code> attempted to perform a virtual method call on it, then this would constitute a critical security vulnerability.</p>

<p>The point here is that determining memory safety in C++ requires <em>non-local</em> reasoning. Any analysis that tries to determine safety of C++ code, whether performed by a machine or performed by a human auditor, has to analyze many functions all at once, rather than one function at a time, to determine whether the code is memory safe. As this example illustrates, sticking to modern C++ coding styles, even with bounds checks, is not enough to prevent this.</p>

<p>There are a few ways around this:</p>

<ul>
<li><p>For each function call, analyze the source to the called function to determine whether it&#8217;s memory safe <em>in the context of the caller</em>. This doesn&#8217;t always work, though: it&#8217;s hard or impossible when function pointers or virtual methods are involved (which function ends up being called?), and it&#8217;s hard with separately compiled code (what if the called function is in a DLL that you don&#8217;t have source for?)</p></li>
<li><p>Change the type of <code>indices</code> to <code>std::vector&lt;std::shared_ptr&lt;int&gt;&gt;</code>; i.e. use reference counting to keep the pointer alive. This has a runtime cost.</p></li>
<li><p>Inline the body of <code>increment_counter</code>, so that the memory safety of <code>munge</code> is immediately clear.</p></li>
<li><p>Make <code>increment_counter</code> a class method (or just a function) instead of an instance method, and have it take <code>counter</code> by reference. The idea here is to prevent the possibility that <code>increment_counter</code> could mess with <code>indices</code> in any way by shutting off its access to it.</p></li>
</ul>


<p>What does this have to do with Rust? In fact, this error corresponds to a borrow check error that Brian Anderson hit when working on the scheduler. In Rust, the corresponding code looks something like this:</p>

<pre><code>impl Foo {
    fn get_first_index(&amp;'a mut self) -&gt; &amp;'a mut int {
        assert!(self.indices.len() &gt; 0);
        return &amp;mut indices[0];
    }

    fn munge(&amp;mut self) {
        let first = self.get_first_index();
        self.increment_counter(); // ERROR
        println(first.to_str());
        *first = 20;
    }
}
</code></pre>

<p>This causes a borrow check error because the <code>first</code> reference conflicts with the call to <code>increment_counter</code>. The reason the borrow check complains is that the borrow check only checks one function at a time, and it could tell (quite rightly!) that the call to <code>increment_counter</code> might be unsafe. The solution is to make <code>increment_counter</code> a static method that only has access to counter; i.e. to rewrite the <code>self.increment_counter()</code> line as follows:</p>

<pre><code>Foo::increment_counter(&amp;mut self.counter);
</code></pre>

<p>Since the borrow check now sees that <code>increment_counter</code> couldn&#8217;t possibly destroy the <code>first</code> reference, it now accepts the code.</p>

<p>Fortunately, such borrow check errors are not as common anymore, with the new simpler borrow check rules. But it&#8217;s interesting to see that, when they do come up, they&#8217;re warning about real problems that affect any language with manual memory management. In the C++ code above, most programmers probably wouldn&#8217;t notice the fact that the memory safety of <code>munge</code> depends on the definition of <code>increment_counter</code>. The challenge in Rust, then, will be to make the error messages comprehensible enough to allow programmers to understand what the borrow checker is warning about and how to fix any problems that arise.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[An Overview of Memory Management in Rust]]></title>
    <link href="http://pcwalton.github.com/blog/2013/03/18/an-overview-of-memory-management-in-rust/"/>
    <updated>2013-03-18T15:07:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2013/03/18/an-overview-of-memory-management-in-rust</id>
    <content type="html"><![CDATA[<p>One of the key features of Rust that sets it apart from other new languages is that its memory management is <em>manual</em>—the programmer has explicit control over where and how memory is allocated and deallocated. In this regard, Rust is much more like C++ than like Java, Python, or Go, to name a few. This is an important design decision that makes Rust able to function in performance-critical domains that safe languages previously haven&#8217;t been able to—top-of-the line games and Web browsers, for example—but it adds a nontrivial learning curve to the language.</p>

<p>For programmers familiar with modern C++, this learning curve is much shallower, but for those who are used to other languages, Rust&#8217;s smart pointers can seem confusing and complex. In keeping with the systems-oriented nature of Rust, this post is designed to explain how Rust&#8217;s memory management works and how to effectively use it.</p>

<h2>Smart pointers</h2>

<p>In many languages with manual memory management, like C, you directly allocate and free memory with calls to special functions. For example:</p>

<pre><code>void f() {
    int *x = malloc(sizeof(int));  /* allocates space for an int on the heap */
    *x = 1024;                     /* initialize the value */
    printf("%d\n", *x);            /* print it on the screen */
    free(x);                       /* free the memory, returning it to the heap */
}
</code></pre>

<p>C gives you a great deal of control over where memory is allocated and deallocated. Memory is allocated with a special function <code>malloc</code>, and it is freed with a special function <code>free</code>. After the call to <code>free</code>, it is an error to attempt to use <code>x</code>, as it is a <em>dangling pointer</em>. A dangling pointer points to invalid memory, but the C compiler makes no attempt to prevent you from using it; it&#8217;s your responsibility to avoid touching it after freeing the memory it points to.</p>

<p>Rust gives you the same level of control over memory, but it works somewhat differently. Let&#8217;s see how the same piece of code looks in Rust:</p>

<pre><code>fn f() {
    let x: ~int = ~1024;          // allocate space and initialize an int
                                  // on the heap
    println(fmt!("%d", *x));      // print it on the screen
} // &lt;-- the memory that x pointed at is automatically freed here
</code></pre>

<p>There are three main differences to notice here:</p>

<ol>
<li><p>In C, you allocate memory first (with the call to <code>malloc</code>), and then you initialize it (in the example above, with the <code>*x = 1024</code> assignment). Rust fuses the two operations together into the <code>~</code> allocation operator, so that you don&#8217;t accidentally forget to initialize memory before you use it.</p></li>
<li><p>In C, the call to <code>malloc</code> returns a plain pointer, <code>int *</code>. In Rust, the <code>~</code> operator, which allocates memory, returns a special <em>smart pointer</em> to an int. Because this type of smart pointer is so common, its name is just a single character, <code>~</code>—thus the type of this smart pointer is written as <code>~int</code>.</p></li>
<li><p>You don&#8217;t call <code>free</code> manually in Rust. Rather, the compiler automatically frees the memory for you when a smart pointer goes out of scope.</p></li>
</ol>


<p>As it turns out, points (2) and (3) are very intertwined, and together they form the cornerstone of Rust&#8217;s memory management system. Here&#8217;s the idea: Unlike C, allocation functions in Rust don&#8217;t return a raw pointer to the space they allocate. Instead, they return a <em>smart pointer</em> to the space. A smart pointer is a special kind of value that controls when the object is freed. Like a raw pointer in C, you can access the data that a smart pointer refers to with <code>*</code>. But unlike a raw pointer, <em>when the smart pointer to an allocation goes out of scope, that allocation is automatically freed.</em> In this way, smart pointers are &#8220;smart&#8221; because they not only track where an object is but also track how to clean it up.</p>

<p>Unlike C, in Rust you never call <code>free</code> directly. Instead, you rely on smart pointers to free all allocations. The most basic reason for this is that smart pointers make it harder to forget to free memory. In C, if you forget to call <code>free</code>, you have a <em>memory leak</em>, which means that the memory will not be cleaned up until the program exits. However, in Rust, the compiler will automatically insert the code necessary to free the memory for you when the smart pointer pointing to your data goes out of scope.</p>

<p>Rust has multiple types of smart pointers, corresponding to the different strategies that programs use to reclaim memory. Some smart pointers, namely <code>~</code> and <code>@</code> (which we will cover shortly), have special names known to the compiler, because they&#8217;re so common. (Having to type long names like <code>unique_ptr</code> all the time would be a burden.) Other smart pointers, such as <code>ARC</code> (which allows you to share read-only data between threads), are in the standard library and are not built into the compiler.</p>

<p>The pointer covered above is known as the <em>unique smart pointer</em> <code>~</code>. We call it &#8220;unique&#8221; because there is always only one smart pointer pointing to each allocation. The other type of smart pointer built into the language is the <em>managed smart pointer</em>, which allows <em>multiple</em> smart pointers to point to the same allocation and uses <em>garbage collection</em> to determine when to free it. Here&#8217;s an example of a managed smart pointer in use:</p>

<pre><code>fn foo() {
    let x: @int = @1024;     // allocate space and initialize an int
                             // on the heap
    bar(x);                  // pass it to `bar`
    println(fmt!("%d", *x)); // print it on the screen
} // &lt;-- the memory can be freed here

fn bar(x: @int) {
    let y: @int = x;         // make a new smart pointer to `x`
} // &lt;-- despite `y` going out of scope, the memory is *not* freed here
</code></pre>

<p>The key difference between <code>~</code> and <code>@</code> is that <code>@</code> allows <em>multiple</em> smart pointers to point to the same data, and the data is cleaned up only after the <em>last</em> such smart pointer disappears. Notice that, in this example, the memory pointed at by <code>y</code> (which is the same as the memory pointed at by <code>x</code>) is not freed at the end of the function <code>bar</code>, because <code>x</code> is still in use and also points to the same data. The fact that <code>@</code> allows multiple smart pointers to the same data, as well as the fact that the allocation is freed only when all of those pointers go out of scope, make managed smart pointers very useful. However, they can be less efficient than unique smart pointers, as they require garbage collection at runtime.</p>

<h2>References</h2>

<p>Recall that a smart pointer is a pointer that automatically frees the memory that it points to when it goes out of scope. Perhaps surprisingly, it often turns out that it&#8217;s useful to have a kind of pointer that <em>doesn&#8217;t</em> free the memory that it points to. Consider this code:</p>

<pre><code>struct Dog {
    name: ~str    // a unique smart pointer to a string
}

fn dogshow() {
    let dogs: [~Dog * 3] = [        // create an array of Dog objects
        ~Dog { name: ~"Spot"   },   // use unique smart pointers to
                                    // allocate
        ~Dog { name: ~"Fido"   },
        ~Dog { name: ~"Snoopy" },
    ];
    for dogs.each |dog| {
        println(fmt!("Say hello to %s", dog.name));
    }
} // &lt;-- all dogs destroyed here
</code></pre>

<p>Suppose that we wanted to single Fido out as the winner of the dog show. We might try this code:</p>

<pre><code>fn dogshow() {
    let dogs: [~Dog * 3] = [
        ~Dog { name: ~"Spot"   },
        ~Dog { name: ~"Fido"   },
        ~Dog { name: ~"Snoopy" },
    ];
    let winner: ~Dog = dogs[1];
    for dogs.each |dog| {
        println(fmt!("Say hello to %s", dog.name));
    }
    println(fmt!("And the winner is: %s!", winner.name));
} // &lt;-- all dogs, and `winner`, destroyed here
</code></pre>

<p>But this code won&#8217;t compile. The reason is that, if it did, Fido would be destroyed twice. Remember that <em>unique smart pointers free the allocations they point to when they go out of scope</em>. The code attempts to make a second smart pointer to Fido at the time it executes the line <code>let winner: ~Dog = dogs[1];</code> If the compiler allowed this to proceed, then at the end of the block, the program would attempt to free Fido twice—once when it frees the original smart pointer embedded within the <code>dogs</code> array, and once when it frees <code>winner</code>.</p>

<p>What we really want is for <code>winner</code> to be a pointer that <em>doesn&#8217;t</em> free the allocation that it points to. In fact, what we want isn&#8217;t a smart pointer at all; we want a <em>reference</em>. Here&#8217;s the code rewritten to use one:</p>

<pre><code>fn dogshow() {
    let dogs: [~Dog * 3] = [
        ~Dog { name: ~"Spot"   },
        ~Dog { name: ~"Fido"   },
        ~Dog { name: ~"Snoopy" },
    ];
    let winner: &amp;Dog = dogs[1];  // note use of `&amp;` to form a reference
    for dogs.each |dog| {
        println(fmt!("Say hello to %s", dog.name));
    }
    println(fmt!("And the winner is: %s!", winner.name));
} // &lt;-- all dogs destroyed here
</code></pre>

<p>This code will now compile. Here, we convert <code>winner</code> into a reference, notated in Rust with <code>&amp;</code>. You can take a reference to any smart pointer type in Rust by simply assigning it to a value with a reference type, as the <code>let winner: &amp;Dog = dogs[1]</code> line does.</p>

<p>References (also known as <em>borrowed pointers</em>) don&#8217;t cause the compiler to free the data they refer to. However, they don&#8217;t <em>prevent</em> the compiler from freeing anything either. They have no effect on what smart pointers will do; regardless of how many references you have, a unique smart pointer will always free the data that it points to when it goes out of scope, and a managed smart pointer will always free its data when all managed smart pointers to the same allocation go out of scope.</p>

<p>This is important to keep in mind. Code like this will not compile:</p>

<pre><code>fn foo() {
    let y: &amp;int;
    {
        let x: ~int = ~2048;
        y = x;
    } // &lt;-- x freed here
    println(fmt!("Your lucky number is: %d", *y)); // ERROR: accesses freed data!
}
</code></pre>

<p>In languages like C++, code like this could cause faults from attempting to access invalid memory. As it turns out, however, this piece of code won&#8217;t compile—the Rust compiler can and does prevent you from writing code like this at compile time. Essentially, the Rust compiler <em>tracks where each reference came from</em> and reports an error if a reference persists longer than the allocation it points into. This means that, generally speaking, you can use references all you like and have the confidence that they won&#8217;t result in hard-to-diagnose errors at runtime.</p>

<h2>Conclusion</h2>

<p>These ideas—smart pointers and references—form the basis of memory management in Rust. If you&#8217;re a C++ programmer, most of this will (hopefully!) simply have been an exercise in learning different syntax. For other programmers, these concepts are likely more foreign. But using these tools, you can write code with fine-grained control over memory, with improved safety over languages like C.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Which Pointer Should I Use?]]></title>
    <link href="http://pcwalton.github.com/blog/2013/03/09/which-pointer-should-i-use/"/>
    <updated>2013-03-09T12:05:00-08:00</updated>
    <id>http://pcwalton.github.com/blog/2013/03/09/which-pointer-should-i-use</id>
    <content type="html"><![CDATA[<p>Deciding whether to use a managed <code>@</code> pointer or an owned <code>~</code> pointer to allocate memory is one of the most frequent sources of confusion for newcomers to Rust. There are two main angles to consider when deciding whether to use an <code>@</code> pointer or a <code>~</code> pointer in Rust: <em>memory management</em> and <em>concurrency</em>. I&#8217;ll cover each in turn.</p>

<p>Note that this tutorial only presents the basic system. There are many extensions to the system—borrowing, library smart pointers, cells, and so on—that allow the various limitations described here to be overcome. But this is the core system that needs to be understood first.</p>

<h1>Memory management</h1>

<p>One of the most important features of Rust from a systems programming perspective is that garbage collection is optional. What this means is that there are safe ways to allocate memory that do not require bookkeeping at runtime to determine when it is safe to free that memory.</p>

<p>What makes it possible for Rust programs to avoid runtime garbage collection is the notion of <em>ownership</em> of a particular allocation. Under this scheme, when the single owner of an allocation goes out of scope, the allocation is freed. Owned pointers in Rust are notated with <code>~</code>. Here&#8217;s an example of their use:</p>

<pre><code>struct Point {
    x: int,
    y: int,
}

fn f() {
    let x: ~Point = ~Point { x: 10, y: 20 };  // allocate a Point on the heap
}  // &lt;-- x is freed here
</code></pre>

<p>Here, <code>x</code> is the single owner of the <code>Point</code> on the heap. Because there is only a single owner, Rust can throw away the memory pointed to by <code>x</code> at the end of the function.</p>

<p>The compiler enforces that there is only a single owner. Assigning the pointer to a new location <em>transfers ownership</em> (known as a <em>move</em> for short). Consider this program:</p>

<pre><code>fn g() {
    let a: ~Point = ~Point { x: 10, y: 20 }; // allocate a Point on the heap
    let b = a;                               // now b is the owner
    println(b.x.to_str());                   // OK
    println(a.x.to_str());                   // ERROR: use of moved value
} // &lt;-- b is freed here
</code></pre>

<p>When compiling this program, the compiler produces the error &#8220;use of moved value&#8221;. This is because assigning an owned pointer transfers ownership, making the old variable <em>dead</em>. Because the compiler knows precisely which variables are dead at all times, it can avoid having to determine at runtime whether to free the memory that a variable points to, and it can prevent you from accidentally accessing dead variables. However, this comes at a price: you are limited to using a single variable to refer to an <code>~</code> allocation.</p>

<p>By contrast, <code>@</code> pointers do not have this limitation. We think of memory that is allocated with <code>@</code> as <em>owned by the garbage collector</em>. You can make as many pointers to <code>@</code> memory as you would like. There is a cost in runtime performance, but this cost comes with a great deal of flexibility. For example, the code above will compile with an <code>@</code> pointer:</p>

<pre><code>fn h() {
    let a: @Point = @Point { x: 10, y: 20 }; // allocate a Point on the heap
    let b = a;                               // a and b share a reference
    println(b.x.to_str());                   // OK
    println(a.x.to_str());                   // also OK
}
</code></pre>

<p>So, in short: <em><code>@</code> pointers require garbage collection, but allow multiple pointers to the same location. <code>~</code> pointers avoid this GC overhead, but they don&#8217;t allow multiple pointers to the same location.</em></p>

<h1>Concurrency</h1>

<p>Another equally important aspect to the distinction between <code>@</code> and <code>~</code> is that it ensures that concurrent Rust tasks don&#8217;t race on shared memory. To illustrate this, here&#8217;s an example of broken code that doesn&#8217;t compile:</p>

<pre><code>struct Counter {
    count: int
}

fn f() {
    // Allocate a mutable counter.
    let counter: @mut Counter = @mut Counter { count: 0 };
    do spawn {               // spawn a new thread
        // Increment the counter.
        counter.count += 1;  // ERROR: attempt to capture an `@` value
    }
    println(counter.count.to_str()); // print the value
}
</code></pre>

<p>This code contains a classic <em>race</em>—if this code compiled, then the value printed would be either 0 or 1, depending on whether the <code>counter.count += 1</code> line executed first or the <code>println</code> executed first. The key here is that two threads—the spawned thread and the main thread—are both simultaneously attempting to access the <code>counter</code> object. To prevent these errors, Rust prevents multiple threads from accessing the same memory at the same time.</p>

<p>Recall from the previous section that there can be any number of pointers to memory allocated with <code>@</code>. But there can be only one pointer to memory allocated with <code>~</code>. This suggests a way to forbid multiple threads from accessing the same data: <em>restrict the types of pointers that can be sent between threads to <code>~</code> pointers</em>. And this is exactly what Rust does.</p>

<p>For any piece of <code>~</code>-allocated memory, there is only one pointer to it, and that pointer is owned by exactly one thread. So there can be no races, since any other threads simply don&#8217;t have access to that memory. Let&#8217;s rewrite our example above using <code>~</code> to illustrate this:</p>

<pre><code>fn g() {
    // Allocate a mutable counter.
    let mut counter: ~Counter = ~Counter { count: 0 };
    do spawn {               // spawn a new thread
        counter.count += 1;  // increment the counter
    }
    println(counter.count.to_str()); // ERROR: use of moved value
}
</code></pre>

<p>What&#8217;s going on here is that, by referring to <code>counter</code> inside the <code>spawn</code> block, the new thread <em>takes ownership</em> of the <code>counter</code> variable, and the <code>counter</code> variable becomes dead everywhere outside that block. Essentially, the main thread loses access to <code>counter</code> by <em>giving it away</em> to the thread it spawns. So the attempt to print the value on the screen from the main thread will fail. By contrast, this code will work:</p>

<pre><code>fn h() {
    // Allocate a mutable counter.
    let mut counter: ~Counter = ~Counter { count: 0 };
    do spawn {               // spawn a new thread
        counter.count += 1;  // increment the counter
        println(counter.count.to_str()); // OK: `counter` is owned by this thread
    }
}
</code></pre>

<p>Notice that the data race is gone: this code always prints <code>1</code>, because the printing happens in the thread that owns the <code>Counter</code> object.</p>

<p>The resulting rule is pretty simple. In short: <em><code>@</code> pointers may not be sent from thread to thread. <code>~</code> pointers may be sent, and are owned by exactly one thread at a time.</em> Therefore, if you need data to be sent, do not allocate it with <code>@</code>.</p>

<h1>Conclusion (TL;DR)</h1>

<p>So the distinction between <code>@</code> and <code>~</code> is often confusing to newcomers, but it&#8217;s really quite simple. There are two main rules to remember:</p>

<ol>
<li><p><code>~</code> only supports one pointer to each allocation, so if you need multiple pointers to the same data, use <code>@</code>. But <code>@</code> requires garbage collection overhead, so if this is important to your application, use <code>~</code> wherever possible.</p></li>
<li><p>Don&#8217;t use <code>@</code> pointers if you need to send data between multiple threads. Use <code>~</code> instead.</p></li>
</ol>


<p>Finally, I should note again that, if these rules are too restrictive for you (for example, if you need multiple pointers but can&#8217;t tolerate garbage collection pauses), there are more advanced solutions: borrowing, safe smart pointers, and unsafe code. But this simple system works well for many programs and forms the foundation of Rust&#8217;s approach to memory management.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The New Borrow Check in a Nutshell]]></title>
    <link href="http://pcwalton.github.com/blog/2013/01/21/the-new-borrow-check-in-a-nutshell/"/>
    <updated>2013-01-21T17:56:00-08:00</updated>
    <id>http://pcwalton.github.com/blog/2013/01/21/the-new-borrow-check-in-a-nutshell</id>
    <content type="html"><![CDATA[<p>If you&#8217;ve used Rust for any period of time, you&#8217;ve probably been bitten by the mysterious <em>borrow check</em>—the compiler pass responsible for preventing <a href="http://stackoverflow.com/questions/6438086/iterator-invalidation-rules">iterator invalidation</a>, as well as a few other dangling pointer scenarios. The current iteration of the borrow check enforces a fairly complex set of rules. Because the rules were hard to understand and ruled out too many valid programs, we were never really satisfied with the analysis; without a simple set of rules to follow, programmers will get frustrated and give up. To remedy this, Niko has proposed a <a href="http://smallcultfollowing.com/babysteps/blog/2012/11/18/imagine-never-hearing-the-phrase-aliasable/">revamp</a> of the borrow checker known as &#8220;Imagine Never Hearing the Phrase &#8216;Aliasable, Mutable&#8217; Again&#8221;. This has mostly been implemented in <a href="https://github.com/mozilla/rust/pull/4454">a pull request</a> now, so I&#8217;d like to take the opportunity to explain the new rules. I&#8217;m particularly excited about this change because now the entire set of borrow check rules are simple enough to boil down to one principle.</p>

<p>Here&#8217;s the rule that the new borrow check is in charge of enforcing: <em>Whenever you take a pointer to an object, you may not modify that object as long as that pointer exists, except through that pointer.</em></p>

<p>(Strictly speaking, this is not all the new borrow check enforces, but the other errors the pass can produce are generally straightforward and simple dangling pointer errors. Also, I&#8217;m omitting the rules related to <code>&amp;const</code>, as this rarely-used type of pointer is likely to be removed.)</p>

<p>For unique pointers (<code>~</code>) and borrowed pointers (<code>&amp;</code>), this rule is enforced at compile time, without any runtime overhead. Here&#8217;s an example:</p>

<pre><code>let mut the_magic_word = Some(~"zap");
match the_magic_word {
    None =&gt; {}
    Some(ref word) {
        the_magic_word = None; // ERROR
        io::println(*word);
    }
}
</code></pre>

<p>Here, the line marked <code>ERROR</code> produces the error &#8220;assigning to mutable local variable prohibited due to outstanding loan&#8221;. This happens because we violated the rule above—the line <code>the_magic_word = None</code> mutates the value <code>the_magic_word</code> while there exists a pointer to it (<code>word</code>).</p>

<p>Another example:</p>

<pre><code>struct Foo {
    array: ~[int]
}

impl Foo {
    fn bar(&amp;mut self) {
        for self.array.each |i| {
            self.array = ~[];  // ERROR
            io::println(i.to_str());
        }
    }
}
</code></pre>

<p>Again, the error is &#8220;assigning to mutable field prohibited due to outstanding loan&#8221;. As before, it&#8217;s traceable to a violation of the mutation rule: the line <code>self.array = ~[]</code> mutates the <code>self.array</code> field while a pointer (<code>i</code>) into it exists.</p>

<p>This example is interesting for a couple of reasons. First of all, it illustrates the way the Rust compiler can catch iterator invalidation issues without runtime overhead in many cases: here the compiler is able to detect that the <code>i</code> iterator, which has type <code>&amp;int</code>, was invalidated, and rejects the program instead of permitting undefined behavior at runtime. Second, this example illustrates something not possible under the current borrow check regime that the new borrow check allows: namely, taking an immutable pointer to a field accessible through a <code>&amp;mut</code> pointer. (An immutable pointer is needed to call the <code>each</code> method to prevent iterator invalidation.) More than any other, this restriction probably led to the greatest number of borrow check errors in practice, since it prevented iterating over any collections reachable from <code>&amp;mut</code> pointers.</p>

<p>Now all of this works fine for <code>&amp;</code> and <code>~</code> pointers, but what about managed boxes (<code>@</code>)? It turns out that immutable <code>@</code> boxes are easy to deal with; since they can&#8217;t be mutated at all, the borrow checker doesn&#8217;t have to do anything to enforce the no-mutation rule. However, for <code>@mut</code> boxes, the situation is more complicated. For <code>@mut</code> boxes, the new borrow checker inserts <em>runtime</em> checks to enforce the pointer rules. Attempting to mutate an <code>@mut</code> box while a pointer to its contents exists results in task failure at runtime, unless the mutation is done through that pointer.</p>

<p>Interestingly, this is similar to the way various debug or safe STL implementations (for example, Microsoft&#8217;s) guard against iterator invalidation. The differences are: (1) in Rust, the checks are automatically inserted by the compiler instead of built into each collection by hand; and (2) the checks are only needed for garbage collected data, as the compiler can perform the checks at compile time for other types of data.</p>

<p>There is one gotcha here, however. As implemented, if any pointer exists to <em>any</em> part of an <code>@mut</code> box, then the <em>entire</em> box cannot be mutated while that pointer exists. This means that this example will fail:</p>

<pre><code>struct Dungeon {
    monsters: ~[Monster],
    total_gold: int
}

impl Dungeon {
    fn count_gold(@mut self) { // note `@mut self`, not `&amp;mut self`
        self.total_gold = 0;
        for self.monsters.each |monster| { // pointer created here
            self.total_gold += monster.gold;
        }
    }
}
</code></pre>

<p>Note that the iterator variable <code>monster</code> has type <code>&amp;Monster</code>. This is a pointer to the inside of <code>Dungeon</code>, so the assignment to <code>self.total_gold</code> violates the mutation rule. Unfortunately, the compiler does not currently catch this, so the program will fail at runtime.</p>

<p>There are a couple of workarounds. The simplest way is to change <code>@mut self</code> to <code>&amp;mut self</code>. Since there is no need to give out the <code>@mut</code> pointer for this operation, this is safe. Roughly speaking, the compile-time checks operate on a per-field basis, while the runtime checks operate on a per-box basis. So this change makes the operation succeed. Another possibility is to make <code>total_gold</code> into a local variable and assign to the field after the <code>for</code> loop.</p>

<p>Despite the fact that this error is easy to fix, I&#8217;m concerned about the fact that the compiler won&#8217;t catch this kind of thing at compile time. So I think we should introduce a set of warnings that looks for common violations of this rule. It&#8217;s impossible to make the warnings catch <em>all</em> failures—that&#8217;s the reason why the check is done at runtime in the first place. (In general, trying to make the compiler reason about <code>@</code> boxes is hard, since the compiler has no idea how many references to them exist.) But I suspect that we could make the analysis good enough to catch the majority of these errors in practice.</p>

<p>In any case, the take-away from all of this is that the borrow checker should be much easier and more transparent with this change. There&#8217;s essentially just one straightforward rule to remember.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Two Meanings of "impl"]]></title>
    <link href="http://pcwalton.github.com/blog/2012/12/30/the-two-meanings-of-impl/"/>
    <updated>2012-12-30T10:42:00-08:00</updated>
    <id>http://pcwalton.github.com/blog/2012/12/30/the-two-meanings-of-impl</id>
    <content type="html"><![CDATA[<p><code>impl</code> declarations in Rust have two forms. The subtle distinction between the two can be confusing at first, so I&#8217;ll briefly explain the difference here.</p>

<p>The first form of <code>impl</code> is a <em>type implementation</em>. (Earlier I was calling this an &#8220;anonymous trait&#8221;, but I think that this terminology is probably more confusing than it&#8217;s worth.) This form allows you to define <em>new</em> functions associated with a type. For example:</p>

<pre><code>struct Dog {
    name: ~str
}

impl Dog {
    static fn new(name: ~str) -&gt; Dog {
        return Dog { name: name };
    }

    fn speak(&amp;self) {
        io::println("woof");
    }
}
</code></pre>

<p>This example defines new functions <code>new</code> and <code>speak</code> under the <code>Dog</code> namespace. Here&#8217;s an example of their use:</p>

<pre><code>let dog = Dog::new("Snoopy");
Dog::speak(&amp;dog); // note: doesn't work today, see note below
</code></pre>

<p>(The explicit call of the form <code>Dog::speak(&amp;dog)</code> doesn&#8217;t work today, but I wrote it out to emphasize the fact that <code>speak</code> lives in the <code>Dog</code> namespace. It&#8217;s likely to work in the future, though. Today, you need to write <code>dog.speak()</code>.)</p>

<p>The second form of <code>impl</code>, on the other hand, is a <em>trait implementation</em>. It&#8217;s distinguished from the first form by the presence of a <code>:</code> followed by the name of a trait. This form allows you to provide an implementation for one or more <em>existing</em> functions belonging to a trait. It doesn&#8217;t define any new functions. For instance, suppose I defined this trait:</p>

<pre><code>trait Animal {
    static fn species(&amp;self) -&gt; ~str;
}
</code></pre>

<p>Then I can supply an implementation of <code>species()</code> for my <code>Dog</code> structure like this:</p>

<pre><code>impl Dog : Animal {
    static fn species(&amp;self) -&gt; ~str {
        return ~"Canis lupus familiaris";
    }
}
</code></pre>

<p>The key point to notice here is that this form doesn&#8217;t define any new names. This code won&#8217;t compile:</p>

<pre><code>let dog = Dog::new("Fido");
io::println(Dog::species(&amp;dog)); // unresolved name: `species`
</code></pre>

<p>But this code will:</p>

<pre><code>let dog = Dog::new("Spot");
io::println(Animal::species(&amp;dog));
</code></pre>

<p>The reason is that a trait implementation only provides the implementation of one or more <em>existing</em> functions rather than defining new functions. The function <code>species</code> is part of the <code>Animal</code> trait; it&#8217;s not part of <code>Dog</code>.</p>

<p>(You might reasonably ask: Why not duplicate the name <code>species</code> into <code>Dog</code>, for convenience? The reason is because of name collisions: it should be possible to implement <code>Animal</code> and later implement another trait with a different function called <code>species</code> without breaking existing code.)</p>

<p>So the upshot of this is that there are two forms of implementations in Rust: the type implementation, which defines new functions, and the trait implementation, which attaches functionality to existing functions. Both use the <code>impl</code> keyword, but they&#8217;re different forms with different meanings.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Tour of Vector Representations]]></title>
    <link href="http://pcwalton.github.com/blog/2012/12/28/a-tour-of-vectors/"/>
    <updated>2012-12-28T18:43:00-08:00</updated>
    <id>http://pcwalton.github.com/blog/2012/12/28/a-tour-of-vectors</id>
    <content type="html"><![CDATA[<p>One aspect of Rust that&#8217;s often confusing to newcomers is its treatment of strings and vectors (also known as arrays or lists). As a result of its focus on systems programming, Rust has a somewhat lower-level concept of a vector than most other languages do. As part of an overall goal to make Rust easy to understand, I thought I&#8217;d write up a quick tour of the way other languages&#8217; vectors work from the perspective of the machine in order to make it easier to map these concepts into Rust.</p>

<p>There are three common models that I&#8217;ve observed in use—for lack of better terminology, I&#8217;ll call them the Java model, the Python model, and the C++ STL model. (For brevity, I&#8217;ve omitted fixed-size, stack-allocated arrays, since these are very limited.) Most languages build upon one of these three. In a subsequent blog post, I&#8217;ll explain how Rust&#8217;s system differs from these and how the programmer can build the equivalents of each of these models in Rust.</p>

<p>We&#8217;ll start with the Java model. Java&#8217;s basic array type has a fixed size when created and cannot be changed afterward. Arrays in Java are always allocated on the Java heap. For example, consider the following line of code:</p>

<pre><code>int[] a = { 1, 2, 3, 4, 5 };
</code></pre>

<p>After this code is executed, the memory of the running program looks like this:</p>

<p><img src="http://i.imgur.com/zeztF.png" alt="image" /></p>

<p>The cell highlighted in red is the value of type <code>int[]</code>. It&#8217;s a <em>reference type</em>, which means that it represents a <em>reference</em> to the data rather than the data itself. This is important when assigning one array value to another. For instance, we execute this code:</p>

<pre><code>int[] b = a;
</code></pre>

<p>And now the memory looks like this:</p>

<p><img src="http://i.imgur.com/0t6xB.png" alt="image" /></p>

<p>Both values are pointing at the same underlying storage. We call this <em>aliasing</em> the array buffer. In Java, any number of values can point to same the underlying array storage. Because of this, the language has no idea how many pointers point to the storage at compile time; therefore, to determine when to clean up the storage, Java uses garbage collection. Periodically, the entire heap is scanned to determine whether any references to the array storage remain, and if there are none, the buffer is freed.</p>

<p>Now this model is simple and fast, but, since the arrays have a fixed size, the programmer can&#8217;t add new elements to them once they&#8217;re created. This is a very common thing to want, so Java provides another type, <code>java.util.ArrayList</code>, for this. As it turns out, the model used by Java&#8217;s <code>ArrayList</code> is essentially the same model that Python uses for all of its lists.</p>

<p>Let&#8217;s look at this model more closely. Consider this statement in Python:</p>

<pre><code>a = [ 1, 2, 3, 4, 5 ]
</code></pre>

<p>Once this is executed, the memory looks like this:</p>

<p><img src="http://i.imgur.com/xXjOU.png" alt="image" /></p>

<p>As in Java, the cell highlighted in red (<code>a</code>) is the value that actually has the Python type <code>list</code>. We can see this if we assign <code>a</code> to <code>b</code>:</p>

<pre><code>b = a
</code></pre>

<p><img src="http://i.imgur.com/jLJRj.png" alt="image" /></p>

<p>Obviously, the disadvantage of this model is that it requires two allocations instead of one. The advantage of this model is that new elements can be added to the end of the vector, and all outstanding references to the vector will see the new elements. Suppose that the vector had capacity 5 when initially created, so that no room exists to add new elements onto the end of the existing storage. Then when we execute the following line:</p>

<pre><code>b.append(6)
</code></pre>

<p>The memory looks like this:</p>

<p><img src="http://i.imgur.com/U0g4w.png" alt="image" /></p>

<p>Here, Python has created a new and larger allocation for the storage, copied the existing elements over, and freed the old allocation (indicated in gray). Because <code>a</code> and <code>b</code> both point to the <code>PyListObject</code> allocation, which has <em>not</em> changed, they both see the new elements:</p>

<pre><code>&gt;&gt;&gt; a
[1, 2, 3, 4, 5, 6]
&gt;&gt;&gt; b
[1, 2, 3, 4, 5, 6]
</code></pre>

<p>In summary, Python&#8217;s model sacrifices some efficiency at runtime because it requires both garbage collection and two allocations, but it gains flexibility by permitting both aliasing and append operations.</p>

<p>Turning our attention to the C++ STL, we find that it has a different model from both Python and Java: it sacrifices aliasing but retains the ability for vectors to grow. For instance, after this C++ STL code executes:</p>

<pre><code>std::vector a;
a.push_back(1);
a.push_back(2);
a.push_back(3);
a.push_back(4);
a.push_back(5);
</code></pre>

<p>The memory looks like this:</p>

<p><img src="http://i.imgur.com/dEQG3.png" alt="image" /></p>

<p>As before, the red box indicates the value of type <code>std::vector</code>. It is stored directly on the stack. It is still fundamentally a reference type, just as vectors in Python and Java are; note that the underlying storage does not have the type <code>std::vector&lt;int&gt;</code> but instead has the type <code>int[]</code> (a plain old C array).</p>

<p>Like Python vectors, STL vectors can grow. After executing this line:</p>

<pre><code>a.push_back(6);
</code></pre>

<p>The STL does this (assuming that there isn&#8217;t enough space to grow the vector in-place):</p>

<p><img src="http://i.imgur.com/QYFGV.png" alt="image" /></p>

<p>Just as the Python list did, the STL vector allocated new storage, copied the elements over, and deleted the old storage.</p>

<p>Unlike Java arrays, however, STL vectors do not support aliasing the contents of the vector (at least, not without some unsafe code). Instead, assignment of a value of type <code>std::vector</code> copies the contents of the vector. Consider this line:</p>

<pre><code>std::vector b = a;
</code></pre>

<p>This code results in the following memory layout:</p>

<p><img src="http://i.imgur.com/KIPIa.png" alt="image" /></p>

<p>The entire contents of the vector were copied into a new allocation. This is, as you might expect, a quite expensive operation, and represents the downside of the C++ STL approach. However, the STL approach comes with significant upsides as well: no garbage collection (via tracing GC or reference counting) is required, there is one less allocation to manage, and the vectors are allowed to grow just as Python lists are.</p>

<p>This covers the three main vector representations in use by most languages. They&#8217;re fairly standard and representative; if I didn&#8217;t mention a language here, it&#8217;s likely that its implementation uses one of these three techniques. It&#8217;s important to note that none of these are right or wrong per se—they all have advantages and disadvantages. In a future post, I&#8217;ll explain the way Rust&#8217;s vector model allows the programmer to choose the model appropriate for the task at hand.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Typestate Is Dead, Long Live Typestate!]]></title>
    <link href="http://pcwalton.github.com/blog/2012/12/26/typestate-is-dead/"/>
    <updated>2012-12-26T19:54:00-08:00</updated>
    <id>http://pcwalton.github.com/blog/2012/12/26/typestate-is-dead</id>
    <content type="html"><![CDATA[<p>One well-known fact about Rust is that the typestate system, which was one of the most unique aspects of the language early on, was dropped in Rust 0.4. The reason was that &#8220;in practice, it found little use&#8221; (courtesy of Wikipedia), which is fairly accurate. However, what&#8217;s less well known is that, in the meantime, Rust gained the building blocks necessary for typestate via its uniqueness typing system. With the right patterns, most of the safety guarantees that typestate enabled can be achieved, although it&#8217;s not as easy to use.</p>

<p>Let&#8217;s start with the simple example of a file that can be open or closed. We want to ensure at compile time that no methods that require the file to be open (for example, reading) can be called on the file while it is closed. With typestate, we would define the functions as follows:</p>

<pre><code>use core::libc;

struct File {
    descriptor: int
}

pred is_open(file: File) -&gt; bool {
    return file.descriptor &gt;= 0;
}

fn open(path: &amp;str) -&gt; File : is_open {
    let file = File { descriptor: libc::open(path) };
    check is_open(file);
    return file;
}

fn close(file: &amp;mut File) {
    libc::close(file.descriptor);
    file.descriptor = -1;
}

fn read(file: &amp;File : is_open, buf: &amp;mut [u8], len: uint) {
    libc::read(file.descriptor, ...)
}
</code></pre>

<p>And this is how this module might be used:</p>

<pre><code>fn main() {
    let file: File : is_open = open("hello.txt");
    read(&amp;file, ...);
    close(file);

    read(&amp;file, ...);    // error: expected File : is_open but found File
    check is_open(file); // will fail at runtime
}
</code></pre>

<p>The constructs here that differ from Rust of today are:</p>

<ul>
<li><p><em>Constraints</em> are special type kinds that can be attached to types with the <code>:</code> syntax; e.g. <code>File : is_open</code>.</p></li>
<li><p>The <code>pred</code> keyword declares a <em>predicate</em> function, which defines both a function and a constraint.</p></li>
<li><p>All values have unconstrained types when initially constructed. To add a constraint to a type, we use the <code>check</code> keyword. The <code>check</code> expression evaluates a predicate and fails at runtime if the predicate returns <code>false</code>; otherwise, it adds the appropriate constraint to the type of the predicate&#8217;s argument.</p></li>
</ul>


<p>Now let&#8217;s look at how we could achieve this in current Rust. We use the <em>branding pattern</em>:</p>

<pre><code>struct File&lt;State&gt; {
    priv descriptor: int,
}

// Make the type noncopyable.
impl&lt;T&gt; File&lt;T&gt; : Drop {
    fn finalize(&amp;self) {}
}

struct Open(@Open);
struct Closed(@Closed);

fn check_open&lt;T&gt;(file: File&lt;T&gt;) -&gt; File&lt;Open&gt; {
    assert file.descriptor &gt;= 0;
    let new_file: File&lt;Open&gt; = File {
        descriptor: file.descriptor
    };
    return new_file;
}

fn open(path: &amp;str) -&gt; File&lt;Open&gt; {
    let file: File&lt;Closed&gt; = File { descriptor: libc::open(path) };
    let file: File&lt;Open&gt; = check_open(file);
    return file;
}

fn close&lt;T&gt;(file: File&lt;T&gt;) -&gt; File&lt;Closed&gt; {
    let new_file: File&lt;Closed&gt; = File {
        descriptor: -1
    };
    libc::close(file.descriptor);
    return new_file;
}

fn read(file: &amp;File&lt;Open&gt;, buf: &amp;mut [u8], len: uint) {
    libc::read(file.descriptor, ...)
}
</code></pre>

<p>Using this code has a different feel to it:</p>

<pre><code>fn main() {
    let file: File&lt;Open&gt; = open("hello.txt");
    read(&amp;file, ...);
    let file: File&lt;Closed&gt; = close(file);

    read(&amp;file, ...);  // error: expected File&lt;Open&gt; but found File&lt;Closed&gt;
    let file: File&lt;Open&gt; = check_open(file); // will fail at runtime
}
</code></pre>

<p>The differences between this code and the original code using typestate are:</p>

<ul>
<li><p>Rather than directly altering the constraints attached to a value&#8217;s type, the functions that change typestate take a value of one type and return a different value of a different type. For example, <code>close()</code> takes a value of <code>File&lt;T&gt;</code> for any state <code>T</code> and returns a value of type <code>File&lt;Closed&gt;</code>.</p></li>
<li><p>Instead of the built-in notion of a predicate, this code uses a <em>phantom type</em>. A phantom type is a type for which no values can be constructed—in this example, there is no way to construct a value of type <code>Open</code> or <code>Closed</code>. Instead, these types are solely used as &#8220;markers&#8221;. In the code above, a value of type <code>File&lt;Open&gt;</code> represents an open file, and a value of type <code>File&lt;Closed&gt;</code> represents a closed file. We call these <em>branded types</em>, because <code>File</code> is <em>branded</em> with the <code>Open</code> or <code>Closed</code> status. Generics (e.g. <code>File&lt;T&gt;</code>) can be used when the state of a file is irrelevant; e.g. if a function can operate on both closed or open files.</p></li>
<li><p><code>File</code> instances are made noncopyable. This is important to prevent code like this from compiling:</p>

<pre><code>let file: File&lt;Open&gt; = open("hello.txt");
let _: File&lt;Closed&gt; = close(file); // ignore the return value
read(&amp;file, ...);  // error: use of moved value `file`
</code></pre></li>
</ul>


<p>The important idea is that to get a closed file, you must first surrender your open file. The uniqueness system in Rust allows the compiler to ensure this: when you change typestates, you must move your original value away, and the compiler will ensure that you can&#8217;t access it again.</p>

<ul>
<li><p>The file descriptor field is made private to the containing module. This is important to disallow other modules from forging open or closed <code>File</code> instances. Otherwise, other code could simply convert an open file to a closed file the same way <code>check_open</code> does:</p>

<pre><code>let open_file: File&lt;Open&gt; = open("hello.txt");
let closed_file: File&lt;Closed&gt; = close(open_file);
let fake_open_file: File&lt;Open&gt; = File { descriptor: closed_file };
// ^^^ error: use of private field 'descriptor'
read(&amp;fake_open_file, ...);
</code></pre></li>
</ul>


<p>Since the <code>File</code> structure contains a private field, no code other than the containing module can create one. In this way, we ensure that nobody can forge instances of <code>File</code> and violate our invariants.</p>

<p>Now, it&#8217;s obvious that this isn&#8217;t perfect in terms of usability. For one, it&#8217;s a design pattern, and design patterns are the sincerest form of request for syntax. I&#8217;m not particularly concerned about this aspect, however, because syntactic sugar is readily achievable with macros.</p>

<p>The issue that I&#8217;m concerned with is deeper. One nice thing about typestate as previously implemented is that you don&#8217;t have to surrender your value; you can effectively &#8220;mutate&#8221; its type &#8220;in-place&#8221;. This saves you from writing temporary variables all over the place and also saves some (cheap) copies at runtime. For example, you can write:</p>

<pre><code>let file = open("hello.txt");
read(&amp;file, ...);
close(file);
</code></pre>

<p>Instead of:</p>

<pre><code>let file = open("hello.txt");
read(&amp;file, ...);
let file = close(file);
</code></pre>

<p>In Rust, however, this causes complications, which we never fully resolved. (In fact, this is part of what led to typestate&#8217;s removal.) Suppose that <code>close</code> mutated the type of its argument to change it from <code>&amp;File&lt;Open&gt;</code> to <code>&amp;File&lt;Closed&gt;</code>. Then consider the following code:</p>

<pre><code>trait Foo {
    fn speak(&amp;self);
}

impl File&lt;Open&gt; : Foo {
    fn speak(&amp;self) {
        io::println("woof");
    }
}

trait Bar {
    fn speak(&amp;self, x: int);
}

impl File&lt;Closed&gt; : Bar {
    fn speak(&amp;self) {
        io::println("meow");
    }
}

let file = open("hello.txt");
for 5.times {
    file.speak();
    close(&amp;file);
}
</code></pre>

<p>How do we compile this code? The first time around the <code>for 5.times { ... }</code> loop, <code>file.speak()</code> should resolve to <code>Foo::speak</code>; the second time around, <code>file.speak()</code> should resolve to <code>Bar::speak</code>. Needless to say, this makes compiling extremely difficult: we would have to consider the lexical scope of every single method invocation and compile it for <em>each</em> possible predicate!</p>

<p>Because of these and other complications, mutating the type doesn&#8217;t seem possible in the general case. We would certainly need to introduce some set of restrictions—perhaps we would need to formalize the notion of a &#8220;constraint&#8221; in the type system (probably by introducing a new type kind) and then introduce some restrictions on implementation declarations to prevent instances from depending on constraints. Whatever system we come up would be pretty complex and would require a fair bit of thought to get right.</p>

<p>So I&#8217;d like to try to play with the current setup and see how far we get with it. In future versions of the language (post-1.0), it might be worthwhile to try to allow some sort of in-place &#8220;mutation&#8221; of types, similar to languages with true typestate. Overall, though, the combination of uniqueness and branding places today&#8217;s Rust in an interesting position, supporting much of the power that came with typestate in a simple system.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Unique Pointers Aren't Just About Memory Management]]></title>
    <link href="http://pcwalton.github.com/blog/2012/10/03/unique-pointers-arent-just-about-memory-management/"/>
    <updated>2012-10-03T11:32:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2012/10/03/unique-pointers-arent-just-about-memory-management</id>
    <content type="html"><![CDATA[<p>One of the most unusual features of Rust, especially when compared to languages that aren&#8217;t C++, is the three types of pointers: <em>borrowed</em> pointers (<code>&amp;T</code>), <em>unique</em> pointers (<code>~T</code>), and <em>managed</em> pointers (<code>@T</code>). Most people quite rightly ask &#8220;why three pointers? Isn&#8217;t one enough?&#8221; The usual answer is that unique pointers help with manual memory management:</p>

<ul>
<li><p>Managed pointers (<code>@T</code>) allow convenient garbage collection.</p></li>
<li><p>Unique pointers (<code>~T</code>) work like <code>malloc</code> and <code>free</code> from C to allow programmers who don&#8217;t want the overhead and complexity of GC to avoid it.</p></li>
<li><p>Borrowed pointers (<code>&amp;T</code>) allow functions to work equally well with both unique and managed pointers.</p></li>
</ul>


<p>This is all true, but there&#8217;s another, equally important reason that&#8217;s often overlooked: unique pointers allow for efficient, safe concurrency.</p>

<p>To see why, let&#8217;s consider the possible ways that an actor- or CSP-based system could enforce safe message passing. By <em>safe</em> message passing I mean that actors can&#8217;t create data races by simultaneously accessing shared mutable data. In short, we want to enforce that this adage is followed (courtesy of Rob Pike)&#8211;&#8220;do not communicate by sharing memory; share memory by communicating.&#8221;</p>

<p>There are three simple ways to do this:</p>

<ol>
<li><p>Copy all messages sent from actor to actor. Changes that one actor makes to the contents of any message do not affect the other actors&#8217; copies of the message.</p></li>
<li><p>Require that all messages sent from actor to actor be immutable. No actor may make changes to any message after it&#8217;s created.</p></li>
<li><p>Make messages inaccessible to the sender once sent&#8211;senders &#8220;give away&#8221; their messages. Only one actor may mutate a message at any given time.</p></li>
</ol>


<p>Each of these patterns has advantages and disadvantages:</p>

<ol>
<li><p>Copying all messages has the advantage that it&#8217;s simple to reason about, and the programmer doesn&#8217;t have to worry about mutability restrictions. The disadvantage is that it comes with a significant performance cost, both in terms of allocation overhead and the copying itself.</p></li>
<li><p>Requiring that messages be immutable has the advantage that many messages can be efficiently sent, but it still can lead to copying in many cases. Consider, for example, an application that spawns off a task to decode a large JPEG image. To be efficient, the image decoding algorithm generally wants to decode into a mutable buffer. But the decoded image data must be immutable to be sent, which necessitates a potentially-expensive copy of the pixel data out of the work buffer to an immutable location.</p></li>
<li><p>Making messages inaccessible to the sender has the advantage that it&#8217;s simple and fast, but it has the disadvantage that it could lead to copying if both the sender and receiver need to access the memory after the send operation.</p></li>
</ol>


<p>Because one pattern rarely fits every use case, most actor-based languages, including Rust, have varying levels of support for all three of these patterns (and for more complex patterns that don&#8217;t appear in the above list, such as <a href="http://en.wikipedia.org/wiki/Software_transactional_memory">software transactional memory</a>). However, each language tends to favor one of the three patterns &#8220;by default&#8221;. For example, Erlang leans toward option #1 (copying all messages), Clojure leans toward option #2 (immutable sharing), while Rust leans toward option #3 (giving messages away). The important thing to note here is that all of the patterns have advantages and disadvantages, and so different scenarios will call for one or the other. Consider the image decoding example from before; pattern #3 is by far the most efficient way to handle this, as the buffer needs to be mutable while the image decoder works on it, but the decoder has no need for the image after decoding is done.</p>

<p>Now the simplest way to support pattern #3 <em>safely</em>&#8211;in other words, to enforce <em>at compile time</em> that only one actor can hold onto a message at any given time&#8211;is through unique pointers. The compiler guarantees that only one reference exists to a uniquely-owned object, enforcing the property we want. Unique pointers support a <em>move</em> operation, which allows functions to &#8220;give a pointer away&#8221; to another function. So by simply requiring that the &#8220;send&#8221; method takes a unique pointer and moves its argument, we teach the compiler everything it needs to know to enforce safe concurrency.</p>

<p>In this way, unique pointers aren&#8217;t just a tool for manual memory management. They&#8217;re also a powerful tool for eliminating data races at compile time. The fact that they also allow Rust programs to avoid the garbage collector is just an added bonus.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A Gentle Introduction to Traits in Rust]]></title>
    <link href="http://pcwalton.github.com/blog/2012/08/08/a-gentle-introduction-to-traits-in-rust/"/>
    <updated>2012-08-08T10:46:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2012/08/08/a-gentle-introduction-to-traits-in-rust</id>
    <content type="html"><![CDATA[<p>Rust traits pack a lot of flexibility into a simple system, and they&#8217;re one of my favorite features of the language. But as a result of the rapid pace of the language&#8217;s development, there&#8217;s been a fair amount of confusion as to how they work. As such, I figured I&#8217;d write up a quick tutorial explaining why and how to use them.</p>

<p>This tutorial assumes only basic knowledge of C-like languages, so I&#8217;ll try to explain everything specific to Rust that might be unclear along the way. Also note that a couple of these features are unimplemented, so if you try this today the syntax will be a little different.</p>

<h2>Simple implementations</h2>

<p>In keeping with the theme of my previous blog posts on classes, let&#8217;s start by writing a game. I&#8217;ll start by defining a struct <code>Monster</code> and a struct <code>Player</code> like this:</p>

<pre><code>struct Monster {
    name: &amp;str;      // `&amp;str` is a reference to a string
    mut health: int; // `mut` indicates that the health can be changed
}

struct Player {
    mut health: int;
}
</code></pre>

<p>Now I can create instances of each:</p>

<pre><code>fn main() {  // `fn` defines a function
    let monster = Monster {
        name: "Gelatinous Cube",
        health: 50
    };
    let player = Player {
        health: 100
    };
}
</code></pre>

<p>Without some functionality, this isn&#8217;t a particularly interesting game. So let&#8217;s add a method to <code>Monster</code>:</p>

<pre><code>impl Monster {
    fn attack(&amp;self, player: &amp;Player) {
        // fmt! is string formatting; this prints "Gelatinous Cube hits you!"
        io::println(fmt!("%s hits you!", self.name));
        player.health -= 10;
    }
}
</code></pre>

<p>And I can call it this way, inside <code>main</code>:</p>

<pre><code>monster.attack(&amp;player);
</code></pre>

<p>There are several things to note here.</p>

<ul>
<li><p>References are explicit in Rust: the <code>&amp;</code> sigil indicates that the method <code>attack</code> takes a reference to the player, not the player itself. If I didn&#8217;t write that, then the player would be copied into the method instead (and we&#8217;d get a compiler warning, because this indicates a bug).</p></li>
<li><p>I use the keyword <code>impl</code> to declare methods for a type. <code>impl</code> declarations can appear  anywhere in the module that declared the type. The <code>struct</code> and <code>impl</code> pair appears a lot in Rust code; it nicely separates out data from implementation. Objective-C and C++ programmers will find this familiar.</p></li>
<li><p>Within an implementation, functions with a <code>self</code> parameter become methods. Python programmers will find this &#8220;explicit self&#8221; familiar. Because references are explicit in Rust, you specify how <code>self</code> is supposed to be passed; in this case, by reference (<code>&amp;self</code>).</p></li>
</ul>


<h2>Generics</h2>

<p>Now that we have basic implementations covered, let&#8217;s look at something completely different: generics. (We&#8217;ll come back to implementations later on.) Like many other languages, Rust features generic functions: functions that can operate on many different types. For example, here&#8217;s a function that returns true if a vector is empty:</p>

<pre><code>// Vectors are written with square brackets around the type; e.g. a vector of
// ints is written `[int]`.
fn is_empty&lt;T&gt;(v: &amp;[T]) -&gt; bool {
    return v.len() == 0;
}
</code></pre>

<p>The generic type parameters are written inside the angle brackets (<code>&lt;</code> and <code>&gt;</code>), after the function name.</p>

<p>There&#8217;s nothing much more to say here; generics are pretty simple. In this form, however, they&#8217;re pretty limited, as we&#8217;ll see.</p>

<h2>Limitations of generics</h2>

<p>Let&#8217;s go back to our game example. Suppose I want to add functionality to save the state of the game to disk in <a href="http://en.wikipedia.org/wiki/JSON">JSON</a>. I&#8217;ll implement some methods on <code>Monster</code> and <code>Player</code> to do this:</p>

<pre><code>impl Monster {
    // `~str` means "a pointer to a string that'll be automatically freed"
    fn to_json(&amp;self) -&gt; ~str {
        return fmt!("{ name: \"%s\", health: %d }", self.name, self.health);
    }
}

impl Player {
    fn to_json(&amp;self) -&gt; ~str {
        return fmt!("{ health: %d }", self.health);
    }
}
</code></pre>

<p>Now imagine that I wanted a function to save any actor (either a monster or a player) into a file. Because monsters and players are different types, I need to use a generic function to handle both. My first attempt at the function looks like this:</p>

<pre><code>fn save&lt;T&gt;(filename: &amp;str, actor: &amp;T) {
    // Because the writer returns an error code, I use .get() to mean "require
    // that this succeeded, and abort the program if it didn't".
    let writer = io::file_writer(filename, [ io::create, io::truncate ]).get();
    writer.write(actor.to_json());
    // Because of RAII, the file will automatically be closed.
}
</code></pre>

<p>Uh-oh. This doesn&#8217;t compile. I get the following error: &#8220;attempted access of field <code>to_json</code> on type <code>&amp;T</code>, but no public field or method with that name was found&#8221;.</p>

<p>What the Rust compiler is telling me is that it doesn&#8217;t know that the type <code>T</code> in this function contains the method <code>to_json</code>. And, in fact, it might not. As written above, it&#8217;d be perfectly legal to call <code>save</code> on any type at all:</p>

<pre><code>struct Penguin {
    name: &amp;str;
}

save("penguin.txt", &amp;Penguin { name: "Fred" });
// But how do I convert penguins to JSON?
</code></pre>

<p>So I&#8217;m stuck. But Rust provides a solution: traits.</p>

<h2>Trait declaration</h2>

<p>Traits are the way to tell the Rust compiler about <em>functionality that a type must provide</em>. They&#8217;re very similar in spirit to interfaces in Java, C#, and Go, and are similar in implementation to typeclasses in Haskell. They provide the solution to the problem I&#8217;m facing: I need to tell the Rust compiler, first of all, that some types can be converted to JSON, and, additionally, for the types that can be converted to JSON, how to do it.</p>

<p>To define a trait, I simply use the <code>trait</code> keyword:</p>

<pre><code>trait ToJSON {
    fn to_json(&amp;self) -&gt; ~str;
}
</code></pre>

<p>This declares a trait named <code>ToJSON</code>, with one method that all types that implement the trait must define. That method is named <code>to_json</code>, and it takes its <code>self</code> parameter by reference.</p>

<p>Now I can define implementations of <code>ToJSON</code> for the various types I&#8217;m interested in. These implementations are exactly the same as above, except that we add <code>: ToJSON</code>.</p>

<pre><code>impl Monster : ToJSON {
    // `~str` means "a pointer to a string that'll be automatically freed"
    fn to_json(&amp;self) -&gt; ~str {
        return fmt!("{ name: \"%s\", health: %d }", self.name, self.health);
    }
}

impl Player : ToJSON {
    fn to_json(&amp;self) -&gt; ~str {
        return fmt!("{ health: %d }", self.health);
    }
}
</code></pre>

<p>That&#8217;s all there is to it. Now I can modify the <code>save</code> function so that it does what I want.</p>

<h2>Trait usage</h2>

<p>Recall that the reason why the <code>save</code> function didn&#8217;t compile is that the Rust compiler didn&#8217;t know that the <code>T</code> type contained a <code>to_json</code> method. What I need is some way to tell the compiler that this function only accepts types that contain the methods I need to call. This is accomplished through <em>trait restrictions</em>. I modify the <code>save</code> function as follows:</p>

<pre><code>fn save&lt;T:ToJSON&gt;(filename: &amp;str, actor: &amp;T) {
    let writer = io::file_writer(filename, [ io::create, io::truncate ]).get();
    writer.write(actor.to_json());
}
</code></pre>

<p>Note the addition of <code>:ToJSON</code> after the type parameter. This indicates that the function can only be called with types that implement the trait.</p>

<p>Now these calls to <code>save</code> will compile:</p>

<pre><code>save("player.txt", &amp;player);
save("monster.txt", &amp;monster);
</code></pre>

<p>But this call will not:</p>

<pre><code>save("penguin.txt", &amp;Penguin { name: "Fred" });
</code></pre>

<p>I get the error &#8220;failed to find an implementation of trait <code>ToJSON</code> for <code>Penguin</code>&#8221;, just as expected.</p>

<h2>Summing up</h2>

<p>These are the basic features of traits and comprise most of what Rust programmers will need to know. There are only a few more features beyond these, which I&#8217;ll mention briefly:</p>

<ul>
<li><p><em>Special traits</em>. Some traits are known to the compiler and represent the built-in operations. Most notably, this includes the ubiquitous <code>copy</code> trait, which invokes the copy operation that occurs when you assign with <code>let x = y</code>. You&#8217;ll see <code>T:copy</code> in many generic functions for this reason. Other special traits include <code>send</code>, which is a trait that indicates the type is sendable, and <code>add</code>, <code>sub</code>, etc, which indicate the built-in arithmetic operators. The key is that, in all cases, traits simply specify <em>what a generic type can do</em>; when you want to do something with a type parameter like <code>T</code>, you specify a trait.</p></li>
<li><p><em>Generic traits</em>. Traits can be generic, which is occasionally useful.</p></li>
<li><p><em>Default implementations</em>. It&#8217;s often helpful for traits to provide default implementations of their methods that take over when the type doesn&#8217;t provide an implementation of its own. For example, the default implementation of <code>to_json()</code> might want to use the Rust reflection API to automatically create JSON for any type, even if that type doesn&#8217;t manually implement the <code>to_json()</code> method. (Note that this feature is currently being implemented.)</p></li>
<li><p><em>Trait composition</em>. Sometimes we want one trait to include another trait. For example, the <code>Num</code> trait, which all number types in the language implement, obviously includes addition, subtraction, multiplication, etc. Trait composition allows traits to be &#8220;glued together&#8221; in this way. Note that this isn&#8217;t <em>inheritance</em>; it&#8217;s simply a convenience that allows trait methods to be combined together, like a mixin. (This is not fully implemented yet.)</p></li>
<li><p><em>First-class trait values</em>. Rarely, it&#8217;s necessary to have a trait be a first-class value, like in Java or Go, instead of attached to a generic type parameter. This doesn&#8217;t come up particularly often, but Rust does support it in the rare cases in which it&#8217;s needed. Idiomatic Rust uses generics instead of Java-like interfaces.</p></li>
</ul>


<p>That&#8217;s about all there is to traits. Traits are essentially Rust&#8217;s object system, but they&#8217;re simpler than many object systems and integrate especially well with generics.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Maximally Minimal Classes for Rust]]></title>
    <link href="http://pcwalton.github.com/blog/2012/06/03/maximally-minimal-classes-for-rust/"/>
    <updated>2012-06-03T14:35:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2012/06/03/maximally-minimal-classes-for-rust</id>
    <content type="html"><![CDATA[<p>Now that classes have been implemented as per the original proposal, the other Rusters and I have been starting to get a feel for the way they work out in practice. The results are positive, but not optimal. Although they definitely succeeded in avoiding the rigidity of traditional object-oriented languages like Java, they still have two basic problems: (1) they feel somewhat out of place with the rest of the language; and (2) they&#8217;re still too heavyweight. Nevertheless, the functionality that they enabled is important, and we shouldn&#8217;t sacrifice it.</p>

<p>Language design tends to go in cycles: we grow the language to accommodate new functionality, then shrink the language as we discover ways in which the features can be orthogonally integrated into the rest of the system. Classes seem to me to be on the upward trajectory of complexity; now it&#8217;s time to shrink them down. At the same time, we shouldn&#8217;t sacrifice the functionality that they enable.</p>

<p>In Rust, classes provide five main pieces of functionality that don&#8217;t otherwise exist: (1) nominal records; (2) constructors; (3) privacy on the field level; (4) attached methods; and (5) destructors. I&#8217;ll go over these five features in turn and discuss how each one could be simplified.</p>

<h2>Nominal records</h2>

<p>Classes in Rust are nominal records. A class in this form:</p>

<pre><code>class monster {
    let mut health: int;
    let name: str;
}
</code></pre>

<p>Is basically the moral equivalent of:</p>

<pre><code>enum monster {
    monster({
        mut health: int,
        name: str
    })
}
</code></pre>

<p>Clearly, the class form is much easier to read and much less confusing for users of the language; &#8220;enum&#8221; makes little sense as there&#8217;s nothing enumerated here. Nevertheless, there&#8217;s a bit of unnecessary noise in the form of the <code>let</code> keyword. We could simplify it to:</p>

<pre><code>class monster {
    mut health: int,
    name: str
}
</code></pre>

<p>It&#8217;s less typing, and it matches record syntax exactly.</p>

<h2>Constructors</h2>

<p>Those who have used Rust classes in their current form know that the above example class <code>monster</code> is incomplete. I still have to define a constructor for <code>monster</code>, like so:</p>

<pre><code>class monster {
    let mut health: int;
    let name: str;

    new(health: int, name: str) {
        self.health = health;
        self.name = name;
    }
}
</code></pre>

<p>This is probably the most burdensome part of classes as they currently stand&#8211;having to repeat each field name four times, and each type twice, is annoying. Many languages have solutions for this (CoffeeScript and Dart, for example), so we could consider adopting one of these languages&#8217; syntactic sugar for something like:</p>

<pre><code>class monster {
    let mut health: int;
    let name: str;

    new(self.health, self.name) {}  // sugar for the above
}
</code></pre>

<p>Unfortunately, it doesn&#8217;t stop there. Constructors have other problems. For one, there can only be one constructor per class&#8211;this is far more restrictive than Java, which permits constructor overloading. Worse, constructors can&#8217;t indicate that they failed; they can only fail the task or set some internal &#8220;this failed&#8221; flag, both of which are clearly unsatisfactory. The right way to report a recoverable error to the caller in Rust is to use the <code>result</code> type, but constructors can&#8217;t return <code>result&lt;self&gt;</code>; they can only return <code>self</code>.</p>

<p>I think the easiest way to address these problems is, following the idea that classes are just nominal records, is to abolish constructors entirely and adopt record literal syntax for initializing classes. So a class like this:</p>

<pre><code>class monster {
    mut health: int,
    name: str
}
</code></pre>

<p>Would be initialized with:</p>

<pre><code>let foe = monster {
    health: 100,
    name: "Bigfoot"
};
</code></pre>

<p>If you want to declare one or more &#8220;constructor&#8221; functions, perhaps to signal success or failure, that&#8217;s easy; they&#8217;re just functions in the same crate:</p>

<pre><code>fn monster(health: int, name: str) -&gt; result&lt;monster&gt; {
    if name == "King Kong" || name == "Godzilla" {
        ret err("Trademark violation");
    }
    ret ok(monster { health: health, name: name });
}
</code></pre>

<p>But note that you only have to write a constructor if you&#8217;re doing something special, like returning an error or initializing private fields. If your class is simple and merely holds public state, then your callers can just use the record literal syntax to create instances of the class.</p>

<h2>Privacy</h2>

<p>Classes in Rust allow private fields:</p>

<pre><code>class monster {
    let priv mut health: 100;
    let name: str;

    ...

    fn hit() {
        self.health -= 10;
    }
}
</code></pre>

<p>This is extremely useful functionality for modularity. But Rust already has a mechanism for privacy, via exports. For example, in order to write an enum whose contents are hidden from the outside world:</p>

<pre><code>enum color {
    priv red;
    priv green;
    priv blue;
}
</code></pre>

<p>(Note that the syntax here is changing; for posterity, I&#8217;m using the new syntax, but note that the code here doesn&#8217;t work at the time of this writing, as it&#8217;s not yet implemented.)</p>

<p>Only this module can construct instances of this enum, or even inspect its contents, because while the enum itself can be named, none of its variants can. So we could apply the same principle to fields of classes:</p>

<pre><code>mod A {
    mod B {
        class monster {
            priv mut health: int,
            name: str
        }

        fn hit(monster: &amp;monster) {
            monster.health -= 10;    // OK
        }
    }

    fn heal(monster: &amp;monster) {
        monster.health += 10;        // error: field "health" is private
    }
}
</code></pre>

<p>Here, a field marked with <code>priv</code> can only be named (and therefore accessed) by the enclosing module or containing modules. It works like every other instance of <code>priv</code> in the language: it restricts the use of a name to the enclosing module and its submodules.</p>

<p>It would be an error for modules that aren&#8217;t the module defining the class or an enclosing module to attempt to construct an instance of a class with a private field with the record literal syntax. This means that, if you use private fields, you need a constructor if you want your class instances to be constructible by the outside world.</p>

<h2>Methods</h2>

<p>Naturally, Rust classes support attached methods; this is much of the reason for their existence. But Rust already has a mechanism for creating methods&#8211;namely, typeclasses. We could write the above <code>monster</code> declaration this way:</p>

<pre><code>mod A {
    class monster {
        priv mut health: int,
        name: str
    }

    impl monster for &amp;monster {
        fn hit() {
            self.health -= 10;
        }
    }
}
</code></pre>

<p>The trick here is that the typeclass implementation is named <code>monster</code>, so a declaration like <code>import A::monster</code> will import both the class and the implementation. This entire scenario works because, with privacy restricted to the module, there is no need to place methods inside the class to achieve privacy.</p>

<p>Sometimes, it&#8217;s useful to have the hidden <code>self</code> parameter actually be a GC&#8217;d pointer to an instance of the class. In the original class proposal, this is accomplished with a separate type of class named <code>@class</code>. However, with this revised proposal, the <code>@class</code> functionality falls out naturally, without any extra features:</p>

<pre><code>class monster {
    priv mut health: int,
    name: str,
    friends: dvec&lt;@monster&gt;  // a dynamic vector
}

impl monster for @monster {
    fn befriend(new_friend: @monster) {
        new_friend.friends.push(self);
    }
}
</code></pre>

<p>It&#8217;d be best if we could eliminate the repetition of the <code>monster</code> name in the <code>impl</code> declaration, so I propose inferring it:</p>

<pre><code>impl for @monster {
    fn befriend(new_friend: @monster) {
        new_friend.friends.push(self);
    }
}
</code></pre>

<p>The name of the implementation would automatically be inferred to be the name of the class if, given a class C, the type is one of <code>C</code>, <code>@C</code>, <code>~C</code>, or <code>&amp;C</code>.</p>

<p>Note that, since traits can be applied to implementations, we can apply traits to classes in this way.</p>

<p>It would be ideal to eliminate the <code>impl</code> declaration entirely. However, this relies on typeclass coherence, which I&#8217;d like to keep separate to avoid coupling proposals. Nevertheless, it&#8217;s worth mentioning; so, in a forthcoming post, I&#8217;ll show how typeclass coherence can make method declaration syntax even simpler.</p>

<h2>Destructors</h2>

<p>Classes are intended to be the only mechanism for destructors in Rust. Unfortunately, there&#8217;s no obvious way to eliminate destructors from classes in a minimal way. There are a number of options:</p>

<ol>
<li><p>Keep destructors in classes, and remove resources.</p></li>
<li><p>Keep resources around, and remove destructors from classes.</p></li>
<li><p>Make the destructor interface (<code>drop</code>) into a special kind of &#8220;intrinsic interface&#8221; which enforces <em>instance coherence</em>. Then remove both resources and destructors from classes. (Recall that instance coherence means that each class can only have one implementation of an interface, which is clearly, to my mind, a necessity if destructors are to become an interface.)</p></li>
<li><p>Make <em>all</em> interfaces enforce instance coherence, make <code>drop</code> into an interface, and remove both resources and destructors from the language.</p></li>
</ol>


<p>I prefer option (4), but, as mentioned before, that&#8217;s a separate issue.</p>

<p>Finally, with nearly all of the special functionality of classes removed, it&#8217;s worth asking why records continue to exist. Indeed, I&#8217;ve been thinking for a while that structural records should be removed from the language, but the reasons for this tie into a deeper discussion on structural and nominal types and deserve their own blog post.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Coherence, modularity, and extensibility for typeclasses]]></title>
    <link href="http://pcwalton.github.com/blog/2012/05/28/coherence/"/>
    <updated>2012-05-28T22:12:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2012/05/28/coherence</id>
    <content type="html"><![CDATA[<p>I&#8217;ve been experimenting with the design of a modification to Rust typeclasses. Because it&#8217;s always best to start with code, here&#8217;s a synopsis of what I have in mind:</p>

<pre><code>mod A {
    // Declaration of an interface:
    iface to_str {
        fn to_str() -&gt; str;

        // Implementations for various types:

        impl int {
            fn to_str() -&gt; str {
                ... implementation of to_str on ints ...
            }
        }

        impl uint {
            fn to_str() -&gt; str {
                ... implementation of to_str on unsigned ints ...
            }
        }

        ... more types here ...
    }

    // Define a class and declare that it implements to_str:
    class foo : to_str {
        fn to_str() {
            ret "foo";
        }
    }
}

mod B {
    import A::to_str;    // Must import the interface first, so
                         // that the compiler can find the method
                         // "to_str".

    println(3.to_str()); // Calls the "to_str" defined above.
}

mod C {
    let x = A::foo();    // Creates a foo object named "x".

    x.to_str();          // Calls "to_str" on "x". Note that I
                         // didn't have to import the "to_str"
                         // method in this scope—since it was
                         // defined inside the declaration of the
                         // class "foo", it's obvious what the
                         // implementation is.
} 
</code></pre>

<p>Essentially, the implementations of an interface go <em>inside</em> the declaration of the interface, with one significant exception: a class is permitted to define implementations of interfaces in the body of the class. The compiler prohibits multiple implementations of the same interface on the same type using two simple rules: (1) implementations defined within an interface must be non-overlapping (i.e. there can&#8217;t be any types which match multiple implementations); and (2) a class can&#8217;t implement an interface that already defines an implementation which might itself match an instance of that class.</p>

<p>The fact that the implementations go inside the interface is a little strange—it resembles the proposed Java defender methods, although it&#8217;s used for a completely different purpose—but I believe there is an important reason for it. It means that, if a programmer wants to look up the definition of a method call, he or she can simply figure out which interface it belongs to, which must always be in scope via an <code>import</code> statement, and then look at the declaration of the interface to find the method.</p>

<p>Fundamentally, the guiding principles behind this design are that the typeclass system should be <em>coherent</em> and <em>modular</em> while supporting <em>extensibility</em>. Here are the definitions of these terms as I see them:</p>

<p><em>Coherent</em> — A typeclass system is coherent if there exists at most one implementation of an interface for every type. Typeclass systems that don&#8217;t have this property have the <em>instance coherence</em> problem (or, as we called it when we independently stumbled across it, the &#8220;hash table problem&#8221;.)</p>

<p><em>Modular</em> — A typeclass system is modular if the unit in the source code that carries the implementation for every method must be in the lexical scope of every call site that needs the implementation (or, for nominal types only, in the lexical scope of the declaration of the type). This is a little unclear, so some examples are in order. First, a simple one:</p>

<pre><code>import vec::len;
printf("The length is %u", [ 1, 2, 3 ].len());
</code></pre>

<p>In this example, we need the implementation for <code>len</code> in scope in order to make a direct call to the method <code>len</code>.</p>

<p>Now a more complex example:</p>

<pre><code>fn print_length&lt;T:len&gt;(x: T) {
    printf("The length is %u", x.len());
}

import vec::len;
print_length([ 1, 2, 3 ]);
</code></pre>

<p>Here, we need the definition of <code>len</code> in scope at the time we call <code>print_length</code>. Because <code>print_length</code> can print the length of any value that implements the <code>len</code> interface, it doesn&#8217;t intrinsically know which method to call. This information has to be provided by the caller of <code>print_length</code>. For this reason, the call to <code>print_length</code> requires the implementation <code>vec::len</code> to be in scope.</p>

<p>In typeclass systems that aren&#8217;t modular, modules that define conflicting typeclass implementations usually can&#8217;t be linked together. For instance, in such a system, if module <code>A</code> implements <code>len</code> for vectors and module <code>B</code> independently implements <code>len</code> for vectors, then modules A and B can&#8217;t be used together in the same program. Obviously, this poses a hazard for large systems composed of many independently developed submodules.</p>

<p><em>Extensibility</em> — A typeclass system facilitates extensibility if it&#8217;s possible for the programmer to introduce a new interface and provide implementations of that interface for existing types in the system. This is what makes typeclasses act like object extensions; it&#8217;s also what makes user-defined typeclasses useful on primitive types.</p>

<p>Many languages have typeclass or interface systems, but to my knowledge none of the mainstream systems support all three of these features. For example:</p>

<p><em>C++</em>—C++ concepts support extensibility, but aren&#8217;t coherent and are only somewhat modular. The C++ language permits out-of-line definition of custom operations on class and enum types. As an example, to provide an ordering on vectors of integers:</p>

<pre><code>#include &lt;vector&gt;
bool operator&lt;(std::vector&lt;int&gt; &amp;a, std::vector&lt;int&gt; &amp;b) {
    ...
}
</code></pre>

<p>In this way, C++ concepts are extensible. But there&#8217;s no check to ensure that there is only such definition in the program for each data type, so C++ concepts aren&#8217;t coherent. In this example, other namespaces can define <code>operator&lt;</code> over the same types.</p>

<p>Generally, C++ scoping rules ensure that a function can never be called outside of its lexical scope. But there is a significant exception: argument-dependent lookup. With ADL, a function can be called outside of its lexical scope if that function was defined in the same scope as the type of one of its arguments. This feature was intended for extensibility, as it allows collections like <code>std::map</code> to pick up definitions of functions like <code>operator&lt;</code> even if the functions aren&#8217;t in scope. However, it clearly comes at the cost of modularity.</p>

<p><em>Haskell</em>—Haskell typeclasses are coherent and support extensibility, but aren&#8217;t modular. Haskell programmers can define instances of typeclasses for any type in the system, but there can be only one instance of a typeclass for every type in the program. This can cause problems when two modules are linked together—if, say, module A defines <code>Show</code> of <code>int</code> and module B independently defines <code>Show</code> of <code>int</code>, modules A and B can&#8217;t be linked together.</p>

<p><em>Java</em> and <em>Go</em>—Java interfaces are modular and coherent, but aren&#8217;t extensible. In Java, an implementation of an interface can be defined only within the package that declares the type. This means, in particular, that interfaces can&#8217;t be defined on primitive types. It also means that a module can&#8217;t define an interface and then declare an implementation of the interface on existing types without modifying the existing type. Go interfaces have the same limitations (unless you define an interface over methods that are already defined on the type in question).</p>

<p><em>Scala</em>—Scala interfaces are modular but only mostly coherent; they also offer some support for extensibility. Unsurprisingly, interfaces in Scala are basically the same as interfaces in Java. The major difference is that, unlike Java, Scala provides a mechanism for extending classes with implementations of interfaces without having to modify the definition of the class—namely, implicits. This feature is extremely useful for extensibility; it also solves the problem of methods on primitive types in an elegant way. The trouble is that implicits are somewhat inconvenient to use; the programmer must define an implicit wrapper around the object, so the <code>this</code> parameter won&#8217;t refer to the object itself but rather to the wrapper. Equally importantly, implicits don&#8217;t enforce coherence—two modules can define two different implicits on the same type.</p>

<p><em>Objective-C</em>—Objective-C categories support extensibility, but aren&#8217;t modular or coherent. In Objective-C, methods can be added to existing objects by defining a new category for that object and placing the methods within that category. The problems with categories are that method calls are all late-bound (precluding static scoping), and what happens when two modules that define conflicting category implementations are linked together is <em>undefined</em>: the resulting object might provide one implementation, or it might provide the other implementation. Either way, the resulting program is unlikely to work.</p>

<p><em>Current Rust</em>—The current Rust implementation of typeclasses is modular and supports extensibility, but it doesn&#8217;t maintain coherence. Implementations are separate from interfaces in Rust (except for classes), and interfaces and implementations can both be defined over primitive types. The trouble is that there can be multiple conflicting implementations for typeclasses, which can lead to the instance coherence problem.</p>

<p>So how does this proposed design compare?</p>

<ul>
<li><p>It offers coherence because there can be only one implementation of an interface for each type. For the implementations provided within the interface itself, we can check that they&#8217;re nonoverlapping. For the implementations defined with classes, we can check to ensure that the interface implementations don&#8217;t overlap with the implementations that the interface itself defined. Either way, the checks involved are simple and ensure that we meet the criterion for coherence.</p></li>
<li><p>It offers modularity because the implementation either has to be imported as part of the interface (for implementations defined inside interfaces) or part of the nominal type (for class implementations). Consequently, it is never the case that two Rust crates cannot be linked together because of conflicting typeclass implementations.</p></li>
<li><p>It offers extensibility because, when an interface is defined, implementations can be provided for any existing types without modifying the declarations of those types.</p></li>
</ul>


<p>Finally, it supports all three of these features while maintaining a minimal feature set.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Why Lifetimes?]]></title>
    <link href="http://pcwalton.github.com/blog/2012/04/23/why-lifetimes/"/>
    <updated>2012-04-23T23:19:00-07:00</updated>
    <id>http://pcwalton.github.com/blog/2012/04/23/why-lifetimes</id>
    <content type="html"><![CDATA[<p>One of the most unique new features of Rust is its slowly-growing support for <em>regions</em>&mdash;or <em>lifetimes</em>, as some of us core developers like to call them. As lifetimes aren&#8217;t found in any mainstream languages, I thought I&#8217;d expand upon why we want them and how they can be used to improve memory management for performance (especially interactive performance) without sacrificing safety. In this first post I&#8217;ll explain why existing memory models weren&#8217;t enough and why we went searching for alternatives. Here I&#8217;m assuming basic knowledge of garbage collection, reference counting, and <code>malloc</code>/<code>free</code>, but nothing more.</p>

<p>The programming models that the current crop of mainstream programming languages expose can be divided pretty evenly into two camps: <em>explicitly-managed</em> environments and <em>garbage-collected</em> enivornments. By far, the most common programming languages built around explicitly-managed environments are C, C++, and Objective-C, and explicit memory management is so associated with these languages that it&#8217;s often just called &#8220;the C memory model&#8221;. Almost all other languages in mainstream use are garbage collected&mdash;Java, C#, JavaScript, Python, Ruby, Lisp, Perl, and tons of other languages all fall into this camp. (Note that here I&#8217;m using &#8220;garbage collection&#8221; in the general sense to mean automatic memory management; some of these languages don&#8217;t have <em>tracing</em> garbage collection and instead use reference counting.)</p>

<p>Now C and its derivatives famously offer a huge amount of control over memory usage&mdash;the built-in language features make it easy to implement stack allocation, ownership (i.e. explicit <code>new</code> and <code>delete</code>), memory pools, and reference counting (manually or with smart pointers or Objective-C&#8217;s Automatic Reference Counting). Most large C/C++/Objective-C codebases use all four strategies. Some programs (like Firefox and OS kernels) even implement their own general-purpose memory allocators. (A few use conservative garbage collectors, like the Boehm GC, but these are in the minority, so I&#8217;ll leave them aside.) This flexibility has a real benefit, especially for real-time and interactive apps (like web browsers!). Not only does explicit memory management tend to spread out the load so that pauses associated with tracing GC don&#8217;t appear, but it also provides a clear path toward improving performance whenever <code>malloc</code> and <code>free</code> do become expensive. In C++, for example, if you profile a program and see lots of expensive calls to <code>operator new</code> near the top, you can often just drop the <a href="http://www.boost.org/doc/libs/1_47_0/libs/pool/doc/index.html">Boost pool library</a>  into your code, change <code>new</code> to <code>new (pool)</code>, and call it a day.</p>

<p>Of course, all this power comes at a huge cost: namely, memory safety. Dangling pointers, wild pointers, and buffer overruns are not only annoying and costly in terms of hard-to-find bugs but also deadly from a security perspective. Heap spray attacks make any vtable dispatch on a freed object into an exploitable security vulnerability. Think about that for a second: <em>in C++, you&#8217;re always one virtual method call away from an exploitable security vulnerability</em>. You can, of course, mitigate this with sandboxing, but sandboxing has a performance and maintenance cost, and mitigating these costs isn&#8217;t easy.</p>

<p>Recognizing the huge costs associated with manual memory management, a huge amount of programming these days has shifted to languages that require garbage-collected environments. These include all of the scripting languages, as well as Java and C#. Garbage collection brings about enormous productivity savings (because the programmer doesn&#8217;t have to think as much about memory management) and also enormous security benefits. An entire class of security vulnerabilities (buffer overruns, use-after-free, stack overflow) basically cease to exist for programs running in a garbage-collected environment (to be replaced by exciting new security vulnerabilities such as SQL injection, but that&#8217;s another story).</p>

<p>The problem with garbage collection is that, now that memory management isn&#8217;t explicit (i.e. that when to recycle memory can&#8217;t be statically known by the compiler anymore), lifetimes have to be discovered at runtime&mdash;and that entails a performance cost. Tracing stop-the-world garbage collectors (and cycle collectors) have to suspend the entire program for pauses that can last hundreds of milliseconds, a fact which hurts lots of programs&mdash;for instance, mobile apps really need to be able to draw at 60 frames per second, ruling out any pause longer than 16 ms. Incremental garbage collection is better, but it&#8217;s tricky to implement and causes a loss of throughput, because the compiler has to insert extra operations on every modification of a pointer. And because everything has to essentially be done dynamically (barring simple static analyses like escape analysis), there will always be scenarios in which a fully garbage collected system loses to a manually-managed one&mdash;and both major open source web browser engines have zero tolerance for performance regressions.</p>

<p>There are many workarounds in garbage-collected languages for the lack of manual memory management. For example, <em>free lists</em> are a popular technique in languages like Java to reduce GC pause times. The idea is simple&mdash;when you have a large number of objects of the same type that you constantly allocate and deallocate, you keep a pool of old objects around and reuse them. The programmer is then responsible for manually allocating and deallocating objects from this free list. This is definitely an effective way to reduce allocations when it&#8217;s needed. But, unfortunately, there are a number of downsides to this approach.</p>

<p>First of all, garbage-collected languages usually don&#8217;t have any built-in syntax for creating objects out of a free list instead of the heap. The built-in constructor for the object can only be called on a fresh heap allocation. The usual workaround for this is to create an <code>init</code> method on the object or to create a factory object, but all of those approaches tend to look awkward syntactically. This problem itself isn&#8217;t a deal-breaker&mdash;after all, Java programmers frequently make factory classes for other reasons&mdash;but it does compound the awkwardness of the free list pattern. Of course, in and of itself, this wouldn&#8217;t be sufficient grounds to add a large amount of complexity to a garbage-collected language to support this pattern.</p>

<p>But there&#8217;s a much worse problem: <em>free lists are inherently unsafe</em>. They aren&#8217;t unsafe in the same way as C++, to be sure&mdash;in C++, there are serious security vulnerabilities to contend with&mdash;but they still allow for many of the same bugs that dangling pointers entail. To see why, notice that a free list has no idea when no more references remain to the objects that it hands out. In fact, it can&#8217;t know how many references remain to the objects allocated within it&mdash;at least, not without reference counting or tracing the object graph, which would lead back to GC and defeat the purpose of the free list! So a free list must require manual memory management. When the programmer frees an object that&#8217;s managed by a free list, it&#8217;s the programmer&#8217;s responsibility to ensure that no more references to it remain. If the programmer accidentally leaks a reference, then that object might be reused for a new instance, and a potentially hard-to-find bug will result. It&#8217;s <code>malloc</code> and <code>free</code> all over again.</p>

<p>The end result of this is that we seem to be trapped between the rock of unpredictable performance and the hard place of programmer burdens and security vulnerabilities. The current set of commonly-used languages don&#8217;t provide solutions here.</p>

<p>Fortunately, the research landscape offers some promising potential solutions, which I&#8217;ll cover next time.</p>
]]></content>
  </entry>
  
</feed>
