Composable and Compilable Macros

Papers We Love may be the best meetup group in New York (that I don’t run, anyway =). They find a new speaker every month or so to give a talk on a paper that they especially enjoy. These papers cover topics from garbage collection, to VMs, to CRDTs, to information retrieval.

We had an especially good one this Thursday. Sam Tobin-Hochstadt came to give a talk on Matthew Flatt’s Composable and Compilable Macros. I’ll admit that I don’t always read the papers before the talk, but I sure did this time. There’s a whole crowd of researchers doing interesting things with Racket,¹ and this is one of the more foundational papers in that genre.

Here’s why I like it, with some context: when this paper was published in 2002, Racket (still called “PLT Scheme” then) didn’t have a proper module system. Instead, you’d just load up your dependencies with load². So if your code used some functions defined over in observerfactoryfactory.scm, you’d make them available by just evaluating that file at run time:

(load "observerfactoryfactory.scm")

This works, at first, but it breaks down pretty quickly as our code grows in complexity. If several files all depend on the same library, for example, should it be loaded (and evaluated) repeatedly? That definitely seems wrong.

The bigger problem, though, comes from macros³ and the compile time vs. run time distinction. Since compiling a load statement doesn’t perform the load (it just compiles the code that’ll execute that load at run time), we can instead direct the compiler to perform the load at compile time. However, manually directing the compiler this way is tedious, error-prone, and usually results in an awfully brittle codebase.

Worse still, it doesn’t even work—we often want some parts of a file to be evaluated at compile-time (notably language extensions⁴) and others at run time (some function and variable definitions).

It would sure be lovely if we had a system that:

Made dependencies explicit
Solved the multiple-load issue by traversing each module the minimum number of times
Allowed programmers to indicate whether they want to use the language extensions in a module (which are compilable) or just the features available at run time.

Of course, that’s exactly what this paper provides. Flatt introduces a module scheme in which:

Code can be defined in modules (generally one per file)
Within a module, the module’s dependencies can be specified
A module may differentiate between its dependencies: either the dependency is to be evaluated at run time, or the compile-time expressions in the module are to be imported.

This feature neatly solves the problems previously introduced, and it’s also extremely easy to work with and reason about.

The paper goes on to introduce a nice example of working with modules to manage different kinds of records. This includes a detailed description of phase separation, the static analysis technique used to avoid mixing compile-time and run-time operations. There are also extensive implementation details and a grammar defining the module system—useful for language developers, but not for a noncombatant like me.

Anyhoo, good paper! This module system is still in Racket today, by the way, but its syntax has changed: it’s now #lang[^5]!

At the end of his talk Sam recommended a few related papers (which I’ll be spending the next few days perusing):

Dybvig, Hieb & Bruggeman, Syntactic Abstraction in Scheme
Flatt, Findler, Culpepper, & Darais, Macros that Work Together
Ghuloum & Dybvig, Implicit phasing for R6RS libraries
Waddell & Dybvig, Extending the scope of syntactic abstraction

One last thing: at the end of the talk, an audience member asked how Sam felt about integrating modules while programming at the REPL. Sam quoted Flatt, “The top level is hopeless!”

There are a bunch of people associated with Racket-related research along with Sam and Flatt, including Daniel Friedman, Matthias Felleisen, Shriram Krishnamurthi, and Robert Findler. A subset of them also wrote a couple of my favorite CS books: The Little Schemer and How To Design Programs. Good stuff. ↩
…much like we still do in Emacs lisp, alas. ↩
The cause of, and solution to, all of lisp’s problems. ↩
Which is how the Scheme community seems to pronounce “macros.” It’s a perfectly reasonably preference, but I wonder when and why the groups diverged. Maybe only certain research groups say this. Hmm. [^5]: Because it’s a language extension! ↩

You might like these textually similar articles: