The Design of Intermediate Languages in Optimizing Compilers
Luke Van Wyck Maurer
Committee: Zena Ariola (chair), Michal Young, Boyana Norris, Mark Lonergan
Dissertation Defense(May 2024)
Keywords:

Every compiler passes code through several stages, each a sort of mini-compiler of its own. Thus each stage may deal with the code in a different representation, which may have little to do with the source or target language. We can describe these in-memory representations as languages in their own right, which we call intermediate languages.

Each intermediate language is designed to accomodate the stage of compilation that handles it. Those toward the end of the compilation pipeline, for instance, tend to have features expressing low-level details of computation. A subtler case is that of the optimization stage, whose role is to transform the program so that it runs faster, uses less memory, and so forth. The optimizer faces tradeoffs: The language should provide enough information to guide optimization algorithms, but all of this information must be kept up to date as the program is transformed. Also, establishing invariants in the language can be helpful both in implementing algorithms and in debugging the implementation, but each invariant may complicate desirable transformations or rule them out altogether. Finally, a language where the invariants are obviously correct may have a form too awkward or otherwise unsuited to the compiler’s needs.

Given the properties and invariants that we would like the language to provide, we can approach the design task in a way that gives these features without necessarily sacrificing implementability. Namely, begin with a formal language that makes the desired properties obvious, then translate it to one more suitable for implementation. We can even translate theorems about valid transformations in the formal language to derive correct algorithms in the implementation language.

This dissertation explores the connections between different intermediate languages and how they can be interderived, then demonstrates how translation lead to an improvement to the Glasgow Haskell Compiler opimization engine.

This dissertation includes previously published coauthored material.