From Syntactic Theories to Interpreters: Specifying and Proving Properties

Yong Xiao

Syntactic theories have been developed to reason about many aspects of modem programming languages [AB97, AFM+9s, LS97, SS99, FLS99]. Having roots in the λ-calculus, these theories rely on transforming source programs to other source programs. Only the syntax of the programming language is relevant.

Experience shows that the development of such theories is error-prone. In order to ensure that the theories are sensible, many properties need to be checked. For example, we need to know if there is always a rule to rewrite a valid program and if a unique value can be obtained (which is described by a unique-decomposition lemma) and if the type of a program is preserved during evaluation (which is described by a subject reduction lemma). In many situations, the proofs of these properties do not require deep insight. However, many purported proofs suffer from being incomplete and often the missed case is the most problematic one. Thus, we think that in order to rely on syntactic theories, it is of mandatory importance to design tools that support their development. The work described in this thesis offers a first step in that direction. We introduce the specification language SL which can directly reflect primitive notions of syntactic theories such as evaluation contexts and dynamic constraints. An experimental system has been implemented that generates interpreters from SL specifications. Currently the generated interpreters are programs in CAML[Cam], a dialect in the ML family. Various examples have been tested, including the operational semantics of core-ML and a syntactic theory for Verilog[FLS99].

We experiment with the SL system to automatically check whether the unique-decomposition and subject reduction properties hold for the specified syntactic theories.

We map the unique-decomposition lemma to the problems of checking equivalence and ambiguity of syntactic definitions. Because checking these properties of context-free grammars is undecidable, we work with regular tree grammars and use algorithms on finite tree automata to perform the checking. To make up for the insufficient expressiveness of regular tree grammars, we enhance the basic framework with built-in types, constants, contexts, and polymorphic types.

In order to specify type systems, we extend the SL system to allow rules written in natural semantics. We also introduce a meta-level of the SL system with utilities to handle substitution and equivalence, and to simulate pattern matching and type reasoning. A subject reduction lemma can be represented as a meta-theorem. The proof is performed automatically by induction on the rewriting rules, the type checking rules, as well as the structure of the object-terms. The induction step consists of instantiating the meta-theorem into simpler forms which are also a meta-theorems. The base case corresponds to the meta-theorems that can be easily proved, for example, a logical tautology. The induction framework for automatically proving subject reduction can be easily extended to other properties.