Java Implementation of the TINY Compiler

Last updated 2008/04/04 20:28:42

Following are the source files for a Java implementation of the TINY compiler example given in the text book Compiler Construction - Principles and Practice by Kenneth Louden. This code is adapted from the C implementation presented in the book. The purpose of this rewrite of Louden's example is to explore an object oriented approach to compiler implementation in Java. I believe the architecture of a compiler lends itself to a natural object oriented design. Traditionally, compiler implementations have been organized very algorithmically, and tend to have large switch statements or other forms of multiple branch logic. The use of switch statements suggests the possibility of using a class hierarchy and dynamic method binding instead. In this design, a class hierarchy is used for the grammatical structure of the TINY language. The goal is to capitalize on as much commonality as possible as well as to use the strict type checking of Java to make the compiler implementation as error free as possible.

Like Louden's original implemenation, this implementation is organized according to the phases of development - scanner, parser, code analysis, and code generation. The scanner is generated by JFLEX and the parser is generated by CUP. This implementation should generate the same code (for the TM machine tm.c given in Louden's book).


High level description of the implementation

A Makefile provides the rules for compiling this implementation using the Java compiler. tinyc.java is the driver for running the compiler.

The basic unit of scanning is the Token class which is defined in Token.java. A Token encapsulates a textual scanned lexical unit along with its line number and file name and a type characterizing the token.

The core of the compiler design is in the classes in the file ASTnode.java. These classes capture the grammar of the TINY language. There is a base class ASTnode that derives from Token and represents the common aspects of all nodes in the abstract syntax tree. As such, it has abstract methods for printing and code generation. Some of the main classes deriving from ASTnode are Expression for expressions Statement for statements, and Program for the entire TINY program.

Lexer.java defines and implement a Lexer class that enapsulates the lexical analysis actions. tiny.jflex defines a JFLEX implementation of the scanner core, resulting in the class TinyLexer which extends Lexer. TinyLexer-hand.java contains a hand coded implementation of the scanner for Tiny that does not use JFLEX.

tiny.cup defines a CUP implementation of the parser core that results in a TinyParser class. This class encapsulates the syntax analysis actions.

Symtab.java defines and implements a very simple symbol table used by the compiler.

Finally, code generation is handled by a Code class (Code.java) as well as implementations of the codegen method of ASTnode.

tiny-compiler.tar is an archive of all of the source files


You may also want to look at a C++ version of the Tiny compiler.


Some Notes on the Coding Style

As noted above, the base classes in ASTnode.java define some abstract methods. For example, a method print is used to pretty print the syntax tree. The implementations of this method for each of the concrete classes in ASTnode.java can be found throughout the classes there. The C++ language offers an advantage over Java source code organization: we could collect all of the method implementations into a single source file consisting only of the method implementations. In the Java implementation, we must have the method implementation within the class definitions. Of course, the classes give us object and type orientation for the grammatical elements. In C++, we can have this organization but still arrange source on a functional basis, which I think would be an advantage.