We advocate the development of language-specific program analysis environments. The language we are targeting to demonstrate the benefits of such an approach is pC++. pC++ is a language extension to C++ designed to allow programmers to compose distributed data structures with parallel execution semantics. The basic concept behind pC++ is the notion of a distributed collection, which is a type of concurrent aggregate ``container class'' [3]. More specifically, a collection is a structured set of objects which are distributed across the processing elements of the computer in a manner designed to be completely consistent with HPF [4]. To accomplish this, pC++ provides a very simple mechanism to build ``collections of objects'' from a base element class. Member functions from this element class can be applied to the entire collection (or a subset) in parallel. This mechanism provides the user with a clean interface to data-parallel style operations by simply calling member functions of the base class. In addition, there is a mechanism for encapsulating SPMD style computation in a thread-based computing model that is both efficient and completely portable. To help the programmer build collections, the pC++ language includes a library of standard collection classes that may be used (or subclassed). This includes classes such as DistributedArray, DistributedMatrix, DistributedVector, and DistributedGrid.
High-level parallel languages, including pC++, place special requirements on the development and use of program analysis tools. The toolset was designed to meet the following pC++ analysis requirements:
The tools are implemented as graphical hypertools. While they are distinct tools, they act in concert as if they were a single application. Each tool implements some well-defined tasks. If one tool needs a feature of another one, it sends a message to the other tool requesting it (e.g., display the source code for a specific function). This design allows easy extensions. The Sage++ toolkit also supports Fortran-based languages, allowing to be retargeted to other programming environments.
Figure 1 shows the pC++ programming environment and the associated tools architecture. The pC++ compiler frontend takes a user program and pC++ class library definitions (which provide predefined collection types) and parses them into an abstract syntax tree (AST). All access to the AST is done via the Sage++ library. Through command line switches, the user can choose to compile a program for profiling, tracing, and breakpoint debugging. In these cases, the instrumentor is invoked to do the necessary instrumentation in the AST. The pC++ backend transforms the AST into plain C++ with calls into the pC++ runtime system. This C++ source code is then compiled and linked by the C++ compiler on the target system. The compilation and execution of pC++ programs can be controlled by cosy (COmpile manager Status displaY). This tool provides a high-level graphical interface for setting compilation and execution parameters and selecting the parallel machine where a program will run.
Figure 1: pC++ Programming Environment and Tools Architecture
The program and performance analysis environment is shown on the right side of Figure 1. They include the integrated TAU tools, profiling and tracing support, and interfaces to stand-alone performance analysis tools developed partly by other groups [13][5][2][11]. The toolset provides support for accessing static information about the program and for querying and analyzing dynamic data obtained from program execution. The static and dynamic tools of the environment are briefly described below; a more detailed discussion of these tools can be found in [10][9].