Breezy allows a user to control the execution of a pC++ program and to view the parallel program collection data during the execution, the essential functions of a parallel debugger. The execution is controlled by manipulating program breakpoints. At these breakpoints, the program data can be displayed in a text window or visualized by a compatible visualization tool. The architecture of breezy is shown in Figure 3.
There are two principal issues to address in the breezy design: breakpoint implementation and program state access. In general, supporting breakpoints for debugging parallel programs is difficult because consistent global program states are difficult to reconstruct from parallel execution. pC++'s data-parallel SPMD-style of computation reduces the complexity of the problem by restricting the possible execution behaviors to single-level fork-join type. In addition, we constrain breakpoints to occur only within barriers. It may seem restrictive to allow access only at barriers, but a pC++ computation usually involves many barriers, and this restriction is not too severe for our purposes of viewing parallel program states.
At each breakpoint, all machine nodes enter a barrier except for a master node, the breakpoint executive. This node establishes an interface between the breezy user interface agent and the distributed collection data (see Figure 3). The other nodes act as slaves to the breakpoint executive while waiting in this barrier. The user can enter high-level requests for collection data to the interface agent, which forwards them to the breakpoint executive node. This node collects the appropriate data from all other nodes and returns it to the interface agent. When the user specifies to continue to the next breakpoint, the breakpoint executive node finally joins all others in the barrier, consequently allowing all to continue until the next breakpoint.
Since pC++ runs on a range of parallel machines, each one having different symbol table and binary formats, gaining access to internal program variables in a portable way is a challenge. The approach we took was to implement breezy at the language level, utilizing data-parallel program semantics and Sage++ program transformation support. Breezy focuses on providing access to the distributed collections of a pC++ program. During program compilation (see Figure 1), breezy automatically generates access functions and methods for all the parallel data structures. Using Sage++ routines, the parallel data structures are easily located in the program and access functions (or access methods if the structure is a class) are inserted appropriately in the program code. The modified program is then linked with a special instrumented runtime system containing code to keep a record of collection instantiations. This record is subsequently used to build a collection symbol table during runtime. The runtime system also includes the modified barrier function that calls the breakpoint executive code. Note that all program modifications are at the source code level, implemented in the pC++ language itself. The obvious benefit is that this additional code is Portable to everywhere pC++ runs. The less obvious benefit is that the operations to access the parallel data are ``generated'' as a natural result of compiling the instrumented program.