This document is broken down into the following sections:
Breezy is a tool that allows an application to "attach" to an executing parallel (pC++) program. This attachment allows the application several capabilities:
Breezy then is a partnership between the application and the executing parallel program. More specifically on the application side, breezy is the implementation of the protocol for accessing the parallel program. A C module "brkAcces.c" is the implementation of this protocol. A document describing the use of this module is in "sage/breezy/baseUtils/brkAccess.doc".
This document is concerned with describing the parallel program side of the partnership, the Breakpoint Executive. This component is a piece of Breezy. The Breakpoint Executive implements the Breezy protocol on the parallel program side, but this is more involved than the application side. The BE must implement all the capabilities that are listed above, i.e. execution control, parallel data access, provide meta information, provide specific information about program state, called specified user defined functions (or methods), and allow general communication.
The structure of the whole system is made up of several components. All of these components are compiled in or linked in with the user program source code. These components are:
Each component is discussed in more detail later in this document. First, though I will give an overview of how these components interact.
The Breakpoint Executive API (distApi.c) is actually performing all communication with the outside application. It receives messages from the application, and decides what to do in response. These responses often involve accessing other components of the system. To access parallel data, the BE API uses access functions that were generated for this purpose (see section 4.0 regarding instrumenting the user program). To get type information, the BE API uses the type API module (typeApi.c). For calling user defined functions, the BE API needs to call these functions from the user program itself (obviously).
The runtime system is involved also. For Breezy to work, the runtime system must have a modified synchronization barrier function. Instead of synchronizing all nodes immediately, it must let one node enter the function(s) that implement the Breakpoint Executive API component. This one node is referred to as the Breakpoint Executive node. When it is signaled by the user application to continue on to the next (or later) breakpoint, then this node too will enter the synch barrier, allowing all the nodes to continue on to the next one.
The runtime systems needs to be modified for the breezy utility. The only piece of breezy that must be implemented in the runtime system is the synchronization barrier call. (There is a basic assumption here that a barrier is performed AFTER each parallel operation invocation. If this is not the case (PGI's HPF implements barrier BEFORE parallel operations) then the user must know that data is retrieved before parallel operations, or else barriers must also be placed after these parallel operations.) Simply stated, the barrier function of the runtime system must have one process call the BREAKPOINT_EXECUTIVE() function before it enters the barrier.
When a barrier call is made, instead of all nodes/processes involved in the executing parallel program entering the barrier right away, the runtime system needs to allow one process to enter the breakpoint executive code (see section 6.0 - the breakpoint executive API implementation). Specifically, the runtime system calls a function called BREAKPOINT_EXECUTIVE() that performs the communication and data requests of the client application. When the client application specifies to continue to the next breakpoint (barrier), the process in the BREAKPOINT_EXECUTIVE() function returns from this function to enter the barrier with the rest of the processes, allowing all to continue on.
To allow for as much portability as possible, all of the data access/manipulation is instrumented on the language level. In this way, the compiler does the work of mapping the correct way of accessing and manipulating data to the current architecture. However, to accomplish this, we need to have access to the compiler/parser information regarding the user program. In particular, we need to be able to 1) examine parallel data structures used in the program, 2) find all allocations (instantiations) of parallel data structures, and 3) find any breezy user-defined access functions.
The reason we need to examine parallel data structures of the program is that after examining them, we will then build functions that extract the parallel data from those structures during runtime. These "access" functions will be called during runtime by the BREAKPOINT_EXECUTIVE() function to get at program data and send it to the client application. The structure of these generated functions depends on the data they are accessing. Currently, the only category of data structures that has been tested is data of a class (an element class of a collection).
We need to find all places in the user program where one of these parallel data structures is created (space is allocated). At each of these points, the program must be modified to include a statement that queues that variable name of the structure, the type name of the structure, and a pointer to the structure. Basically, this dynamically builds a symbol table of the parallel structure. Note that we also have to dequeue these entries when the structure is destroyed (goes out of scope).
Finally, the user program must be scanned for "user-defined" access functions. This is any function whose name is prefixed with the string "UserDefined_". These functions can appear in classes as methods (element classes of collections) or as normal C functions. The parameter types for these functions are predefined, and users must follow these conventions for their program to compile. For a normal C function to be used as a breezy user-defined access function, it should be declared as:
And for a user defined method invocation, there are also no parameters:
The way in which both of the user-defined functions are accessed from the client application is described in section 6.0 - The Breakpoint Executive API implementation. For user defined methods though, we need to generate intermediary functions that retrieve the correct collection and element of that collection so that the method can be executed on that element (collection and element are specified by the user).
To call these functions, a table of user defined function names (without the "UserDefined_") and the actual pointers to those functions is coded and added to the program, so that it is accessible during runtime. Then the client application can call one of these functions by specifying the name of the function, and in the case of user defined methods, which collection, and which element of the collection to invoke the method on. The BREAKPOINT_EXECUTIVE() function uses this table later to call the user defined functions that the client requests.
Note that these user defined functions can return values, but the client application must be aware of this fact if it is the case, and explicitly receive the return data. In general, breezy assumes user defined functions send no information to the client application.
The type API is implemented as a module which allows storage and retrieval of type information. It has the basic functionality:
Storing user defined types includes storing the type name and a description of the type. These type descriptions can be queried by name. All type structures that are composed of combinations of other types are referred to as "composite' types. All of these composite type descriptions are stored in the TA_composite data structure. Other type (typedefs, enums, etc.) that are non-composites are stored in the TA_typeDesc data structure.
Storing variable name and type pairs is straightforward put and get type functions. The getVarType(char *name) searches for the variable name and returns the corresponding type name (not full type description). Note that there can obviously be variables of the same name, so this is not a perfectly correct way of keeping track of these variables. It is designed to keep track of a small number of (global?) variables that the user is interested in. The tables that keep this information are accessible, so the user could distinguish between different variable names by the order in the table, but then he/she would have to perform the table lookups themselves to assure they are getting the correct variable.
Storing collection and element pairs works exactly like the name-type pairs in the above paragraph. A string for the name of the collection type and the element type is stored in pairs in a table.
The ability to store function name and function pointer pairs is more restrictive than it sounds. The function pointers it stores are only of the user defined type (see section 4.0 above). So only functions of that type (and with the same parameter list) can be stored in this table of function names and pointers.
Storing method name and pointer in a table is less obvious because, what is a method pointer? In fact, we actually store a method name and an intermediary function pointer. The reasons for this are described in the next
6.0 The Breakpoint Executive API implementation (distApi.c)
The Breakpoint Executive API implementation is implemented as a module (distApi.c). It serves as the request broker for the client application (in CORBA speak). The client application sends a request/command to the BE, and the BE takes the appropriate response. This implementation exists in the BREAKPOINT_EXECUTIVE() function (mentioned earlier). It calls other functions and modules (including code in the user program) to carry out its task, of course.
The basic communication is implemented a communication module cargo.c. In turn, this module is based on a low level socket communication module transport.c. These modules can also be accessed from the user program (or the client application). Breezy does all connection initializations and other overhead work for the communication link, so any subsequent calls need only specify the data that will be sent. This is how general (uncensored) communication can be implemented by the user program and/or the client application.
One of the major subtasks of the BREAKPOINT_EXECUTIVE() is to maintain an accurate symbol table of the parallel data structures active in the program at any given time. This is implemented using basic lists that are updated from code that has been instrumented into the program (see section 4.0 ).
The requests that the BREAKPOINT_EXECUTIVE() function serves fall into several categories, and I will discuss the BE implementation of these categories:
Controlling the execution of the program is implemented by three basic commands from the client application:
Going to the next breakpoint is the simplest to implement of any command. If requests are being served, it means the current state of the program is that all nodes are in a barrier except for the one node answering the requests. This node is in the BREAKPOINT_EXECUTIVE() function. So, to have all node continue to the next breakpoint, simply exit from the BREAKPOINT_EXECUTIVE() function, allowing the last node to enter the barrier and subsequently release all nodes to continue on until the next barrier.
Skipping a given number of breakpoints (<N>), we simply set a static variable to the number of breakpoints to skip, and decrement that variable each breakpoint (without serving any requests from the client) until the variable is again 0, then we serve requests again.
Terminating execution is implemented as a straightforward exit() call. I was wrong before, this is actually the easiest command to implement.
To get data from parallel data structures, we have generated access functions (see section 4.0 - Instrumenting the user program). The BREAKPOINT_EXECUTIVE() function merely takes the client request, and calls the appropriate access function that has been generated. These access functions are kept in tables in the typeApi.c module, and are accessible by a collection name and element name combined key value.
The BREAKPOINT_EXECUTIVE() allows a client application to call user defined access functions and methods. As in the parallel data accessing above, functions have been generated to call the correct user defined functions (or methods) for the parameters specified by the client. Tables in the typeApi module are consulted for the right functions to call. In the case of user defined methods, functions in this table are actually intermediary functions which set up everything for the actual call to the user defined method. Each class that has user defined methods has a corresponding intermediary function automatically generated by breezy ( section 4.0 ). This intermediary function accepts a collection pointer, an element number, and a string indicating which user defined method to call. It proceeds to retrieve the correct element of the collection, and call the user defined method specified by the string. So, the client application specifies which collection, which element of the collection, and which method to call for that element. The BREAKPOINT_EXECUTIVE() then looks up the correct intermediary function for the specified user defined method using this method name and pointer table. It calls the intermediary function with the arguments passed by the client application, and the intermediary function then calls the right method on the right element of the right collection (right?).
The distApi module maintains some meta information about the program execution. This information includes the last function called, the file and line number the program is on. This information is retrieved by the client application in a single request.
The distApi module also maintains list of user defined functions and methods, and the client application can request these lists. Lastly, the client application can request the type information about the parallel data structures in the program. These are maintained in the typeApi module, so the BREAKPOINT_EXECUTIVE() function just retrieves a textual representation of the types from the typeApi module, and returns this string. (Note that on the client application side, this string is read using the same module, typeApi, and the string is parsed into the typeApi data structure representations of the type information.)