Setting Consistent Breakpoints from Event Trace Analysis

To examine the state of a program, it is necessary to set breakpoints that stop the computation at locations that are both consistent and meaningful to the programmer. Standard mechanisms set breakpoints in terms of program locations or combinations of local state predicates, but these are problematic: How are the breakpoints to be specified across multiple processes? Must they be globally consistent or are local breakpoints sufficient? If local, how and when should the remaining processes be stopped in order to provide a meaningful state?

One possibility is to allow breakpoints to be set dynamically but only at barrier synchronizations as was done in the TAU Breezy debugger [4]. This, however, will be overly constraining in multithreaded and multitasked extensions to pC++, in which programs can have more of an MIMD flavor, with considerable amounts of computation occurring between global synchronization points. In addition, it does not exploit any event-based analysis that the user might have already done. As an alternative, we use the results of the ongoing event-based modeling to automatically halt the program at a consistent state during replay. Ariadne allows the user to specify breakpoints in terms of the same abstract events that he/she has already matched in the execution history graph. Once breakpoints are set, Ariadne annotates the trace and then re-executes the program to the latest consistent state preceding those annotations. Thus each process is stopped at the last point at which it could have influenced the event(s) triggering the breakpoint. This is the consistent state that is most meaningful to the programmer.

In related work, Manabe and Imase [19] also set breakpoints that were used to find consistent states during replay. They, however, used a state-based approach, specifying breakpoints with conjunctions and disjunctions of local state predicates. It is often difficult to determine the appropriate local predicates initially: errors observable only as incorrect output are hard to relate back to individual predicates in specific processes.

We implemented replay at the level of polling in the runtime system: pC++ uses polling in servicing requests for element values and the polling introduces nondeterminism. We record the number of unsuccessful polls as the ``logical time'' of a read event and, during replay, receive events are allowed only when the identical poll count is reached. This ensures that events take place at their respective logical times and preposted receives are introduced into the computation at the appropriate point.

During replay, selected matched (or partially matched) abstract events are used to partition the processes of a computation into three equivalence classes: the core set containing the processes that executed parts of the abstract event; the influence set containing the processes that could have causally affected a component of the abstract event; and the other set containing the remaining processes. In order to halt a computation, the state most relevant to the abstract event is restored [15]:

Processes in the other set are stopped at their earliest state;
Processes in the influence set are stopped immediately after their latest events that could have causally influenced the abstract event; and
Processes in the core set are stopped immediately before the earliest (or after the latest) local event that is a part of the abstract event (depending on the user request).

In some cases, this procedure results in a state that is not consistent, that is, a state that could not have occurred in an actual execution. We define consistency as in [7]. Whenever a user's breakpointing request would result in an inconsistent state, our debugger automatically constructs an earlier, consistent state using a greedy algorithm [13].

For the binary image compression example, we had already used event-based analysis to show that there was some node that was not compressed at level two in the tree that was subsequently compressed at level three. It makes sense to stop the program immediately after level two is processed. With Ariadne, this was done by selecting the second COMPRESSLEVEL event from the match tree and issuing the command ``BREAK AFTER (SELECT).'' This stops the program at exactly the right state, allowing us to examine the program variables immediately after the level two compressions complete but before the level three compressions start on each element. Note that this particular breakpoint would have been very difficult to set without first performing the event-based analysis. The user could not have known to set the breakpoint after the second level of the tree, as opposed to the third or fifth or seventh. The ``values'' that we examined in coming to our choice of breakpoints, that is the count of MERGE events and the elements of the MERGESET and NOMERGESET, were not values computed in the program. The conditions of interest were global properties of the computation, not expressible as local state predicates. Event-based analysis often makes it possible to narrow the user's focus to exactly the right set of states for breakpointing.

After the program is stopped, control is transferred to the Sneaky breakpoint debugger. This let us examine the local state variables as shown in Figure 6.

Looking at the second element of the collection, we saw an inconsistency: myColor was set at 1 indicating that this node represents a single, compressed, black tree but myStatus was set to 2 indicating that the merge had failed. Since it is the color of the node that determines its candidacy for merging at the next level, the color should have been set to a third value indicating ``black and white, no merge.'' This failure to set myColor after an unsuccessful merge attempt was the error we were looking for.

Sneaky itself is a programmable, state-based debugger that we are currently developing; it is in the style of Acid [23], Dalek [20], and Parasight [1]. Many debugging abstractions are context and application dependent, making it difficult to anticipate all of the debugging needs of the user ahead of time. A programmable debugger alleviates this problem by allowing the user to build new abstractions, synthesizing complex events from sequences of simpler ones, as in Dalek [20]. Our programming language will be high-level, dynamically typed, with powerful data structuring capabilities and closures; the initial prototype uses Scheme [10]. Sneaky will include a large and ever-growing library of packaged functionality to aid rapid debugger extension.

Next: Integration within the Up: Event and State-Based Debugging Previous: Event-Based Behavioral Abstraction

Sameer Suresh Shende
Fri Apr 5 10:36:18 PST 1996