Triggers for Trace Dumping

Next: Background Merging and Preparation Up: Enabling Online Trace Analysis Previous: Overview

Triggers for Trace Dumping

The interactive character of online trace analysis requires a strategy for triggering the TAU runtime system to dump trace data. We considered the four different options:

Buffer size driven: The tracing library itself makes the decision when to store trace data. One approach is to dump data whenever the internal memory buffers are full. It is easy to implement and does not require any external interaction with the measurement library. Unfortunately, it may produce large amounts of data when not combined with any filtering techniques. Also, the different contexts (i.e., processes) will produce the trace data at different times, depending on when their buffers fill up.
Application driven: The tracing library could provide the application with an API to explicitly dump the current trace buffers. (Such an API is already available in TAU.) The advantage of this approach is that the amount and frequency of trace dumping is entirely under the control of the application. We expect this approach to be desired in many scenarios.
Timer driven: To make triggering more periodic, a timer could be used to generate an interrupt in each context that causes the trace buffers to be dumped, regardless of the current amount of trace data they hold. In theory, this is simple to implement. Unfortunately, there is no general, widely portable solution to interrupt handling on parallel platforms.
User driven: Here the user decides interactively when to trigger the dump process. Assuming the user is sitting in front of a remote visualization client (e.g., vng), the trigger information needs to be transported to the cluster and to the running application (i.e., the trace measurement system). Again, this requires some sort of inter-process signaling. From the options we discussed so far, we regard this approach to be the most challenging, but also the most desirable.

For the work presented here, we implemented the buffer size and application driven triggering mechanisms. These are generally termed ``push'' models, since they use internal decision strategies to push trace data out to the merging and analysis processes. In contrast, the ``pull'' models based on timer or user driven approaches require some form of inter-process signalling support. Our plan was to first use the simpler push approaches to validate the full online tracing system before implementing additional pull mechanisms.

Next: Background Merging and Preparation Up: Enabling Online Trace Analysis Previous: Overview

Sameer Suresh Shende 2003-09-12