The performance callstack information provides a snapshot on a program's performance during execution. One common approach to making performance data available to tools is to save it in a trace file for post-execution analysis [9,11]. Because multiple nodes of an application produce callstack data, multiple trace files are needed and must be merged to give a complete, time-consistent view of the application's performance. This approach can be easily extended to provide access to the callstack data while the application is executing if the trace files can be shared with the analysis tools. However, this complicates tool design, and if the application processes are running on distributed machines, a shared file approach may not be possible.
Another alternative is to build into the application a runtime interface that allows external tools to access callstack data over a network. Unfortunately, instead of multiple files, tools must deal with multiple network connections to application nodes, and the application must manage these connections and coordinate the callstack data access. However, the interaction between the application and the tools is more tightly coupled in this approach. In addition to the communications functionality, there are issues concerning application-tool synchronization, callstack data consistency, and performance perturbation. Idealy, a solution would not impact the performance of the application significantly beyond the cost of file I/O, yet would allow callstack data to be accessed in a distributed environment with a simple interface for tools.
Because analysis tools work with the callstack data as a single, unified sample, it is desirable to provide a data access interface that obviates concern for where the individual parts of the callstack (node, context, and thread parts) are located in the application's execution environment. Similarly, the application should not be bothered by the number and location of tools, merely informing the external interface of where the callstack data is located and when it is available. The ``glue'' between the tool and application interfaces then must implement the mapping of a high-level, callstack data view to its individual parts and runtime location while servicing callstack access requests from multiple analysis tools. We decided to use the DAQV framework to implement the interfaces and glue for runtime access to performance callstack views. The DAQV system is described below followed by its integration with TAU and use for callstack access.