The are several motivations for wanting to observe the performance of a parallel application during its execution  (e.g., to terminate a long running job or to steer performance variables). The downside in doing so is the deleterious effect on performance that may result. This trade-off forces consideration of a means to capture just enough performance information to make possible essential performance ``views." Additionally, there is the issue of how external performance analysis tools access the performance data resident in the program's execution state. The overhead of data transfer is a pragmatic problem (more data takes longer to send), but flexible control, via a well-defined external interface, of what data to send and even what level of instrumentation to enable may allow certain performance measurement decisions to be runtime selectable.
In this paper, we describe the dynamic performance callstack tool that we are developing as part of the TAU profiling package  for parallel, multi-threaded C++ programs. We also discuss how the performance callstack information is made available to analysis and visualization tools running in the program's computational environment. This is done using the DAQV-II tool interaction framework .