Next: Enabling Online Trace Analysis Up: Online Remote Trace Analysis Previous: Dynamic Event Registration

Remote Trace Data Analysis (VNG)

The distributed parallel performance analysis architecture applied in this paper has been recently designed by researchers from the Dresden University of Technology in Dresden, Germany. Based on the experience gained from the development of the performance analysis tool Vampir [1], the new architecture uses a distributed approach consisting of a parallel analysis server running on a segment of a parallel production environment, and a visualization client running on a remote graphics workstation. Both components interact with each other over a socket based network connection. In the discussion that follows, the parallel analysis server together with the visualization client will be referred to as VNG. The major goals of the distributed parallel approach are:

Keep performance data close to the location where they were created.
Analyze event data in parallel to achieve increased scalability.
Provide fast and easy remote performance analysis on end-user platforms.

VNG consists of two major components: an analysis server (vngd) and visualization client (vng). Each is supposed to run on a different platform. Figure 1 shows a high-level view of the VNG architecture. Boxes represent modules of the components whereas arrows indicate the interfaces between the different modules. The thickness of the arrows gives a rough measure of the data volume to be transferred over an interface, whereas the length of an arrow represents the expected latency for that particular link.

**Figure 1:** VNG Architecture
$\includegraphics[bb=66 117 775 472,width=\columnwidth,clip=]{fig_vng_architecture}$

On the left hand side of Figure 1 we can see the analysis server, which is intended to execute on a dedicated segment of a parallel machine. The reason for this is two-fold. First, it allows the analysis server to have closer access to the trace data generated by an application being traced. Second, it allows the server to execute in parallel. Indeed, the server is a heterogeneous parallel program, implemented using MPI and pthreads, which consists of worker and boss processes. The workers are responsible for trace data storage and analysis. Each of them holds a part of the overall trace data to be analyzed. The bosses are responsible for the communication to the remote clients. They decide how to distribute analysis requests among the workers. Once the analysis requests are completed, the bosses also merge the results into a single packaged response that is to be sent to the client.

The right hand side of Figure 1 depicts the visualization client(s) running on a local desktop graphics workstation. The idea is that the client is not supposed to do any time consuming calculations. It is a straightforward sequential GUI implementation with a look-and-feel very similar to performance analysis tools like Vampir and Jumpshot [6]. For visualization purposes, it communicates with the analysis server according to the user's preferences and inputs. Multiple clients can connect to the analysis server at the same time, allowing simultaneous distributed viewing of trace results.

Next: Enabling Online Trace Analysis Up: Online Remote Trace Analysis Previous: Dynamic Event Registration

Sameer Suresh Shende 2003-09-12