Next: Enabling Online Trace Analysis
Up: Online Remote Trace Analysis
Previous: Dynamic Event Registration
The distributed parallel performance analysis architecture applied in this
paper has been recently designed by researchers from the Dresden University of
Technology in Dresden, Germany. Based on the experience gained from the
development of the performance analysis tool Vampir [1], the new
architecture uses a distributed approach consisting of a parallel analysis
server running on a segment of a parallel production environment, and a
visualization client running on a remote graphics workstation. Both
components interact with each other over a socket based network connection.
In the discussion that follows, the parallel analysis server together with the
visualization client will be referred to as VNG. The major goals of the
distributed parallel approach are:
- Keep performance data close to the location where they were created.
- Analyze event data in parallel to achieve increased scalability.
- Provide fast and easy remote performance analysis on end-user
platforms.
VNG consists of two major components: an analysis server (vngd) and
visualization client (vng). Each is supposed to run on a different
platform. Figure 1 shows a high-level view of the
VNG architecture. Boxes represent modules of the components whereas arrows
indicate the interfaces between the different modules. The thickness of the
arrows gives a rough measure of the data volume to be transferred over an
interface, whereas the length of an arrow represents the expected latency for
that particular link.
Figure 1:
VNG Architecture
![\includegraphics[bb=66 117 775 472,width=\columnwidth,clip=]{fig_vng_architecture}](img1.jpg) |
On the left hand side of Figure 1 we can see the
analysis server, which is intended to execute on a dedicated segment of a
parallel machine. The reason for this is two-fold. First, it allows the
analysis server to have closer access to the trace data generated by an
application being traced. Second, it allows the server to execute in
parallel. Indeed, the server is a heterogeneous parallel program, implemented
using MPI and pthreads, which consists of worker and boss processes. The
workers are responsible for trace data storage and analysis. Each of them
holds a part of the overall trace data to be analyzed. The bosses are
responsible for the communication to the remote clients. They decide how to
distribute analysis requests among the workers. Once the analysis requests
are completed, the bosses also merge the results into a single packaged
response that is to be sent to the client.
The right hand side of Figure 1 depicts the
visualization client(s) running on a local desktop graphics workstation. The
idea is that the client is not supposed to do any time consuming calculations.
It is a straightforward sequential GUI implementation with a look-and-feel
very similar to performance analysis tools like Vampir and
Jumpshot [6]. For visualization purposes, it communicates with
the analysis server according to the user's preferences and inputs. Multiple
clients can connect to the analysis server at the same time, allowing
simultaneous distributed viewing of trace results.
Next: Enabling Online Trace Analysis
Up: Online Remote Trace Analysis
Previous: Dynamic Event Registration
Sameer Suresh Shende
2003-09-12