Section 2 has already provided a rough sketch of the analysis server's internal architecture. We will now go into further detail. Figure 2 can be regarded as a close-up of the left part of the architecture overview. On the right hand side we can see an MPI boss process responsible for the interaction with the client and the control over the worker processes. On the left hand side identical MPI worker processes are depicted in a stacked way so that only the upper most process is actually visible.
Every single MPI worker process is equipped with one master thread doing the MPI communication to the boss and, if required, to other MPI workers. The master thread is created once at the very beginning and keeps running until the server is terminated. Depending on the number of clients to be served, every MPI process also has a dynamically changing number of session threads responsible for the clients' requests. The communication between MPI processes is done with standard MPI constructs whereas the process-local threads communicate by means of shared buffers synchronized by mutexes and conditional variables. Doing so allows for low overhead interaction between the mostly independent components.
Session threads can be sub-divided into three different module categories: analysis modules, event database modules, and trace format modules. Starting from the bottom, trace format modules include the parsers for the traditional Vampir trace format (VPT), the newly designed scalable trace format (STF) by Pallas and the TAU trace format (TRC). The modular approach makes it easy to add other third party formats. The database modules include storage objects for all supported event categories like functions, messages, performance metrics, etc. The final module category provides the analysis capabilities of the server. This type of module does its work upon the data provided by the database modules.
So far, the worker processes have been discussed. For the boss process(es) the situation is slightly different. The layout of a boss process with respect to its threads is identical to the one applied on the worker processes. Similar to a worker process, the master thread is responsible for doing all MPI communication with the workers. The session threads on the other hand have different tasks. They are responsible for merging analysis results received from the workers, converting the results to a platform independent format, and doing the communication with the clients, as depicted on the right hand side of Figure 2.