More sophisticated forms of mixed-mode parallelism are possible when software layers are built to hide the intricacies of efficient communication or data distribution, presenting a compiler backend or an application programmer with a set of well-defined, portable interfaces for general parallel programming. In some cases, these layers involve a mixing of languages, libraries, runtime software, and application components to create a ``hybrid'' parallel development environment that enables higher-level or hierarchical parallel programming abstractions to be used. This poses a challenge to performance tools to not only work with the different parts involved, but also to map performance data to the parallel execution abstractions and user-level performance views.
We've used TAU to investigate task and data parallel execution in the Opus/HPF programming system . Figure 4.7 shows a Vampir display of TAU traces generated from an application written using HPF for data parallelism and Opus for task parallelism. The HPF compiler produces Fortran 90 data parallel modules which can execute on multiple processes. The processes interoperate using the Opus runtime system built on MPI and pthreads. In systems of this type, it is important to be able to see the influence of different software levels. TAU is able to capture performance data at different parts of the Opus/HPF system exposing the bottlenecks within and between levels.
Figure: Vampir displays for TAU traces of an Opus/HPF application using MPI and pthread