Skip Navigation

Colloquium Details

Automatic Performance Analysis on Parallel Computers with SMP Nodes

Author:Felix Wolf Research Centre Juelich, Germany
Date:October 21, 2002
Time:15:30
Location:220 Deschutes

Note: Special Day

Abstract

Parallel computers with SMP nodes provide both multithreading and message passing as their modes of parallel execution. The complexity of the performance problems that can arise in these systems is addressed by formally characterizing the problems in terms of execution patterns that represent situations of inefficient behavior. These patterns are specified as compound events which are input for an automatic analysis process that recognizes and quantifies the inefficient behavior in event traces. Mechanisms that hide the complex relationships within compound-event specifications allow a simple description of complex inefficient behavior on a high level of abstraction.

The analysis process automatically transforms event traces into a scalable representation of performance behavior, allowing a fast and easy identification of performance bottlenecks on varying levels of granularity along the dimensions of problem type, call graph, and process or thread. The uniform mapping of performance behavior onto the corresponding fraction of execution time enables the convenient correlation of different performance behavior using only a single integrated view. A modular analysis architecture separates the performance-problem specifications from the actual analysis process, simplifying the extension and customization of predefined performance problems to meet individual (e.g., application-specific) needs.

To demonstrate the methodology in real parallel-programming environments, it was applied to the programming interfaces MPI, OpenMP, and their combination. To show the methodology's usefulness in practice, the performance-tool prototype EXPERT was implemented and successfully tested for several real-world applications.

Biography