Observing the behavior of an adaptive CFD application shows us several interesting aspects of its execution. Such applications typically involve a domain decomposition of the simulation model across processors and an interaction of execution phases as the simulation proceeds in time. Each iteration may involve a repartitioning or adaption of the underlying computational structure to better address numerical or load balance properties. For example, a mesh refinement might be done at iteration boundaries and information about convergence or divergence of numerical algorithms is detailed. Also, domain specific information such as the number of cells refined at each stage gives a user valuable feedback on the progress of the computation.
Performance evaluation tools must capture and present key application specific data and co-relate this information to performance metrics to provide a useful feedback to the user. Presenting performance information that relates to application specific abstractions is a challenging task. Typically, profilers present performance metrics in the form of a group of tables, one for each MPI task. Each row in a table represents a given routine. Each column specifies a metric such as the exclusive or inclusive time spent in the given routine or the number of calls executed. This information is typically presented for all invocations of the routine. While such information is useful in identifying the routines that contribute most to the overall execution time, it does not explain the performance of the routines with respect to key application phases. To address this shortcoming, we provide several profiling schemes in TAU.