next up previous
Next: Related Work Up: Prototype Implementation Previous: Integration into EXPERT

Integration into TAU

The TAU performance system [13] provides robust technology for performance instrumentation, measurement, and analysis for complex parallel systems. It targets a general computation model consisting of shared-memory nodes where contexts reside, each providing a virtual address space shared by multiple threads of execution. The model is general enough to apply to many high-performance scalable parallel systems and programming paradigms. Because TAU enables performance information to be captured at the node/context/thread levels, this information can be mapped to the particular parallel software and system execution platform under consideration.

TAU supports a flexible instrumentation model that allows access to a measurement API at several stages of program compilation and execution. The instrumentation identifies code segments, provides for mapping of low-level execution events to high-level computation entities, and works with multi-threaded and message passing parallel execution models. It interfaces with the TAU measurement model that can capture data for function, method, basic block, and statement execution. Profiling and tracing form the two measurement choices that TAU provides. Performance experiments can be composed from different measurement modules, including ones that access hardware performance monitors. The TAU data analysis and presentation utilities offer text-based and graphical tools to visualize the performance data as well as bridges to third-party software, such as Vampir [14] for sophisticated trace analysis and visualization.

As with EXPERT, TAU implements the OpenMP performance API in a library that captures the OpenMP events and uses TAU's performance measurement facility to record performance data. For example, the pomp implementation of the same functions as in Section 4.1 would look like the following in TAU:

  TAU_GLOBAL_TIMER(tfor,``for enter/exit'',
                 ``[OpenMP]'',OpenMP);

  void pomp_for_enter(OMPRegDescr* r) {
  #ifdef TAU_AGGREGATE_OPENMP_TIMINGS
    TAU_GLOBAL_TIMER_START(tfor);
  #endif
  #ifdef TAU_OPENMP_REGION_VIEW
    TauStartOpenMPRegionTimer(r);
  #endif
  }

  void pomp_for_exit(OMPRegDescr* r) {
  #ifdef TAU_AGGREGATE_OPENMP_TIMINGS
    TAU_GLOBAL_TIMER_STOP();
  #endif
  #ifdef TAU_OPENMP_REGION_VIEW
    TauStopOpenMPRegionTimer(r);
  #endif
  }

TAU supports construct-based as well as region-based performance measurement. Construct-based measurement uses globally accessible timers to aggregate construct-specific performance cost over all regions. In the case of region-based measurement, like EXPERT, the region descriptor is used to select the specific performance data for that context. Following this instrumentation approach, all of TAU's functionality is accessible to the user, including the ability to select profiling or tracing, enable hardware performance monitoring, and add MPI instrumentation for performance measurement of hybrid applications.



next up previous
Next: Related Work Up: Prototype Implementation Previous: Integration into EXPERT



Sameer Suresh Shende
Thu Aug 23 11:19:57 PDT 2001