Given the interest in OpenMP in the last few years, several research efforts have addressed performance measurement and analysis of OpenMP execution, but none of these efforts have considered a common performance tool interface in the manner proposed in this paper. The OVALTINE tool  helps determine relevant overheads for a parallel OpenMP programs compared to a serial implementation. It uses the Polaris Fortran 77 parser to build a basic abstract syntax tree which it then instruments with counters and timers to determine overheads for various OpenMP constructs and code segments. The nature of the OVALTINE performance measurements suggests that our OpenMP performance API could be applied directly to generate the OpenMP events of interest, allowing greater range to performance tools for use in overhead analysis.
OMPtrace  is a dynamic instrumentation package used to trace OpenMP execution on SGI and IBM platforms. It provides for automatic capture of OpenMP runtime system (RTS) events by intercepting calls to the RTS library. User functions can also be instrumented to generate trace events. The main advantage of OMPtrace is that there is no need to re-compile the OpenMP program for performance analysis. In essence, OMPtrace uses the RTS interface as the performance tool interface, relying on interception at dynamic link time for instrumentation. Unfortunately, this approach relies on OpenMP compiler transformations that turn OpenMP constructs into function calls, and on dynamic shared library operation. To bypass these restrictions, the OpenMP performance interface we propose could provide a suitable target for the performance tracing part of OMPtrace. A compatible pomp library would need to be developed to generate equivalent OMPtrace events and hardware counter data. In this manner, the Paraver  tool for analysis and visualization of OMPtrace data could be used without modification.
The VGV tool combines the OpenMP compiler tools (Guide, GuideView) from KAI with the Vampir/Vampirtrace tracing tools from Pallas for OpenMP performance analysis and visualization. OpenMP instrumentation is provided by the Guide compiler for both profiling and tracing, and the Guide runtime system handles recording of thread events. Being compiler-based, the monitoring of OpenMP performance can be quite detailed and tightly integrated in the execution environment. However, the lack of an external API seriously prevents other performance tools for observing OpenMP execution events. The performance interface we proposed could be applied in the VGV context in the same manner as above. The pomp calls could be implemented in a library for VGV, mapping the OpenMP actions to Vampir state transition calls at appropriate points.
Another approach might be to have the Guide compiler generate the pomp instrumentation, allowing other pomp-compatible performance interface libraries to be used.
Lastly, the JOMP  system is a source-to-source compiler that transforms OpenMP-like directives for Java to multi-thread Java statements that implement the equivalent OpenMP parallel operations. It has similarities to our work in that it supports performance instrumentation as part of its directive transformation . This instrumentation generates events for analysis by Paraver . In a similar manner, the JOMP compiler could be modified to generate pomp calls. In this case, since JOMP manages its own threads to implement parallelism, it may be necessary to implement runtime support for pomp libraries to access thread information.