OpenMP is a parallel programming language system used to express shared memory parallelism. It is based on the model of (nested) fork-join parallelism and the notion of parallel regions where computational work is shared and spread across multiple threads of execution (a thread team); see Figure 1. The language constructs provide for thread synchronization (explicitly and implicitly) to enforce consistency in operation. OpenMP is implemented using comment-style compiler directives (in Fortran) and pragmas (in C and C++).
A performance model for OpenMP can be defined based on its execution events and states. We advocate multiple performance views based on a hierarchy of execution states where each level is more refined in focus:
Figure 1 shows a diagram of OpenMP parallel region operation. Identified are serial ( S) and parallel ( P) states, parallel startup ( STARTUP) and shutdown ( SHUTDOWN) states, and different events at different levels for master and slave threads. Based on this diagram, and given a workable performance instrumentation interface, we can develop measurement tools for capturing serial and parallel performance.
Table 1: Proposed OpenMP Directive Transformations.