Overhead compensation for parallel profiling requires transmitting delay information with messages. Doing so undoubtedly introduces more overhead in the process, in apparent contradiction to our goals. Our methods do not adequately account for these overheads, nor is it obvious exactly how they can or should. While the approach described attempts to balance portability and efficiency concerns, its overhead in practice will depend on what the underlying MPI implementation does with datatypes, and it might do different things with different network interfaces. If the technique is deployed in production environments, it will be important to evaluate MPI implementations to determine their overhead effects.

Scott Biersdorff 2007-02-02