Speedy: An Integrated Performance Extrapolation Tool for pC++ Programs
Bernd Mohr, Kesa van Shanmugam, Allen Malony
Committee:
Technical Report(Dec 1969)
Keywords: performance prediction, extrapolation, object-parallel programming, traceĀ­driven simulation, performance debugging tools, and modeling.
Performance Extrapolation is the process of evaluating the performance of a parallel program in a target execution environment using perfonnance information obtained for the same program in a different execution environmenL Performance extrapolation techniques are suited for rapid performance tuning of parallel programs, particularly when the target environment is unavailable. This paper describes one such technique that was developed for data-parallel C++ programs written in the pC++ language. In pC++, the programmer can distribute a collection of objects to various processors and can have methods invoked on those objects execute in parallel Using performance extrapolation in the development of pC++ applications allows tuning decisions to be made in advance of detailed execution performance measurements. The current pC++ language system includes Τ, an integrated environment for analyzing and tuning the performance of pC++ programs. This paper presents speedy, a new addition to Τ, that predicts the performance of pC++ programs on parallel machines using extrapolation techniques. Speedy applies the existing insttumentation support of Τ to capture high-level event traces of a n-thread pC++ program run on a uniprocessor machine (made possible by pC++'s multithreaded runtime system) together with trace-driven simulation to predict the performance of the program run on a target n-processor machine. We describe how speedy works and how it is integrated into 't. We also show how speedy can be used to evaluate and tune a pC++ program for a given target environment