Showing 1 entry
Keywords: Parallel performance, perturbation analysis, observation
Performance observabiility is the ability to accurately capture, analyze, and present (collectively observe) information about the performance of a computer system. Advances in computer systems design, particularly with respect to parallel processing and supercomputers, have brought a crisis in performance observation -- computer systems technology is outpacing the tools to understand the performance behavior of and to operate the machines near the high-end of their performance range. In this thesis, we study the performance observability problem with emphasis on the practical design, development, and use of tools for performance measurement, analysis, and visualization.
Tools for performance observability must balance the need for performance data against the cost of obtaining it (environment complexity and performance intrusion) -- to little performance data makes performance analysis difficult; too much data perturbs the measurement system. We discuss several methods for performance measurement concentrating specifically on mechanisms for timing and tracing. We show how minor hardware and software modifications can enable better measurement tools to be built and describe results from a prototype hardware-based software monitor developed for the Intel iPSC/2 multiprocessor.
Any software performance measurement perturbs the measured system. We develop two models of performance perturbation to understand the effects of instrumentation intrusion: time-based and event-based. The time-based models use only measured time overheads of intrumentation to approximate actual execution time performance. We show that this model can give accurate approximations for sequential execution and for parallel execution with independent execution ordering. We use the event-based model to quantify the perturbation effects of instrumentations of parallel executions with ordering dependencies. Our results show that this model can be applied in practice to achieve accurate approximations. We also discuss the limitations of the time-based and event-based models.
The potentially large volume of detailed performance data requires new approaches to presentation that can show both gross performance characteristics while allowing users to focus on local performance behavior. We give several examples where performance visualization techniques have been effectively applied, plus discuss the architecture and a prototype of a general performance visualization environment.
Finally, we apply several of the performance measurement, analysis, and visualization techniques to a practical study of performance observability on the Cray X-MP and Cray 2 supercomputers. Our results show that even modest improvements in the existing set of performance tools for a particular machine can have significant benefits in performance evaluation capabiilities.
Created: Tue Mar 2 09:34:47 2004
Return to the ParaDucks Research Group Publications page.