The XPARE tools are designed to compliment an already existing correctness and (minimal) performance testing harness for the ASCI C-SAFE Uintah project. However, the performance measurements employed there use only total execution time analyses. The regression testing against prior execution runs, while able to detect significant performance changes, are unable to provide detailed performance information that could be used to identify the program component(s) most responsible. Instead, XPARE utilizes performance experiments instrumented for a wider range of performance measurements, as offered by the TAU system (The University of Oregon's Tuning and Analysis Utilities). Detailed profiled data captured for all significant events of interest, as specified by C-SAFE Uintah developers, now generates a significantly greater performance space for regression analysis and performance study.
The operational framework of XPARE consists of five parts:
The experiment launcher frontend can be used manually to conduct experiments and produce performance data as well as in the automated setting. It is capable of configuring, compiling, and executing performance experiments, using batch systems utilities. The results transporter is responsible for sending the performance profile data from the suite of experiments to the remote site where the performance database resides. Performance data is sent via email to provide fault-tolerance in the case of an unavailable server and for easy configuration of both the client and server, no additional server or ports need be set up. Upon receipt, the performance database manager stores the profile data properly with meta-information describing the experimental context. The performance reporter is a web interface that provides access to the database and displays cross- and inter-experiment performance results in graphical forms. An easy to use configuration tool for alerting mechanisms allows the user to define thresholds for performance benchmarks for a given experiment setup. When regression analysis of a performance dataset determines that these thresholds have been exceeded, the alerting component notifies the corresponding parties of the violation.
The C-SAFE / Uintah testing system runs weekly using the batch systems at the Los Alamos National Laboratory. A suite of scaling experiments are performed. In addition to complementing the current regression testing system, as described above, we are using XPARE as the foundation for assembling a database of performance data to be used in future internal reviews of the the C-SAFE / Uintah software engineering process. The XPARE-generated reporting mechanism will also be important in presenting a historical performance perspective for ASCI Level 1 center reviews.