Hackstadt's Masters Thesis

CHAPTER II
MOTIVATION

The introductory chapter suggested a few characteristics of a sophisticated visualization environment for parallel programs and performance. This chapter will develop those concepts and others more thoroughly.

By now, the importance of graphical visualization in parallel performance evaluation has been established by the presence of tools such as ParaGraph [9,10], Sieve [42], Pablo [33], Voyeur [44], Seeplex [4], and Traceview [25]. These environments offer the parallel programmer access to information and insights that might otherwise go unobserved. Whether a parallel program is executing on 4 or 4,000 processors, the ability to view performance data in a useful way often enables the programmer to identify anomalous behavior within a program. Subtle changes in the way a program performs its computation, for example, could offer substantial improvements in performance. Without visualization techniques to aid in the discovery of such problems, performance improvements might never be found [9].

Clearly, performance visualization is not a panacea to the performance issues facing parallel programmers, though. As suggested earlier, visualizations have to be "useful" - in a way that elucidates performance behavior. However, "useful" is a subjective term. For instance, in the domain of scientific visualization, a display that is "useful" to the scientist in a jet propulsion laboratory may be inappropriate to the researcher analyzing ocean currents. The notion of "useful" as it applies to scientific visualization applies to performance visualization as well. The important point is that if a visualization helps a single person do their work better, then the visualization should be considered useful.

Currently, though, visualization techniques force performance tool developers to predetermine (i.e., before implementation) what set of visualizations may be useful to the most people. Determining the effectiveness of visualizations is difficult without usability case studies (such as [9]) and a more formal evaluation framework (which might incorporate the ideas in [29]). While application-specific displays have their place in performance visualization, displays can be meaningful to large numbers of people. The success of the tools mentioned above is testimony to this fact. However, the degree to which a visualization can be considered general-purpose or application-specific is a difficult quantity to determine. Nonetheless, both theory and practice strongly suggest the need for a wide range of application-specific visualizations to augment a general-purpose set. To date, this need has been difficult to fulfill because of the considerable overhead in creating, evaluating, and formalizing performance visualizations, but its importance has been documented by Stasko and Kraemer [46]. Heath and Etheridge [10], creators of the general-purpose ParaGraph displays, even acknowledge the importance of application-specific displays:

In general, this wide applicability is a virtue, but knowledge of the application often lets you design a special-purpose display that reveals greater detail or insight than generic displays would permit. (p. 38)

Unfortunately, such displays are not easily created in a tool like ParaGraph since they require special programming skills [10]. Clearly, a development process that requires little overhead and programming would enable developers to generate application-specific displays quickly in response to user needs, as well as create and evaluate general-purpose visualizations.

Computer animations of the ozone hole, a thunderstorm, or ocean currents - all appropriately described as application-specific displays - have become commonplace in scientific visualization. How have other scientists been able to overcome the overheads involved in visualization development? Scientists use generalized data visualization software products that have most, or all, of the tedious graphics and data manipulation programming already done, although there is still a creative process involved in constructing scenarios for visualizing scientific data. Performance visualization developers, on the other hand, have heretofore chosen to develop dedicated graphics and data manipulation support from the ground up, inadvertently limiting the type and variety of displays available to the user. In other words, tools developers have focused on providing the visualizations rather than robust environments and techniques to create them.

As parallel computing architectures, environments, languages, and applications continue to advance, performance visualization needs will become more demanding. Most existing tools are limited to two-dimensional displays, offer little customization and display interaction, and have strict data formats. Three-dimensional visualization in conjunction with advanced graphical techniques has opened up entirely new possibilities for researchers in scientific fields, and it stands to do the same for performance visualization.

To determine whether the field of performance evaluation can benefit by such next-generation visualizations requires a means for rapidly prototyping and evaluating new displays and display techniques. To apply the development process of existing performance visualization products to these new ideas would be to start coding hundreds of three-dimensional graphics routines and interaction techniques, not to mention advanced data representation and manipulation capabilities. Many months later, a researcher might be in a position to begin prototyping and evaluating new visualizations.

The methodology proposed herein is based on a formal foundation in which performance abstractions are mapped to visual abstractions [28]. In general, the methodology defines a framework for interfacing to existing visualization systems (e.g., IBM Data Explorer, AVS), graphical programming libraries (e.g., OpenGL), and other graphics resources. In this manner, an existing visualization package is but a single means of implementing the formal, high-level abstractions. While this work has focused on the use of IBM's Data Explorer, any number of similar products could also be applied. At the very least, this methodology allows visualizations to be prototyped quickly with minimal overhead, making displays available for evaluation without committing months of programming to a visualization project. The process avoids graphics programming completely, yet maintains access to numerous display styles and interaction techniques. In essence, developers are able to focus on the visualization design rather than the underlying implementation of data models and low-level graphical operations. This research offers a new technique for the development of parallel program and performance visualizations. Even if the scientific visualization package isn't suitable for the final implementation, researchers will at least be able to perform much needed usability tests and determine if their displays are effective before final implementation begins.

As a final note, while greatly facilitating the evaluation of new and existing visualizations, this research does not have evaluation as its goal at this point. Many visualizations will be presented in the pages that follow. Any "evaluation" that may take place is ultimately aimed at the development techniques being explored, not the visualizations themselves.

Last modified: Wed Jan 20 15:13:51 PST 1999

Steven Hackstadt / hacks@cs.uoregon.edu

CHAPTER II MOTIVATION

CHAPTER II
MOTIVATION