- S. Shende, A. D. Malony, J. Cuny, K. Lindlan, P. Beckman and S. Karmesin,
Portable Profiling and Tracing for Parallel Scientific Applications
Appears in: Proceedings of SPDT'98: ACM SIGMETRICS Symposium on Parallel and
Distributed Tools, pp. 134-145, Aug. 1998.
Summary: descibes the TAU portable profiling package and its
application to the ACTS Toolkit.
- A. Malony and S. Shende, Performance Technology for Complex Parallel and
Distributed Systems, Proc. Third Austrian-Hungarian Workshop on
Distributed and Parallel Systems, DAPSYS 2000,
"Distributed and Parallel Systems: From Concepts to Applications,"
(Eds. G. Kotsis and P. Kacsuk) Kluwer, Norwell, MA, pp. 37-46, 2000.
Summary: describes how TAU is applied to a multi-threaded
multi-context execution model.
- S. Shende, and A. D. Malony, Integration and Application of the TAU Performance System in Parallel Java Environments Proceedings of the Joint ACM Java Grande - ISCOPE 2001 Conference, June 2001.
Summary: describes TAU's application to mpiJava, for Parallel Java programs.
- S. Shende, A. D. Malony, and R. Ansell-Bell, Instrumentation and
Measurement Strategies for Flexible and Portable Empirical Performance
Evaluation, Proceedings Tools and Techniques for Performance
Evaluation Workshop, PDPTA'01 , C.S.R.E.A., June 2001.
Summary: describes TAU's instrumentation alternatives including DyninstAPI for parallel programs.
- T. Sheehan, A. Malony, S. Shende, A Runtime Monitoring Framework for the TAU Profiling System, Proceedings of the Third International Symposium on Computing in Object-Oriented Parallel
Environments (ISCOPE'99), San Francisco, CA, December 1999.
Summary: describes the TAU Monitoring Framework.
- S. Shende, A. D. Malony and S. Hackstadt,
Dynamic Performance Callstack Sampling: Merging TAU and DAQV,
Appears in: B. Kågström, J. Dongarra, E. Elmroth and J. Wasniewski (editors).
Applied Parallel Computing, 4th International Workshop, PARA'98,
Lecture Notes in Computer Science, No. 1541, Springer-Verlag, Berlin, 1998.
Summary: describes how callstacks can be sampled in TAU using DAQV-II.
- S. Shende,
Profiling and Tracing in Linux,
Appears in: Proceedings of the Extreme Linux Workshop #2, USENIX,
Monterey CA, June 1999.
Summary: gives a brief overview of profiling and tracing
tools in the context of Linux operating system.
- Advanced Computing Laboratory, Los Alamos National Laboratory:
TAU: Tuning and Analysis Utilities ,
Los Alamos National Laboratory Publication LALP-99-205, November 1999.
Summary: A short four page summary of TAU. Produced as a flyer for
- S. Shende, J. Cuny, L. Hansen, J. Kundu, S. McLaughry and O. Wolf,
Event and State-Based Debugging in TAU:A Prototype,
Appears in: Proceedings of SPDT'96: ACM SIGMETRICS Symposium on Parallel and
Distributed Tools, pp. 21-30, May 1996.
Summary: describes a multilevel debugging strategy that combines
both event- and state-based debugging approaches within the TAU program
analysis environment for pC++.
- K. Windisch, B. Mohr, A. Malony,
A Brief Technical Overview of the TAU Tools. Unpublished.
Summary: A very brief look at the design of the TAU environment.
- B. Mohr, A. Malony, J. Cuny,
TAU. In G. Wilson, editor, Parallel Programming using C++,
M.I.T. Press, 1996.
Summary: gives the most complete description of the TAU environment.
- D. Brown, A. Malony, B. Mohr,
Language-based Parallel Program
Interaction: the Breezy Approach,
Proceedings of the International Conference
on High Performance Computing (HiPC'95),India, December 1995.
Summary: describes the design and architecture of
the breezy tool.
- K. Shanmugam, A. Malony, B. Mohr,
Speedy: An Integrated Performance Extrapolation Tool for pC++
Proceedings of the Joint Conference PERFORMANCE TOOLS'95 and
MMB'95, 20th-22nd September, 1995, Heidelberg, Germany.
Summary: A new TAU tool, speddy, is described which
is a graphical interface to the pC++ performance extrapolation
tool ExtraP. Speedy/Extrap allow analyzing the performance of
pC++ programs without actually running them on a parallel computer.
- A. Malony, B. Mohr, P. Beckman, D. Gannon,
Program Analysis and Tuning Tools for a Parallel Object Oriented
Language: An Experiment with the TAU System,
Proceedings of the Workshop on Parallel Scientific Computing,
Cape Cod, Maine, October 1994.
Summary: The use of the TAU tools is illustrated
from the perspective of the design and evaluation of a single
application in pC++: a bitonic sort module that is used as part
of a large N-Body simulation of cosmological evolution.
- S. Hackstadt, A. Malony, B. Mohr,
Scalable Performance Visualization for Data-Parallel Programs,
Proceedings of the Scalable High Performance Computing Conference (SHPCC),
Knoxville, Tennessee, May 1994.
Summary: presents several performance visualization techniques
based on the context of data-parallel programming and execution that
demonstrate good visual scalability properties.
- D. Brown, S. Hackstadt, A. Malony, B. Mohr,
Program Analysis Environments for Parallel Language Systems:
The TAU Environment,
Proceedings of the 2nd Workshop on Environments and Tools For
Parallel Scientific Computing, Townsend, Tennessee, pp. 162-171, May 1994.
Summary: A companion paper to the CONPAR94 article. After
an overview about the TAU program analysis tools is given, the
barrier breakpoint debugger, breezy, is discussed.
- B. Mohr, D. Brown, A. Malony,
TAU: A Portable Parallel Program Analysis Environment for pC++,
Proceedings of CONPAR 94 - VAPP VI, University of Linz,
Austria, LNCS 854, pp. 29-40, September 1994.
Summary: describes the TAU program analysis tools:
fancy (file and class browser), cagey (callgraph browser),
classy (class hierarchy browser), racy (profile data browser),
and easy (event and state viewer)
- A. Malony, B. Mohr, P. Beckman, D. Gannon, S. Yang, F. Bodin,
Performance Analysis of pC++: A Portable Data-Parallel
Programming System for Scalable Parallel Computers, Proceedings of
the 8th International Parallel Processing Symbosium (IPPS),
Cancún, Mexico, pp. 75-85, April 1994.
Summary: describes profiling and tracing capabilities of
pC++ and gives detailed results of speedup measurements for four
- F. Bodin, P. Beckman, D. Gannon, S. Yang, S. Kesavan,
A. Malony, B. Mohr, Implementing a Parallel C++ Runtime System
for Scalable Parallel Systems, Proceedings of the 1993 Supercomputing
Conference, Portland, Oregon, pp. 588-597, November 1993.
Summary: gives an overview of pC++, a parallel version of
C++ and the implemementation of its runtime system on a variety of
distributed and shared memory machines
- B. Mohr,
Standardization of Event Traces Considered Harmful or Is an
Implementation of Object-Independent Event Trace Monitoring and
Analysis Systems Possible?,
Proceedings of the CNRS-NSF Workshop on
Environments and Tools For Parallel Scientific Computing, St. Hilaire
du Touvet, France, Elsevier, Advances in Parallel
Computing, Vol. 6, pp. 103-124, September 1992.
Summary: describes how to write event trace analysis tools
in a way that they are able to read and analyze traces of arbitrary