Tuning and Analysis Utilities

TAU

PRL

These are brief notes about each release, the features in each released are described in detail in the tau [dash] announcements [at] nic.uoregon.edu mailing list archives.

TAU v2.33.1

14 Feb 2024.

TAU v2.33

2 Nov 2023.

TAU v2.32.1

30 Aug 2023.

TAU v2.32

7 Dec. 2022.

TAU v2.31.1

31 May 2022.

TAU v2.31

11 November 2021.

TAU v2.30.2

31 August 2021.

TAU v2.30.1

12 January 2021.

TAU v2.30

17 November 2020.

TAU v2.29.1

21 July 2020.

TAU v2.29

11 November 2019.

TAU v2.28.2

19 July 2019.

TAU v2.28.1

28 Apr 2019.

TAU v2.28

10 Nov 2018.

TAU v2.27

13 Nov 2017. See announcement.

TAU v2.26.3

25 Aug 2017

TAU v2.26.2

10 June 2017

TAU v2.23

15 Nov 2013

TAU v2.22.2

27 May 2013

TAU v2.22.1

8 February 2013

TAU v2.22

9 November 2012

TAU v2.21.4

18 September 2012

TAU v2.21.3

12 July 2012

TAU v2.21.2

26 Mar 2012

TAU v2.21

10 Nov 2011

See announcement.

TAU v2.20.3

18 Aug 2011

See announcement.

TAU v2.20.2

13 May 2011

See announcement.

TAU v2.20.1

22 Mar 2011

See announcement.

TAU v2.20

11 Nov 2010

See announcement.

TAU v2.19.2

9 July 2010

See announcement.

TAU v2.19.1

3 Mar 2010

See announcement.

TAU v2.19

16 Nov 2009

See announcement.

TAU v2.18.3

18 Sep 2009

See announcement.

TAU v2.18.2

15 May 2009

See announcement.

TAU v2.18.1

22 Jan 2009

TAU can now interface with PGI's runtime library and extract performance information associated with kernels that execute on the GPGPUs. TAU tracks the interactions with the GPGPU as seen from the host and generates the performance data. This data includes the name of the routine, file, line number as well as block and grid sizes and individual variable names. This feature works with PGI 8.0.3+ compilers that support the #acc region/end region directives. These source annotations may be placed around loops to automatically generate GPGPU code that executes on CUDA enabled NVidia cards. Users do not need to write any GPGPU specific code explicity. Instead, they use a compiler flag (-ta=nvidia) to generate this code using a special add-on package with the PGI compiler.

This release improves support for Charm++ and NAMD. We have a wiki page that describes how to build and use TAU with NAMD.

TAU v2.18

16 Nov 2008

We add support for the PGI and IBM compilers for compiler-based instrumentation. Now, you may set:

% setenv TAU_OPTIONS '-optCompInst -optVerbose....' (see tau_cc.sh, % tau_f90.sh, tau_cxx.sh) % setenv TAU_MAKEFILE taudir/arch/lib/Makefile.tau-[options] % tau_f90.sh app.f90; tau_f90.sh app.o -o app
to enable this feature. We have tested this on Cray XT3/4/5 systems with PGI compilers, x86_64 linux systems and IBM pSeries Linux, BG/P, AIX Power5 and 6 systems. With this new feature, we have completed support for GNU, IBM, PGI, Intel, and Pathscale compilers. The above -optCompInst flag will work uniformly across all platforms and languages (Fortran/C/C++). This feature works at the routine level and may be used to replace PDT for inserting instrumentation. PDT is still relevant for more detailed instrumentation at the fine-grained loop, memory allocation, and I/O tracking levels. We have updated the GNU compiler instrumentation module in TAU to support instrumentation of routines that reside in shared objects that are loaded at runtime. TAU can now exclude files from compiler-based instrumentation by specifying these in the exclude list in TAU's selective instrumentation file (specified using -optTauSelectFile=file.tau). The GNU compiler support for shared objects requires a BFD package installed with -fPIC (position independent code). When the default package (in /usr/lib) is not compiled this way, you may either specify -DISABLESHARED while configuring TAU or use -bfd=download that will download binutils-2.18 and compile it -fPIC and use it to create libTAU.so. This does not affect the use of TAU for static linking (used by default).

TAU v2.17.3

30 Sep 2008

TAU features compiler based instrumentation for Intel, GNU and PathScale compilers, a new python API for memory tracking, fixes for IBM BG/P configuration, and support for CQoS analysis and drawing charts from script files in PerfExplorer.

TAU v2.17.2

12 Aug 2008

TAU features a generic source code instrumentor in tau_instrumentor, paraprof enhancements including creation of a selective instrumentation file, and support for other file formats, using default values for TAU_THROTTLE (1), COUNTER1, storing weka files in ~/.ParaProf directory, GNU PDT parser, context events in POSIX I/O interposition library and a new lightweight TAU_PROFILER API.

TAU mentioned in HPC wire article

21 March 2008

The TAU performance system® was one of several performance evaluation tools mentioned in a HPC wire article about the Petascale Productivity from Open, Integrated Tools (POINT) project. Quote:

"The POINT project will improve and support a parallel performance environment that integrates the widely-used TAU, PAPI, KOJAK, and PerfSuite technologies as core components. Each tool will be enhanced to better support user needs and evolving scalable HPC technology, and to interoperate as part of a performance engineering system to be used routinely in the performance evaluation and optimization of domain science and engineering (S&E) applications running on HPC systems of extreme scale."

More information about the POINT project can be found at their website.

TAU v2.17.1

21 March 2008

TAU v2.17.1 has these new features:

Tracking MPI-I/O, Perfexplorer 2 with atomic events, jython interface, refactoring TAU and support for TAU_PROFILE_FORMAT environment variable, PAPI-C non-cpu native events, Eclipse/PTP plugin update, Scalasca 1.x support, GCC 4.3.x, IBM BG/P -BGPTIMERS and metadata, and updates for Apple OS X.

TAU v2.17 and PDT 3.12 released

9 November 2007

TAU v2.17 has these new features:

tau_wrap, a wrapper generator for external libraries, port to IBM BG/P (-arch=bgp), SiCortex, Cray CNL, and Windows Cluster 2003 (including MPI support). Improvements to the Eclipse plugin, paraprof, and perfexplorer. Added a new Posix I/O wrapper (-iowrapper) for tracking the volume and bandwidth of I/O. Added support for atomic and context events in the OTF traces generated by VampirTrace using TAU.

TAU v2.16.6 released

21 Sep 2007

TAU v2.16.6 has these new features:

static/dynamic phase/timer instrumentation constructs are now supported in the TAU instrumentation specification file, Cray XT4 compute node linux (-arch=craycnl), Eclipse/PTP plugin for external performance tools, tauex updates for MPI shared object loading, signal handlers for dumping performance data (SIGUSR1) and toggling instrumentation (SIGUSR2), support for OMPP profiles in paraprof, support for Intel 10.x Fortran/C/C++, NAGWare Fortran and g95 Fortran compilers.

TAU v2.16.5 released

31 May 2007

TAU v2.16.5 has these new features:

profile snapshots, I/O tracking in Fortran, configuration and support of multiple PerfDMF databases within ParaProf, support for Lahey 64 bit compiler under Linux, SiCortex 64 and 32 bit architectures, and support for gfortran based parser in PDT 3.11.1 for Mips Linux architecture.

TAU v2.16.4 released

1 May 2007

TAU v2.16.4 has these new features:

Clock synchronization in trace files, metadata fields in ppk files, perfexplorer custom charts with XML metadata fields, TAU portal scripts to upload data, support for persistent communication events in traces, KTAU OS level shared counter coupling, Eclipse/PTP updates for accessing TAU options and build configurations.

TAU v2.16.3 and PDT 3.11 released

27 March 2007

TAU v2.16.3 has these new features:

Eclipse PTP plugin update, memory leak detection enhancements, high level API, Python instrumentation, Paraprof's support for cube3 profiles, perfexplorer comparative displays and Jython interpreter support, PAPI enhancements (papithread, papi domains under x86 linux), tauex, and pure java implementation of tau2slog2.

TAU v2.16.2 and PDT 3.10 released

1 March 2007

TAU v2.16.2 has these new features:

  1. Memory leak detection for Fortran Loads, stores and leaks can be detected automatically using source-level instrumentation. tau_instrumentor accepts a new keyword "memory [file=] routine=" in the instrument section (BEGIN_INSTRUMENT_SECTION/END_INSTRUMENT_SECTION) of the selective instrumentation file. See examples/memoryleakdetect/f90.
  2. Enhancements to Eclipse PTP plugin Scrolling is supported for options in the TAU analysis tab.
  3. Enhancements and bug fixes for PerfExplorer

TAU v2.16.1 and PDT 3.10 released

13 February 2007

TAU v2.15.5 released

30 June 2006

TAU v2.15.5 has these new features:

TAU portal at https://tau.nic.uoregon.edu, automatic memory leak detection for C/C++(malloc/free), Perfexplorer enhancements (normal probability plots, event data, distribution info of events), tau2otf supports compressed and multi-threaded OTF traces, tau_instrumentor, ParaProf and pprof enhancements.

TAU v2.15.4 released

8 June 2006

TAU v2.15.4 has these new features:

tau_poe tool for instrumenting AIX binaries at runtime, improvements in tau_instrumentor to support gotos in loops, support for tracking memory allocations and deallocations and associating these with the program callstack using TAU's malloc/free wrapper, improvements in tau_ompcheck tool, Derby support in PerfDMF, and enhancements to ParaProf and PerfExplorer.

TAU v2.15.3 released

27 April 2006

TAU v2.15.3 has these new features:

support for automatic outer loop level instrumentation in Fortran, support for PDT's gfortran parser, tau_ompcheck for correcting OpenMP directives in Fortran, Derby and DB2 support in PerfDMF, enhancements to Paraprof for phase based profiling, automatic instrumentation of pthread programs, Java trace writer API library, Cray XT3 extensions, and an upgradetau utility for installing TAU.

TAU v2.15.2 released

21 February 2006

TAU v2.15.2 has these new features:

support for automatic outer loop level instrumentation in C and C++ using PDT, Eclipse PTP environment, python 2.4 instrumentation, Jython support in Paraprof, port to FreeBSD and updates to tau_instrumentor.

OTF for IBM BG/L released

30 December 2005

tau2otf: Added a new utility to convert TAU trace files to the Open Trace Format (OTF). OTF trace files can be read by Vampir v5.0 and Vampir NG (VNG). These tools are available from TU Dresden.

TAU v2.15.1 released

22 December 2005

TAU v2.15.1 has these new features:

phaseconvert: Added a new utility to convert callpath profiles to phase based profiles given a set of phases. This supports not only TAU profiles, but also cube profiles and any other callpath profile that perfdmf supports.

tau2profile: Added a new utility to convert TAU trace files to profiles. Traces contain timestamped events while profiles contain aggregate summaries of performance metrics. This utility supports PAPI counter data as well, so TAU trace files with multiplecounter data are mapped to profiles with multiple metrics. It supports generation of profile series and interval profiles as well.

Enhancements to Paraprof

And Better support for Intel compilers for linking C and Fortran codes.

TAU v2.15 released

17 November 2005

We've added new paraprof phase and comparative displays. And support for Eclipse CDT, FDT in TAU. Tau now supports the Open Trace Format (OTF). Updates to the PerfExplorer Performance Data Mining tool have been made. Event profiling can now be throttled during runtime. Added support for ORC Open64 compiler and nested OpenMP calls. Traces are now multi-platform and can be generated on one platform and merged/converted on another. Added support for Cray XT3 (-arch=xt3, see wiki), and SHMEM wrappers. Added support for Solaris on x86_64 Opteron. Updated support for PAPI on IBM BGL and Cray XT3.

TAU v2.14.7 released

11 August 2005

We've added new tools for performance data mining and knowledge discovery [PerfExplorer], command line invocation of TAU, TAU Eclipse Java plugin, and updated our documentation.

TAU v2.14.6, PDT v3.4 and VTF3 v1.34 released

30 June 2005

We've added support for large trace files (> 2GB), GPSHMEM, and now we distribute JumpShot4 and SLOG2 SDK as part of TAU. TAU_COMPILER and tau_instrumentor are enhanced to better support automatic instrumentation of Fortran 90/95 codes using PDT v3.4.

TAU v2.14.5 released

8 June 2005

We've added support for importing CUBE(Kojak) profiles in paraprof. TAU has a new -MPITRACE option that produces trace files with events that are ancestors of MPI calls. These traces can be converted to the Epilog format (from Kojak) for use with the expert tool. TAU_COMPILER instrumentation tool has been updated to support OpenMP instrumentation with Kojak's Opari instrumentor. Paraprof has a new thread statistics table window with support for expanding a callgraph by clicking on a node. You can sort on a particular column by clicking on it its heading.

TAU v2.14.4 released

18 May 2005

We've added support for memory headroom calculation. Paraprof has a packed profile data format, reverse callpath views, and search capabilities. TAU has a new context user defined event where application specific events can be mapped to the program's callstack. TAU traces can now be converted to the Epilog trace format using tau2elg tool.

TAU v2.14.3 released

20 Apr 2005

We've added support for 3D profile displays in Paraprof. TAU now supports the JumpShot4 trace visualizer with the SLOG2 trace converter.

TAU v2.14.1 released

20 Jan 2005

We've added support for phase based profiling, dynamic timers, a tool to convert vtf3 trace files to TAU profiles, and several enhancements to Paraprof. Paraprof now has an option to show the complete callgraph (click-able to identify the callpath, with zoom in/out capabilities, options to select node colors and sizes). Paraprof has a new scalable histogram display which shows the no. of threads of a routine in each bin (between max and min values, with the ability to change the no. of bins). TAU features better support for multi-threaded executions, and support for PathScale compilers (C, C++, Fortran 95) for Opteron Linux platform. PDT v3.3.1 is also released with support for PathScale compilers.

TAU v2.14 released

Nov 2004

TAU now supports Oracle, PostgreSQL and MySQL databases in PerfDMF.

TAU v2.13.7 released

Aug 2004

TAU now supports generation of binary VTF3 traces using VTF3 Trace Library from TU Dresden .

TAU v2.12.9 released

July 2003

TAU v2.12.9 introduces the new paraprof profile browser [ Europar03 ], DyninstAPI 4.0 support for rewriting binary images, file level selective instrumentation support, gprof style parallel callpath views for callpath profiles in paraprof, user specified depth in callpath profiles, Python API improvements, Opari updates for OpenMP instrumentation and EPILOG trace file format support from the KOJAK (FZJ) project.

TAU v2.12.5 released

March 2003

TAU v2.12.5 supports Python bindings and automatic instrumentation of Python code.

Call Path profiling

Aug 2002

TAU supports call path profiling. This allows a user to explore the time spent along a specific call path. Currently, the latest release (TAU v2.11.17) supports a two-level call path. See Call Path Profiling for further details. TAU also supports PETSc in this release.

New tool: tau_reduce

July 2002

Frequently executing light-weight routines may distort the performance data by introducing unnecessary overhead. To weed out these routines, a new tool tau_reduce has been introduced in TAU. It reads the profile output and a rules file that specifies when a routine should not be instrumented, and produces a selective instrumentation file that lists routines that should be excluded from instrumentation. This information can be fed to tau_instrumentor based on PDT or tau_run based on DyninstAPI to reduce the instrumentation overhead for subsequent runs. See examples/reduce and utils/TAU_REDUCE.README for more information.

Support for EPILOG and EXPERT

June 2002

TAU can generate EPILOG binary traces which can be analyzed using the EXPERT tool. See [ KOJAK ]. TAU also supports Hitachi SR8000, NEC SX and IA-64 Linux platforms. Under IA-64, Intel C/C++/F90 compilers are supported.

Runtime access to performance data

May 2002

TAU v2.11.14 also supports runtime access to performance data that allows an application to query its performance metrics. TAU also features selective dumping of profile data and incremental dumping of data at runtime. TAU supports integrated performance analysis in the Uintah software. See [ ISHPC'02 paper ].

Selective Instrumentation

April 2002

TAU supports selective instrumentation of source code (using PDT) and object code (using DyninstAPI). A selective instrumentation file can specify a list of routines that are to be instrumented or to be excluded from instrumentation.

Support for multiple counters

March 2002

TAU can now support profiling with more than one quantity (such as wall-clock time, hardware performance counters). Different options can be selected by setting COUNTER[1-25] environment variables to indicate the counters to be profiled. TAU also supports PAPI v2.1 in this release. See -MULTIPLECOUNTERS configuration option.

Dynamic Grouping

Feb 2002

TAU supports dynamic creation of profile groups. This allows users to enable and disable groups at runtime, as well as associate groups with files during instrumentation using tau_instrumentor. Support for profile groups is demonstrated in SAMRAI(LLNL) .

F90 Support

Jan 2002

TAU supports F90 instrumentation using PDT .

Access to x86 timers under Linux

Dec. 2001

TAU supports access to low-overhead timers under Linux using the -LINUXTIMERS configuration option.

jracy released in TAU v2.10

Nov. 2001

TAU has a new profile browser (jracy) implemented in Java. Sample images of jracy can be seen in EVH1 Profiles .

UPS

Oct. 2001

TAU works with UPS .

XPARE

Sept. 2001

XPARE (eXPeriment Alerting and REporting) is a system for performance experimentation that is integrated in a weekly testing harness for the Uintah / C-SAFE software development effort. With this system we can produce detailed weekly reports of Uintah / C-SAFE performance and alert code developers of performance problems as they arise.

TAU v 2.9.19 Released

Aug 2001

TAU v 2.9.19 features support for OpenMP directive rewriting (Opari) based instrumentation for OpenMP programs. See LACSI 2001 paper.

TAU v 2.9.12 Released

July 2001

TAU v 2.9.12 features support for several thread packages (SGI sproc, pthread, Java, Windows, OpenMP, Tulip, SMARTS) and for a runtime profile snapshot (TAU_DB_DUMP) facility in addition to extensions to its performance data mapping API. See the download section for instructions on downloading TAU.

TAU Documentation

June 2001

TAU JAVA Grande/ISCOPE'01 paper (mpiJava, multi-level instrumentation) and PDPTA'01 paper (use of DyninstAPI with MPI) [ All papers ].

TAU v 2.9 Released

Nov. 2000

TAU v2.9 features support for mixed model programming, support for PAPI, PCL for hardware performance counters and new ports (to IA-64). See the Download page for more information.

TAU supports Hybrid Execution Models

TAU supports MPI+pthread, MPI+OpenMP and MPI+Java hybrid execution models. For details see DAPSYS2000 and ICSJava papers.

TAU supports PAPI and OpenMP with MPI (OpenMPI)

TAU supports access to hardware performance counters using PAPI . For details see PAPI and OpenMPI announcements.

TAU v 2.8.11 Released

Oct. 2000

TAU v2.8x implements the performance mapping API that allows performance data to be correlated between different layers in a multi-layered software. It features support for Fortran 90 and MPI Profiling Interface. It supports access to hardware performance counters using PCL and PAPI on several platforms including Cray T3E, SGI, UltraSparc, IBM Power3, Intel Pentium+

Profiling User Events in PaRP

TAU now implements profiling of user defined event. These could be used to track memory statistics or any application specific statistics maintained on a per thread basis. Click here for more information on its use in the PaRP project.

Vampir and Smarts

TAU can generate event traces for Vampir for Smarts user level threads. This can be a valuable tool in evaluating efficient thread scheduling policies in SMARTS. Click here for more information.

TAU integrated with Pooma II

TAU uses the EDG parser, IL converter and DUCTAPE to automatically insert TAU macros in the source code. TAU is now integrated with Pooma II Click here for more information.

Pthread support

TAU Profiling package now supports pthreads using -pthread configure option. Version 2.3 released on Aug. 10, 1998 also supports user defined events. C programs can now be profiled using TAU using the same API as C++.

TAU IL Converter

TAU IL converter and program database for analysis tools uses an EDG front end to parse a C++ program and converts the intermediate language to a format that can be used by TAU tools. For more info see the documentation section.

TAU Tracing

The TAU Portable package can now generate traces that can be viewed using VAMPIR. For details see the Tutorial Tracing for VAMPIR