TAU - Tuning and Analysis Utilities -

Tuning and Analysis Utilities

TAU

PRL

3.3. Selectively Profiling an Application

3.3.1. Reducing Performance Overhead with TAU_THROTTLE

TAU allow the users to select which functions to profile within a single application. One way the user can selectively instrument an application is by specifying rules which govern which functions should be profiled.

TAU's automatically throttles short running functions in an effort to reduce the amount of overhead associated with profile such functions. This feature may be turned off by setting the environment variable TAU_THROTTLE to 0. The default rules TAU uses to determine which functions to throttle is: numcalls > 100000 && usecs/call < 10 which means that if a function executes more than 100000 times and has an inclusive time per call of less than 10 microseconds, then profiling of that function will be disabled after that threshold is reached. To change the values of numcalls and usecs/call the user may optionally set environment variables:

% setenv TAU_THROTTLE_NUMCALLS 2000000
% setenv TAU_THROTTLE_PERCALL  5
  

to change the values to 2 million and 5 microseconds per call.

For more control over selective instrumentation use the tool "tau_reduce". See tau_reduce. or Section 16.3, “Selective Instrumentation File Generator”

[Note] Note

Because MPI events are not instrumented at the source level. The use of TAU_THROTTLE is the only way to reducing the overhead of frequently run MPI events like MPI_TEST. In short, we recommend that TAU_THROTTLE be used in to reduce the overhead every time you profile your application.

3.3.2. Custom Profiling

TAU allows you to customize the instrumentation of a program by using a selective instrumentation file. This instrumentation file is used to manually control which parts of the application are profiled and how they are profiled. If you are using tau_compiler.sh to instrument your application you can use the -optTauSelectFile=<file> option to enable selective instrumentation.

Selective Instrumentation File Specification. The selective instrumentation file has the following sections, each preceeded and followed by:

BEGIN_EXCLUDE_LIST / END_EXCLUDE_LIST or BEGIN_INCLUDE_LIST / END_INCLUDE_LIST

exclude/include list of routines and/or files for instrumentation. The list of routines to be excluded from instrumentation is specified, one per line, enclosed by BEGIN_EXCLUDE_LIST and END_EXCLUDE_LIST. Instead of specifying which routines should be excluded, the user can specify the list of routines that are to be instrumented using the include list, one routine name per line, enclosed by BEGIN_INCLUDE_LIST and END_INCLUDE_LIST.

BEGIN_FILE_EXCLUDE_LIST / END_FILE_EXCLUDE_LIST or BEGIN_FILE_INCLUDE_LIST / END_FILE_INCLUDE_LIST

Similarly, files can be included or excluded with the BEGIN_FILE_EXCLUDE_LIST, END_FILE_EXCLUDE_LIST, BEGIN_FILE_INCLUDE_LIST, and END_FILE_INCLUDE_LIST lines.

BEGIN_INSTRUMENT_SECTION / END_INSTRUMENT_SECTION

Manually editing the selective instrumentation file gives you more options. These tags allow you to control the type of instrumentation performed in certain portions of your application.

  • Static and Dynamic timers can be set by specifying either a range of line numbers or a routine.

    static timer name="foo_bar" file="foo.c" line=17 to line=18
    dynamic timer routine="int foo1(int)
                  
  • Static and Dynamic phases can be set by specifying either a range of line numbers or a routine. If you do not configure TAU with -PROFILEPHASE these phases will be converted to regular timers.

    static phase routine="int foo(int)
    dynamic phase name="foo1_bar" file="foo.c" line=26 to line=27
                  
  • Loops in the source code can be profiled by specifying a routine in which all loop should be profiled, like:

    loops file="loop_test.cpp" routine="multiply"
                  
  • With Memory Profiling the following events are tracked: memory allocation, memory deallocation, and memory leaks.

    memory file="foo.f90" routine="INIT"
                  
  • IO Events track the size, in bytes of read, write, and print statements.

    io file="foo.f90" routine="RINB"
                  

Both Memory and IO events are represented along with their call-stack, the length of which can the set with environment variable TAU_CALLPATH_DEPTH.

[Note] Note

Due to the limitations of the some compilers (IBM xlf, PGI pgf90, GNU gfortran) The size of the memory reported for a Fortran Array is not the number of bytes but the number of elements.

There are wildcards (#, *, ?) that can be used when specifying a file or routine.For file names * character can be used to specify any number of character, thus foo* matches foobar, foo2, etc. also for file names ? can match a single character, ie. foo? matches foo2, fooZ, but not foobar. You can use # as a wildcard for routines, ie. b# matches bar, b2z etc. Example:

#Tell tau to not profile these functions
BEGIN_EXCLUDE_LIST

void quicksort(int *, int, int)
# The next line excludes all functions beginning with "sort_" and having arguments
# "int *"
void sort_#(int *)
void interchange(int *, int *)

END_EXCLUDE_LIST

#Exclude these files from profiling
BEGIN_FILE_EXCLUDE_LIST

*.so

END_FILE_EXCLUDE_LIST

Within the BEGIN_INSTRUMENT_SECTION / END_INSTRUMENT_SECTION tags you can also insert code fragment within the source code by specifying the file and line number, for example:

file = "line_test.cpp" line = 9 code = "printf(\"i=%d: \", i);"
			

You may want to add code at the entry and exit of a particular routine, for example:

exit routine ="int foo()" code = "cout <<\"exiting foo\"<<endl;"
entry routine ="int foo()" code = "cout <<\"entering foo\"<<endl;"
			

You can also insert code at initialization (at the beginning of main for C/C++ and at the start of program for Fortran.) You can also insert code before a function's first statement is executed. For example:

init code="int i = 0;" lang="C"
decl file="bar.C" routine="foo" code="int j;" lang="C++"
			

Furthermore you can use the following substitutions:

  • @ARGV@ => list of argument to main (only in init construct).

  • @ARGC => number of argument to main plus one for the program name (only in init construct).

  • @FILE@ ==> Name of source file

  • @LINE@ ==> Insertion line

  • @COL@ ==> Insertion column

  • Additional substitutions for entry/exit:

  • @ROUTINE@ ==> Name of function/routine

  • @BEGIN_LINE@ ==> routine.headBegin().line()

  • @BEGIN_COL@ ==> routine.headBegin().col()

  • @END_LINE@ ==> routine.bodyEnd().line()

  • @END_COL@ ==> routine.bodyEnd().col()