Environment Variables

Table 1. TAU Environment Variables
VARIABLE NAME DESCRIPTION

TAU_PROFILE

Set to 1 to have TAU profile your code

TAU_TRACE

Set to 1 to have TAU trace your code

TAU_METRICS

Colon delimited list of TAU/PAPI metrics to profile

PAPI_EVENT

Sets the hardware counter to use when TAU is configured with -PAPI. See ???

PCL_EVENT

Sets the hardware counter to use when TAU is configured with -PCL. See ???

PROFILEDIR

Selectively measure groups of routines and statements. Use with -profile command line option. See ???

TAU_CALLPATH

When set to 1 TAU will generate call-path data. Use with TAU_CALLPATH_DEPTH.

TAU_CALLPATH_DEPTH

Sets the depth of the callpath profiling. Use with TAU_CALLPATH environment variable.

TAU_CALLSITE

When set to 1 TAU will provide call site information for events in profile and trace output. Configure TAU with -bfd=download and -useropt="-g" .

TAU_TRACK_MESSAGE

Track MPI message statistics (profiling), messages lines (tracing).

TAU_COMM_MATRIX

Generate MPI communication matrix data.

TAU_COMPENSATE

Attempt to compensate for profiling overhead in profiles.

TAU_COMPENSATE_ITERATIONS

Set the number of iterations TAU uses to estimate the measurement overhead. A larger number of iteration will increases profiling precision (default 1000).

TAU_KEEP_TRACEFILES

Retains the intermediate trace files. Use with -TRACE TAU configuration option. See ???

TAU_MUSE_PACKAGE

Sets the MAGNET/MUSE package name. Use with the -muse TAU configuration option. See ???

TAU_THROTTLE

Enables the runtime throttling of events that are lightweight. See ???

TAU_THROTTLE_NUMCALLS

Set the maximum number of calls that will be profiled for any function when TAU_THROTTLE is enabled. See ???

TAU_THROTTLE_PERCALL

Set the minimum inclusive time (in milliseconds) a function has to have to be instrumented when TAU_THROTTLE is enabled. See ???

TAU_TRACEFILE

Specifies the name of Vampir trace file. Use with -TRACE TAU configuration option. See ???

TRACEDIR

Specifies the directory where trace file are to be stored. See ???

TAU_SELECT_FILE

When set to the location of a valid selective instrumentation file TAU will include/exclude the specified source at runtime.

TAU_COMPILER_SELECT_FILE

When set to the location of a valid selective instrumentation file the TAU LLVM plugin will include/exclude the specified source.

TAU_COMPILER_MIN_INSTRUCTION_COUNT

Excludes functions from instrumentation if their instruction count is below the set value, Defaults to 50. Set to 1 to include all functions.

TAU_VERBOSE

When set TAU will print out information about the its configuration when running a instrumented application.

TAU_PROFILE_FORMAT

When set to snapshot TAU will generate condensed snapshot profiles (they merge together different metrics so there is only one file per node.) Instead of the default kind. When set to merged, TAU will pre-compute mean and std. dev. at the end of execution.

TAU_TRACK_MEMORY_FOOTPRINT

When set TAU will track resident set size (VmRSS) and peak memory usage (VmHWM) or the high water mark of resident set size, the same values provided by the 'top' command.

TAU_TRACK_POWER

Enables tracking of power consumption via periodic interrupt.

TAU_SYNCHRONIZE_CLOCKS

When set TAU will correct for any time discrepancies between nodes because of their CPU clock lag. This should produce more reliable trace data.

TAU_SAMPLING

Default value is 0 (off). When TAU_SAMPLING is set, we collect additional profile or trace information (depending on whether TAU_PROFILE or TAU_TRACE is set respectively) via periodic sampling at runtime. Metrics collected and sampling period is controlled by TAU_EBS_SOURCE and TAU_EBS_PERIOD variables respectively. The TAU_EBS_UNWIND variable determines if callstack unwinding is enabled at each sample.

For TAU_PROFILE, in addition to regular TAU instrumented profile output, samples will show up as additional events prefixed by [SAMPLE] for each unique function, file and source line number combination. These events are grouped under [INTERMEDIATE] event nodes for the instrumented TAU context where the samples occured. In addition, if TAU_EBS_UNWIND is active, [UNWIND] event nodes may be generated for each discovered callstack entry found by the callstack unwinder.

TAU_SAMPLING is dependent on the availability of BFD as determined by the -bfd configuration option when building TAU. Its ability to resolve sample addresses into function, file name and source line number information may be limited or missing if BFD is missing or is installed with limited functionality. If in doubt, please try building TAU with "-bfd=download". Any one of function, file name and source line number may be missing. In the event all three are, the event is marked as "UNRESOLVED". The TAU_EBS_KEEP_UNRESOLVED_ADDR variable enables addresses to be retained for unresolved results.

TAU_EBS_SOURCE

Default value is "itimer". This variable sets the metric that determines the period of sampling. If the value is "itimer" (default), it represents the number of microseconds between samples (as determined by TAU_EBS_PERIOD). If the value is a PAPI metric (eg. PAPI_FP_INS), then it represents the number of counts of that metric between samples (eg. every 10,000 floating-point instructions if PAPI_FP_INS is used). For "itimer", the samples occur as a result of system timer interrupts while for PAPI they occur in response to PAPI counter overflow interrupts set to the value of the TAU_EBS_PERIOD.

TAU_EBS_PERIOD

Default value is 1,000. This variable sets the period between samples. The semantics of this value is discussed in the section above on TAU_EBS_SOURCE.

TAU_EBS_UNWIND

Default value is 0 (off). This enables callstack unwinding for each sample using the callstack unwinder specified at TAU configuration time. As of this writing, only the libunwind tool is supported. Support for other callstack unwinders like StackwalkerAPI will be included. The TAU_EBS_UNWIND_DEPTH variable is used to control how many times the TAU sampling framework will be allowed to unwind the callstack.

TAU_EBS_UNWIND_DEPTH

Default value is 10. This controls how many layers of the callstack TAU should unwind before attaching the result to the appropriate TAU event context.

TAU_EBS_KEEP_UNRESOLVED_ADDR

Default value is 0 (off). When set, this variable allows sample addresses that fail to be resolved by BFD to be recorded as "UNRESOLVED <modulename> ADDR <addr> instead of "UNRESOLVED <modulename>". This provides nominally more information than the default scenario in light of missing BFD information.

TAU_EBS_RESOLUTION

Can be set to file, function or line. Is line by default. Event based sampling will resolve to the selected level of granularity.

TAU_TRACK_SIGNALS

Set this variables to 1 to capture callstack as metadata at point of failure.

TAU_SUMMARY

Set this variables to 1 to generate just min/max/stddev/mean statistics instead of per-node data. Use paraprof -dumpsummary and then pprof -f profile.Max/Min to see the data.

TAU_IBM_BG_HWP_COUNTERS

Set this variable to 1 to include IBM’s UPC Hardware Performance counters in the metadata for process 0. Requires the use of MPI.

TAU_CUPTI_API

Default: runtime, options: runtime,driver,both. Controls which layer of CUDA is tracked within the CUPTI measurement system. See for example: tau_exec -T serial,cupti -cupti ./matmult. Option should be set basied on which layer the CUDA program uses—runtime when the program uses the CUDA runtime API, driver when the program uses the driver API. NOTE: Both the PGI accelerator and the HMPP compilers use the driver API.

TAU_TRACK_MPI_T_PVARS

Set this variable to 1 to enable collection of MPI_T PVAR values

TAU_MPI_T_CVAR_METRICS

Set this to the MPI_T variable(s) you want to control, in conjunction with the values set in TAU_MPI_T_CVAR_VALUES

TAU_MPI_T_CVAR_VALUES

Set this to the value(s) you want assigned to the variable(s) specified in TAU_MPI_T_CVAR_METRICS

TAU_SET_NODE

Set this to 0 to allow MPI configurations of TAU to work correctly with serial codes.

TAU_THREAD_PER_GPU_STREAM

Set this to 1 to report each GPU strem as a distinct TAU thread. Currently supports CUPTI only.

TAU_CUPTI_PC_HWB

Set to the hardware buffer size (in MBytes) to use with cupti PC sampling (activated with the -cupti_pc option to tau_exec)

TAU_CUPTI_PC_PERIOD

Set to the sampling period to use with cupti PC sampling (activated with the -cupti_pc option to tau_exec)