Appendix A. Environment Variables

Table A.1. TAU Environment Variables

TAU_PROFILE Set to 1 to have TAU profile your code
TAU_TRACE Set to 1 to have TAU trace your code
TAU_METRICS Colon delimited list of TAU/PAPI metrics to profile
PAPI_EVENT Sets the hardware counter to use when TAU is configured with -PAPI. See Section 2.6, “Using Hardware Performance Counters”
PCL_EVENT Sets the hardware counter to use when TAU is configured with -PCL. See Section 2.6, “Using Hardware Performance Counters”
PROFILEDIR Selectively measure groups of routines and statements. Use with -profile command line option. See Section 1.4, “Selectively Profiling an Application”
TAU_CALLPATH When set to 1 TAU will generate call-path data. Use with TAU_CALLPATH_DEPTH.
TAU_CALLPATH_DEPTH Sets the depth of the callpath profiling. Use with TAU_CALLPATH environment variable.
TAU_CALLSITE When set to 1 TAU will provide call site information for events in profile and trace output. Configure TAU with -bfd=download and -useropt="-g" .
TAU_TRACK_MESSAGE Track MPI message statistics (profiling), messages lines (tracing).
TAU_COMM_MATRIX Generate MPI communication matrix data.
TAU_COMPENSATE Attempt to compensate for profiling overhead in profiles.
TAU_COMPENSATE_ITERATIONS Set the number of iterations TAU uses to estimate the measurement overhead. A larger number of iteration will increases profiling precision (default 1000).
TAU_KEEP_TRACEFILES Retains the intermediate trace files. Use with -TRACE TAU configuration option. See Section 3.1, “Generating Event Traces”
TAU_MUSE_PACKAGE Sets the MAGNET/MUSE package name. Use with the -muse TAU configuration option. See Section 2.4, “Using Hardware Counters for Measurement”
TAU_THROTTLE Enables the runtime throttling of events that are lightweight. See Section 1.4, “Selectively Profiling an Application”
TAU_THROTTLE_NUMCALLS Set the maximum number of calls that will be profiled for any function when TAU_THROTTLE is enabled. See Section 1.4, “Selectively Profiling an Application”
TAU_THROTTLE_PERCALL Set the minimum inclusive time (in milliseconds) a function has to have to be instrumented when TAU_THROTTLE is enabled. See Section 1.4, “Selectively Profiling an Application”
TAU_TRACEFILE Specifies the name of Vampir trace file. Use with -TRACE TAU configuration option. See Section 3.1, “Generating Event Traces”
TRACEDIR Specifies the directory where trace file are to be stored. See Section 3.1, “Generating Event Traces”
TAU_SELECT_FILE When set to the location of a valid selective instrumentation file TAU will include/exclude the specified source at runtime.
TAU_VERBOSE When set TAU will print out information about the its configuration when running a instrumented application.
TAU_PROFILE_FORMAT When set to snapshot TAU will generate condensed snapshot profiles (they merge together different metrics so there is only one file per node.) Instead of the default kind. When set to merged, TAU will pre-compute mean and std. dev. at the end of execution.
TAU_TRACK_MEMORY_FOOTPRINT When set TAU will track resident set size (VmRSS) and peak memory usage (VmHWM) or the high water mark of resident set size, the same values provided by the 'top' command.
TAU_TRACK_POWER Enables tracking of power consumption via periodic interrupt.
TAU_SYNCHRONIZE_CLOCKS When set TAU will correct for any time discrepancies between nodes because of their CPU clock lag. This should produce more reliable trace data.

Default value is 0 (off). When TAU_SAMPLING is set, we collect additional profile or trace information (depending on whether TAU_PROFILE or TAU_TRACE is set respectively) via periodic sampling at runtime. Metrics collected and sampling period is controlled by TAU_EBS_SOURCE and TAU_EBS_PERIOD variables respectively. The TAU_EBS_UNWIND variable determines if callstack unwinding is enabled at each sample.

For TAU_PROFILE, in addition to regular TAU instrumented profile output, samples will show up as additional events prefixed by [SAMPLE] for each unique function, file and source line number combination. These events are grouped under [INTERMEDIATE] event nodes for the instrumented TAU context where the samples occured. In addition, if TAU_EBS_UNWIND is active, [UNWIND] event nodes may be generated for each discovered callstack entry found by the callstack unwinder.

TAU_SAMPLING is dependent on the availability of BFD as determined by the -bfd configuration option when building TAU. Its ability to resolve sample addresses into function, file name and source line number information may be limited or missing if BFD is missing or is installed with limited functionality. If in doubt, please try building TAU with "-bfd=download". Any one of function, file name and source line number may be missing. In the event all three are, the event is marked as "UNRESOLVED". The TAU_EBS_KEEP_UNRESOLVED_ADDR variable enables addresses to be retained for unresolved results.

TAU_EBS_SOURCE Default value is "itimer". This variable sets the metric that determines the period of sampling. If the value is "itimer" (default), it represents the number of microseconds between samples (as determined by TAU_EBS_PERIOD). If the value is a PAPI metric (eg. PAPI_FP_INS), then it represents the number of counts of that metric between samples (eg. every 10,000 floating-point instructions if PAPI_FP_INS is used). For "itimer", the samples occur as a result of system timer interrupts while for PAPI they occur in response to PAPI counter overflow interrupts set to the value of the TAU_EBS_PERIOD.
TAU_EBS_PERIOD Default value is 1,000. This variable sets the period between samples. The semantics of this value is discussed in the section above on TAU_EBS_SOURCE.
TAU_EBS_UNWIND Default value is 0 (off). This enables callstack unwinding for each sample using the callstack unwinder specified at TAU configuration time. As of this writing, only the libunwind tool is supported. Support for other callstack unwinders like StackwalkerAPI will be included. The TAU_EBS_UNWIND_DEPTH variable is used to control how many times the TAU sampling framework will be allowed to unwind the callstack.
TAU_EBS_UNWIND_DEPTH Default value is 10. This controls how many layers of the callstack TAU should unwind before attaching the result to the appropriate TAU event context.
TAU_EBS_KEEP_UNRESOLVED_ADDR Default value is 0 (off). When set, this variable allows sample addresses that fail to be resolved by BFD to be recorded as "UNRESOLVED <modulename> ADDR <addr> instead of "UNRESOLVED <modulename>". This provides nominally more information than the default scenario in light of missing BFD information.
TAU_EBS_RESOLUTION Can be set to file, function or line. Is line by default. Event based sampling will resolve to the selected level of granularity.
TAU_TRACK_SIGNALS Set this variables to 1 to capture callstack as metadata at point of failure.
TAU_SUMMARY Set this variables to 1 to generate just min/max/stddev/mean statistics instead of per-node data. Use paraprof -dumpsummary and then pprof -f profile.Max/Min to see the data.
TAU_IBM_BG_HWP_COUNTERS Set this variable to 1 to include IBM's UPC Hardware Performance counters in the metadata for process 0. Requires the use of MPI.
TAU_CUPTI_API Default: runtime, options: runtime,driver,both. Controls which layer of CUDA is tracked within the CUPTI measurement system. See for example: tau_exec -T serial,cupti -cupti ./matmult. Option should be set basied on which layer the CUDA program uses—runtime when the program uses the CUDA runtime API, driver when the program uses the driver API. NOTE: Both the PGI accelerator and the HMPP compilers use the driver API.
TAU_TRACK_MPI_T_PVARS Set this variable to 1 to enable collection of MPI_T PVAR values
TAU_MPI_T_CVAR_METRICS Set this to the MPI_T variable(s) you want to control, in conjunction with the values set in TAU_MPI_T_CVAR_VALUES
TAU_MPI_T_CVAR_VALUES Set this to the value(s) you want assigned to the variable(s) specified in TAU_MPI_T_CVAR_METRICS
TAU_SET_NODE Set this to 0 to allow MPI configurations of TAU to work correctly with serial codes.