Table A.1. TAU Environment Variables
VARIABLE NAME | DESCRIPTION |
---|---|
TAU_PROFILE | Set to 1 to have TAU profile your code |
TAU_TRACE | Set to 1 to have TAU trace your code |
TAU_METRICS | Colon delimited list of TAU/PAPI metrics to profile |
PAPI_EVENT | Sets the hardware counter to use when TAU is configured with -PAPI. See Section 2.6, “Using Hardware Performance Counters” |
PCL_EVENT | Sets the hardware counter to use when TAU is configured with -PCL. See Section 2.6, “Using Hardware Performance Counters” |
PROFILEDIR | Selectively measure groups of routines and statements. Use with -profile command line option. See Section 1.4, “Selectively Profiling an Application” |
TAU_CALLPATH | When set to 1 TAU will generate call-path data. Use with TAU_CALLPATH_DEPTH. |
TAU_CALLPATH_DEPTH | Sets the depth of the callpath profiling. Use with TAU_CALLPATH environment variable. |
TAU_CALLSITE | When set to 1 TAU will provide call site information for events in profile and trace output. Configure TAU with -bfd=download and -useropt="-g" . |
TAU_TRACK_MESSAGE | Track MPI message statistics (profiling), messages lines (tracing). |
TAU_COMM_MATRIX | Generate MPI communication matrix data. |
TAU_COMPENSATE | Attempt to compensate for profiling overhead in profiles. |
TAU_COMPENSATE_ITERATIONS | Set the number of iterations TAU uses to estimate the measurement overhead. A larger number of iteration will increases profiling precision (default 1000). |
TAU_KEEP_TRACEFILES | Retains the intermediate trace files. Use with -TRACE TAU configuration option. See Section 3.1, “Generating Event Traces” |
TAU_MUSE_PACKAGE | Sets the MAGNET/MUSE package name. Use with the -muse TAU configuration option. See Section 2.4, “Using Hardware Counters for Measurement” |
TAU_THROTTLE | Enables the runtime throttling of events that are lightweight. See Section 1.4, “Selectively Profiling an Application” |
TAU_THROTTLE_NUMCALLS | Set the maximum number of calls that will be profiled for any function when TAU_THROTTLE is enabled. See Section 1.4, “Selectively Profiling an Application” |
TAU_THROTTLE_PERCALL | Set the minimum inclusive time (in milliseconds) a function has to have to be instrumented when TAU_THROTTLE is enabled. See Section 1.4, “Selectively Profiling an Application” |
TAU_TRACEFILE | Specifies the name of Vampir trace file. Use with -TRACE TAU configuration option. See Section 3.1, “Generating Event Traces” |
TRACEDIR | Specifies the directory where trace file are to be stored. See Section 3.1, “Generating Event Traces” |
TAU_SELECT_FILE | When set to the location of a valid selective instrumentation file TAU will include/exclude the specified source at runtime. |
TAU_COMPILER_SELECT_FILE | When set to the location of a valid selective instrumentation file the TAU LLVM plugin will include/exclude the specified source. |
TAU_COMPILER_MIN_INSTRUCTION_COUNT | Excludes functions from instrumentation if their instruction count is below the set value, Defaults to 50. Set to 1 to include all functions. |
TAU_VERBOSE | When set TAU will print out information about the its configuration when running a instrumented application. |
TAU_PROFILE_FORMAT | When set to snapshot TAU will
generate condensed snapshot profiles (they merge together different
metrics so there is only one file per node.) Instead of the default
kind. When set to merged , TAU will pre-compute mean
and std. dev. at the end of execution.
|
TAU_TRACK_MEMORY_FOOTPRINT | When set TAU will track resident set size (VmRSS) and peak memory usage (VmHWM) or the high water mark of resident set size, the same values provided by the 'top' command. |
TAU_TRACK_POWER | Enables tracking of power consumption via periodic interrupt. |
TAU_SYNCHRONIZE_CLOCKS | When set TAU will correct for any time discrepancies between nodes because of their CPU clock lag. This should produce more reliable trace data. |
TAU_SAMPLING |
Default value is 0 (off). When TAU_SAMPLING is set, we collect additional profile or trace information (depending on whether TAU_PROFILE or TAU_TRACE is set respectively) via periodic sampling at runtime. Metrics collected and sampling period is controlled by TAU_EBS_SOURCE and TAU_EBS_PERIOD variables respectively. The TAU_EBS_UNWIND variable determines if callstack unwinding is enabled at each sample. For TAU_PROFILE, in addition to regular TAU instrumented profile output, samples will show up as additional events prefixed by [SAMPLE] for each unique function, file and source line number combination. These events are grouped under [INTERMEDIATE] event nodes for the instrumented TAU context where the samples occured. In addition, if TAU_EBS_UNWIND is active, [UNWIND] event nodes may be generated for each discovered callstack entry found by the callstack unwinder. TAU_SAMPLING is dependent on the availability of BFD as determined by the -bfd configuration option when building TAU. Its ability to resolve sample addresses into function, file name and source line number information may be limited or missing if BFD is missing or is installed with limited functionality. If in doubt, please try building TAU with "-bfd=download". Any one of function, file name and source line number may be missing. In the event all three are, the event is marked as "UNRESOLVED". The TAU_EBS_KEEP_UNRESOLVED_ADDR variable enables addresses to be retained for unresolved results. |
TAU_EBS_SOURCE | Default value is "itimer". This variable sets the metric that determines the period of sampling. If the value is "itimer" (default), it represents the number of microseconds between samples (as determined by TAU_EBS_PERIOD). If the value is a PAPI metric (eg. PAPI_FP_INS), then it represents the number of counts of that metric between samples (eg. every 10,000 floating-point instructions if PAPI_FP_INS is used). For "itimer", the samples occur as a result of system timer interrupts while for PAPI they occur in response to PAPI counter overflow interrupts set to the value of the TAU_EBS_PERIOD. |
TAU_EBS_PERIOD | Default value is 1,000. This variable sets the period between samples. The semantics of this value is discussed in the section above on TAU_EBS_SOURCE. |
TAU_EBS_UNWIND | Default value is 0 (off). This enables callstack unwinding for each sample using the callstack unwinder specified at TAU configuration time. As of this writing, only the libunwind tool is supported. Support for other callstack unwinders like StackwalkerAPI will be included. The TAU_EBS_UNWIND_DEPTH variable is used to control how many times the TAU sampling framework will be allowed to unwind the callstack. |
TAU_EBS_UNWIND_DEPTH | Default value is 10. This controls how many layers of the callstack TAU should unwind before attaching the result to the appropriate TAU event context. |
TAU_EBS_KEEP_UNRESOLVED_ADDR | Default value is 0 (off). When set, this variable allows sample addresses that fail to be resolved by BFD to be recorded as "UNRESOLVED <modulename> ADDR <addr> instead of "UNRESOLVED <modulename>". This provides nominally more information than the default scenario in light of missing BFD information. |
TAU_EBS_RESOLUTION | Can be set to file, function or line. Is line by default. Event based sampling will resolve to the selected level of granularity. |
TAU_TRACK_SIGNALS | Set this variables to 1 to capture callstack as metadata at point of failure. |
TAU_SUMMARY | Set this variables to 1 to generate just min/max/stddev/mean statistics instead of per-node data. Use paraprof -dumpsummary and then pprof -f profile.Max/Min to see the data. |
TAU_IBM_BG_HWP_COUNTERS | Set this variable to 1 to include IBM's UPC Hardware Performance counters in the metadata for process 0. Requires the use of MPI. |
TAU_CUPTI_API | Default: runtime , options:
runtime,driver,both .
Controls which layer of CUDA is tracked within the CUPTI
measurement system. See for example: tau_exec -T
serial,cupti -cupti ./matmult . Option should be set
basied on which layer the CUDA program
uses—runtime when the program uses the
CUDA runtime API, driver when the program
uses the driver API. NOTE: Both the PGI accelerator and the HMPP
compilers use the driver API.
|
TAU_TRACK_MPI_T_PVARS | Set this variable to 1 to enable collection of MPI_T PVAR values |
TAU_MPI_T_CVAR_METRICS | Set this to the MPI_T variable(s) you want to control, in conjunction with the values set in TAU_MPI_T_CVAR_VALUES
|
TAU_MPI_T_CVAR_VALUES | Set this to the value(s) you want assigned to the variable(s) specified in TAU_MPI_T_CVAR_METRICS
|
TAU_SET_NODE | Set this to 0 to allow MPI configurations of TAU to work correctly with serial codes. |