Name

tau_exec — TAU execution wrapping script

Synopsis

tau_exec [ options ] [--] { exe } [ exe options ]

Description

Use this script to perform memory or IO tracking on either an instrumented or uninstrumented executable.

Options

-v

verbose mode

-s

show the command generated by tau_exec without running it

-qsub

BG/P qsub mode

-io

track io

-memory

track memory

-memory

enable memory debugger

-cuda

track GPU events via CUDA (Must be configured with -cuda=<dir>, Preferred of CUDA 4.0 or earlier)

-cupti

track GPU events via Nvidia's CUPTI interface (Must be configured with -cupti=<dir>, Preferred for CUDA 4.1 or later).

-um

in conjunction with -cupti adds support for the Unified Memory GPUs. Requires CUDA 6.5 or later.

-opencl

track GPU events via OpenCL

-openacc

track openacc events. Supports TAU configurations with -arch=craycnl or PGI compilers on x86_64 Linux

-ompt

track OpenMP events via OMPT interface

-power

track power events via PAPI's perf RAPL interface

-numa

track DRAM events. Requires PAPI with recent perf support for x86_64

-armci

track ARMCI events via PARMCI (Must be configured with -armci=<dir>)

-shmem

track SHMEM events

-numa

Activates hardware counters to measure remote DRAM accesses and total node accesses. These counters must be available from PAPI in the selected TAU configuration.

-ts-sample-flags=<flags>

flags to pass to PT TS sample_ts command. Overrides TAU_TS_SAMPLE_FLAGS env. var.

-ts-report-flags=<flags>

flags to pass to PT TS report_ts command. Overrides TAU_TS_REPORT_FLAGS env. var.

-ebs

enable Event-based sampling. See README.sampling for more information

-ebs_period=<count >

sampling period (default 1000)

-ebs_source=<counter>

sets sampling metric (default "itimer")

-ptts

Launch ThreadSpotter. It must be available in the system path.

-um

enable Unified Memory events via CUPTI

-sass=<level>

tracks GPU events via CUDA with source code locator activity

-csv

output sass profile in CSV format

-T<option>

: specify TAU option

-loadlib=<file.so >

: specify additional load library

-XrunTAU-<options>

specify TAU library directly

-gdb

run program in gdb debugger

Notes

Defaults if unspecified: -T MPI. MPI is assumed unless SERIAL is specified

CUDA kernel tracking is included, if A CUDA SYNC call is made after each kernel launch and cudaThreadExit() is called before the exit of each thread that uses CUDA.

OPENCL kernel tracking is included, if A OPENCL SYNC call is made after each kernel launch and clReleaseContext() is called before the exit of each thread that uses CUDA.

Examples

mpirun -np 2 tau_exec -io ./ring

mpirun -np 8 tau_exec -ebs -ebs_period=1000000 -ebs_source=PAPI_FP_INS ./ring

tau_exec -T serial,cupti -cupti ./matmult (Preferred for CUDA 4.1 or later)

tau_exec -T serial -cuda ./matmult (Preferred for CUDA 4.0 or earlier)

tau_exec -T serial -opencl (OPENCL)