Frequently Asked Questions (FAQ)










    What data can the TAU Profiling Package show me?
    It can show you the exclusive and inclusive time spent in each function. For templated entities, it shows the breakup of time spent for each instantiation. The other data includes how many times each function was called, how many profiled functions did each function invoke, what the mean inclusive time per call was. It shows the mean time spent in a function over all nodes, contexts and threads. It can also show the exclusive and inclusive times spent in a function for each invocation of every function (and the aggregated sum over all invocations).
    Instead of time, it can use hardware performance counters and show the number of instructions issued for each function, the cycles, loads, stores, floating point operations, primary and secondary data cache misses, TLB misses, etc.
    It can also calculate the statistics such as the standard deviation of the exclusive time( or counts) spent in each templated function.
    Instead of Profiling functions, the user can profile at a finer granularity using timers and it can profile all the above quantities for multiple user defined timers to profile statements in the code instead of functions.
     

    How do I instrument the source code for profiling?
    The user can instrument the source code by inserting TAU Profiling API calls that identify the function and associate it with a profile group. The details of instrumenting the source code can be found in the Overview section and in the Tutorial section.
     

    Can I selectively profile a subset of functions at runtime?
    Yes. The user can specify which functions are to be profiled by using a command line parameter while running the application specifying the profile groups to profile (by default all functions are profiled).
     

    Is the Profiling library thread safe?
    Yes. The TAU Profiling Package can be used with a thread library such as pthread and it generates profile data for each thread. It needs to be configured using -pthread option to support pthread. Otherwise a single thread is assumed per context.
     

    What is the overhead of profiling? Can I keep the instrumentation in the libraries when profiling is turned off? Does it add any overhead?
    The overhead associated with profiling is broken up into the overhead of function name registration that takes place the first time a function is invoked and the overhead for each subsequent invocation of the function. Benchmarks on SGI using fast hardware timers show that typically the function registration could take between 8 to 40 microseconds (longer function/type names could take longer) and for subsequent invocations, the overhead is 0.8 to 1.6 microseconds for each invocation.
    When the profiling is turned off, (default compilation mode in the absence of -DPROFILING_ON flag) all the Profiling calls are defined as null and no runtime overhead is introduced.
     

    Does it work with optimization turned on?
    Yes. Since TAU profiling package uses source code C++ instrumentation, it works with optimization enabled or disabled.
     

    What platforms does it run on and what libraries and frameworks have been profiled with TAU?
    Currently (as of Aug. '98) it runs on

      SGI Power Challenge, SGI Origin 2000 (ASCI Blue Mountain) etc.
      Intel Teraflop Machine (ASCI Red)
      Sun Workstations
      LINUX PC cluster
      HP 9000 Workstations
      DEC Alpha Workstations
      Cray T3E
    Since the TAU Profiling library is written in C++, it is easily ported to other architectures by simply recompiling it.
    The frameworks and libraries that have been profiled with it include :
      POOMA (Parallel Object Oriented Methods and Applications - LANL)
      ACLMPL (Advanced Computing Lab. Message Passing Library - LANL)
      A++/P++ (Array Class Library - LANL)
      PAWS (Parallel Application Work Space - LANL)
      ACLVIS (Advanced Computing Lab. - Visualization Library - LANL)
      MC++ (Monte Carlo simulation package - LANL)
      Conejo (Tecolote project - LANL)
      pC++ (Parallel C++ - U. Indiana, U. Colorado, U. Oregon)
      HPC++ (High Performance C++ - U. Indiana, LANL, U. Colorado, U. Oregon)
      Blitz++ (Object Oriented Numerical Library - U. Waterloo)
     
     

    Can TAU Profiling Package be used to profile some other C++ frameworks and libraries?
    Yes. The TAU Profiling Package has been designed to be portable across different platforms and software frameworks. The steps involved in porting it to some other framework include compiling the Profiling library, adding instrumentation using the TAU API to the framework and recompiling and running it with the profiling flags enabled.
     

    Who do I contact for instructions on downloading the TAU Profiling Package?
    To download TAU, click here.
     

    What security issues (xhost, xauth) are involved in using TAU?
    Currently, TAU only works with a secure X display. This means that if you're using xhost + on your display, it will need to be changed to xhost - and then use xauth for making the display secure. For e.g., if my local display is local.cs.uoregon.edu and my application is running on remote.lanl.gov then, I need to get the authorization key from my local display and add it to the remote node. This can be done by marking the key and copying it in the remote window.
    In the local window,

    [local.cs.uoregon.edu]% xauth
    xauth> list
    local.cs.uoregon.edu:0 MIT-MAGIC-COOKIE-1  123243434323232
    ...
    
    In the other window on the remote node,
    [remote.lanl.gov]% xauth
    xauth> add local.cs.uoregon.edu:0 MIT-MAGIC-COOKIE-1  34342323232f2323e23
    595133
    xauth> exit
    Writing file ~/.Xauthority
    
    If your local X-Server does not generate cookies for X-authorization, please con tact your system administrator to enable X-authorization in the xdm configuration files and ensure that xdm is running.

    Some users prefer to use ssh to login to the remote machine and generate these cookies for authentication automatically. This is the recommended approach.