*****************************************************************************
**			TAU Portable Profiling Package			   **
**			http://www.acl.lanl.gov/tau		           **
*****************************************************************************
**    Copyright 1997-2003				   	   	   **
**    Department of Computer and Information Science, University of Oregon **
**    Advanced Computing Laboratory, Los Alamos National Laboratory        **
**    Research Center Juelich, ZAM Germany			           **
*****************************************************************************
/*******************************************************************
 *                                                                 *
 *        Tuning and Analysis Utilities Installation Procedure     *
 *                           Version 2.12                          *
 *                                                                 *
 *******************************************************************
 *    For installation help, see INSTALL.                          *
 *    For release notes, see README.                               *
 *    For JAVA instructions, see README.JAVA                       *
 *    For licensing information, see LICENSE.                      *
 *    For a tutorial on using TAU, open html/index.html in your    *
 *        web browser.                                             *
 *    For more information, including updates and new releases,    *
 *        see http://www.acl.lanl.gov/tau                          *
 *    For help, reporting bugs, and making suggestions, please     *
 *        send e-mail to tau-bugs@cs.uoregon.edu                   *
 *******************************************************************/


General Installation Procedure: 
-------------------------------
Microsoft Windows users should refer to instructions in Windows-Readme.txt. 

The following instructions are meant for Unix Users.

1.  Configure the package for your system.

After uncompressing and untarring tau, the user needs to configure, compile and
install the package. This is done by invoking:

% ./configure
% make install

TAU is configured by running the configure script with appropriate options that
select the profiling and tracing components that are used to build the TAU 
library.  The `configure' shell script attempts to guess correct values for 
various system-dependent variables used during compilation, and creates the 
Makefile(s) (one in each subdirectory of the source directory).

% ./configure -help 

TAU Configuration Utility 
***********************************************************************
Usage: configure [OPTIONS]
  where [OPTIONS] are:
-c++=<compiler>  ............................ specify the C++ compiler.
    options [CC|KCC|g++|*xlC*|cxx|pgCC|FCC|guidec++|aCC|c++|ecpc|icpc].
-cc=<compiler> ................................ specify the C compiler.
                            options [cc|gcc|KCC|pgcc|guidec|*xlc*|ecc].
-pdt_c++=<compiler>  ............ specify a different PDT C++ compiler.
    options [CC|KCC|g++|*xlC*|cxx|pgCC|FCC|guidec++|aCC|c++|ecpc|icpc].
-fortran=<compiler> ..................... specify the Fortran compiler.
   options    [gnu|sgi|ibm|ibm64|hp|cray|pgi|absoft|fujitsu|sun|compaq|
  	                                         kai|nec|hitachi|intel]
-useropt='<parameters>' ............... list of commandline parameters.
-pthread .................................. Use pthread thread package.
-sproc .................................. Use SGI sproc thread package.
-tulipthread=<dir> .......... Specify location of Tulip/Smarts package.
-smarts .................. Use SMARTS API for threads (use with above).
-openmp ........................................... Use OpenMP threads.
-opari=<dir>... Specify location of Opari OpenMP tool (use with above).
-opari_region ......... Report performance data for all OpenMP regions.
-opari_construct ... Report performance data for all OpenMP constructs.
-pcl=<dir> ..... Specify location of PCL (Performance Counter Library).
-papi=<dir> ............... Specify location of PAPI (Performance API).
-pdt=<dir> ........ Specify location of PDT (Program Database Toolkit).
-jdk=<dir> ...... Specify location of JAVA 2 Development Kit (jdk1.2+).
-dyninst=<dir> ................... Specify location of DynInst Package.
-mpi .......................... Specify use of TAU MPI wrapper library.
-mpiinc=<dir> ............. Specify location of MPI include dir and use
                           the TAU MPI Profiling and Tracing Interface.
-mpilib=<dir> ............. Specify location of MPI library dir and use
                           the TAU MPI Profiling and Tracing Interface.
-mpilibrary=<library> ................ Specify a different MPI library.
            e.g., -mpilibrary=-lmpi_r                                  
-nocomm  ........ Disable tracking communication events in MPI library.
-epilog=<dir>  ............ Specify location of EPILOG Tracing package.
-pythoninc=<dir> ........ Specify location of Python include directory.
-pythonlib=<dir> ............ Specify location of Python lib directory.
-TRACE ..................................... Generate TAU event traces.
-PROFILE ............ Generate profiles (summary statistics) (default).
-PROFILECALLPATH ......................... Generate call path profiles.
-PROFILESTATS .................. Enable standard deviation calculation.
-MULTIPLECOUNTERS ............ Use multiple hardware counters and time.
-SGITIMERS .......... Use fast nanosecond timers on SGI R10000 systems.
-LINUXTIMERS ......... Use low overhead TSC Counter for wallclock time.
-CPUTIME .......... Use usertime+system time instead of wallclock time.
-PAPIWALLCLOCK ........ Use PAPI to access wallclock time. Needs -papi.
-PAPIVIRTUAL   .......... Use PAPI for virtual (user) time calculation.
-noex .................. Use no exceptions while compiling the library.
-help ...................................... display this help message.
***********************************************************************

The following  command-line options are available to configure:

-prefix=<directory>
   
   Specifies the destination directory where the header, library and binary 
   files are copied. By default, these are copied to subdirectories <arch>/bin 
   and <arch>/lib in the TAU root directory. 
   
-arch=<architecture>
   
   Specifies the architecture. If the user does not specify this option, 
   configure determines the architecture. For SGI, the user can specify either 
   of sgi32, sgin32 or sgi64 for 32, n32 or 64 bit compilation modes 
   respectively. The files are installed in the <architecture>/bin and 
   <architecture>/lib directories.
   
-c++=<C++ compiler>
   
   Specifies the name of the C++ compiler. Supported  C++ compilers include  
   KCC (from KAI/Intel), CC,  g++ (from GNU), FCC (from Fujitsu), xlC(from IBM),
   guidec++ (from KAI/Intel), aCC (from HP), c++ (from Apple), and pgCC 
   (from PGI). 
   
-cc=<C Compiler>
   
   Specifies the name of the C compiler. Supported C compilers include cc, 
   gcc (from GNU), pgcc (from PGI), fcc (from Fujitsu), xlc (from IBM), and 
   KCC (from KAI/Intel).

-pdt_c++=<C++ Compiler> 
   Specifies a different C++ compiler for PDT (tau_instrumentor). This is 
   typically used when the library is compiled with a C++ compiler 
   (specified with -c++) and the tau_instrumentor is compiled with a different 
   <pdt_c++> compiler. For e.g., -c++=pgCC -cc=pgcc -pdt_c++=KCC -openmp ... 
   uses PGI's OpenMP compilers for TAU's library and KCC for tau_instrumentor.
   
-fortran=<Fortran Compiler>
   
   Specifies the name of the Fortran90 compiler. Valid options are:
   gnu, sgi, ibm, ibm64, hp, cray, pgi, absoft, fujitsu, sun, compaq, and kai.

-pthread
   
   Specifies pthread as the thread package to be used. In the default mode, no 
   thread package is used. 
   
-tulipthread=<directory>
   
   Specifies Tulip threads (HPC++) as the threads package to be used as well 
   as the location of the root directory where the package is installed. 
   [ Ref: http://www.acl.lanl.gov/tulip ]
   
-tulipthread=<directory> -smarts
   
   Specifies  SMARTS (Shared Memory Asynchronous Runtime System) as the 
   threads package to be used. <directory> gives the location of the SMARTS 
   root directory. [ Ref: http://www.acl.lanl.gov/smarts ]

-openmp
   Specifies OpenMP as the threads package to be used. 
   [ Ref: http://www.openmp.org ]

-opari=<dir>
   Specifies the location of the Opari OpenMP directive rewriting tool. 
   The use of Opari source-to-source instrumentor in conjunction with
   TAU exposes OpenMP events for instrumentation. See examples/opari directory.
   [ Ref: http://www.fz-juelich.de/zam/kojak/opari/ ]
   Note: There are two versions of Opari: standalone - (opari-pomp-1.1.tar.gz) and
   the newer KOJAK - kojak-<ver>.tar.gz opari/ directory. Please upgrade to the 
   KOJAK version (especially if you're using IBM xlf90).
   
-opari_region 
   Report performance data for only OpenMP regions and not constructs. 
   By default, both regions and constructs are profiled with Opari.

-opari_construct 
   Report performance data for only OpenMP constructs and not regions.
   By default, both regions and constructs are profiled with Opari.

-pdt=<directory>
   
   Specifies the location of the installed PDT (Program Database Toolkit) root 
   directory. PDT is used to build tau_instrumentor, a C++, C and F90 
   instrumentation program that automatically inserts TAU annotations in the 
   source code. If PDT is configured with a subdirectory option (-compdir=<opt>)
   then TAU can be configured with the same option by specifying 
   -pdt=<dir> -pdtcompdir=<opt>. 

   [ Ref: http://www.acl.lanl.gov/pdtoolkit ]
   
-pcl=<directory>
  
   Specifies the location of the installed PCL (Performance Counter Library) 
   root directory. PCL provides a common interface to access hardware 
   performance counters on modern microprocessors. The library supports 
   Sun UltraSparc I/II, PowerPC 604e under AIX, MIPS R10000/12000 under IRIX, 
   HP/Compaq Alpha 21164, 21264 under Tru64 Unix and Cray Unicos (T3E) and the 
   Intel Pentium family of microprocessors under Linux. This option specifies 
   the use of hardware performance counters for profiling (instead of time).  
   To measure floating point instructions, set the environment variable 
   PCL_EVENT to PCL_FP_INSTR (for example). Refer to the TAU User's Guide or
   PCL Documentation (pcl.h) for other event names.
   [ Ref : http://www.fz-juelich.de/zam/PCL ]

-papi=<directory>

   Specifies the location of the installed PAPI (Performance API) root 
   directory. PAPI specifies a standard application programming interface (API)    
   for accessing hardware performance counters available on most modern 
   microprocessors similar. To measure floating point instructions, set the
   environment variable PAPI_EVENT to PAPI_FP_INS (for example). Refer to the
   TAU User's Guide or PAPI Documentation for other event names.
   [ Ref : http://icl.cs.utk.edu/projects/papi/api/ ]
   
-jdk=<directory>
   Specifies the location of the Java 2 development kit (jdk1.2+). See
   README.JAVA on instructions on using TAU with Java 2 applications. 
   This option should only be used for configuring TAU to use JVMPI for 
   profiling and tracing of Java applications. It should not be used for 
   configuring jracy, which uses java from the user's path. 

-dyninst=<directory>
   Specifies the location of the DynInst (dynamic instrumentation) package. 
   See README.DYNINST for instructions on using TAU with DynInstAPI for 
   binary runtime instrumentation (instead of manual instrumentation). 
   [ Ref: http://www.cs.umd.edu/projects/dyninstAPI/ ]

-mpiinc=<dir>
   
   Specifies the directory  where mpi header files reside (such as mpi.h and 
   mpif.h). This option also generates the TAU MPI wrapper library that 
   instruments MPI routines using the MPI Profiling Interface. See the 
   examples/NPB2.3/config/make.def file for its usage with Fortran and MPI 
   programs and examples/pi/Makefile for a C++ example that uses MPI. 
   
-mpilib=<dir>
   
   Specifies the directory where mpi library files reside. This option should 
   be used in conjunction with the -mpiinc=<dir> option to generate the TAU 
   MPI wrapper library. 

-mpilibrary=<lib>
   
   Specifies the use of a different MPI library. By default, TAU uses
   -lmpi or -lmpich as the MPI library. This option allows the user to specify
   another library. e.g., -mpilibrary=-lmpi_r  for specifying a thread-safe MPI 
   library.

-nocomm
   Allows the user to turn off tracking of messages (synchronous/asynchronous) in
   TAU's MPI wrapper interposition library. Entry and exit events for MPI routines 
   are still tracked. Affects both profiling and tracing.
   
-epilog=<dir>
   
   Specifies the directory where the EPILOG tracing package [FZJ] is installed.
   This option should be used in conjunction with the -TRACE option to generate
   binary EPILOG traces (instead of binary TAU traces). EPILOG traces can then
   be used with other tools such as EXPERT. EPILOG comes with its own 
   implementation of the MPI wrapper library and the POMP library used with 
   Opari. Using option overrides TAU's libraries for MPI, and OpenMP.

-pythoninc=<dir>
   
   Specifies the location of the Python include directory. This is the directory
   where Python.h header file is located. This option enables python bindings to 
   be generated. The user should set the environment variable PYTHONPATH to 
   <TAUROOT>/<ARCH>/lib/bindings-<options> to use a specific version of the TAU 
   Python bindings. By importing package pytau, a user can manually instrument the source
   code and use the TAU API. On the other hand, by importing tau and 
   using tau.run('<func>'), TAU can automatically generate instrumentation. See
   examples/python directory for further information.

-pythonlib=<dir>
   
   Specifies the location of the Python lib directory. This is the directory
   where *.py and *.pyc files (and config directory) are located. This option is 
   mandatory for IBM when Python bindings are used. For other systems, this option 
   may not be specified (but -pythoninc=<dir> needs to be specified). 

-PROFILE 

   This is the default option; it specifies summary profile files to be 
   generated at the end of execution. Profiling generates aggregate statistics 
   (such as the total time spent in routines and statements), and can be used 
   in conjunction with the profile browser jracy to analyse the performance. 
   Wallclock time is used for profiling  program entities. 
   
-PROFILECALLPATH 

   This option generates call path profiles which shows the time spent in a 
   routine when it is called by another routine in the calling path. "a => b"
   stands for the time spent in routine "b" when it is invoked by routine "a".
   This option is an extension of -PROFILE, the default profiling option. 
   Specifying TAU_CALLPATH_DEPTH environment variable, the user can vary the 
   depth of the callpath. See examples/calltree for further information.

-PROFILESTATS
   
   Specifies the calculation of additional statistics, such as the standard 
   deviation of the exclusive time/counts spent in each profiled block. This 
   option is an extension of -PROFILE, the default profiling option.
   
-PROFILECOUNTERS
   
   Specifies use of hardware performance counters for profiling under IRIX  
   using the SGI R10000 perfex counter access interface. The use of this option 
   is deprecated in favor of the -pcl=<dir> and -papi=<dir> options described 
   above. 

-MULTIPLECOUNTERS
   
   Allows TAU to track more than one quantity (multiple hardware counters, CPU
   time, wallclock time, etc.) Configure with other options such as -papi=<dir>,
   -pcl=<dir>, -LINUXTIMERS, -SGITIMERS, -CPUTIME, -PAPIVIRTUAL, etc. See 
   examples/multicounters/README file for detailed instructions on setting the
   environment variables for this option. If -MULTIPLECOUNTERS is used with the
   -TRACE option, tracing employs the COUNTER1 variable for wallclock time. 
   
-SGITIMERS
   
   Specifies use of the free running nanosecond resolution on-chip timer on 
   the MIPS R10000. This timer has a lower overhead than the default timer on 
   SGI, and is recommended for SGIs. 

-LINUXTIMERS
   Specifies the use of the free running nanosecond resolution time stamp 
   counter (TSC) on Pentium III+ and Itanium family of processors under Linux.
   This timer has a lower overhead than the default time and is recommended.

-CPUTIME
   Uses usertime + system time instead of wallclock time. It gives the CPU
   time spent in the routines.  This currently works only on LINUX systems 
   for multi-threaded programs and on all systems for single-threaded programs. 
   
-PAPIWALLCLOCK
   Uses PAPI (must specify -papi=<dir> also) to access high resolution CPU 
   timers for wallclock time. The default case uses gettimeofday() which 
   has a higher overhead than this. 

-PAPIVIRTUAL
   Uses PAPI (must specify -papi=<dir> also) to access process virtual time.
   This represents the user time for measurements. 


-TRACE
   
   Generates event-trace logs, rather than summary profiles. Traces show when 
   and where an event occurred, in terms of the location in the source code and
   the process that executed it. Traces can be merged and converted using 
   tau_merge and tau_convert utilities respectively, and  visualized using 
   Vampir, a commercial trace visualization tool. [ Ref http://www.pallas.de ]
   
-noex
   
   Specifies that no exceptions be used while compiling the library. This is 
   relevant for C++. 
   
-useropt=<options-list>
   
   Specifies additional user options such as -g or -I.  For multiple options, 
   the options list should be enclosed in a single quote.
   
-help
   
   Lists all the available configure options and quits. 

   Examples:

   % ./configure -c++=KCC 
   Use TAU with KCC
 
   % ./configure -c++=CC -useropt='-g -I/local/apps/STL/'
   Use TAU with SGI CC and add the above user defined options to the 
   commandline.

   % ./configure -TRACE -PROFILE 
   Enable both profiling and tracing.

   % ./configure -c++=KCC -SGITIMERS -tulipthread=/home/smarts/build/smarts-1.0
     -smarts -arch=sgin32 -prefix=/usr/local/packages/tau
   Use TAU with KCC and fast nanosecond timers on SGI and use SMARTS with -n32
   options and install the files in /usr/local/packages/tau

   % ./configure -c++=KCC -cc=cc -arch=sgi64 -mpiinc=/local/apps/mpich/include
     -mpilib=/local/apps/mpich/lib/IRIX64/ch_p4 -SGITIMERS -pdt=/local/apps/pdt
   Use TAU with KCC, and cc on 64 bit SGI systems and use MPI wrapper libraries
   with SGI's low cost timers and use PDT for automated source code 
   instrumentation.

   % ./configure -c++=guidec++ -cc=guidec -papi=/usr/local/packages/papi -openmp
     -mpiinc=/usr/packages/mpich/include -mpilib=/usr/packages/mpich/lib
   Use OpenMP+MPI using KAI's Guide compiler suite and use PAPI for accessing
   hardware performance counters for measurements.

***********************************************************************
   To install *multiple* (typical) configurations of TAU at a site, you may use the 
   script 'installtau'. It takes options similar to those described above. It 
   invokes ./configure <opts>; make clean install;  to create multiple libraries that 
   may be requested by the users at a site. 
   % installtau -help

TAU Configuration Utility 
***********************************************************************
Usage: installtau [OPTIONS]
  where [OPTIONS] are:
-arch=<arch>  
-fortran=<compiler>  
-cc=<compiler>   
-c++=<compiler>   
-useropt=<options>  
-pdt=<pdtdir>  
-papi=<papidir>  
-mpiinc=<mpiincdir>  
-mpilib=<mpilibdir>  
-mpilibrary=<mpilibrary>  
-opari=<oparidir>  
***********************************************************************


2. Compilation.

   Type `make install' to compile the package. 
   Type `make tests' to compile the example programs that are included with
   this distribution.

   Make installs the library and its stub makefile  in <prefix>/<arch>/lib 
   subdirectory and installs utilities such as pprof and jracy in 
   <prefix>/<arch>/bin subdirectory.

   
   Add to your .cshrc file the $(TAU_ARCH)/bin subdirectory.
   e.g.,
   # in .cshrc file
   set path=($path /usr/local/packages/tau/sgi64/bin)

3. Instrumentation.
   JAVA requires no special instrumentation. To use TAU with JAVA, the 
   LD_LIBRARY_PATH environment variable must have the TAU <arch>/lib directory
   in its path. See README.JAVA on instructions regarding its usage.
   For other languages such as C++, C, and Fortran 90, TAU instrumentation in 
   the form of macros or routines must be added  to the source code to 
   identify routine transitions. It can be done automatically using the C++ 
   instrumentor - tau_instrumentor,  based on the Program Database Toolkit, or 
   manually using the instrumentation API (Application Programmers Interface). 
   The API is explained in detail in the documentation available at
   http://www.acl.lanl.gov/tau download page and can be seen in the examples 
   directory. This process involves identifying functions and associating each 
   function with one or more TAU profile groups. This enables selectively 
   profiling groups of functions. By default all instrumented functions that 
   are invoked are profiled.
   
   % cd examples/instrument
   % ./simple
   % pprof
   % jracy

   To use tau_instrumentor, the C++ source code instrumentor: 
   a. Install pdtoolkit. [ Ref: http://www.acl.lanl.gov/pdtoolkit ] 
      % ./configure -arch=IRIX64 -KCC

   b. Install TAU using the -pdt configuration option.
      % ./configure -pdt=/usr/local/packages/pdtoolkit-1.0 -c++=KCC -arch=sgi64 

   c. Modify the makefile to invoke cxxparse from PDT which generates a 
      program database file (.pdb) that contains program  entities (such as 
      routine locations) and tau_instrumentor that uses the .pdb file and the 
      C++ source code to generate an instrumented version of the source code.  
      See examples/autoinstrument/Makefile. 
      
      % cd examples/autoinstrument; make
      % klargest 
      % pprof

   d. tau_reduce is a utility that can determine which routines should not
      be instrumented. Instrumentation in frequently called light-weight routines
      may introduce undue perturbation and distort the performance data. tau_reduce
      examines the profile output and a set of rules for de-instrumentation and 
      produces a selective instrumentation file that can be fed to tau_instrumentor
      or tau_run and specifies which routines should not be instrumented. To see an 
      example of this utility, see examples/reduce (examples/README file has a description).
      Also, utils/TAU_REDUCE.README file contains information about tau_reduce and the
      format for specifying the rules for removing instrumentation. 
      % cd examples/reduce
      % make 

   To illustrate the use of TAU Fortran 90 instrumentation API, we have 
   included the NAS Parallel Benchmarks 2.3 LU and SP suites in the 
   examples/NPB2.3 directory [Ref http://www.nas.nasa.gov/NAS/NPB/ ].
   See the config/make.def makefile that shows how TAU can be used with 
   MPI  (with the TAU MPI Wrapper library) and Fortran 90. To use this, TAU
   must be configured using the -mpiinc=<dir>  and -mpilib=<dir> options. The
   default Fortran 90 compiler used is f90. This may be changed by the user in
   the makefile. LU is completely instrumented and uses the instrumented MPI
   library whereas SP has minimal instrumentation in the top level routine
   and relies on the instrumented MPI wrapper library. 
 
4. jracy.

   jracy is the GUI for TAU performance analysis. It requires Java 1.2+. An
   earlier version of the profile browser, racy, was implemented using Tcl/Tk.
   It is also available in this distribution but support for racy will be 
   gradually phased out. Users are encouraged to use jracy instead. jracy 
   does *not* require -jdk=<dir> option to be specified (which is used for 
   configuring TAU for analyzing Java applications). The 'java' jvm program 
   should be in the user's path.
   NOTE: If jracy does not work properly, please rebuild jRacy.jar file by
   % cd tau-xxx/tools/src/jRacy
   % make clean; make
   Before you do this, please ensure that javac (1.2+) is in your path. 

5. TAU System Requirements :
   -------------------------
I) The Profiling Library needs a recent C++ compiler. Our recommended list:
	a) Kuck and Associates' (http://www.kai.com) KCC compiler
	b) KAI's KAP/Pro (http://www.kai.com) OpenMP guidec++ compiler
	c) SGI (http://www.sgi.com) MipsPro 7.2+ CC compiler 
	d) PGI (http://www.pgroup.com) 3.0 pgCC compiler for Linux
	e) GNU (http://www.gnu.org) gcc-2.95 g++ compiler
	f) IBM (http://www.ibm.com) xlC C++ compiler for IBM SP
        g) SUN (http://www.sun.com) Sun CC 5.0+ compiler
        h) HP (http://www.hp.com) Tru64 cxx 6.x compiler  
	i) HP (http://www.hp.com) aCC compiler 
 
II) Platforms :
   TAU has been tested on 
	a) SGI IRIX 6.5 systems (Origin 2000) with KCC, CC, g++, guidec++.
	b) LINUX x86 PC clusters with 
		i) 	KAI KCC compiler, 
		ii) 	GNU g++/egcs compiler,
		iii)	PGI pgCC, pgcc, pgf90 compiler suite,
	        iv) 	Fujitsu C++/f90 compiler suite,
		v)      KAI KAP/Pro compiler suite.
		vi)     Intel C++/C/F90 compiler suite.
	c) Sun Solaris2 with g++, KCC. 
	d) HP PA-RISC systems running HP-UX with g++, and aCC. 
	e) Cray T3E with Cray C++ compiler, and KAI KCC.
	f) HP Tru64 Alpha with g++, cxx.
        g) HP Alpha Linux clusters with g++.
	i) Microsoft Windows. Tested with MS Visual C++ v6.0.
	j) IBM SP AIX (RS6000) systems with KCC, and xlC compilers.
	k) PowerPC Linux with g++.
	l) IA-64 Linux with g++, SGI Pro64 and Intel C++/C/F90 compilers.
	m) Apple OS X (Darwin) with c++.
	n) Hitachi SR8000 with KCC, g++, Hitachi cc and f90 compilers. 
        o) NEC SX-5 system with NEC c++, cc, and f90 compilers
	   

   TAU may work with minor modifications on other platforms.
	
III) Software Requirements :
   a) Tcl/Tk
   TAU's GUI racy needs Tcl 7.4/Tk 4.0 or better. The default is 8.0. 
   Tcl/Tk can be downloaded from http://www.scriptics.com. 
   NOTE: Tcl/Tk is only required for running the profile browser racy. The
   current version of TAU supports the new Java based jracy profile browser that
   replaces the Tcl/Tk based racy. 
   
   b) xauth
   The display needs to be secure. xhost+ should not be used. Xauth style
   security is required. See TAU FAQ on how to use this. Contact your 
   system administrator if your X-server is not configured for Xauth 
   cookies. 

   c) xrdb
   The configure script ensures that the display is ok using xrdb.

   d) java
   jracy requires Java 1.2+. Java can be downloaded from http://www.sun.com

    
6. Modifying user's Makefile for Tracing/Profiling.

   TAU provides a makefile stub file which is placed in the installation
   directory <prefix>/<arch>/lib/Makefile.tau[-optionlist]. Users need to 
   include this makefile and use the make variables TAU_INCLUDE TAU_LIBS
   and TAU_DEFS appropriately in their makefiles. See (examples/instrument/
   Makefile)  

7. Examples of configuration and usage on the IBM SP
        
     % cd tau-2.x
     Example I:
     Profiling a Multithreaded C++ program (compiled with xlC)
     
     % configure -pthread
     % make clean; make install
     % set path=($path <TAU DIRECTORY>/rs6000/bin)
     % cd examples/threads
     % make; 
     % hello
     
       It has two threads: the profiling data should show functions executing on
       each thread
     % pprof
       This is the text based profile browser.
     % jracy  
     
     Example II:
     Profiling an MPI program using the TAU MPI wrapper library.
     
     % configure -mpiinc=/usr/lpp/ppe.poe/include -mpilib=/usr/lpp/ppe.poe/lib
     % make clean; make install
     % cd examples/pi
     % make 
     % poe cpi -procs 4 -rmpool 2
     % pprof or racy
       Note: Using the MPI Profiling Interface TAU can generate profile data for 
       all MPI routines as well.
     
     Example III:
     Profiling an application written in C++ (compiled with KCC) using automatic 
     source code instrumentation and using CPU time instead of (the default) 
     wallclock time.
     [ For KCC you'll need % module load KCC]
     Download PDT (Program Database Toolkit) from http://www.acl.lanl.gov/pdtoolkit
     
     % cd pdtoolkit-1.x
     % configure 
     % make ; make install
       This takes a while...
     
     Next configure TAU to use PDT for automatic source code instrumentation.
     % cd tau-2.x
     % configure -c++=KCC -cc=cc -pdt=<pdtoolkit-1.x root directory> -CPUTIME
     		e.g.,   ... -pdt=/u1/sameer/pdtoolkit-1.3 ...
     % make clean; make install
     % cd examples/autoinstrument
     % make 
       This takes klargest.cpp, an uninstrumented file, parses it (PDT), and 
       invokes tau_instrumentor, which takes the PDT output and generates an 
       instrumented C++ file, which when linked with the TAU library, generates
       performance date when executed.
     % klargest
     % pprof
     % racy
     
     Example IV:
     Tracing an MPI program (compiled with KCC) and displaying the traces in 
     Vampir.
     
     % configure -c++=KCC -cc=cc -mpiinc=/usr/lpp/ppe.poe/include 
       	  -mpilib=/usr/lpp/ppe.poe/lib -TRACE
     % make clean; make install
     % cd examples/pi
     % make CXX=mpKCC
     % poe cpi -procs 4 -rmpool 2 2000
       Calculate the value of pi using 2000 iterations. 
     
     % tau_merge tautrace.*.trc cpi.trc
     % tau_convert -vampir cpi.trc tau.edf cpi.pv
     
     % vampir cpi.pv 
     
     In the Menu, choose Preferences -> Color Styles -> Activities and choose a 
     distinct color for each activity. 
     
     Example V:
     Profiling an OpenMPI (OpenMP + MPI) C program using xlC.
     % configure -openmp -mpiinc=/usr/lpp/ppe.poe/include 
         -mpilib=/usr/lpp/ppe.poe/lib  
     % cd examples/openmpi
     % make CXX=mpCC_r CC=mpcc_r
     % setenv OMP_NUM_THREADS 2
     % poe stommel -procs 2 -rmpool 2 
     % pprof
   
8. Using TAU with POOMA
   Set the environment variable TAUDIR to point to the directory where TAU is
   installed. Follow the following procedure.

    FOR POOMA/SMARTS Users:
    -----------------------
    1. Configure PDT
    ****************
    % cd /usr/local/packages/pdtoolkit-1.0
    [FOR SGI]
    % configure -KCC -arch=IRIX64
    [FOR LINUX PCs]
    % configure -KCC
    
    2. Configure TAU
    ****************
    % cd /usr/local/packages/tau-2.7
    [FOR SGI]
    % ./configure -arch=sgi64 -c++=KCC -tulipthread=/usr/local/packages/smarts-1.0 -smarts -SGITIMERS -pdt=/usr/local/packages/pdtoolkit-1.0
    [FOR LINUX PCs]
    % ./configure -arch=linux -c++=KCC -tulipthread=/usr/local/packages/smarts-1.0 -smarts -pdt=/usr/local/packages/pdtoolkit-1.0
    % make install
    
    3. Configure SMARTS
    *******************
    % cd /usr/local/packages/smarts-1.0
    [FOR SGI]
    % configure --with-arch=iris4d --prefix /usr/local/packages/smarts-1.0 --with-taudir=/usr/local/packages/tau-2.7 --enable-64bit --enable-profile
    [FOR LINUX PCs]
    % configure --with-arch=i386-linux --prefix /usr/local/packages/smarts-1.0 --with-taudir=/usr/local/packages/tau-2.7 --enable-profile
    % make
    % make install
    
    4. Configure Pooma II 
    *********************
    % setenv TAUDIR     /usr/local/packages/tau-2.7
    % setenv PDTDIR	    /usr/local/packages/pdtoolkit-1.0
    % setenv SMARTSDIR  /usr/local/packages/smarts-1.0
    [FOR SGI]
    % ./configure --arch SGI64KCC --suite PP --parallel --profile --opt --ex
    [FOR LINUX PCs]
    % ./configure --arch LINUXKCC --suite PP --parallel --profile --opt --ex
    % setenv POOMASUITE PP
    % make
    % cd examples/Solvers/SimpleJacobi
    % make
    % cd $POOMASUITE
    % SimpleJacobi --pooma-threads <n>
    % pprof
    % jracy 
       
    
    
    
