2. TAU Mapping API

TAU allows the user to map performance data of entities from one layer to another in multi-layered software. Mapping is used in profiling (and tracing) both synchronous and asynchronous models of computation. For mapping, the following macros are used. First locate and identify the higher-level statement using the TAU_MAPPING macro. Then, associate a function identifier with it using the TAU_MAPPING_OBJECT. Associate the high level statement to a FunctionInfo object that will be visible to lower level code, using TAU_MAPPING_LINK, and then profile entire blocks using TAU_MAPPING_PROFILE. Independent sets of statements can be profiled using TAU_MAPPING_PROFILE_TIMER, TAU_MAPPING_PROFILE_START, and TAU_MAPPING_PROFILE_STOP macros using the FunctionInfo object. The TAU examples/mapping directory has two examples (embedded and external) that illustrate the use of this mapping API for generating object-oriented profiles.

2.1. TAU_MAPPING(statement, key);

Arguments: statement ; // any C++ statement TauGroup_t key; // TAU group/unique key associated

TAU_MAPPING is used to encapsulate the C++ statement that we want to map to some other layer. The other layer can execute synchronously or asynchronously with respect to this statement. The key corresponds to a number that the lower layer will use to refer to this statement. For example,

				
				int main()
				{
				Array <2> A(N, N), B(N, N), C(N,N), D(N, N);
				//Original statement:
				A = B + C + D;
				//Instrumented statement:
				TAU_MAPPING(A = B + C + D; , TAU_USER);
				... 
				}
			

2.2. TAU_MAPPING_CREATE(name, type, key, groupname, tid);

Arguments: char *name, type, groupname; TauGroup_t key; // TAU group/unique key associated int tid; // Thread id

TAU_MAPPING_CREATE is similar to TAU_MAPPING but it requires the name, type and group name parameters (as character strings) to be specified. It creates a mapping and associates it with the key that is specified. Later, this key may be specified to retrieve the FunctionInfo object associated with this key for timing purposes. The thread identifier is specified in the tid parameter.

Example:

				
				TAU_MAPPING_CREATE("foo()", "void ()", function_id,"USER", tid);
			

2.3. TAU_MAPPING_LINK(FuncIdVar, Key);

Arguments: FunctionInfo *FuncIdVar; TauGroup_t Key;

TAU_MAPPING_LINK creates a link between the object defined in TAU_MAPPING_OBJECT (that identifies a statement) and the actual higher-level statement that is mapped with TAU_MAPPING. The Key argument represents a profile group to which the statement belongs, as specified in the TAU_MAPPING macro argument. For the example of array statements, this link should be created in the constructor of the class that represents the expression. TAU_MAPPING_LINK should be executed before any measurement takes place. It assigns the identifier of the statement to the object to which FuncIdVar refers. For example

				
				//
				// Constructor
				// Input an expression and record it for later use.
				//
				template<class LHS,class Op,class RHS,class EvalTag>
				ExpressionKernel<LHS,Op,RHS,EvalTag>::
				ExpressionKernel(const LHS& lhs,const Op& op,const RHS&  rhs, \
				Pooma::Scheduler_t& scheduler) : Pooma::Iterate_t(scheduler,        forEachTag(MakeExpression<LHS>::make(lhs),  DataBlockTag<CountBlocks>(),SumCombineTag()) +  forEachTag(MakeExpression<RHS>::make(rhs), DataBlockTag<CountBlocks>(), \
				SumCombineTag()), -1), lhs_m(lhs), op_m(op), rhs_m(rhs)
				{
				TAU_MAPPING_LINK(TauMapFI, TAU_USER)
				// .. rest of the constructor
				}
			

2.4. TAU_MAPPING_OBJECT(FuncIdVar);

Arguments: FunctionInfo *FuncIdVar;

To create storage for an identifier associated with a higher level statement that is mapped using TAU_MAPPING, we use the TAU_MAPPING_OBJECT macro. For example, in the TAU_MAPPING example, the array expressions are created into objects of a class ExpressionKernel, and each statement is an object that is an instance of this class. To embed the identity of the statement we store the mapping object in a data field in this class. This is shown below:

				template<class LHS,class Op,class RHS,class EvalTag>
				class ExpressionKernel : public Pooma::Iterate_t
				{
				public:
				
				typedef ExpressionKernel<LHS,Op,RHS,EvalTag> This_t;
				//
				// Construct from an Expr.
				// Build the kernel that will evaluate the expression on the 
				// given domain.
				// Acquire locks on the data referred to by the expression.
				//
				ExpressionKernel(const LHS&,const Op&,const RHS& , \
				Pooma::Scheduler_t&);
				
				
				virtual ~ExpressionKernel();
				
				//
				// Do the loop.
				//
				virtual void run();
				
				private:
				
				// The expression we will evaluate.
				LHS lhs_m;
				Op  op_m;
				RHS rhs_m;
				TAU_MAPPING_OBJECT(TauMapFI)
				};
			

2.5. TAU_MAPPING_PROFILE (FuncIdVar);

Arguments; FunctionInfo *FuncIdVar;

The TAU_MAPPING_PROFILE macro measures the time and attributes it to the statement mapped in TAU_MAPPING macro. It takes as its argument the identifier of the higher level statement that is stored using TAU_MAPPING_OBJECT and linked to the statement using TAU_MAPPING_LINK macros. TAU_MAPPING_PROFILE measures the time spent in the entire block in which it is invoked. For example, if the time spent in the run method of the class does work that must be associated with the higher-level array expression, then, we can instrument it as follows:

				//
				// Evaluate the kernel
				// Just tell an InlineEvaluator to do it.
				//
				
				template<class LHS,class Op,class RHS,class EvalTag>
				void
				ExpressionKernel<LHS,Op,RHS,EvalTag>::run()
				{
				TAU_MAPPING_PROFILE(TauMapFI)
				
				// Just evaluate the expression.
				KernelEvaluator<EvalTag>().evalate(lhs_m,op_m,rhs_m);
				// we could release the locks here or in dtor 
				}
			

2.6. TAU_MAPPING_PROFILE_START(timer, tid);

Argument: Profiler timer; int tid;

TAU_MAPPING_PROFILE_START starts the timer that is created using TAU_MAPPING_PROFILE_TIMER. This will measure the elapsed time in groups of statements, instead of the entire block. A corresponding stop statement stops the timer as described next. The thread identifier is specified in the tid parameter.

2.7. TAU_MAPPING_PROFILE_STOP(tid);

Arguments: int tid; TAU_MAPPING_PROFILE_STOP stops the timer associated with the mapped lower-level statements. This is used in conjunction with TAU_MAPPING_PROFILE_TIMER and TAU_MAPPING_PROFILE_START macros. Example: template<class LHS,class Op,class RHS,class EvalTag> void ExpressionKernel<LHS,Op,RHS,EvalTag>::run() { TAU_MAPPING_PROFILE_TIMER(timer, TauMapFI); printf("ExpressionKernel::run() this = 4854\n", this); // Just evaluate the expression. TAU_MAPPING_PROFILE_START(timer); KernelEvaluator<EvalTag>().evaluate(lhs_m, op_m, rhs_m); TAU_MAPPING_PROFILE_STOP(); // we could release the locks here instead of in the dtor. }

This concludes our Mapping section.

2.8. TAU_MAPPING_PROFILE_TIMER(timer, FuncIdVar);

Arguments: Profiler timer; FunctionInfo * FuncIdVar;

TAU_MAPPING_PROFILE_TIMER enables timing of individual statements, instead of complete blocks. It will attribute the time to a higher-level statement. The second argument is the identifier of the statement that is obtained after TAU_MAPPING_OBJECT and TAU_MAPPING_LINK have executed. The timer argument in this macro is any variable that is used subsequently to start and stop the timer.