What data can we gather?
- What was the time spent exclusively and inclusively in each function?
- How many times was f1 called?
- How many profiled functions (subroutines) did f1 call?
- Whats the mean time/call for each function?
- What was the maximum(and minimum and mean and total) time spent in f1 over
all nodes? contexts? threads?
- For each invocation of f1 what was the exclusive and inclusive time spent
in it? (Trace)
- Can we replace "Time" by "flops"? Instructions issued? Cycles? (HW counters)
- Can we profile at a level at which the user understands the modules? Give a high level view of the application?
- Can we provide both a high level abstraction for profiling and allow low level hardware related quantities to be profiled?
- Can we profile only Communication functions? Comm + IO? (Selective Profiling)
- Can we profile a set of statements (finer granularity) instead of functions? Can we profile blocks such as for loops?
- Can we calculate statistics such as standard deviation of the exclusive time spent in each function?
- Does it work with Threads? pthreads? NexusThreads?