Showing 1 entry
Keywords: Performance tools, GPGPU, profiling, tracing
Heterogeneous parallel systems using GPU devices for ap- plication acceleration have garnered signiﬁcant attention in the supercomputing community. However, to realize the full potential of GPU computing, application developers will re- quire tools to measure and analyze accelerator performance with respect to the parallel execution as a whole. A per- formance measurement technology for the NVIDIA CUDA platform has been developed and integrated with the TAU parallel performance system. The design of the TAUcuda package is based on an experimental NVIDIA CUDA driver and associated runtime and device libraries. In any envi- ronment where the CUDA experimental driver is installed, TAUcuda can provide detailed performance information re- garding the execution of GPU kernels and the interactions with the parallel program without any modiﬁcation to the program source or executable code. The paper describes the TAUcuda technology and how it is integrated with the TAU measurement framework to provide integrated performance views. Various examples of TAUcuda use are presented, in- cluding CUDA SDK examples, a GPU version of the Linpack benchmark, and a scalable molecular dynamics application, NAMD.
Modified: Fri Nov 12 10:35:14 US/Pacific 2010
Created: Fri Jul 23 9:06:41 US/Pacific 2010
Return to the ParaDucks Research Group Publications page.