Smarts (Shared Memory Asynchronuous Runtime System) supports integrated task and data parallelism for MIMD architectures with deep memory hierarchies. To support task parallelism, SMARTS provides user-level threads that allows application developer to create light-weight virtual processors natural to the algorithm being implemented. Tulip is a portable parallel runtime class library. It provides a fast, lightweight user level threads package. Smarts is the Tulip interface for threads that is used in POOMA II.
Tracing and Profiling User Level Smarts ThreadsUsing TAU we can evaluate the performance of different scheduling policies implemented in SMARTS. Here we focus on how its usage.
The above figure shows the scheduling of two iterates using the SMARTS sync scheduler on four processors.
The above figure illustrates the SCVE scheduler (aka SMARTS Single Parser scheduler) used in the Red/Black SOR example running on 32 processors of SGI Origin 2000. The Red phase computation is represented by the green color, the Black phase by blue and the scheduling overhead by red (TAU_DEFAULT) and the templates in the FastAsyncScheduler.h by yellow (TAU_USER4 profile group).