Smarts (Shared Memory Asynchronuous Runtime System) supports integrated task and data parallelism for MIMD architectures with
deep memory hierarchies. To support task parallelism, SMARTS provides user-level threads that allows application developer to create
light-weight virtual processors natural to the algorithm being implemented. 
Tulip is a portable parallel runtime class library. It provides a fast, lightweight user level threads package. Smarts is the Tulip interface for threads that is used in POOMA II. 
Tracing and Profiling User Level Smarts Threads
Using TAU we can evaluate the performance of different scheduling policies
implemented in SMARTS. Here we focus on how its usage. 
 Profiling SMARTS applications
Profiling SMARTS applications
 Tracing SMARTS and visualizing using Vampir
Tracing SMARTS and visualizing using Vampir

The above figure shows the scheduling of two iterates using the SMARTS sync scheduler on four processors. 
 
The above figure illustrates the SCVE scheduler (aka SMARTS Single Parser scheduler) used in the Red/Black SOR example running on 32 processors of SGI Origin 2000. The Red phase computation is represented by the green color, the Black phase by blue and the scheduling overhead by red (TAU_DEFAULT) and the templates in the FastAsyncScheduler.h by yellow (TAU_USER4 profile group).