Showing 1 entry
Keywords: SMARTS, asynchronous, threads, profiling, tracing, vertical executiondata-parallelism, dependence-driven execution, runtime system, barriersynchronization, TAU
In the solution of large-scale numerical problems, parallel computing is becoming simultaneously more important and more difficult. The complex organization of today's multiprocessors with several memory hierarchies has forced the scientific programmer to make a choice between simple but unscalable code and scalable but extremely complex code that does not port to other architectures.
This paper describes how the SMARTS runtime system and the POOMA C++ class library for high-performance scientific computing work together to exploit data parallelism in scientific applications while hiding the details of managing parallelism and data locality from the user. We present innovative algorithms, based on the macro-dataflow model for detecting data parallelism and efficiently executing data-parallel statements on shared-memory multiprocessors. We also describe how these algorithms can be implemented on clusters of SMPs.
Modified: Wed Feb 18 17:32:55 2004
Created: Fri Dec 3 12:16:26 1999
Return to the ParaDucks Research Group Publications page.