There are several available options for providing the Z-Level view of a ZPL program. We could write a simplified compiler that would produce completely unoptimized code to exactly implement the abstract machine, spawning a separate set of threads for each parallel statement, with barrier synchronization after each operation. A simple debugger could then attach a debug slave to each thread, and set breakpoints and collect values for display.
A number of problems exist with this solution, but the primary one is performance: the high-level semantics of ZPL are designed to make the optimization process extremely effective. Therefore, an unoptimized program of this sort is likely to have extremely poor performance. Since data-parallel languages are typically used for complex scientific applications, this lack of performance could result in a debugging cycle lasting days, or even weeks.
Because the basic ZPL compilation process results in code structures that greatly resemble optimizations (for example, shards and promoted functions), a Z-Level debugger should be able to interact with a fully optimized ZPL program. A number of problems arise when attempting to provide a Z-level view of an optimized ZPL program running in parallel. These fall primarily into two categories: correcting the effects of optimization, and reconstructing the parallel threads from mloops. With a new approach to debugger design called breakpoint construction, we solve both problems simultaneously.