A far more interesting benchmark in the NAS suite is the random
sparse conjugate gradient computation.
This benchmark requires the repeated solution to , where
is a
random sparse matrix.
There are two ways to approach the parallelization of this task.
The best, and most difficult, is to view the matrix as the connectivity
graph for the elements in the vector.
By a preprocessing step one can partition the vector into segments of
nearly equal size that minimizes the communication between subsets.
The resulting algorithm can have the same low communication to computation
ratio as our simple grid CG example above.
The second approach is not as efficient in terms of locality, but
it a more direct implementation of the benchmark.
In this version we represent the matrix
as a distributed random sparse matrix.
More specifically, we represent the matrix as a Distributed Matrix of
Sparse Matrices.
The collection class in the pC++ library allows
any class that has the algebraic structure of a ring to be the element
of the matrix.
Furthermore, a
by
matrix whose elements are
by
matrices has
all the same mathematical properties of a
by
matrix.
Indeed, this is the key property used
in the blocking algorithms used in most BLAS level 3 libraries.
Consequently, we can select the element class to be a sparse
matrix.
This distributed matrix of sparse matrices can then be used
like an ordinary matrix.