Identifying Strongly Connected Components onDistributed Networks

Sudharshan Srinivasan

Strongly connected components (SCC) are an essential property for understanding the structure of directed networks. Given that many real-world networks are significant, it is often computationally efficient to partition the network over many distributed systems and solve for SCC simultaneously over the partitioned network. In this paper, we present an algorithm for identifying SCC on distributed systems. Our algorithm comprises of three phases. In the first phase, we locally perform SCC over all partitions while in the second phase, we update the partial results across the partitions in a reduced graph form. Lastly, we reapply SCC over the reduced graph at all partitions. The computations within individual partitions are highly scalable using both shared-memory CPU and GPU threads. Due to its distributed nature, our algorithm allows for analysis of much larger networks than is previously possible. Comparisons with state-of-the-art baseline approaches are made to analyze the algorithm. Our results show performance speedups up to 3.1x and memory reductions up to 2.6x with respect to serial implementations.