Committee: Reza Rejaie (chair), Daniel Lowd, Arthur Farley
Directed Research Project(Aug 2011)
Keywords: large graphs; multiscale graph analysis; randon walk; social networks
The structure of networked systems such as online social networks (OSN) or power grids is often represented with a graph. Characterizing the connectivity features of such a graph reveals the structural properties of the corresponding system. In particular, identifying tightly connected regions (i.e. clusters) in a graph provides a valuable insight about its connectivity structure. However, existing cluster detection techniques often require the entire graph, only detect clusters with a relatively small diameter and do not gracefully scale to large graphs.
This project presents a new technique, called WalkAbout, for identifying tightly connected regions in large graphs using short random walks. The key idea is that the ratio of visit (by short random walks) to degree for individual nodes can be used to distinguish nodes in different regions of a graph. We describe the technique, its key features and limitations and perform several validations. Using WalkAbout, we characterize regional connectivity of four large OSNs and demonstrate that the structure of these systems composes of a couple of pronounced regions. Finally, we explore the underlying causes for the formation of these regions in our target OSNs.