Skip Navigation Text:

Directed Research Project Details

Unbiased Sampling over Online Social Networks

Author:Mojtaba Torkjazi
Date:June 05, 2009
Time:10:00
Location:220 Deschutes
Committee:Reza Rejaie (Chair)
Jun Li
Allen Malony

Abstract

During recent years, Online Social Networks (OSNs) have evolved in many ways and attracted millions of users. The dramatic increase in the popularity of OSNs has encouraged network researchers to examine their connectivity structure. The majority of empirical studies for characterizing OSN connectivity graphs have analyzed snapshots of the system taken in different times. These snapshots are collected by measurements that crawl OSN connectivity graphs. However, OSN owners are often unwilling to expose their user information due to privacy concerns. On the other hand, because of large population and dynamics of OSNs, the task of crawling may result in an incomplete or distorted snapshot of the system.

These challenges have clearly heightened the urgency for developing efficient and accurate graph sampling techniques. Although a couple of techniques, such as Metropolized Random Walk (MRW) and Respondent-Driven Sampling (RDS), have been proposed for P2P systems, little is known about their accuracy over connectivity graph of OSNs. In this paper, we focus on MRW and RDS sampling techniques and thoroughly investigate their performance in sampling OSN systems. Our main findings can be summarized as follows: (i) both techniques are sensitive to graph structure, but RDS exhibits better performance; (ii) heterogeneous degree distribution and high number of "unbalanced" edges in OSN graphs are the main factors for poor performance of MRW over such graphs; (iii) RDS is unable to properly sample low degree nodes which are hard to reach; and (iv) OSN graphs composes of a dense core in the middle and a large number of partitions hanging from the core. High internal and low external connectivity of this core make it almost infeasible for sampling technique to explore all regions of OSN graphs.