Measurement-Based Characterization of Large-Scale Networked Systems

Reza Motamedi

As the Internet has grown to represent arguably the largest “engineered” system on earth, network researchers have shown increasing interest in measuring this large-scale networked system. In the process, structures such as the physical Internet or the many different (logical) overlay networks that this physical infrastructure enables have been the focus of numerous studies. Many of these studies have been fueled by the ease of access to “big data”. Moreover, they benefited from advances in the study of complex networks.

However, an important missing aspect in typical applications of complex network theory to the study of real-world distributed systems has been a general lack of attention to domain knowledge. On the one hand, missing or superficial domain knowledge can negatively affect the studies "input"; that is, limitations or idiosyncrasies of the measurement methods can render the resulting graphs difficult to interpret if not meaningless. On the other hand, lacking or insufficient domain knowledge can result in specious "output"; that is, popular graph abstractions of real-world systems are incapable of accounting for “details” that are important from an engineering perspective.

In this thesis, we take a closer look at measurement-based characterization of a few real-world large-scale networked systems and focus on the role that domain knowledge plays in gaining a thorough understanding of these systems key properties and behavior. More specifically, we use domain knowledge to (i) design context-aware measurement strategies that capture the relevant information about the system of interest, (ii) analyze the captured view of the networked system baring in mind the abstraction imposed by the chosen graph representation, and (iii) scrutinize the results derived from the analysis of the graph-based representations by investigating the root causes underlying these findings. The main technical contribution of our work is twofolds. First, we establish concrete connections between the amount and level of domain knowledge needed and the quality of the measurements collected from networked systems. Second, we also provide concrete evidence for the role that domain knowledge plays in the analysis of views inferred from measurements collected from large-scale networked systems.