Fine-grained, Content-agnostic Network Traffic Analysis for Malicious Activity Detection

Yebo Feng

The rapid evolution of malicious activities in network environments necessitates the development of more effective and efficient detection and mitigation techniques. Traditional traffic analysis (TA) approaches have demonstrated limited efficacy and performance in detecting various malicious activities, resulting in a pressing need for more advanced solutions. To fill the gap, this dissertation proposes several new fine-grained network traffic analysis (FGTA) approaches. These approaches focus on (1) detecting previously hard-to-detect malicious activities by deducing fine-grained, detailed application-layer information in privacy-preserving manners, (2) enhancing usability by providing more explainable results and better adaptability to different network environments, and (3) combining network traffic data with endpoint information to provide users with more comprehensive and accurate protections.

We begin by conducting a comprehensive survey of existing FGTA approaches. We then propose CJ-Sniffer, a privacy-aware cryptojacking detection system that efficiently detects cryptojacking traffic. CJ-Sniffer is the first approach to distinguishing cryptojacking traffic from user-initiated cryptocurrency mining traffic, allowing for fine-grained traffic discrimination. This level of fine-grained traffic discrimination has proven challenging to accomplish through traditional TA methodologies. Next, we introduce BotFlowMon, a learning-based, content-agnostic approach for detecting online social network (OSN) bot traffic, which has posed a significant challenge for detection using traditional TA strategies. BotFlowMon is an FGTA approach that relies only on content-agnostic flow-level data as input and utilizes novel algorithms and techniques to classify social bot traffic from real OSN user traffic. To enhance the usability of FGTA-based attack detection, we propose a learning-based DDoS detection approach that emphasizes both explainability and adaptability. This approach provides network administrators with insightful explanatory information and adaptable models for new network environments. Finally, we present a reinforcement learning-based defense approach against L7 DDoS attacks, which combines network traffic data with endpoint information to operate. The proposed approach actively monitors and analyzes the victim server and applies different strategies under different conditions to protect the server while minimizing collateral damage to legitimate requests.

Our evaluation results demonstrate that the proposed approaches achieve high accuracy and efficiency in detecting and mitigating various malicious activities, while maintaining privacy-preserving features, providing explainable and adaptable results, or providing comprehensive application-layer situational awareness. This dissertation significantly advances the fields of FGTA and malicious activity detection.

This dissertation includes published and unpublished co-authored materials.