Fonte: “Combining Unsupervised Approaches for Near Real-Time Network Traffic Anomaly Detection”, 8 Febbraio 2022, Paper, https://www.mdpi.com/2076-3417/12/3/1759/htm

Abstract

The 0-day attack is a cyber-attack based on vulnerabilities that have not yet been published. The detection of anomalous traffic generated by such attacks is vital, as it can represent a critical problem, both in a technical and economic sense, for a smart enterprise as for any system largely dependent on technology. To predict this kind of attack, one solution can be to use unsupervised machine learning approaches, as they guarantee the detection of anomalies regardless of their prior knowledge. It is also essential to identify the anomalous and unknown behaviors that occur within a network in near real-time. Three different approaches have been proposed and benchmarked in exactly the same condition: Deep Autoencoding with GMM and Isolation Forest, Deep Autoencoder with Isolation Forest, and Memory Augmented Deep Autoencoder with Isolation Forest. These approaches are thus the result of combining different unsupervised algorithms. The results show that the addition of the Isolation Forest improves the accuracy values and increases the inference time, although this increase does not represent a relevant problematic factor. This paper also explains the features that the various models consider most important for classifying an event as an attack using the explainable artificial intelligence methodology called Shapley Additive Explanations (SHAP). Experiments were conducted on KDD99, NSL-KDD, and CIC-IDS2017 datasets.

Introduction

Cyber attacks can impact the performance of networks (corporate or otherwise), allow access to and the modification of confidential data, and compromise the security of virtually any infrastructure belonging to the network itself. This issue can have a significant economic impact on a smart enterprise. Zero-day attacks are unknown, never-before-seen attacks based on undisclosed and unpublished flaws and the widespread knowledge of “you cannot protect against what is unknown” [1]. Therefore, it is of fundamental importance to create algorithms that are able to predict 0-day attacks in an automatic and fast way (near real-time). Time is a crucial parameter concerning such attacks: in a system with a large volume of traffic, if it cannot be analyzed in a reasonable time to avoid traffic accumulation, and thus a bottleneck is generated. In a near real-time anomaly detection system, it is necessary to consider the processing speed of the algorithm and the pattern variations typical of attacks caused by the different network traffic that can be monitored in a real context. In particular, network traffic is characterized by behaviors that vary based on holidays, business hours, or times of the year characterized by higher or lower network activity [2]. Therefore, the system must be able to respond to changes in the monitored network traffic dynamically.

Continua a leggere