Fonte: “A study on malware detection and classification using the analysis of API calls sequences through shallow learning and recurrent neural networks”, Paper, https://ceur-ws.org/Vol-3260/paper9.pdf
Malware detection and classification is a critical issue in cybersecurity. Systems acting through signatures suffer the problem of not being able to detect attacks via zero-day malware. Among the approaches that can detect unknown attacks are the possibilities offered by analyzing the sequence of API calls performed by the executable. Such information can be extracted through static and dynamic analysis methods in a sandbox environment. This work proposes an analysis of different techniques to detect malware and subsequently classify them by identifying the family of belonging.
Machine Learning algorithms based on trees are compared with Deep Learning algorithms based on Recurrent Neural Networks. The results obtained lead to choosing an algorithm based on RNNs for malware detection and an algorithm based on trees for malware classification.
The constant growth of the development and diffusion of new technologies expose systems to potential risks. Among the main risks is malware, malicious software created to steal, spy, or, more in general, damage infected systems.
In order to mitigate this risk, it is necessary to adopt tools that can detect, classify and block these threats on time. The literature analysis shows that most malware detection systems are based on static analysis techniques such as signature verification, which are very good for all known malicious software, but at the same time, ineffective for new malware since the signatures of these applications are not yet available.
An alternative approach is using sandbox information (such as CAPEv2), which provides dynamic application analysis. The sandbox software encapsulates the information extracted in reports for each file analyzed. The reports contain the ordered sequence of API calls called.