Enhancing Malware Detection by Integrating Machine Learning with Cuckoo Sandbox
##plugins.themes.bootstrap3.article.main##
Abstract
In this work, two categories of deep learning and conventional machine learning were used to classify malware using a dataset of all possible API call sequences. Specifically, the objective was to determine the best strategy to tackle the ever-rising menace as malware becomes more complex. A new dataset was created employing Cuckoo Sandbox, where API call sequences originating from both benign and malware samples were recorded. The performance of these algorithms was benchmarked and tested using this dataset, which includes SVM, RF, KNN, XGB, GBC, CNN, and RNN. The study established that both deep learning and conventional machine learning algorithms provided high accuracy above 90%. Specifically, the recurrent neural networks (RNNs) demonstrated high accuracy rates ranging from 95% to 99%. These results are highly indicative of deep learning, especially RNN, as a promising approach to improving the effectiveness of malware detection. The data obtained from dynamic analysis, when integrated into a database, serves as a more reliable source for training and testing of such models, and can improve the model’s ability to identify new threats posed by malware. Thus, this work is salient in enhancing the development of new approaches to fight malware that constantly evolve in the modern world.
Downloads
Metrics
##plugins.themes.bootstrap3.article.details##
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.