Performance evaluation of machine learning models for detection of Mirai infection in IoT devices
Abstract
The Internet of Things (IoT) brought in a new age of data connectivity and has vast applications in a wide range of industries. As a result of the upsurge of IoT devices, there has been a rise in IoT-based botnet attacks. IoT devices are particularly vulnerable to the Mirai malware; the botnet that was behind the severest Distributed Denial-Of-Service (DDoS) attacks ever recorded. Due to the massive scale and heterogeneity of IoT deployments, traditional security solutions have proven ineffective. However, Machine Learning approaches can provide the optimal approach needed for IoT security. In this research, Random Forest (RF), Decision Trees (DT), Support Vector Machines (SVM) and Logistic Regression (LR) Machine Learning (ML) models were used to detect the presence of Mirai in IoT devices. Data was extracted from the N-BaIoT dataset, which contained benign and malicious traffic from IoT devices infected with Mirai. This research compares these models based on their precision, recall, accuracy and F1-scores. The results showed that the performance of Decision Tree and Random Forest models were very high, while that of the SVM was above average and the Logistic Regression (LR) model was average. For the three IoT devices of interest in this paper, Logistic Regression and Support Vector Machines had average F1 scores of 0.63 and 0.80 respectively, while Decision Trees and Random Forests both had an average F1 score of 0.99. Using ML models based on decision trees and random forests will be an effective way to reliably detect Mirai in IoT devices