Comparative Analysis of Machine Learning Algorithms for Anomaly Detection in IoT Networks Using CICIoT2023 Dataset

Main Article Content

AUGUSTO CUSTODIO VICENTE
RENATA LOPES ROSA
FREDERICO GADELHA GUIMARÃES

Abstract

Internet of Things (IoT) networks face increasing security threats due to their heterogeneous
nature and resource constraints. This study presents a comprehensive comparison of ten machine learning
algorithms for anomaly detection in IoT environments using the CICIoT2023 dataset. We evaluated six
supervised learning algorithms (Logistic Regression, Random Forest, Gradient Boosting, Linear SVC,
SGD Classifier, and MLP) and four unsupervised anomaly detection methods (Isolation Forest, SGD
One-Class SVM, Local Outlier Factor, and Elliptic Envelope) using a reproducible pipeline with Data
Version Control (DVC). Our methodology employs stratified sampling on 4.5 million records (97.7%
attacks, 2.3% benign), standardized preprocessing with 39 features, and binary classification. The ex-
perimental framework includes rigorous statistical validation through 705 experiments across multiple
hyperparameter configurations with 5 independent runs each. Given severe class imbalance, balanced
accuracy emerged as the critical metric, with ensemble methods (Gradient Boosting: 91.95%, Random
Forest: 91.89%) demonstrating 8-17 percentage point advantage over linear classifiers in minority class
detection. Gradient Boosting achieved highest F1-score (0.9964 ± 0.0004), while SGD-based methods
provided 200-600× faster training with competitive performance, suitable for resource-constrained de-
ployments. Bayesian statistical analysis confirmed significant performance differences across algorithm
families. This research establishes a rigorous baseline for algorithm selection in severely imbalanced IoT
intrusion detection systems.

Article Details

How to Cite
VICENTE, A. C., ROSA, R. L., & GUIMARÃES, F. G. (2025). Comparative Analysis of Machine Learning Algorithms for Anomaly Detection in IoT Networks Using CICIoT2023 Dataset. INFOCOMP Journal of Computer Science, 24(2), 5342. https://doi.org/10.18760/.v24i2.5342
Section
Machine Learning and Computational Intelligence