Cyber Attack Analysis Using XGBoost with Parameter Optimization Using PSO, GA and Data Balancing Techniques Using SMOTE-ENN
Abstract
As cyber threats become increasingly sophisticated, Intrusion Detection Systems (IDS) play a critical role in securing modern networks. Here, we present a new framework where we balance the dataset using SMOTE-ENN, followed by a hybrid optimization method combining Particle Swarm Optimization (PSO) and Genetic Algorithms (GA) to optimize the hyperparameters of an XGBoost classifier. Based on the traditional KDD Cup 99 data set, SMOTE-ENN can solve the data imbalance problem well, where the distribution of minority classes can be improved with few noise, which allows class continuous to train unbiased. The hybrid PSO-GA method exhibited fast convergence with a low possibility of locating at local optima, resulting in the optimal configuration of hyper-parameters. The experimental results showed obvious improvements with the optimized model obtaining 82.63% accuracy, 85.20% recall, and AUC equal to 0.83, better than baseline models. In addition, the computational efficiency of the proposed framework will make it appropriate for the real-time applications. This framework provides an efficient and scalable optimisation strategy for improving IDS, which can be immediately applied to practical cybersecurity scenarios.
References
H. K. R. Kommera, ‘Adaptive Cybersecurity in the Digital Age: Emerging Threat Vectors and Next-Generation Defense Strategies’, International Journal for Research in Applied Science and Engineering Technology, vol. 12, no. 9, 2024, Accessed: Jan. 26, 2025.
I. Ahmad, Q. E. Ul Haq, M. Imran, M. O. Alassafi, and R. A. AlGhamdi, ‘An efficient network intrusion detection and classification system’, Mathematics, vol. 10, no. 3, p. 530, 2022.
G. Velarde, A. Sudhir, S. Deshmane, A. Deshmunkh, K. Sharma, and V. Joshi, ‘Evaluating XGBoost for balanced and imbalanced data: application to fraud detection’, arXiv preprint arXiv:2303.15218, 2023, Accessed: Jan. 24, 2025. [Online]. Available: https://arxiv.org/abs/2303.15218
M. Patil, N. Shivsharan, Y. Naik, H. Yeram, and A. Gawade, ‘Enhancing Cybersecurity: A Comprehensive Analysis of Machine Learning Techniques in Detecting and Preventing Phishing Attacks with a Focus on Xgboost Algorithm’, in 2024 International Conference on Intelligent Systems for Cybersecurity (ISCS), IEEE, 2024, pp. 01–06. Accessed: Jan. 26, 2025. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10581237/
Z. Saharuna and T. Ahmad, ‘Multiclass Imbalance Resampling Techniques for Network Intrusion Detection’, in 2024 10th International Conference on Smart Computing and Communication (ICSCC), IEEE, 2024, pp. 450–454. Accessed: Jan. 25, 2025. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10690582/
S. S. Dhaliwal, A.-A. Nahid, and R. Abbas, ‘Effective intrusion detection system using XGBoost’, Information, vol. 9, no. 7, p. 149, 2018.
Z. Xia, S. He, C. Liu, Y. Liu, X. Yang, and H. Bu, ‘PSO-GA Hyperparameter Optimized ResNet-BiGRU Based Intrusion Detection Method’, IEEE Access, 2024, Accessed: Jan. 26, 2025. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10684706/
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, ‘SMOTE: synthetic minority over-sampling technique’, Journal of artificial intelligence research, vol. 16, pp. 321–357, 2002.
R. Kaur and N. Gupta, ‘Comprehending SMOTE Adaptations to Alleviate Imbalance in Intrusion Detection Systems’, in 2023 4th International Conference on Electronics and Sustainable Communication Systems (ICESC), IEEE, 2023, pp. 976–982. Accessed: Jan. 26, 2025. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10193257/
K. Cheng, C. Zhang, H. Yu, X. Yang, H. Zou, and S. Gao, ‘Grouped SMOTE With Noise Filtering Mechanism for Classifying Imbalanced Data’, IEEE Access, vol. 7, pp. 170668–170681, 2019, doi: 10.1109/ACCESS.2019.2955086.
K. Kim, ‘Noise avoidance SMOTE in ensemble learning for imbalanced data’, IEEE Access, vol. 9, pp. 143250–143265, 2021.
S. He, B. Li, H. Peng, J. Xin, and E. Zhang, ‘An Effective Cost-Sensitive XGBoost Method for Malicious URLs Detection in Imbalanced Dataset’, IEEE Access, vol. 9, pp. 93089–93096, 2021, doi: 10.1109/ACCESS.2021.3093094.
J. Zhang, H. Wang, J. Zhao, S. Duan, and L. Shi, ‘Application of hybrid PSO-GA algorithm in optimization of high-dimensional complex functions’, in 2022 7th International Conference on Multimedia and Image Processing, Tianjin China: ACM, Jan. 2022, pp. 161–166. doi: 10.1145/3517077.3517103.
M. Zhao, H. Zhao, and M. Zhao, ‘Particle swarm optimization algorithm with adaptive two-population strategy’, IEEE Access, vol. 11, pp. 62242–62260, 2023.
P. Zhang, Y. Jia, and Y. Shang, ‘Research and application of XGBoost in imbalanced data’, International Journal of Distributed Sensor Networks, vol. 18, no. 6, p. 155013292211069, Jun. 2022, doi: 10.1177/15501329221106935.
D. Kilichev and W. Kim, ‘Hyperparameter optimization for 1D-CNN-based network intrusion detection using GA and PSO’, Mathematics, vol. 11, no. 17, p. 3724, 2023.
S. Sams Aafiya Banu, B. Gopika, E. Esakki Rajan, M. P. Ramkumar, M. Mahalakshmi, and G. S. R. Emil Selvan, ‘SMOTE Variants for Data Balancing in Intrusion Detection System Using Machine Learning’, in Machine Learning and Computational Intelligence Techniques for Data Engineering, vol. 998, P. Singh, D. Singh, V. Tiwari, and S. Misra, Eds., in Lecture Notes in Electrical Engineering, vol. 998. , Singapore: Springer Nature Singapore, 2023, pp. 317–330. doi: 10.1007/978-981-99-0047-3_28.
A. Behera, K. Sagar Sahoo, T. Kumara Mishra, A. Nayyar, and M. Bilal, ‘Enhancing DDoS detection in SDIoT through effective feature selection with SMOTE-ENN’, PloS one, vol. 19, no. 10, p. e0309682, 2024.
S. Harron, V. Saxena, and N. Kumari, ‘Exploring the Use of Particle Swarm Optimization Algorithms to Enhance Evolutionary Computing’, in 2024 International Conference on Optimization Computing and Wireless Communication (ICOCWC), IEEE, 2024, pp. 1–6. Accessed: Jan. 26, 2025. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10470721/
M. H. L. Louk and B. A. Tama, ‘PSO-driven feature selection and hybrid ensemble for network anomaly detection’, Big Data and Cognitive Computing, vol. 6, no. 4, p. 137, 2022.
M. K. Khandelwal and N. Sharma, ‘A Survey on Particle Swarm Optimization Algorithm’, in Proceedings of International Conference on Communication and Computational Technologies, S. Kumar, S. Hiranwal, S. D. Purohit, and M. Prasad, Eds., in Algorithms for Intelligent Systems. , Singapore: Springer Nature Singapore, 2023, pp. 591–602. doi: 10.1007/978-981-99-3485-0_47.