МЕТОД ВИБОРУ ОЗНАК ДЛЯ СИСТЕМИ ВИЯВЛЕННЯ ВТОРГНЕНЬ З ВИКОРИСТАННЯМ АНСАМБЛЕВОГО ПІДХОДУ ТА НЕЧІТКОЇ ЛОГІКИ

Yevhen Chychkarov; Olga Zinchenko; Andriy Bondarchuk; Liudmyla Aseeva

doi:10.28925/2663-4023.2023.21.234251

Authors

Yevhen Chychkarov State University of Information and Communication Technologies https://orcid.org/0000-0002-4362-5129
Olga Zinchenko State University of Information and Communication Technologies https://orcid.org/0000-0002-3973-7814
Andriy Bondarchuk State University of Information and Communication Technologies https://orcid.org/0000-0001-5124-5102
Liudmyla Aseeva State University of Information and Communication Technologies https://orcid.org/0000-0001-5954-4211

DOI:

https://doi.org/10.28925/2663-4023.2023.21.234251

Keywords:

intrusion detection system, machine learning, ensemble learning, classifier, fuzzy logic, cyber attack; cyber defense using machine learning; feature selection algorithms

Abstract

The study proposed a new method of constructing a set of important features for solving classification problems. This method is based on the idea of using an ensemble of estimators of the importance of features with summarization and the final result of the ensemble with the help of fuzzy logic algorithms. Statistical criteria (chi2, f_classif, correlation coefficient), mean decrease in impurity (MDI), mutual information criterion (mutual_info_classif) were used as estimators of the importance of features. Reducing the number of features on all data sets affects the accuracy of the assessment according to the criterion of the average reduction of classification errors. As long as the group of features in the data set for training contains the first features with the greatest influence, the accuracy of the model is at the initial level, but when at least one of the features with a large impact is excluded from the model, the accuracy of the model is noticeably reduced. The best classification results for all studied data sets were provided by classifiers based on trees or nearest neighbors: DesignTreeClassifier, ExtraTreeClassifier, KNeighborsClassifier. Due to the exclusion of non-essential features from the model, a noticeable increase in the speed of learning is achieved (up to 60-70%). Ensemble learning was used to increase the accuracy of the assessment. The VotingClassifier classifier, built on the basis of algorithms with the maximum learning speed, provided the best learning speed indicators. For future work, the goal is to further improve the proposed IDS model in the direction of improving the selection of classifiers to obtain optimal results, and setting the parameters of the selected classifiers, improving the strategy of generalizing the results of individual classifiers. For the proposed model, the ability to detect individual types of attacks with multi-class prediction is of significant interest.

Downloads

Download data is not yet available.

References

Chua, T.-H., & Salam, I. (2023). Evaluation of Machine Learning Algorithms in Network-Based Intrusion Detection Using Progressive Dataset. Symmetry, 15(6), 1251. https://doi.org/10.3390/sym15061251

Disha, R. A., & Waheed, S. (2022). Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique. Cybersecurity, 5(1). https://doi.org/10.1186/s42400-021-00103-8

Khraisat, A., Gondal, I., Vamplew, P., & Kamruzzaman, J. (2019). Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity, 2(1). https://doi.org/10.1186/s42400-019-0038-7

Liao, H.-J., Richard Lin, C.-H., Lin, Y.-C., & Tung, K.-Y. (2013). Intrusion detection system: A comprehensive review. Journal of Network and Computer Applications, 36(1), 16–24. https://doi.org/10.1016/j.jnca.2012.09.004

Yin, C., Zhu, Y., Fei, J., & He, X. (2017). A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks. IEEE Access, 5, 21954–21961. https://doi.org/10.1109/access.2017.2762418

Divekar, A., Parekh, M., Savla, V., Mishra, R., & Shirole, M. (2018). Benchmarking datasets for Anomaly-based Network Intrusion Detection: KDD CUP 99 alternatives. У 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS). IEEE. https://doi.org/10.1109/cccs.2018.8586840

Alkasassbeh, M. (2017). An empirical evaluation for the intrusion detection features based on machine learning and feature selection methods. https://doi.org/10.48550/arXiv.1712.09623

Catania, C. A., & Garino, C. G. (2012). Automatic network intrusion detection: Current techniques and open issues. Computers & Electrical Engineering, 38(5), 1062–1072. https://doi.org/10.1016/j.compeleceng.2012.05.013

Ingre, B., & Yadav, A. (2015). Performance analysis of NSL-KDD dataset using ANN. In: 2015 International Conference on Signal Processing And Communication Engineering Systems (SPACES). IEEE. pp 92–96. https://doi.org/10.1109/spaces.2015.7058223

Osanaiye, O., Cai, H., Choo, K.-K. R., Dehghantanha, A., Xu, Z., & Dlodlo, M. (2016). Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP Journal on Wireless Communications and Networking, 2016(1):1-10. https://doi.org/10.1186/s13638-016-0623-3

Liu, H., Yan, X., & Wu, Q. (2019). An Improved Pigeon-Inspired Optimisation Algorithm and Its Application in Parameter Inversion. Symmetry, 11(10), 1291. https://doi.org/10.3390/sym11101291

Kasongo, S. M., & Sun, Y. (2020). Performance Analysis of Intrusion Detection Systems Using a Feature Selection Method on the UNSW-NB15 Dataset. Journal of Big Data, 7(1). https://doi.org/10.1186/s40537-020-00379-6

Wang, X., & Zhou, Y. (2022). Multi-Label Feature Selection with Conditional Mutual Information. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4153295

Sara Hooker, Dumitru Erhan, Pieter-Jan Kindermans, and Been Kim. Evaluating feature importance estimates, 2018. https://doi.org/10.48550/arXiv.1806.10758

Rengasamy, D., Rothwell, B. C., & Figueredo, G. P. (2021). Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems Using Feature Importance Fusion. Applied Sciences, 11(24), 11854. https://doi.org/10.3390/app112411854.

Souhail et. al., M. (2019). Network Based Intrusion Detection Using the UNSW-NB15 Dataset. International Journal of Computing and Digital Systems, 8(5), 477–487. https://doi.org/10.12785/ijcds/080505

Rengasamy, Divish & Mafeni Mase, Jimiama & Rothwell, Benjmain & Torres, Mercedes & Alexander, Morgan & Winkler, David & Figueredo, Grazziela. (2022). Feature Importance in Machine Learning Models: A Fuzzy Information Fusion Approach. Neurocomputing. 511. https://doi.org/10.1016/j.neucom.2022.09.053.

Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012

Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, and Huan Liu. (2017). Feature Selection: A Data Perspective. ACM Comput. Surv. 50, 6, Article 94 (November 2018), 45 pages. https://doi.org/10.1145/3136625

Huan Liu and Lei Yu. (2005). Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Trans. on Knowl. and Data Eng. 17, 4 (April 2005), 491–502. https://doi.org/10.1109/TKDE.2005.66

Breiman, L. (2017). Classification and Regression Trees (1st ed.). Routledge. https://doi.org/10.1201/9781315139470

Khaire, Utkarsh & Dhanalakshmi, R.. (2019). Stability of Feature Selection Algorithm: A Review. Journal of King Saud University - Computer and Information Sciences. 34. 10.1016/j.jksuci.2019.06.012.

Kamalov, F., Thabtah, F. & Leung, H.H. Feature Selection in Imbalanced Data. Ann. Data. Sci. 10, 1527–1541 (2023). https://doi.org/10.1007/s40745-021-00366-5

IDS 2018 Intrusion CSVs (CSE-CIC-IDS2018). https://www.kaggle.com/datasets/solarmainframe/ids-intrusion-csv

Aggarwal, P., & Sharma, S. K. (2015). Analysis of KDD dataset attributes - class wise for intru-sion detection. Procedia Computer Science, 57, 842–851. https://doi.org/10.1016/j.procs.2015.07.490

NSL-KDD dataset. URL: http://www.unb.ca/research/iscx/dataset/iscx-NSL-KDD-dataset.html.

Moustafa, Nour & Slay, Jill. (2015). UNSW-NB15: a comprehensive data set for network intru-sion detection systems (UNSW-NB15 network data set). https://doi.org/10.1109/MilCIS.2015.7348942.

Damasevicius, R., Venckauskas, A., Grigaliunas, S., Toldinas, J., Morkevicius, N., Aleliunas, T., & Smuikys, P. (2020). LITNET-2020: An annotated real-world network flow dataset for network intrusion detection. Electronics, 9(5), 800. https://doi.org/10.3390/electronics9050800

Emanet S., Karatas Baydogmus G., Demir O. (2023) An ensemble learning based IDS using Voting rule: VEL-IDS. PeerJ Computer Science 9:e1553 https://doi.org/10.7717/peerj-cs.1553

Mohan, Chander. (2019). AN INTRODUCTION TO FUZZY SET THEORY AND FUZZY LOGIC (Second Edition).

DETECTION OF NETWORK INTRUSIONS USING MACHINE LEARNING ALGORITHMS AND FUZZY LOGIC

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

index

Language

Make a Submission

counter

Information

Developed By

Current Issue