INFORMATION TECHNOLOGY FOR UKRAINIAN-LANGUAGE FAKE NEWS DETECTION IN SOCIAL NETWORKS CYBERSPACE BASED ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING METHODS
DOI:
https://doi.org/10.28925/2663-4023.2026.32.1120Keywords:
fake news, natural language processing, machine learning, Ukrainian language, Sentence-BERT, Support Vector Machi, information securityAbstract
The article addresses the problem of automated fake news detection in the Ukrainian-language information space, which has become particularly relevant under conditions of hybrid warfare and the intensive use of disinformation as a tool of information influence. The aim of the study is to develop and experimentally evaluate an effective system for classifying Ukrainian-language news texts using natural language processing and machine learning methods. The introductory section substantiates the relevance of the research topic, defines the object and subject of the study, and formulates its goal and main objectives. The related work section provides an overview of existing approaches to fake news detection, including classical machine learning algorithms, deep neural networks, and transformer-based models, and highlights their limitations in the context of the Ukrainian language.
The theoretical section systematizes methods of automated text analysis and identifies linguistic features of Ukrainian news content, as well as challenges related to the lack of large annotated corpora and language-specific resources. The methodology section describes the complete research pipeline, including data collection, text preprocessing, cleaning, tokenization, and lemmatization using tools adapted for Ukrainian, as well as the construction of semantic vector representations based on the Sentence-BERT model. Several machine learning classifiers, namely Logistic Regression, Random Forest, Support Vector Machine, and XGBoost, were implemented and compared using cross-validation and hyperparameter optimization.
The results section presents an experimental evaluation of the models using accuracy, precision, recall, F1-score, and ROC-AUC metrics. The findings demonstrate that the Support Vector Machine model achieves the best performance, reaching a classification accuracy of 93.2% even under class imbalance conditions. Computational efficiency and runtime characteristics of the proposed approach are also analysed, along with its potential for practical deployment. The conclusions summarize the main outcomes of the study, confirm that the research objectives have been achieved, and outline directions for future work, including adaptation to a multilingual environment and improving model interpretability. The proposed system can be applied in media monitoring, fact-checking initiatives, and tools aimed at strengthening information security.
Downloads
References
Lipianina-Honcharenko, K., Soia, M., Yurkiv, K., & Ivasechko, A. (2023). Evaluation of the effectiveness of machine learning methods for detecting disinformation in Ukrainian text data. CEUR Workshop Proceedings, 3702, Paper 9. https://ceur-ws.org/Vol-3702/paper9.pdf
Farokhian, M., Rafe, V., & Veisi, H. (2022). Fake news detection using parallel BERT deep neural networks. arXiv. https://arxiv.org/abs/2204.04793
Khairova, N., Galassi, A., Lo Scudo, F., Ivasiuk, B., & Redozub, I. (2024). Unsupervised approach for misinformation detection in Russia–Ukraine war news. CEUR Workshop Proceedings, 3722, Paper 3. https://ceur-ws.org/Vol-3722/paper3.pdf
StopFake. (n.d.). About us. https://www.stopfake.org/uk/pro-nas/
Lendyuk, D. T., & Lipianina-Honcharenko, H. V. (2024). Ensemble learning of classifiers for online detection of disinformation. Tavria Scientific Bulletin. Series: Technical Sciences, (6), 46–63. https://doi.org/10.32782/tnv-tech.2024.6.6
Paraschiv, M., et al. (2022). A unified graph-based approach to disinformation detection using contextual and semantic relations. Proceedings of the International AAAI Conference on Web and Social Media, 16. https://doi.org/10.48550/arXiv.2109.11781
Monti, F., et al. (2019). Fake news detection on social media using geometric deep learning. arXiv. https://doi.org/10.48550/arXiv.1902.06673
Gong, S., et al. (2023). Fake news detection through graph-based neural networks: A survey. arXiv. https://doi.org/10.48550/arXiv.2307.12639
Papadopoulou, O., et al. (2022). MeVer NetworkX: Network analysis and visualisation for tracing disinformation. Future Internet, 14(5), 147. https://doi.org/10.3390/fi14050147
Soga, K., Yoshida, S., & Muneyasu, M. (2024). Graph-based interpretability for fake news detection through topic- and propagation-aware visualisation. Computation, 12(4), 82. https://doi.org/10.3390/computation12040082
Luo, H., Cai, M., & Cui, Y. (2021). Spread of misinformation in social networks: Analysis based on Weibo tweets. Security and Communication Networks, 2021, 7999760. https://doi.org/10.1155/2021/7999760
Béres, F., et al. (2023). Network embedding aided vaccine skepticism detection. Applied Network Science, 8(1), 11. https://doi.org/10.1007/s41109-023-00534-x
Liu, P., et al. (2025). A comparison between Independent Cascade and SIR models. Proceedings of the AAAI Conference on Artificial Intelligence, 39(1). https://doi.org/10.1609/aaai.v39i1.32028
Muñoz, P., Díez, F., & Bellogín, A. (2024). Modeling disinformation networks on Twitter: Structure, behavior, and impact. Applied Network Science, 9(1), 4. https://doi.org/10.1007/s41109-024-00610-w
Su, T., Macdonald, C., & Ounis, I. (2022). Leveraging social network embeddings for fake news detection on Twitter. arXiv. https://doi.org/10.48550/arXiv.2211.10672
Schiffrin, A., et al. (2022). AI startups and the fight against mis/disinformation online: An update. German Marshall Fund of the United States
Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22–36. https://doi.org/10.1145/3137597.3137600
Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 109. https://doi.org/10.1145/339504
Stanford NLP Group. (n.d.). Stanza NLP toolkit. https://stanfordnlp.github.io/stanza/
Kaggle. (n.d.). Fake and real news dataset. https://www.kaggle.com/datasets/zepopo/ukrainian-fake-and-true-news
Skupriienko, S. (n.d.). Ukrainian stop words. https://github.com/skupriienko/Ukrainian-Stopwords/blob/master/stopwords_ua.txt
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Марта Грудзинська, Вікторія Висоцька, Любомир Чирун

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.