ІНФОРМАЦІЙНА ТЕХНОЛОГІЯ ВИЯВЛЕННЯ УКРАЇНОМОВНИХ ФЕЙКОВИХ НОВИН В КІБЕРПРОСТОРІ СОЦІАЛЬНИХ МЕРЕЖ НА ОСНОВІ МЕТОДІВ МАШИННОГО НАВЧАННЯ ТА ОБРОБКИ ПРИРОДНОЇ МОВИ

Marta Hrudzynska; Victoria Vysotska; Lyubomyr Chyrun

doi:10.28925/2663-4023.2026.32.1120

Authors

Marta Hrudzynska Lviv Polytechnic National University https://orcid.org/0009-0004-5183-2269
Victoria Vysotska Lviv Polytechnic National University https://orcid.org/0000-0001-6417-3689
Lyubomyr Chyrun Lviv Polytechnic National University https://orcid.org/0000-0002-9448-1751

DOI:

https://doi.org/10.28925/2663-4023.2026.32.1120

Keywords:

fake news, natural language processing, machine learning, Ukrainian language, Sentence-BERT, Support Vector Machi, information security

Abstract

The article addresses the problem of automated fake news detection in the Ukrainian-language information space, which has become particularly relevant under conditions of hybrid warfare and the intensive use of disinformation as a tool of information influence. The aim of the study is to develop and experimentally evaluate an effective system for classifying Ukrainian-language news texts using natural language processing and machine learning methods. The introductory section substantiates the relevance of the research topic, defines the object and subject of the study, and formulates its goal and main objectives. The related work section provides an overview of existing approaches to fake news detection, including classical machine learning algorithms, deep neural networks, and transformer-based models, and highlights their limitations in the context of the Ukrainian language.

The theoretical section systematizes methods of automated text analysis and identifies linguistic features of Ukrainian news content, as well as challenges related to the lack of large annotated corpora and language-specific resources. The methodology section describes the complete research pipeline, including data collection, text preprocessing, cleaning, tokenization, and lemmatization using tools adapted for Ukrainian, as well as the construction of semantic vector representations based on the Sentence-BERT model. Several machine learning classifiers, namely Logistic Regression, Random Forest, Support Vector Machine, and XGBoost, were implemented and compared using cross-validation and hyperparameter optimization.

The results section presents an experimental evaluation of the models using accuracy, precision, recall, F1-score, and ROC-AUC metrics. The findings demonstrate that the Support Vector Machine model achieves the best performance, reaching a classification accuracy of 93.2% even under class imbalance conditions. Computational efficiency and runtime characteristics of the proposed approach are also analysed, along with its potential for practical deployment. The conclusions summarize the main outcomes of the study, confirm that the research objectives have been achieved, and outline directions for future work, including adaptation to a multilingual environment and improving model interpretability. The proposed system can be applied in media monitoring, fact-checking initiatives, and tools aimed at strengthening information security.

Downloads

Download data is not yet available.

References

Lipianina-Honcharenko, K., Soia, M., Yurkiv, K., & Ivasechko, A. (2023). Evaluation of the effectiveness of machine learning methods for detecting disinformation in Ukrainian text data. CEUR Workshop Proceedings, 3702, Paper 9. https://ceur-ws.org/Vol-3702/paper9.pdf

Farokhian, M., Rafe, V., & Veisi, H. (2022). Fake news detection using parallel BERT deep neural networks. arXiv. https://arxiv.org/abs/2204.04793

Khairova, N., Galassi, A., Lo Scudo, F., Ivasiuk, B., & Redozub, I. (2024). Unsupervised approach for misinformation detection in Russia–Ukraine war news. CEUR Workshop Proceedings, 3722, Paper 3. https://ceur-ws.org/Vol-3722/paper3.pdf

StopFake. (n.d.). About us. https://www.stopfake.org/uk/pro-nas/

Lendyuk, D. T., & Lipianina-Honcharenko, H. V. (2024). Ensemble learning of classifiers for online detection of disinformation. Tavria Scientific Bulletin. Series: Technical Sciences, (6), 46–63. https://doi.org/10.32782/tnv-tech.2024.6.6

Paraschiv, M., et al. (2022). A unified graph-based approach to disinformation detection using contextual and semantic relations. Proceedings of the International AAAI Conference on Web and Social Media, 16. https://doi.org/10.48550/arXiv.2109.11781

Monti, F., et al. (2019). Fake news detection on social media using geometric deep learning. arXiv. https://doi.org/10.48550/arXiv.1902.06673

Gong, S., et al. (2023). Fake news detection through graph-based neural networks: A survey. arXiv. https://doi.org/10.48550/arXiv.2307.12639

Papadopoulou, O., et al. (2022). MeVer NetworkX: Network analysis and visualisation for tracing disinformation. Future Internet, 14(5), 147. https://doi.org/10.3390/fi14050147

Soga, K., Yoshida, S., & Muneyasu, M. (2024). Graph-based interpretability for fake news detection through topic- and propagation-aware visualisation. Computation, 12(4), 82. https://doi.org/10.3390/computation12040082

Luo, H., Cai, M., & Cui, Y. (2021). Spread of misinformation in social networks: Analysis based on Weibo tweets. Security and Communication Networks, 2021, 7999760. https://doi.org/10.1155/2021/7999760

Béres, F., et al. (2023). Network embedding aided vaccine skepticism detection. Applied Network Science, 8(1), 11. https://doi.org/10.1007/s41109-023-00534-x

Liu, P., et al. (2025). A comparison between Independent Cascade and SIR models. Proceedings of the AAAI Conference on Artificial Intelligence, 39(1). https://doi.org/10.1609/aaai.v39i1.32028

Muñoz, P., Díez, F., & Bellogín, A. (2024). Modeling disinformation networks on Twitter: Structure, behavior, and impact. Applied Network Science, 9(1), 4. https://doi.org/10.1007/s41109-024-00610-w

Su, T., Macdonald, C., & Ounis, I. (2022). Leveraging social network embeddings for fake news detection on Twitter. arXiv. https://doi.org/10.48550/arXiv.2211.10672

Schiffrin, A., et al. (2022). AI startups and the fight against mis/disinformation online: An update. German Marshall Fund of the United States

Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22–36. https://doi.org/10.1145/3137597.3137600

Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 109. https://doi.org/10.1145/339504

Stanford NLP Group. (n.d.). Stanza NLP toolkit. https://stanfordnlp.github.io/stanza/

Kaggle. (n.d.). Fake and real news dataset. https://www.kaggle.com/datasets/zepopo/ukrainian-fake-and-true-news

Skupriienko, S. (n.d.). Ukrainian stop words. https://github.com/skupriienko/Ukrainian-Stopwords/blob/master/stopwords_ua.txt

INFORMATION TECHNOLOGY FOR UKRAINIAN-LANGUAGE FAKE NEWS DETECTION IN SOCIAL NETWORKS CYBERSPACE BASED ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING METHODS

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

index

Language

Make a Submission

counter

Information

Developed By

Current Issue