МЕТОД АДАПТИВНОГО ВІДБОРУ ТА ЗВАЖУВАННЯ ОЗНАК НЕПРАВДИВОЇ ІНФОРМАЦІЇ ДЛЯ ПІДВИЩЕННЯ ЕФЕКТИВНОСТІ ЇЇ ВИЯВЛЕННЯ В УМОВАХ ГІБРИДНОЇ ВІЙНИ

Dmytro Dzhendzhero; Volodymyr Nakonechnyi

doi:10.28925/2663-4023.2026.33.1166

Authors

Dmytro Dzhendzhero Taras Shevchenko National University of Kyiv https://orcid.org/0009-0007-9999-850X
Volodymyr Nakonechnyi Taras Shevchenko National University of Kyiv https://orcid.org/0000-0002-0247-5400

DOI:

https://doi.org/10.28925/2663-4023.2026.33.1166

Keywords:

false information, fake news detection, hybrid warfare, feature selection, TF-IDF, logistic regression, risk triage, information security

Abstract

The article addresses the problem of improving false information detection in text messages under hybrid warfare conditions. The relevance of the study is determined by the fact that disinformation campaigns in the modern information environment are used to undermine trust in public institutions, distort the perception of events, destabilize public attitudes, and create additional pressure on decision-making systems. The introduction substantiates the need for an interpretable and resource-efficient method suitable for large, dynamic, and imbalanced message streams. The problem statement shows that the use of the full feature space increases computational complexity, reduces interpretability, and does not support the prioritization of message verification according to risk level. The section on recent studies and publications summarizes current approaches to false information detection, including content-based, fact-based, behavioral, and hybrid models, as well as approaches to disinformation risk assessment in wartime conditions. The theoretical foundations systematize the main concepts related to feature selection, text feature space construction, term weighting, classification quality assessment, and the use of PR/ROC representations under class imbalance. On this basis, the conceptual framework of the proposed method of adaptive selection and weighting of false information indicators is formed. The methodology section describes an experimental evaluation carried out on an open Ukrainian-language news corpus related to the events of Russia’s war against Ukraine. After cleaning the data, removing empty records and duplicates, and applying a minimum length filter of 200 characters, a dataset of 29,372 messages was obtained, including 353 false messages and 29,019 true messages. TF-IDF features based on unigrams and bigrams were used for text representation, while logistic regression was selected as the baseline classifier. Feature selection was implemented using a chi-square plus top-K scheme with several K values tested on the validation set; the final working configuration was K=5000. For practical decision support, a three-level risk triage scheme was introduced based on the 80th and 95th percentiles of the validation score distribution. The results section shows that reducing the feature space from 30,000 to 5,000 features does not lead to a substantial decrease in classification quality: on the test set, F1 decreases only from 0.768 to 0.760, while PR-AUC decreases from 0.819 to 0.805. At the same time, the proposed triage procedure demonstrates practical value: the high-risk group covered 256 messages out of 4,406 in the test set and contained 50 out of 53 false messages, whereas the low-risk group contained only 1 false message. The conclusions justify that the proposed method can be used as an interpretable and resource-efficient component of information space monitoring systems. Further research should focus on extending the feature set with semantic, source-based, and behavioral indicators, as well as testing the method on additional Ukrainian-language corpora.

Keywords: false information; fake

Downloads

Download data is not yet available.

References

Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211-236. https://doi.org/10.1257/jep.31.2.211

Lazer, D. M. J., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., Metzger, M. J., Nyhan, B., Pennycook, G., Rothschild, D., Schudson, M., Sloman, S. A., Sunstein, C. R., Thorson, E. A., Watts, D. J., & Zittrain, J. L. (2018). The science of fake news. Science, 359(6380), 1094-1096. https://doi.org/10.1126/science.aao2998

Wardle, C., & Derakhshan, H. (2017). Information disorder: Toward an interdisciplinary framework for research and policy making. Council of Europe. https://edoc.coe.int/en/media/7495-information-disorder-toward-an-interdisciplinary-framework-for-research-and-policy-making.html

Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146–1151. https://doi.org/10.1126/science.aap9559

Zhou, X., & Zafarani, R. (2021). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), Article 109, 1-40. https://doi.org/10.1145/3395046

Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22-36. https://doi.org/10.1145/3137597.3137600

Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 797-806). ACM. https://doi.org/10.1145/3132847.3132877

Maschmeyer, L., Abrahams, A., Pomerantsev, P., & Yermolenko, V. (2025). Donetsk don’t tell – “hybrid war” in Ukraine and the limits of social media influence operations. Journal of Information Technology & Politics, 22(1), 49-64. https://doi.org/10.1080/19331681.2023.2211969

Bachmann, S.-D. D., Putter, D., & Duczynski, G. (2023). Hybrid warfare and disinformation: A Ukraine war perspective. Global Policy, 14(5), 858-869. https://doi.org/10.1111/1758-5899.13257

Tyshchenko, V. S., & Muzhanova, T. M. (2022). Dezinformatsiia i feikovi novyny: Oznaky ta metody vyiavlennia v merezhi Internet [Disinformation and fake news: Features and methods of detection on the Internet]. Kiberbezpeka: osvita, nauka, tekhnika, 2(18), 175-186. https://doi.org/10.28925/2663-4023.2022.18.175186

Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1-2), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X

Hall, M. A. (1999). Correlation-based feature selection for machine learning (Doctoral dissertation, University of Waikato). https://www.cs.waikato.ac.nz/ml/publications/1999/99MH-Thesis.pdf

Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3, 1289-1305. https://www.jmlr.org/papers/v3/forman03a.html

Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513-523. https://doi.org/10.1016/0306-4573(88)90021-0

Shannon, C. E. (1951). Prediction and entropy of printed English. Bell System Technical Journal, 30(1), 50-64. https://doi.org/10.1002/j.1538-7305.1951.tb01366.x

Piantadosi, S. T. (2014).Zipf’s word frequency law in natural language:A critical review and future directions.Psychonomic Bulletin & Review, 21(5), 1112-1130. https://doi.org/10.3758/s13423-014-0585-6

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002

Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432

Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning (pp. 233-240). ACM. https://doi.org/10.1145/1143844.1143874

Zepopo. (n.d.). Ukrainian fake and true news [Data set]. Kaggle. https://www.kaggle.com/datasets/zepopo/ukrainian-fake-and-true-news

METHOD OF ADAPTIVE SELECTION AND WEIGHTING OF FALSE INFORMATION INDICATORS TO IMPROVE DETECTION EFFICIENCY UNDER HYBRID WARFARE CONDITIONS

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

index

Language

Make a Submission

counter

Information

Developed By

Current Issue