COMPARATIVE ANALYSIS OF THE EFFECTIVENESS OF SOFTWARE CODE VULNERABILITY DETECTION USING LLM AND SAST

Authors

DOI:

https://doi.org/10.28925/2663-4023.2026.32.1165

Keywords:

software, vulnerabilities, testing, artificial intelligence, LLM, static code analysis, SAST, hybrid approach

Abstract

This article justifies the need to implement software code security controls using large language models (LLMs), driven by the rapid growth in the volume of software code, the emergence of new security risks associated with AI-generated code, and the need to integrate individual code components into complex architectural solutions. The algorithms of existing static code analysis (SAST) tools are prone to errors due to their inability to fully account for code execution logic and its contextual relationships. Using LLMs as a verifier that confirms or refutes the results of static code analysis has the potential to address these shortcomings. This paper presents a comparative analysis of the effectiveness of detecting security vulnerabilities in C# code using the Roslyn Analyzers static code analysis tool, large language models such as DeepSeek and Grok, and an integrated approach that combines the advantages of static analyzers and LLMs. The research methodology is based on conducting an experimental study using a test sample of C# code fragments containing various types of security vulnerabilities. In the first stage of the study, the code fragments were tested using the static code analysis tool Roslyn Analyzers. In the next stage, the code fragments were analyzed for vulnerabilities using the DeepSeek V3 and Grok 4.1 models. In the final stage, the effectiveness of the proposed hybrid approach was evaluated, which involves an initial code check by a static analyzer followed by the transmission of its reports to the input of selected generative AI models. The results of the study show that the hybrid approach using DeepSeek and Roslyn Analyzers provides an increase in performance metrics compared to the independent use of these tools. A comparative analysis of the performance metrics of the models’ standalone use also established that Grok performs worse than the DeepSeek model and is not the best option for application in tasks of this type. The study demonstrates that integrating the analytical capabilities of large language models into classical static code analysis processes by confirming or refuting the results of static analysis is a potential step toward self-correcting software security analysis processes.

Downloads

Download data is not yet available.

References

Shahana, A., Hasan, R., Farabi, S. F., Akter, J., Mahmud, M. A. A., Johora, F. T., & Suzer, G. (2024). AI-Driven cybersecurity: balancing advancements and safeguards. Journal of Computer Science and Technology Studies, 6(2), 76–85. https://doi.org/10.32996/jcsts.2024.6.2.9

Gnieciak, D., & Szandala, T. (2025). Large Language Models Versus Static Code Analysis Tools: A Systematic Benchmark for Vulnerability Detection. IEEE Access. https://doi.org/10.1109/access.2025.3635168

Ferrag, M. A., Battah, A., Tihanyi, N., Jain, R., Maimuţ, D., Alwahedi, F., Lestable, T., Thandi, N. S., Mechri, A., Debbah, M., & Cordeiro, L. C. (2025). SecureFalcon: are we there yet in automated software vulnerability detection with llms? IEEE Transactions on Software Engineering, 1–18. https://doi.org/10.1109/tse.2025.3548168

Ding, Y., Fu, Y., Ibrahim, O., Sitawarin, C., Chen, X., Alomair, B., Wagner, D., Ray, B., & Chen, Y. (2025). Vulnerability detection with code language models: how far are we? У 2025 IEEE/ACM 47th international conference on software engineering (ICSE) (p. 1729–1741). IEEE. https://doi.org/10.1109/icse55347.2025.00038

Cristian Curaba Denis D’Ambrosi Alessandro Minisini and Natalia P´erezCampanero Antol´ın. (2024). Cryptoformaleval: Integrating llms and formal verification for automated cryptographic protocol vulnerability detection. arXiv.

Du, X., Wen, M., Zhu, J., Xie, Z., Ji, B., Liu, H., Shi, X., & Jin, H. (2024). Generalization-Enhanced code vulnerability detection via multi-task instruction fine-tuning. У Findings of the association for computational linguistics ACL 2024 (p. 10507–10521). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.625

He, J., & Vechev, M. (2023). Large language models for code: security hardening and adversarial testing. У CCS '23: ACM SIGSAC conference on computer and communications security. ACM. https://doi.org/10.1145/3576915.3623175

Beljulji, E., & Matta, I. (2026). Large language models in security code review and testing. Journal of Systems Research, 5(1). https://doi.org/10.5070/sr3.62177

Shvyrov, V. V., Kapustin, D. A., Sentyay, R. N., & Shulika, T. I. (2024). Analysis of datasets and large language models for vulnerability detection in imperative programming language code. Programmnaya Ingeneria, 15(11), 555–569. https://doi.org/10.17587/prin.15.555-569

Bhandari, G., Naseer, A., & Moonen, L. (2021). CVEfixes: automated collection of vulnerabilities and their fixes from open-source software. У PROMISE '21: 17th international conference on predictive models and data analytics in software engineering. ACM. https://doi.org/10.1145/3475960.3475985

Chen, Y., Ding, Z., Alowain, L., Chen, X., & Wagner, D. (2023). DiverseVul: A new vulnerable source code dataset for deep learning based vulnerability detection. У RAID 2023: the 26th international symposium on research in attacks, intrusions and defenses. ACM. https://doi.org/10.1145/3607199.3607242

Kouliaridis, V., Karopoulos, G., & Kambourakis, G. (2025). Assessing the effectiveness of llms in android application vulnerability analysis. У Lecture notes in computer science (p. 139–154). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-85593-1_9

Daniel Ajiga, Patrick Azuka Okeleke, Samuel Olaoluwa Folorunsho & Chinedu Ezeigweneme. (2024). The role of software automation in improving industrial operations and efficiency. International Journal of Engineering Research Updates, 7(1), 022–035. https://doi.org/10.53430/ijeru.2024.7.1.0031

Venkatasubramanyam, R. D., Gupta, S., & Uppili, U. (2015). Assessing the effectiveness of static analysis through defect correlation analysis. У 2015 IEEE 10th international conference on global software engineering (ICGSE). IEEE. https://doi.org/10.1109/icgse.2015.18

Singh, D., Sekar, V. R., Stolee, K. T., & Johnson, B. (2017). Evaluating how static analysis tools can reduce code review effort. У 2017 IEEE symposium on visual languages and human-centric computing (VL/HCC). IEEE. https://doi.org/10.1109/vlhcc.2017.8103456

Simões, I. R. d. S., & Venson, E. (2024). Evaluating source code quality with large languagem models: a comparative study. У SBQS 2024: XXIII brazilian symposium on software quality (p. 103–113). ACM. https://doi.org/10.1145/3701625.3701650

Cheirdari, F., & Karabatis, G. (2018). Analyzing false positive source code vulnerabilities using static analysis tools. У 2018 IEEE international conference on big data (big data). IEEE. https://doi.org/10.1109/bigdata.2018.8622456

Khater, H. M., Khayat, M., Alrabaee, S., Serhani, M. A., Barka, E., & Sallabi, F. (2023). AI techniques for software vulnerability detection and mitigation. У 2023 IEEE conference on dependable and secure computing (DSC). IEEE. https://doi.org/10.1109/dsc61021.2023.10354233

Ayobami Olwadamilola Adebayo. (2025). Automating security compliance in devsecops through ai-driven policy enforcement. International Journal of Science and Research Archive, 15(2), 670–675. https://doi.org/10.30574/ijsra.2025.15.2.1457

Mohammed, A. (2023). Elevating cybersecurity audits: how AI is shaping compliance and threat detection. Elevating Cybersecurity Audits: How AI is Shaping Compliance and Threat Detection, 2(1), 1–9. https://doi.org/10.5281/zenodo.14760068

Babatunde, L. A., Etim, E. D., Essien, I. A., Cadet, E., Ajayi, J. O., Erigha, E. D., & Obuse, E. (2020). Adversarial machine learning in cybersecurity: vulnerabilities and defense strategies. Journal of Frontiers in Multidisciplinary Research, 1(2), 31–45. https://doi.org/10.54660/.jfmr.2020.1.2.31-45

GitHub models. (b. d.). GitHub. https://github.com/marketplace?type=models

Ponta, S. E., Plate, H., & Sabetta, A. (2020). Detection, assessment and mitigation of vulnerabilities in open source dependencies. Empirical Software Engineering, 25(5), 3175–3215. https://doi.org/10.1007/s10664-020-09830-x

Downloads


Abstract views: 68

Published

2026-03-26

How to Cite

Yarema, O., & Zagorodna, N. (2026). COMPARATIVE ANALYSIS OF THE EFFECTIVENESS OF SOFTWARE CODE VULNERABILITY DETECTION USING LLM AND SAST. Electronic Professional Scientific Journal «Cybersecurity: Education, Science, Technique», 4(32), 878–891. https://doi.org/10.28925/2663-4023.2026.32.1165