IMPROVEMENT OF THE ALGORITHM FOR RESTORING THE LINGUISTIC COMPONENT OF SPEECH INFORMATION USING THE PHONEME–FORMANT METHOD FOR ASSESSING ITS SECURITY LEVEL

Authors

DOI:

https://doi.org/10.28925/2663-4023.2025.27.617632

Keywords:

phoneme–formant analysis; spectral reconstruction; triphone model; coarticulation; speech information security

Abstract

The article presents an improved algorithm for restoring the linguistic component of speech information using the phoneme–formant method in tasks of assessing its protection level. A comprehensive analysis of the historical development of articulation and formant approaches is conducted — from the classical studies of Bell Laboratories and Harvey Fletcher to modern digital techniques of speech intelligibility evaluation established in ANSI S3.5-1997 and ANSI S3.5-2007 standards. It is shown that the evolution of speech-signal analysis methods has directly influenced the formation of information protection systems, particularly in monitoring channels of acoustic leakage. A ten-step algorithm for speech signal analysis and reconstruction is proposed, integrating spectral, statistical, and contextual approaches. The algorithm includes speech activity detection (VAD), signal segmentation, formant extraction using Fourier and linear predictive analysis (FFT, LPC), and construction of a probabilistic triphone model. The consideration of phase shifts, coarticulation effects, and inter-formant intervals ensures high reconstruction accuracy and robustness even under low signal-to-noise ratio (SNR) conditions. Special attention is given to analyzing the influence of spectral distortions of formants on phoneme recognition and determining statistical regularities of diphone and triphone occurrence in Ukrainian speech. This enabled the creation of an adaptive model combining acoustic and linguistic features for more reliable restoration of the linguistic component of a message. The developed algorithm can be applied in technical information protection practice for objective evaluation of speech-channel security and for recovering damaged or noisy recordings. Its structure fully complies with Ukrainian national standards ND TZI 1.6-005-2013 and ND TZI 3.7-003-2023, which regulate the methodology for analyzing speech-information leakage channels and assessing protection efficiency. The obtained results have dual practical significance: they can serve as a foundation for expert systems of speech information leakage control and for systems of automatic speech recognition, synthesis, and restoration requiring high phonemic precision. The proposed phoneme–formant approach demonstrates universality and the ability to integrate with modern digital audio-analysis tools, thus improving both information security and the performance of acoustic control systems.

Downloads

Download data is not yet available.

References

Allen, J. B. (1996). Harvey Fletcher’s role in the creation of communication acoustics. Journal of the Acoustical Society of America, 99(4), Part 1.

Collard, J. (1929). A theoretical study of the articulation and intelligibility of a telephone circuit. Electrical Communication: A Journal of Progress in the Telephone, Telegraph and Radio Art, 7(3), 168–186.

French, N. R., & Steinberg, J. C. (1947). Factors governing the intelligibility of speech sounds. Journal of the Acoustical Society of America, 19(1), 90–119. https://doi.org/10.1121/1.1916407

Beranek, L. L. (1947). Some notes on the measurement of acoustic impedance. Journal of the Acoustical Society of America, 19(3), 420–427. https://doi.org/10.1121/1.1916499

Fletcher, H., & Galt, R. H. (1950). The perception of speech and its relation to telephony. Journal of the Acoustical Society of America, 22(2), 89–151.

Fletcher, H. (1950). A method of calculating hearing loss for speech from an audiogram. Journal of the Acoustical Society of America, 22(1), 1–5.

American National Standards Institute. (1997). ANSI S3.5-1997: Methods for the calculation of the speech intelligibility index. New York: Acoustical Society of America.

American National Standards Institute. (2007). ANSI S3.5-2007: American National Standard — Methods for calculating the speech intelligibility index. New York: Acoustical Society of America.

Anonymous. (1955). Handbook of acoustic noise control (CIA Report No. CIA-RDP81-01043R004000070005-1). Washington, D.C.: Central Intelligence Agency.

Blintsov, V., Nuzhniy, S., Parkhuts, L., & Kasianov, Y. (2018). The objectified procedure and a technology for assessing the state of complex noise speech information protection. Eastern-European Journal of Enterprise Technologies, 5(9 [95]), 26–34. https://doi.org/10.15587/1729-4061.2018.144146

Blintsov, V., & Nuzhniy, S. (2019). Improvement of the method for assessing the level of speech information protection. Eastern-European Journal of Enterprise Technologies, 6(9 [102]), 28–38. https://doi.org/10.15587/1729-4061.2019.185585

Kasianov, Yu. I., & Nuzhnyi, S. M. (2016). Evaluation of speech-like noise generator efficiency by speech intelligibility criterion. Visnyk Natsionalnoho universytetu "Lvivska politekhnika": Avtomatyka, vymiriuvannia ta keruvannia, 852, 105–110.

Dudley, H. (1939). The Voder: An electronic speech synthesizer. Bell Laboratories.

Chiba, T., & Kajiyama, M. (1941). The vowel: Its nature and structure. Tokyo: Kaiseikan.

Stevens, K. N., & House, A. S. (1955). Development of a quantitative description of vowel articulation. Journal of the Acoustical Society of America, 27(4), 484–493.

Fant, G. (1960). Acoustic theory of speech production: With calculations based on X-ray studies of Russian articulations. The Hague: Mouton & Co.

Malinen, J. (2015). Formants. Aalto University. https://math.aalto.fi/~jmalinen/MyPSFilesInWeb/formants_OEL.pdf

Pirogov, A. A. (2001). Osnovy foneticheskoi teorii rechi [Fundamentals of the phonetic theory of speech]. Zhurnal Russkogo fizicheskogo obshchestva (ZhRFM), 1–12, 15–28.

Lienard, J. S., et al. (1977). Diphone synthesis of French: Vocal response unit and automatic prosody from the text. Proceedings of the International Congress on Acoustics, Speech and Signal Processing, 560–563.

Ishchenko, O. S. (2008). Acoustic characteristics of vowel sounds in modern Ukrainian literary language. Ukrainska mova, 4, 102–111.

Vakulenko, M. O. (2024). Positional variations of Ukrainian back vowel formants. Proceedings of the International Conference on Modern Research in Social Sciences, 1(1), 1–12. https://doi.org/10.33422/icmrss.v1i1.301

Vakulenko, M. O. (2018). Ukrainski holosni zvuky v konteksti MFA [Ukrainian vowels in the context of IPA]. Govor, 35(2), 189–214.

Fant, G. (1971). Acoustic theory of speech production: With calculations based on X-ray studies of Russian articulations. Hague – Berlin: Mouton / Walter de Gruyter.

Kent, R. D. (1993). Vocal tract acoustics. Journal of the Acoustical Society of America, 94(5), 2603. https://doi.org/10.1121/1.408664

Coleman, J. (2006). Acoustic structure of consonants. Oxford: University of Oxford.

Rabiner, L. R., & Shafer, R. W. (1978). Tsifrova obrobka movnykh syhnaliv [Digital processing of speech signals]. Kyiv: Prentis-Hol. https://doi.org/10.5555/5404

Fant, G. (1960). Akustychna teoriia utvorennia movlennia [Acoustic theory of speech production]. Stockholm: Mouton. https://archive.org/details/acoustictheoryofspeechproduction

Stevens, K. N. (1998). Acoustic phonetics. Cambridge: MIT Press. https://mitpress.mit.edu/9780262193980

Markel, D. D., & Gray, A. H. (1976). Linear prediction of speech. New York: Springer. https://link.springer.com/book/10.1007/978-3-642-66242-4

Ishchenko, O. S. (2003). Acoustic characteristics of Ukrainian vowels. Kyiv: Instytut movoznavstva NAN Ukrainy.

ND TZI 1.6-005-2013. (2013). /-Methodology for determining characteristics of speech information leakage channels. Kyiv: ADIT. https://cip.gov.ua/ndtzi

ND TZI 3.7-003-2023. (2023). Methodology for evaluating the effectiveness of protection means against speech information leakage. Kyiv: DSSZZI.

Downloads


Abstract views: 4

Published

2025-03-27

How to Cite

Nuzhnyi, S. (2025). IMPROVEMENT OF THE ALGORITHM FOR RESTORING THE LINGUISTIC COMPONENT OF SPEECH INFORMATION USING THE PHONEME–FORMANT METHOD FOR ASSESSING ITS SECURITY LEVEL. Electronic Professional Scientific Journal «Cybersecurity: Education, Science, Technique», 3(27). https://doi.org/10.28925/2663-4023.2025.27.617632