{"id":14826,"date":"2022-12-24T15:02:51","date_gmt":"2022-12-24T13:02:51","guid":{"rendered":"http:\/\/journals.khnu.km.ua\/vestnik\/?p=14826"},"modified":"2022-12-26T19:58:07","modified_gmt":"2022-12-26T17:58:07","slug":"multymodalne-rozpiznavannya-movlennya-na-osnovi-zvukovyh-i-tekstovyh-danyh","status":"publish","type":"post","link":"https:\/\/journals.khnu.km.ua\/vestnik\/?p=14826","title":{"rendered":"\u041c\u0443\u043b\u044c\u0442\u0438\u043c\u043e\u0434\u0430\u043b\u044c\u043d\u0435 \u0440\u043e\u0437\u043f\u0456\u0437\u043d\u0430\u0432\u0430\u043d\u043d\u044f \u043c\u043e\u0432\u043b\u0435\u043d\u043d\u044f \u043d\u0430 \u043e\u0441\u043d\u043e\u0432\u0456 \u0437\u0432\u0443\u043a\u043e\u0432\u0438\u0445 \u0456 \u0442\u0435\u043a\u0441\u0442\u043e\u0432\u0438\u0445 \u0434\u0430\u043d\u0438\u0445"},"content":{"rendered":"<p style=\"text-align: center;\"><!--more--><br \/>\n\u041c\u0423\u041b\u042c\u0422\u0418\u041c\u041e\u0414\u0410\u041b\u042c\u041d\u0415 \u0420\u041e\u0417\u041f\u0406\u0417\u041d\u0410\u0412\u0410\u041d\u041d\u042f \u041c\u041e\u0412\u041b\u0415\u041d\u041d\u042f \u041d\u0410 \u041e\u0421\u041d\u041e\u0412\u0406 \u0417\u0412\u0423\u041a\u041e\u0412\u0418\u0425 \u0406 \u0422\u0415\u041a\u0421\u0422\u041e\u0412\u0418\u0425 \u0414\u0410\u041d\u0418\u0425<br \/>\nMULTIMODAL SPEECH RECOGNITION BASED ON AUDIO AND TEXT DATA<\/p>\n<p><strong>\u0421\u0442\u043e\u0440\u0456\u043d\u043a\u0438: 22-25. \u041d\u043e\u043c\u0435\u0440: \u21165, 2022 (313)\u00a0\u00a0 <\/strong> <a href=\"http:\/\/journals.khnu.km.ua\/vestnik\/wp-content\/uploads\/2022\/12\/vknu-ts-2022-n5313-22-25.pdf\"> <img loading=\"lazy\" class=\"size-full wp-image-69 alignnone\" src=\"http:\/\/journals.khnu.km.ua\/vestnik\/wp-content\/uploads\/2021\/01\/pdf.png\" alt=\"\" width=\"76\" height=\"32\" \/><\/a><br \/>\n<strong>DOI:<\/strong> <a href=\"https:\/\/www.doi.org\/10.31891\/2307-5732-2022-313-5-22-25\">https:\/\/www.doi.org\/10.31891\/2307-5732-2022-313-5-22-25<\/a><br \/>\n<strong>\u0410\u0432\u0442\u043e\u0440\u0438: <\/strong>\u0411\u0410\u0421\u0418\u0421\u0422\u042e\u041a \u041e\u043b\u0435\u0433<br \/>\n\u041d\u0430\u0446\u0456\u043e\u043d\u0430\u043b\u044c\u043d\u0438\u0439 \u0443\u043d\u0456\u0432\u0435\u0440\u0441\u0438\u0442\u0435\u0442 \u00ab\u041b\u044c\u0432\u0456\u0432\u0441\u044c\u043a\u0430 \u043f\u043e\u043b\u0456\u0442\u0435\u0445\u043d\u0456\u043a\u0430\u00bb<br \/>\n<a href=\"https:\/\/orcid.org\/0000-0003-0064-6584\">https:\/\/orcid.org\/0000-0003-0064-6584<\/a><br \/>\ne-mail: oleh.a.basystiuk@lpnu.com<br \/>\n\u041c\u0415\u041b\u042c\u041d\u0418\u041a\u041e\u0412\u0410 \u041d\u0430\u0442\u0430\u043b\u0456\u044f<br \/>\n\u041d\u0430\u0446\u0456\u043e\u043d\u0430\u043b\u044c\u043d\u0438\u0439 \u0443\u043d\u0456\u0432\u0435\u0440\u0441\u0438\u0442\u0435\u0442 \u00ab\u041b\u044c\u0432\u0456\u0432\u0441\u044c\u043a\u0430 \u043f\u043e\u043b\u0456\u0442\u0435\u0445\u043d\u0456\u043a\u0430\u00bb<br \/>\n<a href=\"https:\/\/orcid.org\/0000-0002-2114-3436\">https:\/\/orcid.org\/0000-0002-2114-3436<\/a><br \/>\ne-mail: nataliia.i.melnykova@lpnu.ua<br \/>\nBASYSTIUK Oleh, MELNYKOVA Nataliia<br \/>\nLviv Polytechnic National University<\/p>\n<p style=\"text-align: center;\"><strong>\u0410\u043d\u043e\u0442\u0430\u0446\u0456\u044f \u043c\u043e\u0432\u043e\u044e \u043e\u0440\u0438\u0433\u0456\u043d\u0430\u043b\u0443<\/strong><\/p>\n<p>\u0413\u043b\u0438\u0431\u043e\u043a\u0435 \u043d\u0430\u0432\u0447\u0430\u043d\u043d\u044f \u043f\u043e\u0432\u043d\u0456\u0441\u0442\u044e \u0437\u043c\u0456\u043d\u0438\u043b\u043e \u043f\u0456\u0434\u0445\u0456\u0434 \u0434\u043e \u043c\u0430\u0448\u0438\u043d\u043d\u043e\u0433\u043e \u043f\u0435\u0440\u0435\u043a\u043b\u0430\u0434\u0443. \u0414\u043e\u0441\u043b\u0456\u0434\u043d\u0438\u043a\u0438 \u0432 \u0433\u0430\u043b\u0443\u0437\u0456 \u0433\u043b\u0438\u0431\u043e\u043a\u043e\u0433\u043e \u043d\u0430\u0432\u0447\u0430\u043d\u043d\u044f \u0441\u0442\u0432\u043e\u0440\u0438\u043b\u0438 \u043f\u0440\u043e\u0441\u0442\u0456 \u0440\u0456\u0448\u0435\u043d\u043d\u044f \u043d\u0430 \u043e\u0441\u043d\u043e\u0432\u0456 \u043c\u0430\u0448\u0438\u043d\u043d\u043e\u0433\u043e \u043d\u0430\u0432\u0447\u0430\u043d\u043d\u044f, \u044f\u043a\u0456 \u043f\u0435\u0440\u0435\u0432\u0435\u0440\u0448\u0443\u044e\u0442\u044c \u043d\u0430\u0439\u043a\u0440\u0430\u0449\u0456 \u0435\u043a\u0441\u043f\u0435\u0440\u0442\u043d\u0456 \u0441\u0438\u0441\u0442\u0435\u043c\u0438. \u0423 \u0446\u0456\u0439 \u0440\u043e\u0431\u043e\u0442\u0456 \u0440\u043e\u0437\u0433\u043b\u044f\u043d\u0443\u0442\u043e \u043e\u0441\u043d\u043e\u0432\u043d\u0456 \u043e\u0441\u043e\u0431\u043b\u0438\u0432\u043e\u0441\u0442\u0456 \u043c\u0430\u0448\u0438\u043d\u043d\u043e\u0433\u043e \u043f\u0435\u0440\u0435\u043a\u043b\u0430\u0434\u0443 \u043d\u0430 \u043e\u0441\u043d\u043e\u0432\u0456 \u0440\u0435\u043a\u0443\u0440\u0435\u043d\u0442\u043d\u0438\u0445 \u043d\u0435\u0439\u0440\u043e\u043d\u043d\u0438\u0445 \u043c\u0435\u0440\u0435\u0436. \u0423 \u0441\u0442\u0430\u0442\u0442\u0456 \u0442\u0430\u043a\u043e\u0436 \u0432\u0438\u0441\u0432\u0456\u0442\u043b\u0435\u043d\u043e \u043f\u0435\u0440\u0435\u0432\u0430\u0433\u0438 \u0441\u0438\u0441\u0442\u0435\u043c \u043d\u0430 \u043e\u0441\u043d\u043e\u0432\u0456 RNN, \u0449\u043e \u0432\u0438\u043a\u043e\u0440\u0438\u0441\u0442\u043e\u0432\u0443\u044e\u0442\u044c \u043c\u043e\u0434\u0435\u043b\u044c \u043f\u043e\u0441\u043b\u0456\u0434\u043e\u0432\u043d\u043e\u0441\u0442\u0456 \u0434\u043e \u043f\u043e\u0441\u043b\u0456\u0434\u043e\u0432\u043d\u043e\u0441\u0442\u0456, \u043f\u043e\u0440\u0456\u0432\u043d\u044f\u043d\u043e \u0437\u0456 \u0441\u0442\u0430\u0442\u0438\u0441\u0442\u0438\u0447\u043d\u0438\u043c\u0438 \u0441\u0438\u0441\u0442\u0435\u043c\u0430\u043c\u0438 \u0442\u0440\u0430\u043d\u0441\u043b\u044f\u0446\u0456\u0457. \u0414\u0432\u0456 \u0441\u0438\u0441\u0442\u0435\u043c\u0438 \u043c\u0430\u0448\u0438\u043d\u043d\u043e\u0433\u043e \u043f\u0435\u0440\u0435\u043a\u043b\u0430\u0434\u0443, \u0437\u0430\u0441\u043d\u043e\u0432\u0430\u043d\u0456 \u043d\u0430 \u043c\u043e\u0434\u0435\u043b\u0456 \u043f\u043e\u0441\u043b\u0456\u0434\u043e\u0432\u043d\u043e\u0441\u0442\u0456 \u0434\u043e \u043f\u043e\u0441\u043b\u0456\u0434\u043e\u0432\u043d\u043e\u0441\u0442\u0456, \u0431\u0443\u043b\u0438 \u0441\u0442\u0432\u043e\u0440\u0435\u043d\u0456 \u0437 \u0432\u0438\u043a\u043e\u0440\u0438\u0441\u0442\u0430\u043d\u043d\u044f\u043c \u0431\u0456\u0431\u043b\u0456\u043e\u0442\u0435\u043a \u043c\u0430\u0448\u0438\u043d\u043d\u043e\u0433\u043e \u043d\u0430\u0432\u0447\u0430\u043d\u043d\u044f Keras \u0456 PyTorch. \u041d\u0430 \u043e\u0441\u043d\u043e\u0432\u0456 \u043e\u0442\u0440\u0438\u043c\u0430\u043d\u0438\u0445 \u0440\u0435\u0437\u0443\u043b\u044c\u0442\u0430\u0442\u0456\u0432 \u043f\u0440\u043e\u0432\u0435\u0434\u0435\u043d\u043e \u0430\u043d\u0430\u043b\u0456\u0437 \u0431\u0456\u0431\u043b\u0456\u043e\u0442\u0435\u043a \u0442\u0430 \u043f\u043e\u0440\u0456\u0432\u043d\u044f\u043d\u043d\u044f \u0457\u0445 \u043f\u0440\u043e\u0434\u0443\u043a\u0442\u0438\u0432\u043d\u043e\u0441\u0442\u0456.<br \/>\n<strong>\u041a\u043b\u044e\u0447\u043e\u0432\u0456 \u0441\u043b\u043e\u0432\u0430:<\/strong> \u043c\u0430\u0448\u0438\u043d\u043d\u0438\u0439 \u043f\u0435\u0440\u0435\u043a\u043b\u0430\u0434, \u0433\u043b\u0438\u0431\u043e\u043a\u0435 \u043d\u0430\u0432\u0447\u0430\u043d\u043d\u044f, \u0440\u0435\u043a\u0443\u0440\u0435\u043d\u0442\u043d\u0456 \u043d\u0435\u0439\u0440\u043e\u043d\u043d\u0456 \u043c\u0435\u0440\u0435\u0436\u0456, \u043f\u0440\u043e\u0434\u0443\u043a\u0442\u0438\u0432\u043d\u0456\u0441\u0442\u044c, keras, pytorch, sequence-to-sequence.<\/p>\n<p style=\"text-align: center;\"><strong>\u0420\u043e\u0437\u0448\u0438\u0440\u0435\u043d\u0430 \u0430\u043d\u043e\u0442\u0430\u0446\u0456\u044f \u0430\u043d\u0433\u043b\u0456\u0439\u0441\u044c\u043a\u043e\u044e \u00a0\u043c\u043e\u0432\u043e\u044e<\/strong><\/p>\n<p>Systems of machine translation of texts from one language to another simulate the work of a human translator. Their performance depends on the ability to understand the grammar rules of the language. In translation, the basic units are not individual words, but word combinations or phraseological units that express different concepts. Only by using them, more complex ideas can be expressed through the translated text.<br \/>\nThe main feature of machine translation is different length for input and output. The ability to work with different lengths of input and output provides us with the approach of recurrent neural networks.<br \/>\nA recurrent neural network (RNN) is a class of artificial neural network that has connections between nodes. In this case, a connection refers to a connection from a more distant node to a less distant node. The presence of connections allows the RNN to remember and reproduce the entire sequence of reactions to one stimulus. From the point of view of programming, such networks are analogous to cyclic execution, and from the point of view of the system, such networks are equivalent to a state machine. RNNs are commonly used to process word sequences in natural language processing. Usually, a hidden Markov model (HMM) and an N-program language model are used to process a sequence of words.<br \/>\nDeep learning has completely changed the approach to machine translation. Researchers in the deep learning field has created simple solutions based on machine learning that outperform the best expert systems. In this paper was reviewed the main features of machine translation based on recurrent neural networks. The advantages of systems based on RNN using the sequence-to-sequence model against statistical translation systems are also highlighted in the article. Two machine translation systems based on the sequence-to-sequence model were constructed using Keras and PyTorch machine learning libraries. Based on the obtained results, libraries analysis was done, and their performance comparison.<br \/>\n<strong>Keywords:<\/strong> machine translation, deep learning, recurrent neural networks, performance, keras, pytorch, sequence-to-sequence.<\/p>\n<p style=\"text-align: center;\"><strong>References<\/strong><\/p>\n<ol>\n<li>Yu D., Deng L. Automatic Speech Recognition: A Deep Learning Approach. Springer-Verlag Longon, 2015. DOI: 10.1007\/978-1-4471-5779-3.<\/li>\n<li>Dey N. Intelligent Speech Signal Processing Academic Press, 2019. DOI: 10.1016\/C2018-0-03271-5.<\/li>\n<li>Shakhovska N., Basystiuk O., Shakhovska K. Development of the speech-to-text chatbot interface based on Google API. In: CEUR Workshop Proceedings, 2019, vol. 2386, pp. 212\u2013221.<\/li>\n<li>Melnykova N. Semantic search personalized data as special method of processing medical information. Advances in Intelligent Systems and Computing, 2017: 315-325.<\/li>\n<li>Basystiuk O., Shakhovska N., Bilynska V., Syvokon O., Shamuratov O., Kuchkovskiy V. The Developing of the System for Autimatic Audio to Text Conversion. IT&amp;AS\u20192021: Symposium on Information Technologies &amp; Applied Sciences, March 5\u20136, 2021, Bratislava, Slovak Republic.<\/li>\n<li>Buss E., Leibold L. J., Porter H. L., Grose J. H. Speech recognition in one- and two-talker maskers in school-age children and adults: Development of perceptual masking and glimpsing. The Journal of the Acoustical Society of America, 2017. DOI: 10.1121\/1.4979936.<\/li>\n<li>Nataliya Boyko, Lesya Mochurad, Uliana Parpan, Oleh Basystiuk. Usage of Machine-based Translation Methods for Analyzing Open Data in Legal Cases. In: Proc. of the Intl Workshop on Cyber Hygiene (CybHyg-2019) co-located with 1st International Conference on Cyber Hygiene and Conflict Management in Global Information Networks (CyberConf, 2019), Kyiv, Ukraine, November 30, 2019, pp. 328\u2013338. CEUR-WS.org, online CEUR-WS.org\/Vol-2654\/paper26.pdf.<\/li>\n<li>Melnykova N., Shakhovska N., Gregu\u0161ml M., &amp; Melnykov V. (2019). Using big data for formalization the patient&#8217;s personalized data. Paper presented at the Procedia Computer Science, 155 624-629.<\/li>\n<li>Zoryana Rybchak, Oleh Basystiuk. (2017). Analysis of methods and means of text mining. ECONTECHMOD. AN INTERNATIONAL QUARTERLY JOURNAL, 6(2), 73-78.<\/li>\n<li>GitHub Repository \u201cSpeech recognition algorithms\u201d. https:\/\/github.com\/obasys\/speech-recognition-algorithms. (accessed Aug. 15, 2022)<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[71],"tags":[],"_links":{"self":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts\/14826"}],"collection":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=14826"}],"version-history":[{"count":4,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts\/14826\/revisions"}],"predecessor-version":[{"id":14953,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts\/14826\/revisions\/14953"}],"wp:attachment":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=14826"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=14826"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=14826"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}