<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26907/1562-5419-2023-26-4-483-497</article-id><article-id custom-type="elpub" pub-id-type="custom">ellibs-385</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Разработка cистемы поиска и индексирования контента аудиозаписей</article-title><trans-title-group xml:lang="en"><trans-title>Development of a System for Searching and Indexing the Content of Audio Recordings</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Климов</surname><given-names>Р. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Klimov</surname><given-names>R. A.</given-names></name></name-alternatives><email xlink:type="simple">itis.klimov@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Якупов</surname><given-names>А. Ш.</given-names></name><name name-style="western" xml:lang="en"><surname>Yakupov</surname><given-names>A. S.</given-names></name></name-alternatives><email xlink:type="simple">asyakupov@kpfu.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Казанский (Приволжский) Федеральный университет</institution></aff><aff xml:lang="en"><institution>Kazan (Volga region) Federal University</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2023</year></pub-date><pub-date pub-type="epub"><day>28</day><month>08</month><year>2023</year></pub-date><volume>26</volume><issue>4</issue><fpage>483</fpage><lpage>497</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Климов Р.А., Якупов А.Ш., 2023</copyright-statement><copyright-year>2023</copyright-year><copyright-holder xml:lang="ru">Климов Р.А., Якупов А.Ш.</copyright-holder><copyright-holder xml:lang="en">Klimov R.A., Yakupov A.S.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/385">https://ellibs.elpub.ru/jour/article/view/385</self-uri><abstract><p>Статья посвящена разработке системы поиска и индексации аудиофайлов с использованием автоматического распознавания речи (ASR) и Elasticsearch. Проанализированы актуальные системы транскрибирования аудиофайлов на русском языке и выбрана система whisper как лучшая. Создан алгоритм оптимизации скорости транскрибирования с помощью параллелизации процессов обработки файла, продемонстрирована его эффективность. Построена система на микросервисной архитектуре, способная индексировать контент аудиофайлов и их мета-данные для поиска. Результаты исследования показали, что предложенный подход может быть применен для создания эффективных и гибких систем поиска и аналитики аудиоинформации.
</p></abstract><trans-abstract xml:lang="en"><p>The article is devoted to the development of a search and indexing system for audio files using Automatic Speech Recognition (ASR) and Elasticsearch. Current Russian-language audio file transcription systems have been analyzed, and Whisper has been chosen as the best one. An algorithm for optimizing transcription speed using parallelization of file processing processes has been developed, and its effectiveness has been demonstrated. A microservice architecture-based system has been built, capable of indexing audio file content and their metadata for search purposes. The research results show that the proposed approach can be applied to create efficient and flexible systems for searching and analyzing audio information.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>транскрибирование</kwd><kwd>индексирование</kwd><kwd>параллелизация</kwd><kwd>микросервисы</kwd><kwd>масштабируемость</kwd></kwd-group><kwd-group xml:lang="en"><kwd>transcription</kwd><kwd>indexing</kwd><kwd>parallelization</kwd><kwd>microservices</kwd><kwd>scalability</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">AWS Kendra Transcribe Media Search. URL: https://github.com/aws-samples/aws-kendra-transcribe-media-search</mixed-citation><mixed-citation xml:lang="en">AWS Kendra Transcribe Media Search. URL: https://github.com/aws-samples/aws-kendra-transcribe-media-search</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Noor J., Rownak A., Ratul R., Mondal J. Sherlok in OSS: A Novel Approach of Content-Based Searching on Object Storage System. 2023. URL: https://arxiv.org/pdf/2303.02105.pdf.</mixed-citation><mixed-citation xml:lang="en">Noor J., Rownak A., Ratul R., Mondal J. Sherlok in OSS: A Novel Approach of Content-Based Searching on Object Storage System. 2023. URL: https://arxiv.org/pdf/2303.02105.pdf.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Swift Object Storage. URL: https://www.openstack.org/software/releases/zed/components/swift</mixed-citation><mixed-citation xml:lang="en">Swift Object Storage. URL: https://www.openstack.org/software/releases/zed/components/swift</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Adrakatti A., Mulia K.R. Research Challenges of Library and Information Science in retrieving content based Multimedia Information. 2023. URL:https://www.researchgate.net/publication/361107734_Research_Challenges_of_Library_and_Information_Science_in_retrieving_content_based_Multimedia_Information.</mixed-citation><mixed-citation xml:lang="en">Adrakatti A., Mulia K.R. Research Challenges of Library and Information Science in retrieving content based Multimedia Information. 2023. URL:https://www.researchgate.net/publication/361107734_Research_Challenges_of_Library_and_Information_Science_in_retrieving_content_based_Multimedia_Information.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Google Speech. URL: https://console.cloud.google.com/speech/overview.</mixed-citation><mixed-citation xml:lang="en">Google Speech. URL: https://console.cloud.google.com/speech/overview.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Vosk. URL: https://github.com/alphacep/vosk.</mixed-citation><mixed-citation xml:lang="en">Vosk. URL: https://github.com/alphacep/vosk.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Yandex SpeechKit. URL: https://cloud.yandex.com/en/services/speechkit.</mixed-citation><mixed-citation xml:lang="en">Yandex SpeechKit. URL: https://cloud.yandex.com/en/services/speechkit.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Whisper. URL: https://github.com/openai/whisper.</mixed-citation><mixed-citation xml:lang="en">Whisper. URL: https://github.com/openai/whisper.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Подопригорова Н. С., Подопригорова С. С., Кан А. Д. Автоматическое распознавание речи в системе информационного поиска по аудио // Искусственный интеллект в автоматизированных системах управления и обработки данных, Московский государственный технический университет имени Н.Э. Баумана (национальный исследовательский университет). 2022. Т. 2. С. 339–345.</mixed-citation><mixed-citation xml:lang="en">Подопригорова Н. С., Подопригорова С. С., Кан А. Д. Автоматическое распознавание речи в системе информационного поиска по аудио // Искусственный интеллект в автоматизированных системах управления и обработки данных, Московский государственный технический университет имени Н.Э. Баумана (национальный исследовательский университет). 2022. Т. 2. С. 339–345.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Morris A., Maier V., Green P. From WER and RIL to MER and WIL. 2004. URL:https://www.isca-speech.org/archive_v0/archive_papers/interspeech_2004/i04_2765.pdf.</mixed-citation><mixed-citation xml:lang="en">Morris A., Maier V., Green P. From WER and RIL to MER and WIL. 2004. URL:https://www.isca-speech.org/archive_v0/archive_papers/interspeech_2004/i04_2765.pdf.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">JiWER: A Simple and Fast Python Package to Evaluate an Automatic Speech Recognition System. URL: https://github.com/jitsi/jiwer</mixed-citation><mixed-citation xml:lang="en">JiWER: A Simple and Fast Python Package to Evaluate an Automatic Speech Recognition System. URL: https://github.com/jitsi/jiwer</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Whisper.cpp. URL: https://github.com/ggerganov/whisper.cpp</mixed-citation><mixed-citation xml:lang="en">Whisper.cpp. URL: https://github.com/ggerganov/whisper.cpp</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Faster-whisper. URL: https://github.com/guillaumekln/faster-whisper</mixed-citation><mixed-citation xml:lang="en">Faster-whisper. URL: https://github.com/guillaumekln/faster-whisper</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">CTranslate2. URL: https://github.com/OpenNMT/CTranslate2/</mixed-citation><mixed-citation xml:lang="en">CTranslate2. URL: https://github.com/OpenNMT/CTranslate2/</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Prompt vs prefix in DecodingOptions. URL: https://github.com/openai/whisper/discussions/117</mixed-citation><mixed-citation xml:lang="en">Prompt vs prefix in DecodingOptions. URL: https://github.com/openai/whisper/discussions/117</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">FFmpeg. URL: https://ffmpeg.org/</mixed-citation><mixed-citation xml:lang="en">FFmpeg. URL: https://ffmpeg.org/</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">ElasticSearch. URL: https://www.elastic.co/</mixed-citation><mixed-citation xml:lang="en">ElasticSearch. URL: https://www.elastic.co/</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">ElasticSearch More like this query URL: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.html</mixed-citation><mixed-citation xml:lang="en">ElasticSearch More like this query URL: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.html</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
