<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id custom-type="elpub" pub-id-type="custom">ellibs-719</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Запросы к нереляционным данным на естественном языке на основе большой языковой модели</article-title><trans-title-group xml:lang="en"><trans-title>Queries to Non-Relational Data using Natural Language based on a Large Language Model</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Еркимбаев</surname><given-names>Адильбек Омирбекович</given-names></name><name name-style="western" xml:lang="en"><surname>Erkimbaev</surname><given-names>Adilbek Omirbekovich</given-names></name></name-alternatives><email xlink:type="simple">adilbek@jiht.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Зицерман</surname><given-names>Владимир Юрьевич</given-names></name><name name-style="western" xml:lang="en"><surname>Zitserman</surname><given-names>Vladimir Yurievich</given-names></name></name-alternatives><email xlink:type="simple">vz1941@mail.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Кобзев</surname><given-names>Георгий Анатольевич</given-names></name><name name-style="western" xml:lang="en"><surname>Kobzev</surname><given-names>George Anatolyevich</given-names></name></name-alternatives><email xlink:type="simple">gkbz@mail.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Объединенный институт высоких температур РАН</institution></aff><aff xml:lang="en"><institution>Joint Institute for High Temperatures</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2026</year></pub-date><pub-date pub-type="epub"><day>04</day><month>03</month><year>2026</year></pub-date><volume>29</volume><issue>1</issue><fpage>76</fpage><lpage>98</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Еркимбаев А.О., Зицерман В.Ю., Кобзев Г.А., 2026</copyright-statement><copyright-year>2026</copyright-year><copyright-holder xml:lang="ru">Еркимбаев А.О., Зицерман В.Ю., Кобзев Г.А.</copyright-holder><copyright-holder xml:lang="en">Erkimbaev A.O., Zitserman V.Y., Kobzev G.A.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/719">https://ellibs.elpub.ru/jour/article/view/719</self-uri><abstract><p>В работе рассмотрены новые возможности организации запросов на естественном языке к научным локальным базам данных нереляционного типа. Проведенный анализ исследований, выполненных за последние годы, показал активное внедрение запросов на естественном языке к базам данных различного типа. Отмечено активное применение методов машинного обучения (нейронных алгоритмов). Показано широкое использование в последние два года большой языковой модели для подготовки запросов в различных языковых средах и областях знаний. Проведено исследование новых возможностей графовой базы данных AllegroGraph по использованию больших языковых моделей для организации поиска на естественном языке. Функционал базы данных изучен на примере системы метаданных по теплофизическим свойствам веществ в форме предметной онтологии «Термаль». Тестирование поисковых запросов в двуязычной (английская и русская) среде базы данных выявило в целом преодолимые проблемы и дает хорошие надежды на дальнейшее применение новых прикладных сервисов с использованием больших языковых моделей.
</p></abstract><trans-abstract xml:lang="en"><p>The main purpose of this work is to explore new opportunities for organizing natural language queries in scientific local databases that are not relational. A brief review of recent research shows that there has been an active introduction of natural language queries into databases of various types, and the use of machine learning methods, such as neural algorithms, is noted. The widespread use of large language models in the last two years for query generation in various language settings and fields of expertise has been demonstrated. A study has been conducted to explore the potential of the AllegroGraph graph database in using large language models for natural language search. The functionality of the database has been examined using the example of a metadata system for thermophysical properties in the form of the "Thermal" domain ontology. Testing search queries in a bilingual (English and Russian) database environment has revealed some general problems that can be overcome, and it gives us good hope for the future application of new services using large language models.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>запрос на естественном языке</kwd><kwd>большая языковая модель</kwd><kwd>эмбеддинг</kwd><kwd>нереляционные базы данных</kwd><kwd>графовая база данных</kwd><kwd>онтология предметной области</kwd></kwd-group><kwd-group xml:lang="en"><kwd>natural language query</kwd><kwd>large language model</kwd><kwd>embedding</kwd><kwd>non-relational databases</kwd><kwd>graph database</kwd><kwd>domain ontology</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Erkimbaev A.O., Zitserman V.Iu., Kobzev G.A. Tipologiia materialovedcheskikh dannykh // Nauchno-tekhnicheskaia informatsiia. Ser. 2. 2023. № 6. S. 25–39.</mixed-citation><mixed-citation xml:lang="en">Erkimbaev A.O., Zitserman V.Iu., Kobzev G.A. Tipologiia materialovedcheskikh dannykh // Nauchno-tekhnicheskaia informatsiia. Ser. 2. 2023. № 6. S. 25–39.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Erkimbaev A.O., Zitserman V.Iu., Kobzev G.A., Kosinov A.V. O predstavlenii i otsenke nauchnykh dannykh chislovogo i nechislovogo tipa pri provedenii issledovanii po svoistvam materialov // Nauchno-tekhnicheskaia informatsiia. Ser. 2. 2023. № 2. S. 8–16.</mixed-citation><mixed-citation xml:lang="en">Erkimbaev A.O., Zitserman V.Iu., Kobzev G.A., Kosinov A.V. O predstavlenii i otsenke nauchnykh dannykh chislovogo i nechislovogo tipa pri provedenii issledovanii po svoistvam materialov // Nauchno-tekhnicheskaia informatsiia. Ser. 2. 2023. № 2. S. 8–16.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Woods W.A. Semantics and quantification in natural language question answering. // Advances in computers. N.Y. etc.: Acad. Press, 1978. Vol. 1 7. P. 1–87. URL: https://web.stanford.edu/class/linguist289/woods.pdf</mixed-citation><mixed-citation xml:lang="en">Woods W.A. Semantics and quantification in natural language question answering. // Advances in computers. N.Y. etc.: Acad. Press, 1978. Vol. 1 7. P. 1–87. URL: https://web.stanford.edu/class/linguist289/woods.pdf</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Borodin D.S., Stroganov Iu.V. K zadache sostavleniia zaprosov k bazam dannykh na estestvennom iazyke // Novye informatsionnye tekhnologii v avtomatizirovannykh sistemakh: materialy 19 nauchno-prakticheskogo seminara. M.: IPM im. M.V. Keldysha, aprel 2016. P. 119–125.</mixed-citation><mixed-citation xml:lang="en">Borodin D.S., Stroganov Iu.V. K zadache sostavleniia zaprosov k bazam dannykh na estestvennom iazyke // Novye informatsionnye tekhnologii v avtomatizirovannykh sistemakh: materialy 19 nauchno-prakticheskogo seminara. M.: IPM im. M.V. Keldysha, aprel 2016. P. 119–125.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Bolshakova E.I., Klyshinskii E. S., Lande D.V., Noskov A.A., Peskova O.V., Iagunova E.V. Avtomaticheskaia obrabotka tekstov na estestvennom iazyke i kompiuternaia lingvistika: uchebnoe posobie. M.: MIEM, 2011. 272 s.</mixed-citation><mixed-citation xml:lang="en">Bolshakova E.I., Klyshinskii E. S., Lande D.V., Noskov A.A., Peskova O.V., Iagunova E.V. Avtomaticheskaia obrabotka tekstov na estestvennom iazyke i kompiuternaia lingvistika: uchebnoe posobie. M.: MIEM, 2011. 272 s.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Borodin D.S., Stroganov Iu.V., Volkova L.L., Rudakov I.V., Prosukov E.A. Transliator zaprosov na ogranichennom estestvennom iazyke v zaprosy k reliatsionnym bazam dannykh // Sistemnyi administrator. 2019. Vypusk №01-02. S. 194–195.</mixed-citation><mixed-citation xml:lang="en">Borodin D.S., Stroganov Iu.V., Volkova L.L., Rudakov I.V., Prosukov E.A. Transliator zaprosov na ogranichennom estestvennom iazyke v zaprosy k reliatsionnym bazam dannykh // Sistemnyi administrator. 2019. Vypusk №01-02. S. 194–195.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Posevkin R.V. Primenenie semanticheskoi modeli bazy dannykh pri realizatsii estestvenno-iazykovogo polzovatelskogo interfeisa // Nauchno-tekhnicheskii vestnik informatsionnykh tekhnologii, mekhaniki i optiki. 2018. Tom 18. № 2. S. 262–267.</mixed-citation><mixed-citation xml:lang="en">Posevkin R.V. Primenenie semanticheskoi modeli bazy dannykh pri realizatsii estestvenno-iazykovogo polzovatelskogo interfeisa // Nauchno-tekhnicheskii vestnik informatsionnykh tekhnologii, mekhaniki i optiki. 2018. Tom 18. № 2. S. 262–267.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Mikolov T., et al. Distributed representations of words and phrases and their compositionality // Proc. 26th Int. Conf. on Neural Information Processing Systems. 2013. P. 3111–3119.</mixed-citation><mixed-citation xml:lang="en">Mikolov T., et al. Distributed representations of words and phrases and their compositionality // Proc. 26th Int. Conf. on Neural Information Processing Systems. 2013. P. 3111–3119.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Pennington J., et al. Glove: Global vectors for word representation // Proc. Conf. Empirical Methods in Natural Language Processing. 2014. P. 1532–1543.</mixed-citation><mixed-citation xml:lang="en">Pennington J., et al. Glove: Global vectors for word representation // Proc. Conf. Empirical Methods in Natural Language Processing. 2014. P. 1532–1543.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Kenton J.D.M.-W. C., Toutanova L.K. Bert: Pre-training of deep bidirectional transformers for language understanding // Proc. Conf. of North American Chapter of Association for Computational Linguistics. 2019. P. 4171–4186.</mixed-citation><mixed-citation xml:lang="en">Kenton J.D.M.-W. C., Toutanova L.K. Bert: Pre-training of deep bidirectional transformers for language understanding // Proc. Conf. of North American Chapter of Association for Computational Linguistics. 2019. P. 4171–4186.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Hafsa Shareef Dar, M. Ikramullah Lali, Khalid Mahmood Malik, Syed Ahmad Chan Bukhari. Frameworks for Querying Databases Using Natural Language: A Literature Review. 2019. P. 1–18. arXiv preprint. URL: https://arxiv.org/abs/1909.01822</mixed-citation><mixed-citation xml:lang="en">Hafsa Shareef Dar, M. Ikramullah Lali, Khalid Mahmood Malik, Syed Ahmad Chan Bukhari. Frameworks for Querying Databases Using Natural Language: A Literature Review. 2019. P. 1–18. arXiv preprint. URL: https://arxiv.org/abs/1909.01822</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Baig Muhammad Shahzaib, et al. Natural Language to SQL Queries: A Review Original Article // International Journal of Innovations in Science &amp; Technology. 2022. Vol. 4. Issue 1. P. 147–162.</mixed-citation><mixed-citation xml:lang="en">Baig Muhammad Shahzaib, et al. Natural Language to SQL Queries: A Review Original Article // International Journal of Innovations in Science &amp; Technology. 2022. Vol. 4. Issue 1. P. 147–162.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Tao Yu, et al. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task. arXiv preprint. 2018. URL: https://arxiv.org/abs/1809.08887</mixed-citation><mixed-citation xml:lang="en">Tao Yu, et al. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task. arXiv preprint. 2018. URL: https://arxiv.org/abs/1809.08887</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Manning C.D. Human language understanding &amp; reasoning // Daedalus 2022. Vol. 151. Issue 2. P. 127–138.</mixed-citation><mixed-citation xml:lang="en">Manning C.D. Human language understanding &amp; reasoning // Daedalus 2022. Vol. 151. Issue 2. P. 127–138.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Meyer Jesse G., et al. ChatGPT and large language models in academia: opportunities and challenges // BioData Mining 2023. Vol. 16. Art. numb. 20.</mixed-citation><mixed-citation xml:lang="en">Meyer Jesse G., et al. ChatGPT and large language models in academia: opportunities and challenges // BioData Mining 2023. Vol. 16. Art. numb. 20.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Microsoft Copilot в Azure с базой данных SQL Azure. URL: https://learn.microsoft.com/ru-ru/azure/azure-sql/copilot/copilot-azure-sql-overview?view=azuresql</mixed-citation><mixed-citation xml:lang="en">Microsoft Copilot в Azure с базой данных SQL Azure. URL: https://learn.microsoft.com/ru-ru/azure/azure-sql/copilot/copilot-azure-sql-overview?view=azuresql</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">MongoDB Query Generator using OpenAI. URL: https://www.mongodb.com/docs/compass/current/query-with-natural-language/#std-label-compass-query-natural-language</mixed-citation><mixed-citation xml:lang="en">MongoDB Query Generator using OpenAI. URL: https://www.mongodb.com/docs/compass/current/query-with-natural-language/#std-label-compass-query-natural-language</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Lower your Large Language Model costs with Graphwise GraphDB. URL: https://www.ontotext.com/blog/lower-your-llm-costs-with-graphwise-graphdb/</mixed-citation><mixed-citation xml:lang="en">Lower your Large Language Model costs with Graphwise GraphDB. URL: https://www.ontotext.com/blog/lower-your-llm-costs-with-graphwise-graphdb/</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">AllegroGraph 8.4.0 LLM Embed Specification. URL: https://franz.com/agraph/support/documentation/llmembed.html</mixed-citation><mixed-citation xml:lang="en">AllegroGraph 8.4.0 LLM Embed Specification. URL: https://franz.com/agraph/support/documentation/llmembed.html</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Stardog Voicebox FAQ: How LLM, Generative AI, and Knowledge Graphs are the Future of Data Management. URL: https://www.stardog.com/blog/stardog-voicebox-faq-how-llm-generative-ai-and-knowledge-graphs-are-the-future-of-data-management/</mixed-citation><mixed-citation xml:lang="en">Stardog Voicebox FAQ: How LLM, Generative AI, and Knowledge Graphs are the Future of Data Management. URL: https://www.stardog.com/blog/stardog-voicebox-faq-how-llm-generative-ai-and-knowledge-graphs-are-the-future-of-data-management/</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Trakhtengerts M.S. Tekhnologiia podgotovki informatsii dlia baz dannykh v obmennom formate ISO 2709 // Nauchno-tekhnicheskaia informatsiia. Ser. 2. 2006. № 7. S. 28–31.</mixed-citation><mixed-citation xml:lang="en">Trakhtengerts M.S. Tekhnologiia podgotovki informatsii dlia baz dannykh v obmennom formate ISO 2709 // Nauchno-tekhnicheskaia informatsiia. Ser. 2. 2006. № 7. S. 28–31.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
