Preview

Russian Digital Libraries Journal

Advanced search

SciLibRu, the Library of Scientific Subject Domains

https://doi.org/10.26907/1562-5419-2025-28-6-1324-1345

Abstract


The work is devoted to the problem of data integration for representing scientific subject areas based on their semantic description in the SciLibRu digital library. The LibMeta library's ontology and knowledge graph are used as the data model. SciLibRu is populated by adding data from scientific journals. The paper demonstrates how the stages of processing semi-structured scientific publications for their integration into the library's ontology are implemented. Completing all data preprocessing stages yields a dataset that can be used to train language models for queries in Russian-language scientific subject areas.

About the Authors

Olga Muratovna Ataeva
Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences
Russian Federation


Natalia Pavlovna Tuchkova
Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences
Russian Federation


Kirill Borisovich Teymurazov
Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences
Russian Federation


Aidin Abdyshov
S. Y. Witte University of Moscow
Russian Federation


Mikhail Gennadievich Kobuk
S. Y. Witte University of Moscow
Russian Federation


References

1. Serebryakov V.A., Ataeva O.M. Informacionnaya model' otkrytoj personal'noj semanticheskoj biblioteki LibMeta // Nauchnyj servis v seti Internet: trudy XVIII Vserossijskoj nauchnoj konferencii (19–24 sentyabrya 2016 g., g. Novorossijsk). M.: IPM im. M.V. Keldysha, 2016. S. 304–313. URL: http://keldysh.ru/abrau/2016/3.pdf (In Russ.)

2. Rospocher M., Tonelli S., Serafini L., Pianta E. Corpus-based terminological evaluation of ontologies // Applied Ontology. 2012. Vol. 7, No. 4. P. 429–448. https://doi.org/10.3233/AO-2012-0114

3. Ataeva O., Serebryakov V., Tuchkova N. Ontological approach to a knowledge graph construction in a semantic library // Lobachevskii J. of Mathematics. 2023. Vol. 44, No. 6. P. 2229–2239. https://doi.org/10.1134/S1995080223060471

4. Handbook on Ontologies. Editors: Steffen Staab, Rudi Studer, Springer-Verlag Berlin Heidelberg, 2004. https://doi.org/10.1007/978-3-540-24750-0

5. Ataeva O., Serebryakov V., Tuchkova N. Podhody k organizacii matematicheskih znanij pri formirovaniya predmetnyh tezaurusov razlichnyh razdelov matematiki // CEUR Workshop Proceedings. 2018. Vol. 2260. P. 42–54. https://doi.org/10.20948/abrau-2018-66 (In Russ.)

6. Hlomani H., Stacey D. Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: A survey // Semantic Web Journal. 2014. Vol. 1, No. 5. P. 1–11. https://www.semantic-web-journal.net/system/files/swj657.pdf

7. Lozano-Tello A., Gómez-Pérez A. Ontometric: A method to choose the appropriate ontology // Journal of Database Management. 2004. Vol. 15, No. 2. P. 1–18. https://doi.org/10.4018/jdm.2004040101

8. Shrejder Yu.A. Tezaurusy v informatike i teoreticheskoj semantike // Nauchno-tekhnicheskaya informaciya. Ser. 2. 1971. № 3. S. 21–24 (In Russ.).

9. Lukashevich N.V. Tezaurusy v zadachah informacionnogo poiska. M.: Izd-vo MGU, 2011. 495 s. (In Russ.).

10. Harari F. Teoriya grafov. Per. s angl. i predisl. V.P. Kozyreva. Pod red. G.P. Gavrilova. Izd. 2-e. M.: Editorial URSS, 2003. 296 s

11. Barrasa J., Webber J. Building Knowledge Graphs: A Practitioner’s Guide. O'Reilly. 2023. 290 p.

12. Biswas G., Bezdek J., Oakman R.L. A knowledge-based approach to online document retrieval system design // In Proc. ACM SIGART Int. Symp. Methodol. Intell. Syst. (ISMIS '86), 1986. P. 112–120. https://doi/10.1145/12808.12821

13. Gavrilova T.A., Kudryavcev D.V., Muromcev D.I. Inzheneriya znanij. Modeli i metody: Uchebnik. SPb.: Izdatel'stvo «Lan'», 2016. 324 s. (In Russ.).

14. Pan S. et al. Unifying Large Language Models and Knowledge Graphs: A Roadmap // in IEEE Transactions on Knowledge and Data Engineering. 2024. Vol. 36, No. 7. P. 3580–3599. https://doi.org/10.1109/TKDE.2024.3352100

15. Luo L. et al. Graph-constrained reasoning: Faithful reasoning on knowledge graphs with large language models // arXiv preprint arXiv:2410.13080. 2024. https://doi.org/10.48550/arXiv.2410.13080

16. Vinogradov I.M. (Gl. red.). Matematicheskaya enciklopediya (v 5 tomah) M.: Sovetskaya enciklopediya (1977–1985) (In Russ.).

17. Faddeev L.D. (Gl. red.). Enciklopediya matematicheskoj fiziki. Enciklopediya. M.: Bol'shaya russkaya enciklopediya.1998. 692 s. (In Russ.).

18. Ataeva O.M., Tuchkova N.P. Adaptation of the language model for mathematical texts in the semantic library // System Informatics. 2025. No. 27. P. 59–75.

19. Budzko V.I., Ataeva O.M., Tuchkova N.P. Access automation to information for navigating through semantic library data and integrating the knowledge graph with the language model // Highly Available Systems. 2025. V. 21. No. 2. P. 5−11. https://doi.org/ 10.18127/j20729472-202502-0. (In Russ.).

20. Klyukin A.A., Shirokov A.A. Avtomatizirovannaya sistema podgotovki slabostrukturirovannoj informacii. [Elektronnyj resurs] // Gaudeamus. 2014. Vol. 24, No. 2. URL: https://cyberleninka.ru/article/n/avtomatizirovannaya-sistema-podgotovki-slabostrukturirovannoy-informatsii (date of access: 01.11.2025) (In Russ.).

21. Kurtyukin S.V. Metod avtomatizirovannogo formirovaniya sbornikov arhivnyh dokumentov [Elektronnyj resurs] // Teoriya i praktika sovremennoj nauki. 2018. №5 (35). URL: https://cyberleninka.ru/article/n/metod-avtomatizirovannogo-formirovaniya-sbornikov-arhivnyh-dokumentov (data obrashcheniya: 01.11.2025).

22. Aho A., Seti R., Ul'man Dzh. Kompilyatory: principy, tekhnologii, instrument. M.: Vil'yams, 2001, 762 s. (In Russ.).

23. Volkova I.A., Vylitok A.A., Rudenko T.V. Formal'nye grammatiki i yazyki. Elementy teorii translyacii : uchebnoe posobie dlya studentov II kursa M.: Izd-vo Mosk. gos. un-ta,2009. (In Russ.).

24. Gladkij A.V. Formal'nye grammatiki i yazyki. M.: Nauka, Gl. red. fiz.-mat. lit., 1973, 368 s. (In Russ.).

25. Bryabrin V.M., Landau I.Ya., Nemenman M.E. O sisteme kodirovaniya dlya personal'nyh EVM // Mikroprocessornye sredstva i sistemy. 1986. № 4. S. 61–64 (In Russ.).


Review

For citations:


Ataeva O.M., Tuchkova N.P., Teymurazov K.B., Abdyshov A., Kobuk M.G. SciLibRu, the Library of Scientific Subject Domains. Russian Digital Libraries Journal. 2025;28(6):1324-1345. (In Russ.) https://doi.org/10.26907/1562-5419-2025-28-6-1324-1345

Views: 23

JATS XML


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1562-5419 (Online)