<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26907/1562-5419-2025-28-5-1138-1163</article-id><article-id custom-type="elpub" pub-id-type="custom">ellibs-613</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Исследование квантования больших языковых моделей: оценка эффективности с акцентом на русскоязычные задачи</article-title><trans-title-group xml:lang="en"><trans-title>Exploring Post-Training Quantization of Large Language Models with a Focus on Russian Evaluation</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Пойманов</surname><given-names>Дмитрий Романович</given-names></name><name name-style="western" xml:lang="en"><surname>Poimanov</surname><given-names>Dmitrii Romanovich</given-names></name></name-alternatives><email xlink:type="simple">poimanovdr@my.msu.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Шутов</surname><given-names>Михаил Сергеевич</given-names></name><name name-style="western" xml:lang="en"><surname>Shutov</surname><given-names>Mikhail Sergeevich</given-names></name></name-alternatives><email xlink:type="simple">mihailshutov105@gmail.com</email><xref ref-type="aff" rid="aff-2"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Московский государственный университет им. М. В. Ломоносова</institution></aff><aff xml:lang="en"><institution>Lomonosov Moscow State University</institution></aff></aff-alternatives><aff-alternatives id="aff-2"><aff xml:lang="ru"><institution>Московский физико-технический институт (национальный исследовательский университет)</institution></aff><aff xml:lang="en"><institution>Moscow Institute of Science and Technology</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2025</year></pub-date><pub-date pub-type="epub"><day>19</day><month>12</month><year>2025</year></pub-date><volume>28</volume><issue>5</issue><fpage>1138</fpage><lpage>1163</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Пойманов Д.Р., Шутов М.С., 2025</copyright-statement><copyright-year>2025</copyright-year><copyright-holder xml:lang="ru">Пойманов Д.Р., Шутов М.С.</copyright-holder><copyright-holder xml:lang="en">Poimanov D.R., Shutov M.S.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/613">https://ellibs.elpub.ru/jour/article/view/613</self-uri><abstract><p>Квантование стало ключевой техникой сжатия и ускорения больших языковых моделей (LLM). Несмотря на то, что исследования низкобитного квантования активно развиваются применительно к англоязычным LLM, его влияние на морфологически богатые и разнородные по ресурсам языки, включая русский, остается изученным значительно хуже. Поэтому требуются дополнительные исследования этого вопроса в связи с развитием высокоэффективных русскоязычных и многоязычных LLM.


Мы провели систематическое исследование квантования предобученных моделей в эффективные 2.0—4.25 бита на параметр для современных русскоязычных LLM различного масштаба от 4 до 32 млрд параметров (4 B и 32 B). Экспериментальная часть охватывает как стандартное равномерное квантование, так и специализированные низкобитные форматы. Полученные результаты выявили несколько ключевых тенденций: i) устойчивость русскоязычных LLM к квантованию варьируется в зависимости от архитектуры и размера модели; ii) 4-битное квантование демонстрирует высокую надежность, особенно при использовании продвинутых форматов; iii) 3-битное и 2-битное квантования оказались наиболее чувствительными к указанным калибровки. Полученные эмпирические данные демонстрируют необходимость учета домена модели при использовании различных методов квантования.
</p></abstract><trans-abstract xml:lang="en"><p>The rapid adoption of large language models (LLMs) has made quantization a central technique for enabling efficient deployment under real-world hardware and memory constraints. While English-centric evaluations of low-bit quantization are increasingly available, much less is known about its effects on morphologically rich and resource-diverse languages such as Russian. This gap is particularly important given the recent emergence of high-performing Russian and multilingual LLMs. In this work, we conduct a systematic study of 2-, 3-, and 4-bit post-training quantization (PTQ) for state-of-the-art Russian LLMs across different model scales (4B and 32B). Our experimental setup covers both standard uniform quantization and specialized low-bit formats, as well as lightweight finetuning for recovery in the most extreme 2-bit setting. Our findings highlight several important trends: (i) the tolerance of Russian LLMs to quantization differs across model families and scales; (ii) 4-bit quantization is generally robust, especially when advanced formats are used; (iii) 3-bit models expose sensitivity to calibration data and scaling strategies; and (iv) 2-bit models, while severely degraded under naive PTQ, can be partially restored through short finetuning. Empirical results show that the model's domain must be considered when using different quantization techniques.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>квантование нейросетей</kwd><kwd>сжатие и оптимизация больших языковых моделей</kwd></kwd-group><kwd-group xml:lang="en"><kwd>neural networks quantization</kwd><kwd>compression and optimization of large language models</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Shavrina T. et al. RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 4717–4726. https://doi.org/10.18653/v1/2020.emnlp-main.381</mixed-citation><mixed-citation xml:lang="en">Shavrina T. et al. RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. P. 4717–4726. https://doi.org/10.18653/v1/2020.emnlp-main.381</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Mendonça J., Lavie A., Trancoso I. On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation // Proceedings of the 6th Workshop on NLP for Conversational AI (NLP4ConvAI 2024). 2024. P. 1–12. https://doi.org/10.48550/arXiv.2407.03841</mixed-citation><mixed-citation xml:lang="en">Mendonça J., Lavie A., Trancoso I. On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation // Proceedings of the 6th Workshop on NLP for Conversational AI (NLP4ConvAI 2024). 2024. P. 1–12. https://doi.org/10.48550/arXiv.2407.03841</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Liu J. et al. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation //Advances in Neural Information Processing Systems. 2023. Vol. 36. P. 21558–21572. https://doi.org/10.48550/arXiv.2305.01210</mixed-citation><mixed-citation xml:lang="en">Liu J. et al. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation //Advances in Neural Information Processing Systems. 2023. Vol. 36. P. 21558–21572. https://doi.org/10.48550/arXiv.2305.01210</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Hendrycks D. et al. Measuring massive multitask language understanding, 2021 // International Conference on Learning Representations. 2021. https://doi.org/10.48550/arXiv.2009.03300</mixed-citation><mixed-citation xml:lang="en">Hendrycks D. et al. Measuring massive multitask language understanding, 2021 // International Conference on Learning Representations. 2021. https://doi.org/10.48550/arXiv.2009.03300</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Clark P. et al. Think you have solved question answering? try arc, the ai2 reasoning challenge // arXiv preprint arXiv:1803.05457. 2018. https://doi.org/10.48550/arXiv.1803.05457</mixed-citation><mixed-citation xml:lang="en">Clark P. et al. Think you have solved question answering? try arc, the ai2 reasoning challenge // arXiv preprint arXiv:1803.05457. 2018. https://doi.org/10.48550/arXiv.1803.05457</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Zellers R. et al. HellaSwag: Can a Machine Really Finish Your Sentence? // Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. P. 4791–4800. https://doi.org/10.48550/arXiv.1905.07830</mixed-citation><mixed-citation xml:lang="en">Zellers R. et al. HellaSwag: Can a Machine Really Finish Your Sentence? // Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. P. 4791–4800. https://doi.org/10.48550/arXiv.1905.07830</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Dettmers T. et al. Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale // Advances in neural information processing systems. 2022. Vol. 35, P. 30318–30332. https://doi.org/10.48550/arXiv.2208.07339</mixed-citation><mixed-citation xml:lang="en">Dettmers T. et al. Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale // Advances in neural information processing systems. 2022. Vol. 35, P. 30318–30332. https://doi.org/10.48550/arXiv.2208.07339</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Frantar E. et al. OPTQ: Accurate post-training quantization for generative pre-trained transformers // 11th International Conference on Learning Representations. 2023. https://doi.org/10.48550/arXiv.2210.17323</mixed-citation><mixed-citation xml:lang="en">Frantar E. et al. OPTQ: Accurate post-training quantization for generative pre-trained transformers // 11th International Conference on Learning Representations. 2023. https://doi.org/10.48550/arXiv.2210.17323</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Lin J. et al. Awq: Activation-aware weight quantization for on-device llm compression and acceleration // Proceedings of machine learning and systems. 2024. Vol. 6. P. 87–100. https://doi.org/10.1145/3714983.3714987</mixed-citation><mixed-citation xml:lang="en">Lin J. et al. Awq: Activation-aware weight quantization for on-device llm compression and acceleration // Proceedings of machine learning and systems. 2024. Vol. 6. P. 87–100. https://doi.org/10.1145/3714983.3714987</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Xiao G. et al. Smoothquant: Accurate and efficient post-training quantization for large language models // International conference on machine learning. PMLR, 2023. P. 38087 –38099. https://doi.org/10.48550/arXiv.2211.10438</mixed-citation><mixed-citation xml:lang="en">Xiao G. et al. Smoothquant: Accurate and efficient post-training quantization for large language models // International conference on machine learning. PMLR, 2023. P. 38087 –38099. https://doi.org/10.48550/arXiv.2211.10438</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Tseng A. et al. Qtip: Quantization with trellises and incoherence processing // Advances in Neural Information Processing Systems. 2024. Vol. 37. P. 59597–59620. https://doi.org/10.48550/arXiv.2406.11235</mixed-citation><mixed-citation xml:lang="en">Tseng A. et al. Qtip: Quantization with trellises and incoherence processing // Advances in Neural Information Processing Systems. 2024. Vol. 37. P. 59597–59620. https://doi.org/10.48550/arXiv.2406.11235</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">T-Tech. T-pro-2.0. – Hybrid reasoning model based on Qwen3-32B // HuggingFace.co: The collaboration platform. 2025. URL: https://huggingface.co/t-tech/T-pro-it-2.0</mixed-citation><mixed-citation xml:lang="en">T-Tech. T-pro-2.0. – Hybrid reasoning model based on Qwen3-32B // HuggingFace.co: The collaboration platform. 2025. URL: https://huggingface.co/t-tech/T-pro-it-2.0</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Yandex company. YandexGPT // HuggingFace.co: The collaboration platform. 2025. URL: https://huggingface.co/yandex/YandexGPT-5-Lite-8B-instruct</mixed-citation><mixed-citation xml:lang="en">Yandex company. YandexGPT // HuggingFace.co: The collaboration platform. 2025. URL: https://huggingface.co/yandex/YandexGPT-5-Lite-8B-instruct</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Tikhomirov M., Chernyshev D. Facilitating large language model russian adaptation with learned embedding propagation // Journal of Language and Education. 2024. Vol. 10. No. 4 (40). P. 130–145. https://doi.org/10.48550/arXiv.2412.21140</mixed-citation><mixed-citation xml:lang="en">Tikhomirov M., Chernyshev D. Facilitating large language model russian adaptation with learned embedding propagation // Journal of Language and Education. 2024. Vol. 10. No. 4 (40). P. 130–145. https://doi.org/10.48550/arXiv.2412.21140</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Team Q. et al. Qwen2 technical report // arXiv preprint arXiv:2407.10671. 2024. Vol. 2. P. 3. https://doi.org/10.48550/arXiv.2407.10671</mixed-citation><mixed-citation xml:lang="en">Team Q. et al. Qwen2 technical report // arXiv preprint arXiv:2407.10671. 2024. Vol. 2. P. 3. https://doi.org/10.48550/arXiv.2407.10671</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Agarwal S. et al. gpt-oss-120b &amp; gpt-oss-20b Model Card // arXiv e-prints. 2025. P. arXiv: 2508.10925. https://doi.org/10.48550/arXiv.2508.10925</mixed-citation><mixed-citation xml:lang="en">Agarwal S. et al. gpt-oss-120b &amp; gpt-oss-20b Model Card // arXiv e-prints. 2025. P. arXiv: 2508.10925. https://doi.org/10.48550/arXiv.2508.10925</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Liu A. et al. DeepSeek-V3 Technical Report // arXiv e-prints. 2024. P. arXiv: 2412.19437. https://doi.org/10.48550/arXiv.2412.19437</mixed-citation><mixed-citation xml:lang="en">Liu A. et al. DeepSeek-V3 Technical Report // arXiv e-prints. 2024. P. arXiv: 2412.19437. https://doi.org/10.48550/arXiv.2412.19437</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Chee J. et al. Quip: 2-bit quantization of large language models with guarantees // Advances in Neural Information Processing Systems. 2023. Vol. 36, P. 4396 –4429. https://doi.org/10.48550/arXiv.2307.13304</mixed-citation><mixed-citation xml:lang="en">Chee J. et al. Quip: 2-bit quantization of large language models with guarantees // Advances in Neural Information Processing Systems. 2023. Vol. 36, P. 4396 –4429. https://doi.org/10.48550/arXiv.2307.13304</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Chen M. et al. Efficientqat: Efficient quantization-aware training for large language models // Annual Meeting of the Association for Computational Linguistics. 2025. Vol. 1. P. 10081–10100. https://doi.org/10.48550/arXiv.2407.11062</mixed-citation><mixed-citation xml:lang="en">Chen M. et al. Efficientqat: Efficient quantization-aware training for large language models // Annual Meeting of the Association for Computational Linguistics. 2025. Vol. 1. P. 10081–10100. https://doi.org/10.48550/arXiv.2407.11062</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Shao W. et al. OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models // The Twelfth International Conference on Learning Representations. 2024. https://doi.org/10.48550/arXiv.2308.13137</mixed-citation><mixed-citation xml:lang="en">Shao W. et al. OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models // The Twelfth International Conference on Learning Representations. 2024. https://doi.org/10.48550/arXiv.2308.13137</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Hu E. J. et al. Lora: Low-rank adaptation of large language models // International Conference on Machine Learning. 2022. Vol. 1, No. 2. P. 3. https://doi.org/10.48550/arXiv.2106.09685</mixed-citation><mixed-citation xml:lang="en">Hu E. J. et al. Lora: Low-rank adaptation of large language models // International Conference on Machine Learning. 2022. Vol. 1, No. 2. P. 3. https://doi.org/10.48550/arXiv.2106.09685</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Han Z. et al. Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey // arXiv e-prints. 2024. P. arXiv: 2403.14608. https://doi.org/10.48550/arXiv.2403.14608</mixed-citation><mixed-citation xml:lang="en">Han Z. et al. Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey // arXiv e-prints. 2024. P. arXiv: 2403.14608. https://doi.org/10.48550/arXiv.2403.14608</mixed-citation></citation-alternatives></ref><ref id="cit23"><label>23</label><citation-alternatives><mixed-citation xml:lang="ru">Egiazarian V. et al. Extreme compression of large language models via additive quantization // Proceedings of the 41st International Conference on Machine Learning. 2024. P. 12284–12303. https://doi.org/10.48550/arXiv.2401.06118</mixed-citation><mixed-citation xml:lang="en">Egiazarian V. et al. Extreme compression of large language models via additive quantization // Proceedings of the 41st International Conference on Machine Learning. 2024. P. 12284–12303. https://doi.org/10.48550/arXiv.2401.06118</mixed-citation></citation-alternatives></ref><ref id="cit24"><label>24</label><citation-alternatives><mixed-citation xml:lang="ru">Tseng A. et al. QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks // International Conference on Machine Learning. PMLR, 2024. P. 48630–48656. https://doi.org/10.48550/arXiv.2402.04396</mixed-citation><mixed-citation xml:lang="en">Tseng A. et al. QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks // International Conference on Machine Learning. PMLR, 2024. P. 48630–48656. https://doi.org/10.48550/arXiv.2402.04396</mixed-citation></citation-alternatives></ref><ref id="cit25"><label>25</label><citation-alternatives><mixed-citation xml:lang="ru">Tseng A. et al. Qtip: Quantization with trellises and incoherence processing // Advances in Neural Information Processing Systems. 2024. Vol. 37. P. 59597–59620. https://doi.org/10.48550/arXiv.2406.11235</mixed-citation><mixed-citation xml:lang="en">Tseng A. et al. Qtip: Quantization with trellises and incoherence processing // Advances in Neural Information Processing Systems. 2024. Vol. 37. P. 59597–59620. https://doi.org/10.48550/arXiv.2406.11235</mixed-citation></citation-alternatives></ref><ref id="cit26"><label>26</label><citation-alternatives><mixed-citation xml:lang="ru">Yang A. et al. Qwen3 technical report // arXiv e-prints. 2025. P. arXiv: 2505.09388. https://doi.org/10.48550/arXiv.2505.09388</mixed-citation><mixed-citation xml:lang="en">Yang A. et al. Qwen3 technical report // arXiv e-prints. 2025. P. arXiv: 2505.09388. https://doi.org/10.48550/arXiv.2505.09388</mixed-citation></citation-alternatives></ref><ref id="cit27"><label>27</label><citation-alternatives><mixed-citation xml:lang="ru">Achiam J. et al. GPT-4 Technical Report // arXiv e-prints. 2023. arXiv: 2303.08774. https://doi.org/10.48550/arXiv.2303.08774</mixed-citation><mixed-citation xml:lang="en">Achiam J. et al. GPT-4 Technical Report // arXiv e-prints. 2023. arXiv: 2303.08774. https://doi.org/10.48550/arXiv.2303.08774</mixed-citation></citation-alternatives></ref><ref id="cit28"><label>28</label><citation-alternatives><mixed-citation xml:lang="ru">Darvish Rouhani B. et al. Microscaling data formats for deep learning // arXiv e-prints. 2023. P. arXiv: 2310.10537. https://doi.org/10.48550/arXiv.2310.10537</mixed-citation><mixed-citation xml:lang="en">Darvish Rouhani B. et al. Microscaling data formats for deep learning // arXiv e-prints. 2023. P. arXiv: 2310.10537. https://doi.org/10.48550/arXiv.2310.10537</mixed-citation></citation-alternatives></ref><ref id="cit29"><label>29</label><citation-alternatives><mixed-citation xml:lang="ru">Weber M. et al. Redpajama: an open dataset for training large language models // Advances in neural information processing systems. 2024. Vol. 37. P. 116462–116492. https://doi.org/10.52202/079017-3697</mixed-citation><mixed-citation xml:lang="en">Weber M. et al. Redpajama: an open dataset for training large language models // Advances in neural information processing systems. 2024. Vol. 37. P. 116462–116492. https://doi.org/10.52202/079017-3697</mixed-citation></citation-alternatives></ref><ref id="cit30"><label>30</label><citation-alternatives><mixed-citation xml:lang="ru">Potapov A. T‑Wix – Russian supervised fine‑tuning (SFT) dataset // HuggingFace.co: The collaboration platform. 2025. URL: https://huggingface.co/datasets/t-tech/T-Wix</mixed-citation><mixed-citation xml:lang="en">Potapov A. T‑Wix – Russian supervised fine‑tuning (SFT) dataset // HuggingFace.co: The collaboration platform. 2025. URL: https://huggingface.co/datasets/t-tech/T-Wix</mixed-citation></citation-alternatives></ref><ref id="cit31"><label>31</label><citation-alternatives><mixed-citation xml:lang="ru">Merity S. et al. Pointer Sentinel Mixture Models // International Conference on Learning Representations. 2017. https://doi.org/10.48550/arXiv.1609.07843</mixed-citation><mixed-citation xml:lang="en">Merity S. et al. Pointer Sentinel Mixture Models // International Conference on Learning Representations. 2017. https://doi.org/10.48550/arXiv.1609.07843</mixed-citation></citation-alternatives></ref><ref id="cit32"><label>32</label><citation-alternatives><mixed-citation xml:lang="ru">Korablinov V., Braslavski P. RuBQ: A Russian dataset for question answering over Wikidata // International Semantic Web Conference. Cham: Springer International Publishing. 2020. P. 97–110. https://doi.org/10.1007/978-3-030-62466-8_7</mixed-citation><mixed-citation xml:lang="en">Korablinov V., Braslavski P. RuBQ: A Russian dataset for question answering over Wikidata // International Semantic Web Conference. Cham: Springer International Publishing. 2020. P. 97–110. https://doi.org/10.1007/978-3-030-62466-8_7</mixed-citation></citation-alternatives></ref><ref id="cit33"><label>33</label><citation-alternatives><mixed-citation xml:lang="ru">Li H. et al. CMMLU: Measuring massive multitask language understanding in Chinese // Findings of the Association for Computational Linguistics. 2024. P. 11260–11285. https://doi.org/10.48550/arXiv.2306.09212</mixed-citation><mixed-citation xml:lang="en">Li H. et al. CMMLU: Measuring massive multitask language understanding in Chinese // Findings of the Association for Computational Linguistics. 2024. P. 11260–11285. https://doi.org/10.48550/arXiv.2306.09212</mixed-citation></citation-alternatives></ref><ref id="cit34"><label>34</label><citation-alternatives><mixed-citation xml:lang="ru">Bisk Y. et al. Piqa: Reasoning about physical commonsense in natural language // Proceedings of the AAAI conference on artificial intelligence. 2020. Vol. 34. №. 05. P. 7432–7439. https://doi.org/10.1609/aaai.v34i05.6239</mixed-citation><mixed-citation xml:lang="en">Bisk Y. et al. Piqa: Reasoning about physical commonsense in natural language // Proceedings of the AAAI conference on artificial intelligence. 2020. Vol. 34. №. 05. P. 7432–7439. https://doi.org/10.1609/aaai.v34i05.6239</mixed-citation></citation-alternatives></ref><ref id="cit35"><label>35</label><citation-alternatives><mixed-citation xml:lang="ru">Fenogenova A. et al. MERA: A Comprehensive LLM Evaluation in Russian //Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. P. 9920–9948. https://doi.org/10.18653/v1/2024.acl-long.534</mixed-citation><mixed-citation xml:lang="en">Fenogenova A. et al. MERA: A Comprehensive LLM Evaluation in Russian //Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. P. 9920–9948. https://doi.org/10.18653/v1/2024.acl-long.534</mixed-citation></citation-alternatives></ref><ref id="cit36"><label>36</label><citation-alternatives><mixed-citation xml:lang="ru">Chirkin A. et al. RusConText Benchmark: A Russian Language Evaluation Benchmark for Understanding Context // ACL 2025 Student Research Workshop. 2025. https://aclanthology.org/2025.acl-srw.91/</mixed-citation><mixed-citation xml:lang="en">Chirkin A. et al. RusConText Benchmark: A Russian Language Evaluation Benchmark for Understanding Context // ACL 2025 Student Research Workshop. 2025. https://aclanthology.org/2025.acl-srw.91/</mixed-citation></citation-alternatives></ref><ref id="cit37"><label>37</label><citation-alternatives><mixed-citation xml:lang="ru">EleutherAI. Language Model Evaluation Harness // Zenodo. 2024. v0.4.3. https://zenodo.org/records/10256836</mixed-citation><mixed-citation xml:lang="en">EleutherAI. Language Model Evaluation Harness // Zenodo. 2024. v0.4.3. https://zenodo.org/records/10256836</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
