<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26907/1562-5419-2025-28-5-1070-1084</article-id><article-id custom-type="elpub" pub-id-type="custom">ellibs-609</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Автоматическое извлечение аргументативных отношений из текстов научной коммуникации</article-title><trans-title-group xml:lang="en"><trans-title>Automatic Extraction of Argumentative Relations from Scientific Communication Texts</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Загорулько</surname><given-names>Юрий Алексеевич</given-names></name><name name-style="western" xml:lang="en"><surname>Zagorulko</surname><given-names>Yury Alekseevich</given-names></name></name-alternatives><email xlink:type="simple">zagor@iis.nsk.su</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Сидорова</surname><given-names>Елена Анатольевна</given-names></name><name name-style="western" xml:lang="en"><surname>Sidorova</surname><given-names>Elena Anatolievna</given-names></name></name-alternatives><email xlink:type="simple">lsidorova@iis.nsk.su</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Ахмадеева</surname><given-names>Ирина Равильевна</given-names></name><name name-style="western" xml:lang="en"><surname>Akhmadeeva</surname><given-names>Irina Ravilevna</given-names></name></name-alternatives><email xlink:type="simple">i.r.akhmadeeva@iis.nsk.su</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Институт систем информатики им. А.П. Ершова СО РАН</institution></aff><aff xml:lang="en"><institution>A.P. Ershov Institute of Informatics Systems of Siberian Branch of RAS</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2025</year></pub-date><pub-date pub-type="epub"><day>19</day><month>12</month><year>2025</year></pub-date><volume>28</volume><issue>5</issue><fpage>1070</fpage><lpage>1084</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Загорулько Ю.А., Сидорова Е.А., Ахмадеева И.Р., 2025</copyright-statement><copyright-year>2025</copyright-year><copyright-holder xml:lang="ru">Загорулько Ю.А., Сидорова Е.А., Ахмадеева И.Р.</copyright-holder><copyright-holder xml:lang="en">Zagorulko Y.A., Sidorova E.A., Akhmadeeva I.R.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/609">https://ellibs.elpub.ru/jour/article/view/609</self-uri><abstract><p>Сложность задачи извлечения аргументативных структур связана с такими проблемами, как выделение аргументативных сегментов, прогнозирование дальних связей между неконтактными сегментами, обучение на данных, размеченных с низкой степенью согласованности между аннотаторами. В настоящей работе рассмотрен подход к извлечению аргументативных отношений из достаточно больших текстов, относящихся к области научной коммуникации. Проведен сравнительный анализ методов тонкой настройки с использованием предобученной языковой модели типа Longformer, позволяющей учитывать длинные контексты, и двух методов, позволяющих учитывать расхождения аннотаторов в разметке аргументов за счет использования так называемых мягких меток, полученных путем равномерного сглаживания меток и усреднения экспертных оценок. Эксперименты проводились на четырех наборах данных, содержащих положительные и отрицательные примеры пар утверждений (посылка, заключение) и различающихся способами сегментации и средним размером текста. Наилучшие результаты получены на модели с усреднением экспертных оценок. В то же время отмечено, что модель, использующая сглаженные метки, также повышает точность классификаторов, но ухудшает полноту.
</p></abstract><trans-abstract xml:lang="en"><p>The complexity of the problem of extracting argumentative structures is associated with such problems as selecting argumentative segments, predicting long-range connections between non-contact segments, and training on data labeled with a low degree of inter-annotator consistency. In this paper, we consider an approach to extracting argumentative relations from fairly large texts related to scientific communication. A comparative analysis was performed of fine-tuning methods using a pre-trained Longformer-type language model that takes into account long contexts and two methods that take into account annotator discrepancies in argument labeling by using the so-called soft labels obtained by uniformly smoothing labels and averaging expert assessments. The experiments were conducted on four datasets containing positive and negative examples of statement pairs (premise, conclusion) and differing in segmentation methods and average text size. The best results were obtained using the model with averaging expert assessments. At the same time, it is noted that the model using smoothed labels also increases the accuracy of classifiers, but worsens the recall.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>анализ аргументации</kwd><kwd>извлечение аргументативных отношений</kwd><kwd>научная коммуникация</kwd><kwd>проблемы сегментации</kwd><kwd>мягкая метка</kwd><kwd>сглаживание меток</kwd><kwd>языковая модель</kwd></kwd-group><kwd-group xml:lang="en"><kwd>argument mining</kwd><kwd>argumentative relation extraction</kwd><kwd>scientific communication</kwd><kwd>segmentation problem</kwd><kwd>soft label</kwd><kwd>label smoothing</kwd><kwd>language model</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Meissner J.M., Thumwanit N., Sugawara S., Aizawa A. Embracing Ambiguity: Shifting the Training Target of NLI Models // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, August 2021. Association for Computational Linguistics: Vol. 2: Short Papers, P. 862–869. https://doi.org/10.18653/v1/2021.acl-short.109</mixed-citation><mixed-citation xml:lang="en">Meissner J.M., Thumwanit N., Sugawara S., Aizawa A. Embracing Ambiguity: Shifting the Training Target of NLI Models // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, August 2021. Association for Computational Linguistics: Vol. 2: Short Papers, P. 862–869. https://doi.org/10.18653/v1/2021.acl-short.109</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Lukasik M., Bhojanapalli S., Menon A., Kumar S. Does label smoothing mitigate label noise? // Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020, Vol. 119, P. 6448–6458. URL: https://proceedings.mlr.press/v119/lukasik20a.html</mixed-citation><mixed-citation xml:lang="en">Lukasik M., Bhojanapalli S., Menon A., Kumar S. Does label smoothing mitigate label noise? // Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020, Vol. 119, P. 6448–6458. URL: https://proceedings.mlr.press/v119/lukasik20a.html</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Haque S., Bansal A., McMillan C. Label smoothing improves neural source code summarization // 2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC), Melbourne, Australia, 15–16 May 2023. Institute of Electrical and Electronics Engineers: 2023. P. 101–112. https://doi.org/10.1109/ICPC58990.2023.00025</mixed-citation><mixed-citation xml:lang="en">Haque S., Bansal A., McMillan C. Label smoothing improves neural source code summarization // 2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC), Melbourne, Australia, 15–16 May 2023. Institute of Electrical and Electronics Engineers: 2023. P. 101–112. https://doi.org/10.1109/ICPC58990.2023.00025</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z. Rethinking the Inception Architecture for Computer Vision // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27–30 June, 2016. Institute of Electrical and Electronics Engineers: P. 2818–2826. https://doi.org/10.1109/CVPR.2016.308</mixed-citation><mixed-citation xml:lang="en">Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z. Rethinking the Inception Architecture for Computer Vision // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27–30 June, 2016. Institute of Electrical and Electronics Engineers: P. 2818–2826. https://doi.org/10.1109/CVPR.2016.308</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Wang Y., Wang M., Chen Y., Tao S., Guo J., Su C., Zhang M., Yang H. Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference // Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland. May 2022. Association for Computational Linguistics: 2022, P. 1524–1535. https://doi.org/10.18653/v1/2022.findings-acl.120</mixed-citation><mixed-citation xml:lang="en">Wang Y., Wang M., Chen Y., Tao S., Guo J., Su C., Zhang M., Yang H. Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference // Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland. May 2022. Association for Computational Linguistics: 2022, P. 1524–1535. https://doi.org/10.18653/v1/2022.findings-acl.120</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Timofeeva M.K., Ilina D.V., Kononenko I.S. Argumentative Annotation of the Scientific Internet-Communication Corpus: Genre Analysis and Study of Typical Reasoning Models based on the ArgNetBank Studio Platform // NSU Vestnik. Series: Linguistics and Intercultural Communication. 2024. Vol. 22, No. 1. P. 27–49. (In Russ.) https://doi.org/10.25205/1818-7935-2024-22-1-27-49</mixed-citation><mixed-citation xml:lang="en">Timofeeva M.K., Ilina D.V., Kononenko I.S. Argumentative Annotation of the Scientific Internet-Communication Corpus: Genre Analysis and Study of Typical Reasoning Models based on the ArgNetBank Studio Platform // NSU Vestnik. Series: Linguistics and Intercultural Communication. 2024. Vol. 22, No. 1. P. 27–49. (In Russ.) https://doi.org/10.25205/1818-7935-2024-22-1-27-49</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Shestakov V.K., Kononenko I.S., Sidorova E.A., Zagorulko Yu.A. Assessing Inter-Annotator Agreement on Argumentative Markup // 2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON). IEEE: 2024, P. 309–313. https://doi.org/10.1109/SIBIRCON63777.2024.10758535</mixed-citation><mixed-citation xml:lang="en">Shestakov V.K., Kononenko I.S., Sidorova E.A., Zagorulko Yu.A. Assessing Inter-Annotator Agreement on Argumentative Markup // 2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON). IEEE: 2024, P. 309–313. https://doi.org/10.1109/SIBIRCON63777.2024.10758535</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Akhmadeeva I., Sidorova E., Ilina D. Argument mining in scientific communication: Comparative study // Internet and modern society. Human-computer communication. Cham: Springer Nature Switzerland, 2026. P. 152–166. https://doi.org/10.1007/978-3-031-96177-9_13</mixed-citation><mixed-citation xml:lang="en">Akhmadeeva I., Sidorova E., Ilina D. Argument mining in scientific communication: Comparative study // Internet and modern society. Human-computer communication. Cham: Springer Nature Switzerland, 2026. P. 152–166. https://doi.org/10.1007/978-3-031-96177-9_13</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Beltagy I., Peters M. E., Cohan A. Longformer: The long-document transformer //arXiv preprint arXiv:2004.05150. 2020.</mixed-citation><mixed-citation xml:lang="en">Beltagy I., Peters M. E., Cohan A. Longformer: The long-document transformer //arXiv preprint arXiv:2004.05150. 2020.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
