<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26907/1562-5419-2020-23-5-1026-1043</article-id><article-id custom-type="elpub" pub-id-type="custom">ellibs-246</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Анализ моделей векторных представлений слов в задаче разметки семантических ролей в русскоязычных текстах</article-title><trans-title-group xml:lang="en"><trans-title>Analysis of Word Embeddings for Semantic Role Labeling of Russian Texts</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Кадермятова</surname><given-names>Л. М.</given-names></name><name name-style="western" xml:lang="en"><surname>Kadermyatova</surname><given-names>L. M.</given-names></name></name-alternatives><email xlink:type="simple">lkadermy@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Тутубалина</surname><given-names>Е. В.</given-names></name><name name-style="western" xml:lang="en"><surname>Tutubalina</surname><given-names>E. V.</given-names></name></name-alternatives><email xlink:type="simple">ElVTutubalina@kpfu.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Высшая школа информационных технологий и интеллектуальных систем Казанского (Приволжского) федерального университета</institution></aff><aff xml:lang="en"><institution>Higher Institute of Information Technology and Intelligent Systems</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2020</year></pub-date><pub-date pub-type="epub"><day>28</day><month>10</month><year>2020</year></pub-date><volume>23</volume><issue>5</issue><fpage>1026</fpage><lpage>1043</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Кадермятова Л.М., Тутубалина Е.В., 2020</copyright-statement><copyright-year>2020</copyright-year><copyright-holder xml:lang="ru">Кадермятова Л.М., Тутубалина Е.В.</copyright-holder><copyright-holder xml:lang="en">Kadermyatova L.M., Tutubalina E.V.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/246">https://ellibs.elpub.ru/jour/article/view/246</self-uri><abstract><p>Изучено влияние использования векторных представлений слов на качество установления семантических ролей в русскоязычных текстах. Задача установления семантических ролей в русскоязычных текстах получила широкое распространение после выхода на свет корпуса FrameBank. Были исследованы модели векторных представлений слов word2vec, fastText и ELMo (Embeddings from Language Models). Анализировались метрики качества микро- и макро-F1 как оценочные показатели результатов автоматической разметки актантов. Был проведен ряд экспериментов, демонстрирующих, что модели ELMo, основанные на токенах предикатно-аргументных конструкций, показывают больший прирост качества по сравнению со всеми остальными моделями, в том числе, в сопоставлении с моделями ELMo, обученными на леммах, как по величине микро-F1, так и по величине макро-F1.</p></abstract><trans-abstract xml:lang="en"><p>Currently, there are a huge number of works dedicated to semantic role labeling of English texts [1–3]. However, semantic role labeling of Russian texts was an unexplored area for many years due to the lack of train and test corpora. Semantic role labeling of Russian Texts was widely disseminated after the appearance of the FrameBank corpus [<xref ref-type="bibr" rid="cit4">4</xref>]. In this approach, we analyzed the influence of the word embedding models on the quality of semantic role labeling of Russian texts. Micro- and macro- F1 scores on word2vec [<xref ref-type="bibr" rid="cit5">5</xref>], fastText [<xref ref-type="bibr" rid="cit6">6</xref>], ELMo [<xref ref-type="bibr" rid="cit7">7</xref>] embedding models were calculated. The set of experiments have shown that fastText models averaged slightly better than word2vec models as applied to Russian FrameBank corpus. The higher micro- and macro- F1 scores were obtained on deep tokenized word representation model ELMo in relation to classical shallow embedding models.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>машинное обучение</kwd><kwd>обработка естественного языка</kwd><kwd>векторные представления слов</kwd><kwd>семантические роли</kwd></kwd-group><kwd-group xml:lang="en"><kwd>machine learning</kwd><kwd>ML-model</kwd><kwd>natural language processing</kwd><kwd>word embedding</kwd><kwd>semantic role labeling</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Christensen J., Mausam, Soderland S., and Etzioni O. (2011), An analysis of openinformation extraction based on semantic role labeling. In Proceedings of thesixth international conference on Knowledge capture, pp. 113&amp;ndash;120.</mixed-citation><mixed-citation xml:lang="en">Christensen J., Mausam, Soderland S., and Etzioni O. (2011), An analysis of openinformation extraction based on semantic role labeling. In Proceedings of thesixth international conference on Knowledge capture, pp. 113&amp;ndash;120.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Sameer Pradhan, Wayne Ward, Kadri Hacioglu, James Martin, and Dan Jurafsky. 2005. Semantic role labeling using different syntactic views. In Proceedings of the Association for Computational Linguistics 43rd annual meeting (ACL-2005), Ann Arbor, MI.</mixed-citation><mixed-citation xml:lang="en">Sameer Pradhan, Wayne Ward, Kadri Hacioglu, James Martin, and Dan Jurafsky. 2005. Semantic role labeling using different syntactic views. In Proceedings of the Association for Computational Linguistics 43rd annual meeting (ACL-2005), Ann Arbor, MI.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Luheng He, Kenton Lee, Mike Lewis, and Luke Zettlemoyer. 2017. Deep semantic role labeling: What works and whats next. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pages 473&amp;ndash;483.</mixed-citation><mixed-citation xml:lang="en">Luheng He, Kenton Lee, Mike Lewis, and Luke Zettlemoyer. 2017. Deep semantic role labeling: What works and whats next. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pages 473&amp;ndash;483.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Olga Lyashevskaya and Egor Kashkin. 2015. Framebank: a database of russian lexical constructions. In International Conference on Analysis of Images, Social Networks and Texts, pages 350&amp;ndash;360.</mixed-citation><mixed-citation xml:lang="en">Olga Lyashevskaya and Egor Kashkin. 2015. Framebank: a database of russian lexical constructions. In International Conference on Analysis of Images, Social Networks and Texts, pages 350&amp;ndash;360.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111&amp;ndash;3119.</mixed-citation><mixed-citation xml:lang="en">Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111&amp;ndash;3119.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135&amp;ndash;146.</mixed-citation><mixed-citation xml:lang="en">Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135&amp;ndash;146.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2227&amp;ndash;2237.</mixed-citation><mixed-citation xml:lang="en">Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2227&amp;ndash;2237.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Baker C. F., Fillmore C. J., and Lowe J. B. (1998), The Berkeley FrameNet project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics Volume 1, pp. 86&amp;ndash;90.</mixed-citation><mixed-citation xml:lang="en">Baker C. F., Fillmore C. J., and Lowe J. B. (1998), The Berkeley FrameNet project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics Volume 1, pp. 86&amp;ndash;90.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Ilya Kuznetsov. 2016. Automatic semantic role labelling in Russian language, PhD thesis (in Russian). Ph.D. thesis, Higher School of Economics.</mixed-citation><mixed-citation xml:lang="en">Ilya Kuznetsov. 2016. Automatic semantic role labelling in Russian language, PhD thesis (in Russian). Ph.D. thesis, Higher School of Economics.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Shelmanov A., Smirnov I., Larionov D., Chistova E. Semantic Role Labeling with Pretrained Language Models for Known and Unknown Predicates // Proceedings of Recent Advances in Natural Language Processing, pages 619&amp;ndash;628, Varna, Bulgaria, Sep 2&amp;ndash;4, 2019.</mixed-citation><mixed-citation xml:lang="en">Shelmanov A., Smirnov I., Larionov D., Chistova E. Semantic Role Labeling with Pretrained Language Models for Known and Unknown Predicates // Proceedings of Recent Advances in Natural Language Processing, pages 619&amp;ndash;628, Varna, Bulgaria, Sep 2&amp;ndash;4, 2019.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Andrey Kutuzov and Elizaveta Kuzmenko, 2017. WebVectors: A Toolkit for Building Web Interfaces for Vector Semantic Models, pages 155&amp;ndash;161. Springer.</mixed-citation><mixed-citation xml:lang="en">Andrey Kutuzov and Elizaveta Kuzmenko, 2017. WebVectors: A Toolkit for Building Web Interfaces for Vector Semantic Models, pages 155&amp;ndash;161. Springer.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Khakhulin, Yuri Kuratov, Denis Kuznetsov, et al. 2018. Deeppavlov: Open-source library for dialoguesystems. In Proceedings of ACL 2018, System Demonstrations, pages 122&amp;ndash;127.</mixed-citation><mixed-citation xml:lang="en">Khakhulin, Yuri Kuratov, Denis Kuznetsov, et al. 2018. Deeppavlov: Open-source library for dialoguesystems. In Proceedings of ACL 2018, System Demonstrations, pages 122&amp;ndash;127.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Shelmanov A., Devyatkin D. Semantic role labeling with neural networks for texts in Russian // Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference "Dialogue" (2017). &amp;mdash; Vol. 1. &amp;mdash; 2017. &amp;mdash; P. 245&amp;ndash;256.</mixed-citation><mixed-citation xml:lang="en">Shelmanov A., Devyatkin D. Semantic role labeling with neural networks for texts in Russian // Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference "Dialogue" (2017). &amp;mdash; Vol. 1. &amp;mdash; 2017. &amp;mdash; P. 245&amp;ndash;256.</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Agarap, A. F. 2018. Deep Learning using Rectified Linear Units (ReLU), Neural and Evolutionary Computing, Vol. 1.</mixed-citation><mixed-citation xml:lang="en">Agarap, A. F. 2018. Deep Learning using Rectified Linear Units (ReLU), Neural and Evolutionary Computing, Vol. 1.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Luheng He, Mike Lewis, and Luke Zettlemoyer. Question-answer driven semantic role labeling: Using natural language to annotate natural language. In Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP 2015), pages 643&amp;ndash;653, 2015.</mixed-citation><mixed-citation xml:lang="en">Luheng He, Mike Lewis, and Luke Zettlemoyer. Question-answer driven semantic role labeling: Using natural language to annotate natural language. In Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP 2015), pages 643&amp;ndash;653, 2015.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Wen Tau Yih, Matthew Richardson, Chris Meek, Ming Wei Chang, and Jina Suh. The value of semantic parse labeling for knowledge base question answering. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), pages 201&amp;ndash;206, 2016.</mixed-citation><mixed-citation xml:lang="en">Wen Tau Yih, Matthew Richardson, Chris Meek, Ming Wei Chang, and Jina Suh. The value of semantic parse labeling for knowledge base question answering. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), pages 201&amp;ndash;206, 2016.</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Janara Christensen, Mausam, Stephen Soderland, and Oren Etzioni. 2010. Semantic role labeling for open information extraction. In Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading. Association for Computational Linguistics, Los Angeles, California, pages 52&amp;ndash;60.</mixed-citation><mixed-citation xml:lang="en">Janara Christensen, Mausam, Stephen Soderland, and Oren Etzioni. 2010. Semantic role labeling for open information extraction. In Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading. Association for Computational Linguistics, Los Angeles, California, pages 52&amp;ndash;60.</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">GS Osipov, IV Smirnov, and IA Tikhomirov. 2010. Relational-situational method for text search and analysis and its applications. Scientiﬁc and Technical Information Processing, 37(6):432&amp;ndash;437.</mixed-citation><mixed-citation xml:lang="en">GS Osipov, IV Smirnov, and IA Tikhomirov. 2010. Relational-situational method for text search and analysis and its applications. Scientiﬁc and Technical Information Processing, 37(6):432&amp;ndash;437.</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Liu, D., Gildea, D., 2010. Semantic role features for machine translation. Proc. 23rd Int. Conf. on Computational Linguistics, p.716&amp;ndash;724.</mixed-citation><mixed-citation xml:lang="en">Liu, D., Gildea, D., 2010. Semantic role features for machine translation. Proc. 23rd Int. Conf. on Computational Linguistics, p.716&amp;ndash;724.</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Kashkin, E.V., Lyashevskaya, O.N.: Semantic roles and construction net in Russian FrameBank [Semanticheskie roli i set&amp;rsquo; konstrukcij v sisteme FrameBank] (in Russian). In: Computational Linguistics and Intellectual Technologies. Proceedings of International Conference &amp;ldquo;Dialog&amp;rdquo;, vol. 12-1, pp. 297&amp;ndash;311. RSUH, Moscow (2013)</mixed-citation><mixed-citation xml:lang="en">Kashkin, E.V., Lyashevskaya, O.N.: Semantic roles and construction net in Russian FrameBank [Semanticheskie roli i set&amp;rsquo; konstrukcij v sisteme FrameBank] (in Russian). In: Computational Linguistics and Intellectual Technologies. Proceedings of International Conference &amp;ldquo;Dialog&amp;rdquo;, vol. 12-1, pp. 297&amp;ndash;311. RSUH, Moscow (2013)</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Lyashevskaya O. N., Kashkin E. V. Evaluation of frame-semantic role labeling in a case-marking language // Papers from the Annual International Conference "Dialogue" (2014). &amp;mdash; 2014. &amp;mdash; P. 350&amp;ndash;365.</mixed-citation><mixed-citation xml:lang="en">Lyashevskaya O. N., Kashkin E. V. Evaluation of frame-semantic role labeling in a case-marking language // Papers from the Annual International Conference "Dialogue" (2014). &amp;mdash; 2014. &amp;mdash; P. 350&amp;ndash;365.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
