<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id custom-type="elpub" pub-id-type="custom">ellibs-434</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Использование синтаксиса для анализа тональности твитов  на русском языке</article-title><trans-title-group xml:lang="en"><trans-title>Using syntax for sentiment analysis of russian tweets</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Адаскина</surname><given-names>Ю. В.</given-names></name></name-alternatives><email xlink:type="simple">adaskina@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Паничева</surname><given-names>П. В.</given-names></name></name-alternatives><email xlink:type="simple">p.panicheva@spbu.ru</email><xref ref-type="aff" rid="aff-2"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Попов</surname><given-names>А. М.</given-names></name></name-alternatives><email xlink:type="simple">hedgeonline@gmail.com</email><xref ref-type="aff" rid="aff-2"/></contrib></contrib-group><aff xml:lang="ru" id="aff-1"><institution>ООО «InfoQubes»</institution><country>Russian Federation</country></aff><aff xml:lang="ru" id="aff-2"><institution>Санкт-Петербургский государственный университет</institution><country>Russian Federation</country></aff><pub-date pub-type="collection"><year>2015</year></pub-date><pub-date pub-type="epub"><day>28</day><month>06</month><year>2015</year></pub-date><volume>18</volume><issue>3-4</issue><fpage>163</fpage><lpage>184</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Адаскина Ю.В., Паничева П.В., Попов А.М., 2015</copyright-statement><copyright-year>2015</copyright-year><copyright-holder xml:lang="ru">Адаскина Ю.В., Паничева П.В., Попов А.М.</copyright-holder><copyright-holder xml:lang="en">Адаскина Ю.В., Паничева П.В., Попов А.М.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/434">https://ellibs.elpub.ru/jour/article/view/434</self-uri><abstract><p>Представлен подход к решению задачи анализа тональности в рамках тестирования SentiRuEval – открытого соревнования систем анализа тональности на русском языке. Описанный алгоритм был применен в дорожке по анализу тональности твитов о банках и телекоммуникационных компаниях. Для этих данных была разработана и оценена классификация на три класса: положительный, отрицательный и нейтральный.

Для решения поставленной задачи использовались различные алгоритмы машинного обучения. Признаками для классификатора являлись лингвистические данные, полученные из текста с помощью разработанного нами морфо-синтаксического анализатора. Нормализованные слова, а также синтаксические связи, оказались решающими признаками для достижения наилучшего результата, который был получен с помощью статистического алгоритма опорных векторов.
Оценка, проведенная организаторами конкурса, выявила высокое качество предложенного подхода, который занял первую строчку по трем из четырех мерам качества.</p></abstract><trans-abstract xml:lang="en"><p>The paper describes our approach to the task of sentiment analysis of tweets within SentiRuEval – an open evaluation of sentiment analysis systems for the Russian language. We took part in the task of sentiment analysis of Russian tweets concerning two types of organizations: banks and telecommunications companies. On both datasets, the participants were required to perform a three-way classification of tweets: positive, negative or neutral.

We used various statistical methods as basis for our machine learning algorithms. Linguistic features produced by our morpho-syntactic analyzer are applied to the classification. Syntactic relations proved to be a crucial feature for any statistical method evaluated, and SVM-based classification performed better than the others. Normalized words are another important feature for the algorithm.

The evaluation revealed that our method proved to be rather successful: we scored the first in three out of four evaluation measures.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>анализ тональности</kwd><kwd>синтаксические связи</kwd><kwd>русский язык</kwd><kwd>статистические методы</kwd><kwd>классификация текстов</kwd></kwd-group><kwd-group xml:lang="en"><kwd>sentiment analysis</kwd><kwd>syntactical relations</kwd><kwd>Russian language</kwd><kwd>statistical methods</kwd><kwd>text classification</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Chetviorkin I., Braslavskiy P., Loukachevich N. Sentiment analysis track at ROMIP 2011 // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference «Dialog 2012». 2012. P. 1-14.</mixed-citation><mixed-citation xml:lang="en">Chetviorkin I., Braslavskiy P., Loukachevich N. Sentiment analysis track at ROMIP 2011 // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference «Dialog 2012». 2012. P. 1-14.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Chetviorkin I., Loukachevitch N. Evaluating sentiment analysis systems in Russian // Proceedings of BSNLP workshop, ACL, Prague. 2013. P. 12-17.</mixed-citation><mixed-citation xml:lang="en">Chetviorkin I., Loukachevitch N. Evaluating sentiment analysis systems in Russian // Proceedings of BSNLP workshop, ACL, Prague. 2013. P. 12-17.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Loukachevitch N., Blinov P., Kotelnikov E., Rubtsova Ju., Ivanov V., Tutubalina H. Sentirueval: testing object-oriented sentiment analysis systems in Russian // Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference «Dialogue». 2015. Issue 14. V. 2. P. 13-24.</mixed-citation><mixed-citation xml:lang="en">Loukachevitch N., Blinov P., Kotelnikov E., Rubtsova Ju., Ivanov V., Tutubalina H. Sentirueval: testing object-oriented sentiment analysis systems in Russian // Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference «Dialogue». 2015. Issue 14. V. 2. P. 13-24.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Pang B., Lee L., Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques // Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. 2002. V. 10. P. 79-86.</mixed-citation><mixed-citation xml:lang="en">Pang B., Lee L., Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques // Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. 2002. V. 10. P. 79-86.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Mullen T., Collier N. Sentiment analysis using support vector machines with diverse information sources // Proceedings of 9th EMNLP. 2004. P. 412-418.</mixed-citation><mixed-citation xml:lang="en">Mullen T., Collier N. Sentiment analysis using support vector machines with diverse information sources // Proceedings of 9th EMNLP. 2004. P. 412-418.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Turney P. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews // Proceedings of the 40th ACL. 2002. P. 417-424.</mixed-citation><mixed-citation xml:lang="en">Turney P. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews // Proceedings of the 40th ACL. 2002. P. 417-424.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Kudo T., Matsumoto Y. A boosting algorithm for classification of semi-structured text // Proceedings of 9th EMNLP. 2004. P. 301-308.</mixed-citation><mixed-citation xml:lang="en">Kudo T., Matsumoto Y. A boosting algorithm for classification of semi-structured text // Proceedings of 9th EMNLP. 2004. P. 301-308.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Matsumoto S., Takamura H., Okumura M. Sentiment classification using word sub-sequences and dependency sub-trees // Ho T.-B., Cheung D., Liu H. (eds.) PAKDD 2005. V. 3518. P. 301-311.</mixed-citation><mixed-citation xml:lang="en">Matsumoto S., Takamura H., Okumura M. Sentiment classification using word sub-sequences and dependency sub-trees // Ho T.-B., Cheung D., Liu H. (eds.) PAKDD 2005. V. 3518. P. 301-311.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Mavljutov R.R., Ostapuk N.A. Using basic syntactic relations for sentiment analysis // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference «Dialog 2013». 2013. P. 91-100.</mixed-citation><mixed-citation xml:lang="en">Mavljutov R.R., Ostapuk N.A. Using basic syntactic relations for sentiment analysis // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference «Dialog 2013». 2013. P. 91-100.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Yussupova N., Bogdanova D., Boyko M. Applying of sentiment analysis for texts in russian based on machine learning approach // Proceedings of The Second International Conference on Advances in Information Mining and Management, Italy. 2012. P. 8-14.</mixed-citation><mixed-citation xml:lang="en">Yussupova N., Bogdanova D., Boyko M. Applying of sentiment analysis for texts in russian based on machine learning approach // Proceedings of The Second International Conference on Advances in Information Mining and Management, Italy. 2012. P. 8-14.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Furnkranz J., Mitchell T. M., Rilof E. A case study in using linguistic phrases for text categorization on the WWW // Proceedings of the AAAI Workshop on Learning for Text Categorization, Madison, US. 2998. P. 5-12.</mixed-citation><mixed-citation xml:lang="en">Furnkranz J., Mitchell T. M., Rilof E. A case study in using linguistic phrases for text categorization on the WWW // Proceedings of the AAAI Workshop on Learning for Text Categorization, Madison, US. 2998. P. 5-12.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Caropreso M.F., Matwin S., Sebastiani F.A. Learner-independent evaluation of the usefulness of statistical phrases for automated text categorization // Amita G. Chin (ed.), Text Databases and Document Management: Theory and Practice. 2006. P. 78-102.</mixed-citation><mixed-citation xml:lang="en">Caropreso M.F., Matwin S., Sebastiani F.A. Learner-independent evaluation of the usefulness of statistical phrases for automated text categorization // Amita G. Chin (ed.), Text Databases and Document Management: Theory and Practice. 2006. P. 78-102.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Nastase V., Shirabad J.S., Caropreso M.F. Using dependency relations for text classification // Proceedings of the 19th Canadian Conference on Artificial Intelligence, Quebec City. 2006. P. 12-25.</mixed-citation><mixed-citation xml:lang="en">Nastase V., Shirabad J.S., Caropreso M.F. Using dependency relations for text classification // Proceedings of the 19th Canadian Conference on Artificial Intelligence, Quebec City. 2006. P. 12-25.</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Zhao S., Grishman R. Extracting relations with Integrated Information using kernel methods // Proceedings of the 43rd Annual Meeting of the ACL, Ann Arbor, US. 2005. P. 419-426.</mixed-citation><mixed-citation xml:lang="en">Zhao S., Grishman R. Extracting relations with Integrated Information using kernel methods // Proceedings of the 43rd Annual Meeting of the ACL, Ann Arbor, US. 2005. P. 419-426.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Jansen B.J., Zhang M., Sobel K., Chowdury A. Twitter power: tweets as electronic word of mouth // Journal of the American Society for Information Science and Technology. 2009. V. 60, No 11. P. 2169-2188.</mixed-citation><mixed-citation xml:lang="en">Jansen B.J., Zhang M., Sobel K., Chowdury A. Twitter power: tweets as electronic word of mouth // Journal of the American Society for Information Science and Technology. 2009. V. 60, No 11. P. 2169-2188.</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Go A., Bhayani R., Huang L. twitter sentiment classification using distant supervision // Technical report, Stanford. 2009.</mixed-citation><mixed-citation xml:lang="en">Go A., Bhayani R., Huang L. twitter sentiment classification using distant supervision // Technical report, Stanford. 2009.</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Jiang L., Yu M., Zhou M., Liu X., Zhao T. Target-dependent Twitter sentiment classification // Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, US. 2011. P. 151-160.</mixed-citation><mixed-citation xml:lang="en">Jiang L., Yu M., Zhou M., Liu X., Zhao T. Target-dependent Twitter sentiment classification // Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, US. 2011. P. 151-160.</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Kouloumpis E., Wilson, T., Moore J. Twitter sentiment analysis: the good the bad and the omg! // Artificial Intelligence. 2011. P. 538-541.</mixed-citation><mixed-citation xml:lang="en">Kouloumpis E., Wilson, T., Moore J. Twitter sentiment analysis: the good the bad and the omg! // Artificial Intelligence. 2011. P. 538-541.</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Pak A., Paroubek P. Twitter as a corpus for sentiment analysis and opinion mining // Proceedings of LREC, Valetta. 2010. P. 75-100.</mixed-citation><mixed-citation xml:lang="en">Pak A., Paroubek P. Twitter as a corpus for sentiment analysis and opinion mining // Proceedings of LREC, Valetta. 2010. P. 75-100.</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Адаскина Ю.В., Паничева П.В., Попов А.М. Полуавтоматическое пополнение словарей на основе синтаксических связей // Технологии информационного общества в науке, образовании и культуре: сборник научных статей. Труды XVII Всероссийской объединенной конференции «Интернет и современное общество» (IMS-2014), Санкт-Петербург, 19 – 20 ноября 2014 г. 2014. С. 271-276.</mixed-citation><mixed-citation xml:lang="en">Адаскина Ю.В., Паничева П.В., Попов А.М. Полуавтоматическое пополнение словарей на основе синтаксических связей // Технологии информационного общества в науке, образовании и культуре: сборник научных статей. Труды XVII Всероссийской объединенной конференции «Интернет и современное общество» (IMS-2014), Санкт-Петербург, 19 – 20 ноября 2014 г. 2014. С. 271-276.</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Зализняк А.А. Грамматический словарь русского языка. М.: Русский язык, 1980.</mixed-citation><mixed-citation xml:lang="en">Зализняк А.А. Грамматический словарь русского языка. М.: Русский язык, 1980.</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R,, Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay É. Scikit-learn: machine learning in Python // Journal of Machine Learning Research. 2011. V. 12 (Oct). P. 2825-2830.</mixed-citation><mixed-citation xml:lang="en">Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R,, Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay É. Scikit-learn: machine learning in Python // Journal of Machine Learning Research. 2011. V. 12 (Oct). P. 2825-2830.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
