<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26907/1562-5419-2025-28-6-1415-1434</article-id><article-id custom-type="elpub" pub-id-type="custom">ellibs-626</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Некоторые подходы к повышению точности прогнозирования с использованием ансамблевых методов</article-title><trans-title-group xml:lang="en"><trans-title>Some Approaches to Improving Prediction Accuracy using Ensemble Methods</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Ма</surname><given-names>Синьюэ</given-names></name><name name-style="western" xml:lang="en"><surname>Ma</surname><given-names>Xinyue</given-names></name></name-alternatives><email xlink:type="simple">xinyuema35@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Сенько</surname><given-names>Олег Валентинович</given-names></name><name name-style="western" xml:lang="en"><surname>Sen’Ko</surname><given-names>Oleg Valentinovich</given-names></name></name-alternatives><email xlink:type="simple">OSenko@frccsc.ru</email><xref ref-type="aff" rid="aff-2"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Московский государственный университет имени М. В. Ломоносова</institution></aff><aff xml:lang="en"><institution>Lomonosov Moscow State University</institution></aff></aff-alternatives><aff-alternatives id="aff-2"><aff xml:lang="ru"><institution>Федеральный исследовательский центр «Информатика и управление» Российской академии наук</institution></aff><aff xml:lang="en"><institution>Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2025</year></pub-date><pub-date pub-type="epub"><day>19</day><month>12</month><year>2025</year></pub-date><volume>28</volume><issue>6</issue><fpage>1415</fpage><lpage>1434</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Ма С., Сенько О.В., 2025</copyright-statement><copyright-year>2025</copyright-year><copyright-holder xml:lang="ru">Ма С., Сенько О.В.</copyright-holder><copyright-holder xml:lang="en">Ma X., Sen’Ko O.V.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/626">https://ellibs.elpub.ru/jour/article/view/626</self-uri><abstract><p>Представлены результаты экспериментального исследования эффективности использования сверхслучайных деревьев в моделях, основанных на градиентном бустинге, а также в новом ансамблевом методе, в котором лес генерируется, исходя из условия повышенной внутренней дивергенции. сследована эффективность сверхслучайных деревьев при использовании расширенных наборов признаков с включением новых признаков, вычисляемых как расстояния Идо набора описаний опорных объектов из обучающей выборки. Показано, что использование сверхслучайных деревьев в моделях градиентного бустинга и дивергентного леса позволяет улучшить обобщающую способность, а также, что к еще большему росту обобщающей способности приводит использование расширенных наборов признаков.
</p></abstract><trans-abstract xml:lang="en"><p>This study presents the results of an experimental analysis evaluating the effectiveness of Extra Trees within gradient boosting models, as well as in a newly proposed ensemble framework where the forest is generated under conditions of enhanced internal divergence. Additionally, the paper explores the performance of Extra Trees when applied to novel feature representations computed as distances to a selected set of reference examples. It has been shown that the use of Extra Randomized Trees in gradient boosting and divergent forest models improves generalization ability. The use of expanded feature sets leads to even greater generalization ability.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>регрессионное моделирование</kwd><kwd>ансамблевое обучение</kwd><kwd>метрическое пространство</kwd><kwd>метод сверхслучайных деревьев</kwd></kwd-group><kwd-group xml:lang="en"><kwd>regression modeling</kwd><kwd>ensemble learning</kwd><kwd>metric space</kwd><kwd>extremely randomized trees method</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Habr. Open Machine Learning Course. Topic 5. Ensembles: Bagging, Random Forest. Available at: https://habr.com/ru/companies/ods/articles/324402/ (accessed 6 November 2025). (In Russ.).</mixed-citation><mixed-citation xml:lang="en">Habr. Open Machine Learning Course. Topic 5. Ensembles: Bagging, Random Forest. Available at: https://habr.com/ru/companies/ods/articles/324402/ (accessed 6 November 2025). (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Dmitriev A.I., Zhuravlev Yu.I., Krendelev F.P. O matematicheskikh printsipakh klassifikatsii predmetov ili yavlenii [On the Mathematical Principles of the Classification of Objects and Phenomena] // Diskretnyi analiz [Discrete Analysis]. 1967. No. 7. P. 3–17 (In Russ.).</mixed-citation><mixed-citation xml:lang="en">Dmitriev A.I., Zhuravlev Yu.I., Krendelev F.P. O matematicheskikh printsipakh klassifikatsii predmetov ili yavlenii [On the Mathematical Principles of the Classification of Objects and Phenomena] // Diskretnyi analiz [Discrete Analysis]. 1967. No. 7. P. 3–17 (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Vaintsvaig M.N. Algoritm obucheniya raspoznavaniyu obrazov “Kora” [Algorithm for pattern recognition learning “Kora”] // Algoritmy obucheniya raspoznavaniyu obrazov [Algorithms for pattern recognition learning]. Moscow: Sovetskoe radio, 1973. P. 8–12 (In Russ.).</mixed-citation><mixed-citation xml:lang="en">Vaintsvaig M.N. Algoritm obucheniya raspoznavaniyu obrazov “Kora” [Algorithm for pattern recognition learning “Kora”] // Algoritmy obucheniya raspoznavaniyu obrazov [Algorithms for pattern recognition learning]. Moscow: Sovetskoe radio, 1973. P. 8–12 (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Heath D., Kasif S., Salzberg S. k-DT: A multi-tree learning method // Proceedings of the Second International Workshop on Multistrategy Learning. 1993. P. 138–149. https://doi.org/10.1007/0-387-34296-6_10</mixed-citation><mixed-citation xml:lang="en">Heath D., Kasif S., Salzberg S. k-DT: A multi-tree learning method // Proceedings of the Second International Workshop on Multistrategy Learning. 1993. P. 138–149. https://doi.org/10.1007/0-387-34296-6_10</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Breiman L. Random Forests // Machine Learning. 2001. Vol. 45, No. 1. P. 5–32. https://doi.org/10.1023/A:1010933404324</mixed-citation><mixed-citation xml:lang="en">Breiman L. Random Forests // Machine Learning. 2001. Vol. 45, No. 1. P. 5–32. https://doi.org/10.1023/A:1010933404324</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Breiman L. Bagging predictors // Machine Learning. 1996. Vol. 24, No. 2. P. 123–140. https://doi.org/10.1007/BF00058655</mixed-citation><mixed-citation xml:lang="en">Breiman L. Bagging predictors // Machine Learning. 1996. Vol. 24, No. 2. P. 123–140. https://doi.org/10.1007/BF00058655</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Ho T.K. The Random Subspace Method for Constructing Decision Forests // IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998. Vol. 20, No. 8. P. 832–844. https://doi.org/10.1109/34.709601</mixed-citation><mixed-citation xml:lang="en">Ho T.K. The Random Subspace Method for Constructing Decision Forests // IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998. Vol. 20, No. 8. P. 832–844. https://doi.org/10.1109/34.709601</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Freund Y., Schapire R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting // Journal of Computer and System Sciences. 1997. Vol. 55. P. 119–139. https://doi.org/10.1006/jcss.1997.1504</mixed-citation><mixed-citation xml:lang="en">Freund Y., Schapire R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting // Journal of Computer and System Sciences. 1997. Vol. 55. P. 119–139. https://doi.org/10.1006/jcss.1997.1504</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Friedman J.H. Stochastic Gradient Boosting // Computational Statistics &amp; Data Analysis. 2002. Vol. 38, No. 4. P. 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2</mixed-citation><mixed-citation xml:lang="en">Friedman J.H. Stochastic Gradient Boosting // Computational Statistics &amp; Data Analysis. 2002. Vol. 38, No. 4. P. 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Zhou Z.H. Ensemble Methods: Foundations and Algorithms. New York: Chapman and Hall/CRC, 2012. 446 p. ISBN 978-1-4398-3003-1.</mixed-citation><mixed-citation xml:lang="en">Zhou Z.H. Ensemble Methods: Foundations and Algorithms. New York: Chapman and Hall/CRC, 2012. 446 p. ISBN 978-1-4398-3003-1.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer, 2009. 745 p. https://doi.org/10.1007/978-0-387-84858-7</mixed-citation><mixed-citation xml:lang="en">Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer, 2009. 745 p. https://doi.org/10.1007/978-0-387-84858-7</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Beja-Battais P. Overview of AdaBoost: Reconciling its Views to Better Understand its Dynamics // arXiv preprint arXiv:2310.18323 [cs.LG]. 2023. https://doi.org/10.48550/arXiv.2310.18323</mixed-citation><mixed-citation xml:lang="en">Beja-Battais P. Overview of AdaBoost: Reconciling its Views to Better Understand its Dynamics // arXiv preprint arXiv:2310.18323 [cs.LG]. 2023. https://doi.org/10.48550/arXiv.2310.18323</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System // Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. P. 785–794. https://doi.org/10.48550/arXiv.1603.02754</mixed-citation><mixed-citation xml:lang="en">Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System // Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. P. 785–794. https://doi.org/10.48550/arXiv.1603.02754</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q., Liu T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree // Advances in Neural Information Processing Systems (NeurIPS). 2017. Vol. 30.</mixed-citation><mixed-citation xml:lang="en">Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q., Liu T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree // Advances in Neural Information Processing Systems (NeurIPS). 2017. Vol. 30.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Hancock J.T., Khoshgoftaar T.M. CatBoost for big data: an interdisciplinary review // Journal of Big Data. 2020. Vol. 7, No. 94. 45 p. https://doi.org/10.1186/s40537-020-00369-8</mixed-citation><mixed-citation xml:lang="en">Hancock J.T., Khoshgoftaar T.M. CatBoost for big data: an interdisciplinary review // Journal of Big Data. 2020. Vol. 7, No. 94. 45 p. https://doi.org/10.1186/s40537-020-00369-8</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Zhuravlev Yu.I., Senko O.V., Dokukin A.A., Kiselyova N.N., Saenko I.A. Two-Level Regression Method Using Ensembles of Trees with Optimal Divergence // Doklady Mathematics. 2021. Vol. 103, No. 1. P. 1–4.</mixed-citation><mixed-citation xml:lang="en">Zhuravlev Yu.I., Senko O.V., Dokukin A.A., Kiselyova N.N., Saenko I.A. Two-Level Regression Method Using Ensembles of Trees with Optimal Divergence // Doklady Mathematics. 2021. Vol. 103, No. 1. P. 1–4.</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">https://doi.org/10.1134/S1064562421040177</mixed-citation><mixed-citation xml:lang="en">https://doi.org/10.1134/S1064562421040177</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Dokukin A.A., Sen’ko O.V. A New Two-Level Machine Learning Method for Evaluating the Real Characteristics of Objects // Journal of Computer and Systems Sciences International. 2023. Vol. 62, No. 4. P. 607–614. https://doi.org/10.1134/S1064230723040020</mixed-citation><mixed-citation xml:lang="en">Dokukin A.A., Sen’ko O.V. A New Two-Level Machine Learning Method for Evaluating the Real Characteristics of Objects // Journal of Computer and Systems Sciences International. 2023. Vol. 62, No. 4. P. 607–614. https://doi.org/10.1134/S1064230723040020</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Senko O.V., Dokukin A.A., Kiselyova N.N., Dudarev V.A., Kuznetsova Yu.O. New Two-Level Ensemble Method and Its Application to Chemical Compounds Properties Prediction // Lobachevskii Journal of Mathematics. 2023. Vol. 44, No. 1. P. 188–197. https://doi.org/10.1134/S1995080223010341</mixed-citation><mixed-citation xml:lang="en">Senko O.V., Dokukin A.A., Kiselyova N.N., Dudarev V.A., Kuznetsova Yu.O. New Two-Level Ensemble Method and Its Application to Chemical Compounds Properties Prediction // Lobachevskii Journal of Mathematics. 2023. Vol. 44, No. 1. P. 188–197. https://doi.org/10.1134/S1995080223010341</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">Geurts P., Ernst D., Wehenkel L. Extremely Randomized Trees // Machine Learning. 2006. Vol. 63, No. 1. P. 3–42. https://doi.org/10.1007/s10994-006-6226-1</mixed-citation><mixed-citation xml:lang="en">Geurts P., Ernst D., Wehenkel L. Extremely Randomized Trees // Machine Learning. 2006. Vol. 63, No. 1. P. 3–42. https://doi.org/10.1007/s10994-006-6226-1</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">López-Iñesta E., Grimaldo F., Arevalillo-Herráez M. Combining feature extraction and expansion to improve classification-based similarity learning // Pattern Recognition Letters. 2016. Vol. 85. P. 84–90. https://doi.org/10.1016/j.patrec.2016.11.005</mixed-citation><mixed-citation xml:lang="en">López-Iñesta E., Grimaldo F., Arevalillo-Herráez M. Combining feature extraction and expansion to improve classification-based similarity learning // Pattern Recognition Letters. 2016. Vol. 85. P. 84–90. https://doi.org/10.1016/j.patrec.2016.11.005</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Breiman L., Friedman J., Olshen R.A., Stone C.J. Classification and Regression Trees. Monterey, CA: Wadsworth &amp; Brooks/Cole, 1984. 358 p. https://doi.org/10.1201/9781315139470</mixed-citation><mixed-citation xml:lang="en">Breiman L., Friedman J., Olshen R.A., Stone C.J. Classification and Regression Trees. Monterey, CA: Wadsworth &amp; Brooks/Cole, 1984. 358 p. https://doi.org/10.1201/9781315139470</mixed-citation></citation-alternatives></ref><ref id="cit23"><label>23</label><citation-alternatives><mixed-citation xml:lang="ru">Mahalanobis P.C. On the Generalised Distance in Statistics (reprint of 1936) // Sankhya A. 2018. Vol. 80, Suppl. 1. P. 1–7. https://doi.org/10.1007/s13171-019-00164-5</mixed-citation><mixed-citation xml:lang="en">Mahalanobis P.C. On the Generalised Distance in Statistics (reprint of 1936) // Sankhya A. 2018. Vol. 80, Suppl. 1. P. 1–7. https://doi.org/10.1007/s13171-019-00164-5</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
