<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26907/1562-5419-2023-26-4-414-436</article-id><article-id custom-type="elpub" pub-id-type="custom">ellibs-381</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Инструмент последовательного снятия снимков агрегированных данных из потоковых данных</article-title><trans-title-group xml:lang="en"><trans-title>Tool for Sequential Snapshotting of Aggregated Data from Streaming Data</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Гурьянов</surname><given-names>А. И.</given-names></name><name name-style="western" xml:lang="en"><surname>Gurianov</surname><given-names>A. I.</given-names></name></name-alternatives><email xlink:type="simple">armgnv@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Якупов</surname><given-names>А. Ш.</given-names></name><name name-style="western" xml:lang="en"><surname>Yakupov</surname><given-names>A. S.</given-names></name></name-alternatives><email xlink:type="simple">asyakupov@kpfu.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Казанский (Приволжский) Федеральный университет</institution></aff><aff xml:lang="en"><institution>Kazan (Volga region) Federal University</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2023</year></pub-date><pub-date pub-type="epub"><day>28</day><month>08</month><year>2023</year></pub-date><volume>26</volume><issue>4</issue><fpage>414</fpage><lpage>436</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Гурьянов А.И., Якупов А.Ш., 2023</copyright-statement><copyright-year>2023</copyright-year><copyright-holder xml:lang="ru">Гурьянов А.И., Якупов А.Ш.</copyright-holder><copyright-holder xml:lang="en">Gurianov A.I., Yakupov A.S.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/381">https://ellibs.elpub.ru/jour/article/view/381</self-uri><abstract><p>В современном мире потоковые данные получили широкое распространение во многих предметных областях. Высокую актуальность имеет решение задачи обработки потоковых данных в реальном времени, с минимальной задержкой.
&#13;

При потоковой обработке данных часто применяются различные приближенные алгоритмы, имеющие гораздо более высокую эффективность по времени и памяти, чем точные алгоритмы. Кроме того, часто возникает потребность прогнозирования состояния потока.
&#13;

Таким образом, в настоящее время существует потребность в инструменте последовательного снятия снимков агрегированных данных из потоковых данных, дающем возможность прогнозирования состояния потока и применения приближенных алгоритмов обработки потоковых данных.
&#13;

Авторами статьи разработан такой инструмент, рассмотрены архитектура и механизм его функционирования, а также оценены перспективы его дальнейшего развития.
</p></abstract><trans-abstract xml:lang="en"><p>n the modern world, streaming data has become widespread in many subject areas. The task of processing streaming data in real time, with minimal delay, is highly relevant.
&#13;

In stream processing, data processing, various approximate algorithms are often used, which have much higher time and memory efficiency than exact algorithms. In addition, there is often a need to forecast the state of the stream.
&#13;

Thus, there is currently a need for a tool for sequential snapshotting of aggregated data from streaming data, enabling flow state prediction and approximate algorithms for stream data processing.
&#13;

The authors of the article have developed such a tool, reviewed its architecture and mechanism of functioning, and evaluated the prospects for its further development.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>потоковые данные</kwd><kwd>потоковая обработка данных</kwd><kwd>анализ потоковых данных</kwd><kwd>материализованные представления</kwd><kwd>потоковые алгоритмы</kwd><kwd>приближенные алгоритмы</kwd><kwd>прогнозирование потока</kwd></kwd-group><kwd-group xml:lang="en"><kwd>streaming data</kwd><kwd>stream processing</kwd><kwd>stream analysis</kwd><kwd>materialized views</kwd><kwd>streaming algorithms</kwd><kwd>approximate algorithms</kwd><kwd>stream forecasting</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Гурьянова Э. А., Гурьянов А. И. Анализ и перспективы рынка SaaS в Российской Федерации // Вестник экономики, права и социологии. 2022. №1. С. 182–185.</mixed-citation><mixed-citation xml:lang="en">Гурьянова Э. А., Гурьянов А. И. Анализ и перспективы рынка SaaS в Российской Федерации // Вестник экономики, права и социологии. 2022. №1. С. 182–185.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Kolajo T., Daramola O., Adebiyi A. Big data stream analysis: a systematic literature review. // Journal of Big Data. 2019. Vol. 6.</mixed-citation><mixed-citation xml:lang="en">Kolajo T., Daramola O., Adebiyi A. Big data stream analysis: a systematic literature review. // Journal of Big Data. 2019. Vol. 6.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">https://doi.org/10.1186/s40537-019-0210-7</mixed-citation><mixed-citation xml:lang="en">https://doi.org/10.1186/s40537-019-0210-7</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Маркова В. Д. Влияние цифровой экономики на бизнес // ЭКО. 2018. №12 (534). С. 7–22.</mixed-citation><mixed-citation xml:lang="en">Маркова В. Д. Влияние цифровой экономики на бизнес // ЭКО. 2018. №12 (534). С. 7–22.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Определение потоковой передачи данных // Amazon Web Services (AWS). – URL: https://aws.amazon.com/ru/streaming-data/ (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Определение потоковой передачи данных // Amazon Web Services (AWS). – URL: https://aws.amazon.com/ru/streaming-data/ (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Ельченков Р. А., Дунаев М. Е., Зайцев К. С. Прогнозирование временных рядов при обработке потоковых данных в реальном времени // International Journal of Open Information Technologies. 2022. Т. 10, №6. С. 62–69.</mixed-citation><mixed-citation xml:lang="en">Ельченков Р. А., Дунаев М. Е., Зайцев К. С. Прогнозирование временных рядов при обработке потоковых данных в реальном времени // International Journal of Open Information Technologies. 2022. Т. 10, №6. С. 62–69.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Апатова Н. В. Управление в экосистеме бизнеса в период цифровой трансформации // Эффективное управление экономикой: проблемы и перспективы. 2022. С. 238–241.</mixed-citation><mixed-citation xml:lang="en">Апатова Н. В. Управление в экосистеме бизнеса в период цифровой трансформации // Эффективное управление экономикой: проблемы и перспективы. 2022. С. 238–241.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Маркова В. Д., Кузнецова С. А. Развитие стратегического менеджмента в цифровой экономике // Вестник Томского государственного университета. Экономика. 2019. №48. С. 217–232. https://doi.org/10.17223/19988648/48/15</mixed-citation><mixed-citation xml:lang="en">Маркова В. Д., Кузнецова С. А. Развитие стратегического менеджмента в цифровой экономике // Вестник Томского государственного университета. Экономика. 2019. №48. С. 217–232. https://doi.org/10.17223/19988648/48/15</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Петренко А. С., Петренко С. А. Технологии больших данных (big data) в области информационной безопасности // The 2018 Symposium on Cybersecurity of the Digital Economy. 2018. C. 248–255.</mixed-citation><mixed-citation xml:lang="en">Петренко А. С., Петренко С. А. Технологии больших данных (big data) в области информационной безопасности // The 2018 Symposium on Cybersecurity of the Digital Economy. 2018. C. 248–255.</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Трофимов В. В., Трофимова Л. А. О концепции управления на основе данных в условиях цифровой трансформации // Петербургский экономический журнал. 2021. №4. С. 149–155. https://doi.org/10.24412/2307-5368-2021-4-149-155</mixed-citation><mixed-citation xml:lang="en">Трофимов В. В., Трофимова Л. А. О концепции управления на основе данных в условиях цифровой трансформации // Петербургский экономический журнал. 2021. №4. С. 149–155. https://doi.org/10.24412/2307-5368-2021-4-149-155</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Логиновский О. В., Шестаков А. Л., Шинкарев А. А. Построение современных корпоративных информационных систем // Управление большими системами: сборник трудов. 2019. №81. С. 113–146.</mixed-citation><mixed-citation xml:lang="en">Логиновский О. В., Шестаков А. Л., Шинкарев А. А. Построение современных корпоративных информационных систем // Управление большими системами: сборник трудов. 2019. №81. С. 113–146.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">https://doi.org/10.25728/ubs.2019.81.5</mixed-citation><mixed-citation xml:lang="en">https://doi.org/10.25728/ubs.2019.81.5</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Alwaisi S. S. A., Abbood M. N., Jalil L. F., Kasim S., Fudzee M. F. M., Hadi R., Ismail M. A. A. Review on Big Data Stream Processing Applications: Contributions, Benefits, and Limitations // International Journal on Informatics Visualization. 2021. Vol. 5(4). P. 456–460. https://doi.org/10.30630/joiv.5.4.737</mixed-citation><mixed-citation xml:lang="en">Alwaisi S. S. A., Abbood M. N., Jalil L. F., Kasim S., Fudzee M. F. M., Hadi R., Ismail M. A. A. Review on Big Data Stream Processing Applications: Contributions, Benefits, and Limitations // International Journal on Informatics Visualization. 2021. Vol. 5(4). P. 456–460. https://doi.org/10.30630/joiv.5.4.737</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">McSherry F. View Maintenance: A New Approach to Data Processing // Materialize Blog. 2020. URL: https://materialize.com/blog/olvm/ (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">McSherry F. View Maintenance: A New Approach to Data Processing // Materialize Blog. 2020. URL: https://materialize.com/blog/olvm/ (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Singh A., Garg S., Kaur R., Batra S., Kumar N., Zomaya A. Y. Probabilistic data structures for big data analytics: A comprehensive review // Knowledge-Based Systems. 2020. Vol. 188. https://doi.org/10.1016/j.knosys.2019.104987</mixed-citation><mixed-citation xml:lang="en">Singh A., Garg S., Kaur R., Batra S., Kumar N., Zomaya A. Y. Probabilistic data structures for big data analytics: A comprehensive review // Knowledge-Based Systems. 2020. Vol. 188. https://doi.org/10.1016/j.knosys.2019.104987</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Torres J. F., Hadjout D., Sebaa A., Martinez-Alvarez F., Troncoso A. Deep Learning for Time Series Forecasting: A Survey // Big Data. 2021. Vol 9(1). https://doi.org/10.1089/big.2020.0159</mixed-citation><mixed-citation xml:lang="en">Torres J. F., Hadjout D., Sebaa A., Martinez-Alvarez F., Troncoso A. Deep Learning for Time Series Forecasting: A Survey // Big Data. 2021. Vol 9(1). https://doi.org/10.1089/big.2020.0159</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Brandt T. L., Grawunder M. Moving Object Stream Processing with Short-Time Prediction // Proceedings of the 8th ACM SIGSPATIAL Workshop on GeoStreaming. 2017. https://doi.org/10.1145/3148160.3148168</mixed-citation><mixed-citation xml:lang="en">Brandt T. L., Grawunder M. Moving Object Stream Processing with Short-Time Prediction // Proceedings of the 8th ACM SIGSPATIAL Workshop on GeoStreaming. 2017. https://doi.org/10.1145/3148160.3148168</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Incremental Computation in the Database // Materialize. – URL: https://materialize.com/guides/incremental-computation/ (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Incremental Computation in the Database // Materialize. – URL: https://materialize.com/guides/incremental-computation/ (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">Upserts in Differential Dataflow // Materialize Blog. 2020. URL: https://materialize.com/blog/upserts-in-differential-dataflow/ (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Upserts in Differential Dataflow // Materialize Blog. 2020. URL: https://materialize.com/blog/upserts-in-differential-dataflow/ (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">artemgur/Diplom // GitHub. URL: https://github.com/artemgur/diplom (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">artemgur/Diplom // GitHub. URL: https://github.com/artemgur/diplom (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Materialize Documentation // Materialize. URL: https://materialize.com/docs/ (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Materialize Documentation // Materialize. URL: https://materialize.com/docs/ (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Data definition // ksqIDB Documentation. URL: https://docs.ksqldb.io/en/latest/reference/sql/data-definition/ (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Data definition // ksqIDB Documentation. URL: https://docs.ksqldb.io/en/latest/reference/sql/data-definition/ (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit23"><label>23</label><citation-alternatives><mixed-citation xml:lang="ru">Streaming ingestion // Amazon Redshift. URL: https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Streaming ingestion // Amazon Redshift. URL: https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit24"><label>24</label><citation-alternatives><mixed-citation xml:lang="ru">Confluent Community License Agreement // GitHub. 2018. URL: https://github.com/confluentinc/ksql/blob/master/LICENSE (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Confluent Community License Agreement // GitHub. 2018. URL: https://github.com/confluentinc/ksql/blob/master/LICENSE (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit25"><label>25</label><citation-alternatives><mixed-citation xml:lang="ru">Materialize Business Source License Agreement // GitHub. URL: https://github.com/MaterializeInc/materialize/blob/main/LICENSE (дата обращения 12.05.2023)</mixed-citation><mixed-citation xml:lang="en">Materialize Business Source License Agreement // GitHub. URL: https://github.com/MaterializeInc/materialize/blob/main/LICENSE (дата обращения 12.05.2023)</mixed-citation></citation-alternatives></ref><ref id="cit26"><label>26</label><citation-alternatives><mixed-citation xml:lang="ru">Ting D. Approximate Distinct Counts for Billions of Datasets // Proceedings of the 2019 International Conference on Management of Data. 2019. P. 69–86.</mixed-citation><mixed-citation xml:lang="en">Ting D. Approximate Distinct Counts for Billions of Datasets // Proceedings of the 2019 International Conference on Management of Data. 2019. P. 69–86.</mixed-citation></citation-alternatives></ref><ref id="cit27"><label>27</label><citation-alternatives><mixed-citation xml:lang="ru">https://doi.org/10.1145/3299869.3319897</mixed-citation><mixed-citation xml:lang="en">https://doi.org/10.1145/3299869.3319897</mixed-citation></citation-alternatives></ref><ref id="cit28"><label>28</label><citation-alternatives><mixed-citation xml:lang="ru">Fan L., Cao P., Almeida J., Broder A. Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol // IEEE/ACM Transactions on Networking. 2000. Vol. 8(3). P. 281–293. https://doi.org/10.1109/90.851975</mixed-citation><mixed-citation xml:lang="en">Fan L., Cao P., Almeida J., Broder A. Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol // IEEE/ACM Transactions on Networking. 2000. Vol. 8(3). P. 281–293. https://doi.org/10.1109/90.851975</mixed-citation></citation-alternatives></ref><ref id="cit29"><label>29</label><citation-alternatives><mixed-citation xml:lang="ru">Flajolet P., Fusy E., Gandouet O., Meunier F. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm // Discrete Mathematics &amp; Theoretical Computer Science. 2007. P. 137–156.</mixed-citation><mixed-citation xml:lang="en">Flajolet P., Fusy E., Gandouet O., Meunier F. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm // Discrete Mathematics &amp; Theoretical Computer Science. 2007. P. 137–156.</mixed-citation></citation-alternatives></ref><ref id="cit30"><label>30</label><citation-alternatives><mixed-citation xml:lang="ru">Boyer R.S., Moore J.S. MJRTY – A Fast Majority Vote Algorithm // Automated Reasoning / ed. Boyer R. S. Dordrecht: Kluwer Academic Publishers, 1991. P. 105–117. https://doi.org/10.1007/978-94-011-3488-0_5</mixed-citation><mixed-citation xml:lang="en">Boyer R.S., Moore J.S. MJRTY – A Fast Majority Vote Algorithm // Automated Reasoning / ed. Boyer R. S. Dordrecht: Kluwer Academic Publishers, 1991. P. 105–117. https://doi.org/10.1007/978-94-011-3488-0_5</mixed-citation></citation-alternatives></ref><ref id="cit31"><label>31</label><citation-alternatives><mixed-citation xml:lang="ru">Singh B., Chaitra B. H. Comprehensive Review of Stream Processing Tools // International Research Journal of Engineering and Technology. 2020. Vol. 7(5). P. 3537–3540.</mixed-citation><mixed-citation xml:lang="en">Singh B., Chaitra B. H. Comprehensive Review of Stream Processing Tools // International Research Journal of Engineering and Technology. 2020. Vol. 7(5). P. 3537–3540.</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
