<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">ellibs</journal-id><journal-title-group><journal-title xml:lang="ru">Электронные библиотеки</journal-title><trans-title-group xml:lang="en"><trans-title>Russian Digital Libraries Journal</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">1562-5419</issn><publisher><publisher-name>Казанский (Приволжский) федеральный университет</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26907/1562-5419-2026-29-1-123-144</article-id><article-id custom-type="elpub" pub-id-type="custom">ellibs-724</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Интеллектуальный сервис мультимодального нейросетевого мониторинга области наблюдения</article-title><trans-title-group xml:lang="en"><trans-title>Intelligent Multimodal Neural Network Monitoring Service for the Surveillance Area</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Миннеахметов</surname><given-names>Разиль Рустемович</given-names></name><name name-style="western" xml:lang="en"><surname>Minneakhmetov</surname><given-names>Razil Rustemovich</given-names></name></name-alternatives><email xlink:type="simple">razil0071999@gmail.com</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Казанский (Приволжский) федеральный университет</institution></aff><aff xml:lang="en"><institution>Kazan (Volga region) Federal University</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2026</year></pub-date><pub-date pub-type="epub"><day>04</day><month>03</month><year>2026</year></pub-date><volume>29</volume><issue>1</issue><fpage>123</fpage><lpage>144</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Миннеахметов Р.Р., 2026</copyright-statement><copyright-year>2026</copyright-year><copyright-holder xml:lang="ru">Миннеахметов Р.Р.</copyright-holder><copyright-holder xml:lang="en">Minneakhmetov R.R.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://ellibs.elpub.ru/jour/article/view/724">https://ellibs.elpub.ru/jour/article/view/724</self-uri><abstract><p>Представлен подход к разработке интеллектуального сервиса мультимодального мониторинга области наблюдения с использованием больших нейросетевых моделей. Предлагаемое решение способно анализировать разнородные данные: видеопотоки, сигналы датчиков окружающей среды (температура, влажность и пр.) и журналы событий – для получения целостной картины происходящего. В качестве основных инструментов задействованы крупные языковые и визуальные модели (например, LLaMA, MiniCPM‑V и др.), развернутые локально с помощью платформы Ollama, что обеспечивает автономную и безопасную обработку информации без необходимости передачи данных на удаленные сервера. Разработан прототип системы, работающий в офлайн-режиме и способный выявлять критические ситуации, аномальные отклонения от нормы и контекстно значимые события в наблюдаемой зоне. Описана методика формирования тестовых сценариев и проведения качественной оценки работы модели по метрикам F1-мера, Precision, Recall. Результаты экспериментов подтвердили применимость мультимодальных моделей для решения задач мониторинга: прототип успешно распознает сложные паттерны поведения и демонстрирует потенциал больших моделей в построении адаптивных и масштабируемых систем наблюдения.
</p></abstract><trans-abstract xml:lang="en"><p>The article presents an approach to the development of an intelligent multimodal monitoring service for the surveillance area using large neural network models. The proposed solution is capable of analyzing heterogeneous data – video streams, environmental sensor signals (temperature, humidity, etc.), and event logs – to obtain a complete picture of what is happening. The main tools used are large language and visual models (for example, LLaMA, MiniCPM‑V, etc.) deployed locally using the Ollama platform, which provides autonomous and secure information processing without the need to transfer data to the cloud. A prototype system has been developed that works offline and is capable of detecting critical situations, abnormal deviations from the norm and contextually significant events in the observed area. The method of forming test scenarios and conducting a qualitative assessment of the model's performance using the metrics F1-measure, Precision, Recall on a set of various situations is described. The experimental results confirm the applicability of multimodal models for monitoring tasks: the prototype successfully recognizes complex patterns of behavior and demonstrates the potential of large models in building adaptive and scalable surveillance systems.
</p></trans-abstract><kwd-group xml:lang="ru"><kwd>интеллектуальный сервис</kwd><kwd>мультимодальный мониторинг</kwd><kwd>Ollama</kwd><kwd>большие языковые модели</kwd><kwd>отслеживание активностей</kwd><kwd>видеоаналитика</kwd><kwd>искусственный интеллект</kwd></kwd-group><kwd-group xml:lang="en"><kwd>intelligent service</kwd><kwd>multimodal monitoring</kwd><kwd>Ollama</kwd><kwd>Large Language Models</kwd><kwd>activity tracking</kwd><kwd>video analytics</kwd><kwd>artificial intelligence</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Onsu M.A., Lohan P., Kantarci B., Syed A., Andrews M., Kennedy S. Leveraging Multimodal Large Language Models Assisted by Instance Segmentation for Intelligent Traffic Monitoring [Electronic resource] // arXiv. 2025. Available at: https://arxiv.org/abs/2502.11304 (accessed: 15.05.2025).</mixed-citation><mixed-citation xml:lang="en">Onsu M.A., Lohan P., Kantarci B., Syed A., Andrews M., Kennedy S. Leveraging Multimodal Large Language Models Assisted by Instance Segmentation for Intelligent Traffic Monitoring [Electronic resource] // arXiv. 2025. Available at: https://arxiv.org/abs/2502.11304 (accessed: 15.05.2025).</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Ferrara E. Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling // Sensors. 2024. Vol. 24, No. 15. Article 5045.</mixed-citation><mixed-citation xml:lang="en">Ferrara E. Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling // Sensors. 2024. Vol. 24, No. 15. Article 5045.</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Suh S., Rey V.F., Lukowicz P. Tasked: Transformer-Based Adversarial Learning for Human Activity Recognition Using Wearable Sensors // Knowledge-Based Systems. 2023. Vol. 260. Article 110143.</mixed-citation><mixed-citation xml:lang="en">Suh S., Rey V.F., Lukowicz P. Tasked: Transformer-Based Adversarial Learning for Human Activity Recognition Using Wearable Sensors // Knowledge-Based Systems. 2023. Vol. 260. Article 110143.</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Nauchnyy servis v seti Internet: trudy XXVI Vserossiyskoy nauchnoy konferentsii (September 22–25, 2025, online). Moscow: Keldysh Institute of Applied Mathematics, 2025 (in press).</mixed-citation><mixed-citation xml:lang="en">Nauchnyy servis v seti Internet: trudy XXVI Vserossiyskoy nauchnoy konferentsii (September 22–25, 2025, online). Moscow: Keldysh Institute of Applied Mathematics, 2025 (in press).</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Nath N.D., Behzadan A.H., Paal S.G. Deep Learning for Site Safety: Real-Time Detection of Personal Protective Equipment // Automation in Construction. 2020. Vol. 112. Article 103085.</mixed-citation><mixed-citation xml:lang="en">Nath N.D., Behzadan A.H., Paal S.G. Deep Learning for Site Safety: Real-Time Detection of Personal Protective Equipment // Automation in Construction. 2020. Vol. 112. Article 103085.</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Gupta S. Deep Learning-Based Human Activity Recognition Using Wearable Sensor Data // International Journal of Information Management Data Insights. 2021. Vol. 1. Article 100046.</mixed-citation><mixed-citation xml:lang="en">Gupta S. Deep Learning-Based Human Activity Recognition Using Wearable Sensor Data // International Journal of Information Management Data Insights. 2021. Vol. 1. Article 100046.</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Uçar A., Karakoşe M., Kırımça N. Artificial Intelligence for Predictive Maintenance Applications: Key Components, Trustworthiness, and Future Trends // Applied Sciences. 2024. Vol. 14, No. 2. Article 898.</mixed-citation><mixed-citation xml:lang="en">Uçar A., Karakoşe M., Kırımça N. Artificial Intelligence for Predictive Maintenance Applications: Key Components, Trustworthiness, and Future Trends // Applied Sciences. 2024. Vol. 14, No. 2. Article 898.</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Wu Z., Zhao J., Shen H. Smart Home Automation Based on Human Activity Recognition: A Survey // Future Generation Computer Systems. 2023. Vol. 137. P. 41–57.</mixed-citation><mixed-citation xml:lang="en">Wu Z., Zhao J., Shen H. Smart Home Automation Based on Human Activity Recognition: A Survey // Future Generation Computer Systems. 2023. Vol. 137. P. 41–57.</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Han S., Yuan S., Trabelsi M. LogGPT: Log Anomaly Detection via GPT [Electronic resource] // arXiv. 2023. Available at: https://arxiv.org/pdf/2309.14482</mixed-citation><mixed-citation xml:lang="en">Han S., Yuan S., Trabelsi M. LogGPT: Log Anomaly Detection via GPT [Electronic resource] // arXiv. 2023. Available at: https://arxiv.org/pdf/2309.14482</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">(accessed: 15.05.2025).</mixed-citation><mixed-citation xml:lang="en">(accessed: 15.05.2025).</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Sharma R., Patel N. Deep Learning-Based Anomaly Detection in Surveillance Videos // Journal of Visual Communication and Image Representation. 2022. Vol. 86. Article 103624.</mixed-citation><mixed-citation xml:lang="en">Sharma R., Patel N. Deep Learning-Based Anomaly Detection in Surveillance Videos // Journal of Visual Communication and Image Representation. 2022. Vol. 86. Article 103624.</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Özüağ S., Ertuğrul Ö. Enhanced Occupational Safety in Agricultural Machinery Factories: Artificial Intelligence-Driven Helmet Detection Using Transfer Learning and Majority Voting // Applied Sciences. 2024. Vol. 14. Article 11278. https://doi.org/10.3390/app142311278.</mixed-citation><mixed-citation xml:lang="en">Özüağ S., Ertuğrul Ö. Enhanced Occupational Safety in Agricultural Machinery Factories: Artificial Intelligence-Driven Helmet Detection Using Transfer Learning and Majority Voting // Applied Sciences. 2024. Vol. 14. Article 11278. https://doi.org/10.3390/app142311278.</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Li X., Chen Y., Hu L. Real-Time Workplace Activity Recognition Using Deep Learning Models // IEEE Transactions on Industrial Informatics. 2023. Vol. 19, No. 2. P. 1520–1532.</mixed-citation><mixed-citation xml:lang="en">Li X., Chen Y., Hu L. Real-Time Workplace Activity Recognition Using Deep Learning Models // IEEE Transactions on Industrial Informatics. 2023. Vol. 19, No. 2. P. 1520–1532.</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Wu Z., Zhao J., Shen H. Smart Home Automation Based on Human Activity Recognition: A Survey // Future Generation Computer Systems. 2023. Vol. 137. P. 41–57.</mixed-citation><mixed-citation xml:lang="en">Wu Z., Zhao J., Shen H. Smart Home Automation Based on Human Activity Recognition: A Survey // Future Generation Computer Systems. 2023. Vol. 137. P. 41–57.</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama [Electronic resource]. Available at: https://ollama.com/ (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">Ollama [Electronic resource]. Available at: https://ollama.com/ (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama API Documentation [Electronic resource]. Available at: https://github.com/ollama/ollama/blob/main/docs/api.md (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">Ollama API Documentation [Electronic resource]. Available at: https://github.com/ollama/ollama/blob/main/docs/api.md (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Sahoo P., Singh A.K., Saha S., Jain V., Mondal S., Chadha A. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications [Electronic resource] // arXiv. 2024. Available at: https://arxiv.org/pdf/2402.07927 (accessed: 15.05.2025).</mixed-citation><mixed-citation xml:lang="en">Sahoo P., Singh A.K., Saha S., Jain V., Mondal S., Chadha A. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications [Electronic resource] // arXiv. 2024. Available at: https://arxiv.org/pdf/2402.07927 (accessed: 15.05.2025).</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama Python Library [Electronic resource]. Available at: https://github.com/ollama/ollama-python (accessed: 30.03.2025)</mixed-citation><mixed-citation xml:lang="en">Ollama Python Library [Electronic resource]. Available at: https://github.com/ollama/ollama-python (accessed: 30.03.2025)</mixed-citation></citation-alternatives></ref><ref id="cit19"><label>19</label><citation-alternatives><mixed-citation xml:lang="ru">ISO 8601-1:2019 Standard [Electronic resource]. Available at: https://www.iso.org/obp/ui/#iso:std:iso:8601:-1:ed-1:v1:en (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">ISO 8601-1:2019 Standard [Electronic resource]. Available at: https://www.iso.org/obp/ui/#iso:std:iso:8601:-1:ed-1:v1:en (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit20"><label>20</label><citation-alternatives><mixed-citation xml:lang="ru">OpenAI ChatGPT-4o-mini [Electronic resource]. Available at: https://chatgpt.com/ (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">OpenAI ChatGPT-4o-mini [Electronic resource]. Available at: https://chatgpt.com/ (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit21"><label>21</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama Gemma3:12B Model [Electronic resource]. Available at: https://ollama.com/library/gemma3:12b (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">Ollama Gemma3:12B Model [Electronic resource]. Available at: https://ollama.com/library/gemma3:12b (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit22"><label>22</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama LLaVA:13B Model [Electronic resource]. Available at: https://ollama.com/library/llava:13b (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">Ollama LLaVA:13B Model [Electronic resource]. Available at: https://ollama.com/library/llava:13b (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit23"><label>23</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama Llama3.2-Vision:11B Model [Electronic resource]. Available at: https://ollama.com/library/llama3.2-vision (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">Ollama Llama3.2-Vision:11B Model [Electronic resource]. Available at: https://ollama.com/library/llama3.2-vision (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit24"><label>24</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama MiniCPM-V:8B Model [Electronic resource]. Available at: https://ollama.com/library/minicpm-v (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">Ollama MiniCPM-V:8B Model [Electronic resource]. Available at: https://ollama.com/library/minicpm-v (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref><ref id="cit25"><label>25</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama Qwen2.5-VL:7B Model [Electronic resource]. Available at: https://ollama.com/library/qwen2.5vl (accessed: 16.01.2026).</mixed-citation><mixed-citation xml:lang="en">Ollama Qwen2.5-VL:7B Model [Electronic resource]. Available at: https://ollama.com/library/qwen2.5vl (accessed: 16.01.2026).</mixed-citation></citation-alternatives></ref><ref id="cit26"><label>26</label><citation-alternatives><mixed-citation xml:lang="ru">Ollama Mistral-Small-3.2 Model [Electronic resource]. Available at: https://ollama.com/library/mistral-small3.2 (accessed: 16.01.2026).</mixed-citation><mixed-citation xml:lang="en">Ollama Mistral-Small-3.2 Model [Electronic resource]. Available at: https://ollama.com/library/mistral-small3.2 (accessed: 16.01.2026).</mixed-citation></citation-alternatives></ref><ref id="cit27"><label>27</label><citation-alternatives><mixed-citation xml:lang="ru">Hand D.J., Christen P. F*: An Interpretable Transformation of the Measure // Journal of Classification. 2021. Vol. 38, No. 1. P. 3–17.</mixed-citation><mixed-citation xml:lang="en">Hand D.J., Christen P. F*: An Interpretable Transformation of the Measure // Journal of Classification. 2021. Vol. 38, No. 1. P. 3–17.</mixed-citation></citation-alternatives></ref><ref id="cit28"><label>28</label><citation-alternatives><mixed-citation xml:lang="ru">Scikit-learn F1-Score [Electronic resource]. Available at: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html (accessed: 30.03.2025).</mixed-citation><mixed-citation xml:lang="en">Scikit-learn F1-Score [Electronic resource]. Available at: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html (accessed: 30.03.2025).</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
