References

ellibs

Электронные библиотеки

Russian Digital Libraries Journal

1562-5419

Казанский (Приволжский) федеральный университет

10.26907/1562-5419-2020-23-6-1172-1191

ellibs-255

Research Article

Статьи

Классификация изображений с использованием обучения с подкреплением

Image Classification Using Reinforcement Learning

Елизаров

А. А.

Elizarov

A. A.

artelizar@gmail.com

Разинков

Е. В.

Razinkov

E. V.

evgeny@razinkov.ai

2020

28122020

23611721191

2020

Елизаров А.А., Разинков Е.В.

Elizarov A.A., Razinkov E.V.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://ellibs.elpub.ru/jour/article/view/255

В последнее время активно развивается такое направление машинного обучения, как обучение с подкреплением. Как следствие предпринимаются попытки использования обучения с подкреплением для решения задач компьютерного зрения, в частности для решения задачи классификации изображений. Задачи компьютерного зрения являются на сегодняшний день одними из наиболее актуальных задач искусственного интеллекта. В статье предложен метод классификации изображений в виде глубокой нейронной сети с использованием обучения с подкреплением. Идея разработанного метода сводится к решению задачи о контекстном многоруком бандите с помощью различных стратегий достижения компромисса между эксплуатацией и исследованием и алгоритмов обучения с подкреплением. Рассмотрены такие стратегии, как -жадная, -softmax, -decay-softmax и метод UCB1, и такие алгоритмы обучения с подкреплением, как DQN, REINFORCE и A2C. Проведен анализ влияния различных параметров на эффективность работы.

Recently, such a direction of machine learning as reinforcement learning has been actively developing. As a consequence, attempts are being made to use reinforcement learning for solving computer vision problems, in particular for solving the problem of image classification. The tasks of computer vision are currently one of the most urgent tasks of artificial intelligence. The article proposes a method for image classification in the form of a deep neural network using reinforcement learning. The idea of the developed method comes down to solving the problem of a contextual multi-armed bandit using various strategies for achieving a compromise between exploitation and research and reinforcement learning algorithms. Strategies such as -greedy, -softmax, -decay-softmax, and the UCB1 method, and reinforcement learning algorithms such as DQN, REINFORCE, and A2C are considered. The analysis of the influence of various parameters on the efficiency of the method is carried out, and options for further development of the method are proposed.

машинное обучениеклассификация изображенийобучение с подкреплениемзадача о контекстном многоруком бандите

machine learningimage classificationreinforcement learningcontextual multi-armed bandit problem

References1

Goodfellow I., Bengio Y., Courville A. Deep learning // C.: The MIT Press, 2016, URL: https://www.deeplearningbook.org/.

Krizhevsky A., Sutskever I., Hinton G. E. ImageNet Classification with Deep Convolutional Neural Networks // Advances in neural information processing systems, 2012. Vol. 25, No. 2. P. 1097–1105, DOI: 10.1145/3065386.

Russakovsky O., Deng J., Su H. at all. ImageNet Large Scale Visual Recog-nition Challenge // International Journal of Computer Vision. 2015. Vol. 115, No. 3. P. 211–252, DOI: 10.1007/s11263-015-0816-y.

Sutton R. S., Barto A. G. Reinforcement learning: An introduction // C.: The MIT Press, 2018. URL: http://www.incompleteideas.net/book/RLbook2020.pdf/.

Liu X., Xia T., Wang J. at all. Fully Convolutional Attention Networks for Fine-Grained Recognition // arXiv:1603.06765, 2017.

Li Z., Yang Y., Liu X. at all. Dynamic Computational Time for Visual Atten-tion // arXiv:1703.10332, 2017.

He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recogni-tion // Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016. P. 770–778, DOI: 10.1109/CVPR.2016.90.

PyTorch, 2016, URL: https://pytorch.org/.

Google Colaboratory, 2017, URL: https://colab.research.google.com/.

ImageNet Dataset, 2016, URL: http://image-net.org/.

Fine-Grained Image Classification, 2019, URL: https://paperswithcode.com/task/fine-grained-image-classification/.

Girshick R. Fast R-CNN // Proceedings of the IEEE International Confer-ence on Computer Vision, 2015. P. 1440–1448, DOI: 10.1109/ICCV.2015.169.

Mnih V., Kavukcuoglu K., Silver D. at all. Playing Atari with Deep Rein-forcement Learning // arXiv:1312.5602, 2013.

Abdolmaleki A., Springenberg J. T., Degrave J. at all. Relative Entropy Regularized Policy Iteration // arXiv:1812.02256, 2018.

Auer P., Cesa-Bianchi N., Fischer P. Finite-time Analysis of the Multiarmed Bandit Problem // Machine Learning, 2002. Vol. 47, No. 2-3. P. 235–256, DOI: 10.1023/A:1013689704352.

The authors declare that there are no conflicts of interest present.