References

ellibs

Электронные библиотеки

Russian Digital Libraries Journal

1562-5419

Казанский (Приволжский) федеральный университет

10.26907/1562-5419-2021-24-2-271-292

ellibs-273

Research Article

Статьи

Применение машинного обучения к задаче генерации поисковых запросов

Applying Machine Learning to the Task of Generating Search Queries

Гусенков

А. М.

Gusenkov

A. M.

gusenkov.a.m@gmail.com

Ситтикова

А. Р.

Sittikova

A. R.

sitti.alina@mail.ru

2021

28042021

242272293

2021

Гусенков А.М., Ситтикова А.Р.

Gusenkov A.M., Sittikova A.R.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://ellibs.elpub.ru/jour/article/view/273

Исследованы две модификации рекуррентных нейронных сетей: сети с долгой краткосрочной памятью и сети с управляемым рекуррентным блоком с добавлением механизма внимания к обеим сетям, а также модель Transformer в задаче генерации запросов к поисковым системам. В качестве модели Transformer использована модель GPT-2 от OpenAI, которая обучалась на запросах пользователей. Проведен латентно-семантический анализ для определения семантических сходств между корпусом пользовательских запросов и запросов, генерируемых нейронными сетями. Для проведения анализа корпус был переведен в формат bag of words, к нему применена модель TFIDF, проведено сингулярное разложение. Семантическое сходство вычислялось на основе косинусной меры. Также для более полной оценки применимости моделей к задаче был проведен экспертный анализ для оценки связности слов в искусственно созданных запросах.

In this paper we research two modifications of recurrent neural networks – Long Short-Term Memory networks and networks with Gated Recurrent Unit with the addition of an attention mechanism to both networks, as well as the Transformer model in the task of generating queries to search engines. GPT-2 by OpenAI was used as the Transformer, which was trained on user queries. Latent-semantic analysis was carried out to identify semantic similarities between the corpus of user queries and queries generated by neural networks. The corpus was convert-ed into a bag of words format, the TFIDF model was applied to it, and a singular value decomposition was performed. Semantic similarity was calculated based on the cosine measure. Also, for a more complete evaluation of the applicability of the models to the task, an expert analysis was carried out to assess the coherence of words in artificially created queries.

обработка естественного языкагенерация естественного языкамашинное обучениенейронные сети

natural language processingnatural language generationmachine learningneural networks

References1

Van Deemter K., Krahmer E., Theune M. Real vs. template-based natural language generation: a false opposition? URL: https://wwwhome.ewi.utwente.nl/~theune/PUBS/templates-squib.pdf

Xie Z. Neural Text Generation: A Practical Guide.

URL: https://arxiv.org/pdf/1711.09534.pdf

A Comprehensive Guide to Natural Language Generation, 2019. URL: https://medium.com/sciforce/a-comprehensive-guide-to-natural-language-generation-dd63a4b6e548

Arrington M. AOL proudly releases massive amounts of user search data, 2006. URL: https://techcrunch.com/2006/08/06/aol-proudly-releases-massive-amounts-of-user-search-data/

Reiter E. NLG vs Templates: Levels of Sophistication in Generating Text, 2016. URL: https://ehudreiter.com/2016/12/18/nlg-vs-templates

Gagniuc P. Markov Chains: From Theory to Implementation and Experimentation, 2017. USA, NJ: John Wiley & Sons.

Press O., Bar A., Bogin B., Berant J., Wolf L. Language Generation with Recurrent Generative Adversarial Networks without Pre-training. URL: https://arxiv.org/pdf/1706.01399.pdf

Williams R.J., Hinton G.E., Rumelhart D.E. Learning representations by back-propagating errors. URL: http://www.cs.utoronto.ca/~hinton/absps/naturebp.pdf

Hochreiter S., Bengio Y., Frasconi P., Schmidhuber J. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies.

URL: https://www.bioinf.jku.at/publications/older/ch7.pdf

Hochreiter S., Schmidhuber J. Long-Short Term Memory. URL: http://web.archive.org/web/20150526132154/http:// deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf

Heck J., Salem F. Simplified Minimal Gated Unit Variations for Recurrent Neural Networks. URL: https://arxiv.org/abs /1701.03452

Bahdanau D., Cho K.m Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. URL: https://arxiv.org/pdf/1409.0473.pdf

Felbo B., Mislove A., Søgaard A., Rahwan I., Lehmann S. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. URL: https://arxiv.org/pdf/1708.00524.pdf

Bisong E. Google Colaboratory. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform, 2019. Apress, Berkeley, CA.

Chollet F. Keras, 2015. URL: https://keras.io

Kingma D., Ba J. Adam: A Method for Stochastic Optimization. URL: https://arxiv.org/abs/1412.6980

Learning Rate Scheduler. URL: https://keras.io/api/callbacks/learning_rate_ scheduler/

Schuster M., Paliwal K. Bidirectional recurrent neural networks. URL: https://www.researchgate.net/publication/ 3316656_Bidirectional_recurrent_neural_networks

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A., Kaizer L., Polosukhin I. Attention Is All You Need. URL: https://arxiv.org/pdf/1706.03762.pdf

Radford A., Wu J., Child R., Luan D., Amodei D., Sutskever I. Language Models Are Unsupervised Multitask Learners. URL: https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf

Devlin J., Chang M., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. URL: https://arxiv.org/pdf/1810.04805.pdf

Brown T., Mann B., Ryder N., Subbiah M., Kaplan J. Language Models Are Few-Shot Learners. URL: https://arxiv.org/abs/2005.14165

Gage P. A New Algorithm for Data Compression. URL: https://www.derczynski.com/papers/archive/BPE_Gage.pdf

Deerwester S., Harshman R. Indexing by Latent Semantic Analysis. URL: https://www.cs.bham.ac.uk/ ~pxt/IDA/lsa_ind.pdf

Nakov P. Getting Better Results with Latent Semantic Indexing. URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.6406&rep=rep1&type=pdf

Rehurek R., Sojka P. Software Framework for Topic Modelling with Large Corpora // Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. University of Malta. 2010.

The authors declare that there are no conflicts of interest present.