Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback-Reference-Cited by-同舟云学术

Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback

Published:2024-08-02 Issue: Volume: Page:
ISSN:1384-5810
Container-title:Data Mining and Knowledge Discovery
language:en
Short-container-title:Data Min Knowl Disc

Author:

Puthiya Parambath Shameem A.^ORCID,Anagnostopoulos Christos,Murray-Smith Roderick

Abstract

AbstractWe study the problem of predicting the next query to be recommended in interactive data exploratory analysis to guide users to correct content. Current query prediction approaches are based on sequence-to-sequence learning, exploiting past interaction data. However, due to the resource-hungry training process, such approaches fail to adapt to immediate user feedback. Immediate feedback is essential and considered as a signal of the user’s intent. We contribute with a novel query prediction ensemble mechanism, which adapts to immediate feedback relying on multi-armed bandits framework. Our mechanism, an extension to the popular Exp3 algorithm, augments Transformer-based language models for query predictions by combining predictions from experts, thus dynamically building a candidate set during exploration. Immediate feedback is leveraged to choose the appropriate prediction in a probabilistic fashion. We provide comprehensive large-scale experimental and comparative assessment using a popular online literature discovery service, which showcases that our mechanism (i) improves the per-round regret substantially against state-of-the-art Transformer-based models and (ii) shows the superiority of causal language modelling over masked language modelling for query recommendations.

Funder

Engineering and Physical Sciences Research Council

HORIZON EUROPE Framework Programme

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10618-024-01057-4.pdf

Reference35 articles.

1. Ahmad WU, Chang KW, Wang H (2019) Context attentive document ranking and query suggestion. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 385–394

2. Audibert JY, Bubeck S et al (2009) Minimax policies for adversarial and stochastic bandits. In: COLT, pp 1–122

3. Auer P, Cesa-Bianchi N, Freund Y et al (2002) The nonstochastic multiarmed bandit problem. SIAM J Comput 32(1):48–77

4. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015

5. Bayati M, Hamidi N, Johari R et al (2020) Unreasonable effectiveness of greedy algorithms in multi-armed bandit with many arms. In: Larochelle H, Ranzato M, Hadsell R et al (eds) NeurIPS, pp 1713–1723