Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback

Author:

Puthiya Parambath Shameem A.ORCID,Anagnostopoulos Christos,Murray-Smith Roderick

Abstract

AbstractWe study the problem of predicting the next query to be recommended in interactive data exploratory analysis to guide users to correct content. Current query prediction approaches are based on sequence-to-sequence learning, exploiting past interaction data. However, due to the resource-hungry training process, such approaches fail to adapt to immediate user feedback. Immediate feedback is essential and considered as a signal of the user’s intent. We contribute with a novel query prediction ensemble mechanism, which adapts to immediate feedback relying on multi-armed bandits framework. Our mechanism, an extension to the popular Exp3 algorithm, augments Transformer-based language models for query predictions by combining predictions from experts, thus dynamically building a candidate set during exploration. Immediate feedback is leveraged to choose the appropriate prediction in a probabilistic fashion. We provide comprehensive large-scale experimental and comparative assessment using a popular online literature discovery service, which showcases that our mechanism (i) improves the per-round regret substantially against state-of-the-art Transformer-based models and (ii) shows the superiority of causal language modelling over masked language modelling for query recommendations.

Funder

Engineering and Physical Sciences Research Council

HORIZON EUROPE Framework Programme

Publisher

Springer Science and Business Media LLC

Reference35 articles.

1. Ahmad WU, Chang KW, Wang H (2019) Context attentive document ranking and query suggestion. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 385–394

2. Audibert JY, Bubeck S et al (2009) Minimax policies for adversarial and stochastic bandits. In: COLT, pp 1–122

3. Auer P, Cesa-Bianchi N, Freund Y et al (2002) The nonstochastic multiarmed bandit problem. SIAM J Comput 32(1):48–77

4. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015

5. Bayati M, Hamidi N, Johari R et al (2020) Unreasonable effectiveness of greedy algorithms in multi-armed bandit with many arms. In: Larochelle H, Ranzato M, Hadsell R et al (eds) NeurIPS, pp 1713–1723

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3