Chasing Unknown Bandits: Uncertainty Guidance in Learning and Decision Making-Reference-Cited by-同舟云学术

Chasing Unknown Bandits: Uncertainty Guidance in Learning and Decision Making

Published:2022-08-24 Issue:5 Volume:31 Page:419-427
ISSN:0963-7214
Container-title:Current Directions in Psychological Science
language:en
Short-container-title:Curr Dir Psychol Sci

Author:

Speekenbrink Maarten¹^ORCID

Affiliation:

1. Department of Experimental Psychology, University College London, and The Alan Turing Institute, London, England

Abstract

In repeated decision problems for which it is possible to learn from experience, people should actively seek out uncertain options, rather than avoid ambiguity or uncertainty, in order to learn and improve future decisions. Research on human behavior in a variety of multiarmed-bandit tasks supports this prediction. Multiarmed-bandit tasks involve repeated decisions between options with initially unknown reward distributions and require a careful balance between learning about relatively unknown options (exploration) and obtaining high immediate rewards (exploitation). Resolving this exploration-exploitation dilemma optimally requires considering not only the estimated value of each option, but also the uncertainty in these estimations. Bayesian learning naturally quantifies uncertainty and hence provides a principled framework to study how humans resolve this dilemma. On the basis of computational modeling and behavioral results in bandit tasks, I argue that human learning, attention, and exploration are guided by uncertainty. These results support Bayesian theories of cognition and underpin the fundamental role of subjective uncertainty in both learning and decision making.

Publisher

SAGE Publications

Subject

General Psychology

Link

http://journals.sagepub.com/doi/pdf/10.1177/09637214221105051

Reference32 articles.

1. Learning the value of information in an uncertain world

2. Recent developments in modeling preferences: Uncertainty and ambiguity

3. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration

4. Cortical substrates for exploratory decisions in humans

5. Learning and selective attention

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Signatures of heuristic-based directed exploration in two-step sequential decision task behaviour;2023-05-23

2. Humans combine value learning and hypothesis testing strategically in multi-dimensional probabilistic reward learning;PLOS Computational Biology;2022-11-23