The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning-Reference-Cited by-同舟云学术

The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning

Published:2020-12-08 Issue:12 Volume:18 Page:e3001028
ISSN:1545-7885
Container-title:PLOS Biology
language:en
Short-container-title:PLoS Biol

Author:

Najar Anis^ORCID,Bonnet Emmanuelle^ORCID,Bahrami Bahador,Palminteri Stefano^ORCID

Abstract

While there is no doubt that social signals affect human reinforcement learning, there is still no consensus about how this process is computationally implemented. To address this issue, we compared three psychologically plausible hypotheses about the algorithmic implementation of imitation in reinforcement learning. The first hypothesis, decision biasing (DB), postulates that imitation consists in transiently biasing the learner’s action selection without affecting their value function. According to the second hypothesis, model-based imitation (MB), the learner infers the demonstrator’s value function through inverse reinforcement learning and uses it to bias action selection. Finally, according to the third hypothesis, value shaping (VS), the demonstrator’s actions directly affect the learner’s value function. We tested these three hypotheses in 2 experiments (N = 24 and N = 44) featuring a new variant of a social reinforcement learning task. We show through model comparison and model simulation that VS provides the best explanation of learner’s behavior. Results replicated in a third independent experiment featuring a larger cohort and a different design (N = 302). In our experiments, we also manipulated the quality of the demonstrators’ choices and found that learners were able to adapt their imitation rate, so that only skilled demonstrators were imitated. We proposed and tested an efficient meta-learning process to account for this effect, where imitation is regulated by the agreement between the learner and the demonstrator. In sum, our findings provide new insights and perspectives on the computational mechanisms underlying adaptive imitation in human reinforcement learning.

Funder

ATIP-Avenir

Emergenc

Fondation Schlumberger pour l’Education et la Recherche

Fondation Fyssen

NOMIS

European Research Council

Agence Nationale de la Recherche

Publisher

Public Library of Science (PLoS)

Subject

General Agricultural and Biological Sciences,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology,General Neuroscience

Reference56 articles.

1. Social fear learning: From animal models tboo human function.;J Debiec;Trends Cogn Sci,2017

2. Social learning through prediction error in the brain;J Joiner;NPJ Sci Learn,2017

3. empathy, and mirror neurons;Iacoboni M. Imitation;Annu Rev Psychol,2009

4. Theory of mind: a neural prediction problem;J Koster-Hale;Neuron,2013

5. Action understanding as inverse planning;CL Baker;Cognition,2009

Cited by 28 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. AnimalEnvNet: A Deep Reinforcement Learning Method for Constructing Animal Agents Using Multimodal Data Fusion;Applied Sciences;2024-07-22

2. Transmission of social bias through observational learning;Science Advances;2024-06-28

3. Humans can infer social preferences from decision speed alone;PLOS Biology;2024-06-20

4. Dynamic valuation bias explains social influence on cheating behavior;2024-05-21

5. Social demonstration of colour preference improves the learning of associated demonstrated actions;Animal Cognition;2024-04-09