Generalization in transfer learning: robust control of robot locomotion-Reference-Cited by-同舟云学术

Generalization in transfer learning: robust control of robot locomotion

Published:2022-05-11 Issue:11 Volume:40 Page:3811-3836
ISSN:0263-5747
Container-title:Robotica
language:en
Short-container-title:Robotica

Author:

Ada Suzan Ece^ORCID,Ugur Emre^ORCID,Akin H. Levent

Abstract

AbstractIn this paper, we propose a set of robust training methods for deep reinforcement learning to transfer learning acquired in one control task to a set of previously unseen control tasks. We improve generalization in commonly used transfer learning benchmarks by a novel sample elimination technique, early stopping, and maximum entropy adversarial reinforcement learning. To generate robust policies, we use sample elimination during training via a method we call strict clipping. We apply early stopping, a method previously used in supervised learning, to deep reinforcement learning. Subsequently, we introduce maximum entropy adversarial reinforcement learning to increase the domain randomization during training for a better target task performance. Finally, we evaluate the robustness of these methods compared to previous work on simulated robots in target environments where the gravity, the morphology of the robot, and the tangential friction coefficient of the environment are altered.

Publisher

Cambridge University Press (CUP)

Subject

Computer Science Applications,General Mathematics,Software,Control and Systems Engineering,Control and Optimization,Mechanical Engineering,Modeling and Simulation

Reference44 articles.

1. [21] Al-Shedivat, M. , Bansal, T. , Burda, Y. , Sutskever, I. , Mordatch, I. and Abbeel, P. , “Continuous adaptation via meta-learning in nonstationary and competitive environments,” arXiv preprint, arXiv:1710.03641 (2017).

2. [16] Schulman, J. , Levine, S. , Abbeel, P. , Jordan, M. and Moritz, P. , “Trust Region Policy Optimization,” In: International Conference on Machine Learning (2015) pp. 1889–1897.

3. ZERO-MOMENT POINT — THIRTY FIVE YEARS OF ITS LIFE

4. [9] Li, Z. , Cheng, X. , Peng, X. B. , Abbeel, P. , Levine, S. , Berseth, G. and Sreenath, K. , “Reinforcement learning for robust parameterized locomotion control of bipedal robots,” CoRR abs/2103.14295 (2021). arXiv:2103.14295. https://arxiv.org/abs/2103.14295

5. [32] Pinto, L. , Davidson, J. , Sukthankar, R. and Gupta, A. , “Robust Adversarial Reinforcement Learning,” In: Proceedings of the 34th International Conference on Machine Learning, Volume 70 (JMLR.org, 2017) pp. 2817–2826.

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unsupervised Meta-Testing With Conditional Neural Processes for Hybrid Meta-Reinforcement Learning;IEEE Robotics and Automation Letters;2024-10

2. Transfer learning in robotics: An upcoming breakthrough? A review of promises and challenges;The International Journal of Robotics Research;2024-09-13

3. Three-dimensional variable center of mass height biped walking using a new model and nonlinear model predictive control;Mechanism and Machine Theory;2024-07

4. Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning;IEEE Robotics and Automation Letters;2024-04

5. Bidirectional Progressive Neural Networks With Episodic Return Progress for Emergent Task Sequencing and Robotic Skill Transfer;IEEE Access;2024