A Modular Robotic Arm Configuration Design Method Based on Double DQN with Prioritized Experience Replay
Author:
Ding Ziyan1ORCID, Tang Haijun1, Wan Haiying2, Zhang Chengxi2ORCID, Sun Ran3
Affiliation:
1. School of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China 2. School of Internet of Things Engineering, Jiangnan University, Wuxi 214082, China 3. School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
Abstract
The modular robotic arms can achieve desired performances in different scenarios through the combination of various modules, and concurrently hold the potential to exhibit geometric symmetry and uniform mass symmetry. Therefore, selecting the appropriate combination of modules is crucial for realizing the functions of the robotic arm and ensuring the elegance of the system. To this end, this paper proposes a double deep Q-network (DDQN)-based configuration design algorithm for modular robotic arms, which aims to find the optimal configuration under different tasks. First, a library of small modules of collaborative robotic arms consisting of multiple tandem robotic arms is constructed. These modules are described in a standard format that can be directly imported into the software for simulation, providing greater convenience and flexibility in the development of modular robotic arms. Subsequently, the DDQN design framework for module selection is established to obtain the optimal robotic arm configuration. The proposed method could deal with the overestimation problem in the traditional deep Q-network (DQN) method and improve the estimation accuracy of the value function for each module. In addition, the experience replay mechanism is improved based on the SumTree technique, which enables the algorithm to make effective use of historical experience and prevents the algorithm from falling into local optimal solutions. Finally, comparative experiments are carried out on the PyBullet simulation platform to verify the effectiveness and superiority of the configuration design method developed in the paper. The simulation results show that the proposed DDQN-based method with experience replay mechanism has higher search efficiency and accuracy compared to the traditional DQN scheme.
Funder
Basic Scientific Research Project of China
Reference15 articles.
1. Liu, S.B., and Althoff, M. (August, January 31). Optimizing performance in automation through modular robots[C/OL]. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France. 2. Desai, R., Safonova, M., Muelling, K., and Coros, S. (arXiv, 2018). Automatic design of task-specific robotic arms, arXiv. 3. A survey on artificial intelligence trends in spacecraft guidance dynamics and control;Izzo;Astrodynamics,2019 4. Luo, H., Li, M., Liang, G., Qian, H., and Lam, T.L. (2020, January 25–29). An obstacle-crossing strategy based on the fast self-reconfiguration for modular sphere robots. Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA. 5. Whitman, J., Bhirangi, R., Travers, M., and Choset, H. (2024, January 20–27). Modular robot design synthesis with deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, British Columbia.
|
|