Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments-Reference-Cited by-同舟云学术

Comparative Evaluation of Reinforcement Learning Algorithms for Multi-Agent Unmanned Aerial Vehicle Path Planning in 2D and 3D Environments

Published:2025-06-16 Issue:6 Volume:9 Page:438
ISSN:2504-446X
Container-title:Drones
language:en
Short-container-title:Drones

Author:

Ali Mirza Aqib¹^ORCID,Maqsood Adnan¹^ORCID,Athar Usama¹^ORCID,Khanzada Hasan Raza¹

Affiliation:

1. School of Interdisciplinary Engineering and Sciences, National University of Sciences and Technology, Islamabad 44000, Pakistan

Abstract

Path planning in multi-agent UAV swarms is a crucial issue that involves avoiding collisions in dynamic, obstacle-filled environments while consuming the least amount of time and energy possible. This work comprehensively evaluates reinforcement learning (RL) algorithms for multi-agent UAV path planning in 2D and 3D simulated environments. First, we develop a 2D simulation setup using Python in which UAVs (quadcopters), represented as points in space, navigate toward their respective targets while avoiding static obstacles and inter-agent collisions. In the second phase, we transition this comparison to a physics-based 3D simulation, incorporating realistic UAV (fixed wing) dynamics and checkpoint-based navigation. We compared five algorithms, namely, Proximal Policy Optimization (PPO), Soft Actor–Critic (SAC), Deep Deterministic Policy Gradient (DDPG), Trust Region Policy Optimization (TRPO), and Multi–Agent DDPG (MADDPG), in various scenarios. Our findings reveal significant performance differences between the algorithms across multiple dimensions. DDPG consistently demonstrated superior reward optimization and collision avoidance performance, while PPO and MADDPG excelled in the execution time required to reach the goal. Furthermore, our findings reveal how algorithms perform while transitioning from a simplistic 2D setup to a realistic 3D physics-based environment, which is essential for performing sim-to-real transfer. This work provides valuable insights into the suitability of several reinforcement learning (RL) algorithms for developing autonomous systems and UAV swarm navigation.

Publisher

MDPI AG

Link

https://www.mdpi.com/2504-446X/9/6/438/pdf

Reference60 articles.

1. An accurate UAV 3-D path planning method for disaster emergency response based on an improved multiobjective swarm intelligence algorithm;Wan;IEEE Trans. Cybern.,2022

2. Ruetten, L., Regis, P.A., Feil-Seifer, D., and Sengupta, S. (2020, January 6–8). Area-Optimized UAV Swarm Network for Search and Rescue Operations. Proceedings of the 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA.

3. Arranz, R., Carramiñana, D., Miguel, G.d., Besada, J.A., and Bernardos, A.M. (2023). Application of deep reinforcement learning to UAV swarming for ground surveillance. Sensors, 23.

4. Ming, R., Jiang, R., Luo, H., Lai, T., Guo, E., and Zhou, Z. (2023). Comparative Analysis of Different UAV Swarm Control Methods on Unmanned Farms. Agronomy, 13.

5. Micro UAV Swarm for industrial applications in indoor environment: A systematic literature review;Awasthi;Logist. Res.,2023