Multi-UAV Collision Avoidance using Multi-Agent Reinforcement Learning with Counterfactual Credit Assignment

04/19/2022
by   Shuangyao Huang, et al.
0

Multi-UAV collision avoidance is a challenging task for UAV swarm applications due to the need of tight cooperation among swarm members for collision-free path planning. Centralized Training with Decentralized Execution (CTDE) in Multi-Agent Reinforcement Learning is a promising method for multi-UAV collision avoidance, in which the key challenge is to effectively learn decentralized policies that can maximize a global reward cooperatively. We propose a new multi-agent critic-actor learning scheme called MACA for UAV swarm collision avoidance. MACA uses a centralized critic to maximize the discounted global reward that considers both safety and energy efficiency, and an actor per UAV to find decentralized policies to avoid collisions. To solve the credit assignment problem in CTDE, we design a counterfactual baseline that marginalizes both an agent's state and action, enabling to evaluate the importance of an agent in the joint observation-action space. To train and evaluate MACA, we design our own simulation environment MACAEnv to closely mimic the realistic behaviors of a UAV swarm. Simulation results show that MACA achieves more than 16 algorithms and reduces failure rate by 90 compared to a conventional UAV swarm collision avoidance algorithm in all test scenarios.

READ FULL TEXT

page 1

page 5

research
10/24/2019

Reciprocal Collision Avoidance for General Nonlinear Agents using Reinforcement Learning

Finding feasible and collision-free paths for multiple nonlinear agents ...
research
09/01/2020

A Benchmark for Multi-UAV Task Assignment of an Extended Team Orienteering Problem

A benchmark for multi-UAV task assignment is presented in order to evalu...
research
04/30/2021

Decentralized Swarm Collision Avoidance for Quadrotors via End-to-End Reinforcement Learning

Collision avoidance algorithms are of central interest to many drone app...
research
03/11/2023

E2CoPre: Energy Efficient and Cooperative Collision Avoidance for UAV Swarms with Trajectory Prediction

This paper addresses the collision avoidance problem of UAV swarms in th...
research
05/08/2021

E^2Coop: Energy Efficient and Cooperative Obstacle Detection and Avoidance for UAV Swarms

Energy efficiency is of critical importance to trajectory planning for U...
research
03/29/2020

Optimized Directed Roadmap Graph for Multi-Agent Path Finding Using Stochastic Gradient Descent

We present a novel approach called Optimized Directed Roadmap Graph (ODR...
research
10/23/2019

Decentralized Runtime Synthesis of Shields for Multi-Agent Systems

A shield is attached to a system to guarantee safety by correcting the s...

Please sign up or login with your details

Forgot password? Click here to reset