A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat

12/05/2022
by   Jiajun Chai, et al.
0

Unmanned combat air vehicle (UCAV) combat is a challenging scenario with continuous action space. In this paper, we propose a general hierarchical framework to resolve the within-vision-range (WVR) air-to-air combat problem under 6 dimensions of degree (6-DOF) dynamics. The core idea is to divide the whole decision process into two loops and use reinforcement learning (RL) to solve them separately. The outer loop takes into account the current combat situation and decides the expected macro behavior of the aircraft according to a combat strategy. Then the inner loop tracks the macro behavior with a flight controller by calculating the actual input signals for the aircraft. We design the Markov decision process for both the outer loop strategy and inner loop controller, and train them by proximal policy optimization (PPO) algorithm. For the inner loop controller, we design an effective reward function to accurately track various macro behavior. For the outer loop strategy, we further adopt a fictitious self-play mechanism to improve the combat performance by constantly combating against the historical strategies. Experiment results show that the inner loop controller can achieve better tracking performance than fine-tuned PID controller, and the outer loop strategy can perform complex maneuvers to get higher and higher winning rate, with the generation evolves.

READ FULL TEXT

page 1

page 7

research
04/11/2018

Reinforcement Learning for UAV Attitude Control

Autopilot systems are typically composed of an "inner loop" providing st...
research
03/25/2022

Analysis of OODA Loop based on Adversarial for Complex Game Environments

To address the problem of imperfect confrontation strategy caused by the...
research
04/05/2021

Control of a Tail-Sitter VTOL UAV Based on Recurrent Neural Networks

Tail-sitter vertical takeoff and landing (VTOL) unmanned aerial vehicles...
research
09/08/2023

Sample-Efficient Co-Design of Robotic Agents Using Multi-fidelity Training on Universal Policy Network

Co-design involves simultaneously optimizing the controller and agents p...
research
09/20/2023

Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering

The application of artificial intelligence to simulate air-to-air combat...
research
07/10/2021

Learning-to-Dispatch: Reinforcement Learning Based Flight Planning under Emergency

The effectiveness of resource allocation under emergencies especially hu...
research
03/27/2023

A Compositional Approach to Certifying the Almost Global Asymptotic Stability of Cascade Systems

In this work, we give sufficient conditions for the almost global asympt...

Please sign up or login with your details

Forgot password? Click here to reset