Reinforcement Learning-Based Trajectory Design for the Drone Base Stations

06/23/2019

∙

In this paper, the trajectory optimization problem for a multi-unmanned aerial vehicle (UAV) communication network is investigated. The objective is to find the trajectory of the UAVs so that the sum-rate of the users served by each UAV is maximized. To reach this goal, along with the optimal trajectory design, optimal power and sub-channel allocation is also of great importance to support the users with the highest possible data rates. To solve this complicated problem, we divide it into two sub-problems: UAV trajectory optimization sub-problem, and joint power and sub-channel assignment sub-problem. Then, based on the Q-learning method, we develop a distributed algorithm which solves these sub-problems efficiently, and does not need significant amount of information exchange between the UAVs and the core network. Simulation results show that although Q-learning is a model-free reinforcement learning technique, it has a remarkable capability to train the UAVs to optimize their trajectories based on the received reward signals, which carry decent information from the topology of the network.

READ FULL TEXT

Reinforcement Learning-Based Trajectory Design for the Drone Base Stations

Sign in with Google

Consider DeepAI Pro