Meta-Reinforcement Learning for Trajectory Design in Wireless UAV Networks

05/25/2020
by   Ye Hu, et al.
0

In this paper, the design of an optimal trajectory for an energy-constrained drone operating in dynamic network environments is studied. In the considered model, a drone base station (DBS) is dispatched to provide uplink connectivity to ground users whose demand is dynamic and unpredictable. In this case, the DBS's trajectory must be adaptively adjusted to satisfy the dynamic user access requests. To this end, a meta-learning algorithm is proposed in order to adapt the DBS's trajectory when it encounters novel environments, by tuning a reinforcement learning (RL) solution. The meta-learning algorithm provides a solution that adapts the DBS in novel environments quickly based on limited former experiences. The meta-tuned RL is shown to yield a faster convergence to the optimal coverage in unseen environments with a considerably low computation complexity, compared to the baseline policy gradient algorithm. Simulation results show that, the proposed meta-learning solution yields a 25 in the convergence speed, and about 10 performance, compared to a baseline policy gradient algorithm. Meanwhile, the probability that the DBS serves over 50 compared to the baseline policy gradient algorithm.

READ FULL TEXT
research
12/06/2020

Distributed Multi-agent Meta Learning for Trajectory Design in Wireless Drone Networks

In this paper, the problem of the trajectory design for a group of energ...
research
09/13/2022

Federated Meta-Learning for Traffic Steering in O-RAN

The vision of 5G lies in providing high data rates, low latency (for the...
research
02/12/2020

Provably Convergent Policy Gradient Methods for Model-Agnostic Meta-Reinforcement Learning

We consider Model-Agnostic Meta-Learning (MAML) methods for Reinforcemen...
research
06/14/2022

Distributed and Distribution-Robust Meta Reinforcement Learning (D2-RMRL) for Data Pre-storing and Routing in Cube Satellite Networks

In this paper, the problem of data pre-storing and routing in dynamic, r...
research
12/14/2020

Policy Gradient RL Algorithms as Directed Acyclic Graphs

Meta Reinforcement Learning (RL) methods focus on automating the design ...
research
10/13/2021

Adapting to Dynamic LEO-B5G Systems: Meta-Critic Learning Based Efficient Resource Scheduling

Low earth orbit (LEO) satellite-assisted communications have been consid...
research
02/23/2021

Mixed Policy Gradient

Reinforcement learning (RL) has great potential in sequential decision-m...

Please sign up or login with your details

Forgot password? Click here to reset