Graph Reinforcement Learning for Predictive Power Allocation to Mobile Users

03/08/2022
by   Jianyu Zhao, et al.
0

Allocating resources with future channels can save resource to ensure quality-of-service of video streaming. In this paper, we optimize predictive power allocation to minimize the energy consumed at distributed units (DUs) by using deep deterministic policy gradient (DDPG) to find optimal policy and predict average channel gains. To improve training efficiency, we resort to graph DDPG for exploiting two kinds of relational priors: (a) permutation equivariant (PE) and permutation invariant (PI) properties of policy function and action-value function, (b) topology relation among users and DUs. To design graph DDPG framework more systematically in harnessing the priors, we first demonstrate how to transform matrix-based DDPG into graph-based DDPG. Then, we respectively design the actor and critic networks to satisfy the permutation properties when graph neural networks are used in embedding and end to-end manners. To avoid destroying the PE/PI properties of the actor and critic networks, we conceive a batch normalization method. Finally, we show the impact of leveraging each prior. Simulation results show that the learned predictive policy performs close to the optimal solution with perfect future information, and the graph DDPG algorithms converge much faster than existing DDPG algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2021

Deep Reinforcement Learning with Symmetric Prior for Predictive Power Allocation to Mobile Users

Deep reinforcement learning has been applied for a variety of wireless t...
research
03/21/2020

Accelerating Deep Reinforcement Learning With the Aid of a Partial Model: Power-Efficient Predictive Video Streaming

Predictive power allocation is conceived for power-efficient video strea...
research
08/03/2021

Variational Actor-Critic Algorithms

We introduce a class of variational actor-critic algorithms based on a v...
research
11/22/2018

An Off-policy Policy Gradient Theorem Using Emphatic Weightings

Policy gradient methods are widely used for control in reinforcement lea...
research
09/24/2021

A Graph Policy Network Approach for Volt-Var Control in Power Distribution Systems

Volt-var control (VVC) is the problem of operating power distribution sy...
research
12/11/2020

Deep Deterministic Policy Gradient for Relay Selection and Power Allocation in Cooperative Communication Network

Cooperative communication is an effective approach to improve spectrum u...
research
12/03/2018

Resource Constrained Deep Reinforcement Learning

In urban environments, supply resources have to be constantly matched to...

Please sign up or login with your details

Forgot password? Click here to reset