Multicast Scheduling for Multi-Message over Multi-Channel: A Permutation-based Wolpertinger Deep Reinforcement Learning Method

05/19/2022
by   Ran Li, et al.
0

Multicasting is an efficient technique to simultaneously transmit common messages from the base station (BS) to multiple mobile users (MUs). The multicast scheduling problem for multiple messages over multiple channels, which jointly minimizes the energy consumption of the BS and the latency of serving asynchronized requests from the MUs, is formulated as an infinite-horizon Markov decision process (MDP) with large discrete action space and multiple time-varying constraints, which has not been efficiently addressed in the literatures. By studying the intrinsic features of this MDP under stationary policies and refining the reward function, we first simplify it to an equivalent form with a much smaller state space. Then, we propose a modified deep reinforcement learning (DRL) algorithm, namely the permutation-based Wolpertinger deep deterministic policy gradient (PW-DDPG), to solve the simplified problem. Specifically, PW-DDPG utilizes a permutation-based action embedding module to address the large discrete action space issue and a feasible exploration module to deal with the time-varying constraints. Moreover, as a benchmark, an upper bound of the considered MDP is derived by solving an integer programming problem. Numerical results validate that the proposed algorithm achieves close performance to the derived benchmark.

READ FULL TEXT

page 1

page 12

research
05/19/2022

Coexistence between Task- and Data-Oriented Communications: A Whittle's Index Guided Multi-Agent Reinforcement Learning Approach

We investigate the coexistence of task-oriented and data-oriented commun...
research
11/06/2019

Resilient Load Restoration in Microgrids Considering Mobile Energy Storage Fleets: A Deep Reinforcement Learning Approach

Mobile energy storage systems (MESSs) provide mobility and flexibility t...
research
08/28/2023

Deep Reinforcement Learning for Uplink Scheduling in NOMA-URLLC Networks

This article addresses the problem of Ultra Reliable Low Latency Communi...
research
07/25/2022

Online Reinforcement Learning for Periodic MDP

We study learning in periodic Markov Decision Process(MDP), a special ty...
research
10/19/2019

Dynamic Content Update for Wireless Edge Caching via Deep Reinforcement Learning

This letter studies a basic wireless caching network where a source serv...
research
06/03/2019

Decentralized Deep Reinforcement Learning for Delay-Power Tradeoff in Vehicular Communications

This paper targets at the problem of radio resource management for expec...
research
06/09/2022

An Optimization Method-Assisted Ensemble Deep Reinforcement Learning Algorithm to Solve Unit Commitment Problems

Unit commitment (UC) is a fundamental problem in the day-ahead electrici...

Please sign up or login with your details

Forgot password? Click here to reset