A Deep Reinforcement Learning-Based Resource Scheduler for Massive MIMO Networks
The large number of antennas in massive MIMO systems allows the base station to communicate with multiple users at the same time and frequency resource with multi-user beamforming. However, highly correlated user channels could drastically impede the spectral efficiency that multi-user beamforming can achieve. As such, it is critical for the base station to schedule a suitable group of users in each transmission interval to achieve maximum spectral efficiency while adhering to fairness constraints among the users. User scheduling is an NP-hard problem, with complexity growing exponentially with the number of users. In this paper, we consider the user scheduling problem for massive MIMO systems. Inspired by recent achievements in deep reinforcement learning (DRL) to solve problems with large action sets, we propose , a dynamic scheduler for massive MIMO based on the state-of-the-art Soft Actor-Critic (SAC) DRL model and the K-Nearest Neighbors (KNN) algorithm. Through comprehensive simulations using realistic massive MIMO channel models as well as real-world datasets from channel measurement experiments, we demonstrate the effectiveness of our proposed model in various channel conditions. Our results show that our proposed model performs very close to the optimal proportionally fair (PF) scheduler in terms of spectral efficiency and fairness with more than one order of magnitude lower computational complexity in medium network sizes where PF is computationally feasible. Our results also show the feasibility and high performance of our proposed scheduler in networks with a large number of users.
READ FULL TEXT