Reinforcement-Learning-Based Resource Allocation in Fog Radio Access Networks for Various IoT Environments
Fog radio access network (F-RAN) has been recently proposed to satisfy the low-latency communication requirements of Internet of Things (IoT) applications. We consider the problem of sequentially allocating the limited resources of a fog node to a heterogeneous population of IoT applications with varying latency requirements. Specifically, for each service request it receives in time, fog node needs to decide whether to serve that user locally to provide it with low-latency communication service or to refer it to the cloud control center to keep valuable fog resources available for future users with potentially higher utility to the system (i.e., lower latency requirement). We formulate the problem as a Markov Decision Process (MDP) in two alternative formulations: infinite-horizon MDP (IH MDP) and finite-horizon MDP (FH MDP). In both IH and FH formulations, we present the optimal solution, known as the optimal policy, through Reinforcement Learning (RL). The optimal policies in both cases are learnt from the IoT environment using different RL methods. The significant advantage of the proposed RL methods over the straightforward approach of deciding based on a fixed threshold of utility is that the RL methods quickly learn the optimal decision thresholds from the IoT environment, and thus always achieve the best possible performance regardless of the environment. They strike the right balance between the two conflicting objectives, maximize the average total served utility vs. minimize the fog node's idle time. Extensive simulation results for various IoT environments corroborate the theoretical underpinnings of the proposed RL methods.
READ FULL TEXT