Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs
Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at low cost over commodity platforms offering unprecedented network management flexibility. In this paper, a novel deep Reinforcement Learning (RL)-based framework is proposed that jointly reconfigures the functional splits of the Base Stations (BSs), resources and locations of the virtualized Central Units (vCUs) and Distributed Units (vDUs), and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the possibly-varying traffic demands and resource availability. Testbed measurements are performed to study the relations between traffic demands and computing resource utilization, which reveal that their relations have high variance and dependence on platform and platform load. Hence, acquiring the perfect model of the underlying vRAN system is highly non-trivial. A comprehensive cost function is formulated that considers resource overprovisioning, instantiation and reconfiguration and the declined demands, where such impacts urge to perform the reconfigurations prudently. Motivated by these insights, our solution framework is developed using model-free multi-agent RL, where each agent controls the configurations of each BS. However, each agent has a multi-dimensional discrete action space due to the joint configuration decision of the BS. To overcome the curse of dimensionality, incorporation of Dueling Double Q-network with action branching is applied at each agent. Further, the agent learns its optimal policy to select an action that reconfigures the BS independently. Simulations are performed using O-RAN compliant model. Our results show that the framework successfully learns the optimal policy, can be readily applied to different vRAN systems via transfer learning, and achieves significant cost savings of the benchmarks.
READ FULL TEXT