Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

01/06/2022
by   Filippo Vannella, et al.
3

Controlling antenna tilts in cellular networks is imperative to reach an efficient trade-off between network coverage and capacity. In this paper, we devise algorithms learning optimal tilt control policies from existing data (in the so-called passive learning setting) or from data actively generated by the algorithms (the active learning setting). We formalize the design of such algorithms as a Best Policy Identification (BPI) problem in Contextual Linear Multi-Arm Bandits (CL-MAB). An arm represents an antenna tilt update; the context captures current network conditions; the reward corresponds to an improvement of performance, mixing coverage and capacity; and the objective is to identify, with a given level of confidence, an approximately optimal policy (a function mapping the context to an arm with maximal reward). For CL-MAB in both active and passive learning settings, we derive information-theoretical lower bounds on the number of samples required by any algorithm returning an approximately optimal policy with a given level of certainty, and devise algorithms achieving these fundamental limits. We apply our algorithms to the Remote Electrical Tilt (RET) optimization problem in cellular networks, and show that they can produce optimal tilt update policy using much fewer data samples than naive or existing rule-based learning algorithms.

READ FULL TEXT

page 1

page 7

research
05/21/2020

Off-policy Learning for Remote Electrical Tilt Optimization

We address the problem of Remote Electrical Tilt (RET) optimization usin...
research
07/05/2022

Instance-optimal PAC Algorithms for Contextual Bandits

In the stochastic contextual bandit setting, regret-minimizing algorithm...
research
04/14/2022

Measurement-based Admission Control in Sliced Networks: A Best Arm Identification Approach

In sliced networks, the shared tenancy of slices requires adaptive admis...
research
12/12/2019

Sublinear Optimal Policy Value Estimation in Contextual Bandits

We study the problem of estimating the expected reward of the optimal po...
research
06/26/2021

The Role of Contextual Information in Best Arm Identification

We study the best-arm identification problem with fixed confidence when ...
research
09/16/2020

Lower Bounds for Policy Iteration on Multi-action MDPs

Policy Iteration (PI) is a classical family of algorithms to compute an ...
research
09/15/2022

Semiparametric Best Arm Identification with Contextual Information

We study best-arm identification with a fixed budget and contextual (cov...

Please sign up or login with your details

Forgot password? Click here to reset