Off-policy Learning for Remote Electrical Tilt Optimization

05/21/2020
by   Filippo Vannella, et al.
0

We address the problem of Remote Electrical Tilt (RET) optimization using off-policy Contextual Multi-Armed-Bandit (CMAB) techniques. The goal in RET optimization is to control the orientation of the vertical tilt angle of the antenna to optimize Key Performance Indicators (KPIs) representing the Quality of Service (QoS) perceived by the users in cellular networks. Learning an improved tilt update policy is hard. On the one hand, coming up with a new policy in an online manner in a real network requires exploring tilt updates that have never been used before, and is operationally too risky. On the other hand, devising this policy via simulations suffers from the simulation-to-reality gap. In this paper, we circumvent these issues by learning an improved policy in an offline manner using existing data collected on real networks. We formulate the problem of devising such a policy using the off-policy CMAB framework. We propose CMAB learning algorithms to extract optimal tilt update policies from the data. We train and evaluate these policies on real-world 4G Long Term Evolution (LTE) cellular network data. Our policies show consistent improvements over the rule-based logging policy used to collect the data.

READ FULL TEXT

page 1

page 5

research
01/06/2022

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

Controlling antenna tilts in cellular networks is imperative to reach an...
research
08/21/2019

Exploring Offline Policy Evaluation for the Continuous-Armed Bandit Problem

The (contextual) multi-armed bandit problem (MAB) provides a formalizati...
research
09/16/2022

Sales Channel Optimization via Simulations Based on Observational Data with Delayed Rewards: A Case Study at LinkedIn

Training models on data obtained from randomized experiments is ideal fo...
research
06/27/2020

Overfitting and Optimization in Offline Policy Learning

We consider the task of policy learning from an offline dataset generate...
research
10/12/2020

Remote Electrical Tilt Optimization via Safe Reinforcement Learning

Remote Electrical Tilt (RET) optimization is an efficient method for adj...
research
04/29/2022

Towards Optimal Tradeoff Between Data Freshness and Update Cost in Information-update Systems

In this paper, we consider a discrete-time information-update system, wh...
research
08/22/2019

Online Inference for Advertising Auctions

Advertisers that engage in real-time bidding (RTB) to display their ads ...

Please sign up or login with your details

Forgot password? Click here to reset