Deep Reinforcement Learning with Importance Weighted A3C for QoE enhancement in Video Delivery Services

04/10/2023
by   Mandan Naresh, et al.
0

Adaptive bitrate (ABR) algorithms are used to adapt the video bitrate based on the network conditions to improve the overall video quality of experience (QoE). Recently, reinforcement learning (RL) and asynchronous advantage actor-critic (A3C) methods have been used to generate adaptive bit rate algorithms and they have been shown to improve the overall QoE as compared to fixed rule ABR algorithms. However, a common issue in the A3C methods is the lag between behaviour policy and target policy. As a result, the behaviour and the target policies are no longer synchronized which results in suboptimal updates. In this work, we present ALISA: An Actor-Learner Architecture with Importance Sampling for efficient learning in ABR algorithms. ALISA incorporates importance sampling weights to give more weightage to relevant experience to address the lag issues with the existing A3C methods. We present the design and implementation of ALISA, and compare its performance to state-of-the-art video rate adaptation algorithms including vanilla A3C implemented in the Pensieve framework and other fixed-rule schedulers like BB, BOLA, and RB. Our results show that ALISA improves average QoE by up to 25 higher average QoE than Pensieve, and even more when compared to fixed-rule schedulers.

READ FULL TEXT
research
05/14/2023

PPO-ABR: Proximal Policy Optimization based Deep Reinforcement Learning for Adaptive BitRate streaming

Providing a high Quality of Experience (QoE) for video streaming in 5G a...
research
10/30/2018

Relative Importance Sampling For Off-Policy Actor-Critic in Deep Reinforcement Learning

Off-policy learning is more unstable compared to on-policy learning in r...
research
06/10/2019

Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past

Soft Actor-Critic (SAC) is an off-policy actor-critic deep reinforcement...
research
08/01/2022

Off-Policy Correction for Actor-Critic Algorithms in Deep Reinforcement Learning

Compared to on-policy policy gradient techniques, off-policy model-free ...
research
11/12/2018

Importance Weighted Evolution Strategies

Evolution Strategies (ES) emerged as a scalable alternative to popular R...
research
11/15/2018

Tiyuntsong: A Self-Play Reinforcement Learning Approach for ABR Video Streaming

Existing reinforcement learning(RL)-based adaptive bitrate(ABR) approach...
research
10/29/2018

Variational Inference with Tail-adaptive f-Divergence

Variational inference with α-divergences has been widely used in modern ...

Please sign up or login with your details

Forgot password? Click here to reset