Personalization for Web-based Services using Offline Reinforcement Learning

Large-scale Web-based services present opportunities for improving UI policies based on observed user interactions. We address challenges of learning such policies through model-free offline Reinforcement Learning (RL) with off-policy training. Deployed in a production system for user authentication in a major social network, it significantly improves long-term objectives. We articulate practical challenges, compare several ML techniques, provide insights on training and evaluation of RL models, and discuss generalizations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2022

On the Opportunities and Challenges of using Animals Videos in Reinforcement Learning

We investigate the use of animals videos to improve efficiency and perfo...
research
01/25/2022

MOORe: Model-based Offline-to-Online Reinforcement Learning

With the success of offline reinforcement learning (RL), offline trained...
research
03/25/2020

Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods

Recent advances in machine learning are consistently enabled by increasi...
research
04/10/2023

Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning

Equipped with the trained environmental dynamics, model-based offline re...
research
11/04/2020

Learning from Human Feedback: Challenges for Real-World Reinforcement Learning in NLP

Large volumes of interaction logs can be collected from NLP systems that...
research
11/14/2022

Towards Data-Driven Offline Simulations for Online Reinforcement Learning

Modern decision-making systems, from robots to web recommendation engine...
research
07/23/2023

On the Effectiveness of Offline RL for Dialogue Response Generation

A common training technique for language models is teacher forcing (TF)....

Please sign up or login with your details

Forgot password? Click here to reset