Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval

06/05/2020
by   Limin Chen, et al.
0

Interactive Information Retrieval (IIR) and Reinforcement Learning (RL) share many commonalities, including an agent who learns while interacts, a long-term and complex goal, and an algorithm that explores and adapts. To successfully apply RL methods to IIR, one challenge is to obtain sufficient relevance labels to train the RL agents, which are infamously known as sample inefficient. However, in a text corpus annotated for a given query, it is not the relevant documents but the irrelevant documents that predominate. This would cause very unbalanced training experiences for the agent and prevent it from learning any policy that is effective. Our paper addresses this issue by using domain randomization to synthesize more relevant documents for the training. Our experimental results on the Text REtrieval Conference (TREC) Dynamic Domain (DD) 2017 Track show that the proposed method is able to boost an RL agent's learning effectiveness by 22% in dealing with unseen situations.

READ FULL TEXT
research
02/17/2022

Retrieval-Augmented Reinforcement Learning

Most deep reinforcement learning (RL) algorithms distill experience into...
research
11/23/2019

Corpus-Level End-to-End Exploration for Interactive Systems

A core interest in building Artificial Intelligence (AI) agents is to le...
research
07/28/2023

Dialogue Shaping: Empowering Agents through NPC Interaction

One major challenge in reinforcement learning (RL) is the large amount o...
research
11/23/2022

Reinforcement Learning Agent Design and Optimization with Bandwidth Allocation Model

Reinforcement learning (RL) is currently used in various real-life appli...
research
01/22/2020

On Solving Cooperative MARL Problems with a Few Good Experiences

Cooperative Multi-agent Reinforcement Learning (MARL) is crucial for coo...
research
06/10/2022

Large-Scale Retrieval for Reinforcement Learning

Effective decision making involves flexibly relating past experiences an...
research
08/07/2019

Text mining policy: Classifying forest and landscape restoration policy agenda with neural information retrieval

Dozens of countries have committed to restoring the ecological functiona...

Please sign up or login with your details

Forgot password? Click here to reset