CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

04/18/2022
by   Siddharth Verma, et al.
0

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties. However, dialogue can also be regarded as a goal directed process, where speakers attempt to accomplish a specific task. Reinforcement learning (RL) algorithms are designed specifically for solving such goal-directed problems, but the most direct way to apply RL – through trial-and-error learning in human conversations, – is costly. In this paper, we study how offline reinforcement learning can instead be used to train dialogue agents entirely using static datasets collected from human speakers. Our experiments show that recently developed offline RL methods can be combined with language models to yield realistic dialogue agents that better accomplish task goals.

READ FULL TEXT

page 5

page 14

research
09/02/2022

Dialogue Evaluation with Offline Reinforcement Learning

Task-oriented dialogue systems aim to fulfill user goals through natural...
research
07/23/2023

On the Effectiveness of Offline RL for Dialogue Response Generation

A common training technique for language models is teacher forcing (TF)....
research
12/31/2020

Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration

Despite the recent success of large-scale language models on various dow...
research
09/13/2019

Say What I Want: Towards the Dark Side of Neural Dialogue Models

Neural dialogue models have been widely adopted in various chatbot appli...
research
04/05/2020

Stylistic Dialogue Generation via Information-Guided Reinforcement Learning Strategy

Stylistic response generation is crucial for building an engaging dialog...
research
09/13/2023

Offline Prompt Evaluation and Optimization with Inverse Reinforcement Learning

The recent advances in the development of Large Language Models (LLMs) l...
research
03/10/2021

Causal-aware Safe Policy Improvement for Task-oriented dialogue

The recent success of reinforcement learning's (RL) in solving complex t...

Please sign up or login with your details

Forgot password? Click here to reset