Improving a sequence-to-sequence nlp model using a reinforcement learning policy algorithm

12/28/2022
by   Jabri Ismail, et al.
0

Nowadays, the current neural network models of dialogue generation(chatbots) show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. Modelling a dialogue's future direction is critical for generating coherent, interesting dialogues, a need that has led traditional NLP dialogue models that rely on reinforcement learning. In this article, we explain how to combine these objectives by using deep reinforcement learning to predict future rewards in chatbot dialogue. The model simulates conversations between two virtual agents, with policy gradient methods used to reward sequences that exhibit three useful conversational characteristics: the flow of informality, coherence, and simplicity of response (related to forward-looking function). We assess our model based on its diversity, length, and complexity with regard to humans. In dialogue simulation, evaluations demonstrated that the proposed model generates more interactive responses and encourages a more sustained successful conversation. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2016

Deep Reinforcement Learning for Dialogue Generation

Recent neural models of dialogue generation offer great promise for gene...
research
10/28/2017

A Dual Encoder Sequence to Sequence Model for Open-Domain Dialogue Modeling

Ever since the successful application of sequence to sequence learning f...
research
11/09/2021

Reason first, then respond: Modular Generation for Knowledge-infused Dialogue

Large language models can produce fluent dialogue but often hallucinate ...
research
09/30/2018

Automatic Evaluation of Neural Personality-based Chatbots

Stylistic variation is critical to render the utterances generated by co...
research
04/30/2020

Generating Persona-Consistent Dialogue Responses Using Deep Reinforcement Learning

Recent transformer-based open-domain dialogue agents are trained by refe...
research
12/12/2016

Deep Active Learning for Dialogue Generation

We propose an online, end-to-end, neural generative conversational model...
research
09/01/2023

JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning

Dialogue policy learning (DPL) is a crucial component of dialogue modell...

Please sign up or login with your details

Forgot password? Click here to reset