A Diversity-Promoting Objective Function for Neural Conversation Models

10/11/2015
by   Jiwei Li, et al.
Microsoft
Stanford University
0

Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e.g., "I don't know") regardless of the input. We suggest that the traditional objective function, i.e., the likelihood of output (response) given input (message) is unsuited to response generation tasks. Instead we propose using Maximum Mutual Information (MMI) as the objective function in neural models. Experimental results demonstrate that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/06/2018

Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots

Diversity is a long-studied topic in information retrieval that usually ...
06/03/2016

An Attentional Neural Conversation Model with Improved Specificity

In this paper we propose a neural conversation model for conducting dial...
02/11/2020

Non-Autoregressive Neural Dialogue Generation

Maximum Mutual information (MMI), which models the bidirectional depende...
06/03/2021

Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech

Countermeasures to effectively fight the ever increasing hate speech onl...
11/08/2020

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Natural language generation (NLG) is a critical component in conversatio...
09/16/2018

Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization

Responses generated by neural conversational models tend to lack informa...
11/17/2018

An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss

Affect conveys important implicit information in human communication. Ha...

Code Repositories

chatbot-rnn

A toy chatbot powered by deep learning and trained on data from Reddit


view repo

chatbot-rnn

Please aware that the data provided is already outdated. Sample data would be uploaded for users to test on their own.


view repo

toy_chatbot

This code is taken directly from https://github.com/pender/chatbot-rnn. Customized to work with python 3.5 and tensorflow 1.0.


view repo

Please sign up or login with your details

Forgot password? Click here to reset