A Diversity-Promoting Objective Function for Neural Conversation Models

by   Jiwei Li, et al.
Stanford University

Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e.g., "I don't know") regardless of the input. We suggest that the traditional objective function, i.e., the likelihood of output (response) given input (message) is unsuited to response generation tasks. Instead we propose using Maximum Mutual Information (MMI) as the objective function in neural models. Experimental results demonstrate that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.


page 1

page 2

page 3

page 4


Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots

Diversity is a long-studied topic in information retrieval that usually ...

An Attentional Neural Conversation Model with Improved Specificity

In this paper we propose a neural conversation model for conducting dial...

Non-Autoregressive Neural Dialogue Generation

Maximum Mutual information (MMI), which models the bidirectional depende...

Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech

Countermeasures to effectively fight the ever increasing hate speech onl...

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Natural language generation (NLG) is a critical component in conversatio...

Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization

Responses generated by neural conversational models tend to lack informa...

An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss

Affect conveys important implicit information in human communication. Ha...

Code Repositories


A toy chatbot powered by deep learning and trained on data from Reddit

view repo


Please aware that the data provided is already outdated. Sample data would be uploaded for users to test on their own.

view repo


This code is taken directly from https://github.com/pender/chatbot-rnn. Customized to work with python 3.5 and tensorflow 1.0.

view repo

Please sign up or login with your details

Forgot password? Click here to reset