DeepAI AI Chat
Log In Sign Up

Pointing the Unknown Words

by   Caglar Gulcehre, et al.

The problem of rare and unknown words is an important issue that can potentially influence the performance of many NLP systems, including both the traditional count-based and the deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each time-step, the decision of which softmax layer to use choose adaptively made by an MLP which is conditioned on the context. We motivate our work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known. We observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset using our proposed model.


page 1

page 2

page 3

page 4


Pointer Sentinel Mixture Models

Recent neural network sequence models with softmax classifiers have achi...

Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

Neural language models have been widely used in various NLP tasks, inclu...

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

Nearly all previous work on neural machine translation (NMT) has used qu...

Strategies for Training Large Vocabulary Neural Language Models

Training neural network language models over large vocabularies is still...

Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

Neural language models (NLMs) have recently gained a renewed interest by...

Nonparametric Masked Language Modeling

Existing language models (LMs) predict tokens with a softmax over a fini...

Attention Scheme Inspired Softmax Regression

Large language models (LLMs) have made transformed changes for human soc...

Code Repositories