Pointing the Unknown Words

03/26/2016
by   Caglar Gulcehre, et al.
0

The problem of rare and unknown words is an important issue that can potentially influence the performance of many NLP systems, including both the traditional count-based and the deep learning models. We propose a novel way to deal with the rare and unseen words for the neural network models using attention. Our model uses two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word in the shortlist vocabulary. At each time-step, the decision of which softmax layer to use choose adaptively made by an MLP which is conditioned on the context. We motivate our work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known. We observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset using our proposed model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2016

Pointer Sentinel Mixture Models

Recent neural network sequence models with softmax classifiers have achi...
research
10/29/2018

Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

Neural language models have been widely used in various NLP tasks, inclu...
research
04/04/2016

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

Nearly all previous work on neural machine translation (NMT) has used qu...
research
12/15/2015

Strategies for Training Large Vocabulary Neural Language Models

Training neural network language models over large vocabularies is still...
research
06/11/2018

Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

Neural language models (NLMs) have recently gained a renewed interest by...
research
12/02/2022

Nonparametric Masked Language Modeling

Existing language models (LMs) predict tokens with a softmax over a fini...
research
04/20/2023

Attention Scheme Inspired Softmax Regression

Large language models (LLMs) have made transformed changes for human soc...

Please sign up or login with your details

Forgot password? Click here to reset