Learning to retrieve out-of-vocabulary words in speech recognition

11/17/2015
by   Imran Sheikh, et al.
0

Many Proper Names (PNs) are Out-Of-Vocabulary (OOV) words for speech recognition systems used to process diachronic audio data. To help recovery of the PNs missed by the system, relevant OOV PNs can be retrieved out of the many OOVs by exploiting semantic context of the spoken content. In this paper, we propose two neural network models targeted to retrieve OOV PNs relevant to an audio document: (a) Document level Continuous Bag of Words (D-CBOW), (b) Document level Continuous Bag of Weighted Words (D-CBOW2). Both these models take document words as input and learn with an objective to maximise the retrieval of co-occurring OOV PNs. With the D-CBOW2 model we propose a new approach in which the input embedding layer is augmented with a context anchor layer. This layer learns to assign importance to input words and has the ability to capture (task specific) key-words in a bag-of-word neural network model. With experiments on French broadcast news videos we show that these two models outperform the baseline methods based on raw embeddings from LDA, Skip-gram and Paragraph Vectors. Combining the D-CBOW and D-CBOW2 models gives faster convergence during training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2018

Context is Everything: Finding Meaning Statistically in Semantic Spaces

This paper introduces a simple and explicit measure of word importance i...
research
07/13/2017

Automatic Speech Recognition with Very Large Conversational Finnish and Estonian Vocabularies

Today, the vocabulary size for language models in large vocabulary speec...
research
06/08/2017

Context encoders as a simple but powerful extension of word2vec

With a simple architecture and the ability to learn meaningful word embe...
research
07/06/2019

Bag-of-Audio-Words based on Autoencoder Codebook for Continuous Emotion Prediction

In this paper we present a novel approach for extracting a Bag-of-Words ...
research
09/18/2017

Paraphrasing verbal metonymy through computational methods

Verbal metonymy has received relatively scarce attention in the field of...
research
04/04/2018

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input

In this paper, we explore neural network models that learn to associate ...
research
01/11/2017

Job Detection in Twitter

In this report, we propose a new application for twitter data called job...

Please sign up or login with your details

Forgot password? Click here to reset