End-to-end model for named entity recognition from speech without paired training data

04/02/2022
by   Salima Mdhaffar, et al.
12

Recent works showed that end-to-end neural approaches tend to become very popular for spoken language understanding (SLU). Through the term end-to-end, one considers the use of a single model optimized to extract semantic information directly from the speech signal. A major issue for such models is the lack of paired audio and textual data with semantic annotation. In this paper, we propose an approach to build an end-to-end neural model to extract semantic information in a scenario in which zero paired audio data is available. Our approach is based on the use of an external model trained to generate a sequence of vectorial representations from text. These representations mimic the hidden representations that could be generated inside an end-to-end automatic speech recognition (ASR) model by processing a speech signal. An SLU neural module is then trained using these representations as input and the annotated text as output. Last, the SLU module replaces the top layers of the ASR model to achieve the construction of the end-to-end model. Our experiments on named entity recognition, carried out on the QUAERO corpus, show that this approach is very promising, getting better results than a comparable cascade approach or than the use of synthetic voices.

READ FULL TEXT
research
05/30/2018

End-to-end named entity extraction from speech

Named entity recognition (NER) is among SLU tasks that usually extract s...
research
05/22/2020

End-to-end Named Entity Recognition from English Speech

Named entity recognition (NER) from text has been a widely studied probl...
research
06/24/2021

Where are we in semantic concept extraction for Spoken Language Understanding?

Spoken language understanding (SLU) topic has seen a lot of progress the...
research
10/27/2022

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models

End-to-end spoken language understanding (SLU) systems are gaining popul...
research
05/29/2023

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target

Spoken Language Understanding (SLU) is a task that aims to extract seman...
research
05/02/2022

Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages

We introduce Wav2Seq, the first self-supervised approach to pre-train bo...
research
07/17/2022

End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource setting

Spoken Language Understanding (SLU) is a core task in most human-machine...

Please sign up or login with your details

Forgot password? Click here to reset