Sentence Embedder Guided Utterance Encoder (SEGUE) for Spoken Language Understanding

05/20/2023
by   Yi Xuan Tan, et al.
0

The pre-trained speech encoder wav2vec 2.0 performs very well on various spoken language understanding (SLU) tasks. However, on many tasks, it trails behind text encoders with textual input. To improve the understanding capability of SLU encoders, various studies have used knowledge distillation to transfer knowledge from natural language understanding (NLU) encoders. We use a very simple method of distilling from a textual sentence embedder directly into wav2vec 2.0 as pre-training, utilizing paired audio-text datasets. We observed that this method is indeed capable of improving SLU task performance in fine-tuned settings, as well as full-data and few-shot transfer on a frozen encoder. However, the model performs worse on certain tasks highlighting the strengths and weaknesses of our approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2020

Two-stage Textual Knowledge Distillation to Speech Encoder for Spoken Language Understanding

End-to-end approaches open a new way for more accurate and efficient spo...
research
02/11/2021

Speech-language Pre-training for End-to-end Spoken Language Understanding

End-to-end (E2E) spoken language understanding (SLU) can infer semantics...
research
05/04/2023

End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders

It is challenging to extract semantic meanings directly from audio signa...
research
10/05/2020

Semi-Supervised Speech-Language Joint Pre-Training for Spoken Language Understanding

Spoken language understanding (SLU) requires a model to analyze input ac...
research
10/23/2022

Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings

Inducing semantic representations directly from speech signals is a high...
research
01/18/2021

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

In this work, we explore joint energy-based model (EBM) training during ...
research
02/16/2023

E2E Spoken Entity Extraction for Virtual Agents

This paper reimagines some aspects of speech processing using speech enc...

Please sign up or login with your details

Forgot password? Click here to reset