SpeechBERT: Cross-Modal Pre-trained Language Model for End-to-end Spoken Question Answering

10/25/2019
by   Yung-Sung Chuang, et al.
0

While end-to-end models for spoken language understanding tasks have been explored recently, there is still no end-to-end model for spoken question answering (SQA) tasks, which would be catastrophically influenced by speech recognition errors. Meanwhile, pre-trained language models, such as BERT, have performed successfully in text question answering. To bring this advantage of pre-trained language models into spoken question answering, we propose SpeechBERT, a cross-modal transformer-based pre-trained language model. As the first exploration in end-to-end SQA models, our results matched the performance of conventional approaches that fed with output text from ASR and only slightly fell behind pre-trained language models, showing the potential of end-to-end SQA models.

READ FULL TEXT
research
10/23/2020

ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken Language Understanding

Language model pre-training has shown promising results in various downs...
research
11/01/2022

T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5

In Spoken language understanding (SLU), a natural solution is concatenat...
research
05/17/2020

Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation

Speech is one of the most effective means of communication and is full o...
research
08/19/2019

Question Answering based Clinical Text Structuring Using Pre-trained Language Model

Clinical text structuring is a critical and fundamental task for clinica...
research
05/24/2023

LMs with a Voice: Spoken Language Modeling beyond Speech Tokens

We present SPECTRON, a novel approach to adapting pre-trained language m...
research
04/15/2021

Integration of Pre-trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding

Most End-to-End (E2E) SLU networks leverage the pre-trained ASR networks...
research
04/30/2023

How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model

Pre-trained language models can be surprisingly adept at tasks they were...

Please sign up or login with your details

Forgot password? Click here to reset