An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering

05/25/2020
by   Chia-Chih Kuo, et al.
0

In a spoken multiple-choice question answering (SMCQA) task, given a passage, a question, and multiple choices all in the form of speech, the machine needs to pick the correct choice to answer the question. While the audio could contain useful cues for SMCQA, usually only the auto-transcribed text is utilized in system development. Thanks to the large-scaled pre-trained language representation models, such as the bidirectional encoder representations from transformers (BERT), systems with only auto-transcribed text can still achieve a certain level of performance. However, previous studies have evidenced that acoustic-level statistics can offset text inaccuracies caused by the automatic speech recognition systems or representation inadequacy lurking in word embedding generators, thereby making the SMCQA system robust. Along the line of research, this study concentrates on designing a BERT-based SMCQA framework, which not only inherits the advantages of contextualized language representations learned by BERT, but integrates the complementary acoustic-level information distilled from audio with the text-level information. Consequently, an audio-enriched BERT-based SMCQA framework is proposed. A series of experiments demonstrates remarkable improvements in accuracy over selected baselines and SOTA systems on a published Chinese SMCQA dataset.

READ FULL TEXT
research
05/01/2017

Speech-Based Visual Question Answering

This paper introduces speech-based visual question answering (VQA), the ...
research
09/24/2019

Understanding Semantics from Speech Through Pre-training

End-to-end Spoken Language Understanding (SLU) is proposed to infer the ...
research
01/02/2021

What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure

In recent times, BERT based transformer models have become an inseparabl...
research
11/15/2022

Introducing Semantics into Speech Encoders

Recent studies find existing self-supervised speech encoders contain pri...
research
02/05/2020

UNCC Biomedical Semantic Question Answering Systems. BioASQ: Task-7B, Phase-B

In this paper, we detail our submission to the 2019, 7th year, BioASQ co...
research
10/21/2020

Contextualized Attention-based Knowledge Transfer for Spoken Conversational Question Answering

Spoken conversational question answering (SCQA) requires machines to mod...
research
06/01/2020

An Effective Contextual Language Modeling Framework for Speech Summarization with Augmented Features

Tremendous amounts of multimedia associated with speech information are ...

Please sign up or login with your details

Forgot password? Click here to reset