Hierarchical Attention Model for Improved Machine Comprehension of Spoken Content

08/28/2016
by   Wei Fang, et al.
0

Multimedia or spoken content presents more attractive information than plain text content, but the former is more difficult to display on a screen and be selected by a user. As a result, accessing large collections of the former is much more difficult and time-consuming than the latter for humans. It's therefore highly attractive to develop machines which can automatically understand spoken content and summarize the key information for humans to browse over. In this endeavor, a new task of machine comprehension of spoken content was proposed recently. The initial goal was defined as the listening comprehension test of TOEFL, a challenging academic English examination for English learners whose native languages are not English. An Attention-based Multi-hop Recurrent Neural Network (AMRNN) architecture was also proposed for this task, which considered only the sequential relationship within the speech utterances. In this paper, we propose a new Hierarchical Attention Model (HAM), which constructs multi-hopped attention mechanism over tree-structured rather than sequential representations for the utterances. Improved comprehension performance robust with respect to ASR errors were obtained.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/23/2016

Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine

Multimedia or spoken content presents more attractive information than p...
research
09/01/2017

Query-by-example Spoken Term Detection using Attention-based Multi-hop Networks

Retrieving spoken content with spoken queries, or query-by- example spok...
research
04/01/2018

Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension

Reading comprehension has been widely studied. One of the most represent...
research
12/26/2016

Abstractive Headline Generation for Spoken Content by Attentive Recurrent Neural Networks with ASR Error Modeling

Headline generation for spoken content is important since spoken content...
research
10/10/2018

Structured Argument Extraction of Korean Question and Command

Intention identification and slot filling is a core issue in dialog mana...
research
07/09/2021

An Initial Investigation of Non-Native Spoken Question-Answering

Text-based machine comprehension (MC) systems have a wide-range of appli...
research
04/15/2018

Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing

Spoken content processing (such as retrieval and browsing) is maturing, ...

Please sign up or login with your details

Forgot password? Click here to reset