MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech

05/02/2020
by   Jakob Drachmann Havtorn, et al.
0

We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers. We propose a novel multimodal approach to real-time sequence labeling in speech. Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via automatic speech recognition. Our results show significant gains of jointly learning from the two modalities when compared to text or audio only, under adverse noise and limited volume of training data. The results generalize to medical symptoms detection where we observe a similar pattern of improvements with multimodal learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2019

Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions

Multimodal learning allows us to leverage information from multiple sour...
research
09/05/2018

Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition

Automatic speech recognition can potentially benefit from the lip motion...
research
09/16/2023

ARTEMIS: AI-driven Robotic Triage Labeling and Emergency Medical Information System

Mass casualty incidents (MCIs) pose a formidable challenge to emergency ...
research
10/09/2019

Exploring Hate Speech Detection in Multimodal Publications

In this work we target the problem of hate speech detection in multimoda...
research
07/13/2022

MM-ALT: A Multimodal Automatic Lyric Transcription System

Automatic lyric transcription (ALT) is a nascent field of study attracti...
research
02/13/2022

Multimodal Depression Classification Using Articulatory Coordination Features And Hierarchical Attention Based Text Embeddings

Multimodal depression classification has gained immense popularity over ...
research
04/03/2022

Multilingual and Multimodal Abuse Detection

The presence of abusive content on social media platforms is undesirable...

Please sign up or login with your details

Forgot password? Click here to reset