Acoustic-to-Word Models with Conversational Context Information

05/21/2019
by   Suyoun Kim, et al.
0

Conversational context information, higher-level knowledge that spans across sentences, can help to recognize a long conversation. However, existing speech recognition models are typically built at a sentence level, and thus it may not capture important conversational context information. The recent progress in end-to-end speech recognition enables integrating context with other available information (e.g., acoustic, linguistic resources) and directly recognizing words from speech. In this work, we present a direct acoustic-to-word, end-to-end speech recognition model capable of utilizing the conversational context to better process long conversations. We evaluate our proposed approach on the Switchboard conversational speech corpus and show that our system outperforms a standard end-to-end speech recognition system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/07/2018

Dialog-context aware end-to-end speech recognition

Existing speech recognition systems are typically built at the sentence ...
research
06/27/2019

Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion

We present a novel conversational-context aware end-to-end speech recogn...
research
09/04/2023

SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

Recently, excellent progress has been made in speech recognition. Howeve...
research
11/04/2018

Investigating context features hidden in End-to-End TTS

Recent studies have introduced end-to-end TTS, which integrates the prod...
research
05/03/2023

M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis

Conversational text-to-speech (TTS) aims to synthesize speech with prope...
research
08/29/2022

Turn-Taking Prediction for Natural Conversational Speech

While a streaming voice assistant system has been used in many applicati...
research
11/18/2020

Context-aware RNNLM Rescoring for Conversational Speech Recognition

Conversational speech recognition is regarded as a challenging task due ...

Please sign up or login with your details

Forgot password? Click here to reset