DeepAI
Log In Sign Up

Segmenting Subtitles for Correcting ASR Segmentation Errors

04/16/2021
by   David Wan, et al.
0

Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks. We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic utterances by modeling common error modes. We train a neural tagging model for correcting ASR acoustic segmentation and show that it improves downstream performance on MT and audio-document cross-language information retrieval (CLIR).

READ FULL TEXT

page 1

page 2

page 3

page 4

10/19/2020

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

In this work, we focus on improving ASR output segmentation in the conte...
10/26/2022

Smart Speech Segmentation using Acousto-Linguistic Features with look-ahead

Segmentation for continuous Automatic Speech Recognition (ASR) has tradi...
07/31/2020

An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances

In this paper, we propose a sub-utterance unit selection framework to re...
09/03/2017

Disentangling ASR and MT Errors in Speech Translation

The main aim of this paper is to investigate automatic quality assessmen...
04/22/2022

E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR

Improving the performance of end-to-end ASR models on long utterances ra...
11/24/2015

Spoken Language Translation for Polish

Spoken language translation (SLT) is becoming more important in the incr...
05/30/2020

Dynamic Masking for Improved Stability in Spoken Language Translation

For spoken language translation (SLT) in live scenarios such as conferen...