Segmenting Subtitles for Correcting ASR Segmentation Errors

04/16/2021
by   David Wan, et al.
0

Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks. We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic utterances by modeling common error modes. We train a neural tagging model for correcting ASR acoustic segmentation and show that it improves downstream performance on MT and audio-document cross-language information retrieval (CLIR).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2020

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

In this work, we focus on improving ASR output segmentation in the conte...
research
10/26/2022

Smart Speech Segmentation using Acousto-Linguistic Features with look-ahead

Segmentation for continuous Automatic Speech Recognition (ASR) has tradi...
research
07/31/2020

An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances

In this paper, we propose a sub-utterance unit selection framework to re...
research
09/03/2017

Disentangling ASR and MT Errors in Speech Translation

The main aim of this paper is to investigate automatic quality assessmen...
research
04/22/2022

E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR

Improving the performance of end-to-end ASR models on long utterances ra...
research
11/24/2015

Spoken Language Translation for Polish

Spoken language translation (SLT) is becoming more important in the incr...
research
02/22/2021

Creating a Universal Dependencies Treebank of Spoken Frisian-Dutch Code-switched Data

This paper explores the difficulties of annotating transcribed spoken Du...

Please sign up or login with your details

Forgot password? Click here to reset