Transcribing Against Time

09/15/2017
by   Matthias Sperber, et al.
0

We investigate the problem of manually correcting errors from an automatic speech transcript in a cost-sensitive fashion. This is done by specifying a fixed time budget, and then automatically choosing location and size of segments for correction such that the number of corrected errors is maximized. The core components, as suggested by previous research [1], are a utility model that estimates the number of errors in a particular segment, and a cost model that estimates annotation effort for the segment. In this work we propose a dynamic updating framework that allows for the training of cost models during the ongoing transcription process. This removes the need for transcriber enrollment prior to the actual transcription, and improves correction efficiency by allowing highly transcriber-adaptive cost modeling. We first confirm and analyze the improvements afforded by this method in a simulated study. We then conduct a realistic user study, observing efficiency improvements of 15 deviated most strongly from our initial, transcriber-agnostic cost model. Moreover, we find that our updating framework can capture dynamically changing factors, such as transcriber fatigue and topic familiarity, which we observe to have a large influence on the transcriber's working behavior.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2019

Exploring Methods for the Automatic Detection of Errors in Manual Transcription

Quality of data plays an important role in most deep learning tasks. In ...
research
06/17/2022

Automatic Correction of Human Translations

We introduce translation error correction (TEC), the task of automatical...
research
02/02/2022

Error Correction in ASR using Sequence-to-Sequence Models

Post-editing in Automatic Speech Recognition (ASR) entails automatically...
research
06/26/2019

Leveraging Text Repetitions and Denoising Autoencoders in OCR Post-correction

A common approach for improving OCR quality is a post-processing step ba...
research
02/24/2008

An Empirical Study of End-User Behaviour in Spreadsheet Error Detection & Correction

Very little is known about the process by which end-user developers dete...
research
02/09/2023

Correcting Real-Word Spelling Errors: A New Hybrid Approach

Spelling correction is one of the main tasks in the field of Natural Lan...
research
05/04/2021

Speech Decomposition Based on a Hybrid Speech Model and Optimal Segmentation

In a hybrid speech model, both voiced and unvoiced components can coexis...

Please sign up or login with your details

Forgot password? Click here to reset