Feasibility of Post-Editing Speech Transcriptions with a Mismatched Crowd

09/07/2016
by   Purushotam Radadia, et al.
0

Manual correction of speech transcription can involve a selection from plausible transcriptions. Recent work has shown the feasibility of employing a mismatched crowd for speech transcription. However, it is yet to be established whether a mismatched worker has sufficiently fine-granular speech perception to choose among the phonetically proximate options that are likely to be generated from the trellis of an ASRU. Hence, we consider five languages, Arabic, German, Hindi, Russian and Spanish. For each we generate synthetic, phonetically proximate, options which emulate post-editing scenarios of varying difficulty. We consistently observe non-trivial crowd ability to choose among fine-granular options.

READ FULL TEXT

page 1

page 2

page 3

research
02/20/2017

Post-edit Analysis of Collective Biography Generation

Text generation is increasingly common but often requires manual post-ed...
research
02/16/2021

Context-Aware Prosody Correction for Text-Based Speech Editing

Text-based speech editors expedite the process of editing speech recordi...
research
05/16/2022

Meta AI at Arabic Hate Speech 2022: MultiTask Learning with Self-Correction for Hate Speech Classification

In this paper, we tackle the Arabic Fine-Grained Hate Speech Detection s...
research
05/24/2022

DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages

We introduce DivEMT, the first publicly available post-editing study of ...
research
09/14/2017

When Waiting is not an Option : Learning Options with a Deliberation Cost

Recent work has shown that temporally extended actions (options) can be ...
research
02/21/2022

CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing

The text-based speech editor allows the editing of speech through intuit...

Please sign up or login with your details

Forgot password? Click here to reset