Evaluating the Usability of Automatically Generated Captions for People who are Deaf or Hard of Hearing

12/06/2017
by   Sushant Kafle, et al.
0

The accuracy of Automated Speech Recognition (ASR) technology has improved, but it is still imperfect in many settings. Researchers who evaluate ASR performance often focus on improving the Word Error Rate (WER) metric, but WER has been found to have little correlation with human-subject performance on many applications. We propose a new captioning-focused evaluation metric that better predicts the impact of ASR recognition errors on the usability of automatically generated captions for people who are Deaf or Hard of Hearing (DHH). Through a user study with 30 DHH users, we compared our new metric with the traditional WER metric on a caption usability evaluation task. In a side-by-side comparison of pairs of ASR text output (with identical WER), the texts preferred by our new metric were preferred by DHH participants. Further, our metric had significantly higher correlation with DHH participants' subjective scores on the usability of a caption, as compared to the correlation between WER metric and participant subjective scores. This new metric could be used to select ASR systems for captioning applications, and it may be a better metric for ASR researchers to consider when optimizing ASR systems.

READ FULL TEXT

page 6

page 8

research
06/03/2021

Semantic-WER: A Unified Metric for the Evaluation of ASR Transcript for End Usability

Recent advances in supervised, semi-supervised and self-supervised deep ...
research
01/27/2021

See-Through Captions: Real-Time Captioning on Transparent Display for Deaf and Hard-of-Hearing People

Real-time captioning is a useful technique for deaf and hard-of-hearing ...
research
10/11/2021

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric

Measuring automatic speech recognition (ASR) system quality is critical ...
research
01/29/2018

A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts

Motivated by a project to create a system for people who are deaf or har...
research
12/16/2022

BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric

End-to-End speech-to-speech translation (S2ST) is generally evaluated wi...
research
06/02/2020

An ASR Guided Speech Intelligibility Measure for TTS Model Selection

The perceptual quality of neural text-to-speech (TTS) is highly dependen...
research
05/20/2021

Quantitative Physical Ergonomics Assessment of Teleoperation Interfaces

Human factors and ergonomics are the essential constituents of teleopera...

Please sign up or login with your details

Forgot password? Click here to reset