A Comparison of Online Automatic Speech Recognition Systems and the Nonverbal Responses to Unintelligible Speech

04/29/2019
by   Joshua Y. Kim, et al.
0

Automatic Speech Recognition (ASR) systems have proliferated over the recent years to the point that free platforms such as YouTube now provide speech recognition services. Given the wide selection of ASR systems, we contribute to the field of automatic speech recognition by comparing the relative performance of two sets of manual transcriptions and five sets of automatic transcriptions (Google Cloud, IBM Watson, Microsoft Azure, Trint, and YouTube) to help researchers to select accurate transcription services. In addition, we identify nonverbal behaviors that are associated with unintelligible speech, as indicated by high word error rates. We show that manual transcriptions remain superior to current automatic transcriptions. Amongst the automatic transcription services, YouTube offers the most accurate transcription service. For non-verbal behavioral involvement, we provide evidence that the variability of smile intensities from the listener is high (low) when the speaker is clear (unintelligible). These findings are derived from videoconferencing interactions between student doctors and simulated patients; therefore, we contribute towards both the ASR literature and the healthcare communication skills teaching community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2019

Speech Recognition with no speech or with noisy speech

The performance of automatic speech recognition systems(ASR) degrades in...
research
12/17/2021

JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification

In this paper, we construct a new Japanese speech corpus called "JTubeSp...
research
06/09/2021

Unsupervised Automatic Speech Recognition: A Review

Automatic Speech Recognition (ASR) systems can be trained to achieve rem...
research
12/13/2016

Evaluating Automatic Speech Recognition Systems in Comparison With Human Perception Results Using Distinctive Feature Measures

This paper describes methods for evaluating automatic speech recognition...
research
02/17/2022

'Beach' to 'Bitch': Inadvertent Unsafe Transcription of Kids' Content on YouTube

Over the last few years, YouTube Kids has emerged as one of the highly c...
research
06/07/2023

A Study on the Reliability of Automatic Dysarthric Speech Assessments

Automating dysarthria assessments offers the opportunity to develop effe...
research
03/10/2023

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

Automatic Speech Recognition (ASR) in medical contexts has the potential...

Please sign up or login with your details

Forgot password? Click here to reset