Look at Me When I Talk to You: A Video Dataset to Enable Voice Assistants to Recognize Errors

04/14/2021
by   Andrea Cuadra, et al.
0

People interacting with voice assistants are often frustrated by voice assistants' frequent errors and inability to respond to backchannel cues. We introduce an open-source video dataset of 21 participants' interactions with a voice assistant, and explore the possibility of using this dataset to enable automatic error recognition to inform self-repair. The dataset includes clipped and labeled videos of participants' faces during free-form interactions with the voice assistant from the smart speaker's perspective. To validate our dataset, we emulated a machine learning classifier by asking crowdsourced workers to recognize voice assistant errors from watching soundless video clips of participants' reactions. We found trends suggesting it is possible to determine the voice assistant's performance from a participant's facial reaction alone. This work posits elicited datasets of interactive responses as a key step towards improving error recognition for repair for voice assistants in a wide variety of applications.

READ FULL TEXT

page 2

page 15

research
08/08/2020

JukeBox: A Multilingual Singer Recognition Dataset

A text-independent speaker recognition system relies on successfully enc...
research
12/14/2018

Pay Voice: Point of Sale Recognition for Visually Impaired People

Millions of visually impaired people depend on relatives and friends to ...
research
07/15/2022

Towards Understanding Confusion and Affective States Under Communication Failures in Voice-Based Human-Machine Interaction

We present a series of two studies conducted to understand user's affect...
research
03/29/2022

VoiceMe: Personalized voice generation in TTS

Novel text-to-speech systems can generate entirely new voices that were ...
research
02/23/2023

Can Voice Assistants Be Microaggressors? Cross-Race Psychological Responses to Failures of Automatic Speech Recognition

Language technologies have a racial bias, committing greater errors for ...
research
03/19/2023

Right the docs: Characterising voice dataset documentation practices used in machine learning

Voice-enabled technology is quickly becoming ubiquitous, and is constitu...
research
09/25/2020

Effective Voice: Beyond Exit and Affect in Online Communities

This paper sets out to identify a set of strategies and techniques throu...

Please sign up or login with your details

Forgot password? Click here to reset