Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts

05/10/2022
by   Emma Hughson, et al.
12

Adapting one's voice to different ambient environments and social interactions is required for human social interaction. In robotics, the ability to recognize speech in noisy and quiet environments has received significant attention, but considering ambient cues in the production of social speech features has been little explored. Our research aims to modify a robot's speech to maximize acceptability in various social and acoustic contexts, starting with a use case for service robots in varying restaurants. We created an original dataset collected over Zoom with participants conversing in scripted and unscripted tasks given 7 different ambient sounds and background images. Voice conversion methods, in addition to altered Text-to-Speech that matched ambient specific data, were used for speech synthesis tasks. We conducted a subjective perception study that showed humans prefer synthetic speech that matches ambience and social context, ultimately preferring more human-like voices. This work provides three solutions to ambient and socially appropriate synthetic voices: (1) a novel protocol to collect real contextual audio voice data, (2) tools and directions to manipulate robot speech for appropriate social and ambient specific interactions, and (3) insight into voice conversion's role in flexibly altering robot speech to match different ambient environments.

READ FULL TEXT

page 1

page 4

page 6

research
12/05/2019

Towards Robust Neural Vocoding for Speech Generation: A Survey

Recently, neural vocoders have been widely used in speech synthesis task...
research
03/21/2022

Automated detection of foreground speech with wearable sensing in everyday home environments: A transfer learning approach

Acoustic sensing has proved effective as a foundation for numerous appli...
research
06/09/2022

Speak Like a Dog: Human to Non-human creature Voice Conversion

This paper proposes a new voice conversion (VC) task from human speech t...
research
07/12/2023

Rhythm Modeling for Voice Conversion

Voice conversion aims to transform source speech into a different target...
research
10/25/2018

Sorry: Ambient Tactical Deception Via Malware-Based Social Engineering

In this paper we argue, drawing from the perspectives of cybersecurity a...
research
06/24/2021

Hate Speech Detection in Clubhouse

With the rise of voice chat rooms, a gigantic resource of data can be ex...
research
03/17/2022

Robotic Speech Synthesis: Perspectives on Interactions, Scenarios, and Ethics

In recent years, many works have investigated the feasibility of convers...

Please sign up or login with your details

Forgot password? Click here to reset