Acoustically-Driven Phoneme Removal That Preserves Vocal Affect Cues

10/26/2022
by   Camille Noufi, et al.
0

In this paper, we propose a method for removing linguistic information from speech for the purpose of isolating paralinguistic indicators of affect. The immediate utility of this method lies in clinical tests of sensitivity to vocal affect that are not confounded by language, which is impaired in a variety of clinical populations. The method is based on simultaneous recordings of speech audio and electroglottographic (EGG) signals. The speech audio signal is used to estimate the average vocal tract filter response and amplitude envelop. The EGG signal supplies a direct correlate of voice source activity that is mostly independent of phonetic articulation. These signals are used to create a third signal designed to capture as much paralinguistic information from the vocal production system as possible – maximizing the retention of bioacoustic cues to affect – while eliminating phonetic cues to verbal meaning. To evaluate the success of this method, we studied the perception of corresponding speech audio and transformed EGG signals in an affect rating experiment with online listeners. The results show a high degree of similarity in the perceived affect of matched signals, indicating that our method is effective.

READ FULL TEXT

page 2

page 3

research
08/21/2020

RespVAD: Voice Activity Detection via Video-Extracted Respiration Patterns

Voice Activity Detection (VAD) refers to the task of identification of r...
research
03/02/2021

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss

We present an audio-visual speech separation learning method that consid...
research
07/23/2019

Speech, Head, and Eye-based Cues for Continuous Affect Prediction

Continuous affect prediction involves the discrete time-continuous regre...
research
06/15/2016

Multi-Modal Hybrid Deep Neural Network for Speech Enhancement

Deep Neural Networks (DNN) have been successful in en- hancing noisy spe...
research
05/25/2023

Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion

Audio Deepfake Detection (ADD) aims to detect the fake audio generated b...
research
06/03/2021

An Improved Model for Voicing Silent Speech

In this paper, we present an improved model for voicing silent speech, w...
research
06/29/2023

CORAE: A Tool for Intuitive and Continuous Retrospective Evaluation of Interactions

This paper introduces CORAE, a novel web-based open-source tool for COnt...

Please sign up or login with your details

Forgot password? Click here to reset