Using Deepfake Technologies for Word Emphasis Detection

05/12/2023
by   Eran Kaufman, et al.
0

In this work, we consider the task of automated emphasis detection for spoken language. This problem is challenging in that emphasis is affected by the particularities of speech of the subject, for example the subject accent, dialect or voice. To address this task, we propose to utilize deep fake technology to produce an emphasis devoid speech for this speaker. This requires extracting the text of the spoken voice, and then using a voice sample from the same speaker to produce emphasis devoid speech for this task. By comparing the generated speech with the spoken voice, we are able to isolate patterns of emphasis which are relatively easy to detect.

READ FULL TEXT
research
08/08/2020

JukeBox: A Multilingual Singer Recognition Dataset

A text-independent speaker recognition system relies on successfully enc...
research
06/24/2022

Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech

The cloning of a speaker's voice using an untranscribed reference sample...
research
07/13/2023

Controllable Emphasis with zero data for text-to-speech

We present a scalable method to produce high quality emphasis for text-t...
research
01/10/2021

Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualised Embeddings

This paper describes our proposed system for the AAAI-CAD21 shared task:...
research
08/02/2021

Creation and Detection of German Voice Deepfakes

Synthesizing voice with the help of machine learning techniques has made...
research
08/28/2023

EEG-Derived Voice Signature for Attended Speaker Detection

Objective: Conventional EEG-based auditory attention detection (AAD) is ...

Please sign up or login with your details

Forgot password? Click here to reset