SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection

11/11/2022
by   Jiangyan Yi, et al.
0

Previous databases have been designed to further the development of fake audio detection. However, fake utterances are mostly generated by altering timbre, prosody, linguistic content or channel noise of original audios. They ignore a fake situation, in which the attacker manipulates an acoustic scene of the original audio with another forgery one. It will pose a major threat to our society if some people misuse the manipulated audio with malicious purpose. Therefore, this motivates us to fill in the gap. This paper designs such a dataset for scene fake audio detection (SceneFake). A manipulated audio in the SceneFake dataset involves only tampering the acoustic scene of an utterance by using speech enhancement technologies. We can not only detect fake utterances on a seen test set but also evaluate the generalization of fake detection models to unseen manipulation attacks. Some benchmark results are described on the SceneFake dataset. Besides, an analysis of fake attacks with different speech enhancement technologies and signal-to-noise ratios are presented on the dataset. The results show that scene manipulated utterances can not be detected reliably by the existing baseline models of ASVspoof 2019. Furthermore, the detection of unseen scene manipulation audio is still challenging.

READ FULL TEXT

page 2

page 24

page 25

research
04/08/2021

Half-Truth: A Partially Fake Audio Detection Dataset

Diverse promising datasets have been designed to hold back the developme...
research
07/12/2022

FAD: A Chinese Dataset for Fake Audio Detection

Fake audio detection is a growing concern and some relevant datasets hav...
research
10/18/2021

FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection

As increasing development of text-to-speech (TTS) and voice conversion (...
research
03/28/2022

Attacker Attribution of Audio Deepfakes

Deepfakes are synthetically generated media often devised with malicious...
research
10/12/2022

SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection

Audio DeepFakes are utterances generated with the use of deep neural net...
research
06/27/2022

Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection

Audio DeepFakes allow the creation of high-quality, convincing utterance...
research
11/10/2022

EmoFake: An Initial Dataset for Emotion Fake Audio Detection

There are already some datasets used for fake audio detection, such as t...

Please sign up or login with your details

Forgot password? Click here to reset