TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection

05/23/2023
by   Chenglong Wang, et al.
0

Current fake audio detection relies on hand-crafted features, which lose information during extraction. To overcome this, recent studies use direct feature extraction from raw audio signals. For example, RawNet is one of the representative works in end-to-end fake audio detection. However, existing work on RawNet does not optimize the parameters of the Sinc-conv during training, which limited its performance. In this paper, we propose to incorporate orthogonal convolution into RawNet, which reduces the correlation between filters when optimizing the parameters of Sinc-conv, thus improving discriminability. Additionally, we introduce temporal convolutional networks (TCN) to capture long-term dependencies in speech signals. Experiments on the ASVspoof 2019 show that the Our TO-RawNet system can relatively reduce EER by 66.09% on logical access scenario compared with the RawNet, demonstrating its effectiveness in detecting fake audio attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2022

An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio

Many effective attempts have been made for fake audio detection. However...
research
08/20/2022

Fully Automated End-to-End Fake Audio Detection

The existing fake audio detection systems often rely on expert experienc...
research
08/13/2020

Interpretable Partial Discharge Detection with Temporal Convolution and Pulse Activation Maps: An application to Power Lines

Partial discharge (PD) is a common indication of insulation damages in p...
research
07/12/2022

FAD: A Chinese Dataset for Fake Audio Detection

Fake audio detection is a growing concern and some relevant datasets hav...
research
09/06/2023

An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection

Partially spoofed audio detection is a challenging task, lying in the ne...
research
05/23/2023

Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features

Existing fake audio detection systems perform well in in-domain testing,...
research
06/16/2022

GoodBye WaveNet – A Language Model for Raw Audio with Context of 1/2 Million Samples

Modeling long-term dependencies for audio signals is a particularly chal...

Please sign up or login with your details

Forgot password? Click here to reset