An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection

09/06/2023
by   Yuankun Xie, et al.
0

Partially spoofed audio detection is a challenging task, lying in the need to accurately locate the authenticity of audio at the frame level. To address this issue, we propose a fine-grained partially spoofed audio detection method, namely Temporal Deepfake Location (TDL), which can effectively capture information of both features and locations. Specifically, our approach involves two novel parts: embedding similarity module and temporal convolution operation. To enhance the identification between the real and fake features, the embedding similarity module is designed to generate an embedding space that can separate the real frames from fake frames. To effectively concentrate on the position information, temporal convolution operation is proposed to calculate the frame-specific similarities among neighboring frames, and dynamically select informative neighbors to convolution. Extensive experiments show that our method outperform baseline models in ASVspoof2019 Partial Spoof dataset and demonstrate superior performance even in the crossdataset scenario. The code is released online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2022

Waveform Boundary Detection for Partially Spoofed Audio

The present paper proposes a waveform boundary detection system for audi...
research
07/18/2020

Temporal Complementary Learning for Video Person Re-Identification

This paper proposes a Temporal Complementary Learning Network that extra...
research
02/17/2022

ADD 2022: the First Audio Deep Synthesis Detection Challenge

Audio deepfake detection is an emerging topic, which was included in the...
research
05/23/2023

TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection

Current fake audio detection relies on hand-crafted features, which lose...
research
07/27/2021

Multi-Scale Local-Temporal Similarity Fusion for Continuous Sign Language Recognition

Continuous sign language recognition (cSLR) is a public significant task...
research
11/11/2019

Similarity-DT: Kernel Similarity Embedding for Dynamic Texture Synthesis

Dynamic texture (DT) exhibits statistical stationarity in the spatial do...
research
03/03/2022

TCTrack: Temporal Contexts for Aerial Tracking

Temporal contexts among consecutive frames are far from being fully util...

Please sign up or login with your details

Forgot password? Click here to reset