AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

10/04/2021
by   Jee-weon Jung, et al.
0

Artefacts that differentiate spoofed from bona-fide utterances can reside in spectral or temporal domains. Their reliable detection usually depends upon computationally demanding ensemble systems where each subsystem is tuned to some specific artefacts. We seek to develop an efficient, single system that can detect a broad range of different spoofing attacks without score-level ensembles. We propose a novel heterogeneous stacking graph attention layer which models artefacts spanning heterogeneous temporal and spectral domains with a heterogeneous attention mechanism and a stack node. With a new max graph operation that involves a competitive mechanism and an extended readout scheme, our approach, named AASIST, outperforms the current state-of-the-art by 20 relative. Even a lightweight variant, AASIST-L, with only 85K parameters, outperforms all competing systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2021

Graph Attention Networks for Anti-Spoofing

The cues needed to detect spoofing attacks against automatic speaker ver...
research
07/27/2021

End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection

Artefacts that serve to distinguish bona fide speech from spoofed or dee...
research
08/18/2023

Robust Audio Anti-Spoofing with Fusion-Reconstruction Learning on Multi-Order Spectrograms

Robust audio anti-spoofing has been increasingly challenging due to the ...
research
11/17/2022

Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning

Automatic speaker verification systems are vulnerable to a variety of ac...
research
09/18/2023

Frame-to-Utterance Convergence: A Spectra-Temporal Approach for Unified Spoofing Detection

Voice spoofing attacks pose a significant threat to automated speaker ve...
research
10/16/2022

Attention-Based Audio Embeddings for Query-by-Example

An ideal audio retrieval system efficiently and robustly recognizes a sh...

Please sign up or login with your details

Forgot password? Click here to reset