DeepAI AI Chat
Log In Sign Up

Symmetric Saliency-based Adversarial Attack To Speaker Identification

by   Jiadi Yao, et al.
Tsinghua University

Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge. To address this issue, in this paper, we propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED), to generate adversarial voice examples to speaker identification. It contains two novel components. First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system, so as to make the attacker focus on generating artificial noise to the important samples. It also proposes an angular loss function to push the speaker embedding far away from the source speaker. Our experimental results demonstrate that the proposed SSED yields the state-of-the-art performance, i.e. over 97 39 dB on both the open-set and close-set speaker identification tasks, with a low computational cost.


Noise-robust voice conversion with domain adversarial training

Voice conversion has made great progress in the past few years under the...

Supervised Speaker Embedding De-Mixing in Two-Speaker Environment

In this work, a speaker embedding de-mixing approach is proposed. Instea...

V2S attack: building DNN-based voice conversion from automatic speaker verification

This paper presents a new voice impersonation attack using voice convers...

SAD: Saliency-based Defenses Against Adversarial Examples

With the rise in popularity of machine and deep learning models, there i...

Improving speaker de-identification with functional data analysis of f0 trajectories

Due to a constantly increasing amount of speech data that is stored in d...

FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances

Speaker identification models are vulnerable to carefully designed adver...

Dictionary Attacks on Speaker Verification

In this paper, we propose dictionary attacks against speaker verificatio...