Hear No Evil: Towards Adversarial Robustness of Automatic Speech Recognition via Multi-Task Learning

04/05/2022
by   Nilaksh Das, et al.
0

As automatic speech recognition (ASR) systems are now being widely deployed in the wild, the increasing threat of adversarial attacks raises serious questions about the security and reliability of using such systems. On the other hand, multi-task learning (MTL) has shown success in training models that can resist adversarial attacks in the computer vision domain. In this work, we investigate the impact of performing such multi-task learning on the adversarial robustness of ASR models in the speech domain. We conduct extensive MTL experimentation by combining semantically diverse tasks such as accent classification and ASR, and evaluate a wide range of adversarial settings. Our thorough analysis reveals that performing MTL with semantically diverse tasks consistently makes it harder for an adversarial attack to succeed. We also discuss in detail the serious pitfalls and their related remedies that have a significant impact on the robustness of MTL models. Our proposed MTL approach shows considerable absolute improvements in adversarially targeted WER ranging from 17.25 up to 59.90 compared to single-task learning baselines (attention decoder and CTC respectively). Ours is the first in-depth study that uncovers adversarial robustness gains from multi-task learning for ASR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2022

SkeleVision: Towards Adversarial Resiliency of Person Tracking with Multi-Task Learning

Person tracking using computer vision techniques has wide ranging applic...
research
05/20/2023

Dynamic Gradient Balancing for Enhanced Adversarial Attacks on Multi-Task Models

Multi-task learning (MTL) creates a single machine learning model called...
research
10/26/2021

Adversarial Robustness in Multi-Task Learning: Promises and Illusions

Vulnerability to adversarial attacks is a well-known weakness of Deep Ne...
research
07/15/2021

Multi-task Learning with Cross Attention for Keyword Spotting

Keyword spotting (KWS) is an important technique for speech applications...
research
09/20/2023

AudioFool: Fast, Universal and synchronization-free Cross-Domain Attack on Speech Recognition

Automatic Speech Recognition systems have been shown to be vulnerable to...
research
07/01/2020

Multi-Task Variational Information Bottleneck

In this paper we propose a multi-task deep learning model called multi-t...
research
04/04/2022

Robust Stuttering Detection via Multi-task and Adversarial Learning

By automatic detection and identification of stuttering, speech patholog...

Please sign up or login with your details

Forgot password? Click here to reset