Design of intentional backdoors in sequential models

02/26/2019
by   Zhaoyuan Yang, et al.
4

Recent work has demonstrated robust mechanisms by which attacks can be orchestrated on machine learning models. In contrast to adversarial examples, backdoor or trojan attacks embed surgically modified samples with targeted labels in the model training process to cause the targeted model to learn to misclassify chosen samples in the presence of specific triggers, while keeping the model performance stable across other nominal samples. However, current published research on trojan attacks mainly focuses on classification problems, which ignores sequential dependency between inputs. In this paper, we propose methods to discreetly introduce and exploit novel backdoor attacks within a sequential decision-making agent, such as a reinforcement learning agent, by training multiple benign and malicious policies within a single long short-term memory (LSTM) network. We demonstrate the effectiveness as well as the damaging impact of such attacks through initial outcomes generated from our approach, employed on grid-world environments. We also provide evidence as well as intuition on how the trojan trigger and malicious policy is activated. Challenges with network size and unintentional triggers are identified and analogies with adversarial examples are also discussed. In the end, we propose potential approaches to defend against or serve as early detection for such attacks. Results of our work can also be extended to many applications of LSTM and recurrent networks.

READ FULL TEXT

page 10

page 14

page 16

page 17

research
11/04/2016

Adversarial Machine Learning at Scale

Adversarial examples are malicious inputs designed to fool machine learn...
research
07/09/2018

Adaptive Adversarial Attack on Scene Text Recognition

Recent studies have shown that state-of-the-art deep learning models are...
research
08/25/2020

An Adversarial Attack Defending System for Securing In-Vehicle Networks

In a modern vehicle, there are over seventy Electronics Control Units (E...
research
03/29/2023

Targeted Adversarial Attacks on Wind Power Forecasts

In recent years, researchers proposed a variety of deep learning models ...
research
11/05/2019

DLA: Dense-Layer-Analysis for Adversarial Example Detection

In recent years Deep Neural Networks (DNNs) have achieved remarkable res...
research
06/15/2021

On the Evaluation of Sequential Machine Learning for Network Intrusion Detection

Recent advances in deep learning renewed the research interests in machi...
research
05/15/2018

Neural Classification of Malicious Scripts: A study with JavaScript and VBScript

Malicious scripts are an important computer infection threat vector. Our...

Please sign up or login with your details

Forgot password? Click here to reset