Detecting Backdoors in Neural Networks Using Novel Feature-Based Anomaly Detection

11/04/2020
by   Hao Fu, et al.
0

This paper proposes a new defense against neural network backdooring attacks that are maliciously trained to mispredict in the presence of attacker-chosen triggers. Our defense is based on the intuition that the feature extraction layers of a backdoored network embed new features to detect the presence of a trigger and the subsequent classification layers learn to mispredict when triggers are detected. Therefore, to detect backdoors, the proposed defense uses two synergistic anomaly detectors trained on clean validation data: the first is a novelty detector that checks for anomalous features, while the second detects anomalous mappings from features to outputs by comparing with a separate classifier trained on validation data. The approach is evaluated on a wide range of backdoored networks (with multiple variations of triggers) that successfully evade state-of-the-art defenses. Additionally, we evaluate the robustness of our approach on imperceptible perturbations, scalability on large-scale datasets, and effectiveness under domain shift. This paper also shows that the defense can be further improved using data augmentation.

READ FULL TEXT
research
02/19/2020

NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs

This paper proposes a novel two-stage defense (NNoculation) against back...
research
04/25/2021

Unsupervised Learning of Multi-level Structures for Anomaly Detection

The main difficulty in high-dimensional anomaly detection tasks is the l...
research
07/11/2023

Differential Analysis of Triggers and Benign Features for Black-Box DNN Backdoor Detection

This paper proposes a data-efficient detection method for deep neural ne...
research
03/29/2021

Online Defense of Trojaned Models using Misattributions

This paper proposes a new approach to detecting neural Trojans on Deep N...
research
11/02/2018

Stronger Data Poisoning Attacks Break Data Sanitization Defenses

Machine learning models trained on data from the outside world can be co...
research
04/22/2021

SPECTRE: Defending Against Backdoor Attacks Using Robust Statistics

Modern machine learning increasingly requires training on a large collec...
research
05/13/2022

Universal Post-Training Backdoor Detection

A Backdoor attack (BA) is an important type of adversarial attack agains...

Please sign up or login with your details

Forgot password? Click here to reset