Trojaning semi-supervised learning model via poisoning wild images on the web

01/01/2023
by   Le Feng, et al.
0

Wild images on the web are vulnerable to backdoor (also called trojan) poisoning, causing machine learning models learned on these images to be injected with backdoors. Most previous attacks assumed that the wild images are labeled. In reality, however, most images on the web are unlabeled. Specifically, we study the effects of unlabeled backdoor images under semi-supervised learning (SSL) on widely studied deep neural networks. To be realistic, we assume that the adversary is zero-knowledge and that the semi-supervised learning model is trained from scratch. Firstly, we find the fact that backdoor poisoning always fails when poisoned unlabeled images come from different classes, which is different from poisoning the labeled images. The reason is that the SSL algorithms always strive to correct them during training. Therefore, for unlabeled images, we implement backdoor poisoning on images from the target class. Then, we propose a gradient matching strategy to craft poisoned images such that their gradients match the gradients of target images on the SSL model, which can fit poisoned images to the target class and realize backdoor injection. To the best of our knowledge, this may be the first approach to backdoor poisoning on unlabeled images of trained-from-scratch SSL models. Experiments show that our poisoning achieves state-of-the-art attack success rates on most SSL algorithms while bypassing modern backdoor defenses.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 11

research
05/04/2021

Poisoning the Unlabeled Dataset of Semi-Supervised Learning

Semi-supervised machine learning models learn from a (small) set of labe...
research
11/01/2022

The Perils of Learning From Unlabeled Data: Backdoor Attacks on Semi-supervised Learning

Semi-supervised machine learning (SSL) is gaining popularity as it reduc...
research
12/15/2015

On Deep Representation Learning from Noisy Web Images

The keep-growing content of Web images may be the next important data so...
research
04/08/2022

Feature-enhanced Adversarial Semi-supervised Semantic Segmentation Network for Pulmonary Embolism Annotation

This study established a feature-enhanced adversarial semi-supervised se...
research
08/25/2022

Fix-A-Step: Effective Semi-supervised Learning from Uncurated Unlabeled Sets

Semi-supervised learning (SSL) promises gains in accuracy compared to tr...
research
03/09/2020

Actions speak louder than words: Semi-supervised learning for browser fingerprinting detection

As online tracking continues to grow, existing anti-tracking and fingerp...
research
03/14/2021

Semi-Supervised Video Deraining with Dynamic Rain Generator

While deep learning (DL)-based video deraining methods have achieved sig...

Please sign up or login with your details

Forgot password? Click here to reset