Robust Transformer with Locality Inductive Bias and Feature Normalization

01/27/2023
by   Omid Nejati Manzari, et al.
0

Vision transformers have been demonstrated to yield state-of-the-art results on a variety of computer vision tasks using attention-based networks. However, research works in transformers mostly do not investigate robustness/accuracy trade-off, and they still struggle to handle adversarial perturbations. In this paper, we explore the robustness of vision transformers against adversarial perturbations and try to enhance their robustness/accuracy trade-off in white box attack settings. To this end, we propose Locality iN Locality (LNL) transformer model. We prove that the locality introduction to LNL contributes to the robustness performance since it aggregates local information such as lines, edges, shapes, and even objects. In addition, to further improve the robustness performance, we encourage LNL to extract training signal from the moments (a.k.a., mean and standard deviation) and the normalized features. We validate the effectiveness and generality of LNL by achieving state-of-the-art results in terms of accuracy and robustness metrics on German Traffic Sign Recognition Benchmark (GTSRB) and Canadian Institute for Advanced Research (CIFAR-10). More specifically, for traffic sign classification, the proposed LNL yields gains of 1.1 compared to the state-of-the-art studies.

READ FULL TEXT
research
03/29/2021

On the Adversarial Robustness of Visual Transformers

Following the success in advancing natural language processing and under...
research
05/17/2021

Rethinking the Design Principles of Robust Vision Transformer

Recent advances on Vision Transformers (ViT) have shown that self-attent...
research
07/13/2022

Pyramid Transformer for Traffic Sign Detection

Traffic sign detection is a vital task in the visual system of self-driv...
research
03/26/2021

Understanding Robustness of Transformers for Image Classification

Deep Convolutional Neural Networks (CNNs) have long been the architectur...
research
04/12/2021

LocalViT: Bringing Locality to Vision Transformers

We study how to introduce locality mechanisms into vision transformers. ...
research
06/11/2023

2-D SSM: A General Spatial Layer for Visual Transformers

A central objective in computer vision is to design models with appropri...
research
06/18/2020

Local Competition and Uncertainty for Adversarial Robustness in Deep Learning

This work attempts to address adversarial robustness of deep networks by...

Please sign up or login with your details

Forgot password? Click here to reset