A PyTorch implementation of the method found in "Robust Few-Shot Learning with Adversarially Queried Meta-Learners"
Previous work on adversarially robust neural networks requires large training sets and computationally expensive training procedures. On the other hand, few-shot learning methods are highly vulnerable to adversarial examples. The goal of our work is to produce networks which both perform well at few-shot tasks and are simultaneously robust to adversarial examples. We adapt adversarial training for meta-learning, we adapt robust architectural features to small networks for meta-learning, we test pre-processing defenses as an alternative to adversarial training for meta-learning, and we investigate the advantages of robust meta-learning over robust transfer-learning for few-shot tasks. This work provides a thorough analysis of adversarially robust methods in the context of meta-learning, and we lay the foundation for future work on defenses for few-shot tasks.READ FULL TEXT VIEW PDF
The field of few-shot learning has recently seen substantial advancement...
Meta-learning has been proposed as a framework to address the challengin...
Several recent works have shown how highly realistic human head images c...
Training a Generative Adversarial Networks (GAN) for a new domain from
Learning compact discrete representations of data is itself a key task i...
There has been a recent trend in training neural networks to replace dat...
Gradient-based meta-learning approaches have been successful in few-shot...
A PyTorch implementation of the method found in "Robust Few-Shot Learning with Adversarially Queried Meta-Learners"
For safety-critical applications like facial recognition, traffic sign detection, and copyright control, adversarial attacks pose an actionable threat(Zhao et al., 2018; Eykholt et al., 2017; Saadatpanah et al., 2019). Conventional adversarial training and pre-processing defenses aim to produce networks that resist attack (Madry et al., 2017; Zhang et al., 2019; Samangouei et al., 2018), but such defenses rely heavily on the availability of large training datasets. In applications that require few-shot learning, such as face recognition from few images, recognition of a video source from a single clip, or recognition of a new object from few example photos, the conventional robust training pipeline breaks down.
When data is scarce or new classes arise frequently, neural networks must adapt quickly (Duan et al., 2017; Kaiser et al., 2017; Pfister et al., 2014; Vartak et al., 2017). In these situations, meta-learning
methods achieve few-shot learning by creating networks that learn quickly from little data and with computationally cheap fine-tuning. While state-of-the-art meta-learning methods perform well on benchmark few-shot classification tasks, these naturally trained neural networks are highly vulnerable to adversarial examples. In fact, we will see below that even robust classifiers, when adapted to a new task, fail to resist attacks unless appropriate measures are taken.
We study robust few-shot image classification by meta-learning. We begin by exploring several obvious defenses for few shot learning: adversarial training, robust architectural features, and pre-processing defenses, and find that all three provide relatively weak security in the few-shot setting. Specifically, feature denoising layers, architectural features that achieve state-of-the-art adversarial robustness on ImageNet, are not effective on the lightweight architectures used by meta-learning algorithms, and pre-processing defenses, such as DefenseGAN and image superresolution, dramatically decrease natural accuracy without achieving robustness.
We propose a new approach, called adversarial querying, in which the network is exposed to adversarial attacks during the query step of meta-learning. This algorithm-agnostic method produces a feature extractor that is robust, even without adversarial training during fine-tuning. In the few-shot setting, we show that adversarial querying out-performs standard defenses by a wide margin in terms of both clean accuracy and robustness.
|AT transfer learning (R2-D2 backbone)||39.13%||25.33%|
|Naturally Trained R2-D2||72.59%||0.00%|
|AQ R2-D2 (ours)||57.87%||31.52%|
The R2-D2 meta-learning method, adversarially trained transfer learning (ADML), and our adversarially queried (AQ) R2-D2 classifier on 5-shot Mini-ImageNet. The transfer learning model was trained on all training data (except the hold-out classes) simultaneously, and then fine-tuned on few-shot classes. All R2-D2 models are fine-tuned with a ridge regression head as in(Bertinetto et al., 2018), and we re-implement ADML from (Yin et al., 2018). Natural accuracy is denoted , and robust accuracy, , is computed with respect to a -step PGD attack as in (Madry et al., 2017) with . A description of our training regime can be found in Appendix A.1.
Before the emergence of meta-learning, a number of approaches existed to cope with few-shot data. One simple approach is transfer learning, in which pre-trained feature extractors are created using large datasets, and then fine-tuned on new tasks using less data (Bengio, 2012). Metric learning methods avoid overfitting to the small number of training examples in new classes by instead performing classification using nearest-neighbors in feature space with a feature extractor that is trained on a large corpus of data and not re-trained when classes are added (Snell et al., 2017; Gidaris and Komodakis, 2018; Mensink et al., 2012). Metric learning methods are computationally efficient when adding many low-shot classes, since the feature extractor network is not re-trained.
Meta-learning algorithms create a “base” model that quickly adapts to new tasks by fine-tuning. This model is created using a set of training tasks that can be sampled from a task distribution. Each task comes with support data, , and query data, . In practice, each task is taken to be a classification problem involving only a small subset of classes in a large many-class dataset. The number of examples per class in the support set is called the shot, so that fine-tuning on five support examples per class is 5-shot learning.
An iteration of training begins by sampling tasks
from the task distribution. The base model is fine-tuned on the support data for the sampled tasks, and then used to make predictions on the query data. Then, the base model parameters are updated to improve the accuracy of the resulting fine-tuned model. This requires backpropagation through the fine-tuning steps. See Algorithm1 for a formal treatment.
Note that the fine-tuned parameters, , in the above algorithm, are a function of the base model’s parameters so that the gradient computation in the outer loop may backpropagate through . For validation after training, the base model is fine-tuned on the support set of hold-out tasks, and accuracy on the query set is reported. In this work, we report performance on OmniGlot, Mini-ImageNet, and CIFAR-FS (Lake et al., 2015; Vinyals et al., 2016; Bertinetto et al., 2018).
We focus on four meta-learning algorithms: MAML, R2-D2, MetaOptNet, and ProtoNet. (Finn et al., 2017; Bertinetto et al., 2018; Lee et al., 2019; Snell et al., 2017). During fine-tuning, MAML uses SGD to update all parameters, minimizing cross-entropy loss. Since unrolling SGD steps into a deep computation graph is expensive, a first-order variants ignore second-order derivatives. We use the original MAML formulation. R2-D2 and MetaOptNet, on the other hand, only update the final linear layer during fine-tuning, leaving the “backbone network” that extracts these features frozen at test time. R2-D2 replaces SGD with a closed-form differentiable solver for regularized ridge regression, while MetaOptNet achieves its best performance when replacing SGD with a solver for SVM. Because the objective of these linear problems is convex, differentiable convex optimizers can be efficiently deployed to find optima, and differentiate these optima with respect to the backbone features at train time. ProtoNet takes an approach inspired by metric learning. It constructs class prototypes as the centroids in feature space for each task. These centroids are then used to classify the query set in the outer loop of training. Because each class prototype is a simple geometric average of feature representations, it is easy to differentiate through the fine-tuning step.
Several authors have tried to learn robust models in the data scarce regime. The authors of (Shafahi et al., 2019) study robustness properties of transfer learning. They find that retraining earlier layers of the network during fine-tuning impairs the robustness of the network, while only retraining later layers can largely preserve robustness. ADML is the first attempt at achieving robustness through meta-learning. ADML is a MAML variant, specifically designed for robustness, which employs adversarial training (Yin et al., 2018). However, this method for robustness is only compatible with MAML, an outdated meta-learning algorithm. Moreover, ADML is computationally expensive, and the authors only test their method against a weak attacker. We implement ADML and test it against a strong attacker. We show that our methods achieve both higher robustness and natural accuracy.
Sample results comparing baseline robust learning methods are shown in Table 1, which shows that clean meta-learning and a direct application of adversarial training to meta-learning (the ADML method) achieve low levels of robustness. While simple robust transfer learning achieves more robustness, the adversarial querying procedure does significantly better in terms of both clean and robust accuracy.
In this section, we benchmark existing methods for robust learning with scarce data in terms of both natural and robust accuracy. Following standard practices, we assess the robustness of models by attacking them with -bounded perturbations. We craft image perturbations using the projected gradient descent attack (PGD) since it has proven to be one of the most effective algorithms both for attacking as well as for adversarial training (Madry et al., 2017). This attack is a more powerful version of the one-step attack used in ADML (Yin et al., 2018). A detailed description of the PGD attack can be found in Algorithm 2. We consider perturbations of radius of and a step size of as described by Madry et al. (2017).
Adversarial training is the industry standard for creating robust models that maintain good clean-label performance (Madry et al., 2017). This method involves replacing clean examples with adversarial examples during the training routine. A simple way to harden models to attack is adversarial training, which solves the minimax problem
is the loss function of a network with parameters, is an input image with label , and is an adversarial perturbation. Adversarial training finds network parameters that keep the loss low (and class labels correct) even when adversarial perturbations are added.
Similarly to classically trained classifiers, we expect that few-shot learners are highly vulnerable to attack when adversarial defenses are not employed. We test prominent meta-learning algorithms against a 20-step PGD attack as in (Madry et al., 2017). Table 2 contains 5-shot natural and robust accuracy on the Mini-ImageNet and CIFAR-FS datasets (Vinyals et al., 2016; Bertinetto et al., 2018).
We find that these algorithms are completely unable to resist the attack. Interestingly, MetaOptNet uses SVM for fine-tuning, which is endowed with a wide margins property. The failure of even SVM to express robustness during testing suggests that using robust fine-tuning methods on naturally trained meta-learners is insufficient for robust performance. To further examine this, we consider MAML, which updates the entire network during fine-tuning. We use a naturally trained MAML model and perform adversarial training during fine-tuning (see Table 3). Adversarial training is performed with 7-PGD as in (Madry et al., 2017). If adversarial fine-tuning yielded robust classification, then we could avoid expensive adversarial training variants during meta-learning.
|5-shot Omniglot AQ||97.27%||95.85%||97.51%||96.14%|
While clean trained MAML models with adversarial fine-tuning are slightly more robust than their naturally fine-tuned counterparts, they achieve almost no robustness on Mini-ImageNet even with adversarial fine-tuning. Omniglot is an easier dataset for robustness, so we include an adversarially queried (AQ) MAML model for comparison. The adversarially queried model achieves far superior robustness. We conclude from these experiments that naturally trained meta-learners are vulnerable to adversarial examples, and an analysis of robust techniques for few-shot learning is in order.
We have observed that few-shot learning methods with a non-robust feature extractor break under attack. But what if we use a robust feature extractor? In the following section, we consider both transfer learning and meta-learning with a robust feature extractor.
In order to compare transfer learning and meta-learning, we train the backbone networks from meta-learning algorithms on all training data simultaneously in the fashion of standard adversarial training using 7-PGD (not meta-learning). We then fine-tune using the head from a meta-learning algorithm on top of the transferred feature extractor. We compare the performance of these feature extractors to that of those trained using adversarially queried meta-learning algorithms with the same backbones and heads. This experiment provides a direct comparison of feature extractors produced by transfer learning and robust meta-learning (see Table 4). Meta-learning exhibits far superior robustness than transfer learning on all algorithms we test.
We now adapt adversarial training to the meta-learning paradigm by introducing the query data, but not support data, to adversarial attack (see Algorithm 3). This approach yields fast performance during deployment, as adversarial training (which is roughly 10X slower than standard training) is not required to adapt to a new task. Adversarial querying is algorithm agnostic. We test this method on the MAML, ProtoNet, R2-D2, and MetaOptNet algorithms on the Mini-ImageNet and CIFAR-FS datasets (see Table 5).
In our tests, R2-D2 outperforms MetaOptNet in robust accuracy despite having a less powerful backbone architecture. In Section 4.2, we dissect the effects of backbone architecture and classification head on the disparity between R2-D2 and MetaOptNet in robust performance. In Section 4.4, we verify that adversarial querying generates networks robust to a wide array of strong attackers.
Adversarial querying can also be used to construct meta-learning analogues for other variants of adversarial training. We explore this by substituting the cross-entropy loss for the TRADES loss (Zhang et al., 2019)
. We refer to this method as meta-TRADES. While meta-TRADES can marginally outperform our initial adversarial querying method in robust accuracy with a careful hyperparameter choice,, we find that networks trained with meta-TRADES severely sacrifice natural accuracy (see Table 6).
|R2-D2 Adversarial Queried||57.87%||31.52%||69.25%||44.80%|
|R2-D2 TRADES ()||56.02%||30.96%||66.29%||45.59%|
|R2-D2 TRADES ()||51.51%||32.30%||61.41%||46.54%|
|R2-D2 TRADES ()||34.29%||22.04%||58.32%||45.89%|
High performing meta-learning models, like MetaOptNet and R2-D2, fix their feature extractor and only update their last linear layer during fine-tuning. In the setting of transfer learning, robustness is a feature of early convolutional layers, and re-training these early layers leads to a significant drop in robust test accuracy (Shafahi et al., 2019). We verify that re-training only the last layer leads to improved natural and robust accuracy in adversarially queried meta-learners by training a MAML model but only updating the final layer during fine-tuning including during the inner loop of meta-learning. We find that the model trained by only fine-tuning the last layer decisively outperforms the traditional MAML algorithm (AQ) in both natural and robust accuracy (see Table 7).
The naturally trained MetaOptNet algorithm outperforms R2-D2 in natural accuracy, but previous research has found that performance discrepancies between meta-learning algorithms might be an artifact of different backbone networks (Chen et al., 2019). On natural meta-learning, we confirm that MetaOptNet with the R2-D2 backbone performs similarly to R2-D2 (see Table 8). In our adversarial querying experiments, we saw that MetaOptNet was less robust than R2-D2. This discrepancy remains when we train MetaOptNet with the R2-D2 backbone (see Table 9). We conclude that MetaOptNet’s backbone is not responsible for its inferior robustness. These experiments suggest that ridge regression may be a more effective fine-tuning technique than SVM for robust performance. ProtoNet with R2-D2 backbone also performs worse than the other two adversarially queried models with the same backbone architecture.
|Model||1-shot MI||5-shot MI||1-shot CIFAR||5-shot CIFAR|
|MetaOptNet (R2-D2 backbone)||55.78%||73.15%||68.37%||82.71%|
|Model||1-shot MI||5-shot MI||1-shot CIFAR||5-shot CIFAR|
|MetaOptNet (R2-D2 backbone)||18.81%||24.68%||29.57%||41.90%|
|ProtoNet (R2-D2 backbone)||18.24%||28.39%||26.48%||40.59%|
In addition to adversarial training, architectural features have been used to enhance robustness (Xie et al., 2019). Feature denoising blocks pair classical denoising operations with learned convolutions to reduce the feature noise in feature maps at various stages of a network, and thus reduce the success of adversarial attacks. Massive architectures with these blocks have achieved state-of-the-art robustness against targeted adversarial attacks on ImageNet. However, when deployed on small networks for meta-learning, we find that denoising blocks do not improve robustness. We deploy denoising blocks identical to those in Xie et al. (2019) after various layers of the R2-D2 network. The best results for the denoising experiments are achieved by adding a denoising block after the fourth layer in the R2-D2 embedding network (See Table 10).
|R2-D2 AQ Denoising||57.68%||31.14%|
We test our method by exposing our adversarially queried R2-D2 model to a variety of powerful adversarial attacks. We implement the momentum iterated fast gradient sign method (MI-FGSM), DeepFool, and 20-step PGD with 20 random restarts (Dong et al., 2018; Moosavi-Dezfooli et al., 2016; Madry et al., 2017). Our adversarially queried model indeed is nearly as robust against the strongest bounded attacker as it is against the 20-step PGD attack with a single random start we tested against previously. Note that DeepFool is not bounded and thus the perturbed images are outside of the robustness radius enforced during adversarial querying.
|R2-D2 AT (Transfer Learning)||39.13%||0.42%||24.01%||19.75%|
Recent works have proposed pre-proccessing defenses for sanitizing adversarial examples before feeding them into a naturally trained classifier. If successful, these methods would avoid the expensive adversarial querying procedure during training. While this approach has found success in the mainstream literature, we find that it is ineffective in the few-shot regime.
In DefenseGAN, a GAN trained on natural images is used to sanitize an adversarial example by replacing (possible corrupted) test images with the nearest image in the output range of the GAN (Samangouei et al., 2018). Unfortunately, GANs are not expressive enough to preserve the integrity of testing images on complex datasets involving high-res natural images, and recent attacks have critically compromised the performance of this defense (Ilyas et al., 2017; Athalye et al., 2018). We found the expressiveness of the generator architecture used in the original DefenseGAN setup to be insufficient for even CIFAR-FS, so we substitute a stronger ProGAN generator to model the CIFAR-100 classes (Karras et al., 2017).
The supperesolution defense first denoises data with sparse wavelet filters and then performs superresolution (Mustafa et al., 2019). This defense is also motivated by the principle of projecting adversarial examples onto the natural image manifold. We test the superresolution defense using the same wavelet filtering and superresolution network (SRResNet) used by Mustafa et al. (2019) and first introduced by Ledig et al. (2017). Like with the generator for DefenseGAN, we train the SRResNet on the entire CIFAR-100 dataset before applying the superresolution defense.
We find that these methods are not well suited to the few-shot domain, in which the generative model or superresolution network may not be able to train on the little data available. Morever, even after training the generator on all CIFAR-100 classes, we find that DefenseGAN with a naturally trained R2-D2 meta-learner performs significantly worse in both natural and robust accuracy than an adversarially queried meta-learner of the same architecture. Similarly, the superresolution defense achieves little robustness. The results of these experiments can be found in Table 12.
|R2-D2 with SR defense||35.15%||23.00%|
|R2-D2 with DefenseGAN||35.15%||28.05%|
Naturally trained networks for few-shot learning are vulnerable to adversarial attacks, and existing robust transfer learning methods do not perform well on few-shot tasks. Naturally trained networks suffer from adversarial vulnerability even when adversarially trained during fine-tuning. We thus identify the need for an investigation into robust few-shot methods. We particularly study robustness in the context of meta-learning. We develop an algorithm-agnostic method, called adversarial querying, for hardening meta-learning models. We find that meta-learning models are most robust when the feature extractor is fixed, and only the last layer is retrained during the fine tuning stage. We further identify that choice of classification head matters for robustness. We hope that this paper serves as a starting point for developing new adversarially robust methods for few-shot applications.
One-shot imitation learning. In Advances in neural information processing systems, pp. 1087–1098. Cited by: §1.
Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1126–1135. Cited by: §2.1.
Photo-realistic single image super-resolution using a generative adversarial network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). External Links: Cited by: §5.
We train ProtoNet, R2-D2, and MetaOptNet models for 60 epochs with SGD. We use a learning rate of, momentum (Nesterov) of , and a weight decay term of for the parameters of both the head and the embedding. We decrease the learning rate to after epoch , after epoch , and after epoch . MAML is trained for epochs with meta learning rate of and fine-tuning learning rate of . Fine-tuning is performed for steps per task.