Imbalanced Adversarial Training with Reweighting

Adversarial training has been empirically proven to be one of the most effective and reliable defense methods against adversarial attacks. However, almost all existing studies about adversarial training are focused on balanced datasets, where each class has an equal amount of training examples. Research on adversarial training with imbalanced training datasets is rather limited. As the initial effort to investigate this problem, we reveal the facts that adversarially trained models present two distinguished behaviors from naturally trained models in imbalanced datasets: (1) Compared to natural training, adversarially trained models can suffer much worse performance on under-represented classes, when the training dataset is extremely imbalanced. (2) Traditional reweighting strategies may lose efficacy to deal with the imbalance issue for adversarial training. For example, upweighting the under-represented classes will drastically hurt the model's performance on well-represented classes, and as a result, finding an optimal reweighting value can be tremendously challenging. In this paper, to further understand our observations, we theoretically show that the poor data separability is one key reason causing this strong tension between under-represented and well-represented classes. Motivated by this finding, we propose Separable Reweighted Adversarial Training (SRAT) to facilitate adversarial training under imbalanced scenarios, by learning more separable features for different classes. Extensive experiments on various datasets verify the effectiveness of the proposed framework.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 9

page 14

10/13/2020

To be Robust or to be Fair: Towards Fairness in Adversarial Training

Adversarial training algorithms have been proven to be reliable to impro...
09/12/2019

Transferable Adversarial Robustness using Adversarially Trained Autoencoders

Machine learning has proven to be an extremely useful tool for solving c...
03/30/2021

Class-Aware Robust Adversarial Training for Object Detection

Object detection is an important computer vision task with plenty of rea...
02/20/2020

Boosting Adversarial Training with Hypersphere Embedding

Adversarial training (AT) is one of the most effective defenses to impro...
03/04/2021

Gradient-Guided Dynamic Efficient Adversarial Training

Adversarial training is arguably an effective but time-consuming way to ...
10/01/2020

Bag of Tricks for Adversarial Training

Adversarial training (AT) is one of the most effective strategies for pr...
10/15/2018

Adversarial Learning and Explainability in Structured Datasets

We theoretically and empirically explore the explainability benefits of ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The existence of adversarial samples Szegedy et al. (2013); Goodfellow et al. (2014)

has risen huge concerns on applying deep neural network (DNN) models into security-critical applications, such as autonomous driving 

Chen et al. (2015) and video surveillance systems Kurakin et al. (2016). As countermeasures against adversarial attacks, adversarial training Madry et al. (2017); Zhang et al. (2019) has been empirically proven to be one of the most effective and reliable defense methods. In general, adversarial training can be formulated to minimize the model’s average error on adversarially perturbed input examples Madry et al. (2017); Zhang et al. (2019); Rice et al. (2020). Although promising to improve the model’s robustness, most existing adversarial training methods Zhang et al. (2019); Wang et al. (2019) assume that the number of training examples from each class is equally distributed. However, datasets collected from real-world applications typically have imbalanced distribution Everingham et al. (2010); Lin et al. (2014); Van Horn and Perona (2017). Hence, it is natural to ask: what is the behavior of adversarial training under imbalanced scenarios? Can we directly apply existing imbalanced learning strategies in natural training to tackle the imbalance issue for adversarial training? Recent studies find that adversarial training usually presents distinct properties from natural training Schmidt et al. (2018); Xu et al. (2020a). For example, compared to natural training, adversarially trained models suffer more from the overfitting issue Schmidt et al. (2018). Moreover, it is evident from a recent study Xu et al. (2020a)

that the adversarially trained models tend to present strong class-wise performance disparities, even if the training examples are uniformly distributed over different classes. Imagine that if the training data distribution is highly imbalanced, these properties of adversarial training can be greatly exaggerated and make it extremely difficult to be applied in practice. Therefore, it is important but challenging to answers aforementioned questions.

As the initial effort to study the imbalanced problem in adversarial training, in this work, we first investigate the performance of existing adversarial training under imbalanced settings. As a preliminary study shown in Section 2.1, we apply both natural training and PGD adversarial training Madry et al. (2017) on multiple imbalanced image datasets constructed from the CIFAR10 dataset Krizhevsky et al. (2009) using the ResNet18 architecture He et al. (2016) and evaluate trained models’ performance on class-balanced test datasets. From the preliminary results, we observe that, compared to naturally trained models, adversarially trained models always present very low standard accuracy and robust accuracy111In this work, we denote standard accuracy as model’s accuracy on the input samples without perturbations and robust accuracy as model’s accuracy on the input samples which are adversarially perturbed. Without clear clarification, we consider the perturbation is constrained by -norm 8/255. on under-represented classes. For example, a naturally trained model can achieve around 40% and 60% standard accuracy on under-represented classes “frog” and “truck” separately, while an adversarially trained model gets both 0% standard & robust accuracy on these two classes. This observation suggests that adversarial training is more sensitive to imbalanced data distribution than natural training. Thus, when applying adversarial training in practice, imbalance learning strategies should always be considered for help.

As a result, we explore the potential solutions which can handle the imbalance issues for adversarial training. In this work, we focus on studying the behavior of the reweighting strategy He and Ma (2013) and leave other strategies such as resampling Estabrooks et al. (2004) for one future work. In Section 2.2, we apply the reweighting strategy to existing adversarial training with varied weights assigning to one under-represented class and evaluate trained models’ performance. From the results, we observe that, in adversarial training, increasing weights for an under-represented class can substantially improve the standard & robust accuracy on this class, but drastically hurt the model’s performance on the well-represented class. For example, the robust accuracy of the adversarially trained model on the under-represented class “horse” can be greatly improved when setting a relatively large weight, like 200, to its examples, but the model’s robust accuracy on the well-represented class “cat” is dropped to even lower than the class “horse” and, hence, the overall robust performance of the model is also decreased. These facts indicate that the performance of adversarially trained models is very sensitive to the reweighting manipulations and it could be very hard to figure out an eligible reweighting strategy which is optimal for all classes.

It is also worth noting that this phenomenon is absent in natural training under the same settings. In natural training, from the results in Section 2.2, we find that upweighting the under-represented class increases model’s standard accuracy on this class but only slightly hurts the accuracy on other classes, even when adopting a large weight for under-represent class. To further investigate the possible reasons leading to different behaviors of the reweighing strategy in natural and adversarial training, we visualize their learned features via t-SNE Van der Maaten and Hinton (2008). As shown in Figure 3, we observe that features learned by the adversarially trained model of different classes tend to mix together while they are well separated for the naturally trained model. This observation motivates us to theoretically show that when the given data distribution has poor data separability, upweighting under-represented classes will hurt the model’s performance on well-represented classes. Motivated by this theoretical understanding, we propose a novel algorithm Separable Reweighted Adversarial Training (SRAT) to facilitate the reweighting strategy in imbalanced adversarial training by enhancing the separability of learned features. Through extensive experiments, we validate the effectiveness of SRAT.

2 Preliminary Study

2.1 The Behavior of Adversarial Training

In this subsection, we conduct preliminary studies to examine the performance of PGD adversarial training Madry et al. (2017), under an imbalanced training dataset which is resampled from CIFAR10 dataset Krizhevsky et al. (2009). Following previous imbalanced learning works Cui et al. (2019); Cao et al. (2019), we consider to construct an imbalanced training dataset where each of the first 5 classes (well-represented classes) has 5,000 training examples, and each of the last 5 classes (under-represented classes) has 50 training examples. Figure 1 shows the performance of naturally and adversarially trained models under a ResNet18 He et al. (2016) architecture. From the figure, we can observe that, comparing with natural training, PGD adversarial training will result in a larger performance gap between well-represented classes and under-represented classes. For example, in natural training, the ratio between the average standard accuracy of well-represented classes (brown) and under-represented classes (violet) is about 2:1, while in adversarial training, this ratio expands to 16:1. Moreover, for adversarial training, although it can achieve good standard & robust accuracy on well-represented classes, it has extremely poor performance on under-represented classes. There are 3 out of the 5 under-represented classes with 0% standard & robust accuracy. As a conclusion, the performance of adversarial training is easier to be affected by imbalanced distribution than natural training and suffers more on under-represented classes. In Appendix A.1, we provide more implementation details of this experiment, as well as additional results of the same experiment under other imbalanced settings. The results in Appendix A.1 further support our findings.

(a) Natural Training Standard Acc
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 1: Class-wise performance of natural & adversarial training under an imbalanced CIFAR10.

2.2 The Reweighting Strategy in Natural Training v.s. in Adversarial Training

The preliminary study in Section 2.1 demonstrates that it is highly demanding to adjust the original adversarial training methods to accommodate class-imbalanced data. Therefore, in this subsection, we investigate the effectiveness of existing imbalanced learning strategies in natural training when adopted in adversarial training. In this paper, we focus on the reweighting strategy He and Ma (2013) as the initial effort to study this problem and leave other methods such as resampling Chawla et al. (2002) for future investigation. In this subsection, we conduct experiments under a binary classification problem, where the training dataset contains two classes that are randomly selected from CIFAR10 dataset, with each class having 5,000 and 50 training examples respectively. Under this training dataset, we arrange multiple trails of (reweighted) natural training and (reweighted) adversarial training, with the weight ratio between the under-represented class and well-represented class ranging from 1:1 to 200:1.

(a) Natural Training Standard Acc.
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 2: Class-wise performance of reweighted natural & adversarial training in binary classification.

Figure 2 shows the experimental results with data from the classes “horse” and “cat”. As demonstrated in Figure 2, increasing the weight of the under-represented class will drastically increase the model’s performance of the under-represented class, while also immensely decreasing the performance of the well-represented class. For example, when increasing the weight ratio between two classes from 1:1 to 150:1, the under-represented class’s standard accuracy can be improved from 0% to and its robust accuracy from to . However, the standard & robust accuracy of the well-represented class is also drastically decreasing. For instance, the well-represented class’s standard accuracy drops from 100% to 60%, and its robust accuracy drops from 100% to 50%. These results illustrate that adversarial training’s performance can be significantly affected by the reweighting strategy. As a result, the reweighting strategy in this setting can hardly help improve the overall performance no matter which weight ratio is chosen, because the model’s performance always presents a strong tension between these two classes. As a comparison, for the naturally trained models (Figure 1(a)), increasing the weights for the under-represented examples will only slightly decrease the performance on the well-represented class. More experiments using different binary imbalanced datasets are reported in Appendix A.2, where we have similar observations.

3 Theoretical Analysis

(a) Natural Training.
(b) Adversarial Training.
Figure 3: t-SNE visualization of penultimate layer features.

In Section 2.2, we observe that in natural training, the reweighting strategy can only make a small impact on the two classes’ performance. This phenomenon has been extensively studied by recent works Byrd and Lipton (2019); Xu et al. (2021)

, which investigate the decision boundaries of perfectly fitted DNNs. In particular, they consider the case where the data is linearly (or nonlinearly) separable and study the behavior of linear (or nonlinear) models optimized by reweighted SGD algorithms. Interestingly, they conclude that over the process of training, these models’ decision boundaries will eventually converge to weight-agnostic solutions. For example, a linear classifier optimized by SGD on a linearly separable data will converge to the solution of the

hard-margin support vector machine

 Noble (2006). In other words, as long as the data can be well separated, reweighting will not make huge influence on the finally trained models, which is consistent with what we observed above.

Although these studies only focus on natural training, their interpretations and conclusions motivate our hypothesis in adversarial training. For adversarial training, we conjecture that it is because the models separate the data poorly, thus, their performance is highly sensitive to the reweighting strategy. As a direct validation of this hypothesis, in Figure 3, we visualize the learned (penultimate layer) features of the imbalanced training examples used in the binary classification problem in Section 2.2. We find that adversarially trained models do present obviously poorer separability on the learned features. This suggests that, compared to naturally trained models, adversarially trained models have a weaker ability to separate training data and could potentially make themselves sensitive to reweighting. Next, we will theoretically analyze the impact of reweighting on linear models which are optimized under poorly separable data. Since our empirical study shows that adversarially trained models usually poorly separate the data (see Figure 3), the analysis can hopefully shed light on the behavior of reweighting in adversarial trained models in practice.

Binary Classification Problem. To construct the theoretical study, we focus on a binary classification problem, with a Gaussian mixture distribution which is defined as:

(1)

where the two classes’ centers with each dimension has mean value (

), variance

. Formally, we define the data separability as . Intuitively, if the separability term is larger, it suggests that two classes have farther distance or data examples of each class are more concentrated, so these two classes can be well separated. Previous works Byrd and Lipton (2019) also closely studied this term to describe data separability. Besides, we particularly define the imbalanced training dataset satisfying the condition and

which indicates the imbalance ratio between the two classes. During test, we assume that two classes have the equal probability to appear. Under data distribution

, we will discuss the performance of linear classifiers where and are the weight and bias term of model . If a reweighting strategy is involved, we define the model will upweight the under-represented class “-1” by . In the following lemma, we first derive the solution of the optimized linear classifier training on this imbalanced dataset. Then we will extend the result of Lemma 3.1 to analyze the impact of data separability on the performance of model .

Lemma 3.1

Under the data distribution as defined in Eq. (1), with an imbalanced ratio and a reweight ratio , the optimal classifier which minimizes the (reweighted) empirical risk:

(2)

has the solution: and .

The proof of Lemma 3.1 can be found at Appendix A.3.1

. Note that the final optimized classifier has a weight vector equal to

and its bias term only depends on , and the data separability . In the following, our first theorem is focused on one special setting when , which is the original ERM model without reweighting. Specifically, Theorem 3.1 calculates and compares the model’s performance under data distributions: (with a higher separability ) and (with a lower separability ). From Theorem 3.1, we aim to compare the behavior of linear models when they can poorly separate data (like adversarial trained models) or they can well separate data (like naturally trained models).

Theorem 3.1

Under two data distributions and with the separability , let and be the optimal non-reweighted classifiers () under and , respectively. Given the imbalance ratio is large enough, we have:

(3)

The proof of Theorem 3.1 is provided at Appendix A.3.2. Intuitively, Theorem 3.1 suggests that when the data separability is low (such as ), the optimized classifier (without reweighting) can intrinsically have a larger error difference between the under-represented class “-1” and the well-represented class “+1”. Similar to the observation in Section 2.1 and Figure 3, adversarially trained models also present a weak ability to separate data, and it also presents a strong performance gap between the well-represented class and under-represented class. Conclusively, Theorem 3.1 indicates that the poor ability to separate the training data can be one important reason which leads to the strong performance gap of adversarially trained models.

Next, we consider the case when the reweighting strategy is applied. Similar to Theorem 3.1, we also calculate the models’ classwise error under and with different levels of separability. In particular, Theorem 3.2 focuses on the well-represented class “+1” and calculates its error increase when upweighting the under-represented class “-1” by . Through the analysis in Theorem 3.2, we compare the impact of upweighting the under-represented class on the performance of well-represented class.

Theorem 3.2

Under two data distributions and with different separability , let and be the optimal non-reweighted classifiers () under and respectively, and let and be the optimal reweighted classifiers under and given the optimal reweighting ratio (). Given the imbalance ratio is large enough, we have:

(4)

The detail the proof of Theorem 3.2 at Appendix A.3.3. The theorem shows that, when the data distribution has poorer data separability, such as , upweighting the under-represented class can cause greater hurt on the performance of the well-represented class. It is also consistent with our empirical findings about adversarial training models. Since the adversarially trained models poorly separate the data (Figure 3), upweighting the under-represented class always drastically decreases the performance of well-represented class (Section 2.2). Through the discussions in both Theorem 3.1 and Theorem 3.2, we can conclude that the poor separability can be one important reason which makes adversarial training and its reweighted variants extremely difficult to achieve good performance under imbalance data distribution. Therefore, in the next section, we explore potential solutions which can facilitate the reweighting strategy in adversarial training.

4 Separable Reweighted Adversarial Training (SRAT)

The observations from both preliminary studies and theoretical understandings indicate that more separable data will advance the reweighting strategy in adversarial training under imbalanced scenarios. Thus, in this section, we present a framework, Separable Reweighted Adversarial Training (SRAT), that enables the effectiveness of the reweighting strategy in adversarial training under imbalanced scenarios by increasing the separability in the learned latent feature space.

4.1 Reweighted Adversarial Training

Given an input example , adversarial training Madry et al. (2017) aims to obtain a robust model that can make the same prediction for an adversarial example , generated by applying an adversarially perturbation on . The adversarially perturbations are typically bounded by a small value under -norm, i.e.,

. More formally, adversarial training can be formulated as solving a min-max optimization problem, where a DNN model is trained on minimizing the prediction error on adversarial examples generated by iteratively maximizing some loss function.

As indicated in Section 2.1, adversarial training cannot be applied in imbalanced scenarios directly, as it presents very low performance on under-represented classes. To tackle this problem, a natural idea is to integrate existing imbalanced learning strategies proposed in natural training, such as reweighting, into adversarial training to improve the trained model’s performance on those under-represented classes. Hence, the reweighted adversarial training can be defined as

(5)

where is a reweighting value assigned for each input sample based on the example size of the class belongs to or some properties of . In most existing adversarial training methods Madry et al. (2017); Zhang et al. (2019); Wang et al. (2019), the cross entropy (CE) loss is adopted as the loss function . However, the CE loss could be suboptimal in imbalanced settings and some new loss functions designed for imbalanced settings specifically, such as Focal loss Lin et al. (2017) and LDAM loss Cao et al. (2019), have been prove superiority in natural training. Hence, besides CE loss, Focal loss and LDAM loss can also be adopted as the loss function in Eq. (5).

4.2 Increasing Feature Separability

Our preliminary study indicates that only reweighted adversarial training cannot work well under imbalanced scenarios. Moreover, the reweighting strategy behaves very differently between natural training and adversarial training. Meanwhile, our theoretical analysis suggests that the poor separability of the feature space produced by the adversarially trained model can be one reason to understand these observations. Hence, in order to facilitate the reweighting strategy in adversarial training under imbalanced scenarios, we equip a feature separation loss with our SRAT method. We aim to enforce the learned feature space as separable as possible. More specifically, the goal of the feature separation loss is to make (1) the learned features of examples from the same class well clustered, and (2) the features of examples from different classes well separated. By achieving this goal, the model is able to learn more discriminative features for each class. Correspondingly adjusting the decision boundary via the reweighting strategy to fit under-represented classes’ examples more will not hurt well-represented classes drastically. The feature separation loss is formally defined as:

(6)

where is the feature representation of the adversarial example of , is a scalar temperature parameter, denotes the set of input examples belonging to the same class with and indicates the set of all input examples excepts . When minimizing the feature separation loss during training, the learned features of examples from the same class will tend to aggregate together in the latent feature space, and, hence, result in a more separable latent feature space. Our proposed feature separation loss is inspired by the supervised contrastive loss proposed in Khosla et al. (2020). The main difference is, instead of applying data augmentation techniques to generate two different views of each data example and feeding the model with augmented data examples, our feature separation loss directly takes the adversarial example of each data example as input.

4.3 Training Schedule

By combining the feature separation loss with the reweighted adversarial training, the final object function for Separable reweighted Adversarial Training (SRAT) can be defined as:

(7)

where we use a hyper-parameter to balance the contributions from the reweighted adversarial training and the feature separation loss.

In practice, in order to better take advantage of the reweighting strategy in our SRAT method, we adopt a deferred reweighting training schedule Cao et al. (2019). Specifically, before annealing the learning rate, our SRAT method first trains a model guided by Eq. (7) without introducing the reweighting strategy, i.e., setting for every input example , and then applies reweighting into model training process with a smaller learning rate. Our SRAT method enables to learn more separable feature space, thus comparing with applying the reweighting strategy from the beginning of training, this deferred re-balancing training schedule enables the reweighting strategy to obtain more benefits from our SRAT method, and as a result, it can boost the performance of our SRAT method with the help of the reweighting strategy. The detailed training algorithm for SRAT is shown in Appendix A.4.

5 Experiment

In this section, we perform comprehensive experiments to validate the effectiveness of our proposed SRAT method. We first compare our method with several representative imbalanced learning methods in adversarial training under various imbalanced scenarios and then conduct ablation study to understand our method more deeply.

5.1 Experimental Settings

Datasets. We conduct experiments on multiple imbalanced training datasets artificially created from two benchmark image datasets CIFAR10 Krizhevsky et al. (2009)

and SVHN 

Netzer et al. (2011) with diverse imbalanced distributions. Specifically, we consider two types of imbalance types: Exponential (Exp) imbalance Cui et al. (2019) and Step imbalance Buda et al. (2018). For Exp imbalance, the number of training examples of each class will be reduced according to an exponential function , where is the class index, is the number of training examples in the original CIFAR10/SVHN training dataset for class and . We categorize five most frequent classes in the constructed imbalanced training dataset as well-represented classes and the remaining five classes as under-represented classes. For Step imbalance, we follow the same process adopted in Section 2.1 to construct imbalanced training datasets based on CIFAR10 and SVHN, separately. Moreover, in both imbalanced types, we denote imbalance ratio as the ratio between training example sizes of the most frequent and least frequent class. In our experiments, we construct four different imbalanced datasets, named as “Step-100", “Step-10", “Exp-100" and “Exp-10", by adopting different imbalanced types (Step or Exp) with different imbalanced ratios ( or ) to train models, and evaluate model’s performance on the original uniformly distributed test datasets of CIFAR10 and SVHN correspondingly. More detailed information about imbalanced training sets used in our experiments can be found in Appendix A.5.

Baseline methods. We implement several representative and state-of-the-art imbalanced learning methods (or their combinations) into adversarial training as baseline methods. These methods include: (1) Focal loss (Focal); (2) LDAM loss (LDAM); (3) Class-balanced reweighting (CB-Reweight) Cui et al. (2019), where each example is reweighted proportionally by the inverse of the effective number222The effective number is defined as the volume of examples and can be calculated by , where

is a hyperparameter and

denotes the number of examples of class . of its class; (4) Class-balanced Focal loss (CB-Focal) Cui et al. (2019), a combination of Class-balanced method Cui et al. (2019) and Focal loss Lin et al. (2017), where well-classified examples will be down-weighted while hard-classified examples will be up-weighted controlled by their corresponding effective numbers; (5) deferred reweighted CE loss (DRCB-CE), where a deferred reweighting training schedule is applied based on the CE loss; (6) deferred reweighted Class-balanced Focal loss (DRCB-Focal), where a deferred reweighting training schedule is applied based on the CB-Focal loss; (7) deferred reweighted Class-balanced LDAM loss (DRCB-LDAM) Cao et al. (2019), where a deferred reweighting training schedule is applied based on the CB-LDAM loss. In addition, we also include the original PGD adversarial training method using cross entropy loss (CE) in our experiments.

Our proposed methods. We evaluate three variants of our proposed SRAT method with different implementations of the prediction loss in Eq. (5), i.e., CE loss, Focal loss and LDAM loss. The variant utilizing CE loss is denoted as SRAT-CE, and, similarly, other two variants are denoted as SRAT-Focal and SRAT-LDAM, respectively. For all these three variants, Class-balanced method Cui et al. (2019) is adopted to set reweighting values within the deferred reweighting training schedule.

Implementation details.

We implement all methods used in our experiments based on a Pytorch library DeepRobust 

Li et al. (2020). For CIFAR10 based imbalanced datasets, the adversarial examples used in training are calculated by PGD-10, with a perturbation budget , and step size . For robustness evaluation, we report robust accuracy under -norm attacks generated by PGD-20 on Resnet-18 He et al. (2016) models. For SVHN based imbalanced datasets, the setting is similar with CIFAR10 based datasets, excepts we set step size to in both training and test phases, as suggested in Wu et al. (2020)

. For the deferred reweighting training schedule used in our methods and some baseline methods, we set the number of the training epochs to 200 and the initial learning rate to 0.1, and then decay the learning rate at epoch 160 and 180 with the ratio 0.01. The reweighting strategy will be applied starting from epoch 160.

5.2 Performance Comparison

Table 1 and 2 show the performance comparison on various imbalanced CIFAR10 datasets with different imbalance types and imbalance ratios. In these two tables, we use bold values to denote the highest accuracy among all methods and use the underline values to indicate our SRAT variants which achieve the highest accuracy among their corresponding baseline methods utilizing the same loss function for making predictions. Due to the limited space, we report the performance comparison on SVHN based imbalanced datasets in Appendix A.6.

From Table 1 and Table 2, we can make the following observations. First, compared to baseline methods, our SRAT methods can obtain improved performance in terms of both overall standard & robust accuracy under almost all imbalanced settings. More importantly, our SRT methods make significantly improvement on those under-represented classes, especially under the extremely imbalanced setting. For example, on the Step imbalanced dataset with imbalance ratio , our SRAT-Focal method improves the standard accuracy on under-represented classes from 21.81% achieved by the best baseline method utilizing Focal loss to 51.83% and robust accuracy from 3.24% to 15.89%. These results demonstrate that our proposed SRAT method is able to obtain more robustness under imbalanced settings. Second, the performance gap among three variants SRAT-CE, SRAT-Focal and SRAT-LDAM are mainly caused by the gap between the loss functions in these methods. As shown in Table 1 and 2, DRCB-LDAM typically performs better than DRCE-CE and DRCB-Focal, and similarly, SRAT-LDAM outperforms SRAT-CE and SRAT-Focal under corresponding imbalanced settings.

Imbalance Ratio 10 100
Imbalance Ratio Standard Accuracy Robust Accuracy Standard Accuracy Robust Accuracy
Method Overall Under Overall Under Overall Under Overall Under
CE 63.26 40.62 36.96 14.23 47.29 9.03 30.39 1.62
Focal 63.57 41.17 36.89 14.25 47.36 9.03 30.12 1.45
LDAM 57.08 31.09 37.18 12.44 42.49 0.85 30.80 0.05
CB-Reweight 73.30 74.80 41.34 42.15 37.68 19.64 25.58 10.33
CB-Focal 73.47 73.69 41.19 41.02 15.44 0.00 14.46 0.00
DRCB-CE 75.89 70.55 39.93 33.33 53.40 22.86 28.31 3.35
DRCB-Focal 74.61 67.06 37.91 29.50 52.75 21.81 27.78 3.24
DRCB-LDAM 72.95 75.42 45.23 44.98 61.60 50.69 31.37 16.25
SRAT-CE 76.32 73.20 41.71 37.86 59.10 40.24 30.02 11.72
SRAT-Focal 75.41 74.91 42.05 41.28 62.93 51.83 28.38 15.89
SRAT-LDAM 73.99 76.63 45.60 45.96 63.13 52.73 33.51 18.89
Table 1: Performance comparison on imbalanced CIFAR10 datasets (Imbalanced Type: Step)
Imbalance Ratio 10 100
Metric Standard Accuracy Robust Accuracy Standard Accuracy Robust Accuracy
Method Overall Under Overall Under Overall Under Overall Under
CE 71.95 64.09 37.94 26.79 48.40 23.04 26.94 6.17
Focal 72.06 63.99 37.62 26.27 49.16 23.69 26.84 5.88
LDAM 67.39 58.01 41.35 28.65 48.39 25.69 29.51 8.95
CB-Reweight 75.17 76.87 41.02 41.67 57.49 56.47 29.01 26.53
CB-Focal 74.73 76.67 38.86 42.41 50.35 60.05 27.15 33.56
DRCB-CE 76.25 75.83 40.02 37.93 57.30 37.90 26.97 10.57
DRCB-Focal 75.36 72.72 37.76 33.83 54.76 31.79 25.24 7.81
DRCB-LDAM 73.92 78.53 46.29 48.81 62.65 57.19 31.66 22.11
SRAT-CE 76.94 79.50 41.50 43.08 64.93 64.34 29.68 25.42
SRAT-Focal 75.26 80.52 42.37 47.22 62.57 64.88 30.34 28.66
SRAT-LDAM 74.63 79.82 46.72 50.38 63.11 65.60 34.22 32.55
Table 2: Performance comparison on imbalanced CIFAR10 datasets (Imbalanced Type: Exp).

5.3 Ablation Study

In this subsection, we provide ablation study to understand our SRAT method more comprehensively.

(a) CE.
(b) DRCB-LDAM.
(c) SRAT-LDAM.
Figure 4: t-SNE feature visualization of training examples learned by SRAT and two baseline methods using imbalanced training datasets “Step-100".

Feature space visualization. In order to facilitate the reweighting strategy in adversarial training under the imbalanced setting, we present a feature separation loss in our SRAT method. The main goal of the feature separation loss is to enforce the learned feature space as much separated as possible. For checking whether the feature separation loss can work as expected, we apply t-SNE Van der Maaten and Hinton (2008) to visualize the latent feature space learned by our SRAT-LDAM method in Figure 4. As a comparison, we also provide the visualization of feature space learned by the original PGD adversarial training method (CE) and DRCB-LDAM method.

As shown in Figure 4, the feature space learned by our SRAT-LDAM method is more separable than two baseline methods. This observation demonstrates that, with our proposed feature separation loss, the adversarially trained model is able to learn much better features and thus our SRAT method can achieve superiority performance.

Impact of reweighting values. As in all SRAT variants, we adopt the Class-balanced method Cui et al. (2019) to assign different weights to different classes based on their effective number. To explore how the assigned weights impact the performance of our proposed SRAT method, we conduct experiments on a Step-imbalanced CIFAR10 dataset with imbalance ratio to see the change of model’s performance using different reweighting values. In our experiments, we assign five well-represented classes with weight 1 and change the weight for remaining five under-represented classes from 10 to 200. The experimental results are shown in Figure 5. Here, we use an approximation integer 78 to denote the weight calculated by the Class-balanced method when the imbalance ratio equals 100.

From Figure 5, we can obverse that, for all SRAT variants, the model’s standard accuracy is increased with the increase of the weights assigning to under-represented classes. However, the robust accuracy for these three methods do not synchronize with the change of their standard accuracy. When increasing the weights for under-represented classes, robust accuracy of SRAT-LDAM is almost unchanged and robust accuracy of SRAT-CE and SRAT-Focal even has slight decrease. As a trade-off, using a relative large weights, such as 78 or 100, in our SRAT method can obtain satisfactory performance on both standard & robust accuracy, where the former is calculated by the Class-balanced method and the latter equals the imbalance ratio .

Figure 5: The impact of reweighting values using an imbalanced training dataset “Step-100".
(a) Step-100.
(b) Exp-100.
Figure 6: The impact of the hyper-parameter using imbalanced training datasets “Step-100" and “Exp-100".

Impact of hyper-parameter . In our proposed SRAT method, the contributions of feature separation loss and prediction loss are controlled by a hyper-parameter . In this part, we study how this hyper-parameter affects the performance of our SRAT method. In our experiments, we evaluate the models’ performance of all SRAT variants with different values of used in training process on both Step-imbalanced CIFAR10 dataset and Exp-imbalanced CIFAR10 dataset with imbalance ratio .

As shown in Figure 6, the performance of all SRAT variants are not very sensitive with the choice of . However, a large value of , such as 8, may hurt the model’s performance.

6 Related Work

Adversarial Robustness. The vulnerability of DNN models to adversarial examples has been verified by many existing successful attack methods Goodfellow et al. (2014); Carlini and Wagner (2017); Madry et al. (2017). To improve model robustness against adversarial attacks, various defense methods have been proposed Goodfellow et al. (2014); Madry et al. (2017); Raghunathan et al. (2018); Cohen et al. (2019). Among them, adversarial training has been proven to be one of the most effective defense methods Athalye et al. (2018). Adversarial training can be formulated as solving a min-max optimization problem where the outer minimization process enforces the model to be robust to adversarial examples, generated by the inner maximization process via some existing attacking methods like PGD Madry et al. (2017). Based on adversarial training, several variants, such as TRADES Zhang et al. (2019), MART Wang et al. (2019) and FAT Zhang et al. (2020), have been presented to improve the model’s performance further. More details about adversarial robustness can be found in recent surveys Chakraborty et al. (2018); Xu et al. (2020b). Since almost all studies of adversarial training are focused on balanced datasets, it’s worthwhile to investigate the performance of adversarial training methods on imbalanced training datasets.

Imbalanced Learning. Most existing works of imbalanced training can be roughly classified into two categories, i.e., re-sampling and reweighting. Re-sampling methods aim to reduce the level of imbalance through either over-sampling data examples from under-represented classes Buda et al. (2018); Byrd and Lipton (2019) or under-sampling data examples from well-represented classes Japkowicz and Stephen (2002); Drummond et al. (2003); He and Garcia (2009); Yen and Lee (2009). reweighting methods allocate different weights for different classes or even different data examples. For example, Focal loss Lin et al. (2017) enlarges the weights of wrongly-classified examples while reducing the weights of well-classified examples in the standard cross entropy loss; and LDAM loss Cao et al. (2019) regularizes the under-represented classes more strongly than the over-represented classes to attain good generalization performance on under-represented classes. More information about imbalanced learning can be found in recent surveys He and Ma (2013); Johnson and Khoshgoftaar (2019). The majority of existing methods focused on the nature training scenario and their trained models will be crashed when facing adversarial attacks Szegedy et al. (2013); Goodfellow et al. (2014). Hence, in this paper, we develop a novel method that can defend adversarial attacks and achieve well-pleasing performance under the imbalance setting.

7 Conclusion

In this work, we first empirically investigate the behavior of adversarial training under imbalanced settings and explore the potential solutions to assist adversarial training in tackling the imbalanced issues. As neither adversarial training method itself nor adversarial training with reweighting strategy can work well under imbalanced scenarios, we further theoretically verify that the poor data separability is one key reason causing the failure of adversarial training based methods under imbalanced scenarios. Based on our findings, we propose the Separable Reweighted Adversarial Training (SRAT) framework to facilitate the reweighting strategy in imbalanced adversarial training by enhancing the separability of learned features. Through extensive experiments, we validate the effectiveness of SRAT. In the future, we plan to examine how other types of defense methods perform under imbalanced scenarios and how other types of balanced learning strategies in natural training behavior under adversarial training.

References

  • [1] A. Athalye, N. Carlini, and D. Wagner (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In

    International Conference on Machine Learning

    ,
    pp. 274–283. Cited by: §6.
  • [2] M. Buda, A. Maki, and M. A. Mazurowski (2018)

    A systematic study of the class imbalance problem in convolutional neural networks

    .
    Neural Networks 106, pp. 249–259. Cited by: §5.1, §6.
  • [3] J. Byrd and Z. Lipton (2019)

    What is the effect of importance weighting in deep learning?

    .
    In International Conference on Machine Learning, pp. 872–881. Cited by: §3, §3, §6.
  • [4] K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma (2019) Learning imbalanced datasets with label-distribution-aware margin loss. arXiv preprint arXiv:1906.07413. Cited by: §A.6, §2.1, §4.1, §4.3, §5.1, §6.
  • [5] N. Carlini and D. Wagner (2017) Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Cited by: §6.
  • [6] A. Chakraborty, M. Alam, V. Dey, A. Chattopadhyay, and D. Mukhopadhyay (2018) Adversarial attacks and defences: a survey. arXiv preprint arXiv:1810.00069. Cited by: §6.
  • [7] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer (2002) SMOTE: synthetic minority over-sampling technique.

    Journal of artificial intelligence research

    16, pp. 321–357.
    Cited by: §2.2.
  • [8] C. Chen, A. Seff, A. Kornhauser, and J. Xiao (2015) Deepdriving: learning affordance for direct perception in autonomous driving. In

    Proceedings of the IEEE international conference on computer vision

    ,
    pp. 2722–2730. Cited by: §1.
  • [9] J. Cohen, E. Rosenfeld, and Z. Kolter (2019) Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning, pp. 1310–1320. Cited by: §6.
  • [10] Y. Cui, M. Jia, T. Lin, Y. Song, and S. Belongie (2019) Class-balanced loss based on effective number of samples. In

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    ,
    pp. 9268–9277. Cited by: §2.1, §5.1, §5.1, §5.1, §5.3.
  • [11] C. Drummond, R. C. Holte, et al. (2003) C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II, Vol. 11, pp. 1–8. Cited by: §6.
  • [12] A. Estabrooks, T. Jo, and N. Japkowicz (2004) A multiple resampling method for learning from imbalanced data sets. Computational intelligence 20 (1), pp. 18–36. Cited by: §1.
  • [13] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman (2010) The pascal visual object classes (voc) challenge. International journal of computer vision 88 (2), pp. 303–338. Cited by: §1.
  • [14] I. J. Goodfellow, J. Shlens, and C. Szegedy (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. Cited by: §1, §6, §6.
  • [15] H. He and E. A. Garcia (2009) Learning from imbalanced data. IEEE Transactions on knowledge and data engineering 21 (9), pp. 1263–1284. Cited by: §6.
  • [16] H. He and Y. Ma (2013) Imbalanced learning: foundations, algorithms, and applications. Cited by: §1, §2.2, §6.
  • [17] K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Cited by: §A.1, §1, §2.1, §5.1.
  • [18] N. Japkowicz and S. Stephen (2002) The class imbalance problem: a systematic study. Intelligent data analysis 6 (5), pp. 429–449. Cited by: §6.
  • [19] J. M. Johnson and T. M. Khoshgoftaar (2019) Survey on deep learning with class imbalance. Journal of Big Data 6 (1), pp. 1–54. Cited by: §6.
  • [20] P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan (2020) Supervised contrastive learning. arXiv preprint arXiv:2004.11362. Cited by: §4.2.
  • [21] A. Krizhevsky, G. Hinton, et al. (2009) Learning multiple layers of features from tiny images. Cited by: §A.1, §A.5, §1, §2.1, §5.1.
  • [22] A. Kurakin, I. J. Goodfellow, and S. Bengio (2016) ADVERSARIAL examples in the physical world. arXiv preprint arXiv:1607.02533. Cited by: §1.
  • [23] Y. Li, W. Jin, H. Xu, and J. Tang (2020) Deeprobust: a pytorch library for adversarial attacks and defenses. arXiv preprint arXiv:2005.06149. Cited by: §5.1.
  • [24] T. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár (2017) Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pp. 2980–2988. Cited by: §4.1, §5.1, §6.
  • [25] T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick (2014)

    Microsoft coco: common objects in context

    .
    In European conference on computer vision, pp. 740–755. Cited by: §1.
  • [26] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083. Cited by: §1, §1, §2.1, §4.1, §4.1, §6.
  • [27] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng (2011) Reading digits in natural images with unsupervised feature learning. Cited by: §A.5, §5.1.
  • [28] W. S. Noble (2006) What is a support vector machine?. Nature biotechnology 24 (12), pp. 1565–1567. Cited by: §3.
  • [29] A. Raghunathan, J. Steinhardt, and P. Liang (2018) Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344. Cited by: §6.
  • [30] L. Rice, E. Wong, and Z. Kolter (2020) Overfitting in adversarially robust deep learning. In International Conference on Machine Learning, pp. 8093–8104. Cited by: §1.
  • [31] L. Schmidt, S. Santurkar, D. Tsipras, K. Talwar, and A. Mądry (2018) Adversarially robust generalization requires more data. arXiv preprint arXiv:1804.11285. Cited by: §1.
  • [32] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1, §6.
  • [33] L. Van der Maaten and G. Hinton (2008) Visualizing data using t-sne.. Journal of machine learning research 9 (11). Cited by: §1, §5.3.
  • [34] G. Van Horn and P. Perona (2017) The devil is in the tails: fine-grained classification in the wild. arXiv preprint arXiv:1709.01450. Cited by: §1.
  • [35] Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, and Q. Gu (2019) Improving adversarial robustness requires revisiting misclassified examples. In International Conference on Learning Representations, Cited by: §1, §4.1, §6.
  • [36] D. Wu, S. Xia, and Y. Wang (2020) Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems 33. Cited by: §5.1.
  • [37] D. Xu, Y. Ye, and C. Ruan (2021) Understanding the role of importance weighting for deep learning. arXiv preprint arXiv:2103.15209. Cited by: §3.
  • [38] H. Xu, X. Liu, Y. Li, and J. Tang (2020) To be robust or to be fair: towards fairness in adversarial training. arXiv preprint arXiv:2010.06121. Cited by: §1.
  • [39] H. Xu, Y. Ma, H. Liu, D. Deb, H. Liu, J. Tang, and A. K. Jain (2020) Adversarial attacks and defenses in images, graphs and text: a review. International Journal of Automation and Computing 17 (2), pp. 151–178. Cited by: §6.
  • [40] S. Yen and Y. Lee (2009) Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems with Applications 36 (3), pp. 5718–5727. Cited by: §6.
  • [41] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, and M. Jordan (2019) Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning, pp. 7472–7482. Cited by: §1, §4.1, §6.
  • [42] J. Zhang, X. Xu, B. Han, G. Niu, L. Cui, M. Sugiyama, and M. Kankanhalli (2020) Attacks which do not kill training make adversarial learning stronger. In International Conference on Machine Learning, pp. 11278–11287. Cited by: §6.

Appendix A Appendix

a.1 The Behavior of Adversarial Training

In order to examine the performance of PGD adversarial training under imbalanced scenarios, we adversarially train ResNet18 [17] models on multiple imbalanced training datasets based on CIFAR10 dataset [21]. Similar with observations we discussed in Section 2.1, as shown in Figure 7, Figure 8 and Figure 9, adversarial training produces larger performance gap between well-represented classes and under-represented classes than natural training. Especially, in all imbalanced scenarios, adversarially trained models obtain very low robust accuracy on under-represented classes, which proves again that adversarial training cannot be applied in practical imbalanced scenarios directly.

(a) Natural Training Standard Acc.
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 7: Class-wise performance of natural & adversarial training under an imbalanced CIFAR10 dataset “Step-10".
(a) Natural Training Standard Acc.
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 8: Class-wise performance of natural & adversarial training under an imbalanced CIFAR10 dataset “Exp-100".
(a) Natural Training Standard Acc.
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 9: Class-wise performance of natural & adversarial training under an imbalanced CIFAR10 dataset “Exp-10".

a.2 Reweighting Strategy in Natural Training v.s. in Adversarial Training

For exploring whether the reweighting strategy can help adversarial training deal with imbalanced issues, we evaluate performance of adversarial trained models using diverse binary imbalanced training datasets with different weights assigning to under-represented class. As shown in Figure 10, Figure 11, Figure 12, for adversarially trained models, increasing the weights assigning to under-represented class will improve models’ performance on under-represented class. However, as the same time, the models’ performance on well-represented class will be drastically decreased. As a comparison, adopting larger weights in naturally trained models will also improve models’ performance on under-represented class but only result in slight drop in performance on well-represented class. In other words, the reweighting strategy proposed in natural training to handle imbalanced problem may only provide limited help in adversarial training, and, hence, new techniques are needed for adversarial training under imbalanced scenarios.

(a) Natural Training Standard Acc.
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 10: Class-wise performance of reweighted natural & adversarial training in binary classification. (“auto” as well-represented class and “truck” as under-represented class).
(a) Natural Training Standard Acc.
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 11: Class-wise performance of reweighted natural & adversarial training in binary classification. (“bird” as well-represented class and “frog” as under-represented class).
(a) Natural Training Standard Acc.
(b) Adv. Training Standard Acc.
(c) Adv. Training Robust Acc.
Figure 12: Class-wise performance of reweighted natural & adversarial training in binary classification. (“dog” as well-represented class and “deer” as under-represented class).

a.3 Proofs of the Theorems in Section 3

a.3.1 Proof of Lemma 3.1

See 3.1

Proof 1 (Proof of Lemma 3.1)

We will first prove that the optimal model has parameters (or ) by contradiction. We define and make the following assumption: for the optimal and , we assume if there exist for and

. Then we obtain the following standard errors for two classes of this classifier

with weight :

(8)

However, if we define a new classier whose weight uses to replace , we obtain the errors for the new classifier:

(9)

By comparing the errors in Eq. (8) and Eq. (9), it can imply the classifier has smaller error in each class. Therefore, it contradicts with the assumption that is the optimal classifier with smallest error. Thus, we conclude for an optimal linear classifier in natural training, it must satisfies (or ) if we do not consider the scale of .

Next, we calculate the optimal bias term given , where we find an optimal can minimize the (reweighted) empirical risk:

and we take the derivative with respect to :

When , we can calculate the optimal which gives the minimum value of the empirical error, and we have:

a.3.2 Proof of Theorem 3.1

See 3.1

Proof 2 (Proof of Theorem 3.1)

Without loss of generality, for distribution , with different mean-variance pairs and , we can only consider the case and . Otherwise, we can simply rescale one of them to match the mean vector of the other and will not impact the results. Under this definition, the optimal classifier and has weight vector and bias term , with the value as demonstrated in Lemma 3.1. Next, we will prove the Theorem 3.1 by 2 steps.

Step 1. For the error of class “-1”, we have:

Step 2. For the error of class “+1”, we have:

(10)

and similarly,

(11)

Note that when is large enough, i.e.,

, we can get the Z-score in Eq. (

10) is larger than Eq. (11). As a result, we have:

(12)

By combining Step 1 and Step 2, we can get the inequality in Theorem 3.1.

a.3.3 Proof of Theorem 3.2

See 3.2

Proof 3 (Proof of Theorem 3.2)

We first show that under both distribution and , the optimal reweighting ratio is equal to the imbalance ratio . Based on the results in Eq. (8) and calculated model parameters and , we have the test error (given the model trained by reweight value ):

The value of taking the minimum when its derivative with respect to is equal to , where we can get and the bias term . Note that the variance values have the relation: . Therefore, it is easy to get that:

(13)

Combining the results in Eq. (12) and (13), we have proved the inequality in Theorem 3.2.

a.4 Algorithm of SRAT

The algorithm of our proposed SRAT framework is shown in Algorithm 1. Specifically, in each training iteration, we first generate adversarial examples using PGD for examples in the current batch (Line 5). If the current training iteration does not reach a predefined starting reweighting epoch , we will assign same weights, i.e., for all adversarial examples in the current batch (Line 6). Otherwise, the reweighting strategy will be adopted in the final loss function (Line 15), where a specific weight will be assigned for each adversarial example if its corresponding clean example comes from an under-represented class.

0:  imbalanced training dataset , number of total training epochs , starting reweighting epoch , batch size , number of batches , learning rate
0:  An adversarially robust model
1:  Initialize the model parameters randomly;
2:  for epoch  do
3:     for mini-batch  do
4:        Sample a mini-batch from ;
5:        Generate adversarial example for each ;
6:        
7:        
8:     end for
9:     Optional:
10:  end for
11:  for epoch  do
12:     for mini-batch  do
13:        Sample a mini-batch from ;
14:        Generate adversarial example for each ;
15:        
16:        
17:     end for
18:     Optional:
19:  end for
Algorithm 1 Separable Reweighted Adversarial Training (SRAT).

a.5 Data Distribution of Imbalanced Training Datasets

In our experiments, we construct multiple imbalanced training datasets to simulate various kinds of imbalanced scenarios by combining different imbalance types (i.e., Exp and Step) with different imbalanced ratios (i.e., and ). Figure 13 and Figure 14 show the data distribution of all ten-classes imbalanced training datasets used in our preliminary studies and experiments based on CIFAR10 [21] and SVHN [27] datasets, respectively.

(a) Step-10
(b) Step-100
(c) Exp-10
(d) Exp-100
Figure 13: Data distribution of imbalanced training datasets constructed from CIFAR10 dataset.
(a) Step-10
(b) Step-100
(c) Exp-10
(d) Exp-100
Figure 14: Data distribution of imbalanced training datasets constructed from SVHN dataset.

a.6 Performance Comparison on Imbalanced SVHN Datasets

Table 3 and Table 4 show the performance comparison on various imbalanced SVHN datasets with different imbalance types and imbalance ratios. We use bold values to denote the highest accuracy among all methods and use the underline values to indicate our SRAT variants which achieve the highest accuracy among their corresponding baseline methods utilizing the same loss function for making predictions.

From Table 3 and Table 4, we get similar observation that, comparing with baseline methods, our proposed SRAT method can produce a robust model which can achieve improved overall performance when the training dataset is imbalanced. In addition, based on the experimental results in Table 1 to Table 4, we find that, compared with the performance improvement between DRCB-LDAM and SRAT-LDAM, the improvement between DRCB-CE and SRAT-CE and the improvement between DRCB-Focal and SRAT-Focal are more obviously. The possible reason behind this phenomenon is, the LDAM loss can also implicitly produce a more separable feature space [4] while CE loss and Focal loss do not conduct any specific operations on the latent feature space. Hence, the feature separation loss contained in SRAT-CE and SRAT-Focal could be more effective on learning separable feature space and facilitate the Focal loss on prediction. However, in SRAT-LDAM, the feature separation loss and LDAM loss may affect each other on learning feature representations and, hence, the effectiveness of the feature separation loss may be counteracted or weakened.

In conclusion, experiments conducted on multiple imbalanced datasets verify the effectiveness of our proposed SRAT method under various imbalanced scenarios.

Imbalance Ratio 10 100
Imbalance Ratio Standard Accuracy Robust Accuracy Standard Accuracy Robust Accuracy
Method Overall Under Overall Under Overall Under Overall Under
CE 79.88 67.04 37.62 22.08 59.61 26.19 29.57 5.03
Focal 79.96 67.03 37.83 22.47 60.58 28.17 30.27 5.83
LDAM 84.55 74.96 45.80 31.23 65.61 37.13 33.34 8.36
CB-Reweight 79.48 66.07 37.38 21.66 60.23 27.68 29.54 5.32
CB-Focal 80.29 67.56 38.10 23.00 60.73 28.37 30.09 5.75
DRCB-CE 80.62 68.74 37.25 22.79 60.67 28.36 30.02 5.59
DRCB-Focal 79.11 65.72 37.01 22.02 61.65 30.29 30.78 7.06
DRCB-LDAM 87.83 82.63 46.45 35.15 63.78 33.99 33.60 7.28
SRAT-CE 82.89 72.79 38.23 24.70 63.39 33.85 29.64 6.11
SRAT-Focal 85.05 77.10 39.51 28.06 70.12 47.44 32.18 11.08
SRAT-LDAM 87.65 82.62 46.03 34.75 71.56 50.33 33.54 11.63
Table 3: Performance Comparison on Imbalanced SVHN Datasets (Imbalanced Type: Step)
Imbalance Ratio 10 100
Metric Standard Accuracy Robust Accuracy Standard Accuracy Robust Accuracy
Method Overall Under Overall Under Overall Under Overall Under
CE 87.54 82.67 44.12 35.33 72.51 56.30 33.34 16.93
Focal 87.82 83.01 44.88 35.97 72.61 56.48 34.09 17.62
LDAM 90.06 86.69 51.84 43.73 79.11 66.86 40.42 25.18
CB-Reweight 87.66 82.79 44.39 35.53 72.25 55.97 33.36 17.16
CB-Focal 87.86 82.96 44.61 35.55 73.23 57.34 34.25 17.90
DRCB-CE 88.49 84.51 43.82 36.28 73.74 58.03 33.52 17.68
DRCB-Focal 87.47 82.78 42.52 34.31 71.95 55.11 33.43 17.63
DRCB-LDAM 91.24 89.65 52.39 46.71 80.29 69.23 40.16 24.64
SRAT-CE 88.70 84.94 44.54 36.59 77.11 64.47 34.48 19.91
SRAT-Focal 89.51 85.42 45.37 37.20 80.04 69.54 35.25 23.04
SRAT-LDAM 91.27 89.55 52.10 46.13 80.71 70.49 40.33 25.11
Table 4: Performance Comparison on Imbalanced SVHN Datasets (Imbalanced Type: Exp)