Completely Automated Public Turing test to tell Computers and Humans A
part - CAPTCHA, is a commonly used method to validate human users. Image classification based tests are intentionally designed to make bots fail to classify images. Deep Neural Network (DNN) based methods[1, 2], which have recently been proven to be successful in automated image classification, have been found to be useful to bypass CAPTCHA security process. However, these methods are vulnerable to specially generated adversarial examples , which can be used in CAPTCHAs and similar applications.
An adversarial attack perturbs the input image by adding a non-random, network and input specific noise, to make its automated classification difficult. This artificial noise also makes it more difficult for the legitimate users to classify the adversarial images especially when they are time limited . So, two desired attributes of adversarial images are: (i) they should successfully fool the machine learning systems, (ii) they should introduce as little perceptual noise as possible so that they do not pose any additional challenge to the humans. In this letter, we propose a method for perceptual enhancement of adversarial images to make them closer to their noise-free originals and easier to process by humans.
2 Proposed Method
The inputs of conventional DNNs are RGB images and the attacks add noise to all three channels separately. Adding independent and different amounts of noise to these different channels results in artificial colours being introduced as shown in Fig.0(b), 0(d), 0(f). In addition, as the attack modifies each pixel independently, it exhibits itself as a visually distractive coloured snow-like high-frequency noise . On the other hand, main distinguishing features (such as shape and texture) for an object class can be obtained from the luminance and adversarial noise added to the luminance channel is expected to be more detrimental to network performance than the noise in the colour channels. So, we claim that lower noise levels could be obtained by concentrating the attack on the luminance channel, which in effect is expected to reduce the coloured snow-like noise.
As conventional networks work with RGB images, the adversarial noise calculation inherently makes use of R, G and B channels. For the original image , attack algorithm calculates the adversarial noise, , separately for each channel. This noise is then added to the respective channels of the original image to obtain adversarial image as follows: . In this work, we first convert the image and the adversarial noise into YUV domain and obtain and respectively. Then U and V coefficients of the noise, and , are scaled by a factor . Assuming that the target object is closer to the centre of the image, all the noise channels are filtered with a 2D Gaussian kernel placed at the centre of the image to gradually reduce the noise closer to the edges. The resulting noise is added in YUV colour space: . Then the image is converted back into RGB to allow processing in conventional networks. This process reduces the total amount of noise added to the original image and it might cause the adversarial attack to fail. Hence an iterative process is used as described in Alg.1 to find a stronger attack. Although a stronger attack will increase the noise, overall noise is lower due to the subsequent scaling of chrominance values and the use of Gaussian kernel.
NIPS 2017: Adversarial Learning Development Set 
consist of 1000 images having 299x299 resolution. Each image corresponds to a different ImageNet 1000 category. Image pixels are scaled to the range. All the images are used in the experiments and overall distances are calculated as the average throughout all the images.
4 Experimental Setup
, , and distances are mostly used to measure the perturbation added to the original image. distance counts the number of pixels which were altered during the adversarial process. distance shows the maximum change of the perturbation. Since our method aims perceptual enhancement, we calculate metric using all the channels (1) in order to measure the total perturbation. In this equation, is the original image, is the adversarial image, is the width and, is the height of the image. distance is a better indicator of the overall adversarial noise (high frequency noise which is distractive to human visual system) compared to and .
Fast Gradient Sign Method (FGSM) , Momentum Iterative Method (MIM)  and Carlini&Wagner (C&W )  attacks were used for experimental evaluation of the proposed method as they are well-known milestone attacks.
FGSM  is a one-step gradient based approach which is designed to be fast. For a given image and corresponding target , it calculates the gradient of the loss, , generally cross-entropy, with respect to and multiplies negative of the gradient sign with a constant to generate the adversarial noise. This noise is then added to the image to obtain the adversarial example (2).
MIM  is an iterative version of FGSM. It is designed to find the minimum adversarial example in iterations. At each iteration, MIM updates the accumulated the gradient (3) by using the current normalised gradient of loss, softmax cross-entropy, and previous accumulated gradient multiplied by a decay factor . By this way, a momentum is introduced to be more resilient to small humps, narrow valleys, and poor local minima or maxima. Then the next adversarial example is obtained by subtracting normalized multiplied with a constant .
C&W attack  aims to find the lowest perturbation in distance metric, also in an iterative manner. At each iteration, the attack finds the perturbation for a given input image and target class by solving (5)
where is a constant and is defined as in (6)
is the activation function andis the confidence parameter, (how confident the classifier should be that the generated adversarial image is a sample of the target class). In this work, we use a non-targeted setup so that is any incorrect class.
Cleverhans module  was used for implementing the attacks. Each attack was trained in an untargeted setup and defended on three different pretrained network architectures: Inception v3 (IncV3) , InceptionResNet v2 (IncresV2) , and ResNet50 v3 (Res50V3) .
The experiments aim that all attacks are successful, i.e., the adversarial image generated by the attack network is misclassified by the defence network. To this end, parameter is used for FGSM and MIM attacks and iteration parameter is used for C&W to find the minimum
making the attack successful for each image. The images are downscaled to 224x224 for Res50v3 and they are kept at their original resolution (229x299) for IncV3 and IncresV2. For all attack types, the Gaussian kernel size is set to match the size of the image and it has a standard deviation of 190.
For FGSM attack, parameter is selected as 10.0 at the first iteration and decreased by 0.025 until the minimum which makes the defence network misclassify the adversarial image is obtained. If the adversarial attack fails at the first iteration then is increased by 5.0 and if it is successful then decreased by 0.025 until the minimum that makes the network misclassify the input is obtained.
C&W attack is initialized by setting confidence parameter to zero. Then the iteration parameter is increased, as long as the attack is successful, to find the minimum distance.
For MIM attack, parameter is selected as 0.018 for the first iteration and decreased by 0.001 until the minimum distance is obtained.
5 Experimental Results
The results are shown in Table 5 for different values where baseline refers to the original unmodified attack. Note that the case where is 1 still has an effect of reducing the noise due to the Gaussian smoothing. When is 0, no noise is added to the colour channels.
Fig. 1 shows baseline adversarial images and the images obtained with the proposed method for FGSM, C&W and MIM attacks.
Fig. 2 shows distance improvements as percentage of the baseline attacks. The largest improvement is obtained for FGSM using Res50V3 where it is improved by 41.27% and smallest improvement is 5.88% for C&W using IncresV3. On average 22% improvement is achieved considering all attack and network types.
When we reduce the noise in U and V bands, the adversarial images look perceptually better. However, in order to achieve 100% attack accuracy, stronger attacks, which increase the noise in Y, are needed as a trade-off. However, as can be seen in Table 5, lower distances can still be obtained for all attack types and for all networks. It has to be noted that the value giving the best result is different for each attack. For FGSM, gives the best results for IncresV2 and Res50V3 while is the best for IncV3. For C&W , gives the best results for IncresV2 and Res50V3. While is the best for IncV3, performance difference with is relatively small and it can be said that, in practice, can be used for all network types in question. For MIM, gives the best results for all different types of networks in question.
The results show that the proposed method works independent of the attack type and the network model and reduces the distances. Even though C&W and MIM attacks are optimized to minimize distance by design, our method results in still lower values. While this might sound contradictory, it has to be noted that due to the nature of the networks, this optimization is done on RGB values in the original attacks and might not be optimal when YUV domain is considered. The proposed method reduces the noise in U and V channels which is compensated by increasing noise in Y channel. This strategy reduces the amount of perceptible colour noise as well as reducing the total noise as indicated by distances calculated using RGB channels.
Since C&W and MIM generate adversarial noise in iterative manner, both are able to produce lower distance than FGSM. C&W attack achieved the best distances except using ResNet50v3 as the attack network. For this network, MIM attack achieved the best distance.
We proposed an attack and network type agnostic perceptual enhancement method by converting the adversarial noise to YUV colour space and reducing the chrominance noise and applying Gaussian smoothing to the adversarial noise. The adversarial images are not only perceptually better but also have lower distances to the original images. Conventional networks are trained using images in RGB colour space and inherently, the optimization is done in this colour space. In the future, these networks could be trained using images in YUV colour space. Then using these networks, attacks could be done intrinsically in YUV space.
The proposed method assumes that the object is located near the centre of the image and Gaussian kernel is positioned at the centre of the image. However the object could be off-centre or could be located in a different position which might invalidate this assumption. In the future, class activation maps 
, which could be obtained directly through the attack network, can be used to estimate the centre position of the object. This would allow positioning the Gaussian kernel to overlap better with the object position.
Bilgin Aksoy and Alptekin Temizel (Informatics Institute, Middle East Technical University, Ankara, Turkey)
Stark, F., Hazirbas, C., Triebel, R., Cremers, D.: ‘CAPTCHA Recognition with Active Deep Learning’GCPR Workshop on New Challenges in Neural Computation, 2015, 10
Goodfellow, I.J., Bulatov, Y., Ibarz, J., Arnoud, S., Shet, V.: ‘Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks’, International Conference on Learning Representations, 2014
-  Szegedy, C., Zaremba, W., Sutskever, I., et al.: ‘Intriguing properties of neural networks’, arXiv preprint arXiv:1312.6199, 2013
Elsayed, G., Shankar, S., Cheung, B., et al.: ‘Adversarial Examples that Fool both Computer Vision and Time-Limited Humans’,Advances in Neural Information Processing Systems, 2018, pp. 3914-3924
-  Aydemir, A.E., Temizel, A., Taskaya Temizel, T.:‘The effects of JPEG and JPEG2000 compression on attacks using adversarial examples’, arXiv preprint arXiv:1803.10418, 2018
-  ‘NIPS 2017 Adversarial Learning Development Set’, https://www.kaggle.com/google-brain/nips-2017-adversarial-learning-development-set, accessed July 2018
-  Goodfellow, I.J., Shlens, J., Szegedy, C.:‘Explaining and Harnessing Adversarial Examples’, arXiv preprint arXiv:1412.6572, 2014
Dong, Y., Liao, F., Pang, T., et al.: ‘Boosting adversarial attacks with momentum’,
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 9185-9193
-  Carlini, N., Wagner, D.: ‘Towards Evaluating the Robustness of Neural Networks’, IEEE Symposium on Security and Privacy (SP), 2017, pp. 39-57.
-  Papernot, N., Faghri, F., Carlini, N., et al.: ‘Technical report on the cleverhans v2. 1.0 adversarial examples library’, arXiv preprint arXiv:1610.00768, 2016
-  Szegedy, C., Vanhoucke V., Ioffe, S., Shlens, J., Wojna, Z.: ‘Rethinking the inception architecture for computer vision’, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2818-2826
Szegedy, C., Ioffe, S., Vanhoucke V., Alemi, A.A.:‘Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning’,
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017, pp. 4278-4284
-  He, K., Zhang, X., Ren, S., Sun, J.:‘Deep residual learning for image recognition’, IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.:‘Learning Deep Features for Discriminative Localization’,IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2921-2929