Adversarial CAPTCHAs

01/04/2019 ∙ by Chenghui Shi, et al. ∙ Zhejiang University Georgia Institute of Technology 0

Following the principle of to set one's own spear against one's own shield, we study how to design adversarial CAPTCHAs in this paper. We first identify the similarity and difference between adversarial CAPTCHA generation and existing hot adversarial example (image) generation research. Then, we propose a framework for text-based and image-based adversarial CAPTCHA generation on top of state-of-the-art adversarial image generation techniques. Finally, we design and implement an adversarial CAPTCHA generation and evaluation system, named aCAPTCHA, which integrates 10 image preprocessing techniques, 9 CAPTCHA attacks, 4 baseline adversarial CAPTCHA generation methods, and 8 new adversarial CAPTCHA generation methods. To examine the performance of aCAPTCHA, extensive security and usability evaluations are conducted. The results demonstrate that the generated adversarial CAPTCHAs can significantly improve the security of normal CAPTCHAs while maintaining similar usability. To facilitate the CAPTCHA security research, we also open source the aCAPTCHA system, including the source code, trained models, datasets, and the usability evaluation interfaces.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a type of challenge-response test in computing which is used to distinguish between human and automated programs (machines). The first generation of CAPTCHA was invented in 1997, while the term “CAPTCHA” was first coined in 2002 [1][2]. Ever since its invention, CAPTCHA has been widely used to improve the security of websites and various online applications to prevent the abuse of online services, such as preventing phishing, bots, spam, and Sybil attacks.

Existing CAPTCHA Schemes.

In general, existing popular CAPTCHAs can be classified into four categories:

(1) Text-based CAPTCHA. Text-based CAPTCHA schemes ask users to recognize a string of distorted characters with/without an obfuscated background [7][8]. Due to its simplicity and high efficiency, text-based CAPTCHA is the most widely deployed and acceptable form up to now and in a foreseeable future [7][8].

(2) Image-based CAPTCHA. Image-based CAPTCHA is another popular scheme which usually asks users to select one or more images with specific semantic meanings from a couple of candidate images [25]. It is motivated by the intuition that compared with a string of characters, images carry much richer information and have a larger variation space. Meanwhile, there are still many hard, open problems in image perception and interpretation, especially in the context of noisy environments. Thus, to some extent, image-based CAPTCHA is more secure than text-based CAPTCHA. Nevertheless, to the best of our knowledge, a comprehensive comparative analysis on the security and usability of text- and image-based CAPTCHAs is still void. Recently, many variants of image-based CAPTCHAs were proposed, such as slide-based CAPTCHA which asks users to slide a puzzle to the right part of an image [56], click-based CAPTCHA which asks users to click specific semantic regions of an image [55], etc.

(3) Audio-based CAPTCHA. Audio-based CAPTCHA asks users to recognize the voice contents in a piece of audio [1][2]. In most of the practical applications, audio-based CAPTCHA is often used together with text-based CAPTCHA as a complementary means, mainly because of the usability issue, especially for non-native users of the audio language.

(4) Video-based CAPTCHA. Video-based CAPTCHA is a new kind of CAPTCHA that asks users to finish a content-based video labeling task [34]. It is usually more complex and takes more time for users to correctly finish compared with other forms of CAPTCHAs. Thus, it is not widely adopted and seldom to see in practice.

There are also other different proposals for CAPTCHA design, e.g., game-based CAPTCHA [54] and inference-based CAPTCHA [57]. However, they are not widely deployed yet due to various reasons, e.g., security issues, accessability limitations, and performance issues. In this paper, our study mainly focus on text- and image-based CAPTCHAs. The reason is evident: they are the most accepted and widely used CAPTCHAs up to now and in a foreseeable future. The study of their security and usability has more potential implications for practical applicaitons.

Issues of CAPTCHAs and Motivation. Generally speaking, CAPTCHA can be evaluated according to its security performance, which refers to the strength and resilience of CAPTCHAs against various attacks, and usability performance, which refers to how user friendly the CAPTCHAs are [1][2]. From the security perspective, it is not a news to see reports that a CAPTCHA scheme is broken by some attacks [1][2]

. The evolution of CAPTCHAs always moves forward in a spiral, constantly accompanied by emerging attacks. For text-based CAPTCHAs, the security goal of its earliest version is to defend against Optical Character Recognition (OCR) based attacks. Therefore, many distortion techniques (e.g., varied fonts, varied font sizes, and rotation) are applied. Over the last decade, machine learning algorithms become more and more powerful. Following the seminal work which demonstrates that computers turn to outperform humans in recognizing characters, even under severe distortion, many successful attacks to text-based CAPTCHAs were proposed, including both generic attacks which target multiple text-based CAPTCHAs

[7][8], and specialized attacks which targeted one kind of text-based CAPTCHAs [24]. In spite of that it is possible to improve the security of text-based CAPTCHAs by increasing the distortion and obfuscation levels, their usability will be significantly affected [7][8].

The same dilemma exists for image-based CAPTCHAs either. With the prosperity of machine learning research, especially recent deep learning progress, Deep Neural Networks (DNNs) have achieved impressive success in image classification/recognization, matching or even outperforming the cognitive ability of humans in complex tasks with thousands of classes

[16]

. Along with such progress, many DNN-based attacks have been proposed recently to crack image-based CAPTCHAs with very high success probability, as demonstrated by a large number of reports

[31]. To defend against existing attacks, the intuition is to rely on high-level image semantics and develop more complex image-based CAPTCHAs, e.g., recognizing an image object by utilizing its surrounding context [30]. Leaving the security gains aside, such designs usually induce poor usability [1][2]. To make things worse, unlike text-based CAPTCHAs, it is difficult, if not impossible, for designers to generate specific images with required semantical meanings through certain rules. In other words, it is too labor-intensive to collect labeled images in large scale.

In summary, existing text- and image-based CAPTCHAs are facing challenges from both the security and the usability perspectives. It is desired to develop a new CAPTCHA scheme that achieves high security while preserving proper usability, i.e., seeks a better balance between security and usability.

Our Methodology and Contributions. To address the dilemma of existing text- and image-based CAPTCHAs, we start from analyzing state-of-the-art attacks. It is not surprising that most, if not all, of the attacks to text- and image-based CAPTCHAs are based on machine learning techniques, especially the latest and most powerful ones, which are mainly based on deep learning, typically, CNNs. This is mainly because the development of CAPTCHA attacks roots in the progress of machine learning research, as we discussed before.

On the other hand, with the progress of machine learning research, researchers found that many machine learning models, especially neural networks, are vulnerable to adversarial examples, which are defined as elaborately (maliciously, from the model’s perspective) crafted inputs that are imperceptible to humans but that can fool the machine learning model into producing undesirable behavior, e.g., producing incorrect outputs [39]. Inspired by this fact, is that possible for us to design a new kind of CAPTCHAs by proactively attacking existing CAPTCHA attacks, i.e., “to set one’s own spear against one’s own shield”?

Following this inspiration, we study the method to generate text- and image-based CAPTCHAs based on adversarial learning, i.e., text-based adversarial CAPTCHAs and image-based adversarial CAPTCHAs, that are resilient to state-of-the-art CAPTCHA attacks and meanwhile preserve high usability. Specifically, we have three main objectives in the design: (1) security, which implies that the developed CAPTCHAs can effectively defend against state-of-the-art attacks, especially the powerful deep learning based attacks; (2) usability, which implies that the developed CAPTCHAs should be usable in practice and maintain high user experience; and (3) compatibility, which implies that the proposed CAPTCHA generation scheme is compatible with existing text- and image-based CAPTCHA deployment and applications.

With the above goals in mind, we study the method to inject human-tolerable, preprocessing-resilient (i.e., cannot be removed by CAPTCHA attacks) perturbations to traditional CAPTCHAs. Specifically, we design and implement a novel system aCAPTCHA to generate and evaluate text- and image-based adversarial CAPTCHAs.

Our main contributions can be summarized as follows.

(1) Following our design principle, we propose a framework for generating adversarial CAPTCHAs on top of existing adversarial example (image) generation techniques. Specifically, we propose four text-based and four image-based adversarial CAPTCHA generation methods. Then, we design and implement a comprehensive adversarial CAPTCHA generation and evaluation system, named aCAPTCHA, which integrates 10 image preprocessing techniques, 9 CAPTCHA attacks, 4 baseline adversarial CAPTCHA generation methods, and 8 new adversarial CAPTCHA generation methods. aCAPTCHA can be used for the generation, security evaluation, and usability evaluation of both text- and image-based adversarial CAPTCHAs.

(2) To examine the performance of the adversarial CAPTCHAs generated by aCAPTCHA, we conducted extensive security and usability evaluations. The results demonstrate that the generated adversarial CAPTCHAs can significantly improve the security of normal CAPTCHAs while maintaining similar usability.

(3) We open source the aCAPTCHA system at [60], including the source code, trained models, datasets, and the interfaces for usability evaluation. It is expected that aCAPTCHA can facilitate the CAPTCHA security research and can shed light on designing more secure and usable adversarial CAPTCHAs.

2 Background

In this section, we briefly introduce adversarial examples and the corresponding defense technologies.

2.1 Adversarial Example

Neural networks have achieved great performance on a wide range of application domains, especially, image recognition. However, recent work has discovered that the existing machine learning models including neural networks are vulnerable to adversarial examples. Specifically, suppose we have a classifier with model parameters . Let be an input to the classifier with corresponding ground truth prediction . An adversarial example is an instance in the input space that is close to according to some distance metric , and causes classifier to produce an incorrect output. Adversarial examples that affect one model often affect another model, even if the two models have different architectures or were trained on different training sets, as long as both models were trained to perform the same task[43].

Prior work that considers adversarial examples under a number of threat models can be broadly classified into two categories: white-box attacks where the adversary has full knowledge of the model including the model architecture and parameters, and black-box attacks, where the adversary has no or little knowledge of the model . The construction of an adversarial example depends mainly on the gradient information of the target model. In the white-box setting[10, 11, 41], the gradient of the model is always visible to the attacker. Thus, it is easy for an attacker to generate adversarial examples. In the black-box setting[43, 44, 50], attackers cannot get gradient information directly. There are usually two ways to generate adversarial examples in this condition. The first one is to approximate the gradient information by query operations[50], i.e., sending an image to the target model and getting the output distribution. After many rounds of queries, attackers can approximate the target model’s gradient and generate adversarial examples. The second way is to take advantage of the transferability of adversarial examples[43]. As we mentioned above, adversarial examples that affect one model can often affect another model. An attacker could trains his own local model, generates adversarial examples against the local model by white-box methods, and transfers them to a victim model which he has limited knowledge. In the paper, we rely on the second method refers to the black-box setting to generate adversarial CAPTCHAs against machine learning based attacks.

2.2 Defense Methods

Due to the security threats caused by adversarial examples, improving the robustness of deep learning networks against adversarial perturbation has been an active field of research. Various defensive techniques against adversarial examples have been proposed. We roughly divide them into three categories.

(1) Adversarial Training[41, 45].

The idea is simple and effective. One can retrain neural networks directly on adversarial examples until the model learns to classify them correctly. This makes the network robust against the adversarial examples in the test set and improves the overall generalization capability of the network. However, it does not resolve the problem completely, as adversarial training can only be effective against specific adversarial example generation algorithms that are used in the retraining phase. Moreover, adversarial training has been shown to be difficult at a large scale, e.g., the ImageNet scale.

(2) Gradient Masking[42, 49]. This method tries to prevent an attacker from accessing the useful gradient information of a model. As we mentioned, the construction of an adversarial example depends mainly on the gradient information of the target model. Without useful gradient information, the attackers are hard to perform an attack. However, gradient masking is usually not effective against black-box attacks, because an adversary could run his attack algorithm on an easy-to-attack model, and transfers these adversarial examples to the hard-to-attack model.

(3) Input Transformation[51, 47, 48]. This kind of transformation method generally does not change the structure of a neural network. The main idea is to preprocess or transform the input data, such as image cropping, rescaling and bit-depth reduction, in order to remove adversarial perturbation, and then feed the transformed image through an unmodified classifier. This method is easy to circumvent by white-box attacks because attackers can modify the attack algorithm in the mirror, e.g., considering similar operations during adversarial examples generation. In the black-box attacks, it could provides good protection. However, input transformation cannot eliminate adversarial perturbation in the input data but only decreases the attack success rate.

In general, it is a fundamental problem that neural networks are vulnerable to adversarial perturbation. The existing defend methods are only to some extent mitigating the attack. Thus, dedicated in-depth research is expected in this area.

3 System Overview

Fig. 1: System overview of aCAPTCHA.

In this section, we present the system architecture of aCAPTCHA, which is shown in Fig.1. Basically, it consists of seven modules:

Image Preprocessing (IPP) Module.

In this module, we implement 10 widely used standard image preprocessing techniques for CAPTCHA security analysis, including 9 filters: BLUR, DETAIL, EDGE ENHANCE, SMOOTH, SMOOTH MORE, GaussianBlur, MinFilter, MedianFilter, and ModeFilter, and one standard image binarization method. Basically, all the preprocessing techniques can be used to remove the noise in an image.

Text-based CAPTCHA Attack (TCA) Module. In this module, we implement 5 text-based CAPTCHA attacks, including two traditional machine learning based attacks

(SVM, KNN)

and three state-of-the-art DNN-based attacks (LeNet [12], MaxoutNet [13] and NetInNet [14]). In aCAPTCHA, TCA has two main functions. First, it can provide necessary model information for generating text-based adversarial CAPTCHAs, i.e., for the following TCG module. Second, it can also be employed to evaluate the resilience of text-based CAPTCHAs against actual attacks.

Image-based CAPTCHA Attack (ICA) Module. Similar to TCA, we implement 4 state-of-the-art image-based CAPTCHA attacks in this module (NetInNet [14], VGG [15], GoogleNet [17] and ResNet [18]). It is used to provide necessary model information for generating image-based adversarial CAPTCHAs and for evaluating the resilience of image-based CAPTCHAs against actual attacks.

Text-based Adversarial CAPTCHA Generation (TCG) Module. In this module, we first implement 4 state-of-the-art adversarial example (image) generation algorithms to serve as the baseline. Then, we analyze the limitations of applying existing adversarial image generation techniques to generate text-based adversarial CAPTCHAs. Finally, according to our analysis, we propose 4 new text-based adversarial CAPTCHA generation algorithms.

Image-based Adversarial CAPTCHA Generation (ICG) Module. In this module, we first analyze the limitations of existing adversarial image generation techniques for generating image-based adversarial CAPTCHAs. Then, we implement 4 image-based adversarial CAPTCHA generation algorithms by improving existing techniques.

CAPTCHA Security Evaluation (CSE) Module. Leveraging TCA and ICA, this module is used to evaluate the resilience and robustness of text- and image-based CAPTCHAs against state-of-the-art attacks.

CAPTCHA Usability Evaluation (CUE) Module. This module is mainly used for evaluating the usability of text- and image-based CAPTCHAs.

aCAPTCHA takes a fully modular design, and is thus easily extendable. We can freely add emerging attacks to TCA/ICA and/or add new proposed adversarial CAPTCHA generation algorithms to TCG/ICG.

3.1 Datasets

In the remainder of this paper, for the text-based evaluation scenario, we employ MNIST (Modified National Institute of Standards and Technology database) [3]. MNIST is a large database of 70,000 handwritten digit images and is widely used by the research community as a benchmark to evaluate text-based CAPTCHA’s security and usability [8][3].

For the image-based evaluation scenario, we employ another image benchmark dataset ImageNet ILSVRC-2012 (refers to the dataset used for 2012 ImageNet Large Scale Visual Recognition Challenge) [4][5]. The employed ImageNet ILSVRC-2012 contains 50,000 hand labeled photographs from 1000 categories with 50 photographs from each category 111The used dataset here is a actually a subset of ImageNet ILSVRC-2012, which is sufficient for our purpose..

4 Text-based Adversarial CAPTCHAs

With the design goals in mind and following our design principle, we show the design of TCG step by step below.

4.1 Baselines

In fact, CAPTCHAs can be viewed as a special case of images. Then, following the design principle and goals, a straightforward idea is to generate text-based adversarial CAPTCHAs using exiting adversarial image generation techniques. Therefore, we implement 4 baseline adversarial image generation algorithms in TCG. Before delving to the details, we define some useful notations.

4.1.1 Notations

We first present necessary notations in the context of generating adversarial images. To be consistent with existing research, we use the same notation system as that in [11]. We represent a neural network as a function , where is the input image 222Note that, is not necessary to be a square image. The setting here is for simplicity. and is the corresponding output. Define to be the full neural network including the softmax function and let be the output of all the layers except the softmax. According to , , which can be viewed as a classifier, assigns a class label . Let be the correct label of .

As in [10][11], we use norms to measure the similarity of . Then, . According to the definition, distance measures the Euclidean distance between and ; distance measures the number of coordinates s.t. ; and distance measures the maximum change to any of the coordinates, i.e., .

4.1.2 Baseline Methods

Recently, to generate adversarial examples (adversarial images in our context) against neural networks, many attacks have been proposed [40] [38]. For our purpose, those attacks can serve as our adversarial CAPTCHA generation methods. In TCG, we implement four state-of-the-art such attacks as our baseline methods.

JSMA. In [10], Papernot et al. proposed the Jacobian-based Saliency Map Attack (JSMA) to generate adversarial images. JSMA is a greedy algorithm. Suppose is the target class of image . Then, to obtain such that and , JSMA follows the following steps: (1) ; (2) based on the gradient , compute a saliency map in which each value indicates the impact of the corresponding pixel on the resulting classification; (3) according to the saliency map, select the most important pixel for modification to increase the likelihood of class ; and (4) repeat the above two steps until or more than a set threshold of pixels have been modified.

Note that, JSMA is also capable for generating untargeted adversarial images. For that purpose, we only have to: (1) let and change the goal as to find such that and ; (2) select the pixel to mostly decrease the likelihood of class for modification.

Carlini-Wagner Attacks. Aiming at generating high quality adversarial images, Carlini and Wagner in [11] introduced three powerful attacks tailored to , , and , respectively. Basically, all those three attacks are optimization-based and can be targeted or untargeted. Taking the untargeted attack as an example, it can be formalized as the optimization problem: minimize , such that , i.e., for image , the attack seeks for a perturbation that is small in length and can fool the classifier meanwhile. In the formalization,

is a hyperparameter that balances the two parts in the objective function. The constraint implies that the generated adversarial image should be valid.

Filter
0.00 13.93 0.00 73.51 0.00 1.38 0.00 83.30
BLUR 5.15 8.27 4.22 20.84 6.25 19.27 6.25 22.52
DETAIL 17.80 11.76 0.00 78.28 2.22 4.22 56.79 83.30
EDGE ENHANCE 9.05 8.27 0.00 2.77 9.89 9.89 26.21 35.13
SMOOTH 43.36 37.71 0.00 64.70 24.31 7.54 28.24 94.15
SMOOTH MORE 37.71 40.46 0.00 37.71 20.84 10.79 19.27 88.58
GaussianBlur 49.70 16.42 0.35 35.13 28.24 22.52 22.52 73.51
MinFilter 0.15 1.38 0.05 0.11 0.02 0.07 0.06 0.15
MedianFilter 24.31 68.99 0.05 35.13 17.80 12.81 12.81 68.99
ModeFilter 20.84 30.40 0.00 22.52 30.40 32.69 0.05 40.46
TABLE I: Performance of baseline algorithms vs LeNet. The original SAR of LeNet is .

4.2 Analysis of Baselines

As discussed before, intuitively, it seems like that existing adversarial image generation algorithms, e.g., JSMA and Carlini-Wagner attacks, can be applied to generate adversarial CAPTCHAs directly. Following this intuition, we conduct a preliminary evaluation as follows: () Leveraging MNIST and standard CAPTCHA generation techniques [2], randomly generate 10,000 CAPTCHAs of length 4, i.e., each CAPTCHA is composed of 4 characters from MNIST; Denote these CAPTCHAs by set . () Suppose LeNet from TCA is the employed CAPTCHA attack. Then, use LeNet (trained using 50,000 CAPTCHAs for 20,000 rounds and with batch size 50) to attack the CAPTCHAs in . The Success Attack Rate (SAR), which is defined as the portion of successfully recognized CAPTCHAs in , is ; () In terms of LeNet, generate the adversarial versions of the CAPTCHAs in using JSMA, , , and , denoted by , , , and , respectively. () Use LeNet and possible preprocessing techniques from the IPP module to attack , , , and . The corresponding SARs are shown in Table I, where “” implies does not apply the corresponding preprocessing and denotes the image binarization processing.

From Table I, we observe that without applying image preprocessing, the adversarial CAPTCHAs generated by all the baseline algorithms can significantly reduce the SAR of LeNet, e.g., reduces the SAR of LeNet from to . This implies that the idea of applying adversarial CAPTCHAs to defend against modern attacks is promising.

However, unfortunately, without talking the usability, the security of these adversarial CAPTCHAs can be significantly affected by image preprocessing either. For instance, when attacking , the SAR of LeNet is raised from to after applying the SMOOTH filter and to after further applying image binarization, which is similar to its performance on normal CAPTCHAs. This implies that the perturbation in the adversarial CAPTCHAs can be removed by image preprocessing, i.e., the perturbations added by the baseline algorithms are not resilient/robust to image preprocessing.

We analyze the reasons from two aspects. From the perspective of breaking CAPTCHAs, text-based CAPTCHAs are monotonous compared with the image-based CAPTCHAs. Character shape is only useful information in text-based CAPTCHAs. Other information, such as character colors and background pictures, is useless. Thus, adversaries can employ multiple kinds of techniques, e.g., filtering and image binarization, to remove noise and irrelevant information. From the perturbation generation perspective, theoretically, pre-processing such as filtering and binarization can be bypassed with minor modification of adversarial example generation algorithm, e.g., adding another convolutional layer to the beginning of the neural network with one output channel that performs similar filtering [52]. However, such modification will hugely increase the noise added in CAPTCHAs. If we only consider filtering operation, the adversarial examples generated by minor modification would not affect human recognition. While we consider both filtering and binarization, the adversarial examples generated by minor modification are unable to recognize by human. Therefore, existing adversarial image generation techniques cannot keep the balance between usability and security for text-based CAPTCHAs.

4.3 Adversarial CAPTCHA Generation

In the previous subsection, we analyzed the limitations of existing techniques for generating adversarial CAPTCHAs. Aiming at generating more robust and usable text-based adversarial CAPTCHAs, we in this subsection proposed four new methods based on existing techniques.

Our design mainly follows two guidelines. First, according to our analysis, the perturbations added in the space domain are frail to image preprocessing. Therefore, we consider to add perturbations in the frequency domain. This is because space domain perturbation can be considered as local change of images while frequency domain perturbation is a kind of global change to images, which is more difficult to remove, i.e., frequency domain perturbation is intuitively more resilient to image preprocessing. Certainly, when conducting frequency domain perturbation, we should be aware of the possible impact on the usability. Second, when generating adversarial CAPTCHAs, instead of trying to add human-imperceptible perturbations, we focus on adding human-tolerable perturbations. This will give us more freedom to design more secure and fast adversarial CAPTCHA generation methods. Specifically, based on JSMA, , , and , we propose 4 text-based adversarial CAPTCHA generation algorithms, denoted by JSMA, , , and , respectively.

JSMA. We show the design of JSMA in Algorithm 1. Basically, JSMA follows a similar procedure as the untargeted JSMA. We remark the differences as follows. First, in Steps 3-4, we transform a CAPTCHA to the frequency domain by

Fast Fourier Transform

(FFT) and then compute a saliency map. This enables us to elaborately inject perturbations to a CAPTCHA in the frequency domain as expected.

Second, after transforming a CAPTCHA into the frequency domain, its high frequency part usually corresponds to the margins of characters and other non-vital information, while the low frequency part usually corresponds to the fundamental shape information of characters. Furthermore, as we indicated before, the changes made in the frequency domain induce global changes to an image. Therefore, to decrease possible impacts on the usability of a CAPTCHA, we introduce a mask matrix in Algorithm 1, which has the same size with . has values of 1 in the high frequency part while 0 in the low frequency part. Then, as shown in Steps 5-6, we filter the pixels in the low frequency part while only considering to change the pixels in the high frequency part.

Third, after selecting the candidate modified pixel, instead of modifying one pixel each time as in JSMA, we modify the candidate pixel and its neighbors as shown in Step 7. This design is mainly based on the fact that close pixels in the frequency domain exhibit the partial similarity [58], i.e., neighboring pixels in the frequency domain have very similar property and features. Therefore, modifying the candidate pixel and its neighbors would significantly accelerate the adversarial CAPTCHA generation process while not harmfully affect its quality (recall that, we are targeting to use user-tolerable instead of as little as possible perturbations).

Finally, we make an Inverse FFT (IFFT) for the CAPTCHA in the frequency domain and transform it back to the space domain as shown in Step 8.

, , and . Basically, , , and follow the similar procedures as that in , , and respectively, except that all the designs are finished in the frequency domain. The differences are the same as that between JSMA and JSMA. Therefore, we omit their algorithm descriptions here while implementing them in TCG.

Input :  original CAPTCHAs; the label of ; a classifier; mask.
Output :  adversarial CAPTCHAs
1 , ;
2 while  do
3       ;
4       compute a saliency map based on the gradient ;
5       ;
6       based on , select the pixel, denoted by , that mostly decreases the likelihood of ;
7       modify and its neighbors to decrease the likelihood of ;
8       ;
Algorithm 1 JSMA
Attack Model Normal Text-based Adversarial CAPTCHA Generation
LeNet MaxoutNet NetInNet
JSMA JSMA JSMA
SVM 87.51 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
KNN 83.81 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
LeNet 95.87 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.01 0.00 0.00 0.00
MaxoutNet 95.29 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
NetInNet 96.45 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
TABLE II: Performance of JSMA, , , and (no image preprocessing).
Attack Model Filter + Text-based Adversarial CAPTCHA Generation
LeNet MaxoutNet NetInNet
JSMA JSMA JSMA

SVM, KNN

BLUR 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
DETAIL 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
EDGE ENHANCE 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
SMOOTH 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
SMOOTH MORE 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
GaussianBlur 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
MinFilter 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
MedianFilter 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
ModeFilter 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

LeNet

BLUR 0.32 0.29 0.26 0.24 0.30 0.49 0.30 0.28 0.38 0.25 0.35 0.29
DETAIL 3.77 0.48 1.98 1.84 2.71 4.01 2.86 2.77 3.32 2.24 3.47 2.95
EDGE ENHANCE 3.77 0.48 1.98 1.84 2.71 4.01 2.86 2.77 3.32 2.24 3.47 2.95
SMOOTH 11.66 3.50 6.19 6.56 8.49 10.89 7.20 7.47 10.70 8.12 8.65 7.97
SMOOTH MORE 8.89 2.71 5.10 4.81 6.94 9.13 5.68 5.85 8.49 6.49 6.81 6.56
GaussianBlur 0.03 0.05 0.04 0.04 0.04 0.07 0.05 0.04 0.05 0.06 0.05 0.04
MinFilter 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
MedianFilter 0.01 0.00 0.01 0.01 0.01 0.02 0.01 0.02 0.02 0.02 0.01 0.01
ModeFilter 0.01 0.00 0.01 0.01 0.01 0.02 0.01 0.02 0.02 0.02 0.01 0.01

MaxoutNet

BLUR 5.85 5.68 4.67 5.20 5.63 3.89 5.63 5.15 5.96 8.34 5.46 5.20
DETAIL 10.70 3.85 7.61 6.87 8.57 8.81 8.73 8.57 10.15 6.25 9.80 9.21
EDGE ENHANCE 10.70 3.85 7.61 6.87 8.57 8.81 8.73 8.57 10.15 6.25 9.80 9.21
SMOOTH 38.25 28.66 31.53 29.96 37.98 34.88 35.13 34.88 37.45 35.13 35.38 34.88
SMOOTH MORE 38.52 27.83 30.85 30.85 34.88 32.69 33.89 34.38 36.92 31.53 34.14 33.89
GaussianBlur 0.13 0.47 0.14 0.16 0.16 0.05 0.12 0.12 0.14 0.31 0.14 0.14
MinFilter 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
MedianFilter 0.03 0.00 0.01 0.01 0.02 0.03 0.02 0.02 0.02 0.04 0.02 0.02
ModeFilter 0.03 0.00 0.01 0.01 0.02 0.03 0.02 0.02 0.02 0.04 0.02 0.02

NetInNet

BLUR 17.51 13.47 14.64 15.77 17.10 16.29 16.55 15.26 18.08 17.10 16.82 16.03
DETAIL 15.90 7.34 10.89 10.42 13.47 14.52 13.03 12.49 13.47 10.24 13.59 13.03
EDGE ENHANCE 15.90 7.34 10.89 10.42 13.47 14.52 13.03 12.49 13.47 10.24 13.59 13.03
SMOOTH 28.24 16.82 19.89 20.20 24.49 24.87 22.01 21.84 24.12 21.67 23.04 22.18
SMOOTH MORE 28.88 19.42 21.84 21.33 24.87 26.21 23.40 22.35 24.87 22.18 23.58 22.52
GaussianBlur 0.48 0.28 0.27 0.22 0.44 0.43 0.37 0.35 0.45 0.64 0.39 0.37
MinFilter 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
MedianFilter 0.06 0.03 0.03 0.03 0.06 0.07 0.05 0.05 0.06 0.08 0.05 0.05
ModeFilter 0.06 0.03 0.03 0.03 0.06 0.07 0.05 0.05 0.06 0.08 0.05 0.05
TABLE III: Performance of JSMA, , , and (Filter + ).

4.4 Evaluation

Now, we evaluate the security performance of JSMA, , , and and leave their usability evaluation in Section 7. Generally, the evaluation procedure is the same as that in Section 4.2. In all the evaluations of this subsection, we employ MNIST to randomly generate CAPTCHAs of length 4. For each attack in TCA, we use 50,000 normal CAPTCHAs for training. Specifically, for the DNN based attacks LeNet, MaxOut, and NetInNet, the batch size is 50 and each model is trained for 20,000 rounds. For each scenario, we use 1000 CAPTCHAs for testing. When generating an adversarial CAPTCHA, we set the inner area as the high frequency part while the rest as the low frequency part for mask . Each evaluation is repeated three times and their average is reported as the final result.

First, we evaluate the performance of JSMA, , , and without any image preprocessing. To conduct this group of evaluations, we () leverage JSMA, , , and to generate adversarial CAPTCHAs in terms of LeNet, MaxoutNet, and NetInNet, respectively; and () leverage the attacks in the TCA module to attack these adversarial CAPTCHAs, respectively. The results are shown in Table II, where Normal indicates the SAR of each attack on the normal CAPTCHAs (non-adversarial versions).

From Table II, we have the following observations. (1) All the attacks in TCA are very powerful when attacking normal CAPTCHAs. However, when they attack the adversarial CAPTCHAs generated by JSMA, , , or , none of them can break any adversarial CAPTCHA. This result is as expected and further demonstrates the advantage of applying adversarial CAPTCHAs to improve the security. (2) The generated CAPTCHAs by JSMA, , , and have very good transferability, i.e., the adversarial CAPTCHAs generated in terms of one neural network model are transferable to another neural network or traditional machine learning models. This demonstrates the good robustness of the adversarial CAPTCHAs generated by JSMA, , , and .

Now, we go further by fully considering both image filtering and image binarization, Common operations in breaking text-based CAPTCHAs. Full results are shown in Table III, from which we have the following observations. (1) For SVM and KNN, they cannot break any CAPTCHAs generated by JSMA, , , or even after image preprocessing. This implies adversarial CAPTCHAs can achieve very good security when against traditional machine learning model based attacks. (2) For the DNN based attacks LeNet, MaxoutNet, and NetInNet, they become more powerful along with image filtering and binarization and can break adversarial CAPTCHAs to some extent in several scenarios. Still, adversarial CAPTCHAs are obviously more secure than normal ones when considering the SAR rates of these attacks. Further, comparing the results in Table III with that in Table I, the adversarial CAPTCHAs generated by JSMA, , , and are also much more secure than the ones generated by JSMA, , , and . (3) Similar as the previous evaluations, the adversarial CAPTCHAs maintain adequate transferability, which implies adversarial CAPTCHAs have stable robustness.

Finally, we discuss why the frequency-based methods perform better than space-based methods for text-based CAPTCHAs. According to the CAPTCHAs we generated (as shown in Fig. 2), after adding noise in the frequency domain, the shape and edge of the character changes, which cannot be recovered by filtering and binaryzation. Furthermore, as we protect the low-frequency part of an image, the fundamental shape of the characters in CHAPTCHAs will not change. Thus, human can still recognize them easily.

Normal Image-based Adversarial CAPTCHA Generation
NetInNet GoogleNet VGG ResNet50
NetInNet 41.72 0.0 0.0 0.0 0.0 4.6 20.3 4.2 1.9 0.7 5.4 1.9 2.2 4.1 3.3 8.8 4.7
GoogleNet 51.69 0.5 3.8 7.0 14.3 0.0 0.0 0.0 0.0 0.0 0.1 0.2 1.5 0.4 1.2 6.3 4.6
VGG 57.20 0.5 4.2 11.5 13.5 0.8 19.7 6.4 4.2 0.0 0.0 0.0 0.0 0.5 1.1 13.0 6.7
ResNet50 63.80 10.1 17.6 18.5 20.8 1.9 26.2 7.9 7.6 0.1 0.4 1.2 3.1 0.0 0.0 0.0 0.0
TABLE IV: Security of image-based adversarial CAPTCHAs.
Filter Image-based Adversarial CAPTCHA Generation
NetInNet GoogleNet VGG ResNet50

NetInNet

BLUR 0.0 0.0 0.0 0.6 4.1 9.7 3.2 2.5 1.7 5.5 1.6 2.0 4.4 2.7 5.3 3.7
DETAIL 0.0 0.0 0.0 0.1 1.0 16.7 2.2 0.8 0.4 5.7 0.9 1.3 1.2 1.3 4.1 2.8
EDGE ENHANCE 0.0 0.0 0.0 0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0
SMOOTH 0.0 0.0 0.0 0.1 5.5 19.2 5.7 3.9 1.1 7.9 2.2 2.7 7.0 5.5 9.7 8.5
SMOOTH MORE 0.0 0.0 0.0 0.1 6.0 19.1 6.2 4.9 1.5 8.2 3.0 3.2 7.3 5.7 9.4 7.6
GaussianBlur 0.0 0.0 0.0 0.1 5.5 14.4 5.9 4.3 1.2 6.7 3.2 3.4 7.3 4.3 8.8 7.9
MinFilter 0.0 0.0 0.0 0.3 5.1 12.3 3.0 5.5 2.1 3.1 1.2 4.4 5.8 1.7 5.5 10.0
MedianFilter 0.0 0.0 0.0 0.1 5.1 16.2 6.5 2.9 1.2 7.9 3.5 2.3 5.5 5.7 9.4 5.3
ModeFilter 0.0 0.0 0.0 0.1 4.6 20.3 4.2 1.9 0.7 5.4 1.9 2.2 4.1 3.3 8.8 4.7

GoogleNet

BLUR 0.8 5.7 5.9 8.2 0.0 3.9 1.3 1.3 0.9 4.3 2.8 2.9 6.8 3.4 6.2 6.5
DETAIL 0.4 2.9 5.1 12.6 0.0 0.0 0.0 0.0 0.0 0.1 0.2 1.2 0.2 0.5 3.1 3.0
EDGE ENHANCE 0.0 0.5 0.5 1.6 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.0 0.2 0.2
SMOOTH 0.3 2.7 5.7 7.8 0.0 0.0 0.0 0.0 0.0 0.8 0.7 1.5 0.9 1.7 7.9 5.0
SMOOTH MORE 0.4 3.8 7.1 9.1 0.0 0.0 0.0 0.0 0.0 0.8 0.8 1.5 1.1 1.2 8.1 5.7
GaussianBlur 0.4 2.3 4.3 6.2 0.0 0.8 0.0 0.6 0.1 2.1 2.1 2.0 1.9 3.2 9.1 6.0
MinFilter 2.1 3.8 4.2 10.8 0.0 1.2 0.2 1.7 0.2 1.1 0.7 3.4 1.7 0.7 5.3 7.6
MedianFilter 0.3 1.9 5.1 5.3 0.0 0.3 0.0 0.2 0.1 1.8 1.6 1.0 1.7 3.7 7.3 2.6
ModeFilter 0.5 3.8 7.0 14.3 0.0 0.0 0.0 0.0 0.0 0.1 0.2 1.5 0.4 1.2 6.3 4.6

VGG

BLUR 1.0 4.9 5.5 7.6 3.7 15.2 8.5 5.3 0.0 0.0 0.0 0.3 2.2 3.4 7.6 6.7
DETAIL 0.8 4.2 11.5 10.8 0.5 18.1 5.3 2.0 0.0 0.0 0.0 0.0 0.2 0.7 10.4 3.1
EDGE ENHANCE 0.0 2.1 2.7 2.1 0.0 3.4 0.3 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.9 0.2
SMOOTH 0.7 3.9 9.2 13.1 2.6 21.3 11.8 4.7 0.0 0.0 0.0 0.0 1.2 3.1 13.9 8.0
SMOOTH MORE 0.7 3.7 10.0 12.3 2.0 20.8 11.5 6.2 0.0 0.0 0.0 0.0 1.2 3.2 13.5 9.4
GaussianBlur 1.1 4.8 7.6 10.4 3.9 21.8 11.9 5.3 0.0 0.0 0.0 0.0 2.1 3.4 13.5 7.5
MinFilter 2.9 3.0 4.7 8.8 4.5 10.1 3.2 7.6 0.0 0.0 0.0 0.3 4.9 1.6 5.4 12.3
MedianFilter 0.5 3.2 8.2 9.7 2.3 19.7 8.5 3.2 0.0 0.0 0.0 0.0 2.1 4.0 15.2 6.0
ModeFilter 0.5 4.2 11.5 13.5 0.8 19.7 6.4 4.2 0.0 0.0 0.0 0.0 0.5 1.1 13.0 6.7

ResNet50

BLUR 4.1 17.6 10.8 14.6 6.2 24.0 11.1 6.7 0.9 7.0 6.0 4.4 0.1 0.5 2.6 4.6
DETAIL 10.8 16.7 15.6 18.6 1.3 22.3 5.1 4.1 0.2 0.5 1.1 2.4 0.0 0.0 0.0 0.0
EDGE ENHANCE 1.3 5.8 5.1 4.9 0.1 3.8 0.3 0.2 0.1 0.1 0.5 0.6 0.0 0.0 0.0 0.0
SMOOTH 6.2 13.9 15.2 17.1 4.1 33.1 14.2 7.3 0.0 0.4 1.0 3.1 0.0 0.0 0.0 0.0
SMOOTH MORE 7.0 14.4 17.1 19.6 4.3 30.4 15.6 7.8 0.1 1.0 1.2 3.0 0.0 0.0 0.0 0.0
GaussianBlur 4.4 16.7 12.3 15.6 8.2 30.6 13.5 8.8 0.1 3.8 2.9 4.5 0.0 0.0 0.2 1.0
MinFilter 7.0 9.3 10.1 16.9 10.8 16.6 7.0 10.8 1.0 2.1 1.8 4.6 0.1 0.1 2.1 3.9
MedianFilter 3.5 9.0 11.8 13.5 7.6 26.8 13.9 4.4 0.2 2.0 2.6 3.2 0.0 0.0 0.2 1.1
ModeFilter 10.1 17.6 18.5 20.8 1.9 26.2 7.9 7.6 0.1 0.4 1.2 3.1 0.0 0.0 0.0 0.0
TABLE V: Security of image-based adversarial CAPTCHAs vs Filters.

5 Image-based Adversarial CAPTCHAs

5.1 ICG Design

For image-based adversarial CAPTCHA generation, we actually follow the same design principles as that for the text-based scenario. Furthermore, similar to the situation that existing adversarial image generation techniques are not suitable for generating text-based adversarial CAPTCHAs, they are not suitable for image-based adversarial CAPTCHAs either due to similar reasons. Existing adversarial image generation techniques are mainly targeting to attack neural network models by adding as less as possible (human-imperceptible) perturbations to an image. However, we are standing on the defensive side to generate adversarial CAPTCHAs to improve the security. This implies that we might inject as much as possible perturbations to an image-based adversarial CAPTCHA as long as it is user-tolerable (user recognizable). In addition, the adversarial example generation speed may not be a concern for existing techniques. Although it is not a main constraint for CAPTCHA generation neither, since we can generate the CAPTCHAs offline, we still expect to generate many CAPTCHAs in a fast way (since we may need to update our CAPTCHAs periodically to improve the system security). Therefore, we take efficiency as a consideration in adversarial CAPTCHA generation.

Image-based CAPTCHAs are also different from text-based ones. They carry much richer semantic information which enables researchers to develop more processing techniques. Therefore, we do not have to transform an image-based CAPTCHA to the frequency domain. To some extent, it is relatively easier to generate image-based adversarial CAPTCHAs than generating text-based adversarial CAPTCHAs. Here, similar to the text-based scenario, we implement four image-based adversarial CAPTCHA generation methods based on JSMA, , , and , denoted by JSMA, , , and , respectively.

Input :  original CAPTCHAs; the label of ; a classifier; noise level.
Output :  adversarial CAPTCHAs
1 , ;
2 while  or  do
3       compute a saliency map based on the gradient ;
4       based on , select the pixel, denoted by , that mostly decreases the likelihood of ;
5       modify and its neighbors to decrease the likelihood of ;
6       ;
7      
Algorithm 2 JSMA

JSMA. We show the design of JSMA in Algorithm 2, which basically follows the same procedure as JSMA. Following our design principle, we make two changes. First, we introduce an integer parameter to control the least perturbation that should be made. This implies that in our design, we try to inject as much as possible perturbations as long as the CAPTCHA is user tolerable (certainly, is an empirical value that can be decided based on some preliminary usability testing). Second, like to the text-based scenario, we modify multiple pixels simultaneously to accelerate the generation process.

, , and . For the designs of , , and , their procedures are the same as , , and except that we choose a small step and less iterations to accelerate the CAPTCHA generation process. This also implies that our perturbation injection scheme may not be optimal compared with the original , , and . As we explained before, we are not targeting to add as less perturbation as possible like the original algorithms. Towards another direction, we try to inject more perturbations in a fast way when the CAPTCHA is user-tolerable.

5.2 Evaluation

Now, we evaluate the security performance of JSMA, , , and

while leaving their usability evaluation in the next section. In the evaluation, we employ ImageNet ILSVRC-2012 to generate all the needed CAPTCHAs. Meanwhile, we use the pretrained models (all trained using the data in ImageNet ILSVRC-2012) of the attacks in ICA to examine the security performance of the generated adversarial CAPTCHAs, i.e., using the attacks in ICA to recognize the generated CAPTCHAs. These pretrained models have state-of-the-art performance and are available at Caffe Model Zoo

[6]. For each evaluation scenario, we use 1000 CAPTCHAs for testing. Each evaluation is repeated three times and their average is reported as the final result.

Filter Image-based Adversarial CAPTCHA Generation
NetInNet GoogleNet VGG ResNet50
20 30 40 50 20 30 40 50 20 30 40 50 20 30 40 50

NetInNet

BLUR 0.0 0.0 0.0 0.0 1.6 1.2 1.1 0.9 0.6 0.5 0.4 0.3 2.7 3.0 2.4 1.8
DETAIL 0.0 0.0 0.0 0.0 0.8 0.3 0.1 0.1 0.2 0.1 0.1 0.1 0.9 0.5 0.2 0.1
EDGE ENHANCE 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
SMOOTH 0.0 0.0 0.0 0.0 2.7 1.8 1.2 1.1 0.3 0.2 0.2 0.1 3.6 2.4 1.6 1.4
SMOOTH MORE 0.0 0.0 0.0 0.0 3.0 2.2 1.2 1.2 0.4 0.3 0.3 0.2 3.6 2.7 2.2 1.8
GaussianBlur 0.0 0.0 0.0 0.0 1.8 1.6 1.4 0.9 0.3 0.3 0.1 0.1 3.3 3.0 1.8 1.8
MinFilter 0.0 0.0 0.0 0.0 2.0 1.1 1.2 1.2 0.3 0.6 0.4 0.3 2.0 2.2 2.0 1.4
MedianFilter 0.0 0.0 0.0 0.0 2.2 1.8 1.4 1.2 0.3 0.4 0.2 0.1 2.4 1.8 1.2 1.1
ModeFilter 0.0 0.0 0.0 0.0 1.8 0.9 0.6 0.7 0.2 0.1 0.1 0.1 2.4 0.9 0.5 0.5

GoogleNet

BLUR 0.2 0.2 0.1 0.1 0.0 0.0 0.0 0.0 0.4 0.1 0.1 0.1 1.8 2.0 1.6 1.8
DETAIL 0.2 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.1 0.0 0.0
EDGE ENHANCE 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0