Deep-CAPTCHA: a deep learning based CAPTCHA solver for vulnerability assessment

06/15/2020 ∙ by Zahra Noury, et al. ∙ University of Leeds 3

CAPTCHA is a human-centred test to distinguish a human operator from bots, attacking programs, or any other computerised agent that tries to imitate human intelligence. In this research, we investigate a way to crack visual CAPTCHA tests by an automated deep learning based solution. The goal of the cracking is to investigate the weaknesses and vulnerabilities of the CAPTCHA generators and to develop more robust CAPTCHAs, without taking the risks of manual try and error efforts. We have developed a Convolutional Neural Network called -CAPTCHA to achieve this goal. We propose a platform to investigate both numerical and alphanumerical image CAPTCHAs. To train and develop an efficient model, we have generated 500,000 CAPTCHAs using Python Image-Captcha Library. In this paper, we present our customised deep neural network model, the research gaps and the existing challenges, and the solutions to overcome the issues. Our network's cracking accuracy results leads to 98.94% and 98.31% for the numerical and the alpha-numerical Test datasets, respectively. That means more works need to be done to develop robust CAPTCHAs, to be non-crackable against bot attaches and artificial agents. As the outcome of this research, we identify some efficient techniques to improve the CAPTCHA generators, based on the performance analysis conducted on the Deep-CAPTCHA model.



There are no comments yet.


page 1

page 2

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

CAPTCHA, abbreviated for Completely Automated Public Turing test to tell Computers and Humans Apart is a computer test for distinguishing between humans and robots. As a result, CAPTCHA could be used to prevent different types of cyber security treats, attacks, and penetrations towards the anonymity of web services, websites, login credentials, or even in semi-autonomous vehicles [13] and driver assistance systems [27] when a human need to take over the control on a machine/system.

In particular, these attacks often lead to situations when computer programs substitute humans, and it tries to automate services to send a considerable amount of unwanted emails, access databases, or influence the online pools [4]. One of the most common forms of cyber-attacks is the DDOS [8] attack in which the target service is overloaded with unexpected traffic either to find the target credentials or to paralyse the system, temporarily. One of the classic yet very successful solutions is utilising a CAPTCHA system in the evolution of the cybersecurity systems. Thus, the attacking machines can be distinguished, and the unusual traffics can be banned or ignored to prevent the damage. In general, the intuition behind the CAPTCHA is a task that can distinguish humans and machines by offering them problems that humans can quickly answer, but the machines may find them difficult, both due to computation resource requirements and the algorithm complexity [5]. CAPTCHAs can be in form of numerical or alpha-numerical strings, voice, or image sets. Figure 1 shows a few samples of the common alpha-numerical CAPTCHAs.

Fig. 1: Different types of alphanumerical CAPTCHA samples

One of the commonly used practices is using text-based CAPTCHAs. An example of these types of questions can be seen in Figure 2, in which a sequence of random alphanumeric characters or digits or combinations of them are distorted and drawn in a noisy image. There are many techniques and fine-details to add efficient noise and distortions to the CAPTCHAs to make them more complex. For instance [4] and [9] recommends several techniques to add various type of noise to improve the security of CAPTCHAs schemes such as adding crossing lines over the letters in order to imply an anti-segmentation schema. Although these lines should not be longer than the size of a letter; otherwise, they can be easily detected using a line detection algorithm. Another example would be using different font types, size, and rotation at the character level. One of the most advanced methods in this regard can be found in [28] which is called Visual Cryptography.

On the other hand, there are a few critical points to avoid while creating CAPTCHAs. For example, overestimating the random noises; as the computer vision-based algorithms are nowadays more accurate and clever in avoiding noise in contrast to humans. Besides, it is better to avoid very similar characters such as the number ’0’ and the letter ’O’, which cannot be easily differentiated, both by the computer and a human.

Fig. 2: Examples of a five-digit text-based CAPTCHA image.

Besides the text-based CAPTCHAs, other types of CAPTCHAs are getting popular recently. One example would be image-based CAPTCHAs that include sample images of random objects such as street signs, vehicles, statues, or landscapes and asks the user to identify a particular object among the given images [22]. These types of CAPTCHAs are especially tricky due to the context-dependent spirit. Figure 3 shows a sample of this type of CAPTCHAs.

However, in this paper, we will focus on text-based CAPTCHAs as they are still vastly used in high traffic and dense networks and websites due to their lower computational cost.

Before going to the next section, we would like to mention another application of the CAPTCHA systems that need to be discussed, which is its application in OCR (Optical Character Recognition) systems. Although current OCR algorithms are very robust, they still have some weaknesses in recognising different hand-written scripts or corrupted texts, limiting the usage of these algorithms. Utilising CAPTCHAs proposes an excellent enhancement to tackle such problems, as well. Since the researchers try to algorithmically solve CAPTCHA challenges this also helps to improve OCR algorithms [7]. Besides, some other researchers, such as Ahn et al. [6], suggest a systematic way to employ this method. The proposed solution is called reCAPTCHA, and it merely offers a web-based CAPTCHA system that uses the inserted text to fine-tune its OCR algorithms. The system consists of two parts: First, the preparation stage which utilises two OCR algorithms to transcribe the document independently. Then the outputs are compared, and then the matched parts are marked as correctly solved; and finally, the users choose the mismatched words to create a CAPTCHA challenge dataset [14].

Fig. 3: A sample of recently became accessible CAPTCHAs

This research tries to solve the CAPTCHA recognition problem to improve the technology of generating these methods and techniques, as the bots and scams are getting more advanced and smarter on a day to day basis.

The rest of the paper and our discussions are organised as follows: in Section 2., we review on the literature by discussing the latest related works in the field. Then we introduce the details of the proposed method in Section 3.. The experimental results will be provided in Section 4., followed by the concluding remarks in Section 5..

2. Related Works

In this section, we briefly explore other latest works done in this field. Geetika Garg and Chris Pollett [1]

performed a trained Python-based deep neural network to crack fix-lengthed CAPTCHAs. The network consists of two Convolutional Maxpool layers, followed by a dense layer and a Softmax. The model is trained using SGD with Nesterov momentum. Also, they have tested their model using recurrent layers instead of simple dense layers. However, they proved that using dense layers has more accuracy on this problem.

In another work done by Sivakorn et al. and presented [2]

they have created a web-browser-based system to solve image CAPTCHAs. Their system uses the Google Reverse Image Search (GRIS) and other open-source tools to annotate the images and then try to classify the annotation and find similar images, leading to an 83% success rate on similar image CAPTCHAs.

Stark et al. [3] have also used a Convolutional neural network to overcome this problem. However, they have used three Convolutional layers followed by two dense layers and then the classifiers to solve six-digit CAPTCHAs. Besides, they have used a technique to reduce the size of the required training dataset.

Furthermore, in [11], [12], [18], and [31], also CNN based methods have been proposed to crack CAPTCHA images. [24] has used CNN via the Style Transfer method to achieve a better result. [29] has used CNN too with a small modification, in comparison with the DenseNet [32] structure instead of common CNNs. Also, [33] and [21] have researched Chinese CAPTCHAs and employed a CNN model to crack them. On the other hand, there are other approaches which do not use convolutional neural networks, such as [15]. They use classical image processing methods to solve CAPTCHAs. As another example, [17] uses a sliding window approach to segment the characters and recognise them one by one.

Another fascinating related research field would be the adversarial CAPTCHA generation algorithm. One example would be [16] in which an adversarial noise is added to an original image to make the basic image classifiers misclassifying them, while the image still looks the same for humans. [25] also uses the same approach to create enhanced text-based images.

Among related researches, two of the most exciting works are [26] and [10], which use the Generative Models and Generative Adversarial Networks to train a better and more efficient models on the data.

Fig. 4: In this figure, both approaches for representing the output data are illustrated. Method A) in this method, we have one single Sigmoid layer, which represents three numerical characters (3-digits CAPTCHA, each of which can represent ten different status- 0 to. Method B)

we have three separate Softmax layers. In this example, both methods illustrate “621”.

3. Methodology

Deep learning based methodologies are widely used in almost all aspects of our life, from surveillance systems to autonomous vehicles [23], Robotics, and even the recent global challenge due to the COVID-19 pandemic [35].

To solve the CAPTCHA problem, we have also use a deep neural network architecture using convolutional layers. Below, we describe the detailed procedure of processing and solving the CAPTCHA images. The process includes input data pre-processing, encoding of the output, and the network structure itself.

3.1. Preprocessing

Applying some pre-processing operations such as image size reduction, colour space conversion, and noise reduction filtering can have a tremendous overall increase on the network result.

The original size of the image data used in this research was pixel which is too broad as there exist many blank areas in the image as well as many co-dependant neighbouring pixels. By reducing the image size to pixel, we can achieve almost the same results without any noticeable decrease in the system’s performance. This size reduction can help the training process to become faster since it reduces the data without having much reduction in the data entropy.

Colour space to Gray-space conversion is another preprocessing method that we used to reduce the size of the data while maintaining the same level of detection accuracy. In this way, we could further reduce the amount of redundant data and ease the training and prediction process. Converting from a three-channel RGB image to a grey-scale image does not affect the results, as the colour is not crucial on the text-based CAPTCHA systems.

The last preprocessing technique that we use is the application of a noise reduction algorithm. After a careful study on the appropriate filtering approaches, we decided to implement the conventional Median-Filter to remove the noise of the input image. The algorithm eliminates the noise of the image by using the median value of the surrounding pixels values instead of the pixel itself. The algorithm has been described in Algorithm 1 in which we generate the resultImage from the input image using a predefined window size.

Input: image, window size
Output: resultImage
for x from 0 to image width do
       for y from 0 to image height do
             i = 0 for fx from 0 to window width do
                   for fy from 0 to window height do
                         window[i] := image[x + fx][y + fy];
                         i := i + 1;
                   end for
             end for
            sort entries in window;
             resultImage[x][y] := window[window width * window height / 2];
       end for
end for
Algorithm 1 Median filter noise filtering

3.2. Encoding

Unlike the classification problems where we have a specific number of classes in the CAPTCHA recognition problems, the number of classes depends on the number of digits and the length of the character set in the designed CAPTCHA. This leads to exponential growth depending on the number of classes to be detected. Hence, for a CAPTCHA problem with five numerical digits, we have around 100,000 different combinations. As a result, we are required to encode the output data to fit into a neural network.

The initial encoding we used in this research was to employ neurons, where is the length of the alphabet set, and

is the character set length of the CAPTCHA. The layer utilises the Sigmoid activation function:


Where is the input value and

is the output of the Sigmoid function. By increasing the

, the conversing to and by reducing it the is getting close to . Applying Sigmoid function adds a non-linearity feature to neurons which improves the learning potential and also the complexity of those neurons in dealing with non-linear inputs.

These sets of neurons are arranged in a way so that the first set of neurons represent the first letter of the CAPTCHA; the second set of neurons represent the second letter of the CAPTCHA, and so on. In other words, assuming , the neuron tells whether the fifth letter from the second character matches with the predicted alphabet or not. A visual representation can be seen in Figure 4.A, where the method encompasses three numerical serial digits that represent 621 as the output. However, this approach seemed not to be worthy due to its incapability of normalising the numerical values and also the impossibility of using the Softmax function as the output layer of the intended neural network.

Therefore, we employed parallel Softmax layers, instead:


Where is the corresponding class for which the Softmax is been calculated, is the input value of that class, and is the maximum number of classes.

Each Softmax layer individually represents neurons as Figure 4.B and these neurons in return represent the alphabet that is used to create the CAPTCHAs (for example 0 to 9, or A to Z). unit is representing the location of the digit in the CAPTCHA pattern (for example, locations 1 to 3). Using this technique allows us to normalise each Softmax unit individually over its neurons, . This means each unit can normalise its weight over the different alphabets; hence it performs better, in overall.

Fig. 5: The Architecture of the proposed CNN

3.3. Architecture of the Network

Although the Recurrent Neural Networks (RNNs) can be used to predict CAPTCHA characters, we have focused on sequential models in this research as they perform faster than RNNs and can achieve very accurate result it the model is well designed.

The structure of our proposed networks is depicted in Figure 5

. The network starts with a Convolutional layer with 32 input neurons, with the ReLU activation function and

Kernels. A Max-Pooling layer follows this layer. Then, we have two sets of these Convolutional-MaxPooling pairs with the same parameters except for the number of neurons, which are set to 48 and 64, respectively. We have to note that all of the Convolutional layers have the padding parameter.

After the Convolutional layers, there is a 512 dense layer with the ReLU activation function and a 30% drop-out rate. Finally, we have separate Softmax layers, where is the number of expected characters in the CAPTCHA image.

(a) Loss - Adam
(b) Loss - SGD
(c) Train Accuracy - Adam
(d) Train Accuracy - SGD
(e) Test Accuracy - Adam
(f) Test Accuracy - SGD
Fig. 6: (a) and (b): Loss values of the test and training process for Adam and SGD optimiser, respectively. (c) and (d): The accuracy metrics of the network for the same optimisers on training dataset. (e) and (f): The accuracy metrics of the network using the given optimisers on the Test dataset.

The loss function for this network is Binary-cross entropy since we need to compare these binary matrices together:


were is the number of samples, and is the predictor model. The and are the input data and the label of the sample, respectively. Since the label could be either zero or one, only one part of this equation would be active for each sample.

We also employed Adam optimiser, which is briefly described in Equations 4 to 8 where and representing an exponentially decaying average of past gradients and past squared gradients, respectively. and are configurable constants. is the gradient of the optimising function and is the learning iteration.


In Equations 6 and 7, momentary values for and are calculated as follows:


Finally, using Equation 8 and by updating in each iteration, the optimum value of the function could be attained. and are calculated via Equations 6 and 7 and , the step size (also known as learning rate) is set to 0.0001 in our approach.


The intuition behind using Adam optimiser is its capability in training the network in a reasonable time. This can be easily inferred from Figure 5(a)

in which the Adam optimiser achieves the same results in comparison with Stochastic Gradient Descent (SGD), but with a much faster convergence.

After several experiments, we trained the network for 50 epochs with a batch size of 128 for each. As can be inferred from Figure

5(a), even after 30 epochs the network tends to an acceptable convergence. As a result, 50 epochs seem to be sufficient for the network to perform correctly. Besides, Figure 5(e) would also suggest the same inference based on the measured accuracy metrics.

4. Experimental Results

After developing the above-described model, we trained the network on 500,000 randomly generated CAPTCHAs using Python ImageCaptcha Library [38]. See Figure 7 for some of the randomly generated numerical CAPTCHAs with the fixed lengths of five-digits.

Fig. 7: Samples of the Python numerical Image-Captcha Library used to train the Deep-CAPTCHA.

Fig. 8: Confusion Matrix of the trained network on the test data. Although the digits are accurately labelled, the diagram is scaled non-linearly to point out the differences for better visualisation.

To be balanced, the dataset consists of ten randomly generated images from each permutation of a five-digit text.

Softmax Sigmoid
Loss Accuracy Loss Accuracy
Train 0.013 99.33% 0.037 98.73%
Test 0.075 98.94% 0.116 90.04%
TABLE I: Accuracy metric and total loss value for the Train and Test portion of the dataset.
Train Accuracy Test Accuracy
Digit 1 99.91% 99.87%
Digit 2 99.85% 99.75%
Digit 3 99.84% 99.72%
Digit 4 99.83% 99.75%
Digit 5 99.90% 99.85%
CAPTCHA 99.33% 98.94%
TABLE II: Accuracy metric of the dataset for each digit and the complete CAPTCHA as a set of 5 integrated digits.

4.1. Performance Analysys

As represented in Table I, the network reached the overall performance and accuracy rate of 99.33% on the training set and 98.94% on the test dataset. We have to note that the provided accuracy metrics are calculated based on the number of correctly detected CAPTCHAs as a whole (i.e. correct detection of all five individual digits in a CAPTCHA); otherwise, the accuracy of individual digits are even higher, as per the Table II.

We have also conducted a confusion matrix check to visualise the outcome of this research better. In Figure 8, we can see how the network performs on each digit regardless of the position of that digit in the CAPTCHA string. As a result, the network seems to work extremely accurately on the digits, and there is less than 1% misclassification for each digit.

By analysing the network performance and visually inspecting 50 misclassified samples we pointed out some important results as follows:

While an average human could solve the majority of the misclassified CAPTCHAs, the following weaknesses were identified in our model that caused failure by our CAPTCHA solver:

  • In 85% of the misclassified samples, the Gray-level intensity of the generated CAPTCHAs were considerably lower than the average intensity of the Gaussian distributed pepper noise in the CAPTCHA Image.

  • In 54% of the cases, the digits 3, 8, or 9 were the cause of the misclassification.

  • In 81.8% of the cases, the misclassified digits were rotated for or more.

  • Confusion between the digits 1 and 7 was also another cause of the failures, particularly in case of more than counter-clockwise rotation for the digit 7.

Consequently, in order to cope with the existing weakness and vulnerabilities if the CAPTCHA generators, we strongly suggest mandatory embedding of one some of the digits 3, 7, 8 and 9 (with/without counter-clockwise rotations) with a significantly higher rate of appearance and generation in the CAPTCHAs comparing to the other digits. This will make the CAPTCHAs harder to distinguish for automated algorithms such as DeepCAPTCHA, as they are more likely to be confused with other digits, while the human brain has no difficulties in identifying them.

A similar investigation was conducted for the alphabetic part of the failed detections by DeepCAPTCHA and majority of failed cases were tied to one/some of the letters: c, e, i, l, m, r. Therefore we suggest more inclusion of these letters may enhance the robustness of the CAPTCHAs.

Our research also suggests brighter colour (i.e. lower grayscale intensity) alpha-numerical characters would also help to enhance the difficulty level of the CAPTCHAs.

4.2. Performance Comparison

In this section, we compare the performance of our proposed method with 10 other state-of-the-art techniques. The comparison results are illustrated in Table III followed by further discussions about specification of each method.

As mentioned in earlier sections, our approach is based on Convolutional Neural Network that has three pairs of Convolutional-MaxPool layers followed by a dense layer that is connected to a set of Softmax layers. Finally, the network is trained with Adam optimiser.

In this research we initially focused on optimising our network to solve numerical CAPTCHAs; However, since many existing methods work on both numerical and alphanumerical CAPTCHAs, we developed another network capable of solving both types. Also, we trained the network on 700,000 alphanumerical CAPTCHAs. For a better comparison and to have a more consistent approach, we only increased the number of neurons in each Softmax units from to 36 to cover all common Latin characters and digits.

In order to compare our solution, first, we have investigated [29] which implemented three following approaches: DenseNet-121 and ResNet-50 which are fine-tuned model of original DenseNet and ResNet networks to solve CAPTCHA as well as DFCR which is an optimum model based on DenseNet. The DFCR has claimed an accuracy of 99.96% which is the best accuracy benchmark among other methods. However, this model has only been trained on less than 10,000 four-digit CAPTCHA images generated from texts, which puts the validity of the results under question.

DFCR [29] 99.96%
DenseNet-121 [29] 99.90%
ResNet-50 [29] 99.90%
Proposed method (numerical) * 98.9%
Wei - SVM [36] 98.81%
Wei - CNN [36] 98.43%
Proposed method (alphanumerical) 98.3%
SVHN Network * [14] 97.84%
VGG 16 [37] 97.50%
VGG_CNN_M_1024 [37] 97.20%
ZF [37] 96.60%
TOD-CNN [20] 92.37%

TABLE III: The accuracy results of different CAPTCHA recognition methods. (* denote the accuracy result that only calculated for numerical CAPTCHAs.)

The next comparing method is [36] which uses an SVM based method as well as an implementation of the VGG-16 network to solve CAPTCHA problems. The critical point of this method is the usage of image preprocessing, image segmentation and one by one character recognition. These techniques have lead to 98.81% accuracy on four-digit alphanumerical CAPTCHAs. The network has been trained on a dataset composed of around 10,000 images. Similarly, TOD-CNN [20]

have utilised segmentation method to locate the characters in addition to using a CNN model which is trained on a 60,000 dataset. The method uses a TensorFlow Object Detection (TOD) technique to segment the image and characters.

[14] has used DistBelief implementation of CNNs to recognise numbers more accurately. The dataset used in this research was the Street View House Numbers (SVHN) which contains images taken from Google Street View.

Finally, the latter discussed approach is [37] which compares VGG16, VGG_CNN_M_1024, and ZF. Although they have relatively low accuracy compared to other methods, they have employed R-CNN methods to recognise each character and locate its position at the same time.

In conclusion, our methods seem to have relatively satisfactory results on both numerical and alphanumerical CAPTCHAs. Having a simple network architecture allows us to utilise this network to other purposes with more ease. Besides, having an automated CAPTCHA generation technique allowed us to train or network with a better accuracy while maintaining the detection of more complex and more comprehensive CAPTCHAs comparing to other similar techniques.

5. Conclusion

We developed and tuned a CNN based deep neural network for numerical and alphanumerical based CAPTCHA detection to reveal the strengths and weaknesses of the common CAPTCHA generators. Using a series of paralleled Softmax layers played an important role in detection improvement. We achieved up to 98.94% accuracy in comparison to the previous 90.04% accuracy rate in the same network, only with Sigmoid layer, as described in section 3.2. and Table I.

Although the algorithm was very accurate in fairly random CAPTCHAs, some particular scenarios made it extremely challenging for Deep-CAPTCHA to crack them. We believe taking the addressed issues into account can help to create more reliable and robust CAPTCHA samples which makes it more complex and less likely to be cracked by robots or machines.

As the potential approach for future works, we suggest solving the CAPTCHAs with variable character length, not only limited to numerical characters but also applicable to combined challenging alpha-numerical characters as discussed in section 4.. We also recommend further research on the application of Recurrent Neural Networks as well as the classical image processing methodologies [30] to extract and identify the CAPTCHA characters, individually.


  • [1] Garg, Geetika, and Chris Pollett. ”Neural network captcha crackers.” In 2016 Future Technologies Conference (FTC), pp. 853-861. IEEE, 2016.
  • [2] Sivakorn, Suphannee, Iasonas Polakis, and Angelos D. Keromytis. ”I am robot:(deep) learning to break semantic image captchas.” In 2016 IEEE European Symposium on Security and Privacy (Euro S&P), pp. 388-403. IEEE, 2016.
  • [3] Stark, Fabian, Caner Hazırbas, Rudolph Triebel, and Daniel Cremers. ”Captcha recognition with active deep learning.” In Workshop new challenges in neural computation, vol. 2015, p. 94. Citeseer, 2015.
  • [4]

    Bostik, Ondrej, and Jan Klecka. ”Recognition of CAPTCHA characters by supervised machine learning algorithms.” IFAC-PapersOnLine 51, no. 6 (2018): 208-213.

  • [5] Von Ahn, Luis, Manuel Blum, Nicholas J. Hopper, and John Langford. ”CAPTCHA: Using hard AI problems for security.” In International Conference on the Theory and Applications of Cryptographic Techniques, pp. 294-311. Springer, Berlin, Heidelberg, 2003.
  • [6] Von Ahn, Luis, Benjamin Maurer, Colin McMillen, David Abraham, and Manuel Blum. ”recaptcha: Human-based character recognition via web security measures.” Science 321, no. 5895 (2008): 1465-1468.
  • [7] Kaur, Kiranjot, and Sunny Behal. ”Designing a Secure Text-based CAPTCHA.” Procedia Comput. Sci 57 (2015): 122-125.
  • [8] Nazario, Jose. ”DDoS attack evolution.” Network Security 2008, no. 7 (2008): 7-10.
  • [9] Yousef, Mohamed, Khaled F. Hussain, and Usama S. Mohammed. ”Accurate, data-efficient, unconstrained text recognition with convolutional neural networks.” arXiv preprint arXiv:1812.11894 (2018).
  • [10] Ye, Guixin, Zhanyong Tang, Dingyi Fang, Zhanxing Zhu, Yansong Feng, Pengfei Xu, Xiaojiang Chen, and Zheng Wang. ”Yet another text captcha solver: A generative adversarial network based approach.” In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 332-348. 2018.
  • [11] Karthik, CHBL-P., and Rajendran Adria Recasens. ”Breaking microsoft’s CAPTCHA.” Technical report (2015).
  • [12] Kopp, Martin, Matej Nikl, and Martin Holena. ”Breaking captchas with convolutional neural networks.” ITAT 2017 Proceedings (2017): 93-99.
  • [13] Rezaei, M. and Klette, R. “Look at the Driver, Look at the Road: No Distraction! No Accident!”, CVF Computer Vision and Pattern Recognition, pp. 129–136, (2014).
  • [14] Goodfellow, Ian J., Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, and Vinay Shet. ”Multi-digit number recognition from street view imagery using deep convolutional neural networks.” arXiv preprint arXiv:1312.6082 (2013).
  • [15] Wang, Ye, and Mi Lu. ”An optimized system to solve text-based CAPTCHA.” arXiv preprint arXiv:1806.07202 (2018).
  • [16] Osadchy, Margarita, Julio Hernandez-Castro, Stuart Gibson, Orr Dunkelman, and Daniel Pérez-Cabo. ”No bot expects the DeepCAPTCHA! Introducing immutable adversarial examples, with applications to CAPTCHA generation.” IEEE Transactions on Information Forensics and Security 12, no. 11 (2017): 2640-2653.
  • [17] Bursztein, Elie, Jonathan Aigrain, Angelika Moscicki, and John C. Mitchell. ”The end is nigh: Generic solving of text-based captchas.” In 8th USENIX Workshop on Offensive Technologies (WOOT 14). 2014.
  • [18] Zhao, Nathan, Yi Liu, and Yijun Jiang. ”CAPTCHA Breaking with Deep Learning.” (2017).
  • [19] Chen, Jun, Xiangyang Luo, Yanqing Guo, Yi Zhang, and Daofu Gong. ”A survey on breaking technique of text-based CAPTCHA.” Security and Communication Networks 2017 (2017).
  • [20] Yu, Ning, and Kyle Darling. ”A Low-Cost Approach to Crack Python CAPTCHAs Using AI-Based Chosen-Plaintext Attack.” Applied Sciences 9, no. 10 (2019): 2010.
  • [21] Algwil, Abdalnaser, Dan Ciresan, Beibei Liu, and Jeff Yan. ”A security analysis of automated Chinese turing tests.” In Proceedings of the 32nd Annual Conference on Computer Security Applications, pp. 520-532. 2016.
  • [22] Sivakorn, Suphannee, Jason Polakis, and Angelos D. Keromytis. ”I’m not a human: Breaking the Google reCAPTCHA.” Black Hat (2016).
  • [23] Rezaei, Mahdi, and Reinhard Klette, “Simultaneous Analysis of Driver Behaviour and Road Condition for Driver Distraction Detection.”, International Journal of Image and Data Fusion, (2011).
  • [24]

    Kwon, Hyun, Hyunsoo Yoon, and Ki-Woong Park. ”CAPTCHA Image Generation Using Style Transfer Learning in Deep Neural Network.” In International Workshop on Information Security Applications, pp. 234–246. Springer, Cham, 2019.

  • [25] Kwon, Hyun, Yongchul Kim, Hyunsoo Yoon, and Daeseon Choi. ”Captcha image generation systems using generative adversarial networks.” IEICE TRANSACTIONS on Information and Systems 101, no. 2 (2018): 543–546.
  • [26] George, Dileep, Wolfgang Lehrach, Ken Kansky, Miguel Lázaro-Gredilla, Christopher Laan, Bhaskara Marthi, Xinghua Lou et al. ”A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs.” Science 358, no. 6368 (2017): eaag2612.
  • [27] Rezaei, M. and Sarshar, M., and Sanaatiyan MM/ “Toward next generation of driver assistance systems: A multimodal sensor-based platform.” Computer and Atomation Eengineering, (2010), pp. 62–67.
  • [28] Yan, Xuehu, Feng Liu, Wei Qi Yan, and Yuliang Lu. ”Applying Visual Cryptography to Enhance Text Captchas.” Mathematics 8, no. 3 (2020): 332.
  • [29] Wang, Jing, Jiao Hua Qin, Xu Yu Xiang, Yun Tan, and Nan Pan. ”CAPTCHA recognition based on deep convolutional neural network.” Math. Biosci. Eng. 16, no. 5 (2019): pp. 5851–5861.
  • [30] Rezaei, Mahdi, and Reinhard Klette. “Object Detection, Classification, and Tracking”. Springer International Publishing, 2017.
  • [31] Stark, Fabian, Caner Hazırbas, Rudolph Triebel, and Daniel Cremers. ”Captcha recognition with active deep learning.” In Workshop new challenges in neural computation, vol. 2015, p. 94. Citeseer, 2015.
  • [32] Huang, Gao, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. ”Densely connected convolutional networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708. 2017.
  • [33] Jia, Yang, Wang Fan, Chen Zhao, and Jungang Han. ”An Approach for Chinese Character Captcha Recognition Using CNN.” In Journal of Physics: Conference Series, vol. 1087, no. 2, p. 022015. IOP Publishing, 2018.
  • [34] Banday MT, Shah NA, ”Image Flip Captcha”. ISC International Journal of Information Security (ISeCure) vol.1, no. 2, p. 105–123
  • [35] Rezaei M, Shahidi M, ”Zero-Shot Learning and its Applications from Autonomous Vehicles to COVID-19 Diagnosis: A Review”. In arXiv prepreint arXiv:2004.14143 (2020).
  • [36] Wei, Li, Xiang Li, TingRong Cao, Quan Zhang, LiangQi Zhou, and WenLi Wang. ”Research on Optimization of CAPTCHA Recognition Algorithm Based on SVM.” In Proceedings of the 2019 11th International Conference on Machine Learning and Computing, pp. 236-240. 2019.
  • [37] Du, Feng-Lin, Jia-Xing Li, Zhi Yang, Peng Chen, Bing Wang, and Jun Zhang. ”CAPTCHA Recognition Based on Faster R-CNN.” In International Conference on Intelligent Computing, pp. 597-605. Springer, Cham, 2017.
  • [38] Python Image Captch Librady, Last access: 15 June 2020.