An important step in the recirculation of banknotes is their recognition in high-speed sorting machines and cash-recycling ATMs which is followed by banknote class dependent authenticity and fitness checks. Central banks ensure the integrity of the cash cycle by formulating requirements and testing relevant banknote handling machines see e.g. 2].
In the current paper we use a deep learning approach for the recognition of Euro banknotes which seems to be a simple challenge at first glance but in fact it turns out to be tricky when central bank requirements with respect to rejection of objects not belonging to a trained banknote class are taken into consideration.
The success of deep learning is mainly driven by challenges like the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) with 1000 classes including keyboard, mouse, pencil, and many animals and more than one million images, see . A milestone in the history of deep learning was the success of Alexnet 
, the first convolutional neural network (CNN), winning the ILSVRC challenge by reduction of the top-5 error rate from 26.1% to 15.3%. The achievements of deep learning are made possible by advances in available computing power and larger training data sets allowing for deeper models with millions of parameters. Following the popularity of deep learning in academia it is now available to the industry and general public via open source frameworks as e.g. Tensorflow
or Matlab’s deep learning toolbox. Since in most applications massive amounts of training data are not available a popular approach consists of transfer learning, i.e. the finetuning of some layers of the pretrained network for a specific task, where the pretraining is done on a similar but sufficiently large data set, see[6, 7]. An example is the pretraining on Imagenet and the fine tuning on a smaller data set for the classification of plants  or medical images [9, 10]. The problem of recognition with a reject class is often referred to as open set recognition. A sophisticated probabilistic approach accompanied by an experimental study on the ImageNet database is contained in .
The current paper is structured as follows. In Section II we introduce the necessary background on banknote categories and the recycling framework introduced by the European Central Bank. Subsequently we summarize some background on image classification using deep neural networks and transfer learning in a mathematically rigorous way. In Section III
we introduce the 0-class module, i.e. we propose an architecture modification of the classical deep neural network image classifier which allows the mapping of images to a reject class. Then, we introduce the Inception-v3 neural network which will be applied for transfer learning. In SectionIV
we present the results of our experimental study which contains a statistical analysis of classification results for genuine banknotes under different training conditions and on 3 data sets with different resolutions and with or without application of skew correction. We study classification results for a deep neural network classifier with additional 0-class module also regarding reject rates on genuine banknotes as well as on images not belonging to a trained banknote class, where rejection is desired for the latter. Furthermore, we study training and test times. A summary of the results is contained in SectionV.
Ii-a Banknote categories and banknote recycling framework by the European Central Bank
Ensuring the integrity of the banknote cycle is a major task of the European Central Bank (ECB). For this reason the recirculation as well as the authenticity and fitness checking is regulated in the decisions [12, 1] which apply to bank note handling machines like cash-recycling ATMs and high-speed sorting machines used e.g. in banks and cash-centers. The ECB distinguishes between 4 categories of inputs to banknote handling machines. Category 1 consists of objects not recognized as Euro banknotes e.g. because of a wrong image or format, transportation errors, large folded corners, missing parts or non-Euro currency, compare Figure 11. Category 1 objects have to be rejected to the customer or operating staff. Category 2 consists of suspect counterfeit Euro notes where image and format are correct but one ore more security features are clearly missing or out of tolerance. Category 2 shall be withdrawn from circulation, handed over to the national authorities within 20 working days for further investigation and not be credited to the account holder. Category 3 consists of banknotes with correct image and format but some security features which cannot clearly be authenticated possibly due to tolerance deviations or bad quality of the banknote. Category 3 is treated in the same way as category 2 but may be credited to the account holder. Category 4 consists of genuine banknotes where all authenticity checks are positive. Furthermore it is differed between Category 4a where also all fitness checks have a positive results and category 4b where at least some fitness criteria has a negative result. Category 4 is credited to the account holder but only category 4a shall be used for recirculation whereas category 4b shall be returned to a national central bank. The ECB tests banknote handling machines and publishes a list of successfully tested machines on their website [13, 14]. The current test procedure (see ) contains a counterfeit test (at least 90% of a given counterfeit test deck shall be sorted in category 2 or 3 and none in category 4) and a fitness test (not more than 5% of a given test deck of unfit notes shall be sorted to category 4a). Furthermore at least 90% of a given test deck of fit and genuine notes shall be classified as category 4a and not more than 1% of the fit and genuine notes as category 1, 2 or 3.
Ii-B Image classification and deep neural networks
An image classifier is a function
where are the width and height of the input image in pixels, is the number of input channels (usually 3 for red, green and blue) and is the range of the pixel values. Furthermore denotes the number of image classes. Let
where is the vector of trainable parameters with not being unusual. A deep neural net can be used as image classifier of the form
Training a neural net means that the parameter vector is repeatedly updated in so-called training episodes where a chosen stochastic optimization algorithm is used to optimize the parameters for the classification of a randomly chosen batch of training images. Batch sizes of several hundred and training episode numbers of more than 1000 are not unusual.
Ii-C Transfer learning
Let a deep neural net mapping to be given. Usually has a decomposition of the form
with trainable parameters is the mapping to the so called feature space and
with trainable parameters
consists of the last two layers of the deep neural network which are a fully connected layer followed by a softmax layer. By concatenating the feature mapof a given deep neural net with a suitable mapping to instead of we can obtain a deep neural network for the classification of image classes instead of .
By transfer learning we mean the training of a classifier
with fixed parameters for the classification of different image classes by updating only the parameter vector . Usually is much smaller than . Thus transfer learning has a significantly reduced training time in comparison to training from scratch.
Iii Proposed Method
Iii-a The 0-class module
We introduce a reject class or 0-class which contains images belonging to neither of the trained classes. For this purpose we introduce the so called 0-class module
for with parameter which can be either trained or chosen by hand. Thus the deep neural network classifier with 0-class module is of the form
where means a reject of image and for means that the image is mapped to the th trained class.
Iii-B Google’s inception-v3 CNN
As deep neural net we choose in the following Google’s Inception-v3 architecture proposed in , which achieves a 3.45% top-5 error rate on the ILSVRC2012 benchmark data set. The inception-v3 has a feature map of the form
with trainable parameter vector with . The inception architecture consists of 48 different network layers, compare Figure 1, which are mostly convolutional, pooling or inception layers and have the ability to detect edges, corners, contours and objects. It is available for transfer learning, see , with parameters pretrained on more than one million images from the ImageNet database .
Iv Experimental Study
Iv-a The hardware
To assess the potential of the described approach for application in the ATM industry we made an experimental study on a work station with the state of the art GPU NVIDIA GeForce GTX 1080 Ti with 11 GB graphics storage and a price of 759€. The transfer learning for the inception-v3 net and our architecture modifications were done in Google’s open source framework Tensorflow  building on the pretrained model available as open source repository, see .
Iv-B The data sets
We compare results for three different data sets of Euro banknotes which are provided by Diebold Nixdorf AG and consist of field data recorded with a high-speed line camera used in Diebold Nixdorf cash recycling ATMs, see , with integrated sensor module and recognition software by CI Tech Sensors AG, see . The first aim of the experimental study conducted in 2018 under non-disclosure agreement with Mittweida University of Applied Science was to asses the potential for industrial application. This explains why the data is not made publicly available. Still the authors promote publication of the results for the sake of public interest in a balanced view on deep learning methods. All data sets consist of images with three color channels with the specialty that red and green are recorded in transmitted illumination whereas blue is recorded in reflected illumination. The resolution is dpi for data sets and and dpi for data set . The difference between data set and is that for data set banknotes have a skew up to . See Figure 5 for example images.
All data sets contain 40 different banknote classes consisting of the four orientations of EUR_005_a, EUR_005_b, EUR_010_a, EUR_010_b, EUR_020_a, EUR_020_b, EUR_050_a, EUR_100_a, EUR_200_a and EUR_500_a where a and b denote the first and second series of the Euro banknotes. The different orientations are treated as different classes denoted e.g. for EUR_005_a by EUR_005_a_1, EUR_005_a_2, EUR_005_a_3 and EUR_005_a_4 meaning front side, front side upside down, back side and back side upside down. Furthermore category 1 images are provided which belong to neither of the trained banknote classes or are genuine Euro notes not recognized by the classical algorithm used in the field which is usually due to large folded corners. The number of images per class varies from 291 in case of EUR_200_a_2 to 54091 in case of EUR_050_a_3 which is due to different ratios of the denominations in the field. Per data set more than 400000 banknote images of category 4 are provided. The images contain recordings of banknotes from the field which have different fitness quality.
Iv-C Comparison of different training conditions
We fix the validation and test set ratio at per class, the training batch size at and use the Adam algorithm as optimizer. As parameters for the Adam optimizer we use the default values suggested in [2, Section 8.5.3], i.e. a learning rate ofand respectively and a numerical stabilization constant of
. We compare the accuracy, i.e. the ratio of correctly classified images from the test set for the different data sets and different training conditions with varying number of training images and training epochs see TableI. For all three data sets the results show clearly that a higher number of epochs and a larger training set lead to an improvement of the accuracy. For a smaller training set and a smaller number of epochs data set 2 achieves the best results followed by data set 1 and data set 3. For a larger number of epochs and a larger training set all data sets achieve an accuracy of . It should be mentioned that we only consider results on notes which were recognized correctly by the currently used classifier in the field. In Subsection IV-D we will also compare results for Euro notes rejected by the currently used classifier.
Iv-D Results for category 1 input
For the study of the classification of category 1 input we concentrate on data set 2. This data set consists of 25dpi images without skew and achieves the best accuracy results on images from the 40 considered banknote classes among the three data sets. In the following we turn to the problem of classification of category 1 input. By the ECB decision  input images which are obviously not banknotes because of a wrong image or format have to be rejected. Typical examples are double notes (images of two overlapping notes), transport errors (mostly because of jam), other currencies, cheques or genuine Euro notes with large folded corners.
In Figure 11 we show examples for 5 different types. For the last type of ’rejected genuine Euro notes’ it can be desirable to obtain a higher acceptance rate and a category 4b classification whereas a category 2 or 3 classification should be avoided.
It should be noted that a reject rate for genuine notes below 1% is needed to pass the ECB test procedure, see . So this should be the aim for a deep learning based recognition. On the other hand non-genuine category 1 images which are mapped to one of the 40 banknote classes will be considered as fake notes (category 2 or 3) in the subsequent authenticity checks. Category 2 or 3 notes have to be investigated by national central bank authorities and it is therefore not desirable that a large ratio of category 1 is sorted to category 2 or 3. Thus, a convenient threshold for the mapping to the -class should not lead to a higher reject rate of genuine banknotes and should not lead to an increased amount of category 1 images mapped to a banknote class. In Table II
we compare the reject rates for the inception-v3 net combined with a 0-class module with different thresholds (C1-C4). As test set we use 10% of the genuine notes accepted plus 10% of the genuine notes rejected by the classical algorithm. The thresholds for the 0-class module are motivated by the quantiles of the maximal class probabilities in the set of genuine notes rejected by the classical algorithm but recognized correctly by the inception-v3 classifier see Figure14. As shown in Table II reject rates of 0.51% and lower can be obtained which meet the central bank requirement of a reject rate below 1%. However, if the reject rate is reduced further by choosing a lower threshold for the 0-class module at some point we start to observe banknotes being sorted to the wrong banknote class (C4).
Empirical cumulated distribution function (ecdf) for maximal class probability in feature vector for different types of images rejected by the classical algorithm as well as for genuine notes accepted by the classical algorithm (cat 4 Euro notes). For genuine Euro notes rejected by the classical algorithm we differ between Euro notes mapped to the correct banknote class (recognized EUR BNs) and Euro notes mapped to a wrong banknote class (not-recognized EUR BNs) by the inception-v3-classifier.
Iv-E Run time measurements
We distinguish between training time and test time. The training time consists of the time for feature extraction, i.e. for the application of the pretrained inception-v3 net to the image which is displayed in TableIII and of the time for retraining of the last layer displayed in Table IV. Under a practical view the time for retraining is neglectable whereas the feature extraction time for all images can be more than 2h. Still this is a minor problem since this time has to be expended only once. The test time is much more critical since in practice we are facing strict real time requirements in the contemplated application. The test time is the sum of the feature extraction time for 1 image (about 20ms by column 1 of Table III) and the test time of the resulting feature vector which is in average 0.59ms. Thus we obtain a complete test time of about 20ms for images of all three data sets. Here we should mention that the test time is measured on the same workstation as the training time which is equipped with a rather expensive graphical processing unit.
In this paper we studied the feasibility of recognition of Euro banknotes by deep learning. We focused on requirements from central banks including in particular the rejection of objects with wrong image or format. Thus, we introduce a 0-class module which can be concatenated to every classifier with a probability vector for the trained classes as output and allows the rejection i.e. sorting to a 0-class depending on a chosen threshold. In our experimental study we observed that the largest considered number of training images and epochs achieves 100% accuracy on all three considered data sets (25dpi without skew correction, 25dpi with skew correction and 8dpi with skew correction). For a smaller number of training images (50 per class) and epochs (1000) we observe that data set 2 achieves the best accuracy (99.873%), followed by data set 1 (99.814%) and data set 3 (99.771%). The study of sorting results using the 0-class module for rejection shows that reject rates of 0.51% and lower can be obtained, which meets central bank requirements. However, if the reject rate is reduced further by choosing a lower threshold for the 0-class module at some point we observe banknotes being sorted to the wrong banknote class. Run time measurements show that test time is approximately 20ms on a work station equipped with a state of the art GPU. This is acceptable for the considered real-time application. In summary, the deep learning approach with additional 0-class module offers the possibility for a low reject rate of genuine Euro banknotes, though it is subject of further investigation if a higher acceptance rate leads only to a higher category 4b rate or also to a higher category 3 rate where the latter is undesirable.
The authors would like to thank Michael Flack and Armin Stöckli from CI Tech Sensors AG for supporting the presented feasibility study and the publication of the results. Furthermore the authors would like to express their gratitude to Rainer Stute from Diebold Nixdorf AG for the collection of the field data, to Peter Zemp from CI Tech Sensors AG for the preprocessing of the field data and to Armin Stöckli for his valuable comments and active support concerning the results on the classification of the category 1 images. Last but not least, we would like to express our special thanks to Markus Süß from Mittweida University of Applied Science for his advice concerning Python and the Tensorflow framework.
-  European Central Bank, “Decision of the European central bank of 7 September 2012 amending Decision ECB/2010/14 on the authenticity and fitness checking and recirculation of euro banknotes.” Official Journal of the European Union, vol. L253/19, 2012.
-  I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016, http://www.deeplearningbook.org.
-  O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet large scale visual recognition challenge,” IJCV, vol. 115, no. 3, pp. 211–252, 2015.
-  A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin,
S. Ghemawat, G. Irving, M. Isard et al.
, “Tensorflow: A system for large-scale machine learning,” in12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, pp. 265–283.
-  J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” in NIPS, 2014, pp. 3320–3328.
-  J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “Decaf: A deep convolutional activation feature for generic visual recognition,” in International conference on machine learning, 2014, pp. 647–655.
-  A. K. Reyes, J. C. Caicedo, and J. E. Camargo, “Fine-tuning deep convolutional networks for plant recognition.” CLEF (Working Notes), vol. 1391, 2015.
-  H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mollura, and R. M. Summers, “Deep convolutional neural networks for computer-aided detection: Cnn architectures, dataset characteristics and transfer learning,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1285–1298, 2016.
-  M. Hon and N. M. Khan, “Towards alzheimer’s disease classification through transfer learning,” in 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2017, pp. 1166–1169.
-  A. Bendale and T. E. Boult, “Towards open set deep networks,” in
-  European Central Bank, “Decision of the European central bank of 16 September 2010 on the authenticity and fitness checking and recirculation of euro banknotes.” Official Journal of the European Union, vol. L267/1, 2010.
-  ——. (2019) Successfully tested types of banknote handling machine - customer-operated machines. [Online]. Available: https://www.ecb.europa.eu/euro/cashprof/cashhand/generatedPdfs/Customer_Operated_Machines_2019-07-05.en.pdf
-  ——. (2019) Successfully tested types of banknote handling machine - staff-operated machines. [Online]. Available: https://www.ecb.europa.eu/euro/cashprof/cashhand/generatedPdfs/Staff_Operated_Machines_2019-07-05.en.pdf
-  European Central Bank. (2019) Procedures for testing banknote handling machine types. [Online]. Available: https://www.ecb.europa.eu/euro/cashprof/cashhand/recycling/html/proctest.en.html#procedure
-  C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in CVPR, 2016.
-  Google. (2019) Tensorflow implementation of pretrained inception-v3. [Online]. Available: https://github.com/tensorflow/models/tree/master/research/inception
-  Diebold Nixdorf. (2019) CS4060 cash recycling ATM. [Online]. Available: https://www.dieboldnixdorf.com/en-us/financial-institutions/systems/cash-recyclers/cs%204060
-  CI Tech Sensors AG. (2019) Our technology - your security. [Online]. Available: https://www.citechsensors.com/en/technology.html