In the last decade, the development and wide adoption of Intelligent Transportation Systems (ITS) has resulted in a growing demand for a reliable license plate recognition system. Examples of applications of such systems include speeding radars, surveillance systems, automatic parking access systems etc. [1, 2]
At the base of any license plate recognition system is Optical Character Recognition (OCR). Most OCR systems consist of three steps. First, the image must be analyzed by splitting it into small segments that contain one character each. This step is called segmentation .
In order to recognize characters from images, a support vector machine
(SVM) classifier will be used. SVM is a statistical classifier based on supervised learning. It is able to recognize certain patterns by analyzing labelled training data and to classify the data based on these patterns. However, raw images contain a large amount of data that is useless to the classifier, thereby reducing the efficiency and requiring more training data. Because of this, instead of analyzing raw data, classifiers are supplied with a small number of features that describe the data. The process of selecting and calculating these features is calledfeature extraction .
In this paper, a method of feature extraction based on the compressive sensing theory will be presented. Compressive sensing is a fairly new field that has been the subject of intensive research in the last decade. This new method of sampling enables the acquisition of data with far fewer samples than required by the Shannon-Nyquist sampling theorem
. If the original data and the sampling method meet certain conditions, the data can be fully reconstructed from this small number of samples by solving certain mathematical optimization problems. Based on this, we can deduce that these samples must contain enough information about these images to recognize them. The fact that they contain a lot of information in a small amount of data makes these samples a perfect candidate for classification features[9, 12].
The first step in the recognition of the characters on a license plate is segmentation. This process involves dividing the image into smaller segments. Each segment should contain one character of the license plate text .
This task is usually performed by a simple algorithm. First, an adaptive threshold is applied in order to obtain a binary image. This simplifies the next steps and eliminates the variation of features due to different brightness. The image is then analyzed and all the connected objects are extracted into separate images. Naturally, this simple algorithm cannot distinguish between characters and other objects, but this is the task of the classifier. In order to reduce the number of segments being analyzed, segments containing less than a certain number of pixels are discarded immediately .
As figure 1 shows, besides the license plate number and the text above it, the algorithm also detects other objects such as the frame of the license plate. Smaller objects are successfully ignored by the algorithm.
Iii Feature extraction
Feature extraction is the single most important step in character recognition. Features enable the classifier to recognize a character and distinguish it from other characters. Because of this, choosing the appropriate features directly affects the performance and precision of the classifier.
In order to ensure precise distinction between different characters, the extracted features should satisfy a few conditions. Firstly, the features should be invariant to the expected distorsions and variations that a character may have in a specific image. For example, the size of the character should not affect any of the features. Because of this, the segments from the previous step should be scaled to a fixed size. The binarization which is also done in the segmentation step ensures that the brightness and color of the characters does not affect the features. In other words, the features for different images of the same character should be as similar as possible.
Another condition that must be satisfied is the size of the feature vectors. Namely, larger numbers of features can negatively affect classification performance and require more training data. Because of this, features should represent only the information that will be useful to make a distinction between various characters .
Iii-a Compressive sensing
In order to reduce the number of features, this paper proposes a dimensionality reduction method based on compressive sensing. Compressive sensing is a concept proposed as an improvement to the Shannon-Nyquist sampling theorem which is the fundamental theorem of signal processing. The Shannon-Nyquist sampling theorem states that a continuous signal can be reconstructed from a digital signal whose sampling frequency is at least twice as high as the maximal frequency of the signal. This results in a large amount of data, most of which will be discarded when the signal is compressed. Compressive sensing overcomes this problem by directly taking a small number of measurements from the signal .
The matrix models the linear measurement process. In order to reconstruct the original signal, the above linear system has to be solved. According to the Shannon-Nyquist sampling theorem, must be at least as large as , and (1) yields an unique solution. In the case that , classical linear algebra indicates that the linear system (1) is undetermined and has an infinite number of solutions, making recovery of the original signal impossible. However, according to compressive sensing, it is actually possible to reconstruct even from measurements if certain conditions are met [3, 5].
One of the conditions that must be met is sparsity, i.e. the signal must be sparse in a certain transformation domain in order to ensure a successful reconstruction. The signal is assumed to be -sparse in one of the common transformation domains.
An important factor that affects the signal reconstruction is the quality of the measurement matrix. One good metric for the quality of a measurement matrix is the restricted isometry property (RIP). This concept was introduced in .
 The -th restricted isometry constant (depending on and ) of a matrix is the smallest such that
for all -sparse vectors .
We say that satisfies the retricted isometry property if is small for a reasonably large .
Some of the commonly used measurement matrices that satisfy this condition are random Gaussian, Bernoulli and partial random Fourier matrices. In this paper, a random Bernoulli matrix will be used. The lower bound for the number of measurements with this matrix is :
The development of a resonably fast algorithm for signal reconstruction is very important. The first algorithmic approach that comes to mind is minimization, i.e. the search for the sparsest vector consistent with the measured data (the -norm of a vector is defined as a number of its non-zero components). However, this is generally a very NP-hard problem and it is therefore not viable for use in practice. A very popular alternative method is minimization, also known as basis pursuit, which consists in finding the minimizer of the following problem [3, 5, 6]:
This optimization problem can be solved with efficient methods from convex optimization. Basis pursuit can be interpreted as the convex relaxation of minimization. Alternative methods include greedy-type methods such as orthogonal matching pursuit, as well as thresholding-based methods including iterative hard thresholding .
In this paper, the total variation (TV) method will be used in order to test the reconstructibility of the character images from the measured samples. This method provides great results in image processing applications. It is based on solving the following optimization problem [3, 18]:
The total variation of represents the sum of the gradient magnitudes at each point.
Since a perfect reconstruction of the original signal from the compressive measurements is possible, we can assume that these measurements contain descriptive information about the signal. Therefore, these samples can be used as features in a classification algorithm [9, 12]. In the proposed approach, we will take compressed measurements from the character images using binary measurement matrices. The measurement vectors will then be used as feature vectors for training and testing the classifier.
Iv The proposed approach
The previous chapters have given a brief summary of the theory on which this paper is based. We will now summarize an algorithm that implements the proposed approach.
First, the image of the license plate is converted to a black and white image. All groups of connected pixels are then extracted into separate images. In order to make the algorithm insensitive to image size, all the images are scaled to the same size. This completes the segmentation step .
Each character image (i.e. segment) is then multiplied by the measurement matrix. This way, a small number of random samples will be collected from the signal. According to the CS theory from the previous chapter, these samples contain enough information about the signal to fully reconstruct it, and can therefore be used as features [9, 12].
In order to determine what character does an image represent, a support vector machine (SVM) classifier is used. Since we have multiple classes, we have to use a multiclass SVM which is a modified version of the standard classifier . The training data for the classifier is a large set of different images and their corresponding labels. When the classifier is trained it creates a model that will be used to determine which class does an image belong to. In order to test the precision of the classifier, the available data is randomly split into a training set and a testing set. After training, the model’s accuracy is measured by applying it to the testing set and comparing the predicted labels with the original labels.
In order to verify the performance of the proposed system, we have done a simulation on a sample character image set. The simulation has been done in Matlab 9.3.
The test was done using a character image set available in the Computer Vision System toolbox for Matlab. To reduce the complexity of the simulation, the testing was done using only digits (0-9). The aforementioned image set contains 1010 images of digits, i.e. 101 image for each digit, with dimensions of
pixels. A commonly used split ratio in machine learning 4 by 1, i.e. 80% is used for training and 20% for testing. To get more consistent results, the test was run multiple times (20), and the obtained results were averaged.
The images were sampled using a random Bernoulli matrix in which each element takes values with probability. Figure 2 shows a sample character image and its reconstructions from different numbers of samples. The reconstruction has been done using the minimization algorithm from the -magic toolbox 
The samples extracted from the images have been used as features for the classification step. The classifier used in this test was a multiclass SVM with error correcting output codes, using a ’one-vs-one’ coding scheme. Firstly, the training set of features and labels was used to train a model. Then, the testing set of features was applied to the model in order to test the classification accuracy.
In the first test, 64 features were extracted from the character. By running the test 20 times (with different random measurement matrices and data splits), an average total accuracy of has been achieved. The minimum total accuracy that was observed was , while the maximum was
. In each run, a confusion matrix was generated, and the average scores for each character are shown in TableI.
Another test was done with 96 features. This time, the results were slightly better, achieving an average of and a minimum of . The scores by character are given in Table II.
As Figure 2 shows, 64 samples are not enough for a decent image reconstruction. However, the classification results show that this amount of data still provides a fairly accurate recognition. A classification test has been done with as low as 32 samples, still achieving an average of accuracy, while a TV reconstruction from these samples was practically unrecognizable. This shows that the proposed method of classification is very robust and can provide effective recognition even from a very small number of measurements.
The intensive research in comressive sensing reveals many new possible applications of the theory. In this paper, yet another successful application of these techniques has been demonstrated. The results show that compressive sensing based feature extraction has great performance in the classification of character images. The simple measurement matrix gives this method an advantage over other popular methods in terms of computational complexity.
-  O. Due Trier, A. K. Jain, T. Taxt (1996) “Feature Extraction Methods for Character Recognition - A Survey”, Pat. Recog., (Vol. 29), pp 641-662.
-  S. Stanković, I. Orović, E. Sejdić (2016) “Multimedia Signals and Systems”, Springer International Publishing.
-  J. Musić, T. Marasović, V. Papić, I. Orović, and S. Stanković, “Performance of compressive sensing image reconstruction for search and rescue,” IEEE Geoscience and Remote Sensing Letters, Volume: 13, Issue: 11, pp. 1739 - 1743, Nov. 2016
-  Lj. Stanković (2015) “Digital Signal Processing”, CreateSpace Independent Publishing Platform.
-  S. Foucart, H. Rauhut (2013) “A Mathematical Introduction to Compressive Sensing”, Springer, New York.
-  E. Sejdic, I. Orovic, S. Stankovic, "Compressive sensing meets time-frequency: An overview of recent advances in time-frequency processing of sparse signals," Digital Sig. Proc., Vol. 77, June 2018, pp. 22-35.
-  S. Stankovic, I. Orovic, "An Approach to 2D Signals Recovering in Compressive Sensing Context," Circuits Systems and Signal Processing, April 2017, Volume 36, Issue 4, pp. 1700-1713, 2016.
M. Lohne (2016) “Face Recognition with Compressive Sensing”.
-  LJ. Stankovic, S. Stankovic, T. Thayaparan, M. Dakovic, I. Orovic, "Separation and Reconstruction of the Rigid Body and Micro-Doppler Signal in ISAR Part I-Theory ," IET Radar, Sonar and Navigation, vol. 9, no. 9, pp. 1147-1154, 2015
-  LJ. Stankovic, S. Stankovic, T. Thayaparan, M. Dakovic, I. Orovic, "Separation and Reconstruction of the Rigid Body and Micro-Doppler Signal in ISAR Part II-Statistical Analysis," IET Radar, Sonar and Navigation, vol 9, No 9, pp. 1155-1161, ISSN : 1751-8784, 2015
-  A. en Deǧerli, S. Aslan, M. Yamac, B. Sankur, M. Gabbouj, “Compressively Sensed Image Recognition”, arXiv:1810.06323.
X. Ye, F. Min, (2018) “A method of vehicle license plate recognition based on PCANet and compressive sensing”, Proc. SPIE 10609, MIPPR 2017: Pattern Recognition and Computer Vision.
-  M. Brajović, S. Stanković, I. Orović, “Analysis of noisy coefficients in the discrete Hermite transform domain with application in signal denoising and sparse signal reconstruction,” Sig. Proc., In press, 2018.
-  I. Orovic, A. Draganic, S. Stankovic, "Sparse Time-Frequency Representation for Signals with Fast Varying Instantaneous Frequency," IET Radar, Sonar and Navigation, Vol. 9, Issue: 9, pp. 1260 - 1267.
-  A. Draganic, I. Orovic, S. Stankovic, X. Li, Z. Wang, "An approach to classification and under-sampling of the interfering wireless signals," Microprocessors and Microsystems, Vol. 51, June 2017, pp. 106-113.
E. Candès, T. Tao (2005) “Decoding by linear programming”, IEEE Transactions on Information Theory 51, No. 12, pp. 4203–4215.
-  N. Lekić, A. Draganić, I. Orović, S. Stanković, and V. Papić, “Iris print extracting from reduced and scrambled set of pixels,” Second International Balkan Conference on Communications and Networking BalkanCom 2018, Montenegro, June 6-8, 2018.
-  E. Candès, J. Romberg (2005) “-magic: Recovery of Sparse Signals via Convex Programming”.
-  C. Bernard, C. Ioana, I. Orović, and S. Stanković, “Analysis of underwater signals with nonlinear time-frequency structures using warping based compressive sensing algorithm,” MTS/IEEE North American OCEANS conference, October 2015, Washington, DC, United States, 2015.
-  M. Daković, LJ. Stanković, and S. Stanković, “Gradient Algorithm Based ISAR Image Reconstruction From the Incomplete Dataset,” 3rd International Workshop on Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing (CoSeRa) 2015.