Binarized Statistical Image Features (BSIF) have been shown to be effective in iris recognition [27, 28] as well as in iris presentation attack detection [9, 10, 26]. All these approaches applied the standard BSIF filters provided with the original paper . These filters originate from a different domain than iris recognition. Using these filters assumes that filter kernels developed for a small set of natural images can serve as universal feature extractors independently of the application. However, we hypothesize that a) preparation of domain-specific filters, employing image patches sampled from a new domain for filter training, allows to extract features that are more discriminative for the specific domain, and b) observing how humans perform iris recognition task helps in building more powerful feature extractors. Specifically, this paper answers the following questions:
Does the adaptation of BSIF filters to an iris recognition domain allow to extract more discriminative iris image features than standard BSIF filters?
Does a careful selection of iris image patches, based on regions used by humans performing iris recognition task, allow to increase the iris recognition performance when compared to using filters trained on randomly selected iris images patches?
To calculate new iris-domain specific filters, feedback from human subjects was used to select the most salient training samples. Each of 86 subjects, who participated in the experiments, was presented with 10 iris image pairs (randomly selected out of 160 different pairs) and their task was to: (a) decide if each pair of images presents the same iris; during this step, the gaze of subjects was automatically collected by the means of an eye tracker device, and (b) annotate regions of the images supporting their decision. As a consequence, we compiled two sets of iris image patches deemed to be salient for matching iris images: one coming from the gaze of subjects, and the other coming from the provided annotations. For comparison purposes, the third set of new iris-domain specific filters was designed based on patches selected randomly from iris images, without involving humans.
To verify the hypothesis that domain-specific filters can outperform the standard BSIF filters, we applied a three-stage procedure incorporating specially designed subject-disjoint data sets, and statistical testing to verify if the observed differences among distinct approaches are statistically significant, as depicted in Fig. 1:
The first stage defines the application domain. The standard BSIF filters were trained on patches of natural images. Our filters are trained on human-selected iris image patches extracted from the set of images. To maximize the heterogeneity of the sample iris images, includes image pairs representing different tails of the comparison score distributions (“easy” samples and “hard” samples), iris images of twins, samples with large difference in pupil dilation, and even post-mortem iris images. Four different sensors (LG 4000, AD 100, IriShield MK 2120U, and laboratory prototype based on TheImagingSource DMK camera) were used in total to collect images included into to prevent new filters to be sensor-specific, rather than iris-domain-specific.
In the second stage, all the required hyper-parameters are selected for the classification methods using the standard and our newly-designed filters. We use a separate set of iris images, subject-disjoint with the set, to select filter sets delivering the most discriminating features in iris domain.
The third stage encompasses the statistical evaluation and comparison among the methods, and the third set, subject-disjoint with the and sets, is used to make this comparison sound from a statistical point of view.
The above rigorous procedure, incorporating subject-disjoint data used in each stage, allows to follow a typical scenario of how standard BSIF filters are currently applied in computer vision problems, and hence minimizes the risk of biased evaluations.
There are three main contributions of this paper:
Experiments and comprehensive evaluation showing that a) domain-specific filters extract features that are more powerful in the domain at hand than standard BSIF filters, and b) using human feedback in the filter design process increase chances to obtain further increase in discrimination power of the domain-specific filters.
Domain-specific (new) filters ready to be applied in the standard BSIF pipeline for various tasks related to iris recognition, along with iris image patches used in filter re-training and testing database111The source codes, iris image patches and new retrained BSIF filters are available at https://github.com/CVRL/domain-specific-BSIF-for-iris-recognition. Please follow the instructions at https://cvrl.nd.edu/projects/data/ to get a copy of the test database..
Source codes of the iris recognition method using domain-specific BSIF filters that offers better accuracy than other open-source BSIF-based and Gabor-based iris recognition methods1.
The main purpose of this paper is to show, to our knowledge for the first time, that we may benefit from human feedback when building feature extractors for iris recognition. Next section defines the context of this paper and presents iris recognition methods based on BSIF. Sec. 3 summarizes the related work. In Sec. 4.1, we present the experimental setup designed to build set and to train new filters, while Sec. 4.2 provides the filter selection methodology. Sec. 4.3 presents statistical analysis of the results obtained for standard and new filters, as well as the comparison with other open-source iris recognition methods (one BSIF-based and one incorporating Gabor filtering) on the same test set. Finally, in Sec. 5, we conclude the paper and elaborate on future work.
2 Domain Definition: Iris Recognition
Iris textures observed in near-infrared light are subjectively rather different from textures in natural images. This paper focuses on iris recognition as a specific example domain for which we design new domain-specific BSIF filters.
There are various ways of how the BSIF pipeline can be adapted for iris recognition. A straightforward way, proposed earlier in the literature , is to use the histogram, either normalized or raw, of the composed filtering results as an iris image template, see Fig. 2; this is termed the “BSIF code” . In this approach, the unwrapped iris image is filtered by a set of filters of size pixels. Each filter response is then binarized (with a threshold at zero), and the resulting bits for each pixel are translated into the -bit grey-scale value, and finally all gray-scale values calculated for the entire image are represented as a histogram with bins. In our implementation of this approach, we use an occlusion mask to exclude regions of the BSIF code that do not correspond to the iris texture, and we use only valid portions of the convolution results when calculating the histograms. Finally, the comparison score between iris templates is the distance between raw template histogram and raw probe histogram :
where is the -th bin of the histogram , and is a small number that prevents from division by zero when . As in the original BSIF pipeline, we have additionally considered normalized histograms to calculate an alternative comparison score:
In all our experiments we use normalized iris images and the corresponding normalized occlusion masks calculated by the open-source OSIRIS software . Each normalized image has a resolution of pixels, which translates to 512 sampling points along the iris circle, and 64 sampling points along the iris radius (from the pupil to the sclera). Note that iris rotation in Cartesian coordinates corresponds to a circular shift of the normalized iris image in polar coordinates. It means that even if the mutual rotation between the template and the probe is non-zero, the only thing that changes is the spatial location of elements within the normalized image, hence the resulting histograms are the same.
Rathgeb  proposed to calculate histograms locally in the predefined iris image patches (however, without excluding the occluded iris areas) and to binarize the histograms to calculate a compact iris image representation. We also added this method for comparison in Section 4.3.
An alternative solution, that delivers better results in our experiments, uses the binarized responses of filters directly to construct iris codes, as illustrated in Fig. 3. Each -th iris code forming an iris template is compared independently with the corresponding probe iris code by calculating the fractional Hamming distance of non-occluded iris portions:
where , denotes an exclusive OR operation, denotes a logical AND operation, and are binary masks with 1’s indicating valid areas of the template and probe samples, respectively, denotes the number of 1’s in , and indicates the angle by which the probe image and the corresponding probe mask are rotated to find the best matching. The final comparison score may be calculated in three ways, by taking average, minimum, or maximum fractional Hamming distance out of fractional Hamming distances calculated for pairs of codes (for each -th filter). This gives three another ways of calculating the comparison score considered in this work (in addition to those given by equations (1) and (2)):
3 Related Work
The present paper is related to the step of iris feature extraction, within the typical iris recognition pipeline. Following the seminal work of Daugman, the literature of iris feature extraction was initially dominated by solutions that aimed at describing iris texture through the quantization of filter responses over iris images . While Daugman originally proposed the use of Gabor filters, other approaches inspected the employment of alternative filters, ranging from Haar wavelets , wavelet packets [13, 6], spatial filter banks , to directional filter banks , only to name a few.
In contrast to texture description, some researchers have lately tried to describe alternative iris features, including salient interest points [1, 2] and human-interpretable features . Nonetheless, a prominent set of works still has been focusing on iris texture, specifically employing general-purpose texture descriptors such as Local Binary Patterns (LBP) , Local Phase Quantization (LPQ) , and BSIF [27, 28], the latter descriptor reportedly constituting better iris recognition systems.
The idea of improving BSIF descriptor with domain-specific filters was recently verified by Ouamane et al. 
in the problem of face recognition, where new filters were learned from two- and three-dimensional face images. In the particular case of iris recognition, though, the same idea is yet to be investigated, to the best of our knowledge.
Last but not least, the idea of having humans in the loop for providing human-machine collaboration towards the solution of difficult recognition problems has been applied to varied domains, such as object [3, 19, 17], face , iris , and even galaxy recognition . Human contributions may range from question-and-answer opinions [3, 18, 4], to predefined type selection , to gaze point collection , and to free shape annotation .
Relating this paper to previous work, this is (to our knowledge) first work to show that re-training BSIF filters to the iris image domain improves performance over the default filters. This is also the first work engages human subjects in constructing iris feature extractors, using both human gaze tracking and manual image annotation.
4 Experiments and Results
In this Section we summarize the experimental setup and report the obtained results.
4.1 Filter Training
The task of filter training regards the computation of a set of filters from a training set of image patches, by maximizing the statistical independence of the responses of the filters belonging to , when applied over the patches .
Iris Patch Extraction.
The set of image patches is obtained from set, hence making the proposed solution specific to the domain of iris recognition. Moreover, rather than randomly sampling the irises belonging to , we rely upon the opinions and behavior of human subjects for finding regions of interest.
For gathering people’s input, we set up an experiment with 86 volunteers, comprising university staff and students who were not necessarily familiar with the problem of iris recognition. Through a specially designed web application interface and a dedicated hardware to perform eye tracking (Fig. 4), ten pairs of iris images belonging to were presented to each volunteer, depicting either authentic or impostor cases. The set was composed of 800 iris images (400 pairs) of various quality and origin. To increase the variation of samples, and to make the iris recognition problem more challenging to humans, was composed of a mixture of matching and non-matching pairs representing images that were “easy” or “hard” to match, according to OSIRIS software, images of the same irises but with excessive difference in pupil dilation, images of different eyes from identical twins and triplets, and even post-mortem iris images.
The task of volunteers participating in the experiment was to decide, for each pair, if the displayed irises either belonged to the same eye or not. black While doing this, their gaze was automatically collected through an eye tracker device, whose data was later processed using a spatio-temporal criterion to localize important regions, according to the gaze persistence over the same neighborhood (depicted in Fig. 5. Next, subjects were asked to manually annotate each iris pair with arbitrary shapes, highlighting regions that supported their decisions. As one might observe through Fig. 6, matching regions were annotated as green connections, according to the subject’s desire.
Only -pixel masked clean near-infrared irises were presented to the volunteers, guaranteeing that decisions were made solely based on the iris texture, rather than on periocular information. All image segmentations in were done manually to guarantee the correctness of masking periocular regions and more challenging areas, such as irregular eyelashes, specular reflections or cornea wrinkles in post-mortem iris images.
Prior to extracting the patches , both manual annotations and gaze-based regions of interest — which were initially established on original iris images for making them more intelligible to humans — had to be normalized with Daugman’s rubber sheet transformations . This allows to follow the typical iris-code-based recognition pipeline applied in earlier works. After normalization, patches were extracted from each human-inspired region, according to the largest inscribed rectangle, as shown in Fig. 7. To ensure the quality of the annotations and of the gaze-based regions of interest, patches were extracted only from genuine iris pairs that were correctly classified by subjects
. This is important since humans are not considered to be skilled in iris recognition task, hence using only correctly classified examples increases the probability to learn valuable information from human subjects.
Filter Set Computation.
Once the human-inspired iris patches are available, one can compute the set of ICA filters, which, by definiton , depends on two parameters: the size of the filters and the number of filters. In this work we investigate filter sizes, namely , and filters in one set. This gives 96 different sets of filters (compare this to 60 different sets available in the original BSIF).
Depending on , each iris patch is randomly sampled with squares whose sides are equal to . Sizes larger than the patch itself are discarded. As a consequence, we end up with 142,852 square sub-patches extracted from manual annotations, and 143,177 square sub-patches extracted from gaze-based regions of interest, which are then used for filter training. We never mix annotation-based and gaze-based patches in the experiments. The computation of filters is done with the scikit-learn Fast ICA implementation . The extracted human-inspired iris image patches are available along with the paper for those who want to apply other software packages or methodology for filter training.
Fig. 8 compares two sets of original BSIF and our new filters, for two example scales ( and ). One interesting observation is that original BSIF filters are in most cases similar to edge detectors at different angles (top row in Fig. 8). However, new filters suggest more sensitivity to dot-like features, which are more common in iris texture than straight edges.
4.2 Filter Selection and Matching Strategy
We use a separate set of 1,812 iris images representing 453 different irises of 330 new subjects (compared to set) to select the best set of filters and the best matching strategy out of five strategies presented in Sec. 2. mixes samples acquired by both AD 100 and LG 4000 iris sensors.
Selection of the best filter sets, and , was performed for standard BSIF filters and newly designed filters independently. Hence, our search was a Cartesian product of combinations for standard BSIF filters, and
combinations of domain-specific filters derived from a single set of image patches. We took only a single image pair for a combination of eye (left/right), subject and sensor (AD 100 / LG 4000) when generating comparison scores, in order to minimize statistical dependence among scores. This ended up with 906 genuine comparisons and 453 impostor comparisons available for each evaluation. Some approaches out of the total number of 780 possibilities achieved Equal Error Rate = 0. So, to make a further comparison among all solutions, we accumulated both sample mean and sample variances of scores in the form of the so-calledd-prime (or decidability, or detectability):
where and denote sample mean and sample variance of , respectively. Explaining shortly, the further the mean values are located for same variances, the better is the separation of distributions. Similarly, keeping the same and and simultaneously narrowing and one may get higher level of distinction between genuine and impostor scores. Consequently, the value of estimates the degree by which the distributions of and
overlap. For uncorrelated random variables,where is a standardized difference between the expected values of two random variables, often used in hypothesis testing.
Figure 9 presents distributions of genuine and impostor scores, along with the best combinations of filters, for newly designed filters for eye tracker-based image patches and for all five matching strategies. These plots suggest that a solution calculating mean fractional Hamming distance between binary codes works better (achieves higher ) than solutions based on histograms. black This matching strategy wins when standard BSIF filters or filters learned from iris data are used, and thus only this approach is used in final evaluations on the set.
4.3 Testing and Comparison of Solutions
To verify the hypothesis about the advantage of domain-specific filters, we conduct final experiments on an additional set of 1,900 iris images acquired from the next 330 subjects (different than the subjects represented in ). Figures 10 present comparison score distributions, EER and d’ obtained on the test set for all BSIF-based methods. To relate these approaches to other methods, we add results obtained for the OSIRIS matcher  and Rathgeb  approach distributed along with the USIT package , as shown in Fig. 11.
The first observation is that domain-specific filters based on random patches allow to achieve better than standard BSIF filters (cf. Fig. 10a-b). The second observation is that learning new filters from patches derived from human-annotated regions results in further increase of (Fig. 10c). Finally, one can notice the highest and the lowest EER obtained for filters learned from eye tracker-based iris image patches (Fig. 10d). The former method also outperforms OSIRIS and USIT algorithms. This suggests that using image areas that the subjects look at when performing the iris recognition task in filter design allows to obtain the most discriminative iris features.
To verify if the observed differences are statistically significant we randomly selected (with replacement) 30 sets of genuine and impostor comparison scores calculated for samples. We decided to evaluate each subset of scores by analyzing since the evaluated methods performed perfectly (EER=0) on some of the generated 30 sets of comparisons. The resulting boxplots gathering all 30 random selections of scores for all methods considered in the paper are presented in Fig. 12. One may note superiority of domain-specific filters. We applied one-way ANOVA222One of the ANOVA requirements is normality of tested distributions. However, Szapiro-Wilk test for normality suggests that there is no reason to reject a hypothesis on normality of all distributions of used in this evaluation (, depending on the approach). to verify if differences in the performance observed in Fig. 12 are statistically significant. Obtained for all pairs of approaches listed along the horizontal axis in Fig. 12 indicates that observed differences in are statistically significant (at the assumed significance level ).
A possible explanation of a significantly lower performance of the method included into the USIT package, as observed in Fig. 12, may be related to the fact that this method does not incorporate occlusion mask. All other methods evaluated in this paper exclude non-iris portions of the iris annulus from feature comparison.
Hence, the answer to the first question posed in the introduction is affirmative: the adaptation of BSIF filters to an iris recognition domain does allow to extract more discriminative iris image features than standard BSIF filters. Also the answer to the second question is affirmative: a careful selection of iris image patches, based on regions used by humans performing iris recognition task, does allow to increase the iris recognition performance when compared to using filters trained on randomly selected iris images patches.
Works in the literature have consistently verified the effectiveness of BSIF texture descriptors in iris recognition. The following question has arisen: if a set of filters learned from arbitrary images is useful for iris recognition, what improvements could result if such filters were replaced by iris-sourced ones? This paper shows, through a comprehensive three-step subject-disjoint experimental setup, that learning new iris-specific BSIF filters, especially employing iris image regions important to humans, results in a statistically significant improvement in iris recognition accuracy. To our best knowledge, this is the first time humans were put in the loop for learning domain-specific BSIF filters. For completeness, we also presented results for filters trained on iris patches selected randomly.
As an important contribution of this work, we make our iris-sourced BSIF filters and all iris images patches used in training available to the community, along with the database of test images. We also offer the source codes of the iris recognition method based on newly designed filters that proved the best in our evaluations. black These allow for a full reproduction of the results presented in this paper.
The next interesting research step would be applying these domain-specific filters in other areas related to iris recognition, for instance presentation attack detection, in the algorithms that already used standard BSIF successfully. Since we keep the format of the re-trained filters identical as for standard filters, this replacement is trivial.
The authors would like to cordially thank Dr. Sidney D’Mello and Mr. Robert Bixler for making their laboratory and eye tracker device available for conducting the experiments.
-  F. Alonso-Fernandez, P. Tome-Gonzalez, V. Ruiz-Albacete, and J. Ortega-Garcia. Iris Recognition Based on SIFT Features. In IEEE Intl. Conference on Biometrics, Identity and Security, pages 1–8, 2009.
-  C. Belcher and Y. Du. Region-Based SIFT Approach to Iris Recognition. Elsevier Optics and Lasers in Engineering, 47(1):139–147, 2009.
-  S. Branson, C. Wah, F. Schroff, B. Babenko, P. Welinder, P. Perona, and S. Belongie. Visual Recognition with Humans in the Loop. In Springer European Conference on Computer Vision, pages 438–451, 2010.
-  C. Cao and H.-Z. Ai. Facial Similarity Learning with Humans in the Loop. Springer Journal of Computer Science and Technology, 30(3):499–510, 2015.
-  J. Chen, F. Shen, D. Chen, and P. Flynn. Iris Recognition Based on Human-Interpretable Features. IEEE Transactions on Information Forensics and Security, 11(7):1476–1485, 2016.
-  A. Czajka and A. Pacut. Iris Recognition System Based on Zak-Gabor Wavelet Packets. Journal of Telecommunications and Information Technology, 4:10–18, 2010.
-  J. Daugman. High Confidence Visual Recognition of Persons by a Test of Statistical Independence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11):1148–1161, 1993.
-  J. Daugman. How iris recognition works. IEEE Transactions on Circuits and Systems for Video Technology, 14(1):21–30, 2004.
-  J. S. Doyle and K. W. Bowyer. Robust detection of textured contact lenses in iris recognition using bsif. IEEE Access, 3:1672–1683, 2015.
-  L. Ghiani, A. Hadid, G. L. Marcialis, and F. Roli. Fingerprint liveness detection using binarized statistical image features. In 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), pages 1–6, Sept 2013.
J. Kannala and E. Rahtu.
Bsif: Binarized statistical image features.
Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pages 1363–1366, Nov 2012.
-  A. Kong, D. Zhang, and M. Kamel. An Analysis of IrisCode. IEEE Transactions on Image Processing, 19(2):522–532, 2010.
-  E. Krichen, A. Mellakh, S. Garcia-Salicetti, and B. Dorizzi. Iris Identification using Wavelet Packets. In IEEE Intl. Conference on Pattern Recognition, pages 335–338, 2004.
S. Lim, K. Lee, O. Byeon, and T. Kim.
Efficient Iris Recognition through Improvement of Feature Vector and Classifier.Electronics and Telecommunications Research Institute (ETRI) Journal, 23(2):61–70, 2001.
-  C. Lintott, K. Schawinski, A. Slosar, K. Land, S. Bamford, D. Thomas, J. Raddick, R. Nichol, A. Szalay, D. Andreescu, et al. Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society (MNRAS), 389(3):1179–1189, 2008.
-  L. Ma, Y. Wang, and T. Tan. Iris Recognition using Circular Symmetric Filters. In IEEE Intl. Conference on Pattern Recognition, pages 414–417, 2002.
-  S. Manen, M. Gygli, D. Dai, and L. Van Gool. PathTrack: Fast Trajectory Annotation with Path Supervision. In Intl. Conference on Computer Vision, pages 1–10, 2017.
-  K. McGinn, S. Tarin, and K. Bowyer. Identity Verification using Iris Images: Performance of Human Examiners. In IEEE Intl. Conference on Biometrics: Theory, Applications and Systems, pages 1–6, 2013.
-  N. Murrugarra-Llerena and A. Kovashka. Learning Attributes from Human Gaze. In IEEE Winter Conference on Applications of Computer Vision, pages 510–519, 2017.
-  T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):971–987, 2002.
-  V. Ojansivu and J. Heikkilä. Blur Insensitive Texture Classification Using Local Phase Quantization. In Springer Intl. Conference on Image and Signal Processing, pages 236–243, 2008.
-  N. Othman, B. Dorizzi, and S. Garcia-Salicetti. OSIRIS: An open source iris recognition software. Elsevier Pattern Recognition Letters, 82(2):124–131, 2016.
-  A. Ouamane, E. Boutellaa, M. Bengherabi, A. Taleb-Ahmed, and A. Hadid. A Novel Statistical and Multiscale Local Binary Feature for 2D and 3D Face Verification. Elsevier Computers & Electrical Engineering, 62(1), 2017.
-  C.-H. Park, J.-J. Lee, M. Smith, and K.-H. Park. Iris-Based Personal Authentication Using a Normalized Directional Energy Feature. In Springer Intl. Conference on Audio- and Video-Based Biometric Person Authentication, pages 224–232, 2003.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay.
Scikit-learn: Machine Learning in Python.Journal of Machine Learning Research (JMLR), 12:2825–2830, 2011.
-  R. Raghavendra and C. Busch. Robust scheme for iris presentation attack detection using multiscale binarized statistical image features. IEEE Transactions on Information Forensics and Security, 10(4):703–715, April 2015.
-  K. B. Raja, R. Raghavendra, and C. Busch. Binarized statistical features for improved iris and periocular recognition in visible spectrum. In 2nd International Workshop on Biometrics and Forensics, pages 1–6, March 2014.
-  C. Rathgeb, F. Struck, and C. Busch. Efficient BSIF-based near-infrared iris recognition. In 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA), pages 1–6, Dec 2016.
-  C. Rathgeb, A. Uhl, P. Wild, and H. Hofbauer. Design Decisions for an Iris Recognition SDK. In K. Bowyer and M. J. Burge, editors, Handbook of Iris Recognition, Advances in Computer Vision and Pattern Recognition. Springer, second edition edition, 2016.
-  Tobii AB. Tobii Pro TX300. https://www.tobiipro.com/product-listing/tobii-pro-tx300/ (accessed June 26, 2018), 2018.