Multispectral Biometrics System Framework: Application to Presentation Attack Detection

In this work, we present a general framework for building a biometrics system capable of capturing multispectral data from a series of sensors synchronized with active illumination sources. The framework unifies the system design for different biometric modalities and its realization on face, finger and iris data is described in detail. To the best of our knowledge, the presented design is the first to employ such a diverse set of electromagnetic spectrum bands, ranging from visible to long-wave-infrared wavelengths, and is capable of acquiring large volumes of data in seconds. Having performed a series of data collections, we run a comprehensive analysis on the captured data using a deep-learning classifier for presentation attack detection. Our study follows a data-centric approach attempting to highlight the strengths and weaknesses of each spectral band at distinguishing live from fake samples.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 3

page 9

page 13

page 14

page 15

page 17

page 19

03/31/2018

Presentation Attack Detection for Iris Recognition: An Assessment of the State of the Art

Iris recognition is increasingly used in large-scale applications. As a ...
11/03/2021

Understanding Cross Domain Presentation Attack Detection for Visible Face Recognition

Face signatures, including size, shape, texture, skin tone, eye color, a...
04/25/2020

Deep convolutional neural networks for face and iris presentation attack detection: Survey and case study

Biometric presentation attack detection is gaining increasing attention....
06/12/2020

Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset

Fingerprint presentation attack detection is becoming an increasingly ch...
09/12/2018

Thermal Features for Presentation Attack Detection in Hand Biometrics

This paper proposes a method for utilizing thermal features of the hand ...
12/07/2021

Presentation Attack Detection Methods based on Gaze Tracking and Pupil Dynamic: A Comprehensive Survey

Purpose of the research: In the biometric community, visible human chara...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Biometric sensors have become ubiquitous in recent years with ever more increasing industries introducing some form of biometric authentication for enhancing security or simplifying user interaction. They can be found on everyday items such as smartphones and laptops as well as in facilities requiring high levels of security such as banks, airports or border control. Even though the wide usability of biometric sensors is intended to enhance security, it also comes with a risk of increased spoofing attempts. At the same time, the large availability of commercial sensors enables access to the underlying technology for testing various approaches aiming at concealing one’s identity or impersonating someone else, which is the definition of a Presentation Attack (PA). Besides, advances in materials technology have already enabled the development of Presentation Attack Instruments (PAIs) capable at successfully spoofing existing biometric systems [57, 38, 6].

Presentation Attack Detection

(PAD) has attracted a lot of interest with a long list of publications focusing on devising algorithms where data from existing biometric sensors are used 

[40]. In this work, we approach the PAD problem from a sensory perspective and attempt to design a system that relies mostly on the captured data which should ideally exhibit a distinctive response for PAIs. We focus on capturing spectral data, which refers to the acquisition of images of various bands of the electromagnetic spectrum for extracting additional information of an object beyond its visible spectrum [43]. The higher dimensionality of multispectral data enables detection of other than skin materials based on their spectral characteristics [54]. A comprehensive analysis of the spectral emission of skin and different fake materials [58] shows that in higher than visible wavelengths, the remission properties of skin converges for different skin types (i.e., different race or ethnicity) compared to a diverse set of lifeless substances. Additionally, multispectral data offer a series of advantages over conventional visible light imaging, including visibility through occlusions as well as being unaffected by ambient illumination conditions.

Spectral imaging has been studied for over a decade with applications to medical imaging, food engineering, remote sensing, industrial applications and security [55]. However, its use in biometrics is still at its infancy. A few prototype systems can be found in the literature for face [58], finger [14] and iris [64] data but usually employ a small set of wavelengths [72]. Commercial sensors are still very limited (e.g., [27, 24, 67]) and mainly use a few wavelengths in the visible (VIS) or near-infrared (NIR) spectra. Lately, hybrid sensors have also appeared on smartphones combining VIS, NIR and depth measurements. The majority of existing PAD literature on multispectral data has relied on such commercial sensors for studying PAD paradigms (e.g.,  [10, 52, 1]).

In general, systems capturing spectral data can be grouped in main categories:

Fig. 1: Main components of the proposed multispectral biometrics system framework: A biometric sample is observed by a sensor suite comprised of various multispectral data capture devices. A set of multispectral illumination sources is synchronized with the sensors through an electronic controller board. A computer provides the synchronization sequence through a JSON file and sends capture commands that bring the controller and sensors into a capture loop leading to a sequence of synchronized multispectral data from all devices. All captured data is then packaged into an HDF5 file and sent to a database for storage and further processing.
  1. Multispectral image acquisition using multiple cameras inherently sensitive at different wavelength regimes or employing band-pass filters [25].

  2. Hyperspectral imagers [30].

  3. Single cameras performing sequential image acquisition with a rotating wheel of band-pass filters [7].

  4. Single cameras with Bayer-like band-pass filter patterns [69].

In this work, we follow the first approach and propose a unified framework for multispectral biometric data capture, by combining a variety of cameras, synchronized with a set of illumination sources for collecting data at different sub-bands of the VIS, NIR, short-wave-infrared (SWIR) and long-wave-infrared (LWIR) spectra. To the best of our knowledge this is the first work to enable capture of such diverse types of multispectral data in a unified system framework for different biometric modalities. In the following sections, we analyze the proposed system framework and emphasize on the merits of the captured data at detecting PAs.

2 System Framework and Design

In this section, we analyze the proposed multispectral biometrics system framework and present its realization with three different biometric sensor suites applied on face, finger and iris biometric data, respectively.

The main concept of our framework is presented in Fig. 1 and is initially described here at a very high level. A biometric sample is observed by a sensor suite comprised of various multispectral data capture devices. A set of multispectral illumination sources is synchronized with the sensors through an electronic controller board. A computer uses a Graphical User Interface (GUI) which provides the synchronization sequence through a JSON configuration file and sends capture commands that bring the controller and sensors into a capture loop leading to a sequence of synchronized multispectral frames from all devices. All captured data is then packaged into an HDF5 file and sent to a database for storage and processing. The computer also provides preview capabilities to the user by streaming data in real-time from each device while the GUI is in operation.

Our system design (both in terms of hardware and software) is governed by four key principles:

  1. Flexibility: Illumination sources and capture devices can be easily replaced with alternate ones with no or minimal effort both in terms of hardware and software development.

  2. Modularity: Whole components of the system can be disabled or removed without affecting the overall system’s functionality by simply modifying the JSON configuration file.

  3. Legacy compatibility: The system must provide at least some type of data that can be used for biometric identification through matching with data from older sensors and biometric templates available in existing databases.

  4. Complementarity: The variety of capture devices and illumination sources used aim at providing complementary information about the biometric sample aiding the underlying task at hand.

We first describe all the common components of our system (as depicted in Fig. 1) and then discuss the specifics of each sensor suite for the three studied biometric modalities.

(a) LED illumination module, controlled through an LED driver [48]. Contains slots for SMD LEDs.
(b) Teensy microcontroller [60] used in the controller board of Fig. 1.
Fig. 2: System’s main electronic components.

2.1 Hardware

The hardware design follows all principles described above providing a versatile system which can be easily customized for different application needs.

Regime RGB NIR VIS/NIR SWIR Thermal RGB/NIR/Depth
Product Basler [3], Basler [4], Basler [5], Basler [2], IrisID [28], Xenics [70], FLIR [17], FLIR [18], Intel [27],
acA1920 acA1920 acA4096 acA1300 iCam Bobcat Boson Boson RealSense D435
-150uc -150um -30um -60gmNIR -7000 320-100 320-24 640-18 -RGB -NIR () -Depth
Comm. USB3 USB3 USB3 GigE GigE GigE USB2 USB2 USB3
Protocol
Full Image
Size (pixels)
Max Bit Depth bits bits bits bits bits bits bits bits bits bits bits
Max fps fps fps fps N/A fps fps fps fps fps fps
Frame Rate
Trigger Hardware Hardware Hardware Hardware Software Hardware Software Software Software
Lens Used [36] [36] [37] [15] Built-in [11] / [12] Built-in Built-in Built-in
External Filter No Yes [23] Yes [16] No No No No No No
Spectrum Min nm nm nm nm nm nm um um nm nm N/A
Max nm nm nm nm nm nm um um nm nm
Sensor Suite Face Face () Iris Finger Iris Face / Face Iris Face
Finger
TABLE I: Summary of cameras used in all presented biometric sensor suites along with their main specifications.

Illumination Modules

We have designed a Light-Emitting Diode (LED) based illumination module which can be used as a building block for creating a larger array of LEDs in various spatial configurations. It is especially made for supporting Surface-Mount Device (SMD) LEDs for compactness. The module, shown in Fig. 2(a), contains slots for mounting LEDs and uses a Serial Peripheral Interface (SPI) communication LED driver chip [48] which allows independent control of the current and Pulse-Width-Modulation (PWM) for each slot. LEDs can be turned on/off or their intensity can be modified using a sequence of bits. Since current is independently controlled for each position, it allows combining LEDs with different operating limits.

Controller Board

The controller board also follows a custom design and uses an Arduino-based microcontroller (Teensy  [60]), shown in Fig. 2(b), which can communicate with a computer through a USB2 serial port. The microcontroller offers numerous digital pins for SPI communication as well as Digital-to-Analog (DAC) converters for generating analog signals. The board offers up to slots for RJ45 connectors which can be used to send SPI commands to the illumination modules through ethernet cables. Additionally, it offers up to slots for externally triggering capture devices through digital pulses, whose peak voltage is regulated by appropriate resistors. The Teensy supports a limited amount of storage memory on which a program capable of understanding the commands of the provided configuration file is pre-loaded. At the same time, it provides an accurate internal timer for sending signals at millisecond intervals.

2.2 Software

The software design aligns with the principles of flexibility and modularity described above. We have adopted a microservice architecture which uses REST APIs such that a process can send HTTP requests for capturing data from each available capture device.

Device Servers

Each capture device must follow a device server interface and should just implement a class providing methods for its initialization, setting device parameters and capturing a data sample. This framework simplifies the process of adding new capture devices which only need to implement the aforementioned methods and are agnostic to the remaining system design. At the same time, for camera sensors (which are the ones used in our realization of the framework), it additionally provides a general camera capture device interface for reducing any additional software implementation needs.

Configuration File

The whole system’s operation is determined by a JSON configuration file. It defines which capture devices and illumination sources will be used as well as the timestamps they will receive signals for their activation or deactivation. Further, it specifies initialization or runtime parameters for each capture device allowing adjustments to their operational characteristics without any software changes. As such, it can be used to fully determine a synchronized capture sequence between all available illumination sources and capture devices. Optionally, it can define a different preview sequence used for presenting data to the user through the GUI. Finally, it also determines the dataset names that will be used in the output HDF5 file to store the data from different capture devices.

Graphical User Interface

The GUI provides data preview and capture capabilities. In preview mode, it enters in a continuous loop of signals to all available capture devices and illumination sources and repeatedly sends HTTP requests to all underlying device servers while data is being previewed on the computer screen. In capture mode, it first sends a capture request to each capture device for a predefined number of frames dictated by the JSON configuration file and then puts the controller into a capture loop for sending the appropriate signals. Captured data is packaged into an HDF5 file and sent to a database for storage.

2.3 Biometric Sensor Suites

Fig. 3: Main LED types and illumination modules used in the proposed biometric sensor suites. For each modality (face, finger or iris), illumination modules are combined in different arrangements for achieving illumination uniformity on the observed biometric samples. Each group of modules can receive commands from the main controller board of Fig. 1 through ethernet cables. Here, we refer to any separate LED tag () as representing a wavelength even though some of them might consist of multiple wavelengths (e.g., white light).

In this section, we dive into more details on the specifics of the realization of the presented framework on the face, finger and iris biometric modalities. For our presented systems, all capture devices are cameras and all output data is frame sequences appropriately synchronized with the activation of particular light sources.

We use a variety of cameras each one sensitive to different portions (VIS, NIR, SWIR and LWIR or Thermal) of the electromagnetic spectrum. Table I summarizes all cameras used in our system along with their main specifications. It is apparent that cameras share different characteristics in terms of their resolution, frame rate or dynamic range (bit depth). For some cameras, the sensitivity is restricted by using external band-pass filters in front of their lenses. The cameras were selected, among many options in the market, with the goal of balancing performance, data quality, user-friendliness and cost (but clearly different sensors could be selected based on the application needs). All cameras supporting hardware triggering operate in blocking-mode, i.e., waiting for trigger signals from the controller for a frame to be captured. This way, synchronized frames can be obtained. A few cameras (see Table I) do not support hardware triggering and are synchronized using software countdown timers during the capture process. Even though this triggering mechanism is not millisecond accurate, the timestamps of each frame are also stored so that one can determine the closest frames in time to frames originating from the hardware triggered cameras.

For the illumination modules, we chose a variety of LEDs emitting light at different wavelengths covering a wide range of the spectrum. Here, without loss of generality, we will refer to any separate LED type as representing a wavelength even though some of them might consist of multiple wavelengths (e.g., white light). The choice of LEDs was based on previous studies on multispectral biometric data (as discussed in section 1) as well as cost and market availability of SMD LEDs from vendors (e.g.,  [41, 47, 53, 66]). For each biometric sensor suite, we tried to maximize the available wavelengths considering each LED’s specifications and the system as a whole. Illumination modules are mounted in different arrangements on simple illumination boards containing an RJ45 connector for SPI communication with the main controller board through an ethernet cable. To achieve light uniformity, we created main types of illumination modules which attempt to preserve LED symmetry. Wavelength selection and module arrangement for each sensor suite is presented in Fig. 3. In summary:

  • Face sensor suite: Employs wavelengths mounted on types of illumination modules and arranged in separate groups. illumination modules with LEDs are used in total.

  • Finger sensor suite: Employs wavelengths mounted on types of illumination modules and arranged in separate groups. illumination modules with LEDs are used in total.

  • Iris sensor suite: Employs wavelengths mounted on a single illumination module type and arranged circularly. illumination modules with LEDs are used in total .

All system components are mounted using mechanical parts [61] or custom-made 3D printed parts and enclosed in metallic casings [50, 49] for protection and user-interaction. Additionally, all lenses used (see Table I) have a fixed focal length and each system has an optimal operating distance range based on the Field-of-View (FOV) and Depth-of-Field (DoF) of each camera-lens configuration used. It is important to note that our systems are prototypes and every effort was made to maximize efficiency and variety of captured data. However, the systems could be miniaturized using smaller cameras, fewer or alternate illumination sources or additional components, such as mirrors, for more compact arrangement and total form factor reduction. Such modifications would not interfere with the concepts of the proposed framework which would essentially remain the same.

Fig. 4: Overview of the face sensor suite. Left side: 3D modeling of the system; Right side: Actual developed system.
Fig. 5: Face sensor suite synchronization sequence between cameras (software triggered cameras are underlined) and illumination sources. The width of each box represents the exposure time of each camera (or marked as “Auto” if auto-exposure is used) as well as the duration that each illumination source is active. The RGB, NIR and Depth channels of the RealSense [27] camera are internally synchronized to be captured at the same time. We capture cycles of the presented sequence for a total capture duration of seconds. The gray color represents cameras that are not affected by LED illumination, while the other two colors represent the sensitivity of each camera to the LED illumination sources. Finally, the NIR illumination denoted by “stereo” refers to data captured at a constant frame rate and could be used for stereo depth reconstruction. In this configuration, “stereo” data was captured using the nm wavelength but multiple wavelengths could be simultaneously activated.

2.4 Face Sensor Suite

The face sensor suite uses cameras capturing RGB, NIR (), SWIR, Thermal and Depth data as summarized in Table I. An overview of the system is depicted in Fig. 4. Except for the LED modules, we further use two big bright white lights on both sides of our system (not shown in the figure) to enable uniform lighting conditions for the RGB cameras. The subject sits in front of the system and the distance to the cameras is monitored by the depth indication of the RealSense camera [27]. We use a distance of cm from the RealSense camera, which allows for good focus and best FOV coverage from most cameras. For the cameras affected by the LED illumination, we also capture frames when all LEDs are turned off, which can be used as ambient illumination reference frames. The synchronization sequence provided to the system through the JSON configuration file is presented in Fig. 5. Finally, an overview of the captured data for a bona-fide sample is presented at the left side of Fig. 8 while an analysis of frames and storage needs is summarized in Table II. In this configuration, the system is capable of capturing GB of compressed data in seconds. Legacy compatible data is provided using either RGB camera of the system [3, 27].

Camera Data Lit Non-Lit Bit
Type Frames Frames Depth
Basler RGB [3] RGB
RealSense [27] RGB
RealSense [27] Depth
RealSense [27] NIR
Boson [17] Thermal
Basler Left [4] NIR (as )
Basler Right [4] NIR (as )
Bobcat [70] SWIR
Storage: GB,  Capture Time: seconds.
TABLE II: Analysis of frames and storage needs for the data captured by the face sensor suite for a single subject. For the frames, we use notation (Number of frames Number of datasets in HDF5 file). Each dataset corresponds to different illumination conditions for each data type.
Fig. 6: Overview of the finger sensor suite. Left side: 3D modeling of the system; Remaining: Actual developed system.
Fig. 7: Finger sensor suite synchronization sequence between cameras and illumination sources. The width of each box represents the exposure time of each camera (or marked as “Auto” if auto-exposure is used) as well as the duration that each illumination source is active. We capture a single cycle of the presented sequence for a total capture duration of seconds. The colors represent the sensitivity of each camera to the illumination sources allowing simultaneous capture from both cameras in some instances.

Looking closer at the face sensor suite, the

NIR cameras constitute a stereo pair and can be used for high resolution 3D reconstruction of the biometric sample. Such an approach is not analyzed in this work. However, it requires careful calibration of the underlying cameras for estimating their intrinsic and extrinsic parameters. Moreover, despite face detection being a rather solved problem for RGB data 

[9, 71], this is not the case for data in different spectra. To enable face detection in all captured frames, we use a standard calibration process using checkerboards [73]. For the checkerboard to be visible in all wavelength regimes, a manual approach is used when a sequence of frames is captured offline while the checkerboard is being lit with a bright halogen light. This makes the checkerboard pattern visible and detectable by all cameras which allows the standard calibration estimation process to be followed. The face can then be easily detected in the RGB space [9, 71] and the calculated transformation for each camera can be applied to detect the face in the remaining camera frames.

Camera Data Lit Non-Lit Bit
Type Frames Frames Depth
Basler [2] VIS/NIR (as )
Basler [2] BI (as )
Bobcat [70] LSCI
Bobcat [70] SWIR
Storage: MB,  Capture Time: seconds.
TABLE III: Analysis of frames and storage needs for the data captured by the finger sensor suite for a single finger. For the frames, we use notation (Number of frames Number of datasets in HDF5 file). Each dataset corresponds to different illumination conditions for each data type.
Fig. 8: Overview of captured data by the proposed sensor suites for face (left), finger (top-right) and iris (bottom-right) biometric modalities. For cameras affected by LED illumination or capturing different data types, the middle frame of the capture sequence is shown. For the remaining cameras, equally spaced frames of the whole captured sequence are presented. Images are resized for visually pleasing arrangement and the relative size of images is not preserved.

2.5 Finger Sensor Suite

The finger sensor suite uses cameras sensitive in the VIS/NIR and SWIR parts of the spectrum, as summarized in Table I. An overview of the system is depicted in Fig. 6. The subject places a finger on the finger slit of size mm, facing downwards, which is imaged by the available cameras from a distance of cm. The finger sensor suite uses two additional distinct types of data compared to the remaining sensor suites, namely, Back-Illumination (BI) and Laser Speckle Contrast Imaging (LSCI).

Back-Illumination

Looking at Fig. 6 and Fig. 3, one can observe that the illumination modules are separated in two groups. The first one lies on the side of the cameras lighting the front side of the finger (front-illumination) while the second shines light atop the finger slit which we refer to as BI. This allows capturing images of the light propagating through the finger and can be useful for PAD by either observing light blockage by non-transparent materials used in common PAIs or revealing the presence of veins in a finger of a bona-fide sample. The selected NIR wavelength of nm enhances penetration though the skin as well as absorption of light by the hemoglobin in the blood vessels [51, 21, 68, 34] making them appear dark. Due to the varying thickness of fingers among different subjects, for BI images we use auto-exposure and capture multiple frames so intensity can be adjusted such that the captured image is not over-saturated nor under-exposed.

Laser Speckle Contrast Imaging

Apart from the incoherent LED illumination sources, the finger sensor suite also uses a coherent illumination source, specifically a laser at nm [13], which sends a beam at the forward part of the system’s finger slit. The laser is powered directly by the power of the Teensy  [60] and its intensity can be controlled through an analog voltage using the DAC output of the controller board (as shown in Fig. 1). Illuminating a rough surface through a coherent illumination source leads to an interference pattern, known as speckle pattern. For static objects, the speckle pattern does not change over time. However, when there is motion (such as motion of blood cells through finger veins), the pattern changes at a rate dictated by the velocity of the moving particles and imaging this effect can be used for LSCI [8, 25, 32, 42, 59]. The selected wavelength of nm enables penetration of light through the skin and the speckle pattern is altered over time as a result of the underlying blood flow for bona-fide samples. This time-related phenomenon can prove useful as an indicator of liveness and, in order to observe it, we capture a sequence of frames while the laser is turned on.

The synchronization sequence provided to the system through the JSON configuration file is presented in Fig. 7, where it is shown that complementary spectrum sensitivity of the utilized cameras is exploited for synchronous capture while enabling multiple illumination sources (e.g., laser and NIR light). For each type of data captured under the same lighting conditions and the same camera parameters (i.e., exposure time), we also capture frames when all LEDs are turned off which serve as ambient illumination reference frames. Finally, an overview of the captured data for a bona-fide sample is presented at the top-right part of Fig. 8 while an analysis of frames and storage needs per finger is summarized in Table III. In this configuration, the system is capable of capturing MB of compressed data in seconds. Legacy compatible data is provided through the captured visible light images as we will show in section 3.2.

Fig. 9: Overview of the iris sensor suite. Left side: 3D modeling of the system; Remaining: Actual developed system.
Fig. 10: Iris sensor suite synchronization sequence between cameras (software triggered cameras are underlined) and illumination sources. The width of each box represents the exposure time of each camera (or marked as “Auto” if auto-exposure is used) as well as the duration that each illumination source is active. We capture cycles of the presented sequence and then enable the IrisID camera. The total capture duration is or more seconds (depending on the capture time of the IrisID camera which requires subject cooperation). The gray color represents cameras that are not affected by LED illumination, while the other color represents the sensitivity of each camera to the LED illumination sources.

2.6 Iris Sensor Suite

The iris sensor suite uses cameras capturing NIR and Thermal data, as summarized in Table I. An overview of the system is depicted in Fig. 9. The subject stands in front of the system at a distance of cm guided by the 3D printed distance guide on the right side of the metallic enclosure. The synchronization sequence provided to the system through the JSON configuration file is presented in Fig. 10. The IrisID camera [28] employs its own NIR LED illumination and has an automated way of capturing data giving feedback and requiring user interaction. Hence, it is only activated at the end of the capture from the remaining cameras. An overview of the captured data for a bona-fide sample is presented at the bottom-right part of Fig. 8 while an analysis of frames and storage needs is summarized in Table IV. Note, that the IrisID provides the detected eyes directly while the remaining data require the application of an eye detection algorithm. For detecting eyes in the thermal images, we use the same calibration approach discussed in section 2.4 where eyes can first be detected in the NIR domain and then their coordinates transformed to find the corresponding area in the thermal image. Both the data from the IrisID and the NIR camera are legacy compatible as we will show in section 3.2. Besides, the IrisID camera is one of the sensors most frequently used in the market.

Camera Data Lit Non-Lit Bit
Type Frames Frames Depth
Basler [5] NIR (as )
Boson [18] Thermal
IrisID [28] NIR
( per eye)
Storage: GB,  Capture Time: seconds plus IrisID
capture time (max: 25 seconds).
TABLE IV: Analysis of frames and storage needs for the data captured by the iris sensor suite for a single subject. For the frames, we use notation (Number of frames Number of datasets in HDF5 file). Each dataset corresponds to different illumination conditions for each data type.

One of the drawbacks of the current iris sensor suite is its sensitivity to the subject’s motion and distance due to the rather narrow DoF of the utilized cameras/lenses as well as the long exposure time needed for acquiring bright images. As a result, it requires careful operator feedback to the subject for appropriate positioning in front of the system. Higher intensity illumination or narrow angle LEDs could be used to combat this problem by further closing the aperture of the cameras so that the DoF is increased. However, further research is required for this purpose, taking into consideration possible eye-safety concerns, not present in the current design which employs very low energy LEDs.

3 Experiments

In our analysis so far, we have verified the principles of flexibility and modularity governing the system design in our proposed framework. In this section, we focus on the principles of legacy compatibility and complementarity of the captured data and showcase that they can provide rich information when applied to PAD. The main focus of our work is to present the flexible multispectral biometrics framework and not devise the best performing algorithm for PAD since the captured data can be used in a variety of ways for obtaining the best possible performance. Instead, we follow a data-centric approach and attempt to understand the contribution of distinct regimes of the multispectral data towards detecting different types of PAIs.

3.1 Datasets

We have held data collections with the proposed systems. However, our systems have undergone multiple improvements throughout this period and some data is not fully compatible with the current version of our system (see for example the previous version of our finger sensor suite in [25], which has been largely simplified here). A series of publications have already used data from earlier versions of our systems (see [46, 35, 29, 19] for face and [25, 32, 63, 42, 20, 59, 62, 34] for finger).

The datasets used in our analysis contain only data across data collections that are compatible with the current design (i.e., the same cameras, lenses and illumination sources, as the ones described in section 2, were used). They involve separate data collections of varying size, demographics and PAI distributions that were performed using distinct replicas of our systems in separate locations (leading to possibly different ambient illumination conditions and slight modifications in the positions of each system’s components). Participants presented their biometric samples at least twice to our sensors and a few participants engaged in more than one data collections. Parts of the data will become publicly available through separate publications and the remaining could be distributed later by the National Institute of Standards and Technology (NIST) [44].

In this work, we separate all data from the aforementioned data collections in two groups (data from the former data collections and data from the last data collection). The main statistics for the two groups which will be referred to as Dataset I and Dataset II, respectively, as well as their union (Combined) are summarized in Table V. The reason for this separation is twofold. First, we want to study a cross-dataset analysis scenario for drawing general conclusions. Second, during the last data collection, data was also captured using a variety of existing commercial sensors for face, finger and iris. Therefore, Dataset II constitutes an ideal candidate on which the legacy compatibility principle of our proposed sensor suites can be analyzed.

TABLE V: Studied datasets and their union. For each biometric modality, we group PAIs into broader categories (marked in gray) and present the number of samples and PAI species (sp.) included in each. Not all available PAI species are included in this categorization. PAI categories whose appearance depends heavily on the subject and preparation method are marked with . Finally, the contact lens (CL) category marked with groups contact lenses whose specific type is unknown or their count in the dataset is small for being separately grouped.
Fig. 11: Examples of legacy compatible data captured by the proposed sensor suites and multiple legacy sensors, retrieved from Dataset II (see Table V). All data correspond to the same participant while finger and iris images depict the right index finger and left eye, respectively. For each data type, the figure further presents the notation used in Tables VI and VII.

For each biometric modality, we define a set of PAI categories (see Table V), which will be helpful for our analysis. As observed, multiple PAI species are omitted from the categorization. We tried to form compact categories, which encapsulate different PAI characteristics, as well as consider cases of unknown PAI categories among the two datasets. Finally, it is important to note that the age and race distributions of the participants among the two datasets is drastically different. Dataset I is dominated by young people of Asian origin while Dataset II

includes a larger population of Caucasians or African Americans with a skewed age distribution toward older ages, especially for

face data.

Sensor Enroll. Unique Total Unique
Rate Enroll. Rate Samples Samples

Face

RGB-A
RGB-B
RGB-C
RGB
RS-RGB

Finger

Optical-A
Optical-B
Optical-C
Optical-D
Optical-E
Capacitive
Thermal
White
VIS
White-Bin
VIS-Bin

Iris

NIR-A
NIR-B
NIR-C
IrisID
N
N
N
N
TABLE VI: Bona-fide enrollment rates for each sensor used in Dataset II (see Fig. 11 and Table V), calculated using Neurotechnology’s SDK software [45] for a minimum quality threshold of . The third column lists the enrollment rate when all samples are considered while the fourth presents the corresponding enrollment rate when enrollment of at least one sample per participant and BP is considered a success. Similarly, the last two columns list the total bona-fide samples and unique participant-BP bona-fide samples per sensor, respectively. The best enrollment rates per biometric modality are highlighted in bold.
Legacy Proposed
Sensors Sensor Suites
RGB RS-RGB

Face

RGB-A
RGB-B
RGB-C
White White-Bin VIS VIS-Bin

Finger

Optical-A
Optical-B
Optical-C
Optical-D
Optical-E
Capacitive
Thermal
IrisID N N N N

Iris

NIR-A
NIR-B
NIR-C
TABLE VII: Bona-fide match rates between legacy compatible data provided by the proposed sensor suites and each one of the available legacy sensors in Dataset II (see Table V). Table entries correspond to the FNMR at FMR for each sensor pair, calculated using [45], with the highest match rates highlighted in bold. Only bona-fide samples for each participant and BP that were enrolled by both sensors in each sensor pair were considered. For comparison, the average match rates between the data from finger legacy sensors are: Optical-A (), Optical-B (), Optical-C (), Optical-D (), Optical-E (), Capacitive (), Thermal ().
Fig. 12: FCN Model architecture (extension of [56]): Given parameter and number of features , an input image of channels is first converted into a two dimensional PAD score map () whose spatial distribution is then used to extract features and deduce the final PAD score through a linear layer. Actual score map examples for bona-fide and PA samples are presented at the bottom part of the illustration, following the flow of the presented architecture.

3.2 Legacy Compatibility

As discussed above, during collecting Dataset II, data from each participant was also collected using a variety of legacy sensors ( different sensor types for face and iris and for finger). Sponsor approval is required to release specific references for the legacy sensors used. Instead we provide descriptive identifiers, based on the data types each sensor captures. We now perform a comprehensive list of experiments to understand the legacy compatibility capabilities of our systems. For this purpose, we employ Neurotechnology’s SDK software [45], which is capable of performing biometric data enrollment and matching. We use notation BP (Biometric Position) to refer to a specific sample (i.e., face, left or right eye or specific finger of the left or right hand of a subject). From our sensor suites, legacy compatible data for face and iris is used as is. For finger

, we noticed that the software was failing to enroll multiple high-quality samples, possibly due to the non-conventional nature of the captured finger images and, as a result, we considered two pre-processing steps. First, we cropped a fixed area of the captured image containing mostly the top finger knuckle and enhanced the image using adaptive histogram equalization. Second, we binarize the enhanced image using edge preserving noise reduction filtering and local adaptive thresholding. Fig. 

11 provides an overview of data samples from all sensors, along with the notation used for each, and depicts the pre-processing steps for finger data.

Using the SDK, we first perform enrollment rate analysis for all bona-fide samples in Dataset II using as the minimum acceptable quality threshold. Following, we consider each pair between the proposed sensors and available legacy sensors and perform biometric template matching among all bona-fide samples for all participants with at least one sample for the same BP enrolled from both sensors. The enrollment rate results are provided in Table VI and the match rates for each sensor pair are extracted by drawing a Receiver Operatic Characteristic (ROC) curve and reporting the value of False Non-Match Rate (FNMR) at False Match Rate (FMR) [65] in Table VII. For finger, we analyze the performance of white light images as well as the performance when all visible light images are used (see Fig. 11) and any one of them is enrolled. When matching, the image with the highest enrollment quality is used.

Fig. 13: Examples of pre-processed multispectral data for bona-fide samples and the main PAI categories defined in Table V for each biometric modality. In some cases, images have been min-max normalized within each spectral regime for better visualization. The notation used in the figure is crucial for understanding the results in Fig. 14 and Table IX.

From the results in the tables, it is apparent that the face and iris sensor suites provide at least one image type that is fully legacy compatible. For the finger data, enrollment rate appears to be sub-optimal while match rates are in some cases on par with the average match rates between legacy sensors (compare with values in caption of Table VII). However, the utilized analysis software proves very sensitive to the input image type and the same images when binarized (compare White vs. White-Bin and VIS vs. VIS-Bin entries in Table VI) exhibit increase in enrollment rates. Hence, we are confident that a more careful selection of pre-processing steps [22] or the use of an alternate matching software could lead to improved performance. Besides, the Optical-D legacy sensor, despite covering the smallest finger area and having the lowest resolution among all analyzed legacy sensors, seems to outperform the others by a large margin, indicating the high sensitivity of the enrollment and matching software to selected parameters. Deeper investigation into this topic, however, falls out of the scope of this work.

3.3 Presentation Attack Detection

In order to support the complementarity principle of our design, we devise a set of PAD experiments for each biometric modality. Two class classification, with labels

assigned to bona-fide and PA samples, respectively, is performed using a convolutional neural network (CNN) based model.

Model Architecture

Due to the limited amounts of training data, inherent in biometrics, we follow a patch-based approach where each patch in the input image is first classified with a PAD score in and then individual scores are fused to deduce the final PAD score

for each sample. Unlike traditional patch-based approaches where data is first extracted for patches of a given size and stride and then passed through the network (e.g.,

[25, 42]), we use an extension of the fully-convolutional-network (FCN) architecture presented in [56], as depicted in Fig. 12. The network consists of parts:

  1. Score Map Extraction: Assigns a value in to each patch producing a score map (

    through a set of convolutions and non-linearities while batch normalization layers are used to combat over-fitting.

  2. Feature Extraction: Extracts score map features through a shallow CNN.

  3. Classification: Predicts the final PAD score by passing the score map features through a linear layer.

The suggested network architecture was inspired by the supervision channel approach in [31, 39] and its first part (identical to [56]) is equivalent to a patch-based architecture when the stride is , albeit with increased computational efficiency and reduced memory overhead. A drawback of the FCN architecture compared to a genuine patch-based model, however, is that patches of a sample image are processed together and the batch size needs to be smaller, reducing intra-variability in training batches. The two remaining parts, instead of just performing score averaging, consider the spatial distribution of the score map values for deducing the final PAD score, as shown in the examples at the bottom part of Fig. 12 for a bona-fide and PA sample per modality. The additional feature extraction and classification layers were considered due to the possible non-uniformity of PAIs especially in the case of face and iris data, unlike finger data [56], where a PAI usually covers the whole finger image area passed to the network.

Training Loss

The network architecture in Fig. 12 guarantees , where is the total number of elements in , through the sigmoid layer. However, it does not guarantee that would represent an actual PAD score map for the underlying sample. In order to enforce that all patches within each sample belong to the same class, we employ pixel-wise supervision on such that where

is the ground truth label of the current sample. Denoting the Binary Cross-Entropy loss function as

the sample loss is calculated as:

(1)

where is a constant weight.

3.4 Presentation Attack Detection Experiments

As discussed earlier, the goal of our work is the understanding of the contribution of each spectral channel or regime to the PAD problem as well as the strengths and weaknesses of each type of data by following a data-centric approach. Therefore, we use a model that remains the same across all compared experiments per modality. As such, we try to gain an understanding on how performance is solely affected by the data rather than the number of trainable model parameters, specific model architecture or other training hyperparameters. We first summarize the data pre-processing and training protocols used in our experiments and then describe the experiments in detail.

Data Pre-processing

The data for each biometric modality is pre-processed as follows, where any resizing operation is performed using bicubic interpolation:

  • Face: Face landmarks are detected in the RGB space using [71] and the bounding box formed by the extremities is expanded by toward the top direction. The transformations obtained by the calibration process described in section 2.4 are then used to warp each image channel to the RGB image dimensions and the bounding box area is cropped. Finally, all channels are resized to pixels. A single frame from the captured sequence is used per sample.

  • Finger: A fixed region of interest is cropped per channel such that the covered finger area is roughly the same among all cameras (based on their resolution, system geometry and dimensions of the finger slit mentioned in section 2.5). The cropped area covers the top finger knuckle which falls on an almost constant position for all samples, since each participant uses the finger slit for presenting each finger. Finally, all channels are resized to pixels.

  • Iris: For the data captured using the NIR and IrisID cameras, Neurotechnology’s SDK [45] is employed for performing iris segmentation. The iris bounds are then used as a region of interest for cropping. Each image is finally resized to pixels. For the thermal data, we use the whole eye region (including the periocular area). The center of the eye is extracted from the segmentation calculated on the NIR camera’s images and the corresponding area in the Thermal image is found by applying the calculated transformation between the two cameras (as discussed in section 2.6). The cropped area is finally resized to pixels. A single multispectral frame from the captured sequence is used per sample. We always use the frame with the highest quality score provided by [45] during segmentation. If segmentation fails for all available frames, the sample is discarded.

Exploiting the camera synchronization in our systems, for face and iris data which rely on geometric transformations, the particular frames extracted from each channel are the closest ones in time to the reference frame where face or eye detection was applied (based on each frame’s timestamps). For all biometric modalities, if dark channel frames are available for any spectral channel (see Fig. 8), the corresponding time-averaged dark channel is first subtracted. The data is then normalized in using the corresponding channel’s bit depth (see Tables IIIIIIV). Examples of pre-processed data for bona-fide samples and the PAI categories defined in Table V are presented in Fig. 13. In some cases, images have been min-max normalized within each spectral regime for better visualization. The notations used in the figure will become important in the following analysis.

Model/Loss Parameters
, , (see Fig. 12 and Eq. (1))
: Defined by input image channels
Training Parameters
Initial learning rate: , Minimum learning rate: ,
Learning rate scheduling: Reduce by on validation loss
plateau with (patience: epochs, threshold: ),
Optimizer: Adam [33], Epochs: , Batch Size:
TABLE VIII: Parameters used for all experiments.

Training Protocols

We follow two different training protocols using the datasets presented in Table V:

  1. 3Fold: All data from the Combined dataset is divided in folds. For each fold, the training, validation and testing sets consist of , and of data, respectively. The folds were created considering the participants such that no participant appears in more than one set leading to slightly different percentages than the aforementioned ones.

  2. Cross-Dataset: Dataset I is used for training and validation ( and of data, respectively) while Dataset II is used for testing. In this scenario, a few participants do appear in both datasets, for the finger and iris cases, but their data was collected at a different point in time, a different location and using a different replica of our biometric sensor suites (see participant counts in Table V).

Fig. 14: Left: PAD score distributions for single channel experiments; Right: ROC curves corresponding to -channel experiments for face and finger and single channel experiments for iris. In the ROC curve legends, the best performance is highlighted in bold while the score fusion result (Mean) is underlined when outperforming the best individual experiment.

We now conduct a series of comprehensive experiments to analyze the PAD performance capabilities of the captured data. First, for all three biometric modalities, we perform experiments when each spectral channel is used separately as input to the model of Fig. 12 (i.e., ). For face and finger data, due to the large number of channels, we further conduct experiments when combinations of input channels are used. On one hand, this approach aids in summarizing the results in a compact form but also constitutes a logical extension. For face, is the number of channels provided by the RGB camera while for finger, there are visible light illumination sources and LSCI data is inherently time-dependent, hence sequential frames are necessary for observing this effect. We choose not to study larger channel combinations so that we accentuate the individual contribution of each type of available data to the PAD problem, but always adhere to the rule of comparing experiments using the same number of input channels and therefore contain the same amount of trainable model parameters.

Each experiment uses the same model and training parameters, summarized in Table VIII

. During training, each channel is standardized to zero-mean and unit standard deviation based on the statistics of all images in the training set, while the same normalizing transformation is applied when testing. All experiments are performed on both (

3Fold and Cross-Dataset) training protocols explained above. The notation used for each individual channel and each triplet combination in the experiments is illustrated in Fig. 13. For each type of experiment, we also calculate the performance of the mean PAD score fusion of all individual experiments (denoted as Mean). As performance metrics, we report the Area Under the Curve (AUC), the True Positive Rate at False Positive Rate (denoted as TPR) and the Bona-fide Presentation Classification Error Rate at a fixed Attack Presentation Classification Error Rate (APCER) of (denoted as BPCER in the ISO [26] standard).

The results from all experiments are summarized in Fig. 14 and Table IX. The left part of Fig. 14 analyzes the single channel experiments by drawing error bars of the PAD score distributions for bona-fide samples and each PAI category defined in Table V. The error bars depict the mean and standard deviation of each score distribution bounded by the PAD score limits . Hence, full separation of error bars between bona-fides and PAIs does not imply perfect score separation. However, it can showcase in a clear way which channels are most effective at detecting specific PAI categories. The right part of Fig. 14 presents the calculated ROC curves and relevant performance metrics for the -channel experiments for face and finger and -channel experiments for iris. The same results are then re-analyzed per PAI category in Table IX by calculating each performance metric for an ROC curve drawn by considering only bona-fide samples and a single PAI category each time. The table is color coded such that darker shades denote performance degradation and helps at interpreting the underlying metric values. The grayscale values were indeed selected using the average value between AUC, TPR and ( BPCER) for each colored entry. It is important to note that for the iris experiments, only samples for which iris segmentation was successful in all channels are used in the analysis for a fair comparison among the different models.

TABLE IX: Performance metric analysis per PAI category. For each experiment from the ROC curve’s section in Fig. 14, separate ROC curves are extracted considering only bona-fide samples and samples from a single PAI category (as defined in Table V). The table is color coded such that darker shades denote reduction in performance. The grayscale values were selected using the average value between AUC, TPR and ( BPCER) for each colored entry. The best performance per PAI category and training protocol is highlighted in bold while the score fusion result (Mean) is underlined when matching or outperforming the best individual experiment.

By analyzing the presented results, we make the following observations:

  • Some channels behave exactly as expected by the human visual perception of the images (e.g., Thermal channel success on Plastic Mask and Fake Eyes).

  • Certain PAI categories appear to be easily detectable by most channels (e.g., Ballistic Gelatin and Ecoflex Flesh) while others (e.g., Prostheses and PDMS or Glue) exhibit consistent separation when SWIR/LSCI illumination is used, supporting the complementarity principle of the proposed system.

  • The complementarity principle is further strengthened by the performance of simple score averaging (denoted as Mean), which is the best in multiple cases.

  • The Cross-Dataset protocol performance appears to be severely degraded for channels where cameras are affected by ambient illumination conditions (e.g., visible or NIR light). This is particularly apparent in the face experiments where RGB data performance changes from best to worst between the two training protocols and a huge PAD score shift can be observed in the score distributions. This effect is also aggravated by the smaller size of the face dataset which can lead to overfitting as well as the vastly different demographics between the training and testing sets (discussed in section 3.1). On the contrary, higher wavelength channels appear to be more resilient to both ambient illumination and demographic variations, consistent with the existing literature [58].

  • In a few cases, Cross-Dataset protocol evaluation outperforms the equivalent 3Fold experiments. This can be explained due the smaller variety of PAI species in Dataset II as well as the larger variety of bona-fide samples in the Combined dataset, some of which might be inherently harder to classify correctly.

  • For the iris data, use of multispectral data seems to be less important. The Thermal channel, while successful at detecting fake eyes, appears weak at detecting PAI contact lenses (indeed multiple eyes of participants wearing regular contact lenses are misclassified due to the darker appearance of their pupil-iris area). At the same time, only a single NIR channel appears to have slightly better performance than the IrisID camera which also uses NIR illumination, possibly due to its higher image resolution (see Fig. 13). Nevertheless, the fact that Mean score performance is sometimes superior suggests that certain samples are classified correctly mainly due to the multispectral nature of the data. However, as discussed in section 2.6, the current iris system setup is not optimal and suffers from motion blur as well as cross-wavelength blur when a participant is moving during capture. Blurriness can obscure small details necessary for the detection of contact lens PAIs. Indeed, the N channel, which exhibits the highest performance, was the one that was in best focus and whose images were usually receiving the highest quality scores during the legacy compatibility analysis of section 3.2.

In general, the analysis suggests that for each biometric modality, there is channels which can alone offer high PAD performance. Clearly, some of the problems observed in the Cross-Dataset

protocol analysis could be alleviated by using pre-training, transfer learning or fine-tuning techniques, but the purpose of our work is to emphasize on the limitations originating from certain wavelength regimes and stress the importance of the availability of a variety of spectral bands for training robust classification models. Besides, models using a larger input channel stack can further enhance PA detection, as shown in 

[19, 62, 56, 25, 42].

4 Conclusion

In this work, we presented a multispectral biometrics system framework along with its realization on face, finger and iris biometric modalities. We described the proposed systems in detail and explained how they adhere to the principles of flexibility, modularity, legacy compatibility and complementarity. Further, we showcased that the captured data can provide rich and diverse information useful at distinguishing a series of presentation attack instrument types from bona-fide samples. The variety of synchronized biometric data captured through the proposed systems can open doors to various different applications. We hope that our work will spur further research in this domain and make multispectral biometric data a commodity for researchers to experiment with. We believe that multispectral data for biometrics can be one of the key ingredients for detecting future and ever more sophisticated presentation attacks.

Acknowledgments

This research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA R&D Contract No. 2017-17020200005. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.

The authors would like to thank Marta Gomez-Barrero, Jascha Kolberg and Christoph Busch for their helpful discussions and contributions on the finger biometric modality.

References

  • [1] A. Agarwal, D. Yadav, N. Kohli, R. Singh, M. Vatsa, and A. Noore (2017) Face presentation attack with latex masks in multispectral videos. In

    2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

    ,
    Vol. , pp. 275–283. Cited by: §1.
  • [2] Basler acA1300-60gmNIR. Note: https://www.baslerweb.com/en/products/cameras/area-scan-cameras/ace/aca1300-60gmnir/ Cited by: TABLE I, TABLE III.
  • [3] Basler acA1920-150uc. Note: https://www.baslerweb.com/en/products/cameras/area-scan-cameras/ace/aca1920-150uc/ Cited by: §2.4, TABLE I, TABLE II.
  • [4] Basler acA1920-150um. Note: https://www.baslerweb.com/en/products/cameras/area-scan-cameras/ace/aca1920-150um/ Cited by: TABLE I, TABLE II.
  • [5] Basler acA4096-30um. Note: https://www.baslerweb.com/en/products/cameras/area-scan-cameras/ace/aca4096-30um/ Cited by: TABLE I, TABLE IV.
  • [6] Biometric Authentication Under Threat: Liveness detection Hacking. Note: https://www.blackhat.com/us-19/briefings/schedule/ Cited by: §1.
  • [7] J. Brauers, N. Schulte, and T. Aach (2008) Multispectral filter-wheel cameras: geometric distortion model and compensation algorithms. IEEE Transactions on Image Processing 17 (12), pp. 2368–2380. Cited by: item 3.
  • [8] D. Briers, D. D. Duncan, E. R. Hirst, S. J. Kirkpatrick, M. Larsson, W. Steenbergen, T. Stromberg, and O. B. Thompson (2013) Laser speckle contrast imaging: theoretical and practical limitations. Journal of Biomedical Optics 18 (6), pp. 1 – 10. External Links: Document, Link Cited by: §2.5.
  • [9] A. Bulat and G. Tzimiropoulos (2018)

    Super-FAN: Integrated Facial Landmark Localization and Super-Resolution of Real-World Low Resolution Faces in Arbitrary Poses with GANs

    .
    In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vol. , pp. 109–117. Cited by: §2.4.
  • [10] I. Chingovska, N. Erdogmus, A. Anjos, and S. Marcel (2016) Face recognition systems under spoofing attacks. In Face Recognition Across the Imaging Spectrum, T. Bourlai (Ed.), pp. 165–194. External Links: ISBN 978-3-319-28501-6, Document, Link Cited by: §1.
  • [11] Computar SWIR M1614-SW. Note: https://computar.com/product/1240/M1614-SW Cited by: TABLE I.
  • [12] Computar SWIR M3514-SW. Note: https://computar.com/product/1336/M3514-SW Cited by: TABLE I.
  • [13] Eblana Photonics, EP1310-ADF-DX1-C-FM. Note: https://www.eblanaphotonics.com/fiber-comms.php Cited by: §2.5.
  • [14] J. J. Engelsma, K. Cao, and A. K. Jain (2019)

    RaspiReader: Open Source Fingerprint Reader

    .
    IEEE Transactions on Pattern Analysis and Machine Intelligence 41 (10), pp. 2511–2524. Cited by: §1.
  • [15] EO 35mm C Series VIS-NIR. Note: https://www.edmundoptics.com/p/35mm-c-series-vis-nir-fixed-focal-length-lens/22384/ Cited by: TABLE I.
  • [16] EO 700nm Longpass Filter. Note: https://www.edmundoptics.com/p/50mm-diameter-700nm-cut-on-swir-longpass-filter/28899/ Cited by: TABLE I.
  • [17] FLIR Boson 320, 24 (HFOV), 9.1mm. Note: https://www.flir.com/products/boson/?model=20320A024 Cited by: TABLE I, TABLE II.
  • [18] FLIR Boson 640, 18 (HFOV), 24mm. Note: https://www.flir.com/products/boson/?model=20640A018 Cited by: TABLE I, TABLE IV.
  • [19] A. George, Z. Mostaani, D. Geissenbuhler, O. Nikisins, A. Anjos, and S. Marcel (2020) Biometric face presentation attack detection with multi-channel convolutional neural network. IEEE Transactions on Information Forensics and Security 15 (), pp. 42–55. External Links: Document, ISSN 1556-6021 Cited by: §3.1, §3.4.
  • [20] M. Gomez-Barrero, J. Kolberg, and C. Busch (2019-06) Multi-modal fingerprint presentation attack detection: analysing the surface and the inside. In 2019 International Conference on Biometrics (ICB), pp. 1–8. External Links: Document, ISSN 2376-4201 Cited by: §3.1.
  • [21] P. Gupta and P. Gupta (2014) A vein biometric based authentication system. In Information Systems Security, A. Prakash and R. Shyamasundar (Eds.), Cham, pp. 425–436. External Links: ISBN 978-3-319-13841-1 Cited by: §2.5.
  • [22] M. Hara (2009) Fingerprint image enhancement. In Encyclopedia of Biometrics, S. Z. Li and A. Jain (Eds.), pp. 474–482. External Links: ISBN 978-0-387-73003-5, Document, Link Cited by: §3.2.
  • [23] Heliopan Infrared Filter. Note: https://www.bhphotovideo.com/c/product/800576-REG/Heliopan_735578_35_5mm_Infrared_Blocking_Filter.html Cited by: TABLE I.
  • [24] HID® Lumidigm® V-Series Fingerprint Readers, v302-40. Note: https://www.hidglobal.com/products/readers/single-finger-readers/lumidigm-v-series-fingerprint-readers Cited by: §1.
  • [25] M. E. Hussein, L. Spinoulas, F. Xiong, and W. Abd-Almageed (2018-12) Fingerprint presentation attack detection using a novel multi-spectral capture device and patch-based convolutional neural networks. In 2018 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–8. External Links: Document, ISSN 2157-4766 Cited by: item 1, §2.5, §3.1, §3.3, §3.4.
  • [26] (2017) Information technology – Biometric presentation attack detection – Part 3: Testing and reporting. International Organization for Standardization. Cited by: §3.4.
  • [27] Intel® RealSense Depth Camera D435. Note: https://www.intelrealsense.com/depth-camera-d435/ Cited by: §1, Fig. 5, §2.4, TABLE I, TABLE II.
  • [28] IrisID iCAM-7000 series, iCAM7000S-T. Note: https://www.irisid.com/productssolutions/hardwareproducts/icam7-series/ Cited by: §2.6, TABLE I, TABLE IV.
  • [29] A. Jaiswal, S. Xia, I. Masi, and W. AbdAlmageed (2019-06) RoPAD: robust presentation attack detection through unsupervised adversarial invariance. In 2019 International Conference on Biometrics (ICB), pp. 1–8. External Links: Document, ISSN 2376-4201 Cited by: §3.1.
  • [30] A. Jenerowicz, P. Walczykowski, L. Gladysz, and M. Gralewicz (2018) Application of hyperspectral imaging in hand biometrics. In Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies II, H. Bouma, R. Prabhu, R. J. Stokes, and Y. Yitzhaky (Eds.), Vol. 10802, pp. 129 – 138. External Links: Document, Link Cited by: item 2.
  • [31] A. Jourabloo, Y. Liu, and X. Liu (2018) Face De-spoofing: Anti-spoofing via Noise Modeling. In Computer Vision – ECCV 2018, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss (Eds.), pp. 297–315. External Links: ISBN 978-3-030-01261-8 Cited by: §3.3.
  • [32] P. Keilbach, J. Kolberg, M. Gomez-Barrero, C. Busch, and H. Langweg (2018-Sep.) Fingerprint presentation attack detection using laser speckle contrast imaging. In 2018 International Conference of the Biometrics Special Interest Group (BIOSIG), Vol. , pp. 1–6. External Links: Document, ISSN 1617-5468 Cited by: §2.5, §3.1.
  • [33] D. P. Kingma and J. Ba (2015) Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Y. Bengio and Y. LeCun (Eds.), External Links: Link Cited by: TABLE VIII.
  • [34] J. Kolberg, M. Gomez-Barrero, S. Venkatesh, R. Ramachandra, and C. Busch (2020) Presentation attack detection for finger recognition. In Handbook of Vascular Biometrics, A. Uhl, C. Busch, S. Marcel, and R. Veldhuis (Eds.), pp. 435–463. External Links: ISBN 978-3-030-27731-4, Document, Link Cited by: §2.5, §3.1.
  • [35] K. Kotwal, S. Bhattacharjee, and S. Marcel (2019-10) Multispectral deep embeddings as a countermeasure to custom silicone mask presentation attacks. IEEE Transactions on Biometrics, Behavior, and Identity Science 1 (4), pp. 238–251. External Links: Document, ISSN 2637-6407 Cited by: §3.1.
  • [36] Kowa LM12HC. Note: https://lenses.kowa-usa.com/hc-series/473-lm12hc.html Cited by: TABLE I.
  • [37] Kowa LM25HC. Note: https://lenses.kowa-usa.com/hc-series/475-lm25hc.html Cited by: TABLE I.
  • [38] M. Lafkih, P. Lacharme, C. Rosenberger, M. Mikram, S. Ghouzali, M. E. Haziti, W. Abdul, and D. Aboutajdine (2015) Application of new alteration attack on biometric authentication systems. In 2015 First International Conference on Anti-Cybercrime (ICACC), Vol. , pp. 1–5. Cited by: §1.
  • [39] Y. Liu, A. Jourabloo, and X. Liu (2018) Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vol. , pp. 389–398. Cited by: §3.3.
  • [40] S. Marcel, M. S. Nixon, J. Fiérrez, and N. W. D. Evans (Eds.) (2019) Handbook of biometric anti-spoofing - presentation attack detection, second edition. Advances in Computer Vision and Pattern Recognition, Springer. External Links: Link, Document, ISBN 978-3-319-92626-1 Cited by: §1.
  • [41] Marktech Optoelectronics. Note: https://marktechopto.com/ Cited by: §2.3.
  • [42] H. Mirzaalian, M. Hussein, and W. Abd-Almageed (2019-06) On the effectiveness of laser speckle contrast imaging and deep neural networks for detecting known and unknown fingerprint presentation attacks. In 2019 International Conference on Biometrics (ICB), Vol. , pp. 1–8. External Links: Document, ISSN 2376-4201 Cited by: §2.5, §3.1, §3.3, §3.4.
  • [43] R. Munir and R. A. Khan (2019) An extensive review on spectral imaging in biometric systems: Challenges advancements. Journal of Visual Communication and Image Representation 65, pp. 102660. External Links: ISSN 1047-3203, Document, Link Cited by: §1.
  • [44] National Institute of Standards and Technology (NIST). Note: https://www.nist.gov/ Cited by: §3.1.
  • [45] Neurotechnology, MegaMatcher 11.2 SDK. Note: https://www.neurotechnology.com/megamatcher.html Cited by: 3rd item, §3.2, TABLE VI, TABLE VII.
  • [46] O. Nikisins, A. George, and S. Marcel (2019-06)

    Domain adaptation in multi-channel autoencoder based features for robust face anti-spoofing

    .
    In 2019 International Conference on Biometrics (ICB), Vol. , pp. 1–8. External Links: Document, ISSN 2376-4201 Cited by: §3.1.
  • [47] Osram Opto Semiconductors. Note: https://www.osram.com/os/ Cited by: §2.3.
  • [48] PCA9745B LED driver. Note: https://www.digikey.com/product-detail/en/nxp-usa-inc/PCA9745BTWJ/568-14156-1-ND/9449780 Cited by: (a)a, §2.1.
  • [49] Protocase Designer. Note: https://www.protocasedesigner.com/ Cited by: §2.3.
  • [50] Protocase. Note: https://www.protocase.com/ Cited by: §2.3.
  • [51] R. Raghavendra, K. B. Raja, J. Surbiryala, and C. Busch (2014) A low-cost multimodal biometric sensor to capture finger vein and fingerprint. In IEEE International Joint Conference on Biometrics, Vol. , pp. 1–7. Cited by: §2.5.
  • [52] R. Raghavendra, K. B. Raja, S. Venkatesh, F. A. Cheikh, and C. Busch (2017) On the vulnerability of extended multispectral face recognition systems towards presentation attacks. In 2017 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA), Vol. , pp. 1–8. Cited by: §1.
  • [53] Roithner Lasertechnik. Note: http://www.roithner-laser.com/ Cited by: §2.3.
  • [54] B. Roui-Abidi and M. Abidi (2009) Multispectral and hyperspectral biometrics. In Encyclopedia of Biometrics, S. Z. Li and A. Jain (Eds.), pp. 993–998. External Links: ISBN 978-0-387-73003-5, Document, Link Cited by: §1.
  • [55] A. Signoroni, M. Savardi, A. Baronio, and S. Benini (2019) Deep Learning meets Hyperspectral Image Analysis: A multidisciplinary review. Journal of Imaging 5 (5). Cited by: §1.
  • [56] L. Spinoulas, M. Hussein, H. Mirzaalian, and W. AbdAlmageed (2020) Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset. CoRR. Cited by: Fig. 12, §3.3, §3.3, §3.4.
  • [57] J. Spurný, M. Doleel, O. Kanich, M. Drahanský, and K. Shinoda (2015) New Materials for Spoofing Touch-Based Fingerprint Scanners. In 2015 International Conference on Computer Application Technologies, Vol. , pp. 207–211. Cited by: §1.
  • [58] H. Steiner, S. Sporrer, A. Kolb, and N. Jung (2016) Design of an Active Multispectral SWIR Camera System for Skin Detection and Face Verification. Journal of Sensors 2016, pp. 16. External Links: Link Cited by: §1, §1, 4th item.
  • [59] C. Sun, A. Jagannathan, J. L. Habif, M. Hussein, L. Spinoulas, and W. Abd-Almageed (2019) Quantitative laser speckle contrast imaging for presentation attack detection in biometric authentication systems. In Smart Biomedical and Physiological Sensor Technology XVI, B. M. Cullum, D. Kiehl, and E. S. McLamore (Eds.), Vol. 11020, pp. 38 – 46. External Links: Document, Link Cited by: §2.5, §3.1.
  • [60] Teensy 3.6. Note: https://www.pjrc.com/store/teensy36.html Cited by: (b)b, §2.1, §2.5.
  • [61] Thorlabs. Note: https://www.thorlabs.com/ Cited by: §2.3.
  • [62] R. Tolosana, M. Gomez-Barrero, C. Busch, and J. Ortega-Garcia (2020) Biometric presentation attack detection: beyond the visible spectrum. IEEE Transactions on Information Forensics and Security 15 (), pp. 1261–1275. External Links: Document, ISSN 1556-6021 Cited by: §3.1, §3.4.
  • [63] R. Tolosana, M. Gomez-Barrero, J. Kolberg, A. Morales, C. Busch, and J. Ortega-Garcia (2018-Sep.) Towards fingerprint presentation attack detection based on convolutional neural networks and short wave infrared imaging. In 2018 International Conference of the Biometrics Special Interest Group (BIOSIG), Vol. , pp. 1–5. External Links: Document, ISSN 1617-5468 Cited by: §3.1.
  • [64] S. Venkatesh, R. Ramachandra, K. Raja, and C. Busch (2019) A new multi-spectral iris acquisition sensor for biometric verification and presentation attack detection. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), Vol. , pp. 47–54. Cited by: §1.
  • [65] B. V. K. Vijaya Kumar (2011) Biometric matching. In Encyclopedia of Cryptography and Security, H. C. A. van Tilborg and S. Jajodia (Eds.), pp. 98–101. External Links: ISBN 978-1-4419-5906-5, Document, Link Cited by: §3.2.
  • [66] Vishay Semiconductor. Note: http://www.vishay.com/ Cited by: §2.3.
  • [67] Vista Imaging, Inc., VistaEY2 Dual Iris Face Camera. Note: https://www.vistaimaging.com/biometric_products.html Cited by: §1.
  • [68] L. Wang, G. Leedham, and S. -. Cho (2007-12) Infrared imaging of hand vein patterns for biometric purposes. IET Computer Vision 1 (3-4), pp. 113–122. External Links: Document, ISSN 1751-9640 Cited by: §2.5.
  • [69] X. Wu, D. Gao, Q. Chen, and J. Chen (2020-02) Multispectral imaging via nanostructured random broadband filtering. Opt. Express 28 (4), pp. 4859–4875. External Links: Link, Document Cited by: item 4.
  • [70] Xenics Bobcat 320 GigE 100. Note: https://www.xenics.com/products/bobcat-320-series/ Cited by: TABLE I, TABLE II, TABLE III.
  • [71] J. Yang, A. Bulat, and G. Tzimiropoulos (2020) FAN-Face: a Simple Orthogonal Improvement to Deep Face Recognition. In

    AAAI Conference on Artificial Intelligence

    ,
    Cited by: §2.4, 1st item.
  • [72] D. Zhang, Z. Guo, and Y. Gong (2016) Multispectral biometrics systems. In Multispectral Biometrics: Systems and Applications, pp. 23–35. External Links: ISBN 978-3-319-22485-5, Document, Link Cited by: §1.
  • [73] Z. Zhang (2000-11) A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (11), pp. 1330–1334. External Links: Document, ISSN 1939-3539 Cited by: §2.4.