Face Beneath the Ink: Synthetic Data and Tattoo Removal with Application to Face Recognition

by   Mathias Ibsen, et al.

Systems that analyse faces have seen significant improvements in recent years and are today used in numerous application scenarios. However, these systems have been found to be negatively affected by facial alterations such as tattoos. To better understand and mitigate the effect of facial tattoos in facial analysis systems, large datasets of images of individuals with and without tattoos are needed. To this end, we propose a generator for automatically adding realistic tattoos to facial images. Moreover, we demonstrate the feasibility of the generation by training a deep learning-based model for removing tattoos from face images. The experimental results show that it is possible to remove facial tattoos from real images without degrading the quality of the image. Additionally, we show that it is possible to improve face recognition accuracy by using the proposed deep learning-based tattoo removal before extracting and comparing facial features.



page 1

page 3

page 4

page 5

page 6

page 7

page 8

page 9


Training Deep Face Recognition Systems with Synthetic Data

Recent advances in deep learning have significantly increased the perfor...

Unsupervised Eyeglasses Removal in the Wild

Eyeglasses removal is challenging in removing different kinds of eyeglas...

Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic Data

In portraits, eyeglasses may occlude facial regions and generate cast sh...

FaceEraser: Removing Facial Parts for Augmented Reality

Our task is to remove all facial parts (e.g., eyebrows, eyes, mouth and ...

Fawkes: Protecting Personal Privacy against Unauthorized Deep Learning Models

Today's proliferation of powerful facial recognition models poses a real...

Performance analysis of facial recognition: A critical review through glass factor

COVID-19 pandemic and social distancing urge a reliable human face recog...

Few-Data Guided Learning Upon End-to-End Point Cloud Network for 3D Face Recognition

3D face recognition has shown its potential in many application scenario...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Facial analysis systems are deployed in various applications ranging from medical use to border control. Such facial analysis systems are known to be negatively affected by facial occlusions [1]. A specific kind of facial alteration that partially occludes a face is a face tattoo. Facial tattoos have become more appealing recently and have been described as a mainstream trend in several major newspapers [2, 3]. Ensuring inclusiveness and accessibility for all individuals, independent of physical appearance, is imperative in the development of fair facial analysis systems. In this regard, facial tattoos are especially challenging, as they cause permanent alterations where ink is induced into the dermis layer of the skin. For instance, Ibsen et al. investigated in [4] the impact of facial tattoos and paintings on state-of-the-art face recognition systems. The authors showed that tattoos might impair the recognition accuracy and thus the security of such a facial analysis system.

Fig. 1: Examples of using deep learning-based tattoo removal.

In coherence with the findings in [4], it is of interest to make facial analysis systems more robust towards facial tattoos. One way to do this is face completion, where missing or occluded parts of a face are reconstructed; such approaches have, for instance, shown to improve face recognition performance for some occlusions [5]. An additional benefit of using face completion over approaches like occlusion-aware face recognition is the potential to use the reconstructed facial image for other purposes, e.g. visualising how a face might look without the occlusion or preventing that tattoos are used for recognition purposes; which is something that raises ethical issues as discussed in [6].

However, one major problem of face completion for tattoo removal is the lack of sufficient and high quality training data, as no large database of facial tattoos is currently available.

The main focus of this work is, therefore, two-fold. First, we propose a method for synthetically adding tattoos to facial images which we use to create a large database of facial images with tattoos. The proposed method uses face detection and landmark localisation to divide the face into regions, whereafter suitable placements of tattoos are found. Subsequently, depth and cut-out maps are constructed and used to realistically blend tattoos onto a face. It has recently been shown that synthetic data can be beneficial for face analysis tasks and be a good alternative to real data [7]. Secondly, we use deep learning-based techniques for tattoo-removal (as illustrated in Fig. 1) and evaluate the impact of removing facial tattoos on a state-of-the-art face recognition system using a database comprising real facial images with tattoos.

The approach for synthetically adding tattoos to a facial image, in a fully automated way, is to the authors’ best knowledge the first of its kind. The proposed generator can be used to create large databases which can be used in related fields such as tattoo detection or studying the effects of tattoos on human perception. Additionally, we are the first to measure the effect of removing facial tattoos on face recognition systems. The code for synthetically adding tattoos to face images will be made publicly available.

In summary, this work makes the following contributions:

  • A novel algorithm for synthetically adding facial tattoos to face images.

  • Use of synthetically generated images with facial tattoos to train a deep learning-based algorithm for removing tattoos from real face images.

  • A comprehensive experimental analysis of the proposed approach including an evaluation of the effect of removing facial tattoos on a face recognition system.

The outline of the remaining article is as follows: Sect. II describes prominent related works, Sect. III describes an automated approach for synthetically blending tattoos to facial images which is used in Sect. IV to generate a database of facial images with tattoos. Sect. V and VI

show the feasibility of the synthetic generation by training classifiers for tattoo removal and evaluating if it can improve biometric recognition performance, respectively. Finally, Sect. 

VII provides a summary of this work.

Ii Related Work

The following subsections summarise related works w.r.t. synthetic data generation for facial analysis (Sect. II-A), facial alterations (Sect. II-B), and facial completion (Sect. II-C).

Ii-a Synthetic Data Generation for Face Analysis

Synthetically generated data has seen many application scenarios in face analysis, most notably for addressing the lack of training data. Synthetic data has especially become relevant with the recent advances in deep learning-based algorithms which usually require large amount of training data. Privacy regulations, e.g. the European General Data Protection Regulation [8]

, make sharing and distributing large-scale face databases impracticable as face images are classified as a special category of personal data when used for biometric identification. As an alternative, researchers have explored the use of synthetic data. The generation of realistic-looking synthetic face data has especially become feasible with the recent advances in Generative Adversarial Networks (GANs), first proposed by Goodfellow

et al. in [9]. Prominent work in this field includes StyleGAN which was first introduced in [10] by Karras et al. and showed, at the time, state-of-the-art performance for synthesising facial images. Since the original work, two improved versions of StyleGAN have been proposed [11, 12]

. Much current research in this area focuses on GAN-inversion where existing face images are encoded into the latent space of a generator. Thereafter, the resulting latent code can be shifted in the latent space, whereby the inverted image of the shifted vector results in an alteration of the original image. The technique can, for instance, be used for face age progression 

[13]. In addition to the face, some research have also been conducted for other biometric modalities, e.g. fingerprint [14, 15, 16] and iris [17, 18].

Little work has been conducted regarding synthetic data generation of facial images with tattoos. However, in [19] the authors proposed a method for transforming digital portrait images into realistic-looking tattoos. In [20], the author also shows examples of tattoo images added to facial and body images using an existing GAN for drawing art portraits; however, details about this approach are not scientifically documented.

Ii-B Facial Alterations

Facial alterations can occur in either the physical or digital domain and cause permanent or temporary changes of a face. Several studies have explored the impact of both physical and digital alterations on face recognition systems. In the physical domain, predominantly the effects of makeup and plastic surgery on face recognition have been studied [21]. In [22] the authors collected a database of 900 individuals for analysing the effect of plastic surgery and found that the tested algorithms were unable to effectively account for the appearance changes caused by plastic surgery. More recently, Rathgeb et al. showed in [23], using a database of mostly ICAO-quality face images [24] captured before and after various types of facial plastic surgeries, that different tested state-of-the-art face recognition systems maintained almost perfect verification performance at an operationally relevant threshold corresponding to a False Match Rate (FMR) of . Numerous works have addressed the impact of temporary alterations on face recognition systems. In [25], Dantcheva et al. found that makeup can hinder reliable face recognition; similar conclusions were drawn by Wang et al. in [26] where they investigated the impact of human faces under disguise and makeup. The previous work shows that makeup might be successfully used for identity concealment; in [27], the authors additionally showed that makeup can also be used for presentation attacks with the goal of impersonating another identity. In [28], the authors found that especially high-quality makeup-based presentation attacks can hamper the security of face recognition systems. In [29], the authors found that disguised faces severely affect recognition performance especially for occlusions near the periocular region. The database used by the authors includes different types of disguises including facial paintings. Coherent with these findings, Ibsen et al. showed in [4]

that facial tattoos and paintings can severely affect different modules of a face recognition system including face detection as well as feature extraction and comparison.

Fig. 2: Synthetic facial tattoo generation workflow.

Ferrara et al. were among the first to show that digital alterations can impair the security of face recognition systems. Especially notable is their work in [30] where they showed the possibility of attacking face recognition systems using morphed images. Specifically, they showed that if a high-quality morphed image is infiltrated to a face recognition system (e.g. stored in a passport), it is likely that individuals contributing to the morph are positively authenticated by the biometric system. Since then, there have been numerous works on face recognition systems under morphing attacks. For a comprehensive survey, the reader is referred to [31]. Facial retouching is another area which has seen some attention in the research community. While some early works showed that face recognition can be significantly effected by retouching, Rathgeb et al. showed more recently that face recognition systems might be robust to slight alterations caused by retouching [32]. Similar improvements have been shown for geometrical distortions, e.g. stretching [33]. A more recent threat that has arrived with the prevalence of deep-learning techniques are so-called DeepFakes [34], which can be used to spread misinformation and as such lead to a loss of trust in digital content. Many researchers are working on the detection or generation of deep learning-based alterations. Several arduous challenges and benchmarks have already been established, for instance, the recent Deepfake Detection Challenge [35] where the top model only achieved an accuracy of approximately on previously unseen data. Generation and detection of deep learning-based alterations are continuously evolving and remain a cat-and-mouse game; interested readers are referred to [36] for a comprehensive survey.

Ii-C Facial Completion

Most methods for face completion111Also called face inpainting. build upon deep learning-based algorithms which are trained on paired images where each pair contains a non-occluded face and a corresponding occluded face. In [37]

, the authors proposed an approach for general image completion and showed its applicability for facial completion. In this work, the authors leveraged a fully convolutional neural network trained with global and local context discriminators. Similar work was done in


where the authors occluded faces by adding random squares of noise pixels. Subsequently, they trained an autoencoder to reconstruct the occluded part of the face using global and local adversarial losses as well as a semantic parsing loss. Motivated by the prevalence of VR/AR displays which can hinder face-to-face communication, Zhao

et al. [39] proposed a new generative architecture with an identity preserving loss. In [40], Song et al.

used landmark detection to estimate the geometry of a face and used it, together with the occluded face image, as input to an encoder-decoder architecture for reconstructing the occluded parts of the face. The proposed approach allows generating diverse results by altering the estimated facial geometry. More recently, Din

et al. [41] employed a GAN-based architecture for unmasking of masked facial images. The proposed architecture consists of two stages where the first stage detects the masked area of the face and creates a binary segmentation map. The segmentation map is then used in the second stage for facial completion using a GAN-based architecture with two discriminators where one focuses on the global structure and the other on the occluded parts of the face. In [5], it was found that facial completion can improve face recognition performance.

Iii Facial Tattoo Generator

To address the lack of existing databases of image pairs of individuals before and after they got facial tattoos, we propose an automated approach for synthetically adding facial tattoos to images. An overview of the proposed generation is depicted in Fig. 2. The process of synthetically adding tattoos to a facial image can be split into two main steps which are described in the following subsections: (1) finding the placement of tattoos in a face and (2) blending the tattoos to the face.

Iii-a Placement of Tattoos

To find suitable placements of tattoos in a face, we start by localising the facial region and detecting landmarks of the face. To this end, we use dlib [42] which returns a list of 68 landmarks as shown in Fig. 3.

Fig. 3: Facial landmarks detected by dlib.

Thereafter, the landmarks are used to divide the face into small regions of triangles by performing a fixed Delaunay triangulation. The regions are then extended to the forehead by using the length of the nose as an estimate. Each region now constitutes a possible placement of a tattoo; however, such a division is inadequate for the placement of larger tattoos. Therefore, the face is divided into six larger regions. The division of the face into large and small regions gives high controllability in the data generation. As indicated, some regions are excluded, i.e. regions around the nostrils, mouth and nose. These regions are estimated based on the detected landmarks. The division of a face into regions is illustrated in Fig. 4. The regions makes it possible to avoid placing tattoos in heavily bearded areas or on top of glasses if such information is available about the facial images during the generation phase. In our work, we do not use beard or glass detectors, however for some of the images information about beard or glasses are available which we use to avoid placing tattoos in the affected regions.



Fig. 4: (a) Division of a facial image into regions from landmarks, (b) extended to the forehead, and (c) division into six pre-defined regions.

A tattoo can now be placed in one of the six predefined regions, or the regions can be further combined to allow placing the tattoos in larger areas of the face. A combined region is simply a new region consisting of several smaller regions. The exact placement of a tattoo within a region depends on a pre-selected generation strategy. The generation strategy determines (1) possible regions where a tattoo can be placed, (2) the selection of tattoos, and (3) the size and placement of a tattoo within a region. An example is illustrated in Fig. 5 where one of the cheeks is selected as a possible region, whereafter the largest onoccupied subset within that region is found. Thereafter, the tattoo is placed by estimating its largest possible placement within the selected subset without altering the original aspect-ratio of the tattoo. In this work, we use a database comprising more than 600 distinct tattoo templates which mainly consist of real tattoo designs collected from acquired tattoo books. The selection of which tattoos to place depends on the generation strategies which are further described in Sect. III-C.

(a) Selected region
(b) Find a subset of the region not occupied (the green area).
(c) Find a placement for the tattoo
Fig. 5: Illustration showing an example of how a placement of a tattoo in a region can be found. The red area in (b) illustrates that there might be some areas within a selected region where a tattoo cannot be placed, e.g. if the area is reserved for another tattoo.

Iii-B Blending

To blend the tattoos to faces, various image manipulations are performed using GIMP which is automated using Python-Fu222Python-Fu acts as a wrapper to libgimp and allows for the writing of plug-ins for GIMP in Python. [43].

Given a facial image and placement of tattoos (see Sect. III-A); each tattoo is placed and overlayed on the facial image by multiplying the tattoo layer with the facial image. Afterwards, the tattoo is displaced to match the contours of the face using displacement mapping. Areas of the tattoo which have been displaced outside the face or inside the mouth, nostrils and eyes are cut out. This is done by using cut-out maps (see Fig. 2) which are calculated from the landmarks detected by dlib in the placement-phase. Lastly, the tattoo is made more realistic by colour-adjustment, Gaussian blurring, and lowering the opacity of the tattoo.

As previously stated, displacement mapping is used for mapping tattoos to the contours of a face. It is a technique which utilises depth information of texture maps to alter the positions of pixels according to the depth information in the provided map. Contrary to other approaches, such as bump mapping, it alters the source image by displacing pixels. In displacement mapping, a map containing values in the range 0-255 is used to displace pixels in a source image . In general, a specific pixel, , is displaced in one direction if is less than the theoretical average pixel-value of the map (127.5) else it is displaced in the opposite direction. For the displacement technique used in this work, a pixel in the source image is displaced both vertically and horizontally.

More specifically, let be a coefficient, let , and let . The distance for displacing a pixel, , in the vertical and horizontal direction is then:

PRNet [44] is used to generate the depth maps used in this project. PRNet is capable of performing 3D face reconstruction from 2D face images, and as such, it can also approximate depth maps from 2D facial images. An example of a depth map generated using PRNet is shown in Fig. 5(a).

(a) Depth image generated by PRNet
(b) Transformed depth image
Fig. 6: Example of (a) a depth map generated from a facial image using PRNet and (b) after it has been transformed.
(a) not displaced
(b) displaced
Fig. 7: Facial images with tattoos (a) before and (b) after applying the displacement technique. For (b) the tattoo is bended around the anticipated 3D shape of the nose. Best viewed in electronic format (zoomed in).

As seen in Fig. 5(a), the pixel values in the face region are rather bright, and there is little contrast. The small contrast between the pixel values and the high offset from the theoretical average pixel value implies that the depth map will not work very well as tattoos will be displaced too much in certain regions and too little in others. Therefore, to make the displacement more realistic, the depth map generated by PRNet is transformed by increasing the contrast and lowering the brightness of the map. Fig. 5(b) shows an example of a transformed depth map, and as it can be seen the pixel values are much closer to the theoretical average value than the unaltered map, while the contrast around the nose, eyes and mouth are still high. Fig. 7 shows an example where two facial tattoos are displaced to match the contours of a face.

Fig. 8: Examples of facial images where parts of one or more tattoos have been cut out.

Black ink tends to change in color slightly over time due to the pigment used in black ink. Therefore, for colour-adjustment, all pixels of a tattoo which are similar to pure black are selected and changed to simulate different colours of grey, green, and blue. Due to the multiply blend mode and the varying illumination conditions and skin-tones of the images, this causes black tattoos to appear different for different facial images. The colour-adjustments of black pixels are determined per tattoo, and as such slight variations can occur between different tattoos in the same facial image. Examples are given in Fig. 9.

Fig. 9: Examples of black tattoos blended to facial images.

Iii-C Generation Strategies

By varying how tattoos are selected and placed (Sect. III-A), many different types of facial images with tattoos can be generated. For the database used in this work, we employed two different strategies. In the first strategy, a desired coverage percent of tattoos in a face is randomly chosen from a specified range. Subsequently, tattoos are arbitrarily selected and placed on facial regions until the resulting coverage approximates the desired coverage. The coverage of a tattoo in a face is calculated based on the total amount of facial regions to which tattoos can be placed (see Fig. 3(c)) and the number of non-transparent pixels in the placed tattoos. In the second strategy, a specific region is always selected. Using the first strategy, it is possible to create databases where tattoos are placed arbitrarily until a selected coverage percent has been reached (see Fig. 9(a)-9(c)). Using the latter approach allows for more controlled placement of tattoos, e.g. placing tattoos in the entire face region (Fig. 9(d)) or in a specific region (Fig. 9(e)-9(f)).

Fig. 10: Examples for different types of tattooed face that can be generated: (a) 5%, (b) 15%, (c) 25% coverage, (d) entire face, (e) single tattoo, and (f) specific region.

Iv Synthetic Tattoo Database

This section describes the generation of a large database of facial images with tattoos. The database is used in section V to train deep learning-based models for removing tattoos from facial images. To generate the synthetic tattoo database, subsets of original images from the FERET [45], FRGCv2 [46], and CelebA [47] datasets were used. An overview of the generated database is given in Table I. For the FERET and FRGCv2 datasets, different generation strategies were used, including facial images where tattoos have been placed randomly, with specific coverage ranging from 5 to 25 as well as placement of single tattoos. For the single tattoos, we generated two versions: one version where the tattoo is placed in the entire facial region and another where portrait tattoos are blended to a random region in the face. For the CelebA database, which is more uncontrolled, facial tattoos were placed randomly. To simulate varying image qualities, a data augmentation was performed by randomly applying differing degrees of JPEG compression or Gaussian blur to all the images. Tattoo images and corresponding original (bona fide) images were paired, such that similar augmentation was applied to corresponding images.

Database Subjects Images
Bona fide Tattooed
FERET 529 621 6,743
FRGCv2 533 1,436 16,209
CelebA 6,872 6,872 6,872
TABLE I: Overview of the generated database (before augmentation).

Examples of images in the generated database are depicted in Fig. 11.

Fig. 11: Examples of generated facial images with tattoos.

V Tattoo Removal

To evaluate the realism of the proposed data generation and its use in real-world applications, two models are trained for the task of tattoo removal and compared with one pre-trained approach. Sect. V-A briefly describes the different models used for removing tattoos. Sect. V-B describes different metrics for evaluating the quality of the tattoo removal which is then evaluated in Sect. V-C.

V-a Models

Two different modules were used for removing tattoos. For the SkinDeep model there exists a pre-trained model which has been employed. In the subsequent sections, pix2pix* and SkinDeep* denotes models that are trained from scratch on the synthetic database described in Sect. 



is a supervised conditional GAN for image-to-image translation [48]. For the generator, a U-Net architecture is used whereas the discriminator is based on a PatchGAN classifier which divides the image into patches and discriminates between bona fide (i.e., real images) and fake images.


is a pre-existing model [20] for removing body tattoos utilising well-known components which have shown good results for other image-to-image translation tasks. The generator is a pre-trained U-Net architecture with spectral normalisation and self-attention.

An overview of the three models used in this work is given in Tab. II.

Model Training data
SkinDeep Pretrained [20]
SkinDeep* Own (Sect.IV)
pix2pix* [48] Own (Sect.IV)
TABLE II: Overview of the three tattoo models and used training data. The * indicates that the model has been trained on the synthetic data in Sect. IV.

V-B Quality Metrics

To evaluate the quality of the different tattoo removal models, we use three different metrics commonly used in the literature:

Peak signal-to-noise ratio (PSNR)

is a measurement of error between an input and an output image and is calculated as follows:

where is the theoretical maximum pixel value (i.e. 255 for 8 bit channels) and is the mean squared error between the ground truth image and the inpainted image . The PSNR is measured in decibel and a higher value indicates better quality of the reconstructed image.

Mean Structural Similarity Index (MSSIM)

as given in [49], is defined as follows:

where X and Y are the ground truth image and inpainted image, respectively, is the number of local windows in an image and and are the image content of the ’th local window. The SSIM over local window patches is defined as:

where and are the mean values of the local window patches and , respectively;

are their local variances and

is the local covariance of and . and are constants set based on the same parameter settings as Wang et al. [49], i.e. . MSSIM returns a value in the range of 0 to 1, where 1 means that X and Y are identical.

Visual Information Fidelity (VIF)

is a full reference image quality assessment measurement proposed by Sheikh and Bovik in [50]. VIF is derived from a statistical model for natural scenes as well as models for image distortion and the human visual system. returns a value in range of 0 to 1, where 1 indicates that the ground truth and inpainted images are identical. We use the pixel domain version as implemented in [51].

We estimate the different quality metrics both on portrait-images, i.e. where the entire face is visible and on the inner part of the face (corresponding to the area covered by the 68 dlib landmark points; see Fig. 3) where we focus on only the area from the eyebrows to the chin; these regions are shown in Fig. 12. Additionally, since the pre-trained SkinDeep model returns an image of size , we downscale these to be of same size as the other models ().

(a) Portrait
(b) Inner
Fig. 12: Examples of (a) a full portrait image where the entire face is visible and (b) a crop of the inner face region.
Fig. 13: Examples of using deep-learning based algorithms for facial tattoo removal. Best viewed in electronic format (zoomed in).

V-C Removal Quality Results

We use a total of 41 facial images with tattoos where the tattoos have been manually removed using PhotoShop, we refer to these as our ground truth images. Examples of using the different deep learning-based methods for removing tattoos are given in Fig. 13. As seen, the best model (SkinDeep*) is able to remove most tattoos with only few artefacts whereas the other models perform less well and, for some images, alter the face or fail to accurately remove all tattoos.

Different quality scores are reported in Tab. III which shows that the SkinDeep* model performs best in most scenarios especially when only looking at the inner part of the face.

max width= Scenario Portrait Inner MSSIM PSNR VIF MSSIM PSNR VIF Tattooed pix2pix* SkinDeep SkinDeep*

TABLE III: Quality measurements of the reconstructed images compared to ground truth images where tattoos have been manually removed. ”Tattooed” denotes the baseline case where the tattooed images are compared to the ground truth images.

While the tattoo removal performs well in many scenarios, there are also some extreme cases where it does not work so well. Examples of removing large coverage of tattoos from facial images are shown in Fig. 14. Depicted example images clearly show the limitations of the presented approach.

Fig. 14: Facial images with extreme coverage of tattoos, which remain challenging for our tattoo removal approach. Before (left) and after (right) tattoo removal.

Vi Application to Face Recognition

In this section, we describe how tattoo removal can be integrated and used in a face recognition system. A face recognition system consist of several preprocessing modules such as face alignment and quality estimation. These modules help to minimise factors which are unimportant for face recognition and ensure that only images of sufficient quality are used during authentication. As part of the preprocessing, we propose to use the deep learning-based removal algorithms described in Sect. V. While facial tattoos can be seen as distinctive and helpful to identify individuals, tattoo removal is useful for face recognition in cases where only one of the face images in a comparison contains tattoos [4]. In our experiments, we trained the classifiers to remove facial tattoos from aligned images and as such will assume that our input images have already been aligned since our focus is on improving feature extraction and comparison. Note that the proposed tattoo removal method could also be retrained on unaligned images and placed before the detection module to improve detection accuracy.

Vi-a Experimental Setup

In the following, we describe the database, the employed face recognition system, and metrics used to evaluate the biometric performance:


for the evaluation we use the publicly available database HDA Facial Tattoo and Painting Database333https://dasec.h-da.de/research/biometrics/hda-facial-tattoo-and-painting-database, which consists of 250 image pairs of individuals with and without real facial tattoos. The database was originally collected by Ibsen et al. in [4]. The images have all been aligned using the RetinaFace facial detector [52]. Examples of original image-pairs (before tattoo removal) are given in Fig. 15. These pairs of images are used for evaluating the performance of a face recognition system. For evaluating the effect of tattoo removal, the models described in Sect. V-A are employed on the facial images containing tattoos whereafter the resulting images are used during the evaluation.

Fig. 15: Examples of image-pairs in the HDA facial tattoo and painting database.
Face recognition system

to evaluate the applicability of tattoo removal for face recognition, we use the established ArcFace pre-trained model (LResNet100E-IR,ArcFace@ms1m-refine-v2) with the RetinaFace facial detector.

Recognition performance metrics

the effect of removing facial tattoos is evaluated empirically [53]. Specifically, we measure the FNMR at an operationally relevant thresholds corresponding to a FMR of and :

  • False Match Rate (FMR): the proportion of the completed biometric non-mated comparison trials that result in a false match.

  • False Non-Match Rate (FNMR): the proportion of the completed biometric mated comparison trials that result in a false non-match.

Additionally, we report the Equal Error Rate (EER), i.e. the point where FNMR and FMR are equal. To show the distribution of comparison score, boxplots are used. The comparison scores are computed between pairs of feature vectors using the Euclidean distance.

Vi-B Experimental Results

The effect of removing tattoos on the computed comparison scores is visualised in Fig. 16. As can be seen, the comparison score are not significantly affected for the pix2pix* and SkinDeep models which only showed moderate capabilities of removing tattoos from facial images. However, for SkinDeep*, which has been trained on the synthetic database, it is shown that the dissimilarity score on average gets lower, which indicates that the recognition performance might improve.

FMR0.1% FMR1%
TABLE IV: Biometric performance results for ArcFace.
Fig. 16: Boxplots showing the effect of tattoo removal on biometric comparison scores.

Table IV shows the biometric performance scores calculated on the tattooed images and the inpainted facial images for the different used models. The scores indicate that realistic removal of tattoos (SkinDeep*) might improve face recognition performance, as we can observe that, compared to the baseline (tattooed), the EER is halved and the FNMR at an FMR of is reduced to . The results indicate that a tattoo removal module can be integrated into the processing chain of a face recognition system and help make it more robust towards facial tattoos.

Vii Summary

In this paper, we proposed an automatic approach for blending tattoos onto facial images and showed that it is possible to use synthetic data to train a deep learning-based facial tattoo removal algorithm, thereby enhancing the performance of a state-of-the-art face recognition system. To create a facial image with tattoos, the face is first divided into face regions using landmark detection whereafter tattoo placements can be found. Subsequently, deep reconstruction maps and cut-out maps can be estimated from the input image. Thereafter, the information is combined to realistically blend tattoos to the facial image. Using this approach, we created a large database of facial images with tattoos and used it to train a deep learning-based algorithm for removing tattoos. Experimental results show a high quality of the tattoo removal. To further show the feasibility of the reconstruction, we evaluated the effect of removing facial tattoos on a state-of-the-art face recognition system and found that it can improve the recognition performance.


This research work has been funded by the German Federal Ministry of Education and Research and the Hessian Ministry of Higher Education, Research, Science and the Arts within their joint support of the National Research Center for Applied Cybersecurity ATHENE and the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 860813 - TReSPAsS-ETN.


  • [1] D. Zeng, R. Veldhuis, and L. Spreeuwers, “A survey of face recognition techniques under occlusion,” IET Biometrics, vol. 10, no. 6, pp. 581–606, 2021.
  • [2] S. Kurutz, “Face Tattoos Go Mainstream,” https://www.nytimes.com/2018/08/04/style/face-tattoos.html, 2018, last accessed: 2021-10-27.
  • [3] M. Abrams, “Why are face tattoos the latest celebrity trend,” https://www.standard.co.uk/insider/style/face-tattoos-celebrity-trend-justin-bieber-presley-gerber-a4360511.html, 2020, last accessed: 2021-10-27.
  • [4] M. Ibsen, C. Rathgeb, T. Fink, P. Drozdowski, and C. Busch, “Impact of facial tattoos and paintings on face recognition systems,” IET Biometrics, vol. 10, no. 6, pp. 706–719, 2021.
  • [5] J. Mathai, I. Masi, and W. AbdAlmageed, “Does generative face completion help face recognition?” in Int’l. Conf. on Biometrics (ICB), 2019, pp. 1–8.
  • [6] F. Bacchini and L. Lorusso, “A tattoo is not a face. ethical aspects of tattoo-based biometrics,” Journal of Information, Communication and Ethics in Society, vol. 16, no. 2, 2017.
  • [7] E. Wood, T. Baltrušaitis, C. Hewitt, S. Dziadzio et al., “Fake it till you make it: Face analysis in the wild using synthetic data alone,” in

    Proc. of the IEEE/CVF Int’l. Conf. on Computer Vision (ICCV)

    , 2021, pp. 3681–3691.
  • [8] European Council, “Regulation of the european parliament and of the council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (general data protection regulation),” April 2016.
  • [9] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu et al., “Generative adversarial nets,” in Advances in Neural Information Processing Systems, vol. 27, 2014.
  • [10] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in

    IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR)

    , 2019, pp. 4396–4405.
  • [11] T. Karras, S. Laine, M. Aittala, J. Hellsten et al., “Analyzing and improving the image quality of StyleGAN,” in IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8107–8116.
  • [12] T. Karras, M. Aittala, S. Laine, E. Härkönen et al., “Alias-free generative adversarial networks,” in Proc. NeurIPS, 2021.
  • [13] M. Grimmer, R. Raghavendra, and C. Christoph, “Deep face age progression: A survey,” IEEE Access, vol. 9, pp. 83 376–83 393, 2021.
  • [14] R. Cappelli, D. Maio, and D. Maltoni, “SFinGe: an approach to synthetic fingerprint generation,” Int’l. Workshop on Biometric Technologies, 2004.
  • [15] J. Priesnitz, C. Rathgeb, N. Buchmann, and C. Busch, “SynCoLFinGer: Synthetic Contactless Fingerprint Generator,” arXiv e-prints, p. arXiv:2110.09144, oct 2021.
  • [16] A. B. V. Wyzykowski, M. P. Segundo, and R. de Paula Lemes, “Level three synthetic fingerprint generation,” in 25th Int’l Conf. on Pattern Recognition (ICPR), 2021, pp. 9250–9257.
  • [17] P. Drozdowski, C. Rathgeb, and C. Busch, “SIC-Gen: A synthetic Iris-Code generator,” in Int’l. Conf. of the Biometrics Special Interest Group (BIOSIG), 2017, pp. 61–69.
  • [18] J. Dole, “Synthetic Iris Generation, Manipulation, & ID Preservation,” https://eab.org/cgi-bin/dl.pl?/upload/documents/2256/06-Dole-SyntheticIrisPresentation-210913.pdf, 2021, last accessed: 2021-12-26.
  • [19] X. Xu, W. M. Matkowski, and A. W. K. Kong, “A portrait photo-to-tattoo transform based on digital tattooing,” Multimedia Tools and Applications, vol. 79, no. 33, pp. 24 367–24 392, 2020.
  • [20] V. Madhavan, “SkinDeep,” https://github.com/vijishmadhavan/SkinDeep, 2021, last accessed: 2021-11-01.
  • [21] C. Rathgeb, A. Dantcheva, and C. Busch, “Impact and detection of facial beautification in face recognition: An overview,” IEEE Access, vol. 7, pp. 152 667–152 678, 2019.
  • [22] R. Singh, M. Vatsa, H. S. Bhatt, S. Bharadwaj et al., “Plastic surgery: A new dimension to face recognition,” IEEE Trans. on Information Forensics and Security, vol. 5, no. 3, pp. 441–448, 2010.
  • [23] C. Rathgeb, D. Dogan, F. Stockhardt, M. D. Marsico, and C. Busch, “Plastic surgery: An obstacle for deep face recognition?” in Proc. 15th IEEE Computer Society Workshop on Biometrics (CVPRW), 2020, pp. 3510–3517.
  • [24] International Civil Aviation Organization, “Machine readable passports – part 9 – deployment of biometric identification and electronic storage of data in eMRTDs,” International Civil Aviation Organization (ICAO), 2015.
  • [25] A. Dantcheva, C. Chen, and A. Ross, “Can facial cosmetics affect the matching accuracy of face recognition systems?” in IEEE Fifth Int’l. Conf. on Biometrics: Theory, Applications and Systems (BTAS), 2012, pp. 391–398.
  • [26] T. Y. Wang and A. Kumar, “Recognizing human faces under disguise and makeup,” in IEEE Int’l. Conf. on Identity, Security and Behavior Analysis (ISBA), 2016, pp. 1–7.
  • [27] C. Chen, A. Dantcheva, T. Swearingen, and A. Ross, “Spoofing faces using makeup: An investigative study,” in IEEE Int’l. Conf. on Identity, Security and Behavior Analysis (ISBA), 2017, pp. 1–8.
  • [28] C. Rathgeb, P. Drozdowski, D. Fischer, and C. Busch, “Vulnerability assessment and detection of makeup presentation attacks,” in Proc. Int’l. Workshop on Biometrics and Forensics (IWBF).   IEEE, 2020, pp. 1–6.
  • [29] M. Singh, R. Singh, M. Vatsa, N. K. Ratha, and R. Chellappa, “Recognizing disguised faces in the wild,” Trans. on Biometrics, Behavior, and Identity Science (TBIOM), vol. 1, no. 2, pp. 97–108, 2019.
  • [30] M. Ferrara, A. Franco, and D. Maltoni, “The magic passport,” in IEEE Int’l. Joint Conf. on Biometrics (IJCB), 2014, pp. 1–7.
  • [31] U. Scherhag, C. Rathgeb, J. Merkle, R. Breithaupt, and C. Busch, “Face recognition systems under morphing attacks: A survey,” IEEE Access, vol. 7, pp. 23 012–23 026, 2019.
  • [32] C. Rathgeb, A. Botaljov, F. Stockhardt, S. Isadskiy et al., “PRNU-based detection of facial retouching,” IET Biometrics, 2020.
  • [33] M. F. Hedberg, “Effects of sample stretching in face recognition,” in Int’l. conf. of the Biometrics Special Interest Group (BIOSIG), 2020, pp. 1–4.
  • [34] L. Verdoliva, “Media forensics and deepfakes: An overview,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 5, pp. 910–932, 2020.
  • [35] C. C. Ferrer, B. Pflaum, J. Pan, B. Dolhansky et al., “Deepfake detection challenge results: An open initiative to advance AI,” https://ai.facebook.com/blog/deepfake-detection-challenge-results-an-open-initiative-to-advance-ai/, 2020, last accessed: 2021-11-12.
  • [36] R. Tolosana, R. Vera-Rodriguez, J. Fierrez, A. Morales, and J. Ortega-Garcia, “Deepfakes and beyond: A survey of face manipulation and fake detection,” Information Fusion, vol. 64, pp. 131–148, 2020.
  • [37] S. Iizuka, E. Simo-Serra, and H. Ishikawa, “Globally and locally consistent image completion,” ACM Trans. Graph., vol. 36, no. 4, 2017.
  • [38] Y. Li, S. Liu, J. Yang, and M.-H. Yang, “Generative face completion,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5892–5900.
  • [39] Y. Zhao, W. Chen, J. Xing, X. Li et al., “Identity preserving face completion for large ocular region occlusion,” in 29th British Machine Vision Conf. (BMVC), 2018.
  • [40] L. Song, J. Cao, L. Song, Y. Hu, and R. He, “Geometry-aware face completion and editing,”

    Proceedings of the AAAI Conf. on Artificial Intelligence

    , vol. 33, no. 01, pp. 2506–2513, 2019.
  • [41] N. U. Din, K. Javed, S. Bae, and J. Yi, “A novel GAN-based network for unmasking of masked face,” IEEE Access, vol. 8, pp. 44 276–44 287, 2020.
  • [42]

    D. King, “Dlib-ml: A machine learning toolkit,”

    Journal of Machine Learning Research, 2009.
  • [43] GIMP, “The “Python-Fu” Submenu,” https://docs.gimp.org/2.10/en/gimp-filters-python-fu.html, last accessed: 2021-10-27.
  • [44] Y. Feng, F. Wu, X. Shao, Y. Wang, and X. Zhou, “Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network,” in ECCV, 2018.
  • [45] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, “The FERET database and evaluation procedure for face-recognition algorithms,” Image and Vision Computing, vol. 16, no. 5, pp. 295–306, 1998.
  • [46] P. J. Phillips, P. J. Flynn, T. Scruggs, K. Bowyer et al., “Overview of the face recognition grand challenge,” in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR), vol. 1.   IEEE, 2005, pp. 947–954.
  • [47] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proceedings of Int’l Conf. on Computer Vision (ICCV), 2015.
  • [48] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5967–5976.
  • [49] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
  • [50] H. R. Sheikh and A. C. Bovik, “Image information and visual quality,” IEEE Trans. on Image Processing, vol. 15, no. 2, pp. 430–444, 2006.
  • [51] A. Khalel, “Sewar,” https://github.com/andrewekhalel/sewar, 2021, last accessed: 2021-12-05.
  • [52] J. Deng, J. Guo, E. Ververas, I. Kotsia, and S. Zafeiriou, “RetinaFace: Single-shot multi-level face localisation in the wild,” in Proceedings of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 2020.
  • [53] ISO/IEC JTC1 SC37 Biometrics, ISO/IEC 19795-1:2021. Information Technology – Biometric Performance Testing and Reporting – Part 1: Principles and Framework, International Organization for Standardization, 2021.