Forensic Shoe-print Identification: A Brief Survey

01/05/2019 ∙ by Imad Rida, et al. ∙ Universidade da Beira Interior 0

As an advanced research topic in forensics science, automatic shoe-print identification has been extensively studied in the last two decades, since shoe marks are the clues most frequently left in a crime scene. Hence, these impressions provide a pertinent evidence for the proper progress of investigations in order to identify the potential criminals. The main goal of this survey is to provide a cohesive overview of the research carried out in forensic shoe-print identification and its basic background. Apart defining the problem and describing the phases that typically compose the processing chain of shoe-print identification, we provide a summary/comparison of the state-of-the-art approaches, in order to guide the neophyte and help to advance the research topic. This is done through introducing simple and basic taxonomies as well as summaries of the state-of-the-art performance. Lastly, we discuss the current open problems and challenges in this research topic, point out for promising directions in this field.



There are no comments yet.


page 2

page 3

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The place where the criminals commit their unlawful act namely Scene of Crime (SoC) is for extreme importance for police [1]. According to Locard’s exchange assumption, perpetrator of a crime will inevitably leave something into the SoC [2]. Hence, based on this theory, finding and recovering the physical evidence is crucial and fundamental task in order to identify the criminals and exculpate the unduly accused [3].

Fingerprint, blood and hair are examples of clues that can be found in the SoC [4, 5, 6, 7, 8, 9]. Unfortunately criminals often try to adopt some techniques such as wearing gloves in order to neutralize these clues. On the other hand, although the shoe-prints are not unique, it has been noted that they have greater chance to be present in the SoC than latent fingerprints for instance [10, 11].

(a) shoe-print database
(b) SoC prints
Fig. 1: Example of shoe-print images from database and SoC [12].

A shoe mark occurs due to the contact of a shoe with a surface (see Figure 1). Despite its uniqueness problem compared to other biometric traits [13, 14, 15, 16], footwear impressions hold a great and very promising potential in assisting forensic investigations. For instance, in case of multiple attacks in a short time, it would be unlikely that an attacker would discard or change his/her footwear between different crime places. It has also been reported by Alexandre [17] that approximately of shoe-prints can be retrieved in SoC. A lifted shoe-print from a SoC can potentially be used in two different tasks:

  • Match it against a database (such as Foster and Freeman Ltd) in order de determine its model.

  • Match it against other shoe-prints taken from other SoC to verify if the same shoe model has been used.

Unfortunately, carrying the matching based on the human knowledge (manually through a paper catalogue or semi-automatically through a computer database) is not a trivial task [18]. Indeed, the limitations of such systems are obvious in case of large databases due to the need to match the retrieved sample to all database samples (one by one). Furthermore, it is harder to agree on the classification among several users and mostly in case of degraded shoe mark images. This clearly shows the need to a fully automated shoe-print identification system.

Despite the devoted efforts in order to introduce efficient automated computer systems able to search and match shoe-prints, there is no existing surveys bringing together all existing works. The main aim of this paper is to propose a comprehensive overview of existing automatic shoe-print identification. This is intended to provide researchers with state-of-the-art approaches in order to help advance the research topic as well as guiding the neophyte. Section 2 presents the main architecture of an automated shoe-print identification system. Section 3 introduces the holistic techniques. Section 4 describes the local techniques. Section 5 reports the evaluation and obtained performances. Section 6 gives the discussion. Finally, Section 7 offers our conclusion.

2 Automated shoe-print identification

The main architecture of an automated shoe-print identification system can be divided into three main tasks [19]

: removing the different distortions and enhancing the quality of images by pre-processing, generating discriminative features of a shoe-print using feature extraction techniques and finally classifying/matching the query sample with the whole database containing the shoe-print models and assigning its class label (i.e. shoe type) using the extracted features and a trained classifier or matching function (see Figure


Relevant and discriminative features are of critical and fundamental importance to achieve high performances in any automatic identification system [20]. Feature extraction seeks to transform and fix the dimensionality of an initial input raw shoe-print image to generate a new set of features containing meaningful information contributing to assign the observations to the correct corresponding either on training samples or new unseen data class [21]. Existing state-of-the-art techniques mainly differ by the type of the extracted features. They essentially can be organized in two main categories: holistic and structural methods.

Fig. 2: Cohesive schema of the typical processing chain of an automated shoe-print identification system.

3 Holistic techniques

The holistic or global methods seek to process shoe-print image as a whole. In this context, Bouridane et al. [22] employed Fractal decomposition in order to produce an ensemble of spatial transformations which can reproduce the same image when recursively applied to a nearly similar image. The matching is carried out using Mean Square Noise Error method (MSNE). De Chazal et al. [23]

took as features the squared magnitude of the 2D Discrete Fourier Transform (DFT) namely Power Spectral Density (PSD). A 2D correlation function has been used as a similarly measure and the query image is identified as the one with the highest correlation value in the database. Based on Oppenheim and Lim

[24, 25] assumption claiming that in Fourier domain the phase information is much more important than magnitude in describing the patterns structure, Gueham et al. [26] introduced a Modified Phase-Only Correlation (MPOC) technique through a band pass spectral weighting function. The query sample is then classified as the one with highest matching score. Gueham et al. [27] evaluated two different advanced correlation filters: Optimal Trade-off Synthetic Discriminant Function (OTSDF) and Unconstrained OTSDF. The matching was carried out using three different metrics, peak height, peak to correlation energy and finally peak to sidelobe ratio. Gueham et al. [28] exploited Fourier-Mellin transform features obtained by a log-polar mapping followed by a DFT. The matching is performed based on a two dimensional correlation function. AlGarni and Hamiane [29]

extracted Hu’s moment invariants features, and then four different metrics have been used for the similarity measurement including Euclidean, city block, canberra and correlation. Jing

et al. [30] enhanced the quality of the shoe marks by a pre-processing step including grayscale transformation, noise removal and principal component transformation. Then, four different type of features related to the directionality have been extracted, namely co-occurrence matrices, global Fourier transform, local Fourier transform and directional matrix. Finally, the sum of absolute difference between the previously mentioned features is used as a similarity metric. Patil and Kulkarni [31]

have exploited multiresolution features using Gabor transform. In order to be invariant to rotation, Radon transform has been used to estimate the rotation of the shoe-print to compensate the direction of the extracted features. The classification of a new shoe mark image was carried out using nearest-neighbor based on the Euclidean distance. Pei

et al. [32]

combined odd and even Gabor features to describe the texture and geometry characteristics. Tang and Dai

[33] extracted several texture features including the dot texture and shape of edge. Li et al. [34] combined the integral histogram of the Gabor features with the Euclidean distance and histogram intersection for the similarity measurement. Wei and Gwo [35] used Zernike moments as features and carried out the classification through nearest-neighbor of Euclidean distance. Kong et al. [36]

extracted Gabor and Zernike features combined with normalized correlation for matching. Recently and with the progress in machine learning techniques, several learning-based techniques have been proposed, Kortylewski and Vetter

[37] suggested a probabilistic compositional active basis model. In the same context, Kong et al. [12]

introduced a multi-channel normalized cross-correlation to match multi-channel deep features extracted by pre-trained convolutional neural network. Wang

et al. [38] proposed a manifold ranking based method using various extracted features. Recently, Zhang et al. [39] used a pre-trained VGG16 network further tuned using a data augmentation technique.

Techniques Features Classification / Matching
    (Bouridane et al., 2000) [22] Fractal Decomposition Mean Square Noise Error
    (De Chazal et al.., 2005) [23] Power Spectral Density 2D Correlation
    (Gueham et al., 2007) [26] Phase Modified Phase-Only Correlation
    (Gueham et al., 2008) [27] OTSDF+UOTSDF Peak Height, Peak to Correlation Energy, Peak to Sidelobe Ratio
    (Gueham et al., 2008) [28] Fourier-Mellin Transform 2D Correlation
    (AlGarni and Hamiane, 2008) [29] Hu’s Moments Euclidean, City-Block, Canberra, Correlation
    (Jing et al., 2009) [30] Co-occurrence, Global/Local Fourier Sum of Absolute Difference
    (Patil and Kulkarni, 2009) [31] Gabor Euclidean
    (Pei et al., 2009) [32] Odd and Even Gabor Tree Similarity
    (Tang and Dai, 2010) [33] Texture Defined Similarity Function
    (Li et al., 2014) [34] Gabor Euclidean
    (Wei and Gwo, 2014) [35] Zernike Moments Euclidean
    (Wei and Gwo, 2014) [35] Gabor+Zernike Normalized Correlation
    (Kortylewski and Vetter, 2016) [37] Raw Pixels Probabilistic Model
    (Kong et al., 2017) [12] Deep Features Normalized Cross-Correlation
    (Wang et al., 2017) [38] Hybrid Features (Region & Appearance) Manifold Ranking
    (Zhang et al., 2017) [39] Deep Features Deep Neural Network
    (Zhang and Allinson, 2005) [40] DFT Histogram Edge Direction Euclidean
    (Pavlou and Allinson, 2006) [41] MSER+GLOH+SIFT Gaussian Weighted Function
    (Ghouti et al., 2006) [42] Directional FilterBanks Euclidean
    (Su et al., 2007) [43] MHL+SIFT Defined Similarity Function
    (Ramakrishnan and Srihari, 2008) [44] Cosine Similarity+Entropy+Standard Deviation Conditional Random Fields
    (Pavlou and Allinson, 2009) [45] MSER+SIFT Constraint Kernel
    (Nibouche et al., 2009) [46] Multi-Scale Harris+SIFT RANSAC
    (Dardi et al., 2009) [47, 48, 49] PSD Mahalanobis Distance Correlation
    (Tang et al., 2010) [50] ISHT+MRHT Footwear Print Distance
    (Li et al., 2011) [51]. SIFT Cross-Correlation
    (Rathinavel and Arumugam, 2011) [52] Discrete Cosine Transform Euclidean
    (Hasegawa and Tabbone, 2012) [53] HRT Mean Local Similarity
    (Tang et al., 2010, 2012) [54, 55] ARG Footwear Print Distance
    (Wei et al.., 2013) [56] SIFT Cross-Correlation
    (Wang et al., 2014) [57] Wavelet-Fourier 2D Correlation
    (Kortylewski et al., 2014) [58] Periodicity Defined Similarity Measure
    (Almaadeed et al., 2015) [59] Harris+Hessian+SIFT RANSAC
    (Alizadeh and Kose, 2017) [60] Raw Pixels Sparse Representation for Classification
TABLE I: Overview of shoe-print identification techniques (features and matching).

4 Local techniques

The local methods try to extract some discriminative features from local shoe-print regions. This includes keypoints or various overlapping/non-overlapping parts (we refer the reader to [61] for technical details of different keypoints detection techniques) . Zhang and Allinson [40] used DFT of the normalized histogram of edge direction as features and the Euclidean distance as measure of similarity. Pavlou and Allinson [41] exploited Maximally Stable Extremal Region (MSER) to detect the points of interest followed by Gradient Location and Orientation Histogram (GLOH) and Scale Invariant Feature Transform (SIFT) as feature descriptors. A Gaussian weighted function has been used as similarity metric. Ghouti et al. [42] extracted the block energy-dominant of Directional FilterBanks (DFBs). The matching was performed using Euclidean distance. Su et al. [43] combined the Modified Harris-Laplace (MHL) detector with the enhanced SIFT descriptor. The classification was carried out through nearest-neighbor. Ramakrishnan and Srihari [44] proposed a novel technique through the combination of three different features, cosine similarity, entropy and standard deviation with Conditional Random Fields (CRF). Pavlou and Allinson [45] located points of interest using MSER detector and then the corresponding features are extracted using SIFT descriptor further transformed to an histogram representation. The similarity is measured by a constraint kernel. Nibouche et al. [46] detected local points of interest through multi-scale Harris detector then SIFT descriptor is applied to extract the features. The matching is carried out iteratively using RANdom SAmple Consensus (RANSAC). Dardi et al. [47, 48, 49] divided the shoe-print image into blocks and then the Mahalanobis distance between all possible block pairs is calculated. The PSD of the obtained distance matrix is used as descriptor and the correlation as similarity measure. Tang et al. [50] exploited Iterative Straight-line Hough Transform (ISHT) and Modified Randomized Hough Transform (MRHT). Li et al. [51] combined SIFT detector with cross-correlation for matching. Hasegawa and Tabbone [54, 53] decomposed the shoe-print image into connected components and then Histogram Radon Transform (HRT) is used as descriptor to extract the features. The similarity is measured by the mean of local similarities. Rathinavel and Arumugam [52]

extracted Discrete Cosine Transform (DCT) coefficients of overlapped blocks further combined with Principal Component Analysis (PCA) and Fisher Linear Discriminant (FLD). The classification was carried out using nearest-neighbor of Euclidean distance. Tang

et al. [55] encoded the structural features of shoe-print as an Attributed Relational Graph (ARG) and achieved the matching using a suggested Footwear Print Distance (FPD). Wei et al. [56] combined SIFT features with cross-correlation matching. Wang et al. [57] exploited Wavelet-Fourier transform features. Kortylewski et al. [58] extracted the pattern periodicity features. Almaadeed [59] et al. combined Harris and Hessian point of interest detectors with SIFT descriptors. The matching is carried out using RANSAC. Recently, Alizadeh and Kose [60] proposed an interesting method based on blocked sparse representation. Table I summarizes all the previously mentioned holistic and local shoe-print identification techniques.

5 Evaluation

The availability of large and public datasets is essential for a comparative study of the performances including a consistent evaluation. The main noted problem in the research topic of shoe-print identification is the lack or let even say the absence of public benchmarks with pre-defined and standardized evaluation protocols. Most published techniques in the literature were evaluated on non realistic and synthetically generated images by adding artificial distortions such as noise and blur [23, 27, 46]. Furthermore, the shoe model databases (i.e. training or gallery) were not made available. Thus a direct and fair comparison of the performance with the reported state-of-the-art techniques is unfortunately not possible. It should be also noted that [48, 50] have performed their evaluation based on real data which also was not made available.

Recently, we can notice a new introduced shoe-print database which has been made publicly available for algorithms evaluation namely Footwear Impression Database (FID-300) 111 [58]. It has been collected in collaboration between German State Criminal Police Offices of Niedersachsen and Bayern and the company Forensity AG. This database contains 1175 gallery and 300 probe shoe-print images. The probe images has been digitized with a scanner after being lifted with a gel foil from the ground.

Despite the fact that different datasets, partitions and protocols have been used in the evaluation of the aforementioned state-of-the-art techniques, we give a general overview of the obtained performances (summarized in Table II). The results are reported in the format X%@Y, where it refers to the cumulative score X at the first Y matches. It can be clearly seen that various performances have been obtained ranging from 27.10% to 100%. This clearly shows the need to public datasets with standardized protocols for the algorithms evaluation.

Techniques Accuracy Database Size Studied Distortions
    (Bouridane et al.., 2000) [22] 88.00% @1 145 rotation & translation
    (De Chazal et al., 2005) [23] 87.00% @5% 475 rotation & translation
    (Zhang and Allinson, 2005) [40] 97.70% @4% 512 rotation, noise, scale & translation
    (Pavlou and Allinson, 2006) [41] 85.00% @1 368 rotation & translation
    (Gueham et al., 2007) [26] 100.00% @1 100 partial & noise
    (Gueham et al., 2008a) [27] 95.68% @1 100 rotation, noise & occlusion
    (AlGarni and Hamiane, 2008) [29] 99.40% @1 500 rotation & noise
    (Gueham et al., 2008b) [28] 99.00% @10 500 rotation, scale, noise & occlusion
    (Pavlou and Allinson, 2009) [45] 87.00% @1 374 -
    (Dardi et al., 2009a) [47] 49.00% @1 87 noise
    (Nibouche et al., 2009) [46] 90.00% @1 300 rotation, noise & occlusion
    (Patil and Kulkarni, 2009) [31] 91.00% @1 1400 rotation, noise & occlusion
    (Pei et al., 2009) [32] 61.70% @5 6000 noise & occlusion
    (Dardi et al., 2009c) [48] 73.00% @10 87 rotation, scale & translation
    (Tang et al., 2010b) [50] 71.00% @1% 2660 rotation, scale, translation & occlusion
    (Tang et al., 2012) [55] 70.00% @1% 2660 rotation, scale, translation & noise
    (Wang et al., 2014) [57] 90.87% @2% 210 000 rotation, translation & scale
    (Kortylewski et al., 2014) [58] 27.10% @1% 1175 translation & noise
    (Almaadeed et al., 2015) [59] 99.33% @1 300 rotation, scale, noise & occlusion
    (Kortylewski and Vetter, 2016) [37] 71.00% @20% 1175 -
    (Alizadeh and Kose, 2017) [60] 99.47% @1 190 noise, rotation & occlusion
TABLE II: Performance of the state-of-the-art methods in shoe-print identification.

Actually, automatic shoe-print identification is a very challenging task in computer vision systems. Indeed, it suffers from different variations in shape and appearance due to the tread material and properties of surface (Figure

3). Furthermore, shoe-prints are cluttered since gallery images have no background while probe ones have a complicated and structured background which is hardly distinguishable from patterns of interest (Figure 4). In addition to that, occlusion, noise, translation and limited training data are further problems [62].

Fig. 3: Non-rigid deformation between probe image (left) and its gallery image (right) [62]. Blue circle stands for deformation.
Fig. 4: Shoe-print with structured background [62].Blue circles stand for shoe-prints.

6 Discussion and Current Challenges

A considerable amount of techniques have been introduced in order to the tackle the problem of shoe-print identification using a large variety of features. These extracted features determine which information and properties are available during the identification process [63]. They should capture enough invariant properties within the same shoe class and variant ones between different ones [64]. The conventional methods to identify the lifted shoe marks are mainly based on low-level handcrafted features designed based on the human knowledge. Unfortunately, despite their good performances in some controlled and specific tasks, handcrafted representations are usually ad-hoc, tend to overfitting and lack of generalization ability in various realistic scenarios. Indeed, shoe-print identification is not trial task due to the large intra-class variations caused by the rotation, noise, occlusion, translation and scale distortions. This clearly shows the need to robust techniques capable to operate in complicated and degraded scenarios.

In contrast to handcrafted feature engineering, feature learning approaches are capable to learn robust, discriminative and data-driven representations from the raw data without making use of any prior knowledge of the task [65, 66]

. Among the involved techniques we can find deep learning with the goal of end-to-end identification system

[67]. It seeks to stack more than the usual two neural layers where each layer encodes some specific properties further combined in order to learn representative and discriminative representations. Among the existing deep learning models which can potentially be applied to shoe-print identification, we can find Convolutional Neural Networks (CNN) [68]. They seek to learn discriminative representations with invariant properties.

Up to day, handcrafted feature represent the most and widely used features for shoe mark identification since the deep models require a considerable and huge amount of data in order to be reliable. Unfortunately, the existing shoe-print identification datasets have a very limited size and mainly one example per each shoe class. To be effective and tackle the problem of limited training data, a possible solution is transfer learning. It consists in exploiting models that have been already pre-trained on a huge amount of data for another task followed by a fine tuning step to fit the model to the target application.

7 Conclusion

shoe-print represents an important clue in scene crime for the proper progress of investigations in order identify the criminals. A large variety of handcrafted features have been used for automatic shoe-print identification. These features have shown good performance in limited and controlled scenarios. Unfortunately, they fail when they are dealing with large intra-class variations caused by the noise, occlusions, rotation and various scale distortions. A good alternative to these conventional features are the learned ones, e.g. deep learning, which have more generalization ability in more complicated scenarios. To be effective, these models need to be trained on a large amount of data.

Large and public datasets are essential and of extreme importance for any comparative study of the performances including a consistent evaluation. The main noted problem in the research topic of shoe-print identification is the absence of public benchmarks with pre-defined and standardized evaluation protocols. Most published techniques in the literature were evaluated on non realistic and synthetically generated images. This is clearly show the need to build new large datasets in order to boost the shoe-print research topic.


The work carried out by Hugo Proença was supported by PEst-OE/EEI/LA0008/2013 research program.