I Introduction
The basic idea of automatic kinship verification using facial images is to check if a given two facial images input have pertinence from the same family or not. Several applications can be useful under automatic kinship verification e.g. for forensics, finding missing children, social media comprehension and image annotation. Thus, a DNA test is the most reliable source for kinship verification, it unfortunately cannot be utilized in many situations such as in video surveillance.
Many authors feed their method by different features or multiple features (multiview data) to represent facial images for kinship verification. Lu et al. used the Multiview neighborhood repulsed metric learning (MNRML) [16] method to train four multiview features, Local Binary Patterns (LBP), Learningbased descriptor (LE), SIFT and Threepatch LBP (TPLBP). Yan et al. [26] employed three different feature descriptors including Local Binary Patterns(LBP), Spatial Pyramid LEarning (SPLE) and ScaleInvariant Feature Transform (SIFT) to extract different and complementary information from each face image through DMML method. Yan et al. [27] applied three different feature descriptors including LBP, spatial pyramid lEarning (SPLE), and SIFT to extract different and complementary information from each face image to train the MPDFL method. Lu et al. [15] used four features as it; Local Binary Patterns (LBP), Dense SIFT (DSIFT), the histogram of oriented gradients (HOG) and LPQ for train DDMML method. Lu et al. [9] used MvDML to train four multiview features, Local Binary Patterns (LBP), Learningbased descriptor (LE), SIFT and Threepatch LBP (TPLBP). Laiadi et al. [14] used three features LPQ, BSIF and CoALBP to train SIEDA method. Dornaika et al. used MNRML to train the two features, FC7 layers of VGGF and VGGFace for the purpose of kinship verification. Laiadi et al. proposed TXQDA [13] method to train LPQ and BSIF features using ten scales.
In this work, we propose a new framework to kinship verification from facial images using eight deep features based on four pretrained deep learning networks. For this reason, we extract FC6 and FC7 layers from VGGF, VGGM, VGGS and VGGFace models to train the proposed Multilinear SideInformation based Discriminant Analysis integrating Within Class Covariance Normalization (MSIDA+WCCN) method. We report our preliminary experimental investigations on the KinFaceWI and KinFaceWII benchmarks using four relations, FatherSon, FatherDaughter, MotherSon and MotherDaughter face subsets showing very high performance compared to stateoftheart methods.
Ii Proposed Framework
Figure 1 depicts an overview of our proposed framework. The input is a pair of two face images e.g. a Parent and a Child. We extract features from these images into eight deep features of the input test pair. We compute the cosine similarity between the two facial images to encode the final metric. The cosine similarity is fed to the ROC curve for the performance evaluation.
Iia Extracting Multiview Deep Features
Many methods suggested in the literature on automatic verification of kinship have focused mainly on analyzing deep features trained on facial images (i.e. VGGFace), thus ignoring deep features trained on object images (i.e. VGGF, VGGM and VGGS). Recently, deep facial features have shown great performance than their shallow counterparts to verify kinship relation (e.g. [30]). When considering the facial deep information, the problem usually consists in learning a discriminating metric where the classification (e.g. kinship verification in our case) becomes more affordable when combined to the object deep features. As suggested in [7], we consider the object deep features for kinship verification. Facial and object features show a complementarity which extracted by MSIDA+WCCN method. This is in contrast to use facial deep features or object deep features information separately. Therefore, we extract the facial deep features using VGGFace [18] method and object deep features [4] (VGGF, VGGM and VGGS methods) using MSIDA+WCCN method. Figure 2
depicts multiview deep features extraction (MvVGG) and tensor design. From this figure, the different colors of block architecture represent the difference in each architecture type. For deep object features, VGGFast is an architecture contains number of parameters smaller than VGGMedium, and this latter is an architecture contains number of parameters smaller than VGGSlow and these three architectures were trained on the object recognition ILSVRC2012 database. For deep face features, the VGGFace architecture was trained on VGG Face database [17] which contains 2.6M facial images from 2,622 identities. Furthermore, in the tensor representation, the length of each data stacked in a tensor mode must be the same, and this property was saved by the four pretrained models with 4096 neurons in each of the eight used fully connected layers (i.e the FC6 and FC7 layers of the four pretrained models have the same length).
Iii Multilinear SideInformation based Discriminant Analysis integrating Within Class Covariance Normalization (MSIDA+WCCN)
Iiia SideInformation based Linear Discriminant analysis (SILD)
The positive classes pair images are directly utilized to calculate the within class scatter matrix and the negative classes pair images are used to compute the between class scatter matrix. Let us refer that as the collection of positiveclass image pairs and
as the collection of negativeclass image pairs, where the image is represented by the class label . Here, the withinclass and betweenclass scatter matrices of SideInformation based Linear Discriminant analysis [17] (SILD) method can be represented by:
(3) 
(4) 
The target function for SILD is:
(5) 
(6) 
(7) 
Secondly, is also diagonalized:
(8) 
Finally, the projection matrix can be computed as:
(9) 
where H and Z are orthogonal matrices and and E are diagonal matrices.
A solution to the optimization problem in (5
) is obtained via solving the generalized eigenvalue problem. The projection matrix of SILD is formed by the first
eigenvectors in (9) that ordered in the descending order of eigenvalues.IiiB Proposed Multilinear SideInformation based Discriminant Analysis integrating Within Class Covariance Normalization
Let a Tensor training set of classes, where: contains samples of Parents samples and contains samples of Children samples. The goal of MSIDA [2] is the calculation of projection matrices (). Thus, we calculate one projection matrix for each tensor mode. The objective function of MSIDA method is defined as follow:
(10) 
We calculate the two covariance matrices and for each mode by:
S_w = ∑_p=1^∏_o≠kI_o S_w^p, S_w^p = ∑_i=1^C_1((ˇξ_i^1)^k,p(^ξ_i^1)^k,p)((ˇξ_i^1)^k,p(^ξ_i^1)^k,p)^T
S_b = ∑_p=1^∏_o≠kI_o S_b^p, S_b^p = ∑_i=1^C_0((ˇξ_i^0)^k,p(^ξ_i^0)^k,p)((ˇξ_i^0)^k,p(^ξ_i^0)^k,p)^T
Now that the solution for one mode is known, the optimization problem in equation 10 can be solved iteratively. The projection matrices are first initialized to identity. At each iteration are hypothetical known and
is estimated. Set:
and are replaced in equation 10 by X and Z. The new equation can be solved by the generalized eigenvalue decomposition problem:(11) 
Where, is the eigenvectors matrix and the eigenvalues matrix.
The iterative process of MSIDA breaks up on the recognition of one of the following situations: i) The number of iterations reaches a predefined maximum; or ii) the difference of the estimated projection between two consecutive iterations is less than a threshold, where is the mode dimension of . As depicted in Fig. 1, the block diagram of the proposed approach consists of three essential components: feature extraction, tensor subspace transformation and comparison. We focus in this work on subspace transformation and the feature extraction based multiple scales local descriptor.
IiiC WithinClass Covariance Normalization
The first use of the WithinClass Covariance Normalization (WCCN) is in the community of speaker recognition. While Dehak et al. [5]
founded that it is the best technique to project the reducedvectors of LDA method to a new subspace determined by the squareroot of the inverse of the withinclass covariance matrix. We propose a new variant of MSIDA by integrating WCCN:
G = ∑_p=1^∏_o≠kI_o G^p, G^p =∑_i=1^C_1 (Wk)T(ˇξi1)k,p(Wk)T(^ξi1)k,p(Wk)T(ˇξi1)k,p(Wk)T(^ξi1)k,p
where, is the MSIDA projection matrix found in Eq.11. The WCCN projection matrix is obtained by Cholesky decomposition [11, 28] of the inverse of : . Where the new projection matrix is obtained by: . By imposing upper bounds on the classification error metric [1], WCCN decreases the withinclass variations effect by reducing the expected classification error on the training step.
Iv Experimental Analysis
For experimental evaluation, we considered the KinFaceWI and KinFaceWII databases are gathered through Internet research, including some public figures with their parents and/or children. In the KinFaceWI dataset, there are 156, 134, 116, and 127 pairs corresponding to the FS, FD, MS, and MD relations, respectively. For the KinFaceWII dataset, each kin relation type contains 250 pairs. In total KinFaceWI count 1066 face images and 2000 face images for KinFaceWII.
Method  KinFaceWI  KinFaceWII  
FS  FD  MS  MD  Mean  FS  FD  MS  MD  Mean  
MNRML [16]  72.50  66.50  66.20  72.00  69.90  76.90  74.30  77.40  77.60  76.50 
DMML [26]  74.50  69.50  69.50  75.50  72.25  78.50  76.50  78.50  79.50  78.25 
MPDFL [27]  73.50  67.50  66.10  73.10  70.10  77.30  74.70  77.80  78.00  77.00 
MMTL [19]  N.A  N.A  N.A  N.A  73.70  N.A  N.A  N.A  N.A  77.20 
DDMML [15]  86.40  79.10  81.40  87.00  83.50  87.40  83.80  83.20  83.00  84.30 
NRCML [25]  66.10  61.10  66.90  73.00  66.30  79.80  76.10  79.80  80.00  78.70 
MKSM [29]  83.65  81.35  79.69  81.16  81.46  83.80  81.20  82.40  82.40  82.45 
MvDML [9]  /  /  /  /  /  80.40  79.80  78.80  81.80  80.20 
Deep+Shallow [3]  68.80  68.80  70.50  65.50  68.40  66.50  68.80  65.40  65.40  66.50 
[10]  /  /  /  /  /  82.40  78.20  78.80  80.40  80.00 
ResNet + CF [21]  78.00  83.70  87.00  80.80  82.40  87.70  86.00  86.70  87.40  86.60 
RDML [6]  76.20  74.20  76.90  82.20  77.30  79.30  72.30  77.40  78.30  76.80 
MNRML+SVM [7]  85.90  79.85  86.20  86.62  84.55  87.20  82.60  88.40  89.40  86.90 
SILD+WCCN/LR [12]  /  /  /  /  /  88.40  84.20  85.80  86.40  86.20 
KML [30]  N.A  N.A  N.A  N.A  82.80  N.A  N.A  N.A  N.A  85.70 
MvGMML [8]  69.25  73.12  69.40  72.76  71.13  70.40  73.40  65.80  69.20  69.70 
SSC  71.57  70.83  77.12  79.88  74.85  72.80  69.00  73.80  73.80  72.35 
SILD  73.75  71.25  76.25  77.49  74.69  72.80  69.20  74.00  74.00  72.50 
MSIDA  73.00  72.96  78.41  77.91  75.57  75.00  69.40  75.80  74.40  73.95 
SILD+WCCN  75.72  72.39  79.80  80.74  77.16  77.40  75.60  75.80  78.40  76.80 
MSIDA+WCCN  85.98  85.93  90.05  88.62  87.65  89.40  82.80  87.80  88.00  87.00 
Iva Experimental Setup
The number of the positive and negative pairs used in the experiments is the same for each relation on the four subsets. We use fivefold cross validation strategy for the evaluation. We report the mean accuracy over the five folds. The negative pairs and folds are predefined for the all four relations. For the facial deep features and object deep features, we extracted VGGFace, VGGF, VGGM and VGGS as this has shown to perform better than shallow methods [30, 7]. The tensor features are performs by the proposed MSIDA+WCCN method.
IvB Results and Analysis
IvB1 Results on RFIW’20 Challenge
For RFIW’20 Challenge [20, 24, 22, 21], we used the eight fully connected layers of the four pretrained models (facial and object models). For this reason, we used the Simple Scoring Cosine similarity (SSC) method by concatenating the eight deep features to form a vector of features for each pair facial images. Then, we compute the cosine similarity metric between the two vectors. This method show and prove how can a raw weights of deep features perform in kinship verification as an excellent features of facial images without using any application of learning methods.
IvB2 Results on KinFaceW databases
We run the experiments on the four relations of the two databases, KinFaceWI and KinFaceWII, using SSC, SILD, MSIDA, SILD+WCCN and MSIDA+WCCN methods. The results of these experiments are reported in Table I. The ROC curves comparing SSC, SILD, MSIDA, SILD+WCCN and MSIDA+WCCN are provided in Figures 7 and 12 for the four relation of KinFaceWI and KinFaceWII databases, respectively. As can be noticed from the figure, the performance of MSIDA+WCCN is much better than that the other methods in all cases.
Our proposed method is compared against some recent stateoftheart methods in Table I. Note that some of these methods, such as MvDML, DDMML, ResNet+CF, MNRML+SVM, use combination of different features to describe a face image. Some other methods are based on deep learning. On the four relations of KinFaceWI and KinFaceWII databases, our approach yields in the best results for the mean of all the four kinship subsets of the two databases. These results are promising and demonstrate that our proposed approach is performs better than the recent methods for kinship verification. Furthermore, MSIDA+WCCN and SILD+WCCN improve the performances of their counterparts (i.e. MSIDA and SILD) with large margin. Besides, for linear (vectorbased) methods, SILD+WCCN improves SILD method with about 2.47% and 4.30% for KinFaceWI and KinFaceWII databases, respectively. Also, for multilinear (tensorbased) methods, MSIDA+WCCN improves MSIDA method with about 12.08% and 13.05% for KinFaceWI and KinFaceWII databases, respectively. Thus, the integration of WCCN shows stable and robust performances on the metric learning methods for kinship verification.
V Conclusion
In this paper, we presented an effective approach based on multiview deep features (facial and object) features to the problem of kinship verification. To achieve a low dimensional and discriminative subspace, we proposed the MSIDA+WCCN method. Also, we studied the effect of WCCN on different metric learning methods showing that the withinclass intravariability introduced by the training data (multiview deep features in our case) can be reduced to a large extent. Thus, we see that the performances was improved and the metric learning methods can learn good metrics through WCCN integration. The obtained results by MSIDA+WCCN method outperform the state of the art on four ParentChild relations on two databases, KinFaceWI and KinFaceWII.
References

[1]
O. Barkan, J. Weill, L. Wolf, and H. Aronowitz.
Fast high dimensional vector multiplication face recognition.
In2013 IEEE International Conference on Computer Vision
, pages 1960–1967, Dec 2013.  [2] M. Bessaoudi, A. Ouamane, M. Belahcene, A. Chouchane, E. Boutellaa, and S. Bourennane. Multilinear sideinformation based discriminant analysis for face and kinship verification in the wild. Neurocomputing, 329:267 – 278, 2019.
 [3] M. Bordallo Lopez, A. Hadid, E. Boutellaa, J. Goncalves, V. Kostakos, and S. Hosio. Kinship verification from facial images and videos: human versus machine. Machine Vision and Applications, 29(5):873–890, Jul 2018.
 [4] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. CoRR, abs/1405.3531, 2014.
 [5] N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet. Frontend factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4):788–798, May 2011.
 [6] Z. Ding, M. Shao, W. Hwang, S. Suh, J. Han, C. Choi, and Y. Fu. Robust discriminative metric learning for image representation. IEEE Transactions on Circuits and Systems for Video Technology, 29(11):3173–3183, Nov 2019.
 [7] F. Dornaika, I. ArgandaCarreras, and O. Serradilla. Transfer learning and feature fusion for kinship verification. Neural Computing and Applications, Apr 2019.

[8]
J. Hu, J. Lu, L. Liu, and J. Zhou.
Multiview geometric mean metric learning for kinship verification.
In 2019 IEEE International Conference on Image Processing (ICIP), pages 1178–1182, Sep. 2019.  [9] J. Hu, J. Lu, and Y. Tan. Sharable and individual multiview metric learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(9):2281–2288, Sep. 2018.
 [10] J. Hu, J. Lu, Y. Tan, J. Yuan, and J. Zhou. Local largemargin multimetric learning for face and kinship verification. IEEE Transactions on Circuits and Systems for Video Technology, 28(8):1875–1891, Aug 2018.
 [11] R. L. Iman and W. J. Conover. A distributionfree approach to inducing rank correlation among input variables. Communications in Statistics  Simulation and Computation, 11(3):311–334, 1982.
 [12] O. Laiadi, A. Ouamane, A. Benakcha, A. TalebAhmed, and A. Hadid. Learning multiview deep and shallow features through new discriminative subspace for bisubject and trisubject kinship verification. Applied Intelligence, 49(11):3894–3908, Nov 2019.
 [13] O. Laiadi, A. Ouamane, A. Benakcha, A. TalebAhmed, and A. Hadid. Tensor crossview quadratic discriminant analysis for kinship verification in the wild. Neurocomputing, 2019.
 [14] O. Laiadi, A. Ouamane, E. Boutellaa, A. Benakcha, A. TalebAhmed, and A. Hadid. Kinship verification from face images in discriminative subspaces of color components. Multimedia Tools and Applications, 78(12):16465–16487, Jun 2019.
 [15] J. Lu, J. Hu, and Y. P. Tan. Discriminative deep metric learning for face and kinship verification. IEEE Transactions on Image Processing, 26(9):4269–4282, Sept 2017.
 [16] J. Lu, X. Zhou, Y.P. Tan, Y. Shang, and J. Zhou. Neighborhood repulsed metric learning for kinship verification. IEEE Trans. Pattern Anal. Mach. Intell., 36(2):331–345, Feb. 2014.
 [17] D. X. Meina Kan, Shiguang Shan and X. Chen. Sideinformation based linear discriminant analysis for face recognition. In Proc. BMVC, pages 125.1–125.0, 2011. http://dx.doi.org/10.5244/C.25.125.
 [18] O. M. Parkhi, A. Vedaldi, and A. Zisserman. Deep face recognition. In British Machine Vision Conference, 2015.
 [19] X. Qin, X. Tan, and S. Chen. Mixed bisubject kinship verification via multiview multitask learning. Neurocomputing, 214:350 – 357, 2016.
 [20] J. P. Robinson, M. Shao, Y. Wu, and Y. Fu. Families in the wild (fiw): Largescale kinship image database and benchmarks. In Proceedings of the 2016 ACM on Multimedia Conference, pages 242–246. ACM, 2016.
 [21] J. P. Robinson, M. Shao, Y. Wu, H. Liu, T. Gillis, and Y. Fu. Visual kinship recognition of families in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(11):2624–2637, Nov 2018.
 [22] J. P. Robinson, M. Shao, H. Zhao, Y. Wu, T. Gillis, and Y. Fu. Recognizing families in the wild (rfiw): Data challenge workshop in conjunction with acm mm 2017. In RFIW ’17: Proceedings of the 2017 Workshop on Recognizing Families In the Wild, pages 5–12, New York, NY, USA, 2016. ACM.

[23]
D. L. Swets and J. J. Weng.
Using discriminant eigenfeatures for image retrieval.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8):831–836, Aug 1996.  [24] S. Wang, J. P. Robinson, and Y. Fu. Kinship verification on families in the wild with marginalized denoising metric learning. In Automatic Face and Gesture Recognition (FG), 2017 12th IEEE International Conference and Workshops on.

[25]
H. Yan.
Kinship verification using neighborhood repulsed correlation metric
learning.
Image and Vision Computing, 60:91 – 97, 2017.
Regularization Techniques for HighDimensional Data Analysis.
 [26] H. Yan, J. Lu, W. Deng, and X. Zhou. Discriminative multimetric learning for kinship verification. IEEE Transactions on Information Forensics and Security, 9(7):1169–1178, July 2014.
 [27] H. Yan, J. Lu, and X. Zhou. Prototypebased discriminative feature learning for kinship verification. IEEE Transactions on Cybernetics, 45(11):2535–2545, Nov 2015.
 [28] H. Yu, C. Y. Chung, K. P. Wong, H. W. Lee, and J. H. Zhang. Probabilistic load flow evaluation with hybrid latin hypercube sampling and cholesky decomposition. IEEE Transactions on Power Systems, 24(2):661–667, May 2009.
 [29] Y.G. Zhao, Z. Song, F. Zheng, and L. Shao. Learning a multiple kernel similarity metric for kinship verification. Information Sciences, 430431:247 – 260, 2018.
 [30] X. Zhou, K. Jin, M. Xu, and G. Guo. Learning deep compact similarity metric for kinship verification from face images. Information Fusion, 48:84 – 94, 2019.
Comments
There are no comments yet.