1 Introduction
Human age estimation remains to be an active research topic, which plays an important role on many potential applications, such as video surveillance, social networking, humancomputer interaction, etc. Existing human age estimation methods are mainly based on facial images [2, 3, 22, 23, 31], which are very informative and easy to be estimated. The performance of the facebased age estimation approaches, however, will compromise when the face is occluded, for example, with sunglass or makeup. What’s more, the facebased age estimation becomes challenging if a person is far away from camcorders, which often happens in many video surveillance systems located at crossroads, airports and railway stations. As the unique biometric feature that can be perceived efficiently at a distance, gait can be an alternative way to predict the age in case that human faces are less informative or not available. Remarkably, gaitbased estimation has its psychological foundation which cannot be easily faked [7]. For example, an old person might hobble along, whereas a young person might walk briskly.
In the field of gaitbased age estimation, gait energy image (GEI) [15, 21], which compresses one or more gait sequences into a single image, is one of the most widely used gait templates for its simplicity and effectiveness. Some researches applied age manifold learning techniques on GEI to learn a lowdimensional representation capturing the intrinsic data distribution and geometric structure [16, 17]. Existing gaitbased age estimation approaches can be roughly grouped into two categories: classificationbased [15] and regressionbased methods [17, 21]. However, both of them do not consider the ordinal relationship between age labels, which is an important clue to age estimation. Therefore, the rankingbased methods for facialbased age estimation [3, 5, 13, 22]
are proposed to solve such a problem by utilizing the ordinal information between age labels. These methods usually decompose the ordinal regression into a series of binary classifications and utilize crossentropy loss to optimize these binary classifications. However, the crossentropy loss treats these classifications independently, ignoring the inner relationship among them. For these ordered binary classifications, the expected inner relationship is that the predictive probability of the
th classifier should not be greater than the probability of the (
)th classifier, as explained in Fig. 2.In this paper, we propose an ordinal distribution regression with a global and local convolutional neural network, named as ODRGLCNN, for gaitbased age estimation. Similar to the rankingbased methods for facialbased age estimation, we regard the gaitbased age estimation as an ordinal regression, and decompose the ordinal regression problem into a series of binary classifications subproblems. Note that the major issue with the existing rankingbased methods is that they solve these binary subproblems independently, neglecting the inner relationship in a degree and making not good use of the correlation between these subbinary tasks. To address this shortcoming, we propose an ordinal distribution loss to penalize the distribution difference between the estimated and groundtruth ages. Besides, we proposed a novel network, consisting of a global and three local subnetworks, to obtain global structure and local structures from the head, body and feet of a gait. Experimental results on the OULPAge dataset [40] and the MORPH Album II dataset [26] demonstrate that the proposed approach outperforms the stateoftheart methods both on gaitbased and facebased age estimation.
The contributions of this paper are: 1) A deep ordinal distribution regression for gaitbased age estimation is proposed, achieving the stateoftheart predictive performance on the OULPAge dataset; 2) An ordinal distribution loss is proposed to take the inner relationship among a series of binary subproblems into account; 3) A novel network consisting of a global network and three local subnetworks is proposed, learning more representative features from the gait globally and locally.
2 Related work
In this section, we give a brief survey of facebased and gaitbased age estimation as well as ordinal regression.
Facebased age estimation: Existing approaches for facebased age estimation can be categorized into three categories: classification, regression, and rankingbased methods. Classificationbased methods were often used to roughly estimate the age group of the subject in a face image [12, 41]. Different ages or age groups were treated as independent classes. These methods, however, hardly consider the cost difference of subjects belonging to different age groups. Regressionbased methods provided a more accurate age assessment to a facial image of a subject [6, 39]. Typically, regressionbased methods employed an Euclidean loss ( loss) to penalize the difference between the estimated and groundtruth ages. Recently, rankingbased or ordinal methods were proposed for facial age estimation [1, 2, 22, 3]. Regarding the age as an ordered label, these approaches used multiple binary classifiers to determine the rank of a specific age. Different from the loss that ignores the ordinal information, rankingbased methods are able to explicitly model the ordinal relationship among those face images sampled from different ages.
Gaitbased age estimation: The earliest work for gaitbased age estimation can be dated back to [21], where a Gaussian process regression (GPR) [25] was introduced to predict age from human gait. Then the GPR was refined with an active set method [37] to reduce the computational time for online age estimation [20]. Lu and Tan proposed a multilabel guided subspace (MLG) to better characterize the feature space by correlating the age and gender information of subjects [15]. They further proposed an ordinary preserving manifold learning approach to seek a lowdimensional discriminative subspace for age estimation [17]. Considering the age variations within different age groups such as children, adult, and the elderly, Li et al. proposed an age groupdependent manifold method [14]. After an age group classifier has been trained, a kernel SVM regression was added for accurately assessment in each age group. This method achieves the stateoftheart performance in gaitbased age estimation so far.
Ordinal regression: Most ordinal algorithms can be regarded as the refined version of classification algorithms with ordinal constraints [4, 9, 30]. For examples, Herbrich et al.
utilized support vector machine for ordinal regression
[9], and then Shashua and Levin refined SVM to handle multiple thresholds [30]. Crammer and Singer proposed the perceptron ranking algorithm to generalize the online perceptron algorithm with multiple thresholds for ordinal regression
[4]. Another way to directly utilize the classification algorithms is to transfer the ordinal regression into a series of simpler binary classifications [5, 13]. Specifically, Frank and Hall utilized decision tree as binary classifications for ordinal regression
[5]. Li and Lin learned the ordinal regression by a set of classifiers, followed by employed an SVM for final classification [13]. Recently, Niu et al. introduced a CNN network with multiple binary outputs to solve the ordinal regression for age estimation [22]. Ordinal regression was also used in [3] by learning multiple binary CNNs, and aggregating the final outputs. However, these ordinal regression methods solved each binary subproblem independently, and less utilized the underlying relationship among these binary subproblems. In this paper, we thus proposed a distribution loss to utilize such a relationship to improve age estimation.3 The proposed method
In this section, we present the ordinal regression for gaitbased age estimation, our novel network consisting of one global and three local subnetworks, and a novel distribution loss in more details.
3.1 Ordinal regression
We treat the gaitbased age estimation as an ordinal regression so that the ordinal relationship of age labels can be utilized. Let denote the th input GEI sample, and the corresponding age is with ordered ranks . The symbol denotes the order among different ranks. Given a training set , the ordinal regression is to learn a mapping from images to ranks, , .
Inspired by two rankingbased methods [3, 22], we decompose the ordinal regression into a series of binary classifications. Specifically, the ordinal regression with ranks is decomposed into binary classifiers . For each , a binary classifier is constructed to predict whether the rank of a sample is greater than . The final rank of an unknown test sample is determined by summarizing all the classifiers results of the binary classifiers.
To train the th binary classifier , more concretely, the given dataset is divided into two subsets  one positive class and one negative class, determined by whether age is greater than , ,
(1) 
The whole binary classifiers are welltrained with their respective training datasets, the age of the test sample is predicted as follows:
(2) 
where is the output probability of the th classifier for the sample (, the th output of GLCNN), is the partitioning interval, and denotes the truthtest operator, which is 1 if the inner condition holds, and 0 otherwise.
3.2 The global and local convolutional neural networks (GLCNN)
Fig. 3 presents the overview of the proposed deep neural network for gaitbased age estimation, consisting of one global and three local convolutional neural networks, followed by three fully connected layers with outputs. Next, we describe the network in details.
The grayscale GEI images of size are fed into the global network as the input. Considering that different parts of gait take on different local behaviors, we crop the GEI template into three parts  head, body and feet. In OULPAge dataset [40], the gait images of various people are detected, cropped, aligned and resized into the uniform silhouette template with the same height. In our work, the three parts are cropped using three boxes without overlap, each of which is fixed to size of , , and , respectively. Then three local networks are designed to learn finer details from these three parts separately. More specifically, there are three convolutional layers in both global and local subnetworks. At the first convolutional layer, 32 filters of size
with stride of 1 pixel are applied on the input images, followed by a Leaky Rectified Linear Unit (LeakyReLU)
[18]. Then a max pooling operation with filters of size
applied with a stride of 2 is used to emphasize the strongest responsive points in the feature maps. The similar operations are conducted at the second and third convolutional layers with different filter sizes (refer to Fig. 3 for details). It should be noted that we concatenate the three local feature maps from second convolution layers along height dimension to form new local feature maps in local network, which is further concatenated with the feature maps from global network along the channel dimension.After that, there are three fully connected layers as shown in Fig. 3
. Among them, F4 is the first fully connected layer in which the feature maps are flattened into a feature vector. There are 1024 neurons in F4 followed by LeakyReLU and a dropout layer
[35]. F5 is the second fully connected layer with 1024 neurons that receives the output from F4 followed by LeakyReLU and another dropout layer. F6 is the third fully connected layer with neurons that receives the outputs from F5 followed by LeakyReLU and a dropout layer. Through a sigmoid layer, the outputs correspond to the predictive probabilities frombinary classifiers. The parameters of the network are typically optimized by minimizing a loss function.
3.3 Ordinal distribution loss
Here we cast the age label as the for binary classifiers where . We employ the crossentropy loss as the loss function for these binary classifiers. The loss can be calculated as:
(3) 
where is the output value of the th binary classifier for the th sample. The crossentropy loss, however, optimizes these binary classifiers separately, resulting in discrepancy between different binary classifications, as described in Fig. 2.
In order to fully utilize the inner relationship among these
outputs, we regard these outputs as a probability distribution and then propose a distribution loss, , the squared Earth Mover’s Distance (
) [10], to penalize the discrepancy between the output distribution and the groundtruth distribution. Firstly, the output values are softly transformed to probability value:(4) 
Then the loss is defined as:
(5) 
where and are the probability distributions corresponding to the th output and the th groundtruth , respectively. is a cumulative density function of its input, and is the th element of the CDF of its input.
Finally, we propose an ordinal distribution loss through combining the crossentropy loss with the loss. This loss function is easily embedded into the architecture of GLCNN for an endtoend learning. The ordinal distribution loss (ODL) is,
(6) 
where is a hyperparameter that controls the influence of in the joint loss.
3.4 Learning ODRGLCNN
One advantage of using Eq. (6) is that the ordinal distribution loss can simultaneously learn each binary classification and the inner relationship between these binary classifications. For the th sample , the gradient of our loss can be derivate as:
(7) 
where represents the parameters of network, and
could be derived through the standard backpropagation method. For the
th element of , the gradient can be derivate as:(8) 
For the th element of , the gradient can be derivate as:
(9)  
where . Eq. (8) indicates that the gradient of the cross entropy loss is only related with output value of each binary classification and its corresponding groundtruth, ignoring the intrinsic correlation for their binary classifiers. In contrast, the output value of each classification would be considered when computing the gradient of a specific binary classification in loss, as shown in Eq. (9). Therefore, the ordinal distribution loss can not only consider each binary classification but also utilize inner relationship among them.
4 Experiments
In this section, we describe the experimental settings in details and demonstrate the effectiveness of the proposed method through comparing with the stateoftheart methods and performing a set of ablative studies on OULPAge gait dataset [40]. In addition, we evaluate the generalization ability of the proposed approach to other tasks, facial age estimation on MORPH Album II [26].
4.1 Experimental settings
4.1.1 Data preparation
OULPAge is the largest gait dataset in the world so far, which contains 63,846 samples of GEI (31,093 males and 32,753 females) with age ranging from 2 to 90 years old, and each GEI sample is pixels. According to gender, the age histogram of this dataset in fiveyear intervals is shown in Fig. 4. As the dataset suggested [40], the OULPAge dataset was averagely divided into two disjoint subsets (training set and test set). The training set contains 15,596 males and 16,327 females, the testing set 15,497 males and 16,426 females. In addition, these two sets keep a similar age distribution. Another popular gait dataset on age estimation is the USF dataset [29], which includes only 122 subjects and is too small to train a deep network. Thus, we evaluate the proposed method on the OULPAge dataset but not on the USF dataset.
MORPH Album II is one of the largest longitudinal face databases in the public domain, which contains 55,134 face images of 13,617 subjects in the 16to77 age range [26]. Followed the protocol as [22, 3, 23, 32], we use the fivefold random split (RS) protocol to evaluate the performance of the facial age estimation. All face images are aligned based on five facial landmarks detected using an opensource SeetaFaceEngine^{1}^{1}1https://github.com/seetaface/SeetaFaceEngine and are resized into .
4.1.2 Evaluation metrics
The performance of age estimation is evaluated by the Mean Absolute Error (MAE) and the Cumulative Score (CS). MAE represents the average of the absolute errors between the predicted age and groundtruth over all test samples. The MAE is defined as , where is the total number of test samples. And CS is calculated as , where is the number of test samples whose absolute error between the estimated age and the groundtruth is not greater than years. CS reveals the consistently performance by computing the accuracy of this model in different levels.
4.2 Gaitbased age estimation results
4.2.1 Implementation details
In our experiments, we utilize GLCNN, CNN (consist of a global part as shown in Fig. 3), and VGG16 [24, 33] as three backbone networks. We use Adam [11] with learning rate of , beta1 0.5, beta2 0.999, weight decay
, batch size of 300 and the maximal epochs 300 for CNN and GLCNN. Followed the setting as
[23], we use stochastic gradient descent (SGD) with learning rate of
, weight decay, batch size of 300 and the maximal epochs 100 for VGG16, and reduce the learning rate by multiplying 0.1 for every 15 epochs. To make the grayscale GEI suitable for VGG16, we copied the GEI three times as RGB channels to fed into VGG16, which is pretrained on ImageNet
[28]. Besides, the weight coefficient of the loss term in Eq. (6) is set to, which is tuned according to the model performance. All our experiments are implemented on PyTorch with four GeForce GTX 1080 Ti GPUs.
4.2.2 Comparisons with the stateofthearts
We compare the proposed method with the stateoftheart methods, including the classificationbased methods (e.g., MLG [15]), regressionbased methods (e.g., GPR [21], SVR [34] and ASSOLPP [14]), and age manifold learningbased methods (e.g., OPLDA and OPMFA [16]
). Besides, we implemented a deep learning method as a baseline, named as
VGG16 + MeanVariance
, to validate the effectiveness of the proposed method. This method proposed in [23] achieves an outperform performance in the field of facebased age estimation.Methods  MAE  CS () 

SVR [34]  7.66  41.40% 
MLG [15]  10.98  43.40% 
OPLDA [16]  8.45  36.50% 
OPMFA [16]  9.08  34.70% 
GPR [21]  7.30  43.60% 
ASSOLPP [14]  6.78  53.00% 
VGG16 + MeanVariance [23]  5.59  60.46% 
ODRGLCNN (Ours)  5.12  66.95% 
Table 1 shows the results of eight methods on OULPAge. This suggests that CNNbased methods, such as [23], perform better than traditional methods in MAE [14, 15, 16, 21, 34]
. The reason is because CNNbased methods have much more parameters and learn more representative features with endtoend training. Our method performs the best among all the approaches, because our method benefits from not only a more representative feature extraction network (GLCNN) but also a novel loss function (the ordinal distribution loss), which can learn not only each binary classification of ordinal regression but the inner relationship among them. Besides, as shown in Fig.
5, the CS results on the OULPAge dataset further demonstrate the proposed approach performs consistently better than other stateoftheart methods.Some age estimation examples are shown in Fig. 6. We can see that the proposed approach performs quite robust for young, middleaged, and old subjects. It is noticeable from the last row of Fig. 6 that the age estimation accuracy may be degenerated when a person wears a heavy clothes, or when a person is too thin or too fat.
4.2.3 Analyzing the performance of GLCNN
We evaluate the performance of the proposed network GLCNN by comparing with a simple CNN consisting of a global part and the widely used network VGG16 for age estimation based on gait. Three networks choose the crossentropy loss as their loss function. The results of MAE and CS () with three different networks are shown in Table 2.
Network  MAE  CS ()  Time (ms) 

CNN  5.45  64.64%  7.27 
VGG16  5.63  63.92%  21.9 
GLCNN (Ours)  5.24  65.96%  8.99 
Compared with a simple CNN, GLCNN achieves a better performance in age estimation. It can be seen from Table 2 that 1) compared with CNN and VGG16, GLCNN achieves the best performance in two criteria; 2) although VGG16 has more parameters, GLCNN effectively learns more detail information through combining a global and three local structures, resulting in an improved performance; and 3) the computational cost of GLCNN is only slightly higher than CNN, but much smaller than VGG16.
To better demonstrate the effectiveness of the proposed network, we visualize the features of CNN and GLCNN through distributed stochastic neighbor embedding [19] (SNE) technique with perplexity 30, as shown in Fig. 7. For better visualization, the age label is divided into 9 age groups, . We can see that both GLCNN and CNN features seem keeping a manifoldlike structure since the order of ages varies smoothly from left to right. However, if zooming in Fig. 7, it can be seen that the inner age group samples of the GLCNN are denser than CNN, especially in age group and , which shows that GLCNN can achieve lower error in age estimation, because it learns a better feature representation from both global and local structures.
4.2.4 Comparison of different losses
To validate the effectiveness of the proposed ordinal distribution loss, we compare it with three widely used losses in age estimation task, , Euclidean, MAE and crossentropy losses by performing age estimation based on the proposed GLCNN. The MAE and CS () of these losses are reported in Table 3.
Loss  MAE  CS () 

Euclidean  6.73  52.95% 
MAE  6.65  55.16% 
CrossEntropy  5.24  65.96% 
Ordinal Distribution (Ours)  5.12  66.95% 
It can be seen that crossentropy loss outperforms Euclidean loss and MAE loss for age estimation task. The reason is that Euclidean and MAE losses are easily lead to overfitting and do not consider the ordinal information between age labels. In contrary, the proposed ordinal distribution loss incorporates the inner relationship between the binary classifications by using a distribution loss, named loss, resulting in a better predictive performance.
4.2.5 Discussion of the influence of gender
We realize that the gender is correlated with the age of gait since human gait appearances vary between males and females even within the same age group from Fig. 1. To better utilize the relationship between age and gender of gait, we embed a multitask technique into the CNNbased framework. As shown in Fig. 8, specifically, we integrate a gender classification task to the proposed method and other three CNNbased methods. As a binary classification, the gender loss is defined as:
(10) 
where is the groundtruth of gender for the th sample and is the corresponding predicted value.
Methods  w/o gender  w/ gender  acc. 

CNN + Euclidean  6.96  6.82  96.70% 
CNN + CE  5.40  5.34  97.20% 
VGG16 + MV  5.59  5.52  96.70% 
ODRGLCNN  5.12  5.06  97.80% 
Table 4 indicates that the gender information indeed improves the performance of gaitbased age estimation. Moreover, the accuracy of gender classification in our method is 97.8%, implying that as a byproduct, our network can accurately predict the gender of a person from a gait.
4.3 The facial age estimation
We demonstrate the proposed ordinal distribution loss (ODL) to perform facial age estimation on MORPH Album II, and compare the results with the stateoftheart methods [22, 27, 3, 23, 32]. Followed as [23, 32], we also utilize VGG16, pretrained with ImageNet [28], as the backbone network with the proposed ODL. The results of individual approaches in terms of MAE and CS are reported in Table 5. We can see that our approach achieves better prediction performance than the stateoftheart method DRFs [32], which suggests that our approach can be generalized well to facial age estimation task. Besides, the results by using the ordinal distribution loss (ODL, ) are better than using a single crossentropy loss (ODL, ). It indicates that ODL is more effective in learning the ordinal relationship among different age than a single crossentropy loss.
Methods  MAE  CS () 

ORCNN [22]  3.27  73.0%* 
DEX [27]  3.25  N/A 
RankingCNN [3]  2.96  85.0%* 
VGG16 + MeanVariance [23]  2.41  90.0%* 
DRFs [32]  2.17  91.3% 
VGG16 + ODL()  2.30  91.1% 
VGG16 + ODL() (Ours)  2.16  92.9% 
5 Conclusion
In this paper, we proposed an ordinal distribution regression with GLCNN consisting of one global and three local subnetworks for gaitbased age estimation. By incorporating the crossentropy loss and the loss, the proposed ordinal distribution loss is more effective in learning the ordinal relationship among different age than a single crossentropy loss. Moreover, one global and three local subnetworks are constructed to extract more representative gait features. We also notice that if the gender information is available for training, embedding a multitask strategy into the proposed framework can more or less improve the performance of age estimation. Experiments on the OULPAge and the MORPH Album II datasets show that our approach not only performs better than the stateoftheart methods on gaitbased age estimation, but also generalizes well into facial age estimation task.
References
 [1] K.Y. Chang and C.S. Chen. A learning framework for age rank estimation based on face images with scattering transform. IEEE TIP, 24(3):785–798, 2015.

[2]
K.Y. Chang, C.S. Chen, and Y.P. Hung.
Ordinal hyperplanes ranker with cost sensitivities for age estimation.
In CVPR, pages 585–592, 2011.  [3] S. Chen, C. Zhang, M. Dong, J. Le, and M. Rao. Using rankingCNN for age estimation. In CVPR, 2017.
 [4] K. Crammer and Y. Singer. Pranking with ranking. In NIPS, pages 641–647, 2002.
 [5] E. Frank and M. Hall. A simple approach to ordinal classification. In ECML, pages 145–156, 2001.
 [6] Y. Fu and T. S. Huang. Human age estimation with regression on discriminative aging manifold. IEEE TMM, 10(4):578–584, 2008.
 [7] J. Han and B. Bhanu. Individual recognition using gait energy image. IEEE TPAMI, 28(2):316–322, 2006.
 [8] Y. He, J. Zhang, H. Shan, and L. Wang. Multitask GANs for viewspecific feature learning in gait recognition. IEEE TIFS, 14(1):102–113, 2019.
 [9] R. Herbrich, T. Graepel, and K. Obermayer. Support vector learning for ordinal regression. ICANN, pages 97–102, 1999.
 [10] L. Hou, C.P. Yu, and D. Samaras. Squared earth mover’s distancebased loss for training deep neural networks. arXiv:1611.05916, 2016.
 [11] D. Kinga and J. B. Adam. A method for stochastic optimization. In ICLR, volume 5, 2015.
 [12] A. Lanitis, C. Draganova, and C. Christodoulou. Comparing different classifiers for automatic age estimation. IEEE TSMCB, 34(1):621–628, 2004.
 [13] L. Li and H.T. Lin. Ordinal regression by extended binary classification. In NIPS, pages 865–872, 2007.
 [14] X. Li, Y. Makihara, C. Xu, Y. Yagi, and M. Ren. Gaitbased human age estimation using age groupdependent manifold learning and regression. MTA, 77(21):1–22, 2018.
 [15] J. Lu and Y.P. Tan. Gaitbased human age estimation. IEEE TIFS, 5(4):761–770, 2010.
 [16] J. Lu and Y.P. Tan. Ordinary preserving manifold analysis for human age estimation. In CVPR Workshops, pages 90–95, 2010.
 [17] J. Lu and Y.P. Tan. Ordinary preserving manifold analysis for human age and head pose estimation. IEEE THMS, 43(2):249–258, 2013.
 [18] A. L. Maas, A. Y. Hannun, and A. Y. Ng. Rectifier nonlinearities improve neural network acoustic models. In ICML, volume 30, page 3, 2013.
 [19] L. v. d. Maaten and G. Hinton. Visualizing data using SNE. JMLR, 9(Nov):2579–2605, 2008.
 [20] Y. Makihara, T. Kimura, F. Okura, I. Mitsugami, M. Niwa, C. Aoki, A. Suzuki, D. Muramatsu, and Y. Yagi. Gait collector: an automatic gait data collection system in conjunction with an experiencebased longrun exhibition. In ICB, pages 1–8, 2016.
 [21] Y. Makihara, M. Okumura, H. Iwama, and Y. Yagi. Gaitbased age estimation using a wholegeneration gait database. In ICB, pages 1–6, 2011.
 [22] Z. Niu, M. Zhou, L. Wang, X. Gao, and G. Hua. Ordinal regression with multiple output CNN for age estimation. In CVPR, pages 4920–4928, 2016.
 [23] H. Pan, H. Han, S. Shan, and X. Chen. Meanvariance loss for deep age estimation from a face. In CVPR, pages 5285–5294, 2018.

[24]
O. M. Parkhi, A. Vedaldi, A. Zisserman, et al.
Deep face recognition.
In BMVC, volume 1, 2015. 
[25]
C. E. Rasmussen.
Gaussian processes in machine learning.
In Advanced Lectures on Machine Learning, pages 63–71. 2004.  [26] K. Ricanek and T. Tesafaye. Morph: A longitudinal image database of normal adult ageprogression. In FGR, pages 341–345, 2006.
 [27] R. Rothe, R. Timofte, and L. Van Gool. Deep expectation of real and apparent age from a single image without facial landmarks. IJCV, 126(24):144–157, 2018.
 [28] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. IJCV, 115(3):211–252, 2015.
 [29] S. Sarkar, P. J. Phillips, Z. Liu, I. R. Vega, P. Grother, and K. W. Bowyer. The humanID gait challenge problem: Data sets, performance, and analysis. IEEE TPAMI, 27(2):162–177, 2005.
 [30] A. Shashua and A. Levin. Ranking with large margin principle: two approaches. In NIPS, pages 961–968, 2003.
 [31] W. Shen, Y. Guo, Y. Wang, K. Zhao, B. Wang, and A. Yuille. Deep regression forests for age estimation. arXiv:1712.07195, 2017.
 [32] W. Shen, Y. Guo, Y. Wang, K. Zhao, B. Wang, and A. L. Yuille. Deep regression forests for age estimation. In CVPR, pages 2304–2313, 2018.
 [33] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. arXiv:1409.1556, 2014.
 [34] A. J. Smola and B. Schölkopf. A tutorial on support vector regression. Statistics and Computing, 14(3):199–222, 2004.
 [35] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. JMLR, 15(1):1929–1958, 2014.
 [36] N. Takemura, Y. Makihara, D. Muramatsu, T. Echigo, and Y. Yagi. Multiview large population gait dataset and its performance evaluation for crossview gait recognition. IPSJ TCVA, 10(4):1–14, 2018.

[37]
T. Wada, Y. Matsumura, S. Maeda, and H. Shibuya.
Gaussian process regression with dynamic active set and its application to anomaly detection.
In ICDM, 2013.  [38] Z. Wu, Y. Huang, L. Wang, X. Wang, and T. Tan. A comprehensive study on crossview gait based human identification with deep CNNs. IEEE TPAMI, 39(2):209–226, 2017.
 [39] B. Xiao, X. Yang, H. Zha, Y. Xu, and T. S. Huang. Metric learning for regression problems and human age estimation. In PCM, pages 88–99, 2009.
 [40] C. Xu, Y. Makihara, G. Ogi, X. Li, Y. Yagi, and J. Lu. The OUISIR gait database comprising the large population dataset with age and performance evaluation of age estimation. IPSJ TCVA, 9(1):24, 2017.
 [41] Z. Yang and H. Ai. Demographic classification with local binary patterns. In ICB, pages 464–473, 2007.
Comments
There are no comments yet.