Hujun Yin

is this you? claim profile


  • WebCaricature: a benchmark for caricature face recognition

    Caricatures are facial drawings by artists with exaggeration on certain facial parts. The exaggerations are often beyond realism and yet the caricatures are still recognizable by humans. With the advent of deep learning, recognition performances by computers on real-world faces has become comparable to human performance even under unconstrained situations. However, there is still a gap in caricature recognition performance between computer and human. This is mainly due to the lack of publicly available caricature datasets of large scale. To facilitate the research in caricature recognition, a new caricature dataset is built. All the caricature images and face images were collected from the web.Compared with two existing datasets, this dataset is of larger size and has various artistic styles. We also offer evaluation protocols and present baseline performances on the dataset. Specifically, four evaluation protocols are provided: restricted and unrestricted caricature verifications, caricature to photo and photo to caricature face identifications. Based on the evaluation protocols, three face alignment methods together with five kinds of features and nine subspace and metric learning algorithms have been applied to provide the baseline performances on this dataset. Main conclusion is that there is still a space for improvement in caricature face recognition.

    03/09/2017 ∙ by Jing Huo, et al. ∙ 0 share

    read it

  • Robust Face Recognition with Structural Binary Gradient Patterns

    This paper presents a computationally efficient yet powerful binary framework for robust facial representation based on image gradients. It is termed as structural binary gradient patterns (SBGP). To discover underlying local structures in the gradient domain, we compute image gradients from multiple directions and simplify them into a set of binary strings. The SBGP is derived from certain types of these binary strings that have meaningful local structures and are capable of resembling fundamental textural information. They detect micro orientational edges and possess strong orientation and locality capabilities, thus enabling great discrimination. The SBGP also benefits from the advantages of the gradient domain and exhibits profound robustness against illumination variations. The binary strategy realized by pixel correlations in a small neighborhood substantially simplifies the computational complexity and achieves extremely efficient processing with only 0.0032s in Matlab for a typical face image. Furthermore, the discrimination power of the SBGP can be enhanced on a set of defined orientational image gradient magnitudes, further enforcing locality and orientation. Results of extensive experiments on various benchmark databases illustrate significant improvements of the SBGP based representations over the existing state-of-the-art local descriptors in the terms of discrimination, robustness and complexity. Codes for the SBGP methods will be available at

    06/01/2015 ∙ by Weilin Huang, et al. ∙ 0 share

    read it

  • Breaking the Activation Function Bottleneck through Adaptive Parameterization

    Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure. We develop simple drop-in replacements that learn to adapt their parameterization conditional on the input, thereby increasing statistical efficiency significantly. We present an adaptive LSTM that advances the state of the art for the Penn Treebank and Wikitext-2 word-modeling tasks while using fewer parameters and converging in half as many iterations.

    05/22/2018 ∙ by Sebastian Flennerhag, et al. ∙ 0 share

    read it

  • Cross Attention Network for Semantic Segmentation

    In this paper, we address the semantic segmentation task with a deep network that combines contextual features and spatial information. The proposed Cross Attention Network is composed of two branches and a Feature Cross Attention (FCA) module. Specifically, a shallow branch is used to preserve low-level spatial information and a deep branch is employed to extract high-level contextual features. Then the FCA module is introduced to combine these two branches. Different from most existing attention mechanisms, the FCA module obtains spatial attention map and channel attention map from two branches separately, and then fuses them. The contextual features are used to provide global contextual guidance in fused feature maps, and spatial features are used to refine localizations. The proposed network outperforms other real-time methods with improved speed on the Cityscapes and CamVid datasets with lightweight backbones, and achieves state-of-the-art performance with a deep backbone.

    07/25/2019 ∙ by Mengyu Liu, et al. ∙ 0 share

    read it