Guanghan Ning

is this you? claim profile


  • LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking

    In this paper, we propose a novel effective light-weight framework, called LightTrack, for online human pose tracking. The proposed framework is designed to be generic for top-down pose tracking and is faster than existing online and offline methods. Single-person Pose Tracking (SPT) and Visual Object Tracking (VOT) are incorporated into one unified functioning entity, easily implemented by a replaceable single-person pose estimation module. Our framework unifies single-person pose tracking with multi-person identity association and sheds first light upon bridging keypoint tracking with object tracking. We also propose a Siamese Graph Convolution Network (SGCN) for human pose matching as a Re-ID module in our pose tracking system. In contrary to other Re-ID modules, we use a graphical representation of human joints for matching. The skeleton-based representation effectively captures human pose similarity and is computationally inexpensive. It is robust to sudden camera shift that introduces human drifting. To the best of our knowledge, this is the first paper to propose an online human pose tracking framework in a top-down fashion. The proposed framework is general enough to fit other pose estimators and candidate matching mechanisms. Our method outperforms other online methods while maintaining a much higher frame rate, and is very competitive with our offline state-of-the-art. We make the code publicly available at:

    05/07/2019 ∙ by Guanghan Ning, et al. ∙ 8 share

    read it

  • Dual Path Networks for Multi-Person Human Pose Estimation

    The task of multi-person human pose estimation in natural scenes is quite challenging. Existing methods include both top-down and bottom-up approaches. The main advantage of bottom-up methods is its excellent tradeoff between estimation accuracy and computational cost. We follow this path and aim to design smaller, faster, and more accurate neural networks for the regression of keypoints and limb association vectors. These two regression tasks are naturally dependent on each other. In this work, we propose a dual-path network specially designed for multi-person human pose estimation, and compare our performance with the openpose network in aspects of model size, forward speed, and estimation accuracy.

    10/27/2017 ∙ by Guanghan Ning, et al. ∙ 0 share

    read it

  • Knowledge Projection for Deep Neural Networks

    While deeper and wider neural networks are actively pushing the performance limits of various computer vision and machine learning tasks, they often require large sets of labeled data for effective training and suffer from extremely high computational complexity. In this paper, we will develop a new framework for training deep neural networks on datasets with limited labeled samples using cross-network knowledge projection which is able to improve the network performance while reducing the overall computational complexity significantly. Specifically, a large pre-trained teacher network is used to observe samples from the training data. A projection matrix is learned to project this teacher-level knowledge and its visual representations from an intermediate layer of the teacher network to an intermediate layer of a thinner and faster student network to guide and regulate its training process. Both the intermediate layers from the teacher network and the injection layers from the student network are adaptively selected during training by evaluating a joint loss function in an iterative manner. This knowledge projection framework allows us to use crucial knowledge learned by large networks to guide the training of thinner student networks, avoiding over-fitting, achieving better network performance, and significantly reducing the complexity. Extensive experimental results on benchmark datasets have demonstrated that our proposed knowledge projection approach outperforms existing methods, improving accuracy by up to 4 attractive for practical applications of deep neural networks.

    10/26/2017 ∙ by Zhi Zhang, et al. ∙ 0 share

    read it

  • Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation

    Human pose estimation using deep neural networks aims to map input images with large variations into multiple body keypoints which must satisfy a set of geometric constraints and inter-dependency imposed by the human body model. This is a very challenging nonlinear manifold learning process in a very high dimensional feature space. We believe that the deep neural network, which is inherently an algebraic computation system, is not the most effecient way to capture highly sophisticated human knowledge, for example those highly coupled geometric characteristics and interdependence between keypoints in human poses. In this work, we propose to explore how external knowledge can be effectively represented and injected into the deep neural networks to guide its training process using learned projections that impose proper prior. Specifically, we use the stacked hourglass design and inception-resnet module to construct a fractal network to regress human pose images into heatmaps with no explicit graphical modeling. We encode external knowledge with visual features which are able to characterize the constraints of human body models and evaluate the fitness of intermediate network output. We then inject these external features into the neural network using a projection matrix learned using an auxiliary cost function. The effectiveness of the proposed inception-resnet module and the benefit in guided learning with knowledge projection is evaluated on two widely used benchmarks. Our approach achieves state-of-the-art performance on both datasets.

    05/05/2017 ∙ by Guanghan Ning, et al. ∙ 0 share

    read it

  • Progressive Neural Networks for Image Classification

    The inference structures and computational complexity of existing deep neural networks, once trained, are fixed and remain the same for all test images. However, in practice, it is highly desirable to establish a progressive structure for deep neural networks which is able to adapt its inference process and complexity for images with different visual recognition complexity. In this work, we develop a multi-stage progressive structure with integrated confidence analysis and decision policy learning for deep neural networks. This new framework consists of a set of network units to be activated in a sequential manner with progressively increased complexity and visual recognition power. Our extensive experimental results on the CIFAR-10 and ImageNet datasets demonstrate that the proposed progressive deep neural network is able to obtain more than 10 fold complexity scalability while achieving the state-of-the-art performance using a single network model satisfying different complexity-accuracy requirements.

    04/25/2018 ∙ by Zhi Zhang, et al. ∙ 0 share

    read it

  • A Top-down Approach to Articulated Human Pose Estimation and Tracking

    Both the tasks of multi-person human pose estimation and pose tracking in videos are quite challenging. Existing methods can be categorized into two groups: top-down and bottom-up approaches. In this paper, following the top-down approach, we aim to build a strong baseline system with three modules: human candidate detector, single-person pose estimator and human pose tracker. Firstly, we choose a generic object detector among state-of-the-art methods to detect human candidates. Then, the cascaded pyramid network is used to estimate the corresponding human pose. Finally, we use a flow-based pose tracker to render keypoint-association across frames, i.e., assigning each human candidate a unique and temporally-consistent id, for the multi-target pose tracking purpose. We conduct extensive ablative experiments to validate various choices of models and configurations. We take part in two ECCV 18 PoseTrack challenges: pose estimation and pose tracking.

    01/23/2019 ∙ by Guanghan Ning, et al. ∙ 0 share

    read it