Liang Wang

is this you? claim profile

0

PhD degree in Pattern Recognition and Intelligent System from the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CAS) 2004, Research Assistant at the Imperial College London, United Kingdom and Monash University, Australia, and a Research Fellow at the University of Melbourne, Australia, Lecturer with the Department of Computer Science, University of Bath, Professor of Hundred Talents Program of CAS at the Institute of Automation, Chinese Academy of Sciences, P. R. China.

  • An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

    Skeleton-based action recognition is an important task that requires the adequate understanding of movement characteristics of a human action from the given skeleton sequence. Recent studies have shown that exploring spatial and temporal features of the skeleton sequence is vital for this task. Nevertheless, how to effectively extract discriminative spatial and temporal features is still a challenging problem. In this paper, we propose a novel Attention Enhanced Graph Convolutional LSTM Network (AGC-LSTM) for human action recognition from skeleton data. The proposed AGC-LSTM can not only capture discriminative features in spatial configuration and temporal dynamics but also explore the co-occurrence relationship between spatial and temporal domains. We also present a temporal hierarchical architecture to increases temporal receptive fields of the top AGC-LSTM layer, which boosts the ability to learn the high-level semantic representation and significantly reduces the computation cost. Furthermore, to select discriminative spatial information, the attention mechanism is employed to enhance information of key joints in each AGC-LSTM layer. Experimental results on two datasets are provided: NTU RGB+D dataset and Northwestern-UCLA dataset. The comparison results demonstrate the effectiveness of our approach and show that our approach outperforms the state-of-the-art methods on both datasets.

    02/25/2019 ∙ by Chenyang Si, et al. ∙ 24 share

    read it

  • Semi-supervised Node Classification via Hierarchical Graph Convolutional Networks

    Graph convolutional networks (GCNs) have been successfully applied in node classification tasks of network mining. However, most of models based on neighborhood aggregation are usually shallow and lack the "graph pooling" mechanism, which prevents the model from obtaining adequate global information. In order to increase the receptive field, we propose a novel deep Hierarchical Graph Convolutional Network (H-GCN) for semi-supervised node classification. H-GCN first repeatedly aggregates structurally similar nodes to hyper-nodes and then refines the coarsened graph to the original to restore the representation for each node. Instead of merely aggregating one- or two-hop neighborhood information, the proposed coarsening procedure enlarges the receptive field for each original node, hence more global information can be learned. Comprehensive experiments conducted on public datasets demonstrate effectiveness of the proposed method over the state-of-art methods. Notably, our model gains substantial improvements when only very few labeled samples are provided.

    02/13/2019 ∙ by Fenyu Hu, et al. ∙ 8 share

    read it

  • Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning

    Reasoning is essential for the development of large knowledge graphs, especially for completion, which aims to infer new triples based on existing ones. Both rules and embeddings can be used for knowledge graph reasoning and they have their own advantages and difficulties. Rule-based reasoning is accurate and explainable but rule learning with searching over the graph always suffers from efficiency due to huge search space. Embedding-based reasoning is more scalable and efficient as the reasoning is conducted via computation between embeddings, but it has difficulty learning good representations for sparse entities because a good embedding relies heavily on data richness. Based on this observation, in this paper we explore how embedding and rule learning can be combined together and complement each other's difficulties with their advantages. We propose a novel framework IterE iteratively learning embeddings and rules, in which rules are learned from embeddings with proper pruning strategy and embeddings are learned from existing triples and new triples inferred by rules. Evaluations on embedding qualities of IterE show that rules help improve the quality of sparse entity embeddings and their link prediction results. We also evaluate the efficiency of rule learning and quality of rules from IterE compared with AMIE+, showing that IterE is capable of generating high quality rules more efficiently. Experiments show that iteratively learning embeddings and rules benefit each other during learning and prediction.

    03/21/2019 ∙ by Wen Zhang, et al. ∙ 8 share

    read it

  • Simultaneous Subspace Clustering and Cluster Number Estimating based on Triplet Relationship

    In this paper we propose a unified framework to simultaneously discover the number of clusters and group the data points into them using subspace clustering. Real data distributed in a high-dimensional space can be disentangled into a union of low-dimensional subspaces, which can benefit various applications. To explore such intrinsic structure, state-of-the-art subspace clustering approaches often optimize a self-representation problem among all samples, to construct a pairwise affinity graph for spectral clustering. However, a graph with pairwise similarities lacks robustness for segmentation, especially for samples which lie on the intersection of two subspaces. To address this problem, we design a hyper-correlation based data structure termed as the triplet relationship, which reveals high relevance and local compactness among three samples. The triplet relationship can be derived from the self-representation matrix, and be utilized to iteratively assign the data points to clusters. Three samples in each triplet are encouraged to be highly correlated and are considered as a meta-element during clustering, which show more robustness than pairwise relationships when segmenting two densely distributed subspaces. Based on the triplet relationship, we propose a unified optimizing scheme to automatically calculate clustering assignments. Specifically, we optimize a model selection reward and a fusion reward by simultaneously maximizing the similarity of triplets from different clusters while minimizing the correlation of triplets from same cluster. The proposed algorithm also automatically reveals the number of clusters and fuses groups to avoid over-segmentation. Extensive experimental results on both synthetic and real-world datasets validate the effectiveness and robustness of the proposed method.

    01/23/2019 ∙ by Jie Liang, et al. ∙ 6 share

    read it

  • Richlt Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons

    Current methods for skeleton-based human action recognition usually work with completely observed skeletons. However, in real scenarios, it is prone to capture incomplete and noisy skeletons, which will deteriorate the performance of traditional models. To enhance the robustness of action recognition models to incomplete skeletons, we propose a multi-stream graph convolutional network (GCN) for exploring sufficient discriminative features distributed over all skeleton joints. Here, each stream of the network is only responsible for learning features from currently unactivated joints, which are distinguished by the class activation maps (CAM) obtained by preceding streams, so that the activated joints of the proposed method are obviously more than traditional methods. Thus, the proposed method is termed richly activated GCN (RA-GCN), where the richly discovered features will improve the robustness of the model. Compared to the state-of-the-art methods, the RA-GCN achieves comparable performance on the NTU RGB+D dataset. Moreover, on a synthetic occlusion dataset, the performance deterioration can be alleviated by the RA-GCN significantly.

    05/16/2019 ∙ by Yi-Fan Song, et al. ∙ 6 share

    read it

  • Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons

    Current methods for skeleton-based human action recognition usually work with completely observed skeletons. However, in real scenarios, it is prone to capture incomplete and noisy skeletons, which will deteriorate the performance of traditional models. To enhance the robustness of action recognition models to incomplete skeletons, we propose a multi-stream graph convolutional network (GCN) for exploring sufficient discriminative features distributed over all skeleton joints. Here, each stream of the network is only responsible for learning features from currently unactivated joints, which are distinguished by the class activation maps (CAM) obtained by preceding streams, so that the activated joints of the proposed method are obviously more than traditional methods. Thus, the proposed method is termed richly activated GCN (RA-GCN), where the richly discovered features will improve the robustness of the model. Compared to the state-of-the-art methods, the RA-GCN achieves comparable performance on the NTU RGB+D dataset. Moreover, on a synthetic occlusion dataset, the performance deterioration can be alleviated by the RA-GCN significantly.

    05/16/2019 ∙ by Yi-Fan Song, et al. ∙ 6 share

    read it

  • Hierarchical Macro Strategy Model for MOBA Game AI

    The next challenge of game AI lies in Real Time Strategy (RTS) games. RTS games provide partially observable gaming environments, where agents interact with one another in an action space much larger than that of GO. Mastering RTS games requires both strong macro strategies and delicate micro level execution. Recently, great progress has been made in micro level execution, while complete solutions for macro strategies are still lacking. In this paper, we propose a novel learning-based Hierarchical Macro Strategy model for mastering MOBA games, a sub-genre of RTS games. Trained by the Hierarchical Macro Strategy model, agents explicitly make macro strategy decisions and further guide their micro level execution. Moreover, each of the agents makes independent strategy decisions, while simultaneously communicating with the allies through leveraging a novel imitated cross-agent communication mechanism. We perform comprehensive evaluations on a popular 5v5 Multiplayer Online Battle Arena (MOBA) game. Our 5-AI team achieves a 48 teams which are ranked top 1

    12/19/2018 ∙ by Bin Wu, et al. ∙ 6 share

    read it

  • Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection

    Person search with natural language aims to retrieve the corresponding person in an image database by virtue of a describing sentence about the person, which poses great potential for many applications, e.g., video surveillance. Extracting corresponding visual contents to the human description is the key to this cross-modal matching problem. In this paper, we propose a cascade attention network (CAN) to progressively select from person image and text-image similarity. In the CAN, a pose-guided attention is first proposed to attend to the person in the augmented input which concatenates original 3 image channels with another 14 pose confidence maps. With the extracted person image representation, we compute the local similarities between person parts and textual description. Then a similarity-based hard attention is proposed to further select the description-related similarity scores from those local similarities. To verify the effectiveness of our model, we perform extensive experiments on the CUHK Person Description Dataset (CUHK-PEDES) which is currently the only dataset for person search with natural language. Experimental results show that our approach outperforms the state-of-the-art methods by a large margin.

    09/22/2018 ∙ by Ya Jing, et al. ∙ 4 share

    read it

  • Box-driven Class-wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation

    Semantic segmentation has achieved huge progress via adopting deep Fully Convolutional Networks (FCN). However, the performance of FCN based models severely rely on the amounts of pixel-level annotations which are expensive and time-consuming. To address this problem, it is a good choice to learn to segment with weak supervision from bounding boxes. How to make full use of the class-level and region-level supervisions from bounding boxes is the critical challenge for the weakly supervised learning task. In this paper, we first introduce a box-driven class-wise masking model (BCM) to remove irrelevant regions of each class. Moreover, based on the pixel-level segment proposal generated from the bounding box supervision, we could calculate the mean filling rates of each class to serve as an important prior cue, then we propose a filling rate guided adaptive loss (FR-Loss) to help the model ignore the wrongly labeled pixels in proposals. Unlike previous methods directly training models with the fixed individual segment proposals, our method can adjust the model learning with global statistical information. Thus it can help reduce the negative impacts from wrongly labeled proposals. We evaluate the proposed method on the challenging PASCAL VOC 2012 benchmark and compare with other methods. Extensive experimental results show that the proposed method is effective and achieves the state-of-the-art results.

    04/26/2019 ∙ by Chunfeng Song, et al. ∙ 2 share

    read it

  • Progressive Cluster Purification for Transductive Few-shot Learning

    Few-shot learning aims to learn to generalize a classifier to novel classes with limited labeled data. Transductive inference that utilizes unlabeled test set to deal with low-data problem has been employed for few-shot learning in recent literature. Yet, these methods do not explicitly exploit the manifold structures of semantic clusters, which is inefficient for transductive inference. In this paper, we propose a novel Progressive Cluster Purification (PCP) method for transductive few-shot learning. The PCP can progressively purify the cluster by exploring the semantic interdependency in the individual cluster space. Specifically, the PCP consists of two-level operations: inter-class classification and intra-class transduction. The inter-class classification partitions all the test samples into several clusters by comparing the test samples with the prototypes. The intra-class transduction effectively explores trustworthy test samples for each cluster by modeling data relations within a cluster as well as among different clusters. Then, it refines the prototypes to better represent the real distribution of semantic clusters. The refined prototypes are used to remeasure all the test instances and purify each cluster. Furthermore, the inter-class classification and the intra-class transduction are extremely flexible to be repeated several times to progressively purify the clusters. Experimental results are provided on two datasets: miniImageNet dataset and tieredImageNet dataset. The comparison results demonstrate the effectiveness of our approach and show that our approach outperforms the state-of-the-art methods on both datasets.

    06/10/2019 ∙ by Chenyang Si, et al. ∙ 1 share

    read it

  • Irregular Convolutional Neural Networks

    Convolutional kernels are basic and vital components of deep Convolutional Neural Networks (CNN). In this paper, we equip convolutional kernels with shape attributes to generate the deep Irregular Convolutional Neural Networks (ICNN). Compared to traditional CNN applying regular convolutional kernels like 3×3, our approach trains irregular kernel shapes to better fit the geometric variations of input features. In other words, shapes are learnable parameters in addition to weights. The kernel shapes and weights are learned simultaneously during end-to-end training with the standard back-propagation algorithm. Experiments for semantic segmentation are implemented to validate the effectiveness of our proposed ICNN.

    06/24/2017 ∙ by Jiabin Ma, et al. ∙ 0 share

    read it