Jiawei He

is this you? claim profile


  • Generative Model with Dynamic Linear Flow

    Flow-based generative models are a family of exact log-likelihood models with tractable sampling and latent-variable inference, hence conceptually attractive for modeling complex distributions. However, flow-based models are limited by density estimation performance issues as compared to state-of-the-art autoregressive models. Autoregressive models, which also belong to the family of likelihood-based methods, however suffer from limited parallelizability. In this paper, we propose Dynamic Linear Flow (DLF), a new family of invertible transformations with partially autoregressive structure. Our method benefits from the efficient computation of flow-based methods and high density estimation performance of autoregressive methods. We demonstrate that the proposed DLF yields state-of-theart performance on ImageNet 32x32 and 64x64 out of all flow-based methods, and is competitive with the best autoregressive model. Additionally, our model converges 10 times faster than Glow (Kingma and Dhariwal, 2018). The code is available at https://github.com/naturomics/DLF.

    05/08/2019 ∙ by Huadong Liao, et al. ∙ 14 share

    read it

  • Lifelong GAN: Continual Learning for Conditional Image Generation

    Lifelong learning is challenging for deep neural networks due to their susceptibility to catastrophic forgetting. Catastrophic forgetting occurs when a trained network is not able to maintain its ability to accomplish previously learned tasks when it is trained to perform new tasks. We study the problem of lifelong learning for generative models, extending a trained network to new conditional generation tasks without forgetting previous tasks, while assuming access to the training data for the current task only. In contrast to state-of-the-art memory replay based approaches which are limited to label-conditioned image generation tasks, a more generic framework for continual learning of generative models under different conditional image generation settings is proposed in this paper. Lifelong GAN employs knowledge distillation to transfer learned knowledge from previous networks to the new network. This makes it possible to perform image-conditioned generation tasks in a lifelong learning setting. We validate Lifelong GAN for both image-conditioned and label-conditioned generation tasks, and provide qualitative and quantitative results to show the generality and effectiveness of our method.

    07/23/2019 ∙ by Mengyao Zhai, et al. ∙ 7 share

    read it

  • LayoutVAE: Stochastic Scene Layout Generation from a Label Set

    Recently there is an increasing interest in scene generation within the research community. However, scene layouts are largely being modeled in deterministic fashion, ignoring any plausible visual variations given the same textual description as input. We propose LayoutVAE, a variational autoencoder based framework for generating stochastic scene layouts. LayoutVAE is a versatile modeling framework that allows for generating full image layouts given a label set, or per label layouts for an existing image given a new label. In addition, it is also capable of detecting unusual layouts, potentially providing a way to evaluate layout generation problem. Extensive experiments on MNIST-Layouts and challenging COCO 2017 Panoptic dataset verifies the effectiveness of our proposed framework.

    07/24/2019 ∙ by Akash Abdu Jyothi, et al. ∙ 3 share

    read it

  • Probabilistic Video Generation using Holistic Attribute Control

    Videos express highly structured spatio-temporal patterns of visual data. A video can be thought of as being governed by two factors: (i) temporally invariant (e.g., person identity), or slowly varying (e.g., activity), attribute-induced appearance, encoding the persistent content of each frame, and (ii) an inter-frame motion or scene dynamics (e.g., encoding evolution of the person ex-ecuting the action). Based on this intuition, we propose a generative framework for video generation and future prediction. The proposed framework generates a video (short clip) by decoding samples sequentially drawn from a latent space distribution into full video frames. Variational Autoencoders (VAEs) are used as a means of encoding/decoding frames into/from the latent space and RNN as a wayto model the dynamics in the latent space. We improve the video generation consistency through temporally-conditional sampling and quality by structuring the latent space with attribute controls; ensuring that attributes can be both inferred and conditioned on during learning/generation. As a result, given attributes and/orthe first frame, our model is able to generate diverse but highly consistent sets ofvideo sequences, accounting for the inherent uncertainty in the prediction task. Experimental results on Chair CAD, Weizmann Human Action, and MIT-Flickr datasets, along with detailed comparison to the state-of-the-art, verify effectiveness of the framework.

    03/21/2018 ∙ by Jiawei He, et al. ∙ 2 share

    read it

  • Generic Tubelet Proposals for Action Localization

    We develop a novel framework for action localization in videos. We propose the Tube Proposal Network (TPN), which can generate generic, class-independent, video-level tubelet proposals in videos. The generated tubelet proposals can be utilized in various video analysis tasks, including recognizing and localizing actions in videos. In particular, we integrate these generic tubelet proposals into a unified temporal deep network for action classification. Compared with other methods, our generic tubelet proposal method is accurate, general, and is fully differentiable under a smoothL1 loss function. We demonstrate the performance of our algorithm on the standard UCF-Sports, J-HMDB21, and UCF-101 datasets. Our class-independent TPN outperforms other tubelet generation methods, and our unified temporal deep network achieves state-of-the-art localization results on all three datasets.

    05/30/2017 ∙ by Jiawei He, et al. ∙ 0 share

    read it

  • SECaps: A Sequence Enhanced Capsule Model for Charge Prediction

    Automatic charge prediction aims to predict appropriate final charges according to the fact descriptions for a given criminal case. Automatic charge pre-diction plays an important role in assisting judges and lawyers to improve the effi-ciency of legal decisions, and thus has received much attention. Nevertheless, most existing works on automatic charge prediction perform adequately on those high-frequency charges but are not yet capable of predicting few-shot charges with lim-ited cases. On the other hand, some works have shown the benefits of capsule net-work, which is a powerful technique. This motivates us to propose a Sequence En-hanced Capsule model, dubbed as SECaps model, to relieve this problem. More specifically, we propose a new basic structure, seq-caps layer, to enhance capsule by taking sequence information in to account. In addition, we construct our SE-Caps model by making use of seq-caps layer. Comparing the state-of-the-art meth-ods, our SECaps model achieves 4.5 Criminal-S and Criminal-L, respectively. The experimental results consis-tently demonstrate the superiorities and competitiveness of our proposed model.

    10/10/2018 ∙ by Congqing He, et al. ∙ 0 share

    read it

  • A Variational Auto-Encoder Model for Stochastic Point Processes

    We propose a novel probabilistic generative model for action sequences. The model is termed the Action Point Process VAE (APP-VAE), a variational auto-encoder that can capture the distribution over the times and categories of action sequences. Modeling the variety of possible action sequences is a challenge, which we show can be addressed via the APP-VAE's use of latent representations and non-linear functions to parameterize distributions over which event is likely to occur next in a sequence and at what time. We empirically validate the efficacy of APP-VAE for modeling action sequences on the MultiTHUMOS and Breakfast datasets.

    04/05/2019 ∙ by Nazanin Mehrasa, et al. ∙ 0 share

    read it