Jiqing Wu

is this you? claim profile


  • Sliced Wasserstein Generative Models

    In generative modeling, the Wasserstein distance (WD) has emerged as a useful metric to measure the discrepancy between generated and real data distributions. Unfortunately, it is challenging to approximate the WD of high-dimensional distributions. In contrast, the sliced Wasserstein distance (SWD) factorizes high-dimensional distributions into their multiple one-dimensional marginal distributions and is thus easier to approximate. In this paper, we introduce novel approximations of the primal and dual SWD. Instead of using a large number of random projections, as it is done by conventional SWD approximation methods, we propose to approximate SWDs with a small number of parameterized orthogonal projections in an end-to-end deep learning fashion. As concrete applications of our SWD approximations, we design two types of differentiable SWD blocks to equip modern generative frameworks---Auto-Encoders (AE) and Generative Adversarial Networks (GAN). In the experiments, we not only show the superiority of the proposed generative models on standard image synthesis benchmarks, but also demonstrate the state-of-the-art performance on challenging high resolution image and video generation in an unsupervised manner.

    04/10/2019 ∙ by Jiqing Wu, et al. ∙ 46 share

    read it

  • Energy-relaxed Wassertein GANs(EnergyWGAN): Towards More Stable and High Resolution Image Generation

    Recently, generative adversarial networks (GANs) have achieved great impacts on a broad number of applications, including low resolution(LR) image synthesis. However, they suffer from unstable training especially when image resolution increases. To overcome this bottleneck, this paper generalizes the state-of-the-art Wasserstein GANs (WGANs) to an energy-relaxed objective which enables more stable and higher-resolution image generation. The benefits of this generalization can be summarized in three main points. Firstly, the proposed EnergyWGAN objective guarantees a valid symmetric divergence serving as a more rigorous and meaningful quantitative measure. Secondly, EnergyWGAN is capable of searching a more faithful solution space than the original WGANs without fixing a specific k-Lipschitz constraint. Finally, the proposed EnergyWGAN offers a natural way of stacking GANs for high resolution image generation. In our experiments we not only demonstrate the stable training ability of the proposed EnergyWGAN and its better image generation results on standard benchmark datasets, but also show the advantages over the state-of-the-art GANs on a real-world high resolution image dataset.

    12/04/2017 ∙ by Jiqing Wu, et al. ∙ 0 share

    read it

  • Face Translation between Images and Videos using Identity-aware CycleGAN

    This paper presents a new problem of unpaired face translation between images and videos, which can be applied to facial video prediction and enhancement. In this problem there exist two major technical challenges: 1) designing a robust translation model between static images and dynamic videos, and 2) preserving facial identity during image-video translation. To address such two problems, we generalize the state-of-the-art image-to-image translation network (Cycle-Consistent Adversarial Networks) to the image-to-video/video-to-image translation context by exploiting a image-video translation model and an identity preservation model. In particular, we apply the state-of-the-art Wasserstein GAN technique to the setting of image-video translation for better convergence, and we meanwhile introduce a face verificator to ensure the identity. Experiments on standard image/video face datasets demonstrate the effectiveness of the proposed model in both terms of qualitative and quantitative evaluations.

    12/04/2017 ∙ by Zhiwu Huang, et al. ∙ 0 share

    read it

  • Generative Autotransporters

    In this paper, we aim to introduce the classic Optimal Transport theory to enhance deep generative probabilistic modeling. For this purpose, we design a Generative Autotransporter (GAT) model with explicit distribution optimal transport. Particularly, the GAT model owns a deep distribution transporter to transfer the target distribution to a specific prior probability distribution, which enables a regular decoder to generate target samples from the input data that follows the transported prior distribution. With such a design, the GAT model can be stably trained to generate novel data by merely using a very simple l_1 reconstruction loss function with a generalized manifold-based Adam training algorithm. The experiments on two standard benchmarks demonstrate its strong generation ability.

    06/08/2017 ∙ by Jiqing Wu, et al. ∙ 0 share

    read it

  • On the Relation between Color Image Denoising and Classification

    Large amount of image denoising literature focuses on single channel images and often experimentally validates the proposed methods on tens of images at most. In this paper, we investigate the interaction between denoising and classification on large scale dataset. Inspired by classification models, we propose a novel deep learning architecture for color (multichannel) image denoising and report on thousands of images from ImageNet dataset as well as commonly used imagery. We study the importance of (sufficient) training data, how semantic class information can be traded for improved denoising results. As a result, our method greatly improves PSNR performance by 0.34 - 0.51 dB on average over state-of-the art methods on large scale dataset. We conclude that it is beneficial to incorporate in classification models. On the other hand, we also study how noise affect classification performance. In the end, we come to a number of interesting conclusions, some being counter-intuitive.

    04/05/2017 ∙ by Jiqing Wu, et al. ∙ 0 share

    read it

  • Building Deep Networks on Grassmann Manifolds

    Learning representations on Grassmann manifolds is popular in quite a few visual recognition tasks. In order to enable deep learning on Grassmann manifolds, this paper proposes a deep network architecture by generalizing the Euclidean network paradigm to Grassmann manifolds. In particular, we design full rank mapping layers to transform input Grassmannian data to more desirable ones, exploit re-orthonormalization layers to normalize the resulting matrices, study projection pooling layers to reduce the model complexity in the Grassmannian context, and devise projection mapping layers to respect Grassmannian geometry and meanwhile achieve Euclidean forms for regular output layers. To train the Grassmann networks, we exploit a stochastic gradient descent setting on manifolds of the connection weights, and study a matrix generalization of backpropagation to update the structured data. The evaluations on three visual recognition tasks show that our Grassmann networks have clear advantages over existing Grassmann learning methods, and achieve results comparable with state-of-the-art approaches.

    11/17/2016 ∙ by Zhiwu Huang, et al. ∙ 0 share

    read it

  • Generic 3D Convolutional Fusion for image restoration

    Also recently, exciting strides forward have been made in the area of image restoration, particularly for image denoising and single image super-resolution. Deep learning techniques contributed to this significantly. The top methods differ in their formulations and assumptions, so even if their average performance may be similar, some work better on certain image types and image regions than others. This complementarity motivated us to propose a novel 3D convolutional fusion (3DCF) method. Unlike other methods adapted to different tasks, our method uses the exact same convolutional network architecture to address both image denois- ing and single image super-resolution. As a result, our 3DCF method achieves substantial improvements (0.1dB-0.4dB PSNR) over the state-of-the-art methods that it fuses, and this on standard benchmarks for both tasks. At the same time, the method still is computationally efficient.

    07/26/2016 ∙ by Jiqing Wu, et al. ∙ 0 share

    read it

  • Manifold-valued Image Generation with Wasserstein Adversarial Networks

    Unsupervised image generation has recently received an increasing amount of attention thanks to the great success of generative adversarial networks (GANs), particularly Wasserstein GANs. Inspired by the paradigm of real-valued image generation, this paper makes the first attempt to formulate the problem of generating manifold-valued images, which are frequently encountered in real-world applications. For the study, we specially exploit three typical manifold-valued image generation tasks: hue-saturation-value (HSV) color image generation, chromaticity-brightness (CB) color image generation, and diffusion-tensor (DT) image generation. In order to produce such kinds of images as realistic as possible, we generalize the state-of-the-art technique of Wasserstein GANs to the manifold context with exploiting Riemannian geometry. For the proposed manifold-valued image generation problem, we recommend three benchmark datasets that are CIFAR-10 HSV/CB color images, ImageNet HSV/CB color images, UCL DT image datasets. On the three datasets, we experimentally demonstrate the proposed manifold-aware Wasserestein GAN can generate high quality manifold-valued images.

    12/05/2017 ∙ by Zhiwu Huang, et al. ∙ 0 share

    read it