Deep learning-based methods have achieved remarkable success in image restoration and enhancement, but are they still competitive when there is a lack of paired training data? As one such example, this paper explores the low-light image enhancement problem, where in practice it is extremely challenging to simultaneously take a low-light and a normal-light photo of the same visual scene. We propose a highly effective unsupervised generative adversarial network, dubbed EnlightenGAN, that can be trained without low/normal-light image pairs, yet proves to generalize very well on various real-world test images. Instead of supervising the learning using ground truth data, we propose to regularize the unpaired training using the information extracted from the input itself, and benchmark a series of innovations for the low-light image enhancement problem, including a global-local discriminator structure, a self-regularized perceptual loss fusion, and attention mechanism. Through extensive experiments, our proposed approach outperforms recent methods under a variety of metrics in terms of visual quality and subjective user study. Thanks to the great flexibility brought by unpaired training, EnlightenGAN is demonstrated to be easily adaptable to enhancing real-world images from various domains. The code is available at <https://github.com/yueruchen/EnlightenGAN>
06/17/2019 ∙ by Yifan Jiang, et al. ∙ 26 ∙ share
State-of-the-art pedestrian detection models have achieved great success in many benchmarks. However, these models require lots of annotation information and the labeling process usually takes much time and efforts. In this paper, we propose a method to generate labeled pedestrian data and adapt them to support the training of pedestrian detectors. The proposed framework is built on the Generative Adversarial Network (GAN) with multiple discriminators, trying to synthesize realistic pedestrians and learn the background context simultaneously. To handle the pedestrians of different sizes, we adopt the Spatial Pyramid Pooling (SPP) layer in the discriminator. We conduct experiments on two benchmarks. The results show that our framework can smoothly synthesize pedestrians on background images of variations and different levels of details. To quantitatively evaluate our approach, we add the generated samples into training data of the baseline pedestrian detectors and show the synthetic images are able to improve the detectors' performance.
04/05/2018 ∙ by Xi Ouyang, et al. ∙ 0 ∙ share
Yifan Jiangis this you? claim profile