Leveraging Contaminated Datasets to Learn Clean-Data Distribution with Purified Generative Adversarial Networks

02/03/2023
by   Bowen Tian, et al.
0

Generative adversarial networks (GANs) are known for their strong abilities on capturing the underlying distribution of training instances. Since the seminal work of GAN, many variants of GAN have been proposed. However, existing GANs are almost established on the assumption that the training dataset is clean. But in many real-world applications, this may not hold, that is, the training dataset may be contaminated by a proportion of undesired instances. When training on such datasets, existing GANs will learn a mixture distribution of desired and contaminated instances, rather than the desired distribution of desired data only (target distribution). To learn the target distribution from contaminated datasets, two purified generative adversarial networks (PuriGAN) are developed, in which the discriminators are augmented with the capability to distinguish between target and contaminated instances by leveraging an extra dataset solely composed of contamination instances. We prove that under some mild conditions, the proposed PuriGANs are guaranteed to converge to the distribution of desired instances. Experimental results on several datasets demonstrate that the proposed PuriGANs are able to generate much better images from the desired distribution than comparable baselines when trained on contaminated datasets. In addition, we also demonstrate the usefulness of PuriGAN on downstream applications by applying it to the tasks of semi-supervised anomaly detection on contaminated datasets and PU-learning. Experimental results show that PuriGAN is able to deliver the best performance over comparable baselines on both tasks.

READ FULL TEXT
research
05/29/2019

KG-GAN: Knowledge-Guided Generative Adversarial Networks

Generative adversarial networks (GANs) learn to mimic training data that...
research
06/28/2021

Non-Exhaustive Learning Using Gaussian Mixture Generative Adversarial Networks

Supervised learning, while deployed in real-life scenarios, often encoun...
research
02/18/2018

RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks

Training complex machine learning models for prediction often requires a...
research
02/12/2019

Learning Generative Models of Structured Signals from Their Superposition Using GANs with Application to Denoising and Demixing

Recently, Generative Adversarial Networks (GANs) have emerged as a popul...
research
10/06/2021

GANtron: Emotional Speech Synthesis with Generative Adversarial Networks

Speech synthesis is used in a wide variety of industries. Nonetheless, i...
research
10/21/2019

Mining GOLD Samples for Conditional GANs

Conditional generative adversarial networks (cGANs) have gained a consid...
research
11/21/2017

A generative adversarial framework for positive-unlabeled classification

In this work, we consider the task of classifying the binary positive-un...

Please sign up or login with your details

Forgot password? Click here to reset