Toward a Generalization Metric for Deep Generative Models

11/02/2020
by   Hoang Thanh-Tung, et al.
0

Measuring the generalization capacity of Deep Generative Models (DGMs) is difficult because of the curse of dimensionality. Evaluation metrics for DGMs like Inception Score, Frechet Inception Distance, Precision-Recall, and Neural Net Divergence try to estimate the distance between the generated distribution and the target distribution using a polynomial number of samples. These metrics are the target of researchers when designing new models. Despite the claims, it is still unclear how well they can measure the generalization capacity of a model. In this paper, we investigate the capacity of these metrics in measuring the generalization capacity. We introduce a framework for comparing the robustness of evaluation metrics. We show that better scores in these metrics do not imply better generalization. They can be fooled easily by a generator that memorizes a small subset of the training set. We propose a fix to the NND metric to make it more robust to noise in the generated data.

READ FULL TEXT

page 7

page 13

research
01/06/2018

A Note on the Inception Score

Deep generative models are powerful tools that have produced impressive ...
research
05/31/2018

Assessing Generative Models via Precision and Recall

Recent advances in generative modeling have led to an increased interest...
research
11/16/2019

Effectively Unbiased FID and Inception Score and where to find them

This paper shows that two commonly used evaluation metrics for generativ...
research
03/31/2023

Resolving power: A general approach to compare the discriminating capacity of threshold-free evaluation metrics

This paper introduces the concept of resolving power to describe the cap...
research
06/06/2021

On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition

Many recent developments on generative models for natural images have re...
research
01/10/2020

Towards GAN Benchmarks Which Require Generalization

For many evaluation metrics commonly used as benchmarks for unconditiona...
research
10/22/2019

Establishing an Evaluation Metric to Quantify Climate Change Image Realism

With success on controlled tasks, generative models are being increasing...

Please sign up or login with your details

Forgot password? Click here to reset