Effectively Unbiased FID and Inception Score and where to find them

11/16/2019
by   Min Jin Chong, et al.
5

This paper shows that two commonly used evaluation metrics for generative models, the Fréchet Inception Distance (FID) and the Inception Score (IS), are biased – the expected value of the score computed for a finite sample set is not the true value of the score. Worse, the paper shows that the bias term depends on the particular model being evaluated, so model A may get a better score than model B simply because model A's bias term is smaller. This effect cannot be fixed by evaluating at a fixed number of samples. This means all comparisons using FID or IS as currently computed are unreliable. We then show how to extrapolate the score to obtain an effectively bias-free estimate of scores computed with an infinite number of samples, which we term FID_∞ and IS_∞. In turn, this effectively bias-free estimate requires good estimates of scores with a finite number of samples. We show that using Quasi-Monte Carlo integration notably improves estimates of FID and IS for finite sample sets. Our extrapolated scores are simple, drop-in replacements for the finite sample scores. Additionally, we show that using low discrepancy sequence in GAN training offers small improvements in the resulting generator.

READ FULL TEXT
research
11/02/2020

Toward a Generalization Metric for Deep Generative Models

Measuring the generalization capacity of Deep Generative Models (DGMs) i...
research
10/19/2012

Adaptive Stratified Sampling for Monte-Carlo integration of Differentiable functions

We consider the problem of adaptive stratified sampling for Monte Carlo ...
research
06/22/2022

A Study on the Evaluation of Generative Models

Implicit generative models, which do not return likelihood values, such ...
research
06/22/2021

Discrepancy-based Inference for Intractable Generative Models using Quasi-Monte Carlo

Intractable generative models are models for which the likelihood is una...
research
09/25/2022

Finite-sample Rousseeuw-Croux scale estimators

The Rousseeuw-Croux S_n, Q_n scale estimators and the median absolute de...
research
05/26/2023

Computation of Reliability Statistics for Finite Samples of Success-Failure Experiments

Computational method for statistical measures of reliability, confidence...
research
04/03/2015

Evaluation Evaluation a Monte Carlo study

Over the last decade there has been increasing concern about the biases ...

Please sign up or login with your details

Forgot password? Click here to reset