How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models

02/17/2021
by   Ahmed M. Alaa, et al.
7

Devising domain- and model-agnostic evaluation metrics for generative models is an important and as yet unresolved problem. Most existing metrics, which were tailored solely to the image synthesis setup, exhibit a limited capacity for diagnosing the different modes of failure of generative models across broader application domains. In this paper, we introduce a 3-dimensional evaluation metric, (α-Precision, β-Recall, Authenticity), that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion. Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity. We introduce generalization as an additional, independent dimension (to the fidelity-diversity trade-off) that quantifies the extent to which a model copies training data – a crucial performance indicator when modeling sensitive data with requirements on privacy. The three metric components correspond to (interpretable) probabilistic quantities, and are estimated via sample-level binary classification. The sample-level nature of our metric inspires a novel use case which we call model auditing, wherein we judge the quality of individual samples generated by a (black-box) model, discarding low-quality samples and hence improving the overall model performance in a post-hoc manner.

READ FULL TEXT

page 1

page 8

research
09/04/2023

Probabilistic Precision and Recall Towards Reliable Evaluation of Generative Models

Assessing the fidelity and diversity of the generative model is a diffic...
research
06/27/2023

Learning from Invalid Data: On Constraint Satisfaction in Generative Models

Generative models have demonstrated impressive results in vision, langua...
research
08/31/2023

Unsupervised evaluation of GAN sample quality: Introducing the TTJac Score

Evaluation metrics are essential for assessing the performance of genera...
research
06/16/2023

Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions

Precision and Recall are two prominent metrics of generative performance...
research
05/26/2019

Evaluating Generative Models Using Divergence Frontiers

Despite the tremendous progress in the estimation of generative models, ...
research
06/04/2021

Barcode Method for Generative Model Evaluation driven by Topological Data Analysis

Evaluating the performance of generative models in image synthesis is a ...
research
01/10/2020

Towards GAN Benchmarks Which Require Generalization

For many evaluation metrics commonly used as benchmarks for unconditiona...

Please sign up or login with your details

Forgot password? Click here to reset