Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions

by   Mahyar Khayatkhoei, et al.

Precision and Recall are two prominent metrics of generative performance, which were proposed to separately measure the fidelity and diversity of generative models. Given their central role in comparing and improving generative models, understanding their limitations are crucially important. To that end, in this work, we identify a critical flaw in the common approximation of these metrics using k-nearest-neighbors, namely, that the very interpretations of fidelity and diversity that are assigned to Precision and Recall can fail in high dimensions, resulting in very misleading conclusions. Specifically, we empirically and theoretically show that as the number of dimensions grows, two model distributions with supports at equal point-wise distance from the support of the real distribution, can have vastly different Precision and Recall regardless of their respective distributions, hence an emergent asymmetry in high dimensions. Based on our theoretical insights, we then provide simple yet effective modifications to these metrics to construct symmetric metrics regardless of the number of dimensions. Finally, we provide experiments on real-world datasets to illustrate that the identified flaw is not merely a pathological case, and that our proposed metrics are effective in alleviating its impact.


Probabilistic Precision and Recall Towards Reliable Evaluation of Generative Models

Assessing the fidelity and diversity of the generative model is a diffic...

Reliable Fidelity and Diversity Metrics for Generative Models

Devising indicative evaluation metrics for the image generation task rem...

How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models

Devising domain- and model-agnostic evaluation metrics for generative mo...

Barcode Method for Generative Model Evaluation driven by Topological Data Analysis

Evaluating the performance of generative models in image synthesis is a ...

Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images

Evaluation metrics in image synthesis play a key role to measure perform...

Precision-Recall Divergence Optimization for Generative Modeling with GANs and Normalizing Flows

Achieving a balance between image quality (precision) and diversity (rec...

Evaluating Generative Models Using Divergence Frontiers

Despite the tremendous progress in the estimation of generative models, ...

Please sign up or login with your details

Forgot password? Click here to reset