K-means Clustering Based Feature Consistency Alignment for Label-free Model Evaluation

04/17/2023
by   Shuyu Miao, et al.
0

The label-free model evaluation aims to predict the model performance on various test sets without relying on ground truths. The main challenge of this task is the absence of labels in the test data, unlike in classical supervised model evaluation. This paper presents our solutions for the 1st DataCV Challenge of the Visual Dataset Understanding workshop at CVPR 2023. Firstly, we propose a novel method called K-means Clustering Based Feature Consistency Alignment (KCFCA), which is tailored to handle the distribution shifts of various datasets. KCFCA utilizes the K-means algorithm to cluster labeled training sets and unlabeled test sets, and then aligns the cluster centers with feature consistency. Secondly, we develop a dynamic regression model to capture the relationship between the shifts in distribution and model accuracy. Thirdly, we design an algorithm to discover the outlier model factors, eliminate the outlier models, and combine the strengths of multiple autoeval models. On the DataCV Challenge leaderboard, our approach secured 2nd place with an RMSE of 6.8526. Our method significantly improved over the best baseline method by 36% (6.8526 vs. 10.7378). Furthermore, our method achieves a relatively more robust and optimal single model performance on the validation dataset.

READ FULL TEXT
research
01/21/2021

Validating Label Consistency in NER Data Annotation

Data annotation plays a crucial role in ensuring your named entity recog...
research
11/14/2019

Distributional Clustering: A distribution-preserving clustering method

One key use of k-means clustering is to identify cluster prototypes whic...
research
07/06/2020

Are Labels Necessary for Classifier Accuracy Evaluation?

To calculate the model accuracy on a computer vision task, e.g., object ...
research
06/29/2023

Alternative Telescopic Displacement: An Efficient Multimodal Alignment Method

Feature alignment is the primary means of fusing multimodal data. We pro...
research
06/30/2022

GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language

Helping end users comprehend the abstract distribution shifts can greatl...
research
01/20/2023

When Source-Free Domain Adaptation Meets Label Propagation

Source-free domain adaptation, where only a pre-trained source model is ...
research
07/07/2021

Test for non-negligible adverse shifts

Statistical tests for dataset shift are susceptible to false alarms: the...

Please sign up or login with your details

Forgot password? Click here to reset