Distilling Knowledge from Object Classification to Aesthetics Assessment

by   Jingwen Hou, et al.

In this work, we point out that the major dilemma of image aesthetics assessment (IAA) comes from the abstract nature of aesthetic labels. That is, a vast variety of distinct contents can correspond to the same aesthetic label. On the one hand, during inference, the IAA model is required to relate various distinct contents to the same aesthetic label. On the other hand, when training, it would be hard for the IAA model to learn to distinguish different contents merely with the supervision from aesthetic labels, since aesthetic labels are not directly related to any specific content. To deal with this dilemma, we propose to distill knowledge on semantic patterns for a vast variety of image contents from multiple pre-trained object classification (POC) models to an IAA model. Expecting the combination of multiple POC models can provide sufficient knowledge on various image contents, the IAA model can easier learn to relate various distinct contents to a limited number of aesthetic labels. By supervising an end-to-end single-backbone IAA model with the distilled knowledge, the performance of the IAA model is significantly improved by 4.8 aesthetic labels. On specific categories of images, the SRCC improvement brought by the proposed method can achieve up to 7.2 shows that our method outperforms 10 previous IAA methods.


page 1

page 2

page 10

page 12


Personalized Image Enhancement Featuring Masked Style Modeling

We address personalized image enhancement in this study, where we enhanc...

Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training of Image Segmentation Models

While fine-tuning pre-trained networks has become a popular way to train...

Semi-supervised Ranking for Object Image Blur Assessment

Assessing the blurriness of an object image is fundamentally important t...

DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models

Given a classifier, the inherent property of semantic Out-of-Distributio...

Knowledge Concentration: Learning 100K Object Classifiers in a Single CNN

Fine-grained image labels are desirable for many computer vision applica...

Video Multimethod Assessment Fusion (VMAF) on 360VR contents

This paper describes the subjective experiments and subsequent analysis ...

Image Annotation using Multi-Layer Sparse Coding

Automatic annotation of images with descriptive words is a challenging p...

Please sign up or login with your details

Forgot password? Click here to reset