Towards Fine-grained Image Classification with Generative Adversarial Networks and Facial Landmark Detection

08/28/2021
by   Mahdi Darvish, et al.
2

Fine-grained classification remains a challenging task because distinguishing categories needs learning complex and local differences. Diversity in the pose, scale, and position of objects in an image makes the problem even more difficult. Although the recent Vision Transformer models achieve high performance, they need an extensive volume of input data. To encounter this problem, we made the best use of GAN-based data augmentation to generate extra dataset instances. Oxford-IIIT Pets was our dataset of choice for this experiment. It consists of 37 breeds of cats and dogs with variations in scale, poses, and lighting, which intensifies the difficulty of the classification task. Furthermore, we enhanced the performance of the recent Generative Adversarial Network (GAN), StyleGAN2-ADA model to generate more realistic images while preventing overfitting to the training set. We did this by training a customized version of MobileNetV2 to predict animal facial landmarks; then, we cropped images accordingly. Lastly, we combined the synthetic images with the original dataset and compared our proposed method with standard GANs augmentation and no augmentation with different subsets of training data. We validated our work by evaluating the accuracy of fine-grained image classification on the recent Vision Transformer (ViT) Model.

READ FULL TEXT

page 1

page 3

page 4

research
07/06/2018

Adversarial Learning for Fine-grained Image Search

Fine-grained image search is still a challenging problem due to the diff...
research
04/22/2022

Reinforcing Generated Images via Meta-learning for One-Shot Fine-Grained Visual Recognition

One-shot fine-grained visual recognition often suffers from the problem ...
research
03/29/2017

CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training

We present variational generative adversarial networks, a general learni...
research
08/17/2022

Conviformers: Convolutionally guided Vision Transformer

Vision transformers are nowadays the de-facto preference for image class...
research
11/01/2018

CariGAN: Caricature Generation through Weakly Paired Adversarial Learning

Caricature generation is an interesting yet challenging task. The primar...
research
10/25/2019

Data Augmentation for Skin Lesion using Self-Attention based Progressive Generative Adversarial Network

Deep Neural Networks (DNNs) show a significant impact on medical imaging...
research
07/20/2023

Comparison between transformers and convolutional models for fine-grained classification of insects

Fine-grained classification is challenging due to the difficulty of find...

Please sign up or login with your details

Forgot password? Click here to reset