Disentangling Semantic-to-visual Confusion for Zero-shot Learning

06/16/2021
by   Zihan Ye, et al.
0

Using generative models to synthesize visual features from semantic distribution is one of the most popular solutions to ZSL image classification in recent years. The triplet loss (TL) is popularly used to generate realistic visual distributions from semantics by automatically searching discriminative representations. However, the traditional TL cannot search reliable unseen disentangled representations due to the unavailability of unseen classes in ZSL. To alleviate this drawback, we propose in this work a multi-modal triplet loss (MMTL) which utilizes multimodal information to search a disentangled representation space. As such, all classes can interplay which can benefit learning disentangled class representations in the searched space. Furthermore, we develop a novel model called Disentangling Class Representation Generative Adversarial Network (DCR-GAN) focusing on exploiting the disentangled representations in training, feature synthesis, and final recognition stages. Benefiting from the disentangled representations, DCR-GAN could fit a more realistic distribution over both seen and unseen features. Extensive experiments show that our proposed model can lead to superior performance to the state-of-the-arts on four benchmark datasets. Our code is available at https://github.com/FouriYe/DCRGAN-TMM.

READ FULL TEXT

page 2

page 13

research
08/01/2018

Multi-modal Cycle-consistent Generalized Zero-Shot Learning

In generalized zero shot learning (GZSL), the set of classes are split i...
research
04/15/2019

SR-GAN: Semantic Rectifying Generative Adversarial Network for Zero-shot Learning

The existing Zero-Shot learning (ZSL) methods may suffer from the vague ...
research
07/12/2019

Dual Adversarial Semantics-Consistent Network for Generalized Zero-Shot Learning

Generalized zero-shot learning (GZSL) is a challenging class of vision a...
research
09/21/2019

CANZSL: Cycle-Consistent Adversarial Networks for Zero-Shot Learning from Natural Language

Existing methods using generative adversarial approaches for Zero-Shot L...
research
09/13/2021

Conditional MoCoGAN for Zero-Shot Video Generation

We propose a conditional generative adversarial network (GAN) model for ...
research
11/06/2022

Distilling Representations from GAN Generator via Squeeze and Span

In recent years, generative adversarial networks (GANs) have been an act...
research
07/04/2023

Disentanglement in a GAN for Unconditional Speech Synthesis

Can we develop a model that can synthesize realistic speech directly fro...

Please sign up or login with your details

Forgot password? Click here to reset