An Image is Worth More Than a Thousand Words: Towards Disentanglement in the Wild

by   Aviv Gabbay, et al.

Unsupervised disentanglement has been shown to be theoretically impossible without inductive biases on the models and the data. As an alternative approach, recent methods rely on limited supervision to disentangle the factors of variation and allow their identifiability. While annotating the true generative factors is only required for a limited number of observations, we argue that it is infeasible to enumerate all the factors of variation that describe a real-world image distribution. To this end, we propose a method for disentangling a set of factors which are only partially labeled, as well as separating the complementary set of residual factors that are never explicitly specified. Our success in this challenging setting, demonstrated on synthetic benchmarks, gives rise to leveraging off-the-shelf image descriptors to partially annotate a subset of attributes in real image domains (e.g. of human faces) with minimal manual effort. Specifically, we use a recent language-image embedding model (CLIP) to annotate a set of attributes of interest in a zero-shot manner and demonstrate state-of-the-art disentangled image manipulation results.


page 8

page 9

page 19

page 20

page 21

page 22

page 23


Disentangling Factors of Variation Using Few Labels

Learning disentangled representations is considered a cornerstone proble...

Disentangling Factors of Variation by Mixing Them

We propose an unsupervised approach to learn image representations that ...

Towards efficient representation identification in supervised learning

Humans have a remarkable ability to disentangle complex sensory inputs (...

Disentangling factors of variation in deep representations using adversarial training

We introduce a conditional generative model for learning to disentangle ...

Learning disentangled representations via product manifold projection

We propose a novel approach to disentangle the generative factors of var...

There and back again: Cycle consistency across sets for isolating factors of variation

Representational learning hinges on the task of unraveling the set of un...

JADE: Joint Autoencoders for Dis-Entanglement

The problem of feature disentanglement has been explored in the literatu...

Please sign up or login with your details

Forgot password? Click here to reset