Learning Extremal Representations with Deep Archetypal Analysis

Archetypes are typical population representatives in an extremal sense, where typicality is understood as the most extreme manifestation of a trait or feature. In linear feature space, archetypes approximate the data convex hull allowing all data points to be expressed as convex mixtures of archetypes. However, it might not always be possible to identify meaningful archetypes in a given feature space. Learning an appropriate feature space and identifying suitable archetypes simultaneously addresses this problem. This paper introduces a generative formulation of the linear archetype model, parameterized by neural networks. By introducing the distance-dependent archetype loss, the linear archetype model can be integrated into the latent space of a variational autoencoder, and an optimal representation with respect to the unknown archetypes can be learned end-to-end. The reformulation of linear Archetypal Analysis as deep variational information bottleneck, allows the incorporation of arbitrarily complex side information during training. Furthermore, an alternative prior, based on a modified Dirichlet distribution, is proposed. The real-world applicability of the proposed method is demonstrated by exploring archetypes of female facial expressions while using multi-rater based emotion scores of these expressions as side information. A second application illustrates the exploration of the chemical space of small organic molecules. In this experiment, it is demonstrated that exchanging the side information but keeping the same set of molecules, e. g. using as side information the heat capacity of each molecule instead of the band gap energy, will result in the identification of different archetypes. As an application, these learned representations of chemical space might reveal distinct starting points for de novo molecular design.

READ FULL TEXT

page 8

page 9

page 10

page 13

research
12/06/2018

Traversing Latent Space using Decision Ferns

The practice of transforming raw data to a feature space so that inferen...
research
01/30/2019

Deep Archetypal Analysis

"Deep Archetypal Analysis" generates latent representations of high-dime...
research
02/21/2023

CHA2: CHemistry Aware Convex Hull Autoencoder Towards Inverse Molecular Design

Optimizing molecular design and discovering novel chemical structures to...
research
06/07/2019

Adaptive Nonparametric Variational Autoencoder

Clustering is used to find structure in unlabeled data by grouping simil...
research
10/01/2019

Re-balancing Variational Autoencoder Loss for Molecule Sequence Generation

Molecule generation is to design new molecules with specific chemical pr...
research
05/23/2018

Constrained Graph Variational Autoencoders for Molecule Design

Graphs are ubiquitous data structures for representing interactions betw...
research
06/02/2019

Nonparametric Functional Approximation with Delaunay Triangulation

We propose a differentiable nonparametric algorithm, the Delaunay triang...

Please sign up or login with your details

Forgot password? Click here to reset