DeepAI AI Chat
Log In Sign Up

Reconciling the Discrete-Continuous Divide: Towards a Mathematical Theory of Sparse Communication

by   André F. T. Martins, et al.

Neural networks and other machine learning models compute continuous representations, while humans communicate with discrete symbols. Reconciling these two forms of communication is desirable to generate human-readable interpretations or to learn discrete latent variable models, while maintaining end-to-end differentiability. Some existing approaches (such as the Gumbel-softmax transformation) build continuous relaxations that are discrete approximations in the zero-temperature limit, while others (such as sparsemax transformations and the hard concrete distribution) produce discrete/continuous hybrids. In this paper, we build rigorous theoretical foundations for these hybrids. Our starting point is a new "direct sum" base measure defined on the face lattice of the probability simplex. From this measure, we introduce a new entropy function that includes the discrete and differential entropies as particular cases, and has an interpretation in terms of code optimality, as well as two other information-theoretic counterparts that generalize the mutual information and Kullback-Leibler divergences. Finally, we introduce "mixed languages" as strings of hybrid symbols and a new mixed weighted finite state automaton that recognizes a class of regular mixed languages, generalizing closure properties of regular languages.


Sparse Communication via Mixed Distributions

Neural networks and other machine learning models compute continuous rep...

A partial information decomposition for discrete and continuous variables

Conceptually, partial information decomposition (PID) is concerned with ...

Discrete and continuous representations and processing in deep learning: Looking forward

Discrete and continuous representations of content (e.g., of language or...

DVAE++: Discrete Variational Autoencoders with Overlapping Transformations

Training of discrete latent variable models remains challenging because ...

Information Properties of a Random Variable Decomposition through Lattices

A full-rank lattice in the Euclidean space is a discrete set formed by a...

Differentiable Generative Phonology

The goal of generative phonology, as formulated by Chomsky and Halle (19...