Reconciling the Discrete-Continuous Divide: Towards a Mathematical Theory of Sparse Communication

04/01/2021
by   André F. T. Martins, et al.
0

Neural networks and other machine learning models compute continuous representations, while humans communicate with discrete symbols. Reconciling these two forms of communication is desirable to generate human-readable interpretations or to learn discrete latent variable models, while maintaining end-to-end differentiability. Some existing approaches (such as the Gumbel-softmax transformation) build continuous relaxations that are discrete approximations in the zero-temperature limit, while others (such as sparsemax transformations and the hard concrete distribution) produce discrete/continuous hybrids. In this paper, we build rigorous theoretical foundations for these hybrids. Our starting point is a new "direct sum" base measure defined on the face lattice of the probability simplex. From this measure, we introduce a new entropy function that includes the discrete and differential entropies as particular cases, and has an interpretation in terms of code optimality, as well as two other information-theoretic counterparts that generalize the mutual information and Kullback-Leibler divergences. Finally, we introduce "mixed languages" as strings of hybrid symbols and a new mixed weighted finite state automaton that recognizes a class of regular mixed languages, generalizing closure properties of regular languages.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset