How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

by   Leonardo Petrini, et al.

Learning generic high-dimensional tasks is notably hard, as it requires a number of training data exponential in the dimension. Yet, deep convolutional neural networks (CNNs) have shown remarkable success in overcoming this challenge. A popular hypothesis is that learnable tasks are highly structured and that CNNs leverage this structure to build a low-dimensional representation of the data. However, little is known about how much training data they require, and how this number depends on the data structure. This paper answers this question for a simple classification task that seeks to capture relevant aspects of real data: the Random Hierarchy Model. In this model, each of the n_c classes corresponds to m synonymic compositions of high-level features, which are in turn composed of sub-features through an iterative process repeated L times. We find that the number of training data P^* required by deep CNNs to learn this task (i) grows asymptotically as n_c m^L, which is only polynomial in the input dimensionality; (ii) coincides with the training set size such that the representation of a trained network becomes invariant to exchanges of synonyms; (iii) corresponds to the number of data at which the correlations between low-level features and classes become detectable. Overall, our results indicate how deep CNNs can overcome the curse of dimensionality by building invariant representations, and provide an estimate of the number of data required to learn a task based on its hierarchically compositional structure.


page 3

page 4


How Wide Convolutional Neural Networks Learn Hierarchical Tasks

Despite their success, understanding how convolutional neural networks (...

Do Convolutional Neural Networks Learn Class Hierarchy?

Convolutional Neural Networks (CNNs) currently achieve state-of-the-art ...

Rotation Invariant Deep CBIR

Introduction of Convolutional Neural Networks has improved results on al...

On the Transferability of Representations in Neural Networks Between Datasets and Tasks

Deep networks, composed of multiple layers of hierarchical distributed r...

Discriminability-enforcing loss to improve representation learning

During the training process, deep neural networks implicitly learn to re...

Learning Robust Convolutional Neural Networks with Relevant Feature Focusing via Explanations

Existing image recognition techniques based on convolutional neural netw...

Combining Deep Learning and String Kernels for the Localization of Swiss German Tweets

In this work, we introduce the methods proposed by the UnibucKernel team...

Please sign up or login with your details

Forgot password? Click here to reset