An Entropy-Based Model for Hierarchical Learning
Machine learning is the dominant approach to artificial intelligence, through which computers learn from data and experience. In the framework of supervised learning, a necessity for a computer to learn from data accurately and efficiently is to be provided with auxiliary information about the data distribution and target function through the learning model. This notion of auxiliary information relates to the concept of regularization in statistical learning theory. A common feature among real-world datasets is that data domains are multiscale and target functions are well-behaved and smooth. This paper proposes an entropy-based learning model that exploits this data structure and discusses its statistical and computational benefits. The hierarchical learning model is inspired by human beings' logical and progressive easy-to-hard learning mechanism and has interpretable levels. The model apportions computational resources according to the complexity of data instances and target functions. This property can have multiple benefits, including higher inference speed and computational savings in training a model for many users or when training is interrupted. We provide a statistical analysis of the learning mechanism using multiscale entropies and show that it can yield significantly stronger guarantees than uniform convergence bounds.
READ FULL TEXT