An Analytical Theory of Curriculum Learning in Teacher-Student Networks

by   Luca Saglietti, et al.

In humans and animals, curriculum learning – presenting data in a curated order - is critical to rapid learning and effective pedagogy. Yet in machine learning, curricula are not widely used and empirically often yield only moderate benefits. This stark difference in the importance of curriculum raises a fundamental theoretical question: when and why does curriculum learning help? In this work, we analyse a prototypical neural network model of curriculum learning in the high-dimensional limit, employing statistical physics methods. Curricula could in principle change both the learning speed and asymptotic performance of a model. To study the former, we provide an exact description of the online learning setting, confirming the long-standing experimental observation that curricula can modestly speed up learning. To study the latter, we derive performance in a batch learning setting, in which a network trains to convergence in successive phases of learning on dataset slices of varying difficulty. With standard training losses, curriculum does not provide generalisation benefit, in line with empirical observations. However, we show that by connecting different learning phases through simple Gaussian priors, curriculum can yield a large improvement in test performance. Taken together, our reduced analytical descriptions help reconcile apparently conflicting empirical results and trace regimes where curriculum learning yields the largest gains. More broadly, our results suggest that fully exploiting a curriculum may require explicit changes to the loss function at curriculum boundaries.


page 3

page 5

page 6

page 8

page 9

page 21

page 22


Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks

Our first contribution in this paper is a theoretical investigation of c...

Understanding Self-Paced Learning under Concave Conjugacy Theory

By simulating the easy-to-hard learning manners of humans/animals, the l...

On the Statistical Benefits of Curriculum Learning

Curriculum learning (CL) is a commonly used machine learning training st...

Analyzing Curriculum Learning for Sentiment Analysis along Task Difficulty, Pacing and Visualization Axes

While Curriculum Learning (CL) has recently gained traction in Natural l...

Curriculum Knowledge Switching for Pancreas Segmentation

Pancreas segmentation is challenging due to the small proportion and hig...

Theory of Curriculum Learning, with Convex Loss Functions

Curriculum Learning - the idea of teaching by gradually exposing the lea...

An Empirical Comparison of Syllabuses for Curriculum Learning

Syllabuses for curriculum learning have been developed on an ad-hoc, per...

Please sign up or login with your details

Forgot password? Click here to reset