Learning curves for the multi-class teacher-student perceptron

03/22/2022
by   Elisabetta Cornacchia, et al.
17

One of the most classical results in high-dimensional learning theory provides a closed-form expression for the generalisation error of binary classification with the single-layer teacher-student perceptron on i.i.d. Gaussian inputs. Both Bayes-optimal estimation and empirical risk minimisation (ERM) were extensively analysed for this setting. At the same time, a considerable part of modern machine learning practice concerns multi-class classification. Yet, an analogous analysis for the corresponding multi-class teacher-student perceptron was missing. In this manuscript we fill this gap by deriving and evaluating asymptotic expressions for both the Bayes-optimal and ERM generalisation errors in the high-dimensional regime. For Gaussian teacher weights, we investigate the performance of ERM with both cross-entropy and square losses, and explore the role of ridge regularisation in approaching Bayes-optimality. In particular, we observe that regularised cross-entropy minimisation yields close-to-optimal accuracy. Instead, for a binary teacher we show that a first-order phase transition arises in the Bayes-optimal performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2021

Capturing the learning curves of generic features maps for realistic data sets with a teacher-student model

Teacher-student models provide a powerful framework in which the typical...
research
01/30/2023

Understanding Self-Distillation in the Presence of Label Noise

Self-distillation (SD) is the process of first training a teacher model ...
research
09/20/2020

Expectation propagation for the diluted Bayesian classifier

Efficient feature selection from high-dimensional datasets is a very imp...
research
01/19/2020

Optimal Rate of Convergence for Deep Neural Network Classifiers under the Teacher-Student Setting

Classifiers built with neural networks handle large-scale high-dimension...
research
10/18/2022

Multi-Source Transformer Architectures for Audiovisual Scene Classification

In this technical report, the systems we submitted for subtask 1B of the...
research
02/21/2019

Active online learning in the binary perceptron problem

The binary perceptron is the simplest artificial neural network formed b...
research
03/07/2022

On the pitfalls of entropy-based uncertainty for multi-class semi-supervised segmentation

Semi-supervised learning has emerged as an appealing strategy to train d...

Please sign up or login with your details

Forgot password? Click here to reset