Tractability from overparametrization: The example of the negative perceptron

by   Andrea Montanari, et al.

In the negative perceptron problem we are given n data points ( x_i,y_i), where x_i is a d-dimensional vector and y_i∈{+1,-1} is a binary label. The data are not linearly separable and hence we content ourselves to find a linear classifier with the largest possible negative margin. In other words, we want to find a unit norm vector θ that maximizes min_i≤ ny_i⟨θ, x_i⟩. This is a non-convex optimization problem (it is equivalent to finding a maximum norm vector in a polytope), and we study its typical properties under two random models for the data. We consider the proportional asymptotics in which n,d→∞ with n/d→δ, and prove upper and lower bounds on the maximum margin κ_s(δ) or – equivalently – on its inverse function δ_s(κ). In other words, δ_s(κ) is the overparametrization threshold: for n/d≤δ_s(κ)-ε a classifier achieving vanishing training error exists with high probability, while for n/d≥δ_s(κ)+ε it does not. Our bounds on δ_s(κ) match to the leading order as κ→ -∞. We then analyze a linear programming algorithm to find a solution, and characterize the corresponding threshold δ_lin(κ). We observe a gap between the interpolation threshold δ_s(κ) and the linear programming threshold δ_lin(κ), raising the question of the behavior of other algorithms.



There are no comments yet.


page 13


The generalization error of max-margin linear classifiers: High-dimensional asymptotics in the overparametrized regime

Modern machine learning models are often so complex that they achieve va...

On the integrality gap of the maximum-cut semidefinite programming relaxation in fixed dimension

We describe a factor-revealing convex optimization problem for the integ...

Error constant estimation under the maximum norm for linear Lagrange interpolation

For the Lagrange interpolation over a triangular domain, we propose an e...

Tropical Support Vector Machine and its Applications to Phylogenomics

Most data in genome-wide phylogenetic analysis (phylogenomics) is essent...

On the consistency of Multithreshold Entropy Linear Classifier

Multithreshold Entropy Linear Classifier (MELC) is a recent classifier i...

Margins, Kernels and Non-linear Smoothed Perceptrons

We focus on the problem of finding a non-linear classification function ...

Distribution-Independent Evolvability of Linear Threshold Functions

Valiant's (2007) model of evolvability models the evolutionary process o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.