Exact high-dimensional asymptotics for support vector machine

05/13/2019
by   Haoyang Liu, et al.
0

Support vector machine (SVM) is one of the most widely used classification methods. In this paper, we consider soft margin support vector machine used on data points with independent features, where the sample size n and the feature dimension p grows to ∞ in a fixed ratio p/n→δ. We propose a set of equations that exactly characterizes the asymptotic behavior of support vector machine. In particular, we give exact formula for (1) the variability of the optimal coefficients, (2) proportion of data points lying on the margin boundary (i.e. number of support vectors), (3) the final objective function value, and (4) expected misclassification error on new data points, which in particular implies exact formula for the optimal tuning parameter given a data generating mechanism. The global null case is considered first, where the label y∈{+1,-1} is independent of the feature x. Then the signaled case is considered, where the label y∈{+1,-1} is allowed to have a general dependence on the feature x through a linear combination a_0^Tx. These results for the non-smooth hinge loss serve as an analogue to the recent results in sur2018modern for smooth logistic loss. Our approach is based on heuristic leave-one-out calculations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2020

Support Vector Machines and Radon's Theorem

A support vector machine (SVM) is an algorithm which finds a hyperplane ...
research
11/26/2012

Random Projections for Linear Support Vector Machines

Let X be a data matrix of rank ρ, whose rows represent n points in d-dim...
research
05/09/2012

Virtual Vector Machine for Bayesian Online Classification

In a typical online learning scenario, a learner is required to process ...
research
03/02/2020

Tropical Support Vector Machine and its Applications to Phylogenomics

Most data in genome-wide phylogenetic analysis (phylogenomics) is essent...
research
10/15/2016

Incremental One-Class Models for Data Classification

In this paper we outline a PhD research plan. This research contributes ...
research
05/05/2020

Interpreting Deep Models through the Lens of Data

Identification of input data points relevant for the classifier (i.e. se...
research
04/03/2017

Geometric Insights into Support Vector Machine Behavior using the KKT Conditions

The Support Vector Machine (SVM) is a powerful and widely used classific...

Please sign up or login with your details

Forgot password? Click here to reset