Distributed Inference for Linear Support Vector Machine

11/29/2018
by   Xiaozhou Wang, et al.
0

The growing size of modern data brings many new challenges to existing statistical inference methodologies and theories, and calls for the development of distributed inferential approaches. This paper studies distributed inference for linear support vector machine (SVM) for the binary classification task. Despite a vast literature on SVM, much less is known about the inferential properties of SVM, especially in a distributed setting. In this paper, we propose a multi-round distributed linear-type (MDL) estimator for conducting inference for linear SVM. The proposed estimator is computationally efficient. In particular, it only requires an initial SVM estimator and then successively refines the estimator by solving simple weighted least squares problem. Theoretically, we establish the Bahadur representation of the estimator. Based on the representation, the asymptotic normality is further derived, which shows that the MDL estimator achieves the optimal statistical efficiency, i.e., the same efficiency as the classical linear SVM applying to the entire dataset in a single machine setup. Moreover, our asymptotic result avoids the condition on the number of machines or data batches, which is commonly assumed in distributed estimation literature, and allows the case of diverging dimension. We provide simulation studies to demonstrate the performance of the proposed MDL estimator.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2020

Distributed Estimation for Principal Component Analysis: a Gap-free Approach

The growing size of modern data sets brings many challenges to the exist...
research
08/23/2023

Leverage classifier: Another look at support vector machine

Support vector machine (SVM) is a popular classifier known for accuracy,...
research
10/15/2022

Distributed Estimation and Inference for Semi-parametric Binary Response Models

The development of modern technology has enabled data collection of unpr...
research
03/16/2023

High-Dimensional Penalized Bernstein Support Vector Machines

The support vector machines (SVM) is a powerful classifier used for bina...
research
11/04/2015

A Distributed One-Step Estimator

Distributed statistical inference has recently attracted enormous attent...
research
10/18/2013

On the Suitable Domain for SVM Training in Image Coding

Conventional SVM-based image coding methods are founded on independently...
research
11/28/2018

First-order Newton-type Estimator for Distributed Estimation and Inference

This paper studies distributed estimation and inference for a general st...

Please sign up or login with your details

Forgot password? Click here to reset