Communication-efficient Distributed Sparse Linear Discriminant Analysis

10/15/2016
by   Lu Tian, et al.
0

We propose a communication-efficient distributed estimation method for sparse linear discriminant analysis (LDA) in the high dimensional regime. Our method distributes the data of size N into m machines, and estimates a local sparse LDA estimator on each machine using the data subset of size N/m. After the distributed estimation, our method aggregates the debiased local estimators from m machines, and sparsifies the aggregated estimator. We show that the aggregated estimator attains the same statistical rate as the centralized estimation method, as long as the number of machines m is chosen appropriately. Moreover, we prove that our method can attain the model selection consistency under a milder condition than the centralized method. Experiments on both synthetic and real datasets corroborate our theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2016

Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models

We propose communication-efficient distributed estimation and inference ...
research
06/12/2019

Communication-Efficient Accurate Statistical Estimation

When the data are stored in a distributed manner, direct application of ...
research
02/22/2022

Distributed Sparse Multicategory Discriminant Analysis

This paper proposes a convex formulation for sparse multicategory linear...
research
05/16/2022

Distributed Feature Selection for High-dimensional Additive Models

Distributed statistical learning is a common strategy for handling massi...
research
03/12/2022

Varying Coefficient Linear Discriminant Analysis for Dynamic Data

Linear discriminant analysis (LDA) is a vital classification tool in sta...
research
09/02/2017

Communication-efficient Algorithm for Distributed Sparse Learning via Two-way Truncation

We propose a communicationally and computationally efficient algorithm f...
research
10/24/2014

Median Selection Subset Aggregation for Parallel Inference

For massive data sets, efficient computation commonly relies on distribu...

Please sign up or login with your details

Forgot password? Click here to reset