Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos

09/08/2023
by   Di Wang, et al.
0

Data silos, mainly caused by privacy and interoperability, significantly constrain collaborations among different organizations with similar data for the same purpose. Distributed learning based on divide-and-conquer provides a promising way to settle the data silos, but it suffers from several challenges, including autonomy, privacy guarantees, and the necessity of collaborations. This paper focuses on developing an adaptive distributed kernel ridge regression (AdaDKRR) by taking autonomy in parameter selection, privacy in communicating non-sensitive information, and the necessity of collaborations in performance improvement into account. We provide both solid theoretical verification and comprehensive experiments for AdaDKRR to demonstrate its feasibility and effectiveness. Theoretically, we prove that under some mild conditions, AdaDKRR performs similarly to running the optimal learning algorithms on the whole data, verifying the necessity of collaborations and showing that no other distributed learning scheme can essentially beat AdaDKRR under the same conditions. Numerically, we test AdaDKRR on both toy simulations and two real-world applications to show that AdaDKRR is superior to other existing distributed learning schemes. All these results show that AdaDKRR is a feasible scheme to defend against data silos, which are highly desired in numerous application regions such as intelligent decision-making, pricing forecasting, and performance prediction for products.

READ FULL TEXT

page 12

page 19

page 20

research
03/27/2020

Distributed Kernel Ridge Regression with Communications

This paper focuses on generalization performance analysis for distribute...
research
02/10/2020

Distributed Learning with Dependent Samples

This paper focuses on learning rate analysis of distributed kernel ridge...
research
07/13/2021

Oversampling Divide-and-conquer for Response-skewed Kernel Ridge Regression

The divide-and-conquer method has been widely used for estimating large-...
research
08/07/2018

A distributed regression analysis application based on SAS software Part II: Cox proportional hazards regression

Previous work has demonstrated the feasibility and value of conducting d...
research
05/24/2021

Uncertainty quantification for distributed regression

The ever-growing size of the datasets renders well-studied learning tech...
research
09/10/2023

Nonlinear Granger Causality using Kernel Ridge Regression

I introduce a novel algorithm and accompanying Python library, named mlc...

Please sign up or login with your details

Forgot password? Click here to reset