Privacy-Preserving and Lossless Distributed Estimation of High-Dimensional Generalized Additive Mixed Models

10/14/2022
by   Daniel Schalk, et al.
0

Various privacy-preserving frameworks that respect the individual's privacy in the analysis of data have been developed in recent years. However, available model classes such as simple statistics or generalized linear models lack the flexibility required for a good approximation of the underlying data-generating process in practice. In this paper, we propose an algorithm for a distributed, privacy-preserving, and lossless estimation of generalized additive mixed models (GAMM) using component-wise gradient boosting (CWB). Making use of CWB allows us to reframe the GAMM estimation as a distributed fitting of base learners using the L_2-loss. In order to account for the heterogeneity of different data location sites, we propose a distributed version of a row-wise tensor product that allows the computation of site-specific (smooth) effects. Our adaption of CWB preserves all the important properties of the original algorithm, such as an unbiased feature selection and the feasibility to fit models in high-dimensional feature spaces, and yields equivalent model estimates as CWB on pooled data. Next to a derivation of the equivalence of both algorithms, we also showcase the efficacy of our algorithm on a distributed heart disease data set and compare it with state-of-the-art methods.

READ FULL TEXT

page 34

page 35

research
10/11/2021

Privacy-Preserving Multiparty Protocol for Feature Selection Problem

In this paper, we propose a secure multiparty protocol for the feature s...
research
05/22/2019

A Privacy Preserving Collusion Secure DCOP Algorithm

In recent years, several studies proposed privacy-preserving algorithms ...
research
08/17/2020

Privacy-preserving feature selection: A survey and proposing a new set of protocols

Feature selection is the process of sieving features, in which informati...
research
02/16/2019

Privacy Preserving Integrative Regression Analysis of High-dimensional Heterogeneous Data

Meta-analyzing multiple studies, enabling more precise estimation and in...
research
11/08/2019

Privacy-Preserving Generalized Linear Models using Distributed Block Coordinate Descent

Combining data from varied sources has considerable potential for knowle...
research
09/29/2017

Privacy Preserving Identification Using Sparse Approximation with Ambiguization

In this paper, we consider a privacy preserving encoding framework for i...
research
12/17/2018

Privacy-Preserving Distributed Joint Probability Modeling for Spatial-Correlated Wind Farms

Building the joint probability distribution (JPD) of multiple spatial-co...

Please sign up or login with your details

Forgot password? Click here to reset