PADME-SoSci: A Platform for Analytics and Distributed Machine Learning for the Social Sciences

03/27/2023
by   Zeyd Boukhers, et al.
0

Data privacy and ownership are significant in social data science, raising legal and ethical concerns. Sharing and analyzing data is difficult when different parties own different parts of it. An approach to this challenge is to apply de-identification or anonymization techniques to the data before collecting it for analysis. However, this can reduce data utility and increase the risk of re-identification. To address these limitations, we present PADME, a distributed analytics tool that federates model implementation and training. PADME uses a federated approach where the model is implemented and deployed by all parties and visits each data location incrementally for training. This enables the analysis of data across locations while still allowing the model to be trained as if all data were in a single location. Training the model on data in its original location preserves data ownership. Furthermore, the results are not provided until the analysis is completed on all data locations to ensure privacy and avoid bias in the results.

READ FULL TEXT

page 1

page 2

research
07/06/2020

Sharing Models or Coresets: A Study based on Membership Inference Attack

Distributed machine learning generally aims at training a global model b...
research
03/12/2018

A Location-based Approach for Distributed Kiosk Design

Electronic kiosk interface design and implementation metrics have been w...
research
12/07/2018

A Hybrid Approach to Privacy-Preserving Federated Learning

Training machine learning models often requires data from multiple parti...
research
06/03/2019

Federated Hierarchical Hybrid Networks for Clickbait Detection

Online media outlets adopt clickbait techniques to lure readers to click...
research
12/05/2022

Federated Neural Topic Models

Over the last years, topic modeling has emerged as a powerful technique ...
research
04/05/2019

A Conceptual Architecture for Contractual Data Sharing in a Decentralised Environment

Machine Learning systems rely on data for training, input and ongoing fe...
research
02/05/2019

PUTWorkbench: Analysing Privacy in AI-intensive Systems

AI intensive systems that operate upon user data face the challenge of b...

Please sign up or login with your details

Forgot password? Click here to reset