Fed-TDA: Federated Tabular Data Augmentation on Non-IID Data

11/22/2022
by   Shaoming Duan, et al.
0

Non-independent and identically distributed (non-IID) data is a key challenge in federated learning (FL), which usually hampers the optimization convergence and the performance of FL. Existing data augmentation methods based on federated generative models or raw data sharing strategies for solving the non-IID problem still suffer from low performance, privacy protection concerns, and high communication overhead in decentralized tabular data. To tackle these challenges, we propose a federated tabular data augmentation method, named Fed-TDA. The core idea of Fed-TDA is to synthesize tabular data for data augmentation using some simple statistics (e.g., distributions of each column and global covariance). Specifically, we propose the multimodal distribution transformation and inverse cumulative distribution mapping respectively synthesize continuous and discrete columns in tabular data from a noise according to the pre-learned statistics. Furthermore, we theoretically analyze that our Fed-TDA not only preserves data privacy but also maintains the distribution of the original data and the correlation between columns. Through extensive experiments on five real-world tabular datasets, we demonstrate the superiority of Fed-TDA over the state-of-the-art in test performance and communication efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2023

A Simple Data Augmentation for Feature Distribution Skewed Federated Learning

Federated learning (FL) facilitates collaborative learning among multipl...
research
07/08/2022

StatMix: Data augmentation method that relies on image statistics in federated learning

Availability of large amount of annotated data is one of the pillars of ...
research
08/18/2021

Fed-TGAN: Federated Learning Framework for Synthesizing Tabular Data

Generative Adversarial Networks (GANs) are typically trained to synthesi...
research
06/09/2020

XOR Mixup: Privacy-Preserving Data Augmentation for One-Shot Federated Learning

User-generated data distributions are often imbalanced across devices an...
research
06/20/2022

Mitigating Data Heterogeneity in Federated Learning with Data Augmentation

Federated Learning (FL) is a prominent framework that enables training a...
research
04/15/2021

Efficient Ring-topology Decentralized Federated Learning with Deep Generative Models for Industrial Artificial Intelligent

By leveraging deep learning based technologies, the data-driven based ap...
research
05/30/2021

FED-χ^2: Privacy Preserving Federated Correlation Test

In this paper, we propose the first secure federated χ^2-test protocol F...

Please sign up or login with your details

Forgot password? Click here to reset