Federated Visual Classification with Real-World Data Distribution

03/18/2020
by   Tzu Ming Harry Hsu, et al.
24

Federated Learning enables visual models to be trained on-device, bringing advantages for user privacy (data need never leave the device), but challenges in terms of data diversity and quality. Whilst typical models in the datacenter are trained using data that are independent and identically distributed (IID), data at source are typically far from IID. Furthermore, differing quantities of data are typically available at each device (imbalance). In this work, we characterize the effect these real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm. To do so, we introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits that simulate real-world edge learning scenarios. We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.

READ FULL TEXT

page 3

page 6

page 18

research
09/13/2019

Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification

Federated Learning enables visual models to be trained in a privacy-pres...
research
03/03/2020

Evaluation Framework For Large-scale Federated Learning

Federated learning is proposed as a machine learning setting to enable d...
research
02/08/2022

Learnings from Federated Learning in the Real world

Federated Learning (FL) applied to real world data may suffer from sever...
research
09/08/2023

Federated Learning for Early Dropout Prediction on Healthy Ageing Applications

The provision of social care applications is crucial for elderly people ...
research
05/19/2022

FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for Federated Learning on Non-IID Data

Federated learning is a distributed machine learning approach which enab...
research
05/21/2020

Training Keyword Spotting Models on Non-IID Data with Federated Learning

We demonstrate that a production-quality keyword-spotting model can be t...
research
09/25/2019

Experimental Evaluation of Algorithms for Computing Quasiperiods

Quasiperiodicity is a generalization of periodicity that was introduced ...

Please sign up or login with your details

Forgot password? Click here to reset