A Case for Offloading Federated Learning Server on Smart NIC

07/13/2023
by   Naoki Shibahara, et al.
0

Federated learning is a distributed machine learning approach where local weight parameters trained by clients locally are aggregated as global parameters by a server. The global parameters can be trained without uploading privacy-sensitive raw data owned by clients to the server. The aggregation on the server is simply done by averaging the local weight parameters, so it is an I/O intensive task where a network processing accounts for a large portion compared to the computation. The network processing workload further increases as the number of clients increases. To mitigate the network processing workload, in this paper, the federated learning server is offloaded to NVIDIA BlueField-2 DPU which is a smart NIC (Network Interface Card) that has eight processing cores. Dedicated processing cores are assigned by DPDK (Data Plane Development Kit) for receiving the local weight parameters and sending the global parameters. The aggregation task is parallelized by exploiting multiple cores available on the DPU. To further improve the performance, an approximated design that eliminates an exclusive access control between the computation threads is also implemented. Evaluation results show that the federated learning server on the DPU accelerates the execution time by 1.32 times compared with that on the host CPU with a negligible accuracy loss.

READ FULL TEXT
research
10/19/2020

From Distributed Machine Learning To Federated Learning: In The View Of Data Privacy And Security

Federated learning is an improved version of distributed machine learnin...
research
03/18/2019

Communication-Efficient Federated Deep Learning with Asynchronous Model Update and Temporally Weighted Aggregation

Federated learning obtains a central model on the server by aggregating ...
research
12/23/2022

Deep Unfolding-based Weighted Averaging for Federated Learning under Heterogeneous Environments

Federated learning is a collaborative model training method by iterating...
research
02/09/2023

Delay Sensitive Hierarchical Federated Learning with Stochastic Local Updates

The impact of local averaging on the performance of federated learning (...
research
09/10/2021

Multimodal Federated Learning

Federated learning is proposed as an alternative to centralized machine ...
research
09/15/2020

Federated Dynamic GNN with Secure Aggregation

Given video data from multiple personal devices or street cameras, can w...
research
07/18/2022

Study of the performance and scalability of federated learning for medical imaging with intermittent clients

Federated learning is a data decentralization privacy-preserving techniq...

Please sign up or login with your details

Forgot password? Click here to reset