Privacy-Preserving Collaborative Chinese Text Recognition with Federated Learning

05/09/2023
by   Shangchao Su, et al.
0

In Chinese text recognition, to compensate for the insufficient local data and improve the performance of local few-shot character recognition, it is often necessary for one organization to collect a large amount of data from similar organizations. However, due to the natural presence of private information in text data, different organizations are unwilling to share private data, such as addresses and phone numbers. Therefore, it becomes increasingly important to design a privacy-preserving collaborative training framework for the Chinese text recognition task. In this paper, we introduce personalized federated learning (pFL) into the Chinese text recognition task and propose the pFedCR algorithm, which significantly improves the model performance of each client (organization) without sharing private data. Specifically, based on CRNN, to handle the non-iid problem of client data, we add several attention layers to the model and design a two-stage training approach for the client. In addition, we fine-tune the output layer of the model using a virtual dataset on the server, mitigating the problem of character imbalance in Chinese documents. The proposed approach is validated on public benchmarks and two self-built real-world industrial scenario datasets. The experimental results show that the pFedCR algorithm can improve the performance of local personalized models while also improving their generalization performance on other client data domains. Compared to local training within an organization, pFedCR improves model performance by about 20 methods, pFedCR improves performance by 6 learning, pFedCR can correct erroneous information in the ground truth.

READ FULL TEXT
research
10/12/2020

Differentially Private Secure Multi-Party Computation for Federated Learning in Financial Applications

Federated Learning enables a population of clients, working with a trust...
research
11/19/2022

Personalized Federated Learning with Hidden Information on Personalized Prior

Federated learning (FL for simplification) is a distributed machine lear...
research
01/27/2023

FedHP: Heterogeneous Federated Learning with Privacy-preserving

Federated Learning is a distributed machine learning environment, which ...
research
05/07/2021

FedGL: Federated Graph Learning Framework with Global Self-Supervision

Graph data are ubiquitous in the real world. Graph learning (GL) tries t...
research
10/01/2021

Personalized Retrogress-Resilient Framework for Real-World Medical Federated Learning

Nowadays, deep learning methods with large-scale datasets can produce cl...
research
04/23/2023

Personalized Federated Learning via Gradient Modulation for Heterogeneous Text Summarization

Text summarization is essential for information aggregation and demands ...
research
07/09/2020

Maximum Entropy Regularization and Chinese Text Recognition

Chinese text recognition is more challenging than Latin text due to the ...

Please sign up or login with your details

Forgot password? Click here to reset