A Multiplexed Network for End-to-End, Multilingual OCR

03/29/2021
by   Jing Huang, et al.
0

Recent advances in OCR have shown that an end-to-end (E2E) training pipeline that includes both detection and recognition leads to the best results. However, many existing methods focus primarily on Latin-alphabet languages, often even only case-insensitive English characters. In this paper, we propose an E2E approach, Multiplexed Multilingual Mask TextSpotter, that performs script identification at the word level and handles different scripts with different recognition heads, all while maintaining a unified loss that simultaneously optimizes script identification and multiple recognition heads. Experiments show that our method outperforms the single-head model with similar number of parameters in end-to-end recognition tasks, and achieves state-of-the-art results on MLT17 and MLT19 joint text detection and script identification benchmarks. We believe that our work is a step towards the end-to-end trainable and scalable multilingual multi-purpose OCR system. Our code and model will be released.

READ FULL TEXT

page 4

page 6

research
10/13/2022

Task Grouping for Multilingual Text Recognition

Most existing OCR methods focus on alphanumeric characters due to the po...
research
09/26/2022

End-to-end Multilingual Coreference Resolution with Mention Head Prediction

This paper describes our approach to the CRAC 2022 Shared Task on Multil...
research
07/06/2018

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

Recently, models based on deep neural networks have dominated the fields...
research
11/21/2018

A Novel Integrated Framework for Learning both Text Detection and Recognition

In this paper, we propose a novel integrated framework for learning both...
research
11/18/2016

End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance

In this paper, we propose an innovative end-to-end subtitle detection an...
research
05/07/2021

Efficient Weight factorization for Multilingual Speech Recognition

End-to-end multilingual speech recognition involves using a single model...
research
07/04/2021

Robust End-to-End Offline Chinese Handwriting Text Page Spotter with Text Kernel

Offline Chinese handwriting text recognition is a long-standing research...

Please sign up or login with your details

Forgot password? Click here to reset