DeepAI AI Chat
Log In Sign Up

1st Place Solution to ECCV 2022 Challenge on Out of Vocabulary Scene Text Understanding: Cropped Word Recognition

by   Zhangzi Zhu, et al.

This report presents our winner solution to ECCV 2022 challenge on Out-of-Vocabulary Scene Text Understanding (OOV-ST) : Cropped Word Recognition. This challenge is held in the context of ECCV 2022 workshop on Text in Everything (TiE), which aims to extract out-of-vocabulary words from natural scene images. In the competition, we first pre-train SCATTER on the synthetic datasets, then fine-tune the model on the training set with data augmentations. Meanwhile, two additional models are trained specifically for long and vertical texts. Finally, we combine the output from different models with different layers, different backbones, and different seeds as the final results. Our solution achieves an overall word accuracy of 69.73 in-vocabulary and out-of-vocabulary words.


page 1

page 2

page 3


On Vocabulary Reliance in Scene Text Recognition

The pursuit of high performance on public benchmarks has been the drivin...

Vision-Language Adaptive Mutual Decoder for OOV-STR

Recent works have shown huge success of deep learning models for common ...

One Size Does Not Fit All: The Case for Personalised Word Complexity Models

Complex Word Identification (CWI) aims to detect words within a text tha...

Dynamics of core of language vocabulary

Studies of the overall structure of vocabulary and its dynamics became p...

How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection

With the rapid development in deep learning, deep neural networks have b...