C3-STISR: Scene Text Image Super-resolution with Triple Clues

04/29/2022
by   Minyi Zhao, et al.
2

Scene text image super-resolution (STISR) has been regarded as an important pre-processing task for text recognition from low-resolution scene text images. Most recent approaches use the recognizer's feedback as clues to guide super-resolution. However, directly using recognition clue has two problems: 1) Compatibility. It is in the form of probability distribution, has an obvious modal gap with STISR - a pixel-level task; 2) Inaccuracy. it usually contains wrong information, thus will mislead the main task and degrade super-resolution performance. In this paper, we present a novel method C3-STISR that jointly exploits the recognizer's feedback, visual and linguistical information as clues to guide super-resolution. Here, visual clue is from the images of texts predicted by the recognizer, which is informative and more compatible with the STISR task; while linguistical clue is generated by a pre-trained character-level language model, which is able to correct the predicted texts. We design effective extraction and fusion mechanisms for the triple cross-modal clues to generate a comprehensive and unified guidance for super-resolution. Extensive experiments on TextZoom show that C3-STISR outperforms the SOTA methods in fidelity and recognition performance. Code is available in https://github.com/zhaominyiz/C3-STISR.

READ FULL TEXT
research
12/13/2021

Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution

In the last decade, the blossom of deep learning has witnessed the rapid...
research
07/19/2023

Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement

Scene text image super-resolution (STISR), aiming to improve image quali...
research
03/06/2020

Pixel-Level Self-Paced Learning for Super-Resolution

Recently, lots of deep networks are proposed to improve the quality of p...
research
10/13/2022

Scene Text Image Super-Resolution via Content Perceptual Loss and Criss-Cross Transformer Blocks

Text image super-resolution is a unique and important task to enhance re...
research
06/16/2023

FALL-E: A Foley Sound Synthesis Model and Strategies

This paper introduces FALL-E, a foley synthesis system and its training/...
research
07/31/2023

HiREN: Towards Higher Supervision Quality for Better Scene Text Image Super-Resolution

Scene text image super-resolution (STISR) is an important pre-processing...
research
03/17/2022

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution

Scene text image super-resolution aims to increase the resolution and re...

Please sign up or login with your details

Forgot password? Click here to reset