Bidirectional Scene Text Recognition with a Single Decoder

12/08/2019
by   Maurits Bleeker, et al.
27

Scene Text Recognition (STR) is the problem of recognizing the correct word or character sequence in a cropped word image. To obtain more robust output sequences, the notion of bidirectional STR has been introduced. So far, bidirectional STRs have been implemented by using two separate decoders; one for left-to-right decoding and one for right-to-left. Having two separate decoders for almost the same task with the same output space is undesirable from a computational and optimization point of view. We introduce the bidirectional Scene Text Transformer (Bi-STET), a novel bidirectional STR method with a single decoder for bidirectional text decoding. With its single decoder, Bi-STET outperforms methods that apply bidirectional decoding by using two separate decoders while also being more efficient than those methods, Furthermore, we achieve or beat state-of-the-art (SOTA) methods on all STR benchmarks with Bi-STET. Finally, we provide analyses and insights into the performance of Bi-STET.

READ FULL TEXT
research
09/14/2021

Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition

Non-autoregressive (NAR) transformer models have been studied intensivel...
research
10/27/2020

Fast Interleaved Bidirectional Sequence Generation

Independence assumptions during sequence generation can speed up inferen...
research
01/31/2020

Pseudo-Bidirectional Decoding for Local Sequence Transduction

Local sequence transduction (LST) tasks are sequence transduction tasks ...
research
01/29/2019

Fully-functional bidirectional Burrows-Wheeler indexes

Given a string T on an alphabet of size σ, we describe a bidirectional B...
research
01/20/2021

Towards the Right Direction in BiDirectional User Interfaces

Hundreds of millions of speakers of bidirectional (BiDi) languages rely ...
research
05/21/2023

A Framework for Bidirectional Decoding: Case Study in Morphological Inflection

Transformer-based encoder-decoder models that generate outputs in a left...
research
10/26/2020

Syllabification of the Divine Comedy

We provide a syllabification algorithm for the Divine Comedy using techn...

Please sign up or login with your details

Forgot password? Click here to reset