Visual Text Correction
This paper tackles the Text Correction (TC) problem, i.e., finding and replacing an inaccurate word in a sentence. We introduce a novel deep network which detects the inaccuracy in a sentence and selects the best appropriate word to substitute. Our pipeline can be trained in an End-To-End fashion. Moreover, our method leverages the visual features and extends the simple text correction to Visual Text Correction (VTC). We present a method to fuse the visual and textual data for VTC problem. In our formulation, every single word dynamically selects part of a visual feature vector through a gating process. Furthermore, to train and evaluate our model, we propose an approach to automatically construct a large dataset for VTC problem. Our experiments and performance analysis demonstrate that the proposed method provides the best results and also highlights the challenges in solving the VTC problem. To the best of our knowledge, this work is the first of its kind for the Visual Text Correction task.
READ FULL TEXT