COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts

07/11/2022
by   Jeonghun Baek, et al.
0

Recognizing irregular texts has been a challenging topic in text recognition. To encourage research on this topic, we provide a novel comic onomatopoeia dataset (COO), which consists of onomatopoeia texts in Japanese comics. COO has many arbitrary texts, such as extremely curved, partially shrunk texts, or arbitrarily placed texts. Furthermore, some texts are separated into several parts. Each part is a truncated text and is not meaningful by itself. These parts should be linked to represent the intended meaning. Thus, we propose a novel task that predicts the link between truncated texts. We conduct three tasks to detect the onomatopoeia region and capture its intended meaning: text detection, text recognition, and link prediction. Through extensive experiments, we analyze the characteristics of the COO. Our data and code are available at <https://github.com/ku21fan/COO-Comic-Onomatopoeia>.

READ FULL TEXT

page 3

page 8

page 19

research
11/12/2017

Arbitrarily-Oriented Text Recognition

Recognizing text from natural images is still a hot research topic in co...
research
12/30/2021

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

The flourishing blossom of deep learning has witnessed the rapid develop...
research
10/10/2019

On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

Scene text recognition (STR) is the task of recognizing character sequen...
research
11/23/2021

StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks

Scene text detection is still a challenging task, as there may be extrem...
research
04/07/2023

Towards Unified Scene Text Spotting based on Sequence Generation

Sequence generation models have recently made significant progress in un...
research
11/25/2022

MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts

Event detection (ED) identifies and classifies event triggers from unstr...
research
01/01/2023

Optimizing Readability Using Genetic Algorithms

This research presents ORUGA, a method that tries to automatically optim...

Please sign up or login with your details

Forgot password? Click here to reset