Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications

05/09/2020
by   Kiyoharu Aizawa, et al.
12

Manga, or comics, which are a type of multimodal artwork, have been left behind in the recent trend of deep learning applications because of the lack of a proper dataset. Hence, we built Manga109, a dataset consisting of a variety of 109 Japanese comic books (94 authors and 21,142 pages) and made it publicly available by obtaining author permissions for academic use. We carefully annotated the frames, speech texts, character faces, and character bodies; the total number of annotations exceeds 500k. This dataset provides numerous manga images and annotations, which will be beneficial for use in machine learning algorithms and their evaluation. In addition to academic use, we obtained further permission for a subset of the dataset for industrial use. In this article, we describe the details of the dataset and present a few examples of multimedia processing applications (detection, retrieval, and generation) that apply existing deep learning methods and are made possible by the dataset.

READ FULL TEXT

page 1

page 6

page 7

page 9

research
07/19/2023

Classification of Visualization Types and Perspectives in Patents

Due to the swift growth of patent applications each year, information an...
research
02/17/2020

Serial Speakers: a Dataset of TV Series

For over a decade, TV series have been drawing increasing interest, both...
research
01/22/2023

Applied Deep Learning to Identify and Localize Polyps from Endoscopic Images

Deep learning based neural networks have gained popularity for a variety...
research
04/26/2021

Synthetic 3D Data Generation Pipeline for Geometric Deep Learning in Architecture

With the growing interest in deep learning algorithms and computational ...
research
10/13/2021

NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels

Deep learning has shown remarkable progress in a wide range of problems....
research
06/23/2021

PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database

In deep learning area, large-scale image datasets bring a breakthrough i...
research
01/27/2022

Interactive 3D Character Modeling from 2D Orthogonal Drawings with Annotations

We propose an interactive 3D character modeling approach from orthograph...

Please sign up or login with your details

Forgot password? Click here to reset