SynthText3D: Synthesizing Scene Text Images from 3D Virtual Worlds

by   Minghui Liao, et al.

With the development of deep neural networks, the demand for a significant amount of annotated training data becomes the performance bottlenecks in many fields of research and applications. Image synthesis can generate annotated images automatically and freely, which gains increasing attention recently. In this paper, we propose to synthesize scene text images from the 3D virtual worlds, where the precise descriptions of scenes, editable illumination/visibility, and realistic physics are provided. Different from the previous methods which paste the rendered text on static 2D images, our method can render the 3D virtual scene and text instances as an entirety. In this way, complex perspective transforms, various illuminations, and occlusions can be realized in our synthesized scene text images. Moreover, the same text instances with various viewpoints can be produced by randomly moving and rotating the virtual camera, which acts as human eyes. The experiments on the standard scene text detection benchmarks using the generated synthetic data demonstrate the effectiveness and superiority of the proposed method. The code and synthetic data will be made available at


page 1

page 3

page 4

page 5

page 6

page 7

page 8


UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World

Synthetic data has been a critical tool for training scene text detectio...

Synthetic-to-Real Unsupervised Domain Adaptation for Scene Text Detection in the Wild

Deep learning-based scene text detection can achieve preferable performa...

SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models

For successful scene text recognition (STR) models, synthetic text image...

Synthetic Data Generation and Adaption for Object Detection in Smart Vending Machines

This paper presents an improved scheme for the generation and adaption o...

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

We propose a new paradigm to automatically generate training data with a...

On Exploring and Improving Robustness of Scene Text Detection Models

It is crucial to understand the robustness of text detection models with...

Scene Text Synthesis for Efficient and Effective Deep Network Training

A large amount of annotated training images is critical for training acc...

Please sign up or login with your details

Forgot password? Click here to reset