WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models

03/29/2023
by   Konstantina Nikolaidou, et al.
11

Text-to-Image synthesis is the task of generating an image according to a specific text description. Generative Adversarial Networks have been considered the standard method for image synthesis virtually since their introduction; today, Denoising Diffusion Probabilistic Models are recently setting a new baseline, with remarkable results in Text-to-Image synthesis, among other fields. Aside its usefulness per se, it can also be particularly relevant as a tool for data augmentation to aid training models for other document image processing tasks. In this work, we present a latent diffusion-based method for styled text-to-text-content-image generation on word-level. Our proposed method manages to generate realistic word image samples from different writer styles, by using class index styles and text content prompts without the need of adversarial training, writer recognition, or text recognition. We gauge system performance with Frechet Inception Distance, writer recognition accuracy, and writer retrieval. We show that the proposed model produces samples that are aesthetically pleasing, help boosting text recognition performance, and gets similar writer retrieval score as real data.

READ FULL TEXT
research
06/19/2023

Conditional Text Image Generation with Diffusion Models

Current text recognition systems, including those for handwritten script...
research
10/25/2022

Lafite2: Few-shot Text-to-Image Generation

Text-to-image generation models have progressed considerably in recent y...
research
11/13/2020

Diffusion models for Handwriting Generation

In this paper, we propose a diffusion probabilistic model for handwritin...
research
09/22/2018

Parametric Synthesis of Text on Stylized Backgrounds using PGGANs

We describe a novel method of generating high-resolution real-world imag...
research
12/29/2021

Learning Inception Attention for Image Synthesis and Image Recognition

Image synthesis and image recognition have witnessed remarkable progress...
research
11/27/2018

A Compositional Textual Model for Recognition of Imperfect Word Images

Printed text recognition is an important problem for industrial OCR syst...
research
05/06/2023

Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

Denoising diffusion models have shown remarkable potential in various ge...

Please sign up or login with your details

Forgot password? Click here to reset