Character-Aware Models Improve Visual Text Rendering

12/20/2022
by   Rosanne Liu, et al.
0

Current image generation models struggle to reliably produce well-formed visual text. In this paper, we investigate a key contributing factor: popular text-to-image models lack character-level input features, making it much harder to predict a word's visual makeup as a series of glyphs. To quantify the extent of this effect, we conduct a series of controlled experiments comparing character-aware vs. character-blind text encoders. In the text-only domain, we find that character-aware models provide large gains on a novel spelling task (WikiSpell). Transferring these learnings onto the visual domain, we train a suite of image generation models, and show that character-aware variants outperform their character-blind counterparts across a range of novel text rendering tasks (our DrawText benchmark). Our models set a much higher state-of-the-art on visual spelling, with 30+ point accuracy gains over competitors on rare words, despite training on far fewer examples.

READ FULL TEXT

page 2

page 6

page 8

page 9

page 12

page 15

page 16

page 17

research
04/17/2017

Learning Character-level Compositionality with Visual Features

Previous work has modeled the compositionality of words by creating char...
research
01/11/2020

Authorship Attribution in Bangla literature using Character-level CNN

Characters are the smallest unit of text that can extract stylometric si...
research
11/04/2022

Rickrolling the Artist: Injecting Invisible Backdoors into Text-Guided Image Generation Models

While text-to-image synthesis currently enjoys great popularity among re...
research
11/07/2022

Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

Machine learning models are now able to convert user-written text descri...
research
05/24/2023

Transferring Visual Attributes from Natural Language to Verified Image Generation

Text to image generation methods (T2I) are widely popular in generating ...
research
10/16/2022

Character-Centric Story Visualization via Visual Planning and Token Alignment

Story visualization advances the traditional text-to-image generation by...
research
12/03/2020

Evolving Character-Level DenseNet Architectures using Genetic Programming

DenseNet architectures have demonstrated impressive performance in image...

Please sign up or login with your details

Forgot password? Click here to reset