XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

04/11/2022
by   Wei Liu, et al.
0

Generating a new font library is a very labor-intensive and time-consuming job for glyph-rich scripts. Few-shot font generation is thus required, as it requires only a few glyph references without fine-tuning during test. Existing methods follow the style-content disentanglement paradigm and expect novel fonts to be produced by combining the style codes of the reference glyphs and the content representations of the source. However, these few-shot font generation methods either fail to capture content-independent style representations, or employ localized component-wise style representations, which is insufficient to model many Chinese font styles that involve hyper-component features such as inter-component spacing and "connected-stroke". To resolve these drawbacks and make the style representations more reliable, we propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder that is conditioned jointly on the glyph image and the corresponding stroke labels. The cross-modality encoder is pre-trained in a self-supervised manner to allow effective capture of cross- and intra-modality correlations, which facilitates the content-style disentanglement and modeling style representations of all scales (stroke-level, component-level and character-level). The pre-trained encoder is then applied to the downstream font generation task without fine-tuning. Experimental comparisons of our method with state-of-the-art methods demonstrate our method successfully transfers styles of all scales. In addition, it only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task 28

READ FULL TEXT

page 4

page 7

page 13

page 14

page 15

page 16

page 17

page 18

research
09/02/2023

Few shot font generation via transferring similarity guided global style and quantization local style

Automatic few-shot font generation (AFFG), aiming at generating new font...
research
04/02/2021

Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts

A few-shot font generation (FFG) method has to satisfy two objectives: t...
research
12/22/2021

Few-shot Font Generation with Weakly Supervised Localized Representations

Automatic few-shot font generation aims to solve a well-defined, real-wo...
research
09/23/2020

Few-shot Font Generation with Localized Style Representations and Factorization

Automatic few-shot font generation is in high demand because manual desi...
research
03/09/2022

Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification

A large-scale labeled dataset is a key factor for the success of supervi...
research
12/12/2021

Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification

While RGB-Infrared cross-modality person re-identification (RGB-IR ReID)...
research
04/30/2022

Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator

Automatic font generation remains a challenging research issue due to th...

Please sign up or login with your details

Forgot password? Click here to reset