Learning Character-level Compositionality with Visual Features

04/17/2017
by   Frederick Liu, et al.
0

Previous work has modeled the compositionality of words by creating character-level models of meaning, reducing problems of sparsity for rare words. However, in many writing systems compositionality has an effect even on the character-level: the meaning of a character is derived by the sum of its parts. In this paper, we model this effect by creating embeddings for characters based on their visual characteristics, creating an image for the character and running it through a convolutional neural network to produce a visual character embedding. Experiments on a text classification task demonstrate that such model allows for better processing of instances with rare characters in languages such as Chinese, Japanese, and Korean. Additionally, qualitative analyses demonstrate that our proposed model learns to focus on the parts of characters that carry semantic content, resulting in embeddings that are coherent in visual space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2017

Glyph-aware Embedding of Chinese Characters

Given the advantage and recent success of English character-level and su...
research
12/20/2022

Character-Aware Models Improve Visual Text Rendering

Current image generation models struggle to reliably produce well-formed...
research
07/13/2020

A theory of interaction semantics

The aim of this article is to delineate a theory of interaction semantic...
research
06/04/2022

APES: Articulated Part Extraction from Sprite Sheets

Rigged puppets are one of the most prevalent representations to create 2...
research
01/26/2021

Coloring the Black Box: What Synesthesia Tells Us about Character Embeddings

In contrast to their word- or sentence-level counterparts, character emb...
research
09/06/2017

The Voynich Manuscript is Written in Natural Language: The Pahlavi Hypothesis

The late medieval Voynich Manuscript (VM) has resisted decryption and wa...
research
04/05/2023

Efficient OCR for Building a Diverse Digital History

Thousands of users consult digital archives daily, but the information t...

Please sign up or login with your details

Forgot password? Click here to reset