Effect of Text Color on Word Embeddings

04/18/2020
by   Masaya Ikoma, et al.
0

In natural scenes and documents, we can find the correlation between a text and its color. For instance, the word, "hot", is often printed in red, while "cold" is often in blue. This correlation can be thought of as a feature that represents the semantic difference between the words. Based on this observation, we propose the idea of using text color for word embeddings. While text-only word embeddings (e.g. word2vec) have been extremely successful, they often represent antonyms as similar since they are often interchangeable in sentences. In this paper, we try two tasks to verify the usefulness of text color in understanding the meanings of words, especially in identifying synonyms and antonyms. First, we quantify the color distribution of words from the book cover images and analyze the correlation between the color and meaning of the word. Second, we try to retrain word embeddings with the color distribution of words as a constraint. By observing the changes in the word embeddings of synonyms and antonyms before and after re-training, we aim to understand the kind of words that have positive or negative effects in their word embeddings when incorporating text color information.

READ FULL TEXT
research
05/17/2017

Utility of general and specific word embeddings for classifying translational stages of research

Conventional text classification models make a bag-of-words assumption r...
research
10/08/2020

comp-syn: Perceptually Grounded Word Embeddings with Color

Popular approaches to natural language processing create word embeddings...
research
05/09/2018

Incorporating Subword Information into Matrix Factorization Word Embeddings

The positive effect of adding subword information to word embeddings has...
research
02/17/2022

Word Embeddings for Automatic Equalization in Audio Mixing

In recent years, machine learning has been widely adopted to automate th...
research
11/29/2018

Sequential Embedding Induced Text Clustering, a Non-parametric Bayesian Approach

Current state-of-the-art nonparametric Bayesian text clustering methods ...
research
09/22/2022

Homophone Reveals the Truth: A Reality Check for Speech2Vec

Generating spoken word embeddings that possess semantic information is a...
research
02/22/2020

Extracting and Validating Explanatory Word Archipelagoes using Dual Entropy

The logical connectivity of text is represented by the connectivity of w...

Please sign up or login with your details

Forgot password? Click here to reset