GlobalTrait: Personality Alignment of Multilingual Word Embeddings

11/01/2018
by   Farhad Bin Siddique, et al.
0

We propose a multilingual model to recognize Big Five Personality traits from text data in four different languages: English, Spanish, Dutch and Italian. Our analysis shows that words having a similar semantic meaning in different languages do not necessarily correspond to the same personality traits. Therefore, we propose a personality alignment method, GlobalTrait, which has a mapping for each trait from the source language to the target language (English), such that words that correlate positively to each trait are close together in the multilingual vector space. Using these aligned embeddings for training, we can transfer personality related training features from high-resource languages such as English to other low-resource languages, and get better multilingual results, when compared to using simple monolingual and unaligned multilingual embeddings. We achieve an average F-score increase (across all three languages except English) from 65 to 73.4 (+8.4), when comparing our monolingual model to multilingual using CNN with personality aligned embeddings. We also show relatively good performance in the regression tasks, and better classification results when evaluating our model on a separate Chinese dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2019

Learning Multilingual Word Embeddings Using Image-Text Data

There has been significant interest recently in learning multilingual wo...
research
12/28/2021

Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora

The problem of comparing two bodies of text and searching for words that...
research
04/21/2020

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

We present an easy and efficient method to extend existing sentence embe...
research
01/28/2020

Unsupervised Multilingual Alignment using Wasserstein Barycenter

We study unsupervised multilingual alignment, the problem of finding wor...
research
01/10/2022

Language-Agnostic Website Embedding and Classification

Currently, publicly available models for website classification do not o...
research
10/11/2022

Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models

While multilingual language models can improve NLP performance on low-re...
research
05/14/2019

Multilingual Factor Analysis

In this work we approach the task of learning multilingual word represen...

Please sign up or login with your details

Forgot password? Click here to reset