An exploration of the encoding of grammatical gender in word embeddings

08/05/2020
by   Hartger Veeman, et al.
0

The vector representation of words, known as word embeddings, has opened a new research approach in the study of languages. These representations can capture different types of information about words. The grammatical gender of nouns is a typical classification of nouns based on their formal and semantic properties. The study of grammatical gender based on word embeddings can give insight into discussions on how grammatical genders are determined. In this research, we compare different sets of word embeddings according to the accuracy of a neural classifier determining the grammatical gender of nouns. It is found that the information about grammatical gender is encoded differently in Swedish, Danish, and Dutch embeddings. Our experimental results on the contextualized embeddings pointed out that adding more contextual (semantic) information to embeddings is detrimental to the classifier's performance. We also observed that removing morpho-syntactic features such as articles from the training corpora of embeddings decreases the classification performance dramatically, indicating a large portion of the information is encoded in the relationship between nouns and articles.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2019

Gender-preserving Debiasing for Pre-trained Word Embeddings

Word embeddings learnt from massive text collections have demonstrated s...
research
03/18/2015

Text Segmentation based on Semantic Word Embeddings

We explore the use of semantic word embeddings in text segmentation algo...
research
11/22/2019

Anaphora Resolution in Dialogue Systems for South Asian Languages

Anaphora resolution is a challenging task which has been the interest of...
research
05/04/2020

Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words

Although models using contextual word embeddings have achieved state-of-...
research
09/18/2019

Decision-Directed Data Decomposition

We present an algorithm, Decision-Directed Data Decomposition, which dec...
research
10/30/2019

How does Grammatical Gender Affect Noun Representations in Gender-Marking Languages?

Many natural languages assign grammatical gender also to inanimate nouns...
research
02/14/2020

Semantic Relatedness and Taxonomic Word Embeddings

This paper connects a series of papers dealing with taxonomic word embed...

Please sign up or login with your details

Forgot password? Click here to reset