Incorporating Subword Information into Matrix Factorization Word Embeddings

05/09/2018
by   Alexandre Salle, et al.
0

The positive effect of adding subword information to word embeddings has been demonstrated for predictive models. In this paper we investigate whether similar benefits can also be derived from incorporating subwords into counting models. We evaluate the impact of different types of subwords (n-grams and unsupervised morphemes), with results confirming the importance of subword information in learning representations of rare and out-of-vocabulary words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2016

Neural-based Noise Filtering from Word Embeddings

Word embeddings have been demonstrated to benefit NLP tasks impressively...
research
04/18/2021

Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Sparse regression has recently been applied to enable transfer learning ...
research
02/10/2019

Word embeddings for idiolect identification

The term idiolect refers to the unique and distinctive use of language o...
research
04/18/2020

Effect of Text Color on Word Embeddings

In natural scenes and documents, we can find the correlation between a t...
research
03/01/2016

Characterizing Diseases from Unstructured Text: A Vocabulary Driven Word2vec Approach

Traditional disease surveillance can be augmented with a wide variety of...
research
05/18/2021

An Automated Method to Enrich Consumer Health Vocabularies Using GloVe Word Embeddings and An Auxiliary Lexical Resource

Background: Clear language makes communication easier between any two pa...
research
09/19/2021

Conditional probing: measuring usable information beyond a baseline

Probing experiments investigate the extent to which neural representatio...

Please sign up or login with your details

Forgot password? Click here to reset