Semi-Supervised Multi-Task Word Embeddings

09/16/2018
by   James O'Neill, et al.
0

Word embeddings have been shown to benefit from ensembling several word embedding sources, often carried out using straightforward mathematical operations over the set of vectors to produce a meta-embedding representation. More recently, unsupervised learning has been used to find a lower-dimensional representation, similar in size to that of the word embeddings within the ensemble. However, these methods do not use the available manual labeled datasets that are often used solely for the purpose of evaluation. We propose to improve word embeddings by simultaneously learning to reconstruct an ensemble of pretrained word embeddings with supervision from various labeled word similarity datasets. This involves reconstructing word meta-embeddings while simultaneously using a Siamese Network to also learn word similarity where both processes share a hidden layer. Experiments are carried out on 6 word similarity datasets and 3 analogy datasets. We find that performance is improved for all word similarity datasets when compared to unsupervised learning methods with a mean increase of 11.33 in the Spearman Correlation coefficient. Moreover, 4 of 6 of word similarity datasets from our approach show best performance when using of a cosine loss for reconstruction and Brier's loss for word similarity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/13/2018

Angular-Based Word Meta-Embedding Learning

Ensembling word embeddings to improve distributed word representations h...
research
02/07/2017

How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks

Maybe the single most important goal of representation learning is makin...
research
06/04/2018

Absolute Orientation for Word Embedding Alignment

We propose a new technique to align word embeddings which are derived fr...
research
10/07/2019

Correlations between Word Vector Sets

Similarity measures based purely on word embeddings are comfortably comp...
research
08/22/2019

Unsupervised Lemmatization as Embeddings-Based Word Clustering

We focus on the task of unsupervised lemmatization, i.e. grouping togeth...
research
10/11/2021

A Comprehensive Comparison of Word Embeddings in Event Entity Coreference Resolution

Coreference Resolution is an important NLP task and most state-of-the-ar...
research
07/11/2016

The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction

This study investigates the use of unsupervised word embeddings and sequ...

Please sign up or login with your details

Forgot password? Click here to reset