Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features

03/07/2017
by   Matteo Pagliardini, et al.
0

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2018

Correcting the Common Discourse Bias in Linear Representation of Sentences using Conceptors

Distributed representations of words, better known as word embeddings, h...
research
10/23/2017

Testing the limits of unsupervised learning for semantic similarity

Semantic Similarity between two sentences can be defined as a way to det...
research
10/17/2017

Unsupervised Sentence Representations as Word Information Series: Revisiting TF--IDF

Sentence representation at the semantic level is a challenging task for ...
research
03/30/2018

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

A lot of the recent success in natural language processing (NLP) has bee...
research
02/07/2021

Unsupervised Sentence-embeddings by Manifold Approximation and Projection

The concept of unsupervised universal sentence encoders has gained tract...
research
10/06/2020

Compositional Demographic Word Embeddings

Word embeddings are usually derived from corpora containing text from ma...
research
10/20/2018

pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

Reasoning about implied relationships (e.g. paraphrastic, common sense, ...

Please sign up or login with your details

Forgot password? Click here to reset