Learning Distributed Representations of Sentences from Unlabelled Data

02/10/2016
by   Felix Hill, et al.
0

Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2018

An efficient framework for learning sentence representations

In this work we propose a simple and efficient framework for learning se...
research
05/09/2018

Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks

Experimental evidence indicates that simple models outperform complex de...
research
03/30/2018

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

A lot of the recent success in natural language processing (NLP) has bee...
research
05/22/2023

Beyond Words: A Comprehensive Survey of Sentence Representations

Sentence representations have become a critical component in natural lan...
research
10/29/2018

Learning Better Internal Structure of Words for Sequence Labeling

Character-based neural models have recently proven very useful for many ...
research
05/13/2023

A Simple and Plug-and-play Method for Unsupervised Sentence Representation Enhancement

Generating proper embedding of sentences through an unsupervised way is ...
research
04/04/2017

Interpretation of Semantic Tweet Representations

Research in analysis of microblogging platforms is experiencing a renewe...

Please sign up or login with your details

Forgot password? Click here to reset