Sparse Lifting of Dense Vectors: Unifying Word and Sentence Representations

11/05/2019
by   Wenye Li, et al.
0

As the first step in automated natural language processing, representing words and sentences is of central importance and has attracted significant research attention. Different approaches, from the early one-hot and bag-of-words representation to more recent distributional dense and sparse representations, were proposed. Despite the successful results that have been achieved, such vectors tend to consist of uninterpretable components and face nontrivial challenge in both memory and computational requirement in practical applications. In this paper, we designed a novel representation model that projects dense word vectors into a higher dimensional space and favors a highly sparse and binary representation of word vectors with potentially interpretable components, while trying to maintain pairwise inner products between original vectors as much as possible. Computationally, our model is relaxed as a symmetric non-negative matrix factorization problem which admits a fast yet effective solution. In a series of empirical evaluations, the proposed model exhibited consistent improvement and high potential in practical applications.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 6

page 8

page 9

page 10

research
06/17/2015

Non-distributional Word Vector Representations

Data-driven representation learning for words is a technique of central ...
research
02/17/2018

Can Network Embedding of Distributional Thesaurus be Combined with Word Vectors for Better Representation?

Distributed representations of words learned from text have proved to be...
research
07/27/2019

Modeling Winner-Take-All Competition in Sparse Binary Projections

Inspired by the advances in biological science, the study of sparse bina...
research
08/02/2016

Semantic Representations of Word Senses and Concepts

Representing the semantics of linguistic items in a machine-interpretabl...
research
05/06/2016

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec

Distributed dense word vectors have been shown to be effective at captur...
research
12/13/2018

An Unbiased Approach to Quantification of Gender Inclination using Interpretable Word Representations

Recent advances in word embedding provide significant benefit to various...
research
09/18/2017

Paraphrasing verbal metonymy through computational methods

Verbal metonymy has received relatively scarce attention in the field of...

Please sign up or login with your details

Forgot password? Click here to reset