Enhancing the LexVec Distributed Word Representation Model Using Positional Contexts and External Memory

06/03/2016
by   Alexandre Salle, et al.
0

In this paper we take a state-of-the-art model for distributed word representation that explicitly factorizes the positive pointwise mutual information (PPMI) matrix using window sampling and negative sampling and address two of its shortcomings. We improve syntactic performance by using positional contexts, and solve the need to store the PPMI matrix in memory by working on aggregate data in external memory. The effectiveness of both modifications is shown using word similarity and analogy tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2016

Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations

In this paper, we propose LexVec, a new method for generating distribute...
research
09/15/2021

Fast Extraction of Word Embedding from Q-contexts

The notion of word embedding plays a fundamental role in natural languag...
research
07/07/2016

Representing Verbs with Rich Contexts: an Evaluation on Verb Similarity

Several studies on sentence processing suggest that the mental lexicon k...
research
03/17/2017

Construction of a Japanese Word Similarity Dataset

An evaluation of distributed word representation is generally conducted ...
research
10/28/2009

Word Sense Disambiguation Based on Mutual Information and Syntactic Patterns

This paper describes a hybrid system for WSD, presented to the English a...
research
08/19/2019

Why So Down? The Role of Negative (and Positive) Pointwise Mutual Information in Distributional Semantics

In distributional semantics, the pointwise mutual information (PMI) weig...

Please sign up or login with your details

Forgot password? Click here to reset