PSDVec: a Toolbox for Incremental and Scalable Word Embedding

06/10/2016
by   Shaohua Li, et al.
0

PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite approximation. To scale up the learning process, we implement a blockwise online learning algorithm to learn the embeddings incrementally. This strategy greatly reduces the learning time of word embeddings on a large vocabulary, and can learn the embeddings of new words without re-learning the whole vocabulary. On 9 word similarity/analogy benchmark sets and 2 Natural Language Processing (NLP) tasks, PSDVec produces embeddings that has the best average performance among popular word embedding tools. PSDVec provides a new option for NLP practitioners.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2017

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

Word embeddings have been found to provide meaningful representations fo...
research
02/02/2019

Understanding Composition of Word Embeddings via Tensor Decomposition

Word embedding is a powerful tool in natural language processing. In thi...
research
12/28/2019

Learning Numeral Embeddings

Word embedding is an essential building block for deep learning methods ...
research
03/24/2016

Part-of-Speech Relevance Weights for Learning Word Embeddings

This paper proposes a model to learn word embeddings with weighted conte...
research
02/19/2022

Data-Driven Mitigation of Adversarial Text Perturbation

Social networks have become an indispensable part of our lives, with bil...
research
07/02/2018

Transparent, Efficient, and Robust Word Embedding Access with WOMBAT

We present WOMBAT, a Python tool which supports NLP practitioners in acc...
research
06/20/2018

The Corpus Replication Task

In the field of Natural Language Processing (NLP), we revisit the well-k...

Please sign up or login with your details

Forgot password? Click here to reset