A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches

03/13/2023
by   Obaidullah Zaland, et al.
0

Vector-based word representations help countless Natural Language Processing (NLP) tasks capture both semantic and syntactic regularities of the language. In this paper, we present the characteristics of existing word embedding approaches and analyze them with regards to many classification tasks. We categorize the methods into two main groups - Traditional approaches mostly use matrix factorization to produce word representations, and they are not able to capture the semantic and syntactic regularities of the language very well. Neural-Network based approaches, on the other hand, can capture sophisticated regularities of the language and preserve the word relationships in the generated word representations. We report experimental results on multiple classification tasks and highlight the scenarios where one approach performs better than the rest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2023

An Investigation on Word Embedding Offset Clustering as Relationship Classification

Vector representations obtained from word embedding are the source of ma...
research
10/24/2018

Local Homology of Word Embeddings

Topological data analysis (TDA) has been widely used to make progress on...
research
06/24/2016

Evaluation method of word embedding by roots and affixes

Word embedding has been shown to be remarkably effective in a lot of Nat...
research
06/06/2020

Quantum-like Generalization of Complex Word Embedding: a lightweight approach for textual classification

In this paper, we present an extension, and an evaluation, to existing Q...
research
09/06/2019

Efficient Sentence Embedding using Discrete Cosine Transform

Vector averaging remains one of the most popular sentence embedding meth...
research
11/01/2018

Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

Deep learning models have become state of the art for natural language p...
research
01/31/2021

Introduction of a novel word embedding approach based on technology labels extracted from patent data

Diversity in patent language is growing and makes finding synonyms for c...

Please sign up or login with your details

Forgot password? Click here to reset