An empirical study on large scale text classification with skip-gram embeddings

06/21/2016
by   Georgios Balikas, et al.
0

We investigate the integration of word embeddings as classification features in the setting of large scale text classification. Such representations have been used in a plethora of tasks, however their application in classification scenarios with thousands of classes has not been extensively researched, partially due to hardware limitations. In this work, we examine efficient composition functions to obtain document-level from word-level embeddings and we subsequently investigate their combination with the traditional one-hot-encoding representations. By presenting empirical evidence on large, multi-class, multi-label classification problems, we demonstrate the efficiency and the performance benefits of this combination.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2018

Incorporating Word Embeddings into Open Directory Project based Large-scale Classification

Recently, implicit representation models, such as embedding or deep lear...
research
08/25/2023

MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

The eXtreme Multi-label text Classification(XMC) refers to training a cl...
research
03/30/2015

LSHTC: A Benchmark for Large-Scale Text Classification

LSHTC is a series of challenges which aims to assess the performance of ...
research
03/10/2020

Text classification with word embedding regularization and soft similarity measure

Since the seminal work of Mikolov et al., word embeddings have become th...
research
06/03/2020

Exploiting Class Labels to Boost Performance on Embedding-based Text Classification

Text classification is one of the most frequent tasks for processing tex...
research
06/24/2021

byteSteady: Fast Classification Using Byte-Level n-Gram Embeddings

This article introduces byteSteady – a fast model for classification usi...
research
08/14/2019

On the Robustness of Projection Neural Networks For Efficient Text Representation: An Empirical Study

Recently, there has been strong interest in developing natural language ...

Please sign up or login with your details

Forgot password? Click here to reset