Bag of Tricks for Efficient Text Classification

07/06/2016
by   Armand Joulin, et al.
0

This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore CPU, and classify half a million sentences among 312K classes in less than a minute.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/27/2023

Gzip versus bag-of-words for text classification with KNN

The effectiveness of compression distance in KNN-based text classificati...
research
02/17/2017

Analysis and Optimization of fastText Linear Text Classifier

The paper [1] shows that simple linear classifier can compete with compl...
research
09/18/2017

Word Vector Enrichment of Low Frequency Words in the Bag-of-Words Model for Short Text Multi-class Classification Problems

The bag-of-words model is a standard representation of text for many lin...
research
06/19/2017

Topic Modeling for Classification of Clinical Reports

Electronic health records (EHRs) contain important clinical information ...
research
06/12/2023

Linear Classifier: An Often-Forgotten Baseline for Text Classification

Large-scale pre-trained language models such as BERT are popular solutio...
research
12/12/2016

FastText.zip: Compressing text classification models

We consider the problem of producing compact architectures for text clas...
research
08/29/2018

Centroid estimation based on symmetric KL divergence for Multinomial text classification problem

We define a new method to estimate centroid for text classification base...

Please sign up or login with your details

Forgot password? Click here to reset