An Amharic News Text classification Dataset

03/10/2021
by   Israel Abebe Azime, et al.
0

In NLP, text classification is one of the primary problems we try to solve and its uses in language analyses are indisputable. The lack of labeled training data made it harder to do these tasks in low resource languages like Amharic. The task of collecting, labeling, annotating, and making valuable this kind of data will encourage junior researchers, schools, and machine learning practitioners to implement existing classification models in their language. In this short paper, we aim to introduce the Amharic text classification dataset that consists of more than 50k news articles that were categorized into 6 classes. This dataset is made available with easy baseline performances to encourage studies and better performance experiments.

READ FULL TEXT

page 1

page 2

page 3

research
10/23/2020

KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text Classification for Kinyarwanda and Kirundi

Recent progress in text classification has been focused on high-resource...
research
07/07/2019

Improving short text classification through global augmentation methods

We study the effect of different approaches to text augmentation. To do ...
research
12/26/2019

Text Classification for Azerbaijani Language Using Machine Learning and Embedding

Text classification systems will help to solve the text clustering probl...
research
09/25/2019

The Power of Communities: A Text Classification Model with Automated Labeling Process Using Network Community Detection

The text classification is one of the most critical areas in machine lea...
research
06/01/2020

Concept Matching for Low-Resource Classification

We propose a model to tackle classification tasks in the presence of ver...
research
12/16/2022

Azimuth: Systematic Error Analysis for Text Classification

We present Azimuth, an open-source and easy-to-use tool to perform error...
research
05/05/2020

Establishing Baselines for Text Classification in Low-Resource Languages

While transformer-based finetuning techniques have proven effective in t...

Please sign up or login with your details

Forgot password? Click here to reset