New Arabic Medical Dataset for Diseases Classification

06/29/2021
by   Jaafar Hammoud, et al.
0

The Arabic language suffers from a great shortage of datasets suitable for training deep learning models, and the existing ones include general non-specialized classifications. In this work, we introduce a new Arab medical dataset, which includes two thousand medical documents collected from several Arabic medical websites, in addition to the Arab Medical Encyclopedia. The dataset was built for the task of classifying texts and includes 10 classes (Blood, Bone, Cardiovascular, Ear, Endocrine, Eye, Gastrointestinal, Immune, Liver and Nephrological) diseases. Experiments on the dataset were performed by fine-tuning three pre-trained models: BERT from Google, Arabert that based on BERT with large Arabic corpus, and AraBioNER that based on Arabert with Arabic medical corpus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2022

ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

Using pre-trained transformer models such as BERT has proven to be effec...
research
05/04/2023

Leveraging BERT Language Model for Arabic Long Document Classification

Given the number of Arabic speakers worldwide and the notably large amou...
research
11/18/2021

Supporting Undotted Arabic with Pre-trained Language Models

We observe a recent behaviour on social media, in which users intentiona...
research
05/18/2023

A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model

In this work, we explore Parameter-Efficient-Learning (PEL) techniques t...
research
11/29/2022

New Results for the Text Recognition of Arabic Maghribī Manuscripts – Managing an Under-resourced Script

HTR models development has become a conventional step for digital humani...
research
03/14/2023

Optimizing Deep Learning Model Parameters with the Bees Algorithm for Improved Medical Text Classification

This paper introduces a novel mechanism to obtain the optimal parameters...
research
02/06/2023

Context-Gloss Augmentation for Improving Arabic Target Sense Verification

Arabic language lacks semantic datasets and sense inventories. The most ...

Please sign up or login with your details

Forgot password? Click here to reset