DeepAI AI Chat
Log In Sign Up

New Arabic Medical Dataset for Diseases Classification

06/29/2021
by   Jaafar Hammoud, et al.
0

The Arabic language suffers from a great shortage of datasets suitable for training deep learning models, and the existing ones include general non-specialized classifications. In this work, we introduce a new Arab medical dataset, which includes two thousand medical documents collected from several Arabic medical websites, in addition to the Arab Medical Encyclopedia. The dataset was built for the task of classifying texts and includes 10 classes (Blood, Bone, Cardiovascular, Ear, Endocrine, Eye, Gastrointestinal, Immune, Liver and Nephrological) diseases. Experiments on the dataset were performed by fine-tuning three pre-trained models: BERT from Google, Arabert that based on BERT with large Arabic corpus, and AraBioNER that based on Arabert with Arabic medical corpus.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/19/2022

ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

Using pre-trained transformer models such as BERT has proven to be effec...
05/04/2023

Leveraging BERT Language Model for Arabic Long Document Classification

Given the number of Arabic speakers worldwide and the notably large amou...
11/18/2021

Supporting Undotted Arabic with Pre-trained Language Models

We observe a recent behaviour on social media, in which users intentiona...
11/29/2022

New Results for the Text Recognition of Arabic Maghribī Manuscripts – Managing an Under-resourced Script

HTR models development has become a conventional step for digital humani...
03/14/2023

Optimizing Deep Learning Model Parameters with the Bees Algorithm for Improved Medical Text Classification

This paper introduces a novel mechanism to obtain the optimal parameters...