A Small-Scale Switch Transformer and NLP-based Model for Clinical Narratives Classification

03/22/2023
by   Thanh-Dung Le, et al.
0

In recent years, Transformer-based models such as the Switch Transformer have achieved remarkable results in natural language processing tasks. However, these models are often too complex and require extensive pre-training, which limits their effectiveness for small clinical text classification tasks with limited data. In this study, we propose a simplified Switch Transformer framework and train it from scratch on a small French clinical text classification dataset at CHU Sainte-Justine hospital. Our results demonstrate that the simplified small-scale Transformer models outperform pre-trained BERT-based models, including DistillBERT, CamemBERT, FlauBERT, and FrALBERT. Additionally, using a mixture of expert mechanisms from the Switch Transformer helps capture diverse patterns; hence, the proposed approach achieves better results than a conventional Transformer with the self-attention mechanism. Finally, our proposed framework achieves an accuracy of 87%, precision at 87%, and recall at 85%, compared to the third-best pre-trained BERT-based model, FlauBERT, which achieved an accuracy of 84%, precision at 84%, and recall at 84%. However, Switch Transformers have limitations, including a generalization gap and sharp minima. We compare it with a multi-layer perceptron neural network for small French clinical narratives classification and show that the latter outperforms all other models.

READ FULL TEXT

page 1

page 9

page 11

research
01/27/2022

Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences

Transformers-based models, such as BERT, have dramatically improved the ...
research
01/27/2023

A Comparative Study of Pretrained Language Models for Long Clinical Text

Objective: Clinical knowledge enriched transformer models (e.g., Clinica...
research
02/25/2021

SparseBERT: Rethinking the Importance Analysis in Self-attention

Transformer-based models are popular for natural language processing (NL...
research
06/18/2020

I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths

Self-attention has emerged as a vital component of state-of-the-art sequ...
research
08/11/2023

Enhancing Phenotype Recognition in Clinical Notes Using Large Language Models: PhenoBCBERT and PhenoGPT

We hypothesize that large language models (LLMs) based on the transforme...
research
08/03/2023

Food Classification using Joint Representation of Visual and Textual Data

Food classification is an important task in health care. In this work, w...

Please sign up or login with your details

Forgot password? Click here to reset