Using Large Language Models to Automate Category and Trend Analysis of Scientific Articles: An Application in Ophthalmology

by   Hina Raja, et al.

Purpose: In this paper, we present an automated method for article classification, leveraging the power of Large Language Models (LLM). The primary focus is on the field of ophthalmology, but the model is extendable to other fields. Methods: We have developed a model based on Natural Language Processing (NLP) techniques, including advanced LLMs, to process and analyze the textual content of scientific papers. Specifically, we have employed zero-shot learning (ZSL) LLM models and compared against Bidirectional and Auto-Regressive Transformers (BART) and its variants, and Bidirectional Encoder Representations from Transformers (BERT), and its variant such as distilBERT, SciBERT, PubmedBERT, BioBERT. Results: The classification results demonstrate the effectiveness of LLMs in categorizing large number of ophthalmology papers without human intervention. Results: To evalute the LLMs, we compiled a dataset (RenD) of 1000 ocular disease-related articles, which were expertly annotated by a panel of six specialists into 15 distinct categories. The model achieved mean accuracy of 0.86 and mean F1 of 0.85 based on the RenD dataset. Conclusion: The proposed framework achieves notable improvements in both accuracy and efficiency. Its application in the domain of ophthalmology showcases its potential for knowledge organization and retrieval in other domains too. We performed trend analysis that enables the researchers and clinicians to easily categorize and retrieve relevant papers, saving time and effort in literature review and information gathering as well as identification of emerging scientific trends within different disciplines. Moreover, the extendibility of the model to other scientific fields broadens its impact in facilitating research and trend analysis across diverse disciplines.


page 1

page 4

page 5

page 6

page 8


Using the Full-text Content of Academic Articles to Identify and Evaluate Algorithm Entities in the Domain of Natural Language Processing

In the era of big data, the advancement, improvement, and application of...

BERT: A Review of Applications in Natural Language Processing and Understanding

In this review, we describe the application of one of the most popular d...

Language Models are Few-shot Learners for Prognostic Prediction

Clinical prediction is an essential task in the healthcare industry. How...

Beyond original Research Articles Categorization via NLP

This work proposes a novel approach to text categorization – for unknown...

ChatGPT Chemistry Assistant for Text Mining and Prediction of MOF Synthesis

We use prompt engineering to guide ChatGPT in the automation of text min...

Semantic and Relational Spaces in Science of Science: Deep Learning Models for Article Vectorisation

Over the last century, we observe a steady and exponentially growth of s...

Please sign up or login with your details

Forgot password? Click here to reset