Large Language Models Vote: Prompting for Rare Disease Identification

by   David Oniani, et al.

The emergence of generative Large Language Models (LLMs) emphasizes the need for accurate and efficient prompting approaches. LLMs are often applied in Few-Shot Learning (FSL) contexts, where tasks are executed with minimal training data. FSL has become popular in many Artificial Intelligence (AI) subdomains, including AI for health. Rare diseases affect a small fraction of the population. Rare disease identification from clinical notes inherently requires FSL techniques due to limited data availability. Manual data collection and annotation is both expensive and time-consuming. In this paper, we propose Models-Vote Prompting (MVP), a flexible prompting approach for improving the performance of LLM queries in FSL settings. MVP works by prompting numerous LLMs to perform the same tasks and then conducting a majority vote on the resulting outputs. This method achieves improved results to any one model in the ensemble on one-shot rare disease identification and classification tasks. We also release a novel rare disease dataset for FSL, available to those who signed the MIMIC-IV Data Use Agreement (DUA). Furthermore, in using MVP, each model is prompted multiple times, substantially increasing the time needed for manual annotation, and to address this, we assess the feasibility of using JSON for automating generative LLM evaluation.


Rare Disease Identification from Clinical Notes with Ontologies and Weak Supervision

The identification of rare diseases from clinical notes with Natural Lan...

Semi-supervised Rare Disease Detection Using Generative Adversarial Network

Rare diseases affect a relatively small number of people, which limits i...

Language Models are Few-shot Learners for Prognostic Prediction

Clinical prediction is an essential task in the healthcare industry. How...

Identifying and Extracting Rare Disease Phenotypes with Large Language Models

Rare diseases (RDs) are collectively common and affect 300 million peopl...

CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models

Large pre-trained language models (LLMs) have been shown to have signifi...

Ontology-Based and Weakly Supervised Rare Disease Phenotyping from Clinical Notes

Computational text phenotyping is the practice of identifying patients w...

Surrogate-guided sampling designs for classification of rare outcomes from electronic medical records data

Scalable and accurate identification of specific clinical outcomes has b...

Please sign up or login with your details

Forgot password? Click here to reset