A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks

07/22/2023
by   Yanis Labrak, et al.
0

We evaluate four state-of-the-art instruction-tuned large language models (LLMs) – ChatGPT, Flan-T5 UL2, Tk-Instruct, and Alpaca – on a set of 13 real-world clinical and biomedical natural language processing (NLP) tasks in English, such as named-entity recognition (NER), question-answering (QA), relation extraction (RE), etc. Our overall results demonstrate that the evaluated LLMs begin to approach performance of state-of-the-art models in zero- and few-shot scenarios for most tasks, and particularly well for the QA task, even though they have never seen examples from these tasks before. However, we observed that the classification and RE tasks perform below what can be achieved with a specifically trained model for the medical field, such as PubMedBERT. Finally, we noted that no LLM outperforms all the others on all the studied tasks, with some models being better suited for certain tasks than others.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors

Recent work has shown that fine-tuning large language models (LLMs) on l...
research
09/14/2023

An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing

Large language models (LLMs) have shown remarkable capabilities in Natur...
research
06/20/2018

The Natural Language Decathlon: Multitask Learning as Question Answering

Deep learning has improved performance on many natural language processi...
research
04/26/2022

Testing the Ability of Language Models to Interpret Figurative Language

Figurative and metaphorical language are commonplace in discourse, and f...
research
06/22/2023

Identifying and Extracting Rare Disease Phenotypes with Large Language Models

Rare diseases (RDs) are collectively common and affect 300 million peopl...
research
02/23/2023

CHiLL: Zero-shot Custom Interpretable Feature Extraction from Clinical Notes with Large Language Models

Large Language Models (LLMs) have yielded fast and dramatic progress in ...
research
06/07/2023

Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers

ChatGPT is a large language model developed by OpenAI. Despite its impre...

Please sign up or login with your details

Forgot password? Click here to reset