Evaluating the Robustness to Instructions of Large Language Models

08/28/2023
by   Yuansheng Ni, et al.
0

Recently, Instruction fine-tuning has risen to prominence as a potential method for enhancing the zero-shot capabilities of Large Language Models (LLMs) on novel tasks. This technique has shown an exceptional ability to boost the performance of moderately sized LLMs, sometimes even reaching performance levels comparable to those of much larger model variants. The focus is on the robustness of instruction-tuned LLMs to seen and unseen tasks. We conducted an exploration of six models including Alpaca, Vicuna, WizardLM, and Traditional Task-oriented Models(Flan-T5-XL/XXL, T0++) using real-world relation extraction datasets as case studies. We carried out a comprehensive evaluation of these instruction-following LLMs which have been tuned based on open-domain instructions and task-oriented instructions. The main discussion is their performance and robustness towards instructions. We have observed that in most cases, the model's performance in dealing with unfamiliar instructions tends to worsen significantly, and the robustness of the model for RE instructions deteriorates compared to QA. Further, we discovered that up until a certain parameter size threshold (3B), the performance of the FLAN-T5 model improves as the parameter count increases. The robustness of different scales of FLAN-T5 models to RE instruction is worse than the robustness to QA instruction.

READ FULL TEXT
research
06/20/2023

Evaluating the Zero-shot Robustness of Instruction-tuned Language Models

Instruction fine-tuning has recently emerged as a promising approach for...
research
08/17/2023

Do you really follow me? Adversarial Instructions for Evaluating the Robustness of Large Language Models

Large Language Models (LLMs) have shown remarkable proficiency in follow...
research
05/24/2023

PIVOINE: Instruction Tuning for Open-world Information Extraction

We consider the problem of Open-world Information Extraction (Open-world...
research
07/20/2023

Instruction-following Evaluation through Verbalizer Manipulation

While instruction-tuned models have shown remarkable success in various ...
research
05/23/2023

Robust Instruction Optimization for Large Language Models with Distribution Shifts

Large Language Models have demonstrated significant ability in accomplis...
research
08/23/2023

Instruction Position Matters in Sequence Generation with Large Language Models

Large language models (LLMs) are capable of performing conditional seque...
research
05/24/2023

Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models

Instruction-tuned models are trained on crowdsourcing datasets with task...

Please sign up or login with your details

Forgot password? Click here to reset