PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts

06/07/2023
by   Kaijie Zhu, et al.
0

The increasing reliance on Large Language Models (LLMs) across academia and industry necessitates a comprehensive understanding of their robustness to prompts. In response to this vital need, we introduce PromptBench, a robustness benchmark designed to measure LLMs' resilience to adversarial prompts. This study uses a plethora of adversarial textual attacks targeting prompts across multiple levels: character, word, sentence, and semantic. These prompts are then employed in diverse tasks, such as sentiment analysis, natural language inference, reading comprehension, machine translation, and math problem-solving. Our study generates 4,032 adversarial prompts, meticulously evaluated over 8 tasks and 13 datasets, with 567,084 test samples in total. Our findings demonstrate that contemporary LLMs are vulnerable to adversarial prompts. Furthermore, we present comprehensive analysis to understand the mystery behind prompt robustness and its transferability. We then offer insightful robustness analysis and pragmatic recommendations for prompt composition, beneficial to both researchers and everyday users. We make our code, prompts, and methodologies to generate adversarial prompts publicly accessible, thereby enabling and encouraging collaborative exploration in this pivotal field: https://github.com/microsoft/promptbench.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2023

Sentiment Analysis in the Era of Large Language Models: A Reality Check

Sentiment analysis (SA) has been a long-standing research area in natura...
research
01/31/2023

The Impacts of Unanswerable Questions on the Robustness of Machine Reading Comprehension Models

Pretrained language models have achieved super-human performances on man...
research
06/20/2023

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Generative Pre-trained Transformer (GPT) models have exhibited exciting ...
research
07/01/2022

An Understanding-Oriented Robust Machine Reading Comprehension Model

Although existing machine reading comprehension models are making rapid ...
research
05/26/2023

On Evaluating Adversarial Robustness of Large Vision-Language Models

Large vision-language models (VLMs) such as GPT-4 have achieved unpreced...
research
04/29/2020

Benchmarking Robustness of Machine Reading Comprehension Models

Machine Reading Comprehension (MRC) is an important testbed for evaluati...
research
07/01/2021

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Despite pre-trained language models have proven useful for learning high...

Please sign up or login with your details

Forgot password? Click here to reset