P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts

10/14/2021
by   Benjamin Newman, et al.
0

Recent work (e.g. LAMA (Petroni et al., 2019)) has found that the quality of the factual information extracted from Large Language Models (LLMs) depends on the prompts used to query them. This inconsistency is problematic because different users will query LLMs for the same information using different wording, but should receive the same, accurate responses regardless. In this work we aim to address this shortcoming by introducing P-Adapters: lightweight models that sit between the embedding layer and first attention layer of LLMs. They take LLM embeddings as input and output continuous prompts that are used to query the LLM. Additionally, we investigate Mixture of Experts (MoE) models that learn a set of continuous prompts ("experts") and select one to query the LLM. They require a separate classifier trained on human-annotated data to map natural language prompts to the continuous ones. P-Adapters perform comparably to the more complex MoE models in extracting factual information from BERT and RoBERTa while eliminating the need for additional annotations. P-Adapters show between 12-26 improvement in consistency over a baseline of only using natural language queries. Finally, we investigate what makes a P-adapter successful and conclude that access to the LLM's embeddings of the original natural language prompt, particularly the subject of the entity pair being asked about, is a significant factor.

READ FULL TEXT

page 9

page 14

research
05/02/2022

Entity-aware Transformers for Entity Search

Pre-trained language models such as BERT have been a key ingredient to a...
research
06/06/2023

Prompting Large Language Models to Reformulate Queries for Moment Localization

The task of moment localization is to localize a temporal moment in an u...
research
04/28/2020

SCELMo: Source Code Embeddings from Language Models

Continuous embeddings of tokens in computer programs have been used to s...
research
03/09/2021

BERTese: Learning to Speak to BERT

Large pre-trained language models have been shown to encode large amount...
research
05/05/2023

Zelda: Video Analytics using Vision-Language Models

Advances in ML have motivated the design of video analytics systems that...
research
02/14/2023

Exploring Category Structure with Contextual Language Models and Lexical Semantic Networks

Recent work on predicting category structure with distributional models,...
research
10/27/2019

Thieves on Sesame Street! Model Extraction of BERT-based APIs

We study the problem of model extraction in natural language processing,...

Please sign up or login with your details

Forgot password? Click here to reset