Knowledge Sanitization of Large Language Models

09/21/2023
by   Yoichi Ishibashi, et al.
0

We explore a knowledge sanitization approach to mitigate the privacy concerns associated with large language models (LLMs). LLMs trained on a large corpus of Web data can memorize and potentially reveal sensitive or confidential information, raising critical security concerns. Our technique fine-tunes these models, prompting them to generate harmless responses such as “I don't know” when queried about specific information. Experimental results in a closed-book question-answering task show that our straightforward method not only minimizes particular knowledge leakage but also preserves the overall performance of LLM. These two advantages strengthen the defense against extraction attacks and reduces the emission of harmful content such as hallucinations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2022

Internet-augmented language models through few-shot prompting for open-domain question answering

In this work, we aim to capitalize on the unique few-shot capabilities o...
research
07/04/2023

ProPILE: Probing Privacy Leakage in Large Language Models

The rapid advancement and widespread use of large language models (LLMs)...
research
05/22/2023

Quantifying Association Capabilities of Large Language Models and Its Implications on Privacy Leakage

The advancement of large language models (LLMs) brings notable improveme...
research
04/26/2022

You Don't Know My Favorite Color: Preventing Dialogue Representations from Revealing Speakers' Private Personas

Social chatbots, also known as chit-chat chatbots, evolve rapidly with l...
research
02/24/2023

Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

Large language models (LLMs), such as ChatGPT, are able to generate huma...
research
06/17/2021

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Previous literatures show that pre-trained masked language models (MLMs)...
research
05/28/2023

Conformal Prediction with Large Language Models for Multi-Choice Question Answering

As large language models continue to be widely developed, robust uncerta...

Please sign up or login with your details

Forgot password? Click here to reset