Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs

08/19/2023
by   Suhang Wu, et al.
0

Large language models (LLMs) possess a wealth of knowledge encoded in their parameters. However, this knowledge may become outdated or unsuitable over time. As a result, there has been a growing interest in knowledge editing for LLMs and evaluating its effectiveness. Existing studies primarily focus on knowledge editing using factual triplets, which not only incur high costs for collection but also struggle to express complex facts. Furthermore, these studies are often limited in their evaluation perspectives. In this paper, we propose Eva-KELLM, a new benchmark for evaluating knowledge editing of LLMs. This benchmark includes an evaluation framework and a corresponding dataset. Under our framework, we first ask the LLM to perform knowledge editing using raw documents, which provides a more convenient and universal approach compared to using factual triplets. We then evaluate the updated LLM from multiple perspectives. In addition to assessing the effectiveness of knowledge editing and the retention of unrelated knowledge from conventional studies, we further test the LLM's ability in two aspects: 1) Reasoning with the altered knowledge, aiming for the LLM to genuinely learn the altered knowledge instead of simply memorizing it. 2) Cross-lingual knowledge transfer, where the LLM updated with raw documents in one language should be capable of handling queries from another language. To facilitate further research, we construct and release the corresponding dataset. Using this benchmark, we investigate the effectiveness of several commonly-used knowledge editing methods. Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results, particularly when it comes to reasoning with altered knowledge and cross-lingual knowledge transfer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2023

Cross-Lingual Knowledge Editing in Large Language Models

Knowledge editing aims to change language models' performance on several...
research
07/24/2023

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Modern language models capture a large body of factual knowledge. Howeve...
research
05/25/2022

Language Anisotropic Cross-Lingual Model Editing

Pre-trained language models learn large amounts of knowledge from their ...
research
08/29/2019

KBSET -- Knowledge-Based Support for Scholarly Editing and Text Processing

KBSET supports a practical workflow for scholarly editing, based on usin...
research
04/19/2022

Detecting Text Formality: A Study of Text Classification Approaches

Formality is an important characteristic of text documents. The automati...
research
02/10/2022

Locating and Editing Factual Knowledge in GPT

We investigate the mechanisms underlying factual knowledge recall in aut...
research
09/27/2022

EditEval: An Instruction-Based Benchmark for Text Improvements

Evaluation of text generation to date has primarily focused on content c...

Please sign up or login with your details

Forgot password? Click here to reset