Statistical Knowledge Assessment for Generative Language Models

05/17/2023
by   Qingxiu Dong, et al.
0

Generative Language Models (GLMs) have demonstrated capabilities to store factual knowledge and answer queries efficiently. Given varying prompts, does a GLM consistently generate factually correct answers? In this paper, we introduce a statistical knowledge assessment framework guided by latent variables and the KaRR metric, which quantifies a model's knowledge by computing its continuous probability across diverse text forms. We conduct a comprehensive comparison of knowledge across 14 GLMs using our framework, including LLaMA, Alpaca, OPT, and others. Our statistical knowledge assessment encompasses 600 relation types and exhibits a strong correlation (0.43 Kendall's τ) with human evaluation. Our findings reveal that the knowledge in GLMs with the same backbone architecture adheres to the scaling law, and that tuning on instruction-following data may compromise the model's ability to generate factually correct text consistently.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/23/2023

Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning

The recent surge of generative AI has been fueled by the generative powe...
research
05/19/2023

Self-QA: Unsupervised Knowledge Guided Language Model Alignment

Large-scale language models like ChatGPT and GPT-4 have gained attention...
research
07/10/2023

TIM: Teaching Large Language Models to Translate with Comparison

Open-sourced large language models (LLMs) have demonstrated remarkable e...
research
01/11/2023

GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities

The global economy is increasingly dependent on knowledge workers to mee...
research
01/26/2022

An Assessment of the Impact of OCR Noise on Language Models

Neural language models are the backbone of modern-day natural language p...
research
08/23/2023

Evaluation of Faithfulness Using the Longest Supported Subsequence

As increasingly sophisticated language models emerge, their trustworthin...
research
04/28/2023

Dissecting Recall of Factual Associations in Auto-Regressive Language Models

Transformer-based language models (LMs) are known to capture factual kno...

Please sign up or login with your details

Forgot password? Click here to reset