Knowledge Infused Decoding

04/06/2022
by   Ruibo Liu, et al.
2

Pre-trained language models (LMs) have been shown to memorize a substantial amount of knowledge from the pre-training corpora; however, they are still limited in recalling factually correct knowledge given a certain context. Hence, they tend to suffer from counterfactual or hallucinatory generation when used in knowledge-intensive natural language generation (NLG) tasks. Recent remedies to this problem focus on modifying either the pre-training or task fine-tuning objectives to incorporate knowledge, which normally require additional costly training or architecture modification of LMs for practical applications. We present Knowledge Infused Decoding (KID) – a novel decoding algorithm for generative LMs, which dynamically infuses external knowledge into each step of the LM decoding. Specifically, we maintain a local knowledge memory based on the current context, interacting with a dynamically created external knowledge trie, and continuously update the local memory as a knowledge-aware constraint to guide decoding via reinforcement learning. On six diverse knowledge-intensive NLG tasks, task-agnostic LMs (e.g., GPT-2 and BART) armed with KID outperform many task-optimized state-of-the-art models, and show particularly strong performance in few-shot scenarios over seven related knowledge-infusion techniques. Human evaluation confirms KID's ability to generate more relevant and factual language for the input context when compared with multiple baselines. Finally, KID also alleviates exposure bias and provides stable generation quality when generating longer sequences. Code for KID is available at https://github.com/microsoft/KID.

READ FULL TEXT
research
05/04/2022

P^3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Compared to other language tasks, applying pre-trained language models (...
research
06/08/2023

Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training

We consider the task of few-shot intent detection, which involves traini...
research
08/02/2023

Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model

Current captioning approaches tend to generate correct but "generic" des...
research
05/08/2023

HistAlign: Improving Context Dependency in Language Generation by Aligning with History

Language models (LMs) can generate hallucinations and incoherent outputs...
research
03/16/2022

Can Pre-trained Language Models Interpret Similes as Smart as Human?

Simile interpretation is a crucial task in natural language processing. ...
research
06/19/2023

Guiding Language Models of Code with Global Context using Monitors

Language models of code (LMs) work well when the surrounding code in the...
research
08/16/2022

CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks

Knowledge-intensive language tasks (KILT) usually require a large body o...

Please sign up or login with your details

Forgot password? Click here to reset