Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

05/28/2023
by   Minki Kang, et al.
0

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy. Previous studies have focused on building task-specific small language models (LMs) by fine-tuning them with labeled data or distilling LLMs. However, these approaches are ill-suited for knowledge-intensive reasoning tasks due to the limited capacity of small LMs in memorizing the knowledge required. Motivated by our theoretical analysis on memorization, we propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales with augmented knowledge retrieved from an external knowledge base. Moreover, we further propose a neural reranker to obtain documents relevant to rationale generation. We empirically show that KARD significantly improves the performance of small T5 and Flan-T5 models on the challenging knowledge-intensive reasoning datasets, namely MedQA-USMLE and StrategyQA. Notably, our method makes the 250M models achieve superior performance against the fine-tuned 3B models, having 12 times larger parameters, on both MedQA-USMLE and StrategyQA benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2023

Fine-tuning Large Enterprise Language Models via Ontological Reasoning

Large Language Models (LLMs) exploit fine-tuning as a technique to adapt...
research
11/03/2022

PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales

Neural language models (LMs) have achieved impressive results on various...
research
05/22/2023

Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting

Large language models (LLMs) demonstrate remarkable medical expertise, b...
research
08/03/2023

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Mathematical reasoning is a challenging task for large language models (...
research
10/12/2022

Can Pretrained Language Models (Yet) Reason Deductively?

Acquiring factual knowledge with Pretrained Language Models (PLMs) has a...
research
05/10/2023

ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base

Analogical reasoning is a fundamental cognitive ability of humans. Howev...
research
05/24/2022

TALM: Tool Augmented Language Models

Transformer based language models (LMs) demonstrate increasing performan...

Please sign up or login with your details

Forgot password? Click here to reset