Adapting Large Language Models via Reading Comprehension

09/18/2023
by   Daixuan Cheng, et al.
0

We explore how continued pre-training on domain-specific corpora influences large language models, revealing that training on the raw corpora endows the model with domain knowledge, but drastically hurts its prompting ability for question answering. Taken inspiration from human learning via reading comprehension–practice after reading improves the ability to answer questions based on the learned knowledge–we propose a simple method for transforming raw corpora into reading comprehension texts. Each raw text is enriched with a series of tasks related to its content. Our method, highly scalable and applicable to any pre-training corpora, consistently enhances performance across various tasks in three different domains: biomedicine, finance, and law. Notably, our 7B language model achieves competitive performance with domain-specific models of much larger scales, such as BloombergGPT-50B. Furthermore, we demonstrate that domain-specific reading comprehension texts can improve the model's performance even on general benchmarks, showing the potential to develop a general model across even more domains. Our model, code, and data will be available at https://github.com/microsoft/LMOps.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2019

Probing Prior Knowledge Needed in Challenging Chinese Machine Reading Comprehension

With an ultimate goal of narrowing the gap between human and machine rea...
research
09/19/2023

Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation

Data contamination in model evaluation is getting increasingly prevalent...
research
05/10/2021

REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training

Pre-trained Language Models (PLMs) have achieved great success on Machin...
research
04/02/2023

A Data-centric Framework for Improving Domain-specific Machine Reading Comprehension Datasets

Low-quality data can cause downstream problems in high-stakes applicatio...
research
03/26/2022

Lite Unified Modeling for Discriminative Reading Comprehension

As a broad and major category in machine reading comprehension (MRC), th...
research
03/31/2020

Procedural Reading Comprehension with Attribute-Aware Context Flow

Procedural texts often describe processes (e.g., photosynthesis and cook...
research
05/24/2023

Lawyer LLaMA Technical Report

Large Language Models (LLMs), like LLaMA, have exhibited remarkable perf...

Please sign up or login with your details

Forgot password? Click here to reset