Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

08/02/2023
by   Tim Hartill, et al.
0

We equip a smaller Language Model to generalise to answering challenging compositional questions that have not been seen in training. To do so we propose a combination of multitask supervised pretraining on up to 93 tasks designed to instill diverse reasoning abilities, and a dense retrieval system that aims to retrieve a set of evidential paragraph fragments. Recent progress in question-answering has been achieved either through prompting methods against very large pretrained Language Models in zero or few-shot fashion, or by fine-tuning smaller models, sometimes in conjunction with information retrieval. We focus on the less explored question of the extent to which zero-shot generalisation can be enabled in smaller models with retrieval against a corpus within which sufficient information to answer a particular question may not exist. We establish strong baselines in this setting for diverse evaluation datasets (StrategyQA, CommonsenseQA, IIRC, DROP, Musique and ARC-DA), and show that performance can be significantly improved by adding retrieval-augmented training datasets which are designed to expose our models to a variety of heuristic reasoning strategies such as weighing partial evidence or ignoring an irrelevant context.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/09/2023

Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval

When provided with sufficient explanatory context, smaller Language Mode...
research
06/13/2023

Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard

BEIR is a benchmark dataset for zero-shot evaluation of information retr...
research
05/25/2022

Teaching Broad Reasoning Skills via Decomposition-Guided Contexts

Question-answering datasets require a broad set of reasoning skills. We ...
research
03/09/2023

Can a Frozen Pretrained Language Model be used for Zero-shot Neural Retrieval on Entity-centric Questions?

Neural document retrievers, including dense passage retrieval (DPR), hav...
research
05/19/2023

Evaluation of medium-large Language Models at zero-shot closed book generative question answering

Large language models (LLMs) have garnered significant attention, but th...
research
08/04/2023

ChatGPT for GTFS: From Words to Information

The General Transit Feed Specification (GTFS) standard for publishing tr...
research
12/14/2021

Learning to Retrieve Passages without Supervision

Dense retrievers for open-domain question answering (ODQA) have been sho...

Please sign up or login with your details

Forgot password? Click here to reset