BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer

Retrieval-based language models are increasingly employed in question-answering tasks. These models search in a corpus of documents for relevant information instead of having all factual knowledge stored in its parameters, thereby enhancing efficiency, transparency, and adaptability. We develop the first Norwegian retrieval-based model by adapting the REALM framework and evaluating it on various tasks. After training, we also separate the language model, which we call the reader, from the retriever components, and show that this can be fine-tuned on a range of downstream tasks. Results show that retrieval augmented language modeling improves the reader's performance on extractive question-answering, suggesting that this type of training improves language models' general ability to use context and that this does not happen at the expense of other abilities such as part-of-speech tagging, dependency parsing, named entity recognition, and lemmatization. Code, trained models, and data are made publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2023

Data-Efficient French Language Modeling with CamemBERTa

Recent advances in NLP have significantly improved the performance of la...
research
01/25/2023

ViDeBERTa: A powerful pre-trained language model for Vietnamese

This paper presents ViDeBERTa, a new pre-trained monolingual language mo...
research
10/31/2019

A neural document language modeling framework for spoken document retrieval

Recent developments in deep learning have led to a significant innovatio...
research
07/06/2023

Improving Retrieval-Augmented Large Language Models via Data Importance Learning

Retrieval augmentation enables large language models to take advantage o...
research
05/02/2023

Discern and Answer: Mitigating the Impact of Misinformation in Retrieval-Augmented Models with Discriminators

Most existing retrieval-augmented language models (LMs) for question ans...
research
04/16/2021

Editing Factual Knowledge in Language Models

The factual knowledge acquired during pretraining and stored in the para...
research
01/28/2022

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

Retrieval-based language models (R-LM) model the probability of natural ...

Please sign up or login with your details

Forgot password? Click here to reset