RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit

06/08/2023
by   Jiongnan Liu, et al.
0

Although Large Language Models (LLMs) have demonstrated extraordinary capabilities in many domains, they still have a tendency to hallucinate and generate fictitious responses to user requests. This problem can be alleviated by augmenting LLMs with information retrieval (IR) systems (also known as retrieval-augmented LLMs). Applying this strategy, LLMs can generate more factual texts in response to user input according to the relevant content retrieved by IR systems from external corpora as references. In addition, by incorporating external knowledge, retrieval-augmented LLMs can answer in-domain questions that cannot be answered by solely relying on the world knowledge stored in parameters. To support research in this area and facilitate the development of retrieval-augmented LLM systems, we develop RETA-LLM, a RETreival-Augmented LLM toolkit. In RETA-LLM, we create a complete pipeline to help researchers and users build their customized in-domain LLM-based systems. Compared with previous retrieval-augmented LLM systems, RETA-LLM provides more plug-and-play modules to support better interaction between IR systems and LLMs, including request rewriting, document retrieval, passage extraction, answer generation, and fact checking modules. Our toolkit is publicly available at https://github.com/RUC-GSAI/YuLan-IR/tree/main/RETA-LLM.

READ FULL TEXT
research
01/30/2021

OpenMatch: An Open-Source Package for Information Retrieval

Information Retrieval (IR) is an important task and can be used in many ...
research
07/10/2023

InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval

Recent work has explored Large Language Models (LLMs) to overcome the la...
research
05/11/2023

Active Retrieval Augmented Generation

Despite the remarkable ability of large language models (LMs) to compreh...
research
08/08/2023

Hybrid Retrieval-Augmented Generation for Real-time Composition Assistance

Retrieval augmented models show promise in enhancing traditional languag...
research
09/17/2019

Revealing the Importance of Semantic Retrieval for Machine Reading at Scale

Machine Reading at Scale (MRS) is a challenging task in which a system i...
research
06/02/2023

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration

Noticing the urgent need to provide tools for fast and user-friendly qua...
research
06/28/2021

Keyphrase Generation for Scientific Document Retrieval

Sequence-to-sequence models have lead to significant progress in keyphra...

Please sign up or login with your details

Forgot password? Click here to reset