TextBox 2.0: A Text Generation Library with Pre-trained Language Models

12/26/2022
by   Tianyi Tang, et al.
0

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers 13 common text generation tasks and their corresponding 83 datasets and further incorporates 45 PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement 4 efficient training strategies and provide 4 generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/06/2021

TextBox: A Unified, Modularized, and Extensible Framework for Text Generation

We release an open library, called TextBox, which provides a unified, mo...
research
03/15/2022

Graph Pre-training for AMR Parsing and Generation

Abstract meaning representation (AMR) highlights the core semantic infor...
research
09/15/2022

LAVIS: A Library for Language-Vision Intelligence

We introduce LAVIS, an open-source deep learning library for LAnguage-VI...
research
09/19/2023

OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch

Large language models (LLMs) with billions of parameters have demonstrat...
research
03/08/2023

Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference

Large Language Models (LLMs) like GPT-3 have sparked significant interes...
research
07/15/2021

Spanish Language Models

This paper presents the Spanish RoBERTa-base and RoBERTa-large models, a...
research
07/19/2023

Efficient Guided Generation for Large Language Models

In this article we describe an efficient approach to guiding language mo...

Please sign up or login with your details

Forgot password? Click here to reset