SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

08/21/2023
by   Tianyu Yu, et al.
0

Large language models (LLMs) have shown impressive ability for open-domain NLP tasks. However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format. Their performances on NLU tasks are highly related to prompts or demonstrations and are shown to be poor at performing several representative NLU tasks, such as event extraction and entity typing. To this end, we present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding. We express all NLU tasks with two atomic tasks, which define fixed instructions to restrict the input and output format but still “open” for arbitrarily varied label sets. The model is first instruction-tuned with extremely fine-grained labeled data synthesized by ChatGPT and then further fine-tuned by 233 different atomic tasks from 152 datasets across various domains. The experimental results show that SeqGPT has decent classification and extraction ability, and is capable of performing language understanding tasks on unseen domains. We also conduct empirical studies on the scaling of data and model size as well as on the transfer across tasks. Our model is accessible at https://github.com/Alibaba-NLP/SeqGPT.

READ FULL TEXT

page 1

page 7

page 19

page 20

research
04/14/2023

HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge

Large Language Models (LLMs), such as the LLaMA model, have demonstrated...
research
04/16/2021

Language Models are Few-Shot Butlers

Pretrained language models demonstrate strong performance in most NLP ta...
research
03/01/2023

How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks

The GPT-3.5 models have demonstrated impressive performance in various N...
research
10/06/2021

Federated Distillation of Natural Language Understanding with Confident Sinkhorns

Enhancing the user experience is an essential task for application servi...
research
10/08/2022

Understanding HTML with Large Language Models

Large language models (LLMs) have shown exceptional performance on a var...
research
07/28/2023

TrafficSafetyGPT: Tuning a Pre-trained Large Language Model to a Domain-Specific Expert in Transportation Safety

Large Language Models (LLMs) have shown remarkable effectiveness in vari...
research
08/28/2019

Data Augmentation with Atomic Templates for Spoken Language Understanding

Spoken Language Understanding (SLU) converts user utterances into struct...

Please sign up or login with your details

Forgot password? Click here to reset