STORYWARS: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation

05/14/2023
by   Yulun Du, et al.
0

Collaborative stories, which are texts created through the collaborative efforts of multiple authors with different writing styles and intentions, pose unique challenges for NLP models. Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce STORYWARS, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. We design 12 task types, comprising 7 understanding and 5 generation task types, on STORYWARS, deriving 101 diverse story-related tasks in total as a multi-task benchmark covering all fully-supervised, few-shot, and zero-shot scenarios. Furthermore, we present our instruction-tuned model, INSTRUCTSTORY, for the story tasks showing that instruction tuning, in addition to achieving superior results in zero-shot and few-shot scenarios, can also obtain the best performance on the fully-supervised tasks in STORYWARS, establishing strong multi-task benchmark performances on STORYWARS.

READ FULL TEXT
research
05/25/2022

Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

Instruction tuning is an emergent paradigm in NLP wherein natural langua...
research
05/22/2023

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

ChatGPT and GPT-4 have attracted substantial interest from both academic...
research
09/18/2023

Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

Text language models have shown remarkable zero-shot capability in gener...
research
04/17/2023

InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction

Large language models have unlocked strong multi-task capabilities from ...
research
09/30/2022

PART: Pre-trained Authorship Representation Transformer

Authors writing documents imprint identifying information within their t...
research
04/20/2022

A Corpus for Understanding and Generating Moral Stories

Teaching morals is one of the most important purposes of storytelling. A...
research
05/16/2023

A Video Is Worth 4096 Tokens: Verbalize Story Videos To Understand Them In Zero Shot

Multimedia content, such as advertisements and story videos, exhibit a r...

Please sign up or login with your details

Forgot password? Click here to reset