Story Ending Prediction by Transferable BERT

05/17/2019
by   Zhongyang Li, et al.
0

Recent advances, such as GPT and BERT, have shown success in incorporating a pre-trained transformer language model and fine-tuning operation to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story ending prediction as the target task to conduct experiments. The final result, an accuracy of 91.8 dramatically outperforms previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks. Error analysis shows what are the strength and weakness of BERT-based models for story ending prediction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2020

RobBERT: a Dutch RoBERTa-based Language Model

Pre-trained language models have been dominating the field of natural la...
research
01/26/2022

Neural Grapheme-to-Phoneme Conversion with Pre-trained Grapheme Models

Neural network models have achieved state-of-the-art performance on grap...
research
04/15/2021

SINA-BERT: A pre-trained Language Model for Analysis of Medical Texts in Persian

We have released Sina-BERT, a language model pre-trained on BERT (Devlin...
research
09/27/2019

Reweighted Proximal Pruning for Large-Scale Language Representation

Recently, pre-trained language representation flourishes as the mainstay...
research
09/18/2023

Multi-modality Meets Re-learning: Mitigating Negative Transfer in Sequential Recommendation

Learning effective recommendation models from sparse user interactions r...
research
04/22/2020

Keyphrase Prediction With Pre-trained Language Model

Recently, generative methods have been widely used in keyphrase predicti...
research
10/17/2020

HABERTOR: An Efficient and Effective Deep Hatespeech Detector

We present our HABERTOR model for detecting hatespeech in large scale us...

Please sign up or login with your details

Forgot password? Click here to reset