DeepAI AI Chat
Log In Sign Up

LRG at SemEval-2020 Task 7: Assessing the Ability of BERT and Derivative Models to Perform Short-Edits based Humor Grading

by   Siddhant Mahurkar, et al.
BITS Pilani

In this paper, we assess the ability of BERT and its derivative models (RoBERTa, DistilBERT, and ALBERT) for short-edits based humor grading. We test these models for humor grading and classification tasks on the Humicroedit and the FunLines dataset. We perform extensive experiments with these models to test their language modeling and generalization abilities via zero-shot inference and cross-dataset inference based approaches. Further, we also inspect the role of self-attention layers in humor-grading by performing a qualitative analysis over the self-attention weights from the final layer of the trained BERT model. Our experiments show that all the pre-trained BERT derivative models show significant generalization capabilities for humor-grading related tasks.


page 1

page 2

page 3

page 4


LV-BERT: Exploiting Layer Variety for BERT

Modern pre-trained language models are mostly built upon backbones stack...

I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths

Self-attention has emerged as a vital component of state-of-the-art sequ...

PoWER-BERT: Accelerating BERT inference for Classification Tasks

BERT has emerged as a popular model for natural language understanding. ...

ConvBERT: Improving BERT with Span-based Dynamic Convolution

Pre-trained language models like BERT and its variants have recently ach...

Mediators in Determining what Processing BERT Performs First

Probing neural models for the ability to perform downstream tasks using ...

Have Attention Heads in BERT Learned Constituency Grammar?

With the success of pre-trained language models in recent years, more an...