Applying a Pre-trained Language Model to Spanish Twitter Humor Prediction

07/06/2019
by   Bobak Farzin, et al.
0

Our entry into the HAHA 2019 Challenge placed 3^rd in the classification task and 2^nd in the regression task. We describe our system and innovations, as well as comparing our results to a Naive Bayes baseline. A large Twitter based corpus allowed us to train a language model from scratch focused on Spanish and transfer that knowledge to our competition model. To overcome the inherent errors in some labels we reduce our class confidence with label smoothing in the loss function. All the code for our project is included in a GitHub repository for easy reference and to enable replication by others.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2019

The LIG system for the English-Czech Text Translation Task of IWSLT 2019

In this paper, we present our submission for the English to Czech Text T...
research
08/05/2022

Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

This report describes a pre-trained language model Erlangshen with prope...
research
09/08/2022

IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model

In this paper, we describe our shared task submissions for Subtask 2 in ...
research
05/03/2017

Amobee at SemEval-2017 Task 4: Deep Learning System for Sentiment Detection on Twitter

This paper describes the Amobee sentiment analysis system, adapted to co...
research
07/09/2019

To Tune or Not To Tune? How About the Best of Both Worlds?

The introduction of pre-trained language models has revolutionized natur...
research
11/02/2016

And the Winner is ...: Bayesian Twitter-based Prediction on 2016 U.S. Presidential Election

This paper describes a Naive-Bayesian predictive model for 2016 U.S. Pre...

Please sign up or login with your details

Forgot password? Click here to reset