HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity

04/11/2022
by   Zihang Xu, et al.
1

This paper describes our system designed for SemEval-2022 Task 8: Multilingual News Article Similarity. We proposed a linguistics-inspired model trained with a few task-specific strategies. The main techniques of our system are: 1) data augmentation, 2) multi-label loss, 3) adapted R-Drop, 4) samples reconstruction with the head-tail combination. We also present a brief analysis of some negative methods like two-tower architecture. Our system ranked 1st on the leaderboard while achieving a Pearson's Correlation Coefficient of 0.818 on the official evaluation set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

HULAT at SemEval-2023 Task 9: Data augmentation for pre-trained transformers applied to Multilingual Tweet Intimacy Analysis

This paper describes our participation in SemEval-2023 Task 9, Intimacy ...
research
04/27/2023

NAP at SemEval-2023 Task 3: Is Less Really More? (Back-)Translation as Data Augmentation Strategies for Detecting Persuasion Techniques

Persuasion techniques detection in news in a multi-lingual setup is non-...
research
09/22/2020

On Data Augmentation for Extreme Multi-label Classification

In this paper, we focus on data augmentation for the extreme multi-label...
research
05/31/2022

GateNLP-UShef at SemEval-2022 Task 8: Entity-Enriched Siamese Transformer for Multilingual News Article Similarity

This paper describes the second-placed system on the leaderboard of SemE...
research
07/29/2017

Sentiment Analysis on Financial News Headlines using Training Dataset Augmentation

This paper discusses the approach taken by the UWaterloo team to arrive ...
research
11/03/2022

Exploring the State-of-the-Art Language Modeling Methods and Data Augmentation Techniques for Multilingual Clause-Level Morphology

This paper describes the KUIS-AI NLP team's submission for the 1^st Shar...
research
01/15/2023

Hawk: An Industrial-strength Multi-label Document Classifier

There are a plethora of methods and algorithms that solve the classical ...

Please sign up or login with your details

Forgot password? Click here to reset