UU-Tax at SemEval-2022 Task 3: Improving the generalizability of language models for taxonomy classification through data augmentation

10/07/2022
by   Injy Sarhan, et al.
10

This paper presents our strategy to address the SemEval-2022 Task 3 PreTENS: Presupposed Taxonomies Evaluating Neural Network Semantics. The goal of the task is to identify if a sentence is deemed acceptable or not, depending on the taxonomic relationship that holds between a noun pair contained in the sentence. For sub-task 1 – binary classification – we propose an effective way to enhance the robustness and the generalizability of language models for better classification on this downstream task. We design a two-stage fine-tuning procedure on the ELECTRA language model using data augmentation techniques. Rigorous experiments are carried out using multi-task learning and data-enriched fine-tuning. Experimental results demonstrate that our proposed model, UU-Tax, is indeed able to generalize well for our downstream task. For sub-task 2 – regression – we propose a simple classifier that trains on features obtained from Universal Sentence Encoder (USE). In addition to describing the submitted systems, we discuss other experiments that employ pre-trained language models and data augmentation techniques. For both sub-tasks, we perform error analysis to further understand the behaviour of the proposed models. We achieved a global F1_Binary score of 91.25 and a rho score of 0.221 in sub-task 2.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2023

LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning

In recent years, there has been significant progress in developing pre-t...
research
04/06/2022

DAGAM: Data Augmentation with Generation And Modification

Text classification is a representative downstream task of natural langu...
research
03/05/2023

Effectiveness of Data Augmentation for Prefix Tuning with Limited Data

Recent work has demonstrated that tuning continuous prompts on large, fr...
research
05/19/2022

Transformers as Neural Augmentors: Class Conditional Sentence Generation via Variational Bayes

Data augmentation methods for Natural Language Processing tasks are expl...
research
04/19/2019

Suggestion Mining from Online Reviews using ULMFiT

In this paper we present our approach and the system description for Sub...
research
09/13/2021

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models

Recent works have shown that powerful pre-trained language models (PLM) ...
research
05/12/2021

OCHADAI-KYODAI at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction

We propose an ensemble model for predicting the lexical complexity of wo...

Please sign up or login with your details

Forgot password? Click here to reset