OCHADAI-KYODAI at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction

by   Yuki Taya, et al.

We propose an ensemble model for predicting the lexical complexity of words and multiword expressions (MWEs). The model receives as input a sentence with a target word or MWEand outputs its complexity score. Given that a key challenge with this task is the limited size of annotated data, our model relies on pretrained contextual representations from different state-of-the-art transformer-based language models (i.e., BERT and RoBERTa), and on a variety of training methods for further enhancing model generalization and robustness:multi-step fine-tuning and multi-task learning, and adversarial training. Additionally, we propose to enrich contextual representations by adding hand-crafted features during training. Our model achieved competitive results and ranked among the top-10 systems in both sub-tasks.



There are no comments yet.


page 1

page 2

page 3

page 4


Alejandro Mosquera at SemEval-2021 Task 1: Exploring Sentence and Word Features for Lexical Complexity Prediction

This paper revisits feature engineering approaches for predicting the co...

MULTISEM at SemEval-2020 Task 3: Fine-tuning BERT for Lexical Meaning

We present the MULTISEM systems submitted to SemEval 2020 Task 3: Graded...

UPB at SemEval-2021 Task 1: Combining Deep Learning and Hand-Crafted Features for Lexical Complexity Prediction

Reading is a complex process which requires proper understanding of text...

Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders

Pretrained Masked Language Models (MLMs) have revolutionised NLP in rece...

Fast, Effective and Self-Supervised: Transforming Masked LanguageModels into Universal Lexical and Sentence Encoders

Pretrained Masked Language Models (MLMs) have revolutionised NLP in rece...

Idiomatic Expression Identification using Semantic Compatibility

Idiomatic expressions are an integral part of natural language and const...

IITK@LCP at SemEval 2021 Task 1: Classification for Lexical Complexity Regression Task

This paper describes our contribution to SemEval 2021 Task 1: Lexical Co...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.