Using Two Losses and Two Datasets Simultaneously to Improve TempoWiC Accuracy

12/15/2022
by   Mohammad Javad Pirhadi, et al.
0

WSD (Word Sense Disambiguation) is the task of identifying which sense of a word is meant in a sentence or other segment of text. Researchers have worked on this task (e.g. Pustejovsky, 2002) for years but it's still a challenging one even for SOTA (state-of-the-art) LMs (language models). The new dataset, TempoWiC introduced by Loureiro et al. (2022b) focuses on the fact that words change over time. Their best baseline achieves 70.33 we use two different losses simultaneously to train RoBERTa-based classification models. We also improve our model by using another similar dataset to generalize better. Our best configuration beats their best baseline by 4.23

READ FULL TEXT
research
11/27/2021

Language models in word sense disambiguation for Polish

In the paper, we test two different approaches to the unsupervised word ...
research
06/23/2020

Combining Neural Language Models for WordSense Induction

Word sense induction (WSI) is the problem of grouping occurrences of an ...
research
04/29/2020

Don't Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation

State-of-the-art methods for Word Sense Disambiguation (WSD) combine two...
research
05/23/2018

How much does a word weigh? Weighting word embeddings for word sense induction

The paper describes our participation in the first shared task on word s...
research
12/16/2022

Metaphorical Polysemy Detection: Conventional Metaphor meets Word Sense Disambiguation

Linguists distinguish between novel and conventional metaphor, a distinc...
research
07/26/2021

Image-Based Parking Space Occupancy Classification: Dataset and Baseline

We introduce a new dataset for image-based parking space occupancy class...
research
07/23/2022

Context based lemmatizer for Polish language

Lemmatization is the process of grouping together the inflected forms of...

Please sign up or login with your details

Forgot password? Click here to reset