Models of reference production: How do they withstand the test of time?

07/27/2023
by   Fahime Same, et al.
0

In recent years, many NLP studies have focused solely on performance improvement. In this work, we focus on the linguistic and scientific aspects of NLP. We use the task of generating referring expressions in context (REG-in-context) as a case study and start our analysis from GREC, a comprehensive set of shared tasks in English that addressed this topic over a decade ago. We ask what the performance of models would be if we assessed them (1) on more realistic datasets, and (2) using more advanced methods. We test the models using different evaluation metrics and feature selection experiments. We conclude that GREC can no longer be regarded as offering a reliable assessment of models' ability to mimic human reference production, because the results are highly impacted by the choice of corpus and evaluation metrics. Our results also suggest that pre-trained language models are less dependent on the choice of corpus than classic Machine Learning models, and therefore make more robust class predictions.

READ FULL TEXT
research
10/05/2021

Investigating the Impact of Pre-trained Language Models on Dialog Evaluation

Recently, there is a surge of interest in applying pre-trained language ...
research
11/14/2020

Lessons from Computational Modelling of Reference Production in Mandarin and English

Referring expression generation (REG) algorithms offer computational mod...
research
04/29/2020

GePpeTto Carves Italian into a Language Model

In the last few years, pre-trained neural architectures have provided im...
research
05/12/2022

Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages

Although recent Massively Multilingual Language Models (MMLMs) like mBER...
research
07/18/2018

Is it worth it? Budget-related evaluation metrics for model selection

Creating a linguistic resource is often done by using a machine learning...
research
04/30/2020

Mind Your Inflections! Improving NLP for Non-Standard English with Base-Inflection Encoding

Morphological inflection is a process of word formation where base words...
research
02/08/2022

What are the best systems? New perspectives on NLP Benchmarking

In Machine Learning, a benchmark refers to an ensemble of datasets assoc...

Please sign up or login with your details

Forgot password? Click here to reset