Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP

03/28/2023
by   Sara Papi, et al.
30

Despite its pivotal role in research experiments, code correctness is often presumed only on the basis of the perceived quality of the results. This comes with the risk of erroneous outcomes and potentially misleading findings. To address this issue, we posit that the current focus on result reproducibility should go hand in hand with the emphasis on coding best practices. We bolster our call to the NLP community by presenting a case study, in which we identify (and correct) three bugs in widely used open-source implementations of the state-of-the-art Conformer architecture. Through comparative experiments on automatic speech recognition and translation in various language settings, we demonstrate that the existence of bugs does not prevent the achievement of good and reproducible results and can lead to incorrect conclusions that potentially misguide future research. In response to this, this study is a call to action toward the adoption of coding best practices aimed at fostering correctness and improving the quality of the developed software.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2022

Defining the role of open source software in research reproducibility

Reproducibility is inseparable from transparency, as sharing data, code ...
research
01/30/2018

The Reification of an Incorrect and Inappropriate Spreadsheet Model

Once information is loaded into a spreadsheet, it acquires properties th...
research
06/16/2023

Reproducibility in NLP: What Have We Learned from the Checklist?

Scientific progress in NLP rests on the reproducibility of researchers' ...
research
05/11/2018

The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD

UKB is an open source collection of programs for performing, among other...
research
03/23/2021

A large-scale study on research code quality and execution

This article presents a study on the quality and execution of research c...
research
08/03/2023

Replicability Study: Corpora For Understanding Simulink Models Projects

Background: Empirical studies on widely used model-based development too...
research
04/06/2022

Data-Driven Approach for Log Instruction Quality Assessment

In the current IT world, developers write code while system operators ru...

Please sign up or login with your details

Forgot password? Click here to reset