Investigating Text Simplification Evaluation

Modern text simplification (TS) heavily relies on the availability of gold standard data to build machine learning models. However, existing studies show that parallel TS corpora contain inaccurate simplifications and incorrect alignments. Additionally, evaluation is usually performed by using metrics such as BLEU or SARI to compare system output to the gold standard. A major limitation is that these metrics do not match human judgements and the performance on different datasets and linguistic phenomena vary greatly. Furthermore, our research shows that the test and training subsets of parallel datasets differ significantly. In this work, we investigate existing TS corpora, providing new insights that will motivate the improvement of existing state-of-the-art TS evaluation methods. Our contributions include the analysis of TS corpora based on existing modifications used for simplification and an empirical study on TS models performance by using better-distributed datasets. We demonstrate that by improving the distribution of TS datasets, we can build more robust TS models.

READ FULL TEXT
research
11/08/2020

A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems

Most Natural Language Generation systems need to produce accurate texts....
research
12/16/2018

The Adverse Effects of Code Duplication in Machine Learning Models of Code

The field of big code relies on mining large corpora of code to perform ...
research
06/30/2021

All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text

Human evaluations are typically considered the gold standard in natural ...
research
10/06/2020

Investigating African-American Vernacular English in Transformer-Based Text Generation

The growth of social media has encouraged the written use of African Ame...
research
10/28/2021

Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC

Machine translation (MT) system aims to translate source language into t...
research
06/16/2023

Cross-corpus Readability Compatibility Assessment for English Texts

Text readability assessment has gained significant attention from resear...
research
10/31/2020

Free the Plural: Unrestricted Split-Antecedent Anaphora Resolution

Now that the performance of coreference resolvers on the simpler forms o...

Please sign up or login with your details

Forgot password? Click here to reset