Daniel Deutsch

research

∙ 08/25/2023

Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level

As research on machine translation moves to translating text beyond the ...

0 Daniel Deutsch, et al. ∙

research

∙ 08/14/2023

The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation

Automatic evaluation of machine translation (MT) is a critical tool driv...

0 Patrick Fernandes, et al. ∙

research

∙ 05/23/2023

Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation

Kendall's tau is frequently used to meta-evaluate how well machine trans...

0 Daniel Deutsch, et al. ∙

research

∙ 12/20/2022

Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization

The acquisition of high-quality human annotations through crowdsourcing ...

10 Lining Zhang, et al. ∙

research

∙ 10/22/2022

On the Limitations of Reference-Free Evaluations of Generated Text

There is significant interest in developing evaluation metrics which acc...

0 Daniel Deutsch, et al. ∙

research

∙ 04/29/2022

Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code

We introduce Repro, an open-source library which aims at improving the r...

0 Daniel Deutsch, et al. ∙

research

∙ 04/21/2022

Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics

How reliably an automatic summarization evaluation metric replicates hum...

0 Daniel Deutsch, et al. ∙

research

∙ 04/21/2022

Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics

Question answering-based summarization evaluation metrics must automatic...

0 Daniel Deutsch, et al. ∙

research

∙ 11/15/2021

Question-Based Salient Span Selection for More Controllable Text Summarization

In this work, we propose a method for incorporating question-answering (...

0 Daniel Deutsch, et al. ∙

research

∙ 03/31/2021

A Statistical Analysis of Summarization Evaluation Metrics using Resampling Methods

The quality of a summarization evaluation metric is quantified by calcul...

0 Daniel Deutsch, et al. ∙

research

∙ 10/23/2020

Understanding the Extent to which Summarization Evaluation Metrics Measure the Information Quality of Summaries

Reference-based metrics such as ROUGE or BERTScore evaluate the content ...

0 Daniel Deutsch, et al. ∙

research

∙ 10/01/2020

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Recently, there has been growing interest in using question-answering (Q...

0 Daniel Deutsch, et al. ∙

research

∙ 07/10/2020

SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics

We present SacreROUGE, an open-source library for using and developing s...

0 Daniel Deutsch, et al. ∙

Daniel Deutsch

Featured Co-authors

Sign in with Google

Consider DeepAI Pro