Drawing Causal Inferences About Performance Effects in NLP

09/14/2022
by   Sandra Wankmüller, et al.
0

This article emphasizes that NLP as a science seeks to make inferences about the performance effects that result from applying one method (compared to another method) in the processing of natural language. Yet NLP research in practice usually does not achieve this goal: In NLP research articles, typically only a few models are compared. Each model results from a specific procedural pipeline (here named processing system) that is composed of a specific collection of methods that are used in preprocessing, pretraining, hyperparameter tuning, and training on the target task. To make generalizing inferences about the performance effect that is caused by applying some method A vs. another method B, it is not sufficient to compare a few specific models that are produced by a few specific (probably incomparable) processing systems. Rather, the following procedure would allow drawing inferences about methods' performance effects: (1) A population of processing systems that researchers seek to infer to has to be defined. (2) A random sample of processing systems from this population is drawn. (The drawn processing systems in the sample will vary with regard to the methods they apply along their procedural pipelines and also will vary regarding the compositions of their training and test data sets used for training and evaluation.) (3) Each processing system is applied once with method A and once with method B. (4) Based on the sample of applied processing systems, the expected generalization errors of method A and method B are approximated. (5) The difference between the expected generalization errors of method A and method B is the estimated average treatment effect due to applying method A compared to method B in the population of processing systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2023

A Two-Stage Method for Extending Inferences from a Collection of Trials

When considering the effect a treatment will cause in a population of in...
research
06/19/2021

Choosing the Estimand When Matching or Weighting in Observational Studies

Matching and weighting methods for observational studies require the cho...
research
09/30/2021

Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population

Methods for extending – generalizing or transporting – inferences from a...
research
01/26/2021

Exploring Transitivity in Neural NLI Models through Veridicality

Despite the recent success of deep neural networks in natural language p...
research
03/27/2019

Towards causally interpretable meta-analysis: transporting inferences from multiple studies to a target population

We take steps towards causally interpretable meta-analysis by describing...
research
02/07/2022

Sensitivity Analysis in the Generalization of Experimental Results

Randomized controlled trials (RCT's) allow researchers to estimate causa...

Please sign up or login with your details

Forgot password? Click here to reset