DeepAI AI Chat
Log In Sign Up

Text Characterization Toolkit

10/04/2022
by   Daniel Simig, et al.
0

In NLP, models are usually evaluated by reporting single-number performance scores on a number of readily available benchmarks, without much deeper analysis. Here, we argue that - especially given the well-known fact that benchmarks often contain biases, artefacts, and spurious correlations - deeper results analysis should become the de-facto standard when presenting new models or benchmarks. We present a tool that researchers can use to study properties of the dataset and the influence of those properties on their models' behaviour. Our Text Characterization Toolkit includes both an easy-to-use annotation tool, as well as off-the-shelf scripts that can be used for specific analyses. We also present use-cases from three different domains: we use the tool to predict what are difficult examples for given well-known trained models and identify (potentially harmful) biases and heuristics that are present in a dataset.

READ FULL TEXT

page 11

page 14

page 15

07/18/2022

MRCLens: an MRC Dataset Bias Detection Toolkit

Many recent neural models have shown remarkable empirical results in Mac...
11/20/2019

Transfer Learning Toolkit: Primers and Benchmarks

The transfer learning toolkit wraps the codes of 17 transfer learning mo...
02/19/2023

SanskritShala: A Neural Sanskrit NLP Toolkit with Web-Based Interface for Pedagogical and Annotation Purposes

We present a neural Sanskrit Natural Language Processing (NLP) toolkit n...
04/19/2021

skweak: Weak Supervision Made Easy for NLP

We present skweak, a versatile, Python-based software toolkit enabling N...
04/16/2020

ViBE: A Tool for Measuring and Mitigating Bias in Image Datasets

Machine learning models are known to perpetuate the biases present in th...
10/14/2022

A Survey of Parameters Associated with the Quality of Benchmarks in NLP

Several benchmarks have been built with heavy investment in resources to...
10/24/2022

Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and Models

Cognitive psychologists have documented that humans use cognitive heuris...