Text Characterization Toolkit

10/04/2022
by   Daniel Simig, et al.
0

In NLP, models are usually evaluated by reporting single-number performance scores on a number of readily available benchmarks, without much deeper analysis. Here, we argue that - especially given the well-known fact that benchmarks often contain biases, artefacts, and spurious correlations - deeper results analysis should become the de-facto standard when presenting new models or benchmarks. We present a tool that researchers can use to study properties of the dataset and the influence of those properties on their models' behaviour. Our Text Characterization Toolkit includes both an easy-to-use annotation tool, as well as off-the-shelf scripts that can be used for specific analyses. We also present use-cases from three different domains: we use the tool to predict what are difficult examples for given well-known trained models and identify (potentially harmful) biases and heuristics that are present in a dataset.

READ FULL TEXT

page 11

page 14

page 15

research
07/18/2022

MRCLens: an MRC Dataset Bias Detection Toolkit

Many recent neural models have shown remarkable empirical results in Mac...
research
11/20/2019

Transfer Learning Toolkit: Primers and Benchmarks

The transfer learning toolkit wraps the codes of 17 transfer learning mo...
research
02/19/2023

SanskritShala: A Neural Sanskrit NLP Toolkit with Web-Based Interface for Pedagogical and Annotation Purposes

We present a neural Sanskrit Natural Language Processing (NLP) toolkit n...
research
05/26/2023

Finspector: A Human-Centered Visual Inspection Tool for Exploring and Comparing Biases among Foundation Models

Pre-trained transformer-based language models are becoming increasingly ...
research
04/19/2021

skweak: Weak Supervision Made Easy for NLP

We present skweak, a versatile, Python-based software toolkit enabling N...
research
10/14/2022

A Survey of Parameters Associated with the Quality of Benchmarks in NLP

Several benchmarks have been built with heavy investment in resources to...

Please sign up or login with your details

Forgot password? Click here to reset