RECAST: Interactive Auditing of Automatic Toxicity Detection Models

01/07/2020
by   Austin P. Wright, et al.
0

As toxic language becomes nearly pervasive online, there has been increasing interest in leveraging the advancements in natural language processing (NLP), from very large transformer models to automatically detecting and removing toxic comments. Despite the fairness concerns, lack of adversarial robustness, and limited prediction explainability for deep learning systems, there is currently little work for auditing these systems and understanding how they work for both developers and users. We present our ongoing work, RECAST, an interactive tool for examining toxicity detection models by visualizing explanations for predictions and providing alternative wordings for detected toxic speech.

READ FULL TEXT
research
02/08/2021

RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization

With the widespread use of toxic language online, platforms are increasi...
research
10/01/2020

A Survey of the State of Explainable AI for Natural Language Processing

Recent years have seen important advances in the quality of state-of-the...
research
05/30/2023

Explaining Hate Speech Classification with Model Agnostic Methods

There have been remarkable breakthroughs in Machine Learning and Artific...
research
05/29/2023

Exploiting Explainability to Design Adversarial Attacks and Evaluate Attack Resilience in Hate-Speech Detection Models

The advent of social media has given rise to numerous ethical challenges...
research
03/01/2023

ToxVis: Enabling Interpretability of Implicit vs. Explicit Toxicity Detection Models with Interactive Visualization

The rise of hate speech on online platforms has led to an urgent need fo...
research
02/03/2022

Rethinking Explainability as a Dialogue: A Practitioner's Perspective

As practitioners increasingly deploy machine learning models in critical...
research
08/03/2023

XNLP: An Interactive Demonstration System for Universal Structured NLP

Structured Natural Language Processing (XNLP) is an important subset of ...

Please sign up or login with your details

Forgot password? Click here to reset