Model-agnostic and Scalable Counterfactual Explanations via Reinforcement Learning

Counterfactual instances are a powerful tool to obtain valuable insights into automated decision processes, describing the necessary minimal changes in the input space to alter the prediction towards a desired target. Most previous approaches require a separate, computationally expensive optimization procedure per instance, making them impractical for both large amounts of data and high-dimensional data. Moreover, these methods are often restricted to certain subclasses of machine learning models (e.g. differentiable or tree-based models). In this work, we propose a deep reinforcement learning approach that transforms the optimization procedure into an end-to-end learnable process, allowing us to generate batches of counterfactual instances in a single forward pass. Our experiments on real-world data show that our method i) is model-agnostic (does not assume differentiability), relying only on feedback from model predictions; ii) allows for generating target-conditional counterfactual instances; iii) allows for flexible feature range constraints for numerical and categorical attributes, including the immutability of protected features (e.g. gender, race); iv) is easily extended to other data modalities such as images.

READ FULL TEXT

page 9

page 17

page 18

research
01/25/2021

Conditional Generative Models for Counterfactual Explanations

Counterfactual instances offer human-interpretable insight into the loca...
research
05/31/2022

MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation

Counterfactual explanation is an important Explainable AI technique to e...
research
04/19/2021

DA-DGCEx: Ensuring Validity of Deep Guided Counterfactual Explanations With Distribution-Aware Autoencoder Loss

Deep Learning has become a very valuable tool in different fields, and n...
research
05/27/2019

Model-Agnostic Counterfactual Explanations for Consequential Decisions

Predictive models are being increasingly used to support consequential d...
research
09/15/2021

CounterNet: End-to-End Training of Counterfactual Aware Predictions

This work presents CounterNet, a novel end-to-end learning framework whi...
research
07/20/2022

Learning Counterfactually Invariant Predictors

We propose a method to learn predictors that are invariant under counter...
research
03/28/2023

CREATED: Generating Viable Counterfactual Sequences for Predictive Process Analytics

Predictive process analytics focuses on predicting future states, such a...

Please sign up or login with your details

Forgot password? Click here to reset