Explanation-based Counterfactual Retraining(XCR): A Calibration Method for Black-box Models

06/22/2022
by   Liu Zhendong, et al.
0

With the rapid development of eXplainable Artificial Intelligence (XAI), a long line of past work has shown concerns about the Out-of-Distribution (OOD) problem in perturbation-based post-hoc XAI models and explanations are socially misaligned. We explore the limitations of post-hoc explanation methods that use approximators to mimic the behavior of black-box models. Then we propose eXplanation-based Counterfactual Retraining (XCR), which extracts feature importance fastly. XCR applies the explanations generated by the XAI model as counterfactual input to retrain the black-box model to address OOD and social misalignment problems. Evaluation of popular image datasets shows that XCR can improve model performance when only retaining 12.5 features without changing the black-box model structure. Furthermore, the evaluation of the benchmark of corruption datasets shows that the XCR is very helpful for improving model robustness and positively impacts the calibration of OOD problems. Even though not calibrated in the validation set like some OOD calibration methods, the corrupted data metric outperforms existing methods. Our method also beats current OOD calibration methods on the OOD calibration metric if calibration on the validation set is applied.

READ FULL TEXT
research
09/03/2020

Model extraction from counterfactual explanations

Post-hoc explanation techniques refer to a posteriori methods that can b...
research
05/05/2020

Post-hoc explanation of black-box classifiers using confident itemsets

It is difficult to trust decisions made by Black-box Artificial Intellig...
research
12/02/2019

EMAP: Explanation by Minimal Adversarial Perturbation

Modern instance-based model-agnostic explanation methods (LIME, SHAP, L2...
research
04/30/2022

ExSum: From Local Explanations to Model Understanding

Interpretability methods are developed to understand the working mechani...
research
06/30/2020

Learning Post-Hoc Causal Explanations for Recommendation

State-of-the-art recommender systems have the ability to generate high-q...
research
01/08/2021

From Black-box to White-box: Examining Confidence Calibration under different Conditions

Confidence calibration is a major concern when applying artificial neura...
research
11/06/2022

Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates

Calibration strengthens the trustworthiness of black-box models by produ...

Please sign up or login with your details

Forgot password? Click here to reset