Towards learning to explain with concept bottleneck models: mitigating information leakage

11/07/2022
by   Joshua Lockhart, et al.
0

Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. Then a downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction. Model trust issues emerge in this paradigm when soft concept labels are used: it has previously been observed that extra information about the data distribution leaks into the concept predictions. In this work we show how Monte-Carlo Dropout can be used to attain soft concept predictions that do not contain leaked information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2022

Learn to explain yourself, when you can: Equipping Concept Bottleneck Models with the ability to abstain on their concept predictions

The Concept Bottleneck Models (CBMs) of Koh et al. [2020] provide a mean...
research
12/14/2022

Interactive Concept Bottleneck Models

Concept bottleneck models (CBMs) (Koh et al. 2020) are interpretable neu...
research
07/09/2020

Concept Bottleneck Models

We seek to learn models that we can interact with using high-level conce...
research
05/31/2022

Post-hoc Concept Bottleneck Models

Concept Bottleneck Models (CBMs) map the inputs onto a set of interpreta...
research
02/07/2023

Towards a Deeper Understanding of Concept Bottleneck Models Through End-to-End Explanation

Concept Bottleneck Models (CBMs) first map raw input(s) to a vector of h...
research
08/25/2023

Learning to Intervene on Concept Bottlenecks

While traditional deep learning models often lack interpretability, conc...
research
02/28/2023

A Closer Look at the Intervention Procedure of Concept Bottleneck Models

Concept bottleneck models (CBMs) are a class of interpretable neural net...

Please sign up or login with your details

Forgot password? Click here to reset