Learned Feature Attribution Priors

by   Ethan Weinberger, et al.

Deep learning models have achieved breakthrough successes in domains where data is plentiful. However, such models are prone to overfitting when trained on high-dimensional, low sample size datasets. Furthermore, the black-box nature of such models has limited their application in domains where model trust is critical. As a result, deep learning has struggled to make inroads in domains such as precision medicine, where small sample sizes are the norm and model trust is paramount. Oftentimes, even in low data settings we have some set of prior information on each input feature to our prediction task, which may be related to that feature's relevance to the prediction problem. In this work we propose the learned attribution prior framework to take advantage of such information and alleviate the issues mentioned previously. For a given prediction task, our framework jointly learns a relationship between prior information about a feature and that feature's importance to the task, while also biasing the prediction model to focus on the features with high predicted importance. We find that training models using our framework improves model accuracy in low-data settings. Furthermore, we find that the resulting learned meta-feature to feature relationships open up new avenues for model interpretation.


page 5

page 6


Learning Explainable Models Using Attribution Priors

Two important topics in deep learning both involve incorporating humans ...

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Ensuring the trustworthiness and interpretability of machine learning mo...

Analysis of a Deep Learning Model for 12-Lead ECG Classification Reveals Learned Features Similar to Diagnostic Criteria

Despite their remarkable performance, deep neural networks remain unadop...

Sound Explanation for Trustworthy Machine Learning

We take a formal approach to the explainability problem of machine learn...

Explaining COVID-19 and Thoracic Pathology Model Predictions by Identifying Informative Input Features

Neural networks have demonstrated remarkable performance in classificati...

Incorporating Priors with Feature Attribution on Text Classification

Feature attribution methods, proposed recently, help users interpret the...

Regularising Non-linear Models Using Feature Side-information

Very often features come with their own vectorial descriptions which pro...

Please sign up or login with your details

Forgot password? Click here to reset