When Explanations Lie: Why Modified BP Attribution Fails

12/20/2019 ∙ by Leon Sixt, et al. ∙ Freie Universität Berlin 17

Modified backpropagation methods are a popular group of attribution methods. We analyse the most prominent methods: Deep Taylor Decomposition, Layer-wise Relevance Propagation, Excitation BP, PatternAttribution, Deconv, and Guided BP. We found empirically that the explanations of the mentioned modified BP methods are independent of the parameters of later layers and show that the z^+ rule used by multiple methods converges to a rank-1 matrix. This can explain well why the actual network's decision is ignored. We also develop a new metric cosine similarity convergence (CSC) to directly quantify the convergence of the modified BP methods to a rank-1 matrix. Our conclusion is that many modified BP methods do not explain the predictions of deep neural networks faithfully.



There are no comments yet.


page 4

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.