Does a Neural Network Really Encode Symbolic Concept?

02/25/2023
by   Mingjie Li, et al.
0

Recently, a series of studies have tried to extract interactions between input variables modeled by a DNN and define such interactions as concepts encoded by the DNN. However, strictly speaking, there still lacks a solid guarantee whether such interactions indeed represent meaningful concepts. Therefore, in this paper, we examine the trustworthiness of interaction concepts from four perspectives. Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts, which is partially aligned with human intuition.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset