Different people may share many cognitive functions (e.g. object recognition), but in general, the underlying neural implementation of these shared cognitive functions will be different across individuals. Similarly, when many instantiations of the same neural network architecture are trained on the same dataset, these networks tend to approximate the same mathematical function with very different weight configurations Dauphin2014-sq; Li2015-ur; Meng2018-vv. Concretely, given the same input, two trained networks tend to produce the same output, but their hidden activity patterns will be different. In what sense are these networks similar? Broadly speaking, any mathematical function has many equivalent paramterizations. Understanding the connection of these paramterizations might help us understand the intrinsic property of that function. What is the connection across these neural networks trained on the same data?
Prior research has shown that there are underlying similarities across the activity patterns from different networks trained on the same dataset Li2015-ur; Morcos2018-nf; Raghu2017-ng. One hypothesis is that the activity patterns of these networks span highly similar feature spaces Li2015-ur. Empirically, it has also been shown that different networks can be “aligned” by doing canonical correlation analysis on the singular components of their activity patterns Morcos2018-nf; Raghu2017-ng. Interestingly, in the case of linear networks, prior theoretical research has shown that different instances of the same network architecture will learn the same representational similarity relation across the inputs saxe2014-ux; Saxe2018-bl. And their activity patterns are connected by orthogonal transformations (assuming the training data is structured hierarchically, small norm weight initialization, and small learning rate) saxe2014-ux; Saxe2018-bl. Though many conclusions derived from linear networks generalized to non-linear networks Advani2017-fo; saxe2014-ux; Saxe2018-bl, it is unclear if this result holds in the non-linear setting.
In this paper, we test if different neural networks trained on the same dataset learn to represent the training data as different orthogonal transformations of some underlying shared representation. To do so, we leverage ideas developed for analyzing group-level neuroimaging data. Recently, techniques have been developed for functionally aligning different subjects to a shared representational space directly based on brain responses Chen2015-mi; Haxby2011-uf. Here, we propose to construct the shared representational space across neural networks with the shared response model (SRM) Chen2015-mi, a method for functionally aligning neuroimaging data across subjects Anderson2016-xh; Guntupalli2016-so; Haxby2011-uf; Vodrahalli2018-kw. SRM maps different subjects’ data to a shared space through matrices with orthonormal columns. In our work, we use SRM to show that, in some cases, orthogonal matrices can be sufficient for constructing a shared representational space across activity patterns from different networks. Namely, different networks learn different rigid-body transformations of the same underlying representation. This result is consistent with the theoretical predictions made on deep linear networks saxe2014-ux; Saxe2018-bl, as well as prior empirical works (Li2015-ur; Morcos2018-nf; Raghu2017-ng).
Here we introduce the shared response model (SRM) and the concept of a representational similarity matrix (RSM). We use SRM to construct a shared representational space where hidden activity patterns across networks can be meaningfully compared. And we use RSM to quantitatively evaluate the learned transformations.
Shared Response Model (SRM). SRM is formulated as in equation (1). Given neural networks. Let , be the set of activity patterns for -th layer of network , where is the number of units and is the number of examples. SRM seeks , a basis set for the shared space, and , the transformation matrices between the network-specific native space (the span of ) and the shared space (Fig 1A shows a schematic illustration of this process). are constrained to be matrices with orthonormal columns. Finally,
is a hyperparameter that control the dimensionality of the shared space. When, is orthogonal, which represents a rigid-body transformation.
Representational Similarity Matrix (RSM). To assess the information encoded by hidden activity patterns, we use RSM Kriegeskorte2008-md; Kriegeskorte2013-xl, a method for comparing neural representations across different systems (e.g. monkey vs. human). Let matrix to be the matrix of activity patterns for a neural network layer, where each column of is an activity pattern evoked by an input. The within-network RSM of is the correlation matrix of , i.e., . Without loss of generality, we assume to be column-wise normalized, so . RSM is a matrix that reflects all pairwise similarities of the hidden activity patterns evoked by different inputs. We define inter-network RSM as . Figure 1B shows the RSMs from ten standard ConvNet trained on CIFAR10 for demonstration.
The averaged within-network RSM represents what’s shared across networks. If two networks have identical activity patterns (), their inter-network RSM will be identical to the averaged within-network RSM. However, if they are “misaligned” (e.g. off by an orthogonal transformation), their inter-network RSM will be different from the averaged within-network RSM. For example, consider two sets of patterns and , where is orthogonal. Then . With this observation, we use the correlation between inter-network RSM and within-network RSM to assess the quality of SRM alignment.
The connection between SRM and representational similarity. We start with establishing a theoretical connection between SRM and RSM – if two sets of activity patterns , have identical RSMs, , can be represented as different orthogonal transformations of the same underlying shared representation. Namely, there exist , and , such that and , with and . We prove this in the case of two networks, and the generalization to networks is straightforward.
For two sets of activity patterns and , RSM() = RSM() if and only if and can be represented as different orthogonal transformations of the same shared representation .
Proof: For the forward direction, assume . Let and be compact SVDs. The assumption can be rewritten in terms of the SVDs:and . Let and let . Now, we can rewrite and as and . Finally, let , , and . By construction, this is a SRM solution that perfectly aligns and .
For the converse, assuming there is a SRM solution that achieves a perfect alignment for and . Namely, and , with and for some . Then,