Towards Modeling the Interaction of Spatial-Associative Neural Network Representations for Multisensory Perception
Our daily perceptual experience is driven by different neural mechanisms that yield multisensory interaction as the interplay between exogenous stimuli and endogenous expectations. While the interaction of multisensory cues according to their spatiotemporal properties and the formation of multisensory feature-based representations have been widely studied, the interaction of spatial-associative neural representations has received considerably less attention. In this paper, we propose a neural network architecture that models the interaction of spatial-associative representations to perform causal inference of audiovisual stimuli. We investigate the spatial alignment of exogenous audiovisual stimuli modulated by associative congruence. In the spatial layer, topographically arranged networks account for the interaction of audiovisual input in terms of population codes. In the associative layer, congruent audiovisual representations are obtained via the experience-driven development of feature-based associations. Levels of congruency are obtained as a by-product of the neurodynamics of self-organizing networks, where the amount of neural activation triggered by the input can be expressed via a nonlinear distance function. Our novel proposal is that activity-driven levels of congruency can be used as top-down modulatory projections to spatially distributed representations of sensory input, e.g. semantically related audiovisual pairs will yield a higher level of integration than unrelated pairs. Furthermore, levels of neural response in unimodal layers may be seen as sensory reliability for the dynamic weighting of crossmodal cues. We describe a series of planned experiments to validate our model in the tasks of multisensory interaction on the basis of semantic congruence and unimodal cue reliability.
READ FULL TEXT