Deep Neural Networks with Auxiliary-Model Regulated Gating for Resilient Multi-Modal Sensor Fusion
Deep neural networks allow for fusion of high-level features from multiple modalities and have become a promising end-to-end solution for multi-modal sensor fusion. While the recently proposed gating architectures improve the conventional fusion mechanisms employed in CNNs, these models are not always resilient particularly under the presence of sensor failures. This paper shows that the existing gating architectures fail to robustly learn the fusion weights that critically gate different modalities, leading to the issue of fusion weight inconsistency. We propose a new gating architecture by incorporating an auxiliary model to regularize the main model such that the fusion weight for each sensory modality can be robustly learned. As a result, this new auxiliary-model regulated architecture and its variants outperform the existing non-gating and gating fusion architectures under both clean and corrupted sensory inputs resulted from sensor failures. The obtained performance gains are rather significant in the latter case.
READ FULL TEXT