Differentiable Pooling for Hierarchical Feature Learning
We introduce a parametric form of pooling, based on a Gaussian, which can be optimized alongside the features in a single global objective function. By contrast, existing pooling schemes are based on heuristics (e.g. local maximum) and have no clear link to the cost function of the model. Furthermore, the variables of the Gaussian explicitly store location information, distinct from the appearance captured by the features, thus providing a what/where decomposition of the input signal. Although the differentiable pooling scheme can be incorporated in a wide range of hierarchical models, we demonstrate it in the context of a Deconvolutional Network model (Zeiler et al. ICCV 2011). We also explore a number of secondary issues within this model and present detailed experiments on MNIST digits.
READ FULL TEXT