SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images
Omni-directional cameras have many advantages over conventional cameras in that they have a much wider field-of-view (FOV). Several approaches have been recently proposed to apply convolutional neural networks (CNNs) to omni-directional images to solve classification and detection problems. However, most of them use image representations in the Euclidean space defined by transforming the omni-directional views originally in the non-Euclidean space. This transformation leads to shape distortion due to nonuniform spatial resolving power and loss of continuity. These effects make existing convolution kernels have difficulties in extracting meaningful information. This paper proposes a novel method to resolve the aforementioned problems of applying CNNs to omni-directional images. The proposed method utilizes a spherical polyhedron to represent omni-directional views. This method minimizes the variance of spatial resolving power on the sphere surface, and includes new convolution and pooling methods for the proposed representation. The proposed approach can also be adopted by existing CNN-based methods. The feasibility and efficacy of the proposed method is demonstrated through both classification and detection tasks.
READ FULL TEXT