Log In Sign Up

Quaternion Equivariant Capsule Networks for 3D Point Clouds

by   Yongheng Zhao, et al.

We present a 3D capsule architecture for processing of point clouds that is equivariant with respect to the SO(3) rotation group, translation and permutation of the unordered input sets. The network operates on a sparse set of local reference frames, computed from an input point cloud and establishes end-to-end equivariance through a novel 3D quaternion group capsule layer, including an equivariant dynamic routing procedure. The capsule layer enables us to disentangle geometry from pose, paving the way for more informative descriptions and a structured latent space. In the process, we theoretically connect the process of dynamic routing between capsules to the well-known Weiszfeld algorithm, a scheme for solving iterative re-weighted least squares (IRLS) problems with provable convergence properties, enabling robust pose estimation between capsule layers. Due to the sparse equivariant quaternion capsules, our architecture allows joint object classification and orientation estimation, which we validate empirically on common benchmark datasets.


3D Point-Capsule Networks

In this paper, we propose 3D point-capsule networks, an auto-encoder des...

3DCapsule: Extending the Capsule Architecture to Classify 3D Point Clouds

This paper introduces the 3DCapsule, which is a 3D extension of the rece...

PointCaps: Raw Point Cloud Processing using Capsule Networks with Euclidean Distance Routing

Raw point cloud processing using capsule networks is widely adopted in c...

Canonical Capsules: Unsupervised Capsules in Canonical Pose

We propose an unsupervised capsule architecture for 3D point clouds. We ...

Group Equivariant Capsule Networks

We present group equivariant capsule networks, a framework to introduce ...

Geometric Capsule Autoencoders for 3D Point Clouds

We propose a method to learn object representations from 3D point clouds...

Affordance detection with Dynamic-Tree Capsule Networks

Affordance detection from visual input is a fundamental step in autonomo...