Feature Space Transfer for Data Augmentation

by   Bo Liu, et al.

The problem of data augmentation in feature space is considered. A new architecture, denoted the FeATure TransfEr Network (FATTEN), is proposed for the modeling of feature trajectories induced by variations of object pose. This architecture exploits a parametrization of the pose manifold in terms of pose and appearance. This leads to a deep encoder/decoder network architecture, where the encoder factors into an appearance and a pose predictor. Unlike previous attempts at trajectory transfer, FATTEN can be efficiently trained end-to-end, with no need to train separate feature transfer functions. This is realized by supplying the decoder with information about a target pose and the use of a multi-task loss that penalizes category- and pose-mismatches. In result, FATTEN discourages discontinuous or non-smooth trajectories that fail to capture the structure of the pose manifold, and generalizes well on object recognition tasks involving large pose variation. Experimental results on the artificial ModelNet database show that it can successfully learn to map source features to target features of a desired pose, while preserving class identity. Most notably, by using feature space transfer for data augmentation (w.r.t. pose and depth) on SUN-RGBD objects, we demonstrate considerable performance improvements on one/few-shot object recognition in a transfer learning setup, compared to current state-of-the-art methods.


AGA: Attribute Guided Augmentation

We consider the problem of data augmentation, i.e., generating artificia...

Sill-Net: Feature Augmentation with Separated Illumination Representation

For visual object recognition tasks, the illumination variations can cau...

Interpretable Transformations with Encoder-Decoder Networks

Deep feature spaces have the capacity to encode complex transformations ...

Semantic Feature Augmentation in Few-shot Learning

A fundamental problem with few-shot learning is the scarcity of data in ...

Disentangling Space and Time in Video with Hierarchical Variational Auto-encoders

There are many forms of feature information present in video data. Princ...

When Regression Meets Manifold Learning for Object Recognition and Pose Estimation

In this work, we propose a method for object recognition and pose estima...

Unsupervised Feature Learning of Human Actions as Trajectories in Pose Embedding Manifold

An unsupervised human action modeling framework can provide useful pose-...

Please sign up or login with your details

Forgot password? Click here to reset