Event Camera Data Pre-training

01/05/2023
by   Yan Yang, et al.
0

This paper proposes a pre-trained neural network for handling event camera data. Our model is trained in a self-supervised learning framework, and uses paired event camera data and natural RGB images for training. Our method contains three modules connected in a sequence: i) a family of event data augmentations, generating meaningful event images for self-supervised training; ii) a conditional masking strategy to sample informative event patches from event images, encouraging our model to capture the spatial layout of a scene and fast training; iii) a contrastive learning approach, enforcing the similarity of embeddings between matching event images, and between paired event-RGB images. An embedding projection loss is proposed to avoid the model collapse when enforcing event embedding similarities. A probability distribution alignment loss is proposed to encourage the event data to be consistent with its paired RGB image in feature space. Transfer performance in downstream tasks shows superior performance of our method over state-of-the-art methods. For example, we achieve top-1 accuracy at 64.83% on the N-ImageNet dataset.

READ FULL TEXT

page 7

page 8

research
12/20/2022

Masked Event Modeling: Self-Supervised Pretraining for Event Cameras

Event cameras offer the capacity to asynchronously capture brightness ch...
research
02/13/2023

CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets

Current RGB-D scene recognition approaches often train two standalone ba...
research
05/26/2022

Task-Customized Self-Supervised Pre-training with Scalable Dynamic Routing

Self-supervised learning (SSL), especially contrastive methods, has rais...
research
03/30/2021

Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays

Pre-trained models, e.g., from ImageNet, have proven to be effective in ...
research
07/11/2022

A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision

Self-supervised pre-training for 3D vision has drawn increasing research...
research
09/01/2021

EventPoint: Self-Supervised Local Descriptor Learning for Event Cameras

We proposes a method of extracting intrest points and descriptors using ...
research
03/11/2021

Self-Supervised Motion Retargeting with Safety Guarantee

In this paper, we present self-supervised shared latent embedding (S3LE)...

Please sign up or login with your details

Forgot password? Click here to reset