FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pre-Training

09/18/2023
by   Shaheer Mohamed, et al.
0

Hyperspectral images (HSIs) contain rich spectral and spatial information. Motivated by the success of transformers in the field of natural language processing and computer vision where they have shown the ability to learn long range dependencies within input data, recent research has focused on using transformers for HSIs. However, current state-of-the-art hyperspectral transformers only tokenize the input HSI sample along the spectral dimension, resulting in the under-utilization of spatial information. Moreover, transformers are known to be data-hungry and their performance relies heavily on large-scale pre-training, which is challenging due to limited annotated hyperspectral data. Therefore, the full potential of HSI transformers has not been fully realized. To overcome these limitations, we propose a novel factorized spectral-spatial transformer that incorporates factorized self-supervised pre-training procedures, leading to significant improvements in performance. The factorization of the inputs allows the spectral and spatial transformers to better capture the interactions within the hyperspectral data cubes. Inspired by masked image modeling pre-training, we also devise efficient masking strategies for pre-training each of the spectral and spatial transformers. We conduct experiments on three publicly available datasets for HSI classification task and demonstrate that our model achieves state-of-the-art performance in all three datasets. The code for our model will be made available at https://github.com/csiro-robotics/factoformer.

READ FULL TEXT

page 1

page 9

page 10

page 11

research
03/31/2022

Deep Hyperspectral Unmixing using Transformer Network

Currently, this paper is under review in IEEE. Transformers have intrigu...
research
03/05/2021

SpecTr: Spectral Transformer for Hyperspectral Pathology Image Segmentation

Hyperspectral imaging (HSI) unlocks the huge potential to a wide variety...
research
11/01/2021

Transformers for prompt-level EMA non-response prediction

Ecological Momentary Assessments (EMAs) are an important psychological d...
research
09/07/2023

DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions

As it is empirically observed that Vision Transformers (ViTs) are quite ...
research
03/27/2022

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

The past year has witnessed a rapid development of masked image modeling...
research
09/19/2022

S^3R: Self-supervised Spectral Regression for Hyperspectral Histopathology Image Classification

Benefited from the rich and detailed spectral information in hyperspectr...
research
06/12/2023

Unmasking Deepfakes: Masked Autoencoding Spatiotemporal Transformers for Enhanced Video Forgery Detection

We present a novel approach for the detection of deepfake videos using a...

Please sign up or login with your details

Forgot password? Click here to reset