Surface Masked AutoEncoder: Self-Supervision for Cortical Imaging Data

08/10/2023
by   Simon Dahan, et al.
0

Self-supervision has been widely explored as a means of addressing the lack of inductive biases in vision transformer architectures, which limits generalisation when networks are trained on small datasets. This is crucial in the context of cortical imaging, where phenotypes are complex and heterogeneous, but the available datasets are limited in size. This paper builds upon recent advancements in translating vision transformers to surface meshes and investigates the potential of Masked AutoEncoder (MAE) self-supervision for cortical surface learning. By reconstructing surface data from a masked version of the input, the proposed method effectively models cortical structure to learn strong representations that translate to improved performance in downstream tasks. We evaluate our approach on cortical phenotype regression using the developing Human Connectome Project (dHCP) and demonstrate that pre-training leads to a 26% improvement in performance, with an 80% faster convergence, compared to models trained from scratch. Furthermore, we establish that pre-training vision transformer models on large datasets, such as the UK Biobank (UKB), enables the acquisition of robust representations for finetuning in low-data scenarios. Our code and pre-trained models are publicly available at <https://github.com/metrics-lab/surface-vision-transformers>.

READ FULL TEXT
research
10/13/2022

How to Train Vision Transformer on Small-scale Datasets?

Vision Transformer (ViT), a radically different architecture than convol...
research
03/21/2023

The Multiscale Surface Vision Transformer

Surface meshes are a favoured domain for representing structural and fun...
research
05/31/2022

Surface Analysis with Vision Transformers

The extension of convolutional neural networks (CNNs) to non-Euclidean g...
research
12/08/2021

MLP Architectures for Vision-and-Language Modeling: An Empirical Study

We initiate the first empirical study on the use of MLP architectures fo...
research
01/05/2022

Advancing 3D Medical Image Analysis with Variable Dimension Transform based Supervised 3D Pre-training

The difficulties in both data acquisition and annotation substantially r...
research
07/25/2022

Self-Distilled Vision Transformer for Domain Generalization

In recent past, several domain generalization (DG) methods have been pro...
research
04/27/2016

Diving deeper into mentee networks

Modern computer vision is all about the possession of powerful image rep...

Please sign up or login with your details

Forgot password? Click here to reset