A Convolutional Vision Transformer for Semantic Segmentation of Side-Scan Sonar Data

02/24/2023
by   Hayat Rajani, et al.
0

Distinguishing among different marine benthic habitat characteristics is of key importance in a wide set of seabed operations ranging from installations of oil rigs to laying networks of cables and monitoring the impact of humans on marine ecosystems. The Side-Scan Sonar (SSS) is a widely used imaging sensor in this regard. It produces high-resolution seafloor maps by logging the intensities of sound waves reflected back from the seafloor. In this work, we leverage these acoustic intensity maps to produce pixel-wise categorization of different seafloor types. We propose a novel architecture adapted from the Vision Transformer (ViT) in an encoder-decoder framework. Further, in doing so, the applicability of ViTs is evaluated on smaller datasets. To overcome the lack of CNN-like inductive biases, thereby making ViTs more conducive to applications in low data regimes, we propose a novel feature extraction module to replace the Multi-layer Perceptron (MLP) block within transformer layers and a novel module to extract multiscale patch embeddings. A lightweight decoder is also proposed to complement this design in order to further boost multiscale feature extraction. With the modified architecture, we achieve state-of-the-art results and also meet real-time computational requirements. We make our code available at  <https://github.com/hayatrajani/s3seg-vit>

READ FULL TEXT
research
09/19/2023

Spatial-Assistant Encoder-Decoder Network for Real Time Semantic Segmentation

Semantic segmentation is an essential technology for self-driving cars t...
research
01/11/2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer

Existing semantic segmentation works have been mainly focused on designi...
research
11/04/2022

Real-Time Target Sound Extraction

We present the first neural network model to achieve real-time and strea...
research
07/19/2018

Guided Upsampling Network for Real-Time Semantic Segmentation

Semantic segmentation architectures are mainly built upon an encoder-dec...
research
09/12/2022

Vision Transformer with Convolutional Encoder-Decoder for Hand Gesture Recognition using 24 GHz Doppler Radar

Transformers combined with convolutional encoders have been recently use...
research
07/14/2023

HEAL-SWIN: A Vision Transformer On The Sphere

High-resolution wide-angle fisheye images are becoming more and more imp...
research
03/13/2023

Transformer Encoder with Multiscale Deep Learning for Pain Classification Using Physiological Signals

Pain is a serious worldwide health problem that affects a vast proportio...

Please sign up or login with your details

Forgot password? Click here to reset