More Perspectives Mean Better: Underwater Target Recognition and Localization with Multimodal Data via Symbiotic Transformer and Multiview Regression

05/22/2023
by   Shipei Liu, et al.
0

Underwater acoustic target recognition (UATR) and localization (UATL) play important roles in marine exploration. The highly noisy acoustic signal and time-frequency interference among various sources pose big challenges to this task. To tackle these issues, we propose a multimodal approach to extract and fuse audio-visual-textual information to recognize and localize underwater targets through the designed Symbiotic Transformer (Symb-Trans) and Multi-View Regression (MVR) method. The multimodal data were first preprocessed by a custom-designed HetNorm module to normalize the multi-source data in a common feature space. The Symb-Trans module embeds audiovisual features by co-training the preprocessed multimodal features through parallel branches and a content encoder with cross-attention. The audiovisual features are then used for underwater target recognition. Meanwhile, the text embedding combined with the audiovisual features is fed to an MVR module to predict the localization of the underwater targets through multi-view clustering and multiple regression. Since no off-the-shell multimodal dataset is available for UATR and UATL, we combined multiple public datasets, consisting of acoustic, and/or visual, and/or textural data, to obtain audio-visual-textual triplets for model training and validation. Experiments show that our model outperforms comparative methods in 91.7 for the recognition and localization tasks, respectively. In a case study, we demonstrate the advantages of multi-view models in establishing sample discriminability through visualization methods. For UATL, the proposed MVR method produces the relation graphs, which allow predictions based on records of underwater targets with similar conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2022

Learning Visual Representation of Underwater Acoustic Imagery Using Transformer-Based Style Transfer Method

Underwater automatic target recognition (UATR) has been a challenging re...
research
02/16/2022

Cross-view and Cross-domain Underwater Localization based on Optical Aerial and Acoustic Underwater Images

Cross-view image matches have been widely explored on terrestrial image ...
research
03/02/2021

An Event-Based Stack For Data Transmission Through Underwater Multimodal Networks

The DESERT Underwater framework (http://desert-underwater.dei.unipd.it/)...
research
04/24/2023

Advancing underwater acoustic target recognition via adaptive data pruning and smoothness-inducing regularization

Underwater acoustic recognition for ship-radiated signals has high pract...
research
06/24/2019

Multimodal and Multi-view Models for Emotion Recognition

Studies on emotion recognition (ER) show that combining lexical and acou...
research
09/05/2022

Underwater Acoustic Ranging Between Smartphones

We present a novel underwater system that can perform acoustic ranging b...
research
09/07/2023

Cross-domain Sound Recognition for Efficient Underwater Data Analysis

This paper presents a novel deep learning approach for analyzing massive...

Please sign up or login with your details

Forgot password? Click here to reset