Multimodal Transformers for Wireless Communications: A Case Study in Beam Prediction

09/21/2023
by   Yu Tian, et al.
0

Wireless communications at high-frequency bands with large antenna arrays face challenges in beam management, which can potentially be improved by multimodality sensing information from cameras, LiDAR, radar, and GPS. In this paper, we present a multimodal transformer deep learning framework for sensing-assisted beam prediction. We employ a convolutional neural network to extract the features from a sequence of images, point clouds, and radar raw data sampled over time. At each convolutional layer, we use transformer encoders to learn the hidden relations between feature tokens from different modalities and time instances over abstraction space and produce encoded vectors for the next-level feature extraction. We train the model on a combination of different modalities with supervised learning. We try to enhance the model over imbalanced data by utilizing focal loss and exponential moving average. We also evaluate data processing and augmentation techniques such as image enhancement, segmentation, background filtering, multimodal data flipping, radar signal transformation, and GPS angle calibration. Experimental results show that our solution trained on image and GPS data produces the best distance-based accuracy of predicted beams at 78.44 generalization to unseen day scenarios near 73 This outperforms using other modalities and arbitrary data processing techniques, which demonstrates the effectiveness of transformers with feature fusion in performing radio beam prediction from images and GPS. Furthermore, our solution could be pretrained from large sequences of multimodality wireless data, on fine-tuning for multiple downstream radio network tasks.

READ FULL TEXT

page 3

page 4

page 5

page 10

page 11

research
11/04/2022

RCDPT: Radar-Camera fusion Dense Prediction Transformer

Recently, transformer networks have outperformed traditional deep neural...
research
03/29/2023

T-FFTRadNet: Object Detection with Swin Vision Transformers from Raw ADC Radar Signals

Object detection utilizing Frequency Modulated Continous Wave radar is b...
research
07/20/2023

Meta-Transformer: A Unified Framework for Multimodal Learning

Multimodal learning aims to build models that can process and relate inf...
research
01/06/2023

FMCW Radar Sensing for Indoor Drones Using Learned Representations

Frequency-modulated continuous-wave (FMCW) radar is a promising sensor t...
research
08/15/2022

Self-Supervised Multimodal Fusion Transformer for Passive Activity Recognition

The pervasiveness of Wi-Fi signals provides significant opportunities fo...
research
06/22/2022

Improve Radar Sensing Performance of Multiple Roadside Units Cooperation via Space Registration

Roadside units (RSUs) can help vehicles sense the traffic environment, s...
research
10/07/2022

Monitoring MBE substrate deoxidation via RHEED image-sequence analysis by deep learning

Reflection high-energy electron diffraction (RHEED) is a powerful tool i...

Please sign up or login with your details

Forgot password? Click here to reset