DeepAI AI Chat
Log In Sign Up

Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion

by   Masahiro Yasuda, et al.

We tackle a challenging task: multi-view and multi-modal event detection that detects events in a wide-range real environment by utilizing data from distributed cameras and microphones and their weak labels. In this task, distributed sensors are utilized complementarily to capture events that are difficult to capture with a single sensor, such as a series of actions of people moving in an intricate room, or communication between people located far apart in a room. For sensors to cooperate effectively in such a situation, the system should be able to exchange information among sensors and combines information that is useful for identifying events in a complementary manner. For such a mechanism, we propose a Transformer-based multi-sensor fusion (MultiTrans) which combines multi-sensor data on the basis of the relationships between features of different viewpoints and modalities. In the experiments using a dataset newly collected for this task, our proposed method using MultiTrans improved the event detection performance and outperformed comparatives.


MEx: Multi-modal Exercises Dataset for Human Activity Recognition

MEx: Multi-modal Exercises Dataset is a multi-sensor, multi-modal datase...

Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention

Automatically describing video, or video captioning, has been widely stu...

Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer

Large-scale deployment of autonomous vehicles has been continually delay...

Objective Study of Sensor Relevance for Automatic Cough Detection

The development of a system for the automatic, objective and reliable de...

AFT-VO: Asynchronous Fusion Transformers for Multi-View Visual Odometry Estimation

Motion estimation approaches typically employ sensor fusion techniques, ...

Contextual Sense Making by Fusing Scene Classification, Detections, and Events in Full Motion Video

With the proliferation of imaging sensors, the volume of multi-modal ima...

Multimodal Indoor Localisation for Measuring Mobility in Parkinson's Disease using Transformers

Parkinson's disease (PD) is a slowly progressive debilitating neurodegen...