MMRNet: Improving Reliability for Multimodal Computer Vision for Bin Picking via Multimodal Redundancy

by   Yuhao Chen, et al.

Recently, there has been tremendous interest in industry 4.0 infrastructure to address labor shortages in global supply chains. Deploying artificial intelligence-enabled robotic bin picking systems in real world has become particularly important for reducing labor demands and costs while increasing efficiency. To this end, artificial intelligence-enabled robotic bin picking systems may be used to automate bin picking, but may also cause expensive damage during an abnormal event such as a sensor failure. As such, reliability becomes a critical factor for translating artificial intelligence research to real world applications and products. In this paper, we propose a reliable vision system with MultiModal Redundancy (MMRNet) for tackling object detection and segmentation for robotic bin picking using data from different modalities. This is the first system that introduces the concept of multimodal redundancy to combat sensor failure issues during deployment. In particular, we realize the multimodal redundancy framework with a gate fusion module and dynamic ensemble learning. Finally, we present a new label-free multimodal consistency score that utilizes the output from all modalities to measure the overall system output reliability and uncertainty. Through experiments, we demonstrate that in an event of missing modality, our system provides a much more reliable performance compared to baseline models. We also demonstrate that our MC score is a more powerful reliability indicator for outputs during inference time where model generated confidence score are often over-confident.


page 2

page 11

page 15

page 16


Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (1995)

This is the Proceedings of the Eleventh Conference on Uncertainty in Art...

SelectFusion: A Generic Framework to Selectively Learn Multisensory Fusion

Autonomous vehicles and mobile robotic systems are typically equipped wi...

Adaptive User-centered Neuro-symbolic Learning for Multimodal Interaction with Autonomous Systems

Recent advances in machine learning, particularly deep learning, have en...

Robust Multimodal Fusion for Human Activity Recognition

The proliferation of IoT and mobile devices equipped with heterogeneous ...

Maximum Likelihood Estimation for Multimodal Learning with Missing Modality

Multimodal learning has achieved great successes in many scenarios. Comp...

Platform for Situated Intelligence

We introduce Platform for Situated Intelligence, an open-source framewor...

Seamless Redundancy for High Reliability Wi-Fi

By removing wire harness, Wi-Fi is becoming increasingly pervasive in ev...

Please sign up or login with your details

Forgot password? Click here to reset