Panoptic SwiftNet: Pyramidal Fusion for Real-time Panoptic Segmentation

03/15/2022
by   Josip Šarić, et al.
0

Dense panoptic prediction is a key ingredient in many existing applications such as autonomous driving, automated warehouses or agri-robotics. However, most of these applications leverage the recovered dense semantics as an input to visual closed-loop control. Hence, practical deployments require real-time inference over large input resolutions on embedded hardware. These requirements call for computationally efficient approaches which deliver high accuracy with limited computational resources. We propose to achieve this goal by trading-off backbone capacity for multi-scale feature extraction. In comparison with contemporaneous approaches to panoptic segmentation, the main novelties of our method are scale-equivariant feature extraction and cross-scale upsampling through pyramidal fusion. Our best model achieves 55.9 60 FPS on full resolution 2MPx images and RTX3090 with FP16 Tensor RT optimization.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset