A novel Region of Interest Extraction Layer for Instance Segmentation

by   Leonardo Rossi, et al.

Given the wide diffusion of deep neural network architectures for computer vision tasks, several new applications are nowadays more and more feasible. Among them, a particular attention has been recently given to instance segmentation, by exploiting the results achievable by two-stage networks (such as Mask R-CNN or Faster R-CNN), derived from R-CNN. In these complex architectures, a crucial role is played by the Region of Interest (RoI) extraction layer, devoted to extract a coherent subset of features from a single Feature Pyramid Network (FPN) layer attached on top of a backbone. This paper is motivated by the need to overcome to the limitations of existing RoI extractors which select only one (the best) layer from FPN. Our intuition is that all the layers of FPN retain useful information. Therefore, the proposed layer (called Generic RoI Extractor - GRoIE) introduces non-local building blocks and attention mechanisms to boost the performance. A comprehensive ablation study at component level is conducted to find the best set of algorithms and parameters for the GRoIE layer. Moreover, GRoIE can be integrated seamlessly with every two-stage architecture for both object detection and instance segmentation tasks. Therefore, the improvements brought by the use of GRoIE in different state-of-the-art architectures are also evaluated. The proposed layer leads up to gain a 1.1 detection and 1.7 The code is publicly available on GitHub repository at https://github.com/IMPLabUniPr/mmdetection-groie


page 1

page 2

page 3

page 4


Vec2Instance: Parameterization for Deep Instance Segmentation

Current advances in deep learning is leading to human-level accuracy in ...

EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network

Recently, it has been demonstrated that the performance of a deep convol...

Non-local RoI for Cross-Object Perception

We present a generic and flexible module that encodes region proposals b...

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

Instance segmentation is one of the fundamental vision tasks. Recently, ...

Recursively Refined R-CNN: Instance Segmentation with Self-RoI Rebalancing

Within the field of instance segmentation, most of the state-of-the-art ...

Real-time Automatic M-mode Echocardiography Measurement with Panel Attention from Local-to-Global Pixels

Motion mode (M-mode) recording is an essential part of echocardiography ...

ROI-based Deep Image Compression with Swin Transformers

Encoding the Region Of Interest (ROI) with better quality than the backg...

Please sign up or login with your details

Forgot password? Click here to reset