FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction

01/11/2019
by   Shuyang Sun, et al.
0

The basic principles in designing convolutional neural network (CNN) structures for predicting objects on different levels, e.g., image-level, region-level, and pixel-level are diverging. Generally, network structures designed specifically for image classification are directly used as default backbone structure for other tasks including detection and segmentation, but there is seldom backbone structure designed under the consideration of unifying the advantages of networks designed for pixel-level or region-level predicting tasks, which may require very deep features with high resolution. Towards this goal, we design a fish-like network, called FishNet. In FishNet, the information of all resolutions is preserved and refined for the final task. Besides, we observe that existing works still cannot directly propagate the gradient information from deep layers to shallow layers. Our design can better handle this problem. Extensive experiments have been conducted to demonstrate the remarkable performance of the FishNet. In particular, on ImageNet-1k, the accuracy of FishNet is able to surpass the performance of DenseNet and ResNet with fewer parameters. FishNet was applied as one of the modules in the winning entry of the COCO Detection 2018 challenge. The code is available at https://github.com/kevin-ssy/FishNet.

READ FULL TEXT
research
08/18/2016

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

This paper proposes a novel saliency detection method by combining regio...
research
12/11/2019

RDSNet: A New Deep Architecture for Reciprocal Object Detection and Instance Segmentation

Object detection and instance segmentation are two fundamental computer ...
research
11/25/2022

PIP: Positional-encoding Image Prior

In Deep Image Prior (DIP), a Convolutional Neural Network (CNN) is fitte...
research
09/22/2019

Pixel-Level Dense Prediction without Decoder

Pixel-level dense prediction tasks such as keypoint estimation are domin...
research
11/02/2022

WITT: A Wireless Image Transmission Transformer for Semantic Communications

In this paper, we aim to redesign the vision Transformer (ViT) as a new ...
research
06/14/2022

Pixel-by-pixel Mean Opinion Score (pMOS) for No-Reference Image Quality Assessment

Deep-learning based techniques have contributed to the remarkable progre...

Please sign up or login with your details

Forgot password? Click here to reset