Detailed Dense Inference with Convolutional Neural Networks via Discrete Wavelet Transform

08/06/2018
by   Lingni Ma, et al.
0

Dense pixelwise prediction such as semantic segmentation is an up-to-date challenge for deep convolutional neural networks (CNNs). Many state-of-the-art approaches either tackle the loss of high-resolution information due to pooling in the encoder stage, or use dilated convolutions or high-resolution lanes to maintain detailed feature maps and predictions. Motivated by the structural analogy between multi-resolution wavelet analysis and the pooling/unpooling layers of CNNs, we introduce discrete wavelet transform (DWT) into the CNN encoder-decoder architecture and propose WCNN. The high-frequency wavelet coefficients are computed at encoder, which are later used at the decoder to unpooled jointly with coarse-resolution feature maps through the inverse DWT. The DWT/iDWT is further used to develop two wavelet pyramids to capture the global context, where the multi-resolution DWT is applied to successively reduce the spatial resolution and increase the receptive field. Experiment with the Cityscape dataset, the proposed WCNNs are computationally efficient and yield improvements the accuracy for high-resolution dense pixelwise prediction.

READ FULL TEXT
research
07/06/2019

Multi-level Wavelet Convolutional Neural Networks

In computer vision, convolutional networks (CNNs) often adopts pooling t...
research
10/17/2021

Exploring Novel Pooling Strategies for Edge Preserved Feature Maps in Convolutional Neural Networks

With the introduction of anti-aliased convolutional neural networks (CNN...
research
06/13/2020

Split-Merge Pooling

There are a variety of approaches to obtain a vast receptive field with ...
research
05/24/2022

Wavelet Feature Maps Compression for Image-to-Image CNNs

Convolutional Neural Networks (CNNs) are known for requiring extensive c...
research
06/14/2023

WavPool: A New Block for Deep Neural Networks

Modern deep neural networks comprise many operational layers, such as de...
research
04/16/2020

Top-Down Networks: A coarse-to-fine reimagination of CNNs

Biological vision adopts a coarse-to-fine information processing pathway...
research
05/18/2018

Multi-level Wavelet-CNN for Image Restoration

The tradeoff between receptive field size and efficiency is a crucial is...

Please sign up or login with your details

Forgot password? Click here to reset