Scaling up Kernels in 3D CNNs

06/21/2022
by   Yukang Chen, et al.
6

Recent advances in 2D CNNs and vision transformers (ViTs) reveal that large kernels are essential for enough receptive fields and high performance. Inspired by this literature, we examine the feasibility and challenges of 3D large-kernel designs. We demonstrate that applying large convolutional kernels in 3D CNNs has more difficulties in both performance and efficiency. Existing techniques that work well in 2D CNNs are ineffective in 3D networks, including the popular depth-wise convolutions. To overcome these obstacles, we present the spatial-wise group convolution and its large-kernel module (SW-LK block). It avoids the optimization and efficiency issues of naive 3D large kernels. Our large-kernel 3D CNN network, i.e., LargeKernel3D, yields non-trivial improvements on various 3D tasks, including semantic segmentation and object detection. Notably, it achieves 73.9 segmentation and 72.8 the nuScenes LIDAR leaderboard. It is further boosted to 74.2 simple multi-modal fusion. LargeKernel3D attains comparable or superior results than its CNN and transformer counterparts. For the first time, we show that large kernels are feasible and essential for 3D networks.

READ FULL TEXT

page 3

page 17

research
03/13/2022

Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs

We revisit large kernel design in modern convolutional neural networks (...
research
09/04/2023

Large Separable Kernel Attention: Rethinking the Large Kernel Attention Design in CNN

Visual Attention Networks (VAN) with Large Kernel Attention (LKA) module...
research
07/07/2022

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity

Transformers have quickly shined in the computer vision world since the ...
research
11/14/2022

ParCNetV2: Oversized Kernel with Enhanced Attention

Transformers have achieved tremendous success in various computer vision...
research
08/14/2023

SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers

This paper presents a module, Spatial Cross-scale Convolution (SCSC), wh...
research
10/03/2022

Analysis of (sub-)Riemannian PDE-G-CNNs

Group equivariant convolutional neural networks (G-CNNs) have been succe...
research
03/28/2023

LinK: Linear Kernel for LiDAR-based 3D Perception

Extending the success of 2D Large Kernel to 3D perception is challenging...

Please sign up or login with your details

Forgot password? Click here to reset