
-
Rethinking Spatial Dimensions of Vision Transformers
Vision Transformer (ViT) extends the application range of transformers f...
read it
-
Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels
ImageNet has been arguably the most popular image classification benchma...
read it
-
VideoMix: Rethinking Data Augmentation for Video Classification
State-of-the-art video action classifiers often suffer from overfitting....
read it
-
ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network
This paper addresses representational bottleneck in a network and propos...
read it
-
Slowing Down the Weight Norm Increase in Momentum-based Optimizers
Normalization techniques, such as batch normalization (BN), have led to ...
read it
-
An Empirical Evaluation on Robustness and Uncertainty of Regularization Methods
Despite apparent human-level performances of deep neural networks (DNN),...
read it
-
EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse
In this paper, we propose a new multi-scale face detector having an extr...
read it
-
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Regional dropout strategies have been proposed to enhance the performanc...
read it
-
Character Region Awareness for Text Detection
Scene text detection methods based on neural networks have emerged recen...
read it
-
What is wrong with scene text recognition model comparisons? dataset and model analysis
Many new proposals for scene text recognition (STR) models have been int...
read it
-
Concentrated-Comprehensive Convolutions for lightweight semantic segmentation
The semantic segmentation requires a lot of computational cost. The dila...
read it
-
Deep Pyramidal Residual Networks
Deep convolutional neural networks (DCNNs) have shown remarkable perform...
read it