YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection

08/10/2023
by   Yuming Chen, et al.
0

We aim at providing the object detection community with an efficient and performant object detector, termed YOLO-MS. The core design is based on a series of investigations on how convolutions with different kernel sizes affect the detection performance of objects at different scales. The outcome is a new strategy that can strongly enhance multi-scale feature representations of real-time object detectors. To verify the effectiveness of our strategy, we build a network architecture, termed YOLO-MS. We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets, like ImageNet, or pre-trained weights. Without bells and whistles, our YOLO-MS outperforms the recent state-of-the-art real-time object detectors, including YOLO-v7 and RTMDet, when using a comparable number of parameters and FLOPs. Taking the XS version of YOLO-MS as an example, with only 4.5M learnable parameters and 8.7G FLOPs, it can achieve an AP score of 43 is about 2 can also be used as a plug-and-play module for other YOLO models. Typically, our method significantly improves the AP of YOLOv8 from 37 fewer parameters and FLOPs. Code is available at https://github.com/FishAndWasabi/YOLO-MS.

READ FULL TEXT

page 3

page 4

page 9

research
07/06/2022

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

YOLOv7 surpasses all known object detectors in both speed and accuracy i...
research
09/18/2021

MS-SincResNet: Joint learning of 1D and 2D kernels using multi-scale SincNet and ResNet for music genre classification

In this study, we proposed a new end-to-end convolutional neural network...
research
12/28/2016

FastMask: Segment Multi-scale Object Candidates in One Shot

Objects appear to scale differently in natural images. This fact require...
research
12/04/2017

Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids

In this paper, we propose gated recurrent feature pyramid for the proble...
research
04/02/2018

Multi-scale Location-aware Kernel Representation for Object Detection

Although Faster R-CNN and its variants have shown promising performance ...
research
07/18/2020

Multi-Scale Positive Sample Refinement for Few-Shot Object Detection

Few-shot object detection (FSOD) helps detectors adapt to unseen classes...
research
06/23/2020

iffDetector: Inference-aware Feature Filtering for Object Detection

Modern CNN-based object detectors focus on feature configuration during ...

Please sign up or login with your details

Forgot password? Click here to reset