MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning

10/28/2021
by   Ji Lin, et al.
19

Tiny deep learning on microcontroller units (MCUs) is challenging due to the limited memory size. We find that the memory bottleneck is due to the imbalanced memory distribution in convolutional neural network (CNN) designs: the first several blocks have an order of magnitude larger memory usage than the rest of the network. To alleviate this issue, we propose a generic patch-by-patch inference scheduling, which operates only on a small spatial region of the feature map and significantly cuts down the peak memory. However, naive implementation brings overlapping patches and computation overhead. We further propose network redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead. Manually redistributing the receptive field is difficult. We automate the process with neural architecture search to jointly optimize the neural architecture and inference scheduling, leading to MCUNetV2. Patch-based inference effectively reduces the peak memory usage of existing networks by 4-8x. Co-designed with neural networks, MCUNetV2 sets a record ImageNet accuracy on MCU (71.8 >90 also unblocks object detection on tiny devices, achieving 16.9 Pascal VOC compared to the state-of-the-art result. Our study largely addressed the memory bottleneck in tinyML and paved the way for various vision applications beyond image classification.

READ FULL TEXT

page 5

page 7

page 8

page 9

page 11

page 12

page 14

page 16

research
07/20/2020

MCUNet: Tiny Deep Learning on IoT Devices

Machine learning on tiny IoT devices based on microcontroller units (MCU...
research
10/27/2020

μNAS: Constrained Neural Architecture Search for Microcontrollers

IoT devices are powered by microcontroller units (MCUs) which are extrem...
research
03/04/2020

Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices

Recent advances demonstrate that irregularly wired neural networks from ...
research
09/05/2019

Efficient Neural Architecture Transformation Searchin Channel-Level for Object Detection

Recently, Neural Architecture Search has achieved great success in large...
research
11/30/2022

Pex: Memory-efficient Microcontroller Deep Learning through Partial Execution

Embedded and IoT devices, largely powered by microcontroller units (MCUs...
research
03/07/2023

TinyAD: Memory-efficient anomaly detection for time series data in Industrial IoT

Monitoring and detecting abnormal events in cyber-physical systems is cr...
research
01/12/2021

3D-ANAS: 3D Asymmetric Neural Architecture Search for Fast Hyperspectral Image Classification

Hyperspectral images involve abundant spectral and spatial information, ...

Please sign up or login with your details

Forgot password? Click here to reset