Efficient Neural Network Deployment for Microcontroller

07/02/2020
by   Hasan Unlu, et al.
0

Edge computing for neural networks is getting important especially for low power applications and offline devices. TensorFlow Lite and PyTorch Mobile were released for this purpose. But they mainly support mobile devices instead of microcontroller level yet. Microcontroller support is an emerging area now. There are many approaches to reduce network size and compute load like pruning, binarization and layer manipulation i.e. operator reordering. This paper is going to explore and generalize convolution neural network deployment for microcontrollers with two novel optimization proposals offering memory saving and compute efficiency in 2D convolutions as well as fully connected layers. The first one is in-place max-pooling, if the stride is greater than or equal to pooling kernel size. The second optimization is to use ping-pong buffers between layers to reduce memory consumption significantly. The memory savings and performance will be compared with CMSIS-NN framework developed for ARM Cortex-M CPUs. The final purpose is to develop a tool consuming PyTorch model with trained network weights, and it turns into an optimized inference engine(forward pass) in C/C++ for low memory(kilobyte level) and limited computing capable microcontrollers.

READ FULL TEXT
research
04/30/2018

Ultra Power-Efficient CNN Domain Specific Accelerator with 9.3TOPS/Watt for Mobile and Embedded Applications

Computer vision performances have been significantly improved in recent ...
research
02/24/2021

Efficient Low-Latency Dynamic Licensing for Deep Neural Network Deployment on Edge Devices

Along with the rapid development in the field of artificial intelligence...
research
03/29/2021

Deep Compression for PyTorch Model Deployment on Microcontrollers

Neural network deployment on low-cost embedded systems, hence on microco...
research
04/27/2023

Moccasin: Efficient Tensor Rematerialization for Neural Networks

The deployment and training of neural networks on edge computing devices...
research
03/04/2019

Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs

The Winograd or Cook-Toom class of algorithms help to reduce the overall...
research
07/04/2022

Sustainable AI Processing at the Edge

Edge computing is a popular target for accelerating machine learning alg...
research
07/25/2019

HUGE2: a Highly Untangled Generative-model Engine for Edge-computing

As a type of prominent studies in deep learning, generative models have ...

Please sign up or login with your details

Forgot password? Click here to reset