DeepAI AI Chat
Log In Sign Up

Fast Implementation of Morphological Filtering Using ARM NEON Extension

by   Elena Limonova, et al.

In this paper we consider speedup potential of morphological image filtering on ARM processors. Morphological operations are widely used in image analysis and recognition and their speedup in some cases can significantly reduce overall execution time of recognition. More specifically, we propose fast implementation of erosion and dilation using ARM SIMD extension NEON. These operations with the rectangular structuring element are separable. They were implemented using the advantages of separability as sequential horizontal and vertical passes. Each pass was implemented using van Herk/Gil-Werman algorithm for large windows and low-constant linear complexity algorithm for small windows. Final implementation was improved with SIMD and used a combination of these methods. We also considered fast transpose implementation of 8x8 and 16x16 matrices using ARM NEON to get additional computational gain for morphological operations. Experiments showed 3 times efficiency increase for final implementation of erosion and dilation compared to van Herk/Gil-Werman algorithm without SIMD, 5.7 times speedup for 8x8 matrix transpose and 12 times speedup for 16x16 matrix transpose compared to transpose without SIMD.


page 1

page 2

page 3

page 4


A fast vectorized sorting implementation based on the ARM scalable vector extension (SVE)

The way developers implement their algorithms and how these implementati...

Fast matrix multiplication for binary and ternary CNNs on ARM CPU

Low-bit quantized neural networks are of great interest in practical app...

Parallel Implementation of Distributed Global Optimization (DGO)

Parallel implementations of distributed global optimization (DGO) [13] o...

ARM 4-BIT PQ: SIMD-based Acceleration for Approximate Nearest Neighbor Search on ARM

We accelerate the 4-bit product quantization (PQ) on the ARM architectur...

Deep Morphological Neural Networks

Mathematical morphology is a theory and technique to collect features li...

Multithreaded Filtering Preconditioner for Diffusion Equation on Structured Grid

A parallel and nested version of a frequency filtering preconditioner is...

The Divide-and-Conquer Framework: A Suitable Setting for the DDM of the Future

This paper was prompted by numerical experiments we performed, in which ...