CNNs for JPEGs: A Study in Computational Cost

09/20/2023
by   Samuel Felipe dos Santos, et al.
0

Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, defining state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from the RGB pixels. However, most image data are usually available in compressed format, from which the JPEG is the most widely used due to transmission and storage purposes demanding a preliminary decoding process that have a high computational load and memory usage. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. Those methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNNs architectures to work with them. One limitation of these current works is that, in order to accommodate the frequency domain data, the modifications made to the original model increase significantly their amount of parameters and computational complexity. On one hand, the methods have faster preprocessing, since the cost of fully decoding the images is avoided, but on the other hand, the cost of passing the images though the model is increased, mitigating the possible upside of accelerating the method. In this paper, we propose a further study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing the images through the network. We also propose handcrafted and data-driven techniques for reducing the computational complexity and the number of parameters for these models in order to keep them similar to their RGB baselines, leading to efficient models with a better trade off between computational cost and accuracy.

READ FULL TEXT
research
12/26/2020

Deep Learning Towards Edge Computing: Neural Networks Straight from Compressed Data

Due to the popularization and grow in computational power of mobile phon...
research
04/01/2021

Less is More: Accelerating Faster Neural Networks Straight from JPEG

Most image data available are often stored in a compressed format, from ...
research
12/26/2020

Faster and Accurate Compressed Video Action Recognition Straight from the Frequency Domain

Human action recognition has become one of the most active field of rese...
research
09/20/2023

Budget-Aware Pruning: Handling Multiple Domains with Less Parameters

Deep learning has achieved state-of-the-art performance on several compu...
research
10/14/2022

Parameter Sharing in Budget-Aware Adapters for Multi-Domain Learning

Deep learning has achieved state-of-the-art performance on several compu...
research
10/03/2020

Nonconvex Regularization for Network Slimming:Compressing CNNs Even More

In the last decade, convolutional neural networks (CNNs) have evolved to...
research
11/29/2022

RGB no more: Minimally-decoded JPEG Vision Transformers

Most neural networks for computer vision are designed to infer using RGB...

Please sign up or login with your details

Forgot password? Click here to reset