Towards End-to-End Image Compression and Analysis with Transformers

12/17/2021
by   Yuanchao Bai, et al.
17

We propose an end-to-end image compression and analysis model with Transformers, targeting to the cloud-based image classification application. Instead of placing an existing Transformer-based image classification model directly after an image codec, we aim to redesign the Vision Transformer (ViT) model to perform image classification from the compressed features and facilitate image compression with the long-term information from the Transformer. Specifically, we first replace the patchify stem (i.e., image splitting and embedding) of the ViT model with a lightweight image encoder modelled by a convolutional neural network. The compressed features generated by the image encoder are injected convolutional inductive bias and are fed to the Transformer for image classification bypassing image reconstruction. Meanwhile, we propose a feature aggregation module to fuse the compressed features with the selected intermediate features of the Transformer, and feed the aggregated features to a deconvolutional neural network for image reconstruction. The aggregated features can obtain the long-term information from the self-attention mechanism of the Transformer and improve the compression performance. The rate-distortion-accuracy optimization problem is finally solved by a two-step training strategy. Experimental results demonstrate the effectiveness of the proposed model in both the image compression and the classification tasks.

READ FULL TEXT

page 5

page 7

page 8

page 9

page 10

page 11

page 12

page 13

research
03/02/2022

Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions

With the achievements of Transformer in the field of natural language pr...
research
11/12/2021

Transformer-based Image Compression

A Transformer-based Image Compression (TIC) approach is developed which ...
research
03/21/2023

Learning A Sparse Transformer Network for Effective Image Deraining

Transformers-based methods have achieved significant performance in imag...
research
08/08/2023

SDLFormer: A Sparse and Dense Locality-enhanced Transformer for Accelerated MR Image Reconstruction

Transformers have emerged as viable alternatives to convolutional neural...
research
09/19/2023

Multi-spectral Entropy Constrained Neural Compression of Solar Imagery

Missions studying the dynamic behaviour of the Sun are defined to captur...
research
09/04/2019

Faster and Accurate Classification for JPEG2000 Compressed Images in Networked Applications

JPEG2000 (j2k) is a highly popular format for image and video compressio...
research
07/29/2022

Forensic License Plate Recognition with Compression-Informed Transformers

Forensic license plate recognition (FLPR) remains an open challenge in l...

Please sign up or login with your details

Forgot password? Click here to reset