Convolutional Neural Network (CNN) vs Visual Transformer (ViT) for Digital Holography

08/20/2021
by   Stéphane Cuenat, et al.
22

In Digital Holography (DH), it is crucial to extract the object distance from a hologram in order to reconstruct its amplitude and phase. This step is called auto-focusing and it is conventionally solved by first reconstructing a stack of images and then by sharpening each reconstructed image using a focus metric such as entropy or variance. The distance corresponding to the sharpest image is considered the focal position. This approach, while effective, is computationally demanding and time-consuming. In this paper, the determination of the distance is performed by Deep Learning (DL). Two deep learning (DL) architectures are compared: Convolutional Neural Network (CNN)and Visual transformer (ViT). ViT and CNN are used to cope with the problem of auto-focusing as a classification problem. Compared to a first attempt [11] in which the distance between two consecutive classes was 100μm, our proposal allows us to drastically reduce this distance to 1μm. Moreover, ViT reaches similar accuracy and is more robust than CNN.

READ FULL TEXT

page 2

page 3

page 4

page 6

research
02/02/2018

Convolutional neural network-based regression for depth prediction in digital holography

Digital holography enables us to reconstruct objects in three-dimensiona...
research
03/15/2022

Fast Autofocusing using Tiny Networks for Digital Holographic Microscopy

The numerical wavefront backpropagation principle of digital holography ...
research
03/21/2018

Extended depth-of-field in holographic image reconstruction using deep learning based auto-focusing and phase-recovery

Holography encodes the three dimensional (3D) information of a sample in...
research
09/16/2022

A Mosquito is Worth 16x16 Larvae: Evaluation of Deep Learning Architectures for Mosquito Larvae Classification

Mosquito-borne diseases (MBDs), such as dengue virus, chikungunya virus,...
research
10/24/2019

Reconstruction of Undersampled 3D Non-Cartesian Image-Based Navigators for Coronary MRA Using an Unrolled Deep Learning Model

Purpose: To rapidly reconstruct undersampled 3D non-Cartesian image-base...
research
06/05/2022

Performance Comparison of Simple Transformer and Res-CNN-BiLSTM for Cyberbullying Classification

The task of text classification using Bidirectional based LSTM architect...
research
04/17/2019

DistanceNet: Estimating Traveled Distance from Monocular Images using a Recurrent Convolutional Neural Network

Classical monocular vSLAM/VO methods suffer from the scale ambiguity pro...

Please sign up or login with your details

Forgot password? Click here to reset