Learning to search for and detect objects in foveal images using deep learning

04/12/2023
by   Beatriz Paula, et al.
0

The human visual system processes images with varied degrees of resolution, with the fovea, a small portion of the retina, capturing the highest acuity region, which gradually declines toward the field of view's periphery. However, the majority of existing object localization methods rely on images acquired by image sensors with space-invariant resolution, ignoring biological attention mechanisms. As a region of interest pooling, this study employs a fixation prediction model that emulates human objective-guided attention of searching for a given class in an image. The foveated pictures at each fixation point are then classified to determine whether the target is present or absent in the scene. Throughout this two-stage pipeline method, we investigate the varying results obtained by utilizing high-level or panoptic features and provide a ground-truth label function for fixation sequences that is smoother, considering in a better way the spatial structure of the problem. Finally, we present a novel dual task model capable of performing fixation prediction and detection simultaneously, allowing knowledge transfer between the two tasks. We conclude that, due to the complementary nature of both tasks, the training process benefited from the sharing of knowledge, resulting in an improvement in performance when compared to the previous approach's baseline scores.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2019

A Novel Deep Learning Pipeline for Retinal Vessel Detection in Fluorescein Angiography

While recent advances in deep learning have significantly advanced the s...
research
12/11/2020

Objectness-Guided Open Set Visual Search and Closed Set Detection

Searching for small objects in large images is currently challenging for...
research
03/12/2007

Extraction of cartographic objects in high resolution satellite images for object model generation

The aim of this study is to detect man-made cartographic objects in high...
research
11/22/2018

Inference of the three-dimensional chromatin structure and its temporal behavior

Understanding the three-dimensional (3D) structure of the genome is esse...
research
07/25/2022

Deep dual stream residual network with contextual attention for pansharpening of remote sensing images

Pansharpening enhances spatial details of high spectral resolution multi...
research
02/21/2022

Guided Visual Attention Model Based on Interactions Between Top-down and Bottom-up Information for Robot Pose Prediction

Learning to control a robot commonly requires mapping between robot stat...
research
11/25/2018

Visual Attention on the Sun: What Do Existing Models Actually Predict?

Visual attention prediction is a classic problem that seems to be well a...

Please sign up or login with your details

Forgot password? Click here to reset