Repurposing Existing Deep Networks for Caption and Aesthetic-Guided Image Cropping

01/07/2022
by   Nora Horanyi, et al.
11

We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image cropping methods, where one typically trains a deep network to regress to crop parameters or cropping actions, we propose to directly optimize for the cropping parameters by repurposing pre-trained networks on image captioning and aesthetic tasks, without any fine-tuning, thereby avoiding training a separate network. Specifically, we search for the best crop parameters that minimize a combined loss of the initial objectives of these networks. To make the optimization table, we propose three strategies: (i) multi-scale bilinear sampling, (ii) annealing the scale of the crop region, therefore effectively reducing the parameter space, (iii) aggregation of multiple optimization results. Through various quantitative and qualitative evaluations, we show that our framework can produce crops that are well-aligned to intended user descriptions and aesthetically pleasing.

READ FULL TEXT

page 2

page 9

page 11

page 13

page 14

page 16

page 17

page 19

research
11/18/2020

Online Exemplar Fine-Tuning for Image-to-Image Translation

Existing techniques to solve exemplar-based image-to-image translation w...
research
07/28/2017

Fine-Pruning: Joint Fine-Tuning and Compression of a Convolutional Network with Bayesian Optimization

When approaching a novel visual recognition problem in a specialized ima...
research
10/06/2017

Efficient K-Shot Learning with Regularized Deep Networks

Feature representations from pre-trained deep neural networks have been ...
research
07/10/2018

Topic-Guided Attention for Image Captioning

Attention mechanisms have attracted considerable interest in image capti...
research
08/28/2023

SAM-PARSER: Fine-tuning SAM Efficiently by Parameter Space Reconstruction

Segment Anything Model (SAM) has received remarkable attention as it off...
research
04/21/2017

Attend to You: Personalized Image Captioning with Context Sequence Memory Networks

We address personalization issues of image captioning, which have not be...
research
09/11/2017

Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

The existing image captioning approaches typically train a one-stage sen...

Please sign up or login with your details

Forgot password? Click here to reset