Deep Multiple Instance Learning for Zero-shot Image Tagging

03/16/2018
by   Shafin Rahman, et al.
0

In-line with the success of deep learning on traditional recognition problem, several end-to-end deep models for zero-shot recognition have been proposed in the literature. These models are successful to predict a single unseen label given an input image, but does not scale to cases where multiple unseen objects are present. In this paper, we model this problem within the framework of Multiple Instance Learning (MIL). To the best of our knowledge, we propose the first end-to-end trainable deep MIL framework for the multi-label zero-shot tagging problem. Due to its novel design, the proposed framework has several interesting features: (1) Unlike previous deep MIL models, it does not use any off-line procedure (e.g., Selective Search or EdgeBoxes) for bag generation. (2) During test time, it can process any number of unseen labels given their semantic embedding vectors. (3) Using only seen labels per image as weak annotation, it can produce a bounding box for each predicted labels. We experiment with the NUS-WIDE dataset and achieve superior performance across conventional, zero-shot and generalized zero-shot tagging tasks.

READ FULL TEXT

page 6

page 13

page 14

research
12/22/2015

Multi-Instance Visual-Semantic Embedding

Visual-semantic embedding models have been recently proposed and shown t...
research
05/31/2016

Fast Zero-Shot Image Tagging

The well-known word analogy experiments show that the recent word vector...
research
06/20/2019

Zero-shot Learning and Knowledge Transfer in Music Classification and Tagging

Music classification and tagging is conducted through categorical superv...
research
05/26/2023

Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis

Recently, zero-shot TTS and VC methods have gained attention due to thei...
research
05/12/2021

Semantic Diversity Learning for Zero-Shot Multi-label Classification

Training a neural network model for recognizing multiple labels associat...
research
05/28/2023

Z-GMOT: Zero-shot Generic Multiple Object Tracking

Despite the significant progress made in recent years, Multi-Object Trac...
research
03/21/2017

ZM-Net: Real-time Zero-shot Image Manipulation Network

Many problems in image processing and computer vision (e.g. colorization...

Please sign up or login with your details

Forgot password? Click here to reset