Structured Prediction with Output Embeddings for Semantic Image Annotation

09/07/2015
by   Ariadna Quattoni, et al.
0

We address the task of annotating images with semantic tuples. Solving this problem requires an algorithm which is able to deal with hundreds of classes for each argument of the tuple. In such contexts, data sparsity becomes a key challenge, as there will be a large number of classes for which only a few examples are available. We propose handling this by incorporating feature representations of both the inputs (images) and outputs (argument classes) into a factorized log-linear model, and exploiting the flexibility of scoring functions based on bilinear forms. Experiments show that integrating feature representations of the outputs in the structured prediction model leads to better overall predictions. We also conclude that the best output representation is specific for each type of argument.

READ FULL TEXT

page 2

page 4

page 9

research
06/08/2017

Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision

Scene labeling is a challenging classification problem where each input ...
research
07/29/2020

Learning Output Embeddings in Structured Prediction

A powerful and flexible approach to structured prediction consists in em...
research
04/18/2023

Revisiting the Role of Similarity and Dissimilarity in Best Counter Argument Retrieval

This paper studies the task of best counter-argument retrieval given an ...
research
10/31/2021

Learning Debiased and Disentangled Representations for Semantic Segmentation

Deep neural networks are susceptible to learn biased models with entangl...
research
06/11/2019

Representation Learning-Assisted Click-Through Rate Prediction

Click-through rate (CTR) prediction is a critical task in online adverti...
research
12/03/2016

Commonly Uncommon: Semantic Sparsity in Situation Recognition

Semantic sparsity is a common challenge in structured visual classificat...
research
08/16/2021

Efficient Feature Representations for Cricket Data Analysis using Deep Learning based Multi-Modal Fusion Model

Data analysis has become a necessity in the modern era of cricket. Every...

Please sign up or login with your details

Forgot password? Click here to reset