Commonly Uncommon: Semantic Sparsity in Situation Recognition

12/03/2016
by   Mark Yatskar, et al.
0

Semantic sparsity is a common challenge in structured visual classification problems; when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set. This paper studies semantic sparsity in situation recognition, the task of producing structured summaries of what is happening in images, including activities, objects and the roles objects play within the activity. For this problem, we find empirically that most object-role combinations are rare, and current state-of-the-art models significantly underperform in this sparse data regime. We avoid many such errors by (1) introducing a novel tensor composition function that learns to share examples across role-noun combinations and (2) semantically augmenting our training data with automatically gathered examples of rarely observed outputs using web data. When integrated within a complete CRF-based structured prediction model, the tensor-based approach outperforms existing state of the art by a relative improvement of 2.11 accuracy, respectively. Adding 5 million images with our semantic augmentation techniques gives further relative improvements of 6.23 and noun-role accuracy.

READ FULL TEXT

page 1

page 8

research
07/02/2023

ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition

Situation Recognition is the task of generating a structured summary of ...
research
03/26/2020

Grounded Situation Recognition

We introduce Grounded Situation Recognition (GSR), a task that requires ...
research
07/29/2017

Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints

Language is increasingly being used to define rich visual recognition pr...
research
10/09/2018

Towards Verifying Semantic Roles Co-occurrence

Semantic role theory considers roles as a small universal set of unanaly...
research
09/07/2015

Structured Prediction with Output Embeddings for Semantic Image Annotation

We address the task of annotating images with semantic tuples. Solving t...
research
08/29/2023

3D Adversarial Augmentations for Robust Out-of-Domain Predictions

Since real-world training datasets cannot properly sample the long tail ...
research
12/17/2017

Probabilistic Semantic Retrieval for Surveillance Videos with Activity Graphs

We present a novel framework for finding complex activities matching use...

Please sign up or login with your details

Forgot password? Click here to reset