Visual Semantic Parsing: From Images to Abstract Meaning Representation

The success of scene graphs for visual scene understanding has brought attention to the benefits of abstracting a visual input (e.g., image) into a structured representation, where entities (people and objects) are nodes connected by edges specifying their relations. Building these representations, however, requires expensive manual annotation in the form of images paired with their scene graphs or frames. These formalisms remain limited in the nature of entities and relations they can capture. In this paper, we propose to leverage a widely-used meaning representation in the field of natural language processing, the Abstract Meaning Representation (AMR), to address these shortcomings. Compared to scene graphs, which largely emphasize spatial relationships, our visual AMR graphs are more linguistically informed, with a focus on higher-level semantic concepts extrapolated from visual input. Moreover, they allow us to generate meta-AMR graphs to unify information contained in multiple image descriptions under one representation. Through extensive experimentation and analysis, we demonstrate that we can re-purpose an existing text-to-AMR parser to parse images into AMRs. Our findings point to important future research directions for improved scene understanding.

READ FULL TEXT

page 2

page 8

page 9

page 16

page 17

page 18

research
10/17/2022

SGRAM: Improving Scene Graph Parsing via Abstract Meaning Representation

Scene graph is structured semantic representation that can be modeled as...
research
02/23/2018

Evaluating Scoped Meaning Representations

Semantic parsing offers many opportunities to improve natural language u...
research
05/12/2021

Image interpretation by iterative bottom-up top-down processing

Scene understanding requires the extraction and representation of scene ...
research
09/13/2019

Scene Graph Parsing by Attention Graph

Scene graph representations, which form a graph of visual object nodes t...
research
05/16/2023

MetaSRL++: A Uniform Scheme for Modelling Deeper Semantics

Despite enormous progress in Natural Language Processing (NLP), our fiel...
research
10/20/2022

Design Representation as Semantic Networks

Design representation is a common task in the design process to facilita...
research
06/20/2015

Aligning where to see and what to tell: image caption with region-based attention and scene factorization

Recent progress on automatic generation of image captions has shown that...

Please sign up or login with your details

Forgot password? Click here to reset