Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

09/13/2022
by   Keisuke Shirai, et al.
2

We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-modal relation. With our dataset, one can try a range of applications, from multimodal commonsense reasoning and procedural text generation.

READ FULL TEXT
research
06/19/2023

Cross-Modal Attribute Insertions for Assessing the Robustness of Vision-and-Language Learning

The robustness of multimodal deep learning models to realistic changes i...
research
05/04/2022

Visual Commonsense in Pretrained Unimodal and Multimodal Models

Our commonsense knowledge about objects includes their typical visual at...
research
03/22/2023

Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval

Text-to-image person retrieval aims to identify the target person based ...
research
10/11/2019

Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval

Image-text retrieval of natural scenes has been a popular research topic...
research
02/05/2021

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

Recently multimodal named entity recognition (MNER) has utilized images ...
research
11/20/2022

How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation

Sarcasm generation has been investigated in previous studies by consider...
research
05/07/2022

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

Recently, online shopping has gradually become a common way of shopping ...

Please sign up or login with your details

Forgot password? Click here to reset