Benchmarking Diverse-Modal Entity Linking with Generative Models

05/27/2023
by   Sijia Wang, et al.
0

Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding, or schema linking, it is more challenging to design a unified model for diverse modality configurations. To bring various modality configurations together, we constructed a benchmark for diverse-modal EL (DMEL) from existing EL datasets, covering all three modalities including text, image, and table. To approach the DMEL task, we proposed a generative diverse-modal model (GDMM) following a multimodal-encoder-decoder paradigm. Pre-training with rich corpora builds a solid foundation for DMEL without storing the entire KB for inference. Fine-tuning GDMM builds a stronger DMEL baseline, outperforming state-of-the-art task-specific EL models by 8.51 F1 score on average. Additionally, extensive error analyses are conducted to highlight the challenges of DMEL, facilitating future research on this task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2023

Generative Multimodal Entity Linking

Multimodal Entity Linking (MEL) is the task of mapping mentions with mul...
research
04/11/2022

Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning

Entities lie in the heart of biomedical natural language understanding, ...
research
09/09/2021

M5Product: A Multi-modal Pretraining Benchmark for E-commercial Product Downstream Tasks

In this paper, we aim to advance the research of multi-modal pre-trainin...
research
02/15/2022

CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval

We introduce CommerceMM - a multimodal model capable of providing a dive...
research
05/24/2023

AMELI: Enhancing Multimodal Entity Linking with Fine-Grained Attributes

We propose attribute-aware multimodal entity linking, where the input is...
research
07/05/2022

Entity Linking in Tabular Data Needs the Right Attention

Understanding the semantic meaning of tabular data requires Entity Linki...
research
08/06/2021

StrucTexT: Structured Text Understanding with Multi-Modal Transformers

Structured text understanding on Visually Rich Documents (VRDs) is a cru...

Please sign up or login with your details

Forgot password? Click here to reset