Flat Multi-modal Interaction Transformer for Named Entity Recognition

08/23/2022
by   Junyu Lu, et al.
0

Multi-modal named entity recognition (MNER) aims at identifying entity spans and recognizing their categories in social media posts with the aid of images. However, in dominant MNER approaches, the interaction of different modalities is usually carried out through the alternation of self-attention and cross-attention or over-reliance on the gating machine, which results in imprecise and biased correspondence between fine-grained semantic units of text and image. To address this issue, we propose a Flat Multi-modal Interaction Transformer (FMIT) for MNER. Specifically, we first utilize noun phrases in sentences and general domain words to obtain visual cues. Then, we transform the fine-grained semantic representation of the vision and text into a unified lattice structure and design a novel relative position encoding to match different modalities in Transformer. Meanwhile, we propose to leverage entity boundary detection as an auxiliary task to alleviate visual bias. Experiments show that our methods achieve the new state-of-the-art performance on two benchmark datasets.

READ FULL TEXT

page 1

page 9

research
10/19/2022

Multi-Granularity Cross-Modality Representation Learning for Named Entity Recognition on Social Media

Named Entity Recognition (NER) on social media refers to discovering and...
research
12/13/2021

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

Recently, Multi-modal Named Entity Recognition (MNER) has attracted a lo...
research
04/02/2019

Aiding Intra-Text Representations with Visual Context for Multimodal Named Entity Recognition

With massive explosion of social media such as Twitter and Instagram, pe...
research
06/10/2019

A Multi-task Approach for Named Entity Recognition in Social Media Data

Named Entity Recognition for social media data is challenging because of...
research
07/19/2023

Multi-Grained Multimodal Interaction Network for Entity Linking

Multimodal entity linking (MEL) task, which aims at resolving ambiguous ...
research
05/15/2023

A Novel Framework for Multimodal Named Entity Recognition with Multi-level Alignments

Mining structured knowledge from tweets using named entity recognition (...
research
04/24/2020

FLAT: Chinese NER Using Flat-Lattice Transformer

Recently, the character-word lattice structure has been proved to be eff...

Please sign up or login with your details

Forgot password? Click here to reset