Knowledge Mining with Scene Text for Fine-Grained Recognition

03/27/2022
by   Hao Wang, et al.
10

Recently, the semantics of scene text has been proven to be essential in fine-grained image classification. However, the existing methods mainly exploit the literal meaning of scene text for fine-grained recognition, which might be irrelevant when it is not significantly related to objects/scenes. We propose an end-to-end trainable network that mines implicit contextual knowledge behind scene text image and enhance the semantics and correlation to fine-tune the image representation. Unlike the existing methods, our model integrates three modalities: visual feature extraction, text semantics extraction, and correlating background knowledge to fine-grained image classification. Specifically, we employ KnowBert to retrieve relevant knowledge for semantic representation and combine it with image features for fine-grained classification. Experiments on two benchmark datasets, Con-Text, and Drink Bottle, show that our method outperforms the state-of-the-art by 3.72% mAP and 5.39% mAP, respectively. To further validate the effectiveness of the proposed method, we create a new dataset on crowd activity recognition for the evaluation. The source code and new dataset of this work are available at https://github.com/lanfeng4659/KnowledgeMiningWithSceneText.

READ FULL TEXT

page 1

page 5

page 7

research
07/26/2016

Semantic Clustering for Robust Fine-Grained Scene Recognition

In domain generalization, the knowledge learnt from one or multiple sour...
research
06/24/2020

Learning Semantically Enhanced Feature for Fine-Grained Image Classification

We target at providing a computational cheap yet effective approach for ...
research
04/05/2023

MoocRadar: A Fine-grained and Multi-aspect Knowledge Repository for Improving Cognitive Student Modeling in MOOCs

Student modeling, the task of inferring a student's learning characteris...
research
03/29/2022

Fine-Grained Visual Entailment

Visual entailment is a recently proposed multimodal reasoning task where...
research
01/17/2020

Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network

Recent studies in image classification have demonstrated a variety of te...
research
05/09/2021

TextAdaIN: Fine-Grained AdaIN for Robust Text Recognition

Leveraging the characteristics of convolutional layers, image classifier...
research
10/17/2019

Topical Keyphrase Extraction with Hierarchical Semantic Networks

Topical keyphrase extraction is used to summarize large collections of t...

Please sign up or login with your details

Forgot password? Click here to reset