
-
StacMR: Scene-Text Aware Cross-Modal Retrieval
Recent models for cross-modal retrieval have benefited from an increasin...
read it
-
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval
Scene text instances found in natural images carry explicit semantic inf...
read it
-
Document Visual Question Answering Challenge 2020
This paper presents results of Document Visual Question Answering Challe...
read it
-
Retrieval Guided Unsupervised Multi-domain Image-to-Image Translation
Image to image translation aims to learn a mapping that transforms an im...
read it
-
Location Sensitive Image Retrieval and Tagging
People from different parts of the globe describe objects and concepts i...
read it
-
Text Recognition – Real World Data and Where to Find Them
We present a method for exploiting weakly annotated images to improve te...
read it
-
DocVQA: A Dataset for VQA on Document Images
We present a new dataset for Visual Question Answering on document image...
read it
-
Multimodal grid features and cell pointers for Scene Text Visual Question Answering
This paper presents a new model for the task of scene text visual questi...
read it
-
RoadText-1K: Text Detection Recognition Dataset for Driving Videos
Perceiving text is crucial to understand semantics of outdoor scenes and...
read it
-
Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features
Text contained in an image carries high-level semantics that can be expl...
read it
-
Exploring Hate Speech Detection in Multimodal Publications
In this work we target the problem of hate speech detection in multimoda...
read it
-
ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling – RRC-LSVT
Robust text reading from street view images provides valuable informatio...
read it
-
ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)
This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-S...
read it
-
ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition -- RRC-MLT-2019
With the growing cosmopolitan culture of modern cities, the need of robu...
read it
-
ICDAR 2019 Competition on Scene Text Visual Question Answering
This paper presents final results of ICDAR 2019 Scene Text Visual Questi...
read it
-
Selective Style Transfer for Text
This paper explores the possibilities of image style transfer applied to...
read it
-
Scene Text Visual Question Answering
Current visual question answering datasets do not consider the rich sema...
read it
-
Good News, Everyone! Context driven entity-aware captioning for news images
Current image captioning systems perform at a merely descriptive level, ...
read it
-
Self-Supervised Visual Representations for Cross-Modal Retrieval
Cross-modal retrieval methods have been significantly improved in last y...
read it
-
Self-Supervised Learning from Web Data for Multimodal Retrieval
Self-Supervised learning from multimodal image and text data allows deep...
read it
-
Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images
Word spotting in natural scene images has many applications in scene und...
read it
-
Single Shot Scene Text Retrieval
Textual information found in scene images provides high level semantic i...
read it
-
Learning from #Barcelona Instagram data what Locals and Tourists post about its Neighbourhoods
Massive tourism is becoming a big problem for some cities, such as Barce...
read it
-
Learning to Learn from Web Data through Deep Semantic Embeddings
In this paper we propose to learn a multimodal image and text embedding ...
read it
-
TextTopicNet - Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces
The immense success of deep learning based methods in computer vision he...
read it
-
Non-deterministic Behavior of Ranking-based Metrics when Evaluating Embeddings
Embedding data into vector spaces is a very popular strategy of pattern ...
read it
-
The Robust Reading Competition Annotation and Evaluation Platform
The ICDAR Robust Reading Competition (RRC), initiated in 2003 and re-est...
read it
-
Self-supervised learning of visual features through embedding images into text topic spaces
End-to-end training from scratch of current deep architectures for new c...
read it
-
TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild
Motivated by the success of powerful while expensive techniques to recog...
read it
-
Improving patch-based scene text script identification with ensembles of conjoined networks
This paper focuses on the problem of script identification in scene text...
read it
-
A fine-grained approach to scene text script identification
This paper focuses on the problem of script identification in unconstrai...
read it
-
Visual Script and Language Identification
In this paper we introduce a script identification method based on hand-...
read it
-
Object Proposals for Text Extraction in the Wild
Object Proposals is a recent computer vision technique receiving increas...
read it
-
Sparse Radial Sampling LBP for Writer Identification
In this paper we present the use of Sparse Radial Sampling Local Binary ...
read it
-
A Fast Hierarchical Method for Multi-script and Arbitrary Oriented Scene Text Extraction
Typography and layout lead to the hierarchical organisation of text in w...
read it