
-
Few Shot Learning With No Labels
Few-shot learners aim to recognize new categories given only a small num...
read it
-
Visual Speech Enhancement Without A Real Visual Stream
In this work, we re-think the task of speech enhancement in unconstraine...
read it
-
Exploring Pair-Wise NMT for Indian Languages
In this paper, we address the task of improving pair-wise machine transl...
read it
-
Improving Word Recognition using Multiple Hypotheses and Deep Embeddings
We propose a novel scheme for improving the word recognition accuracy us...
read it
-
Table Structure Recognition using Top-Down and Bottom-Up Cues
Tables are information-rich structured objects in document images. While...
read it
-
Graphical Object Detection in Document Images
Graphical elements: particularly tables and figures contain a visual sum...
read it
-
CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images
Localizing page elements/objects such as tables, figures, equations, etc...
read it
-
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild
In this work, we investigate the problem of lip-syncing a talking face v...
read it
-
Document Visual Question Answering Challenge 2020
This paper presents results of Document Visual Question Answering Challe...
read it
-
Revisiting Low Resource Status of Indian Languages in Machine Translation
Indian language machine translation performance is hampered due to the l...
read it
-
Textual Description for Mathematical Equations
Reading of mathematical expression or equation in the document images is...
read it
-
IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents
We introduce a new dataset for graphical object detection in business do...
read it
-
Weakly Supervised Instance Segmentation by Learning Annotation Consistent Instances
Recent approaches for weakly supervised instance segmentations depend on...
read it
-
A Multilingual Parallel Corpora Collection Effort for Indian Languages
We present sentence aligned parallel corpora across 10 Indian Languages ...
read it
-
DocVQA: A Dataset for VQA on Document Images
We present a new dataset for Visual Question Answering on document image...
read it
-
Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval
Recognition and retrieval of textual content from the large document col...
read it
-
RoadText-1K: Text Detection Recognition Dataset for Driving Videos
Perceiving text is crucial to understand semantics of outdoor scenes and...
read it
-
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Humans involuntarily tend to infer parts of the conversation from lip mo...
read it
-
Towards Automatic Face-to-Face Translation
In light of the recent breakthroughs in automatic machine translation sy...
read it
-
A Deep Learning Approach for Robust Corridor Following
For an autonomous corridor following task where the environment is conti...
read it
-
A Baseline Neural Machine Translation System for Indian Languages
We present a simple, yet effective, Neural Machine Translation system fo...
read it
-
ICDAR 2019 Competition on Scene Text Visual Question Answering
This paper presents final results of ICDAR 2019 Scene Text Visual Questi...
read it
-
Scene Text Visual Question Answering
Current visual question answering datasets do not consider the rich sema...
read it
-
A Cost Efficient Approach to Correct OCR Errors in Large Document Collections
Word error rate of an ocr is often higher than its character error rate....
read it
-
CVIT-MT Systems for WAT-2018
This document describes the machine translation system used in the submi...
read it
-
Self-Supervised Visual Representations for Cross-Modal Retrieval
Cross-modal retrieval methods have been significantly improved in last y...
read it
-
Low-Cost Transfer Learning of Face Tasks
Do we know what the different filters of a face network represent? Can w...
read it
-
Universal Semi-Supervised Semantic Segmentation
In recent years, the need for semantic segmentation has arisen across se...
read it
-
City-Scale Road Audit System using Deep Learning
Road networks in cities are massive and is a critical component of mobil...
read it
-
IDD: A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained Environments
While several datasets for autonomous navigation have become available i...
read it
-
Dissimilarity Coefficient based Weakly Supervised Object Detection
We consider the problem of weakly supervised object detection, where the...
read it
-
Improved Visual Relocalization by Discovering Anchor Points
We address the visual relocalization problem of predicting the location ...
read it
-
Investigating the generalizability of EEG-based Cognitive Load Estimation Across Visualizations
We examine if EEG-based cognitive load (CL) estimation is generalizable ...
read it
-
Class2Str: End to End Latent Hierarchy Learning
Deep neural networks for image classification typically consists of a co...
read it
-
Connecting Visual Experiences using Max-flow Network with Application to Visual Localization
We are motivated by the fact that multiple representations of the enviro...
read it
-
Learning Human Poses from Actions
We consider the task of learning to estimate human pose in still images....
read it
-
TextTopicNet - Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces
The immense success of deep learning based methods in computer vision he...
read it
-
Efficient Semantic Segmentation using Gradual Grouping
Deep CNNs for semantic segmentation have high memory and run time requir...
read it
-
Unsupervised Learning of Face Representations
We present an approach for unsupervised training of CNNs in order to lea...
read it
-
HWNet v2: An Efficient Word Image Representation for Handwritten Documents
We present a framework for learning efficient holistic representation fo...
read it
-
Learning Optimal Redistribution Mechanisms through Neural Networks
We consider a setting where p public resources are to be allocated among...
read it
-
SmartTennisTV: Automatic indexing of tennis videos
In this paper, we demonstrate a score based indexing approach for tennis...
read it
-
Towards Structured Analysis of Broadcast Badminton Videos
Sports video data is recorded for nearly every major tournament but rema...
read it
-
Unconstrained Scene Text and Video Text Recognition for Arabic Script
Building robust recognizers for Arabic has always been challenging. We d...
read it
-
An EEG-based Image Annotation System
The success of deep learning in computer vision has greatly increased th...
read it
-
Pose-Aware Person Recognition
Person recognition methods that use multiple body regions have shown sig...
read it
-
Self-supervised learning of visual features through embedding images into text topic spaces
End-to-end training from scratch of current deep architectures for new c...
read it
-
Automated Top View Registration of Broadcast Football Videos
In this paper, we propose a novel method to register football broadcast ...
read it
-
Generating Synthetic Data for Text Recognition
Generating synthetic images is an art which emulates the natural process...
read it
-
Trajectory Aligned Features For First Person Action Recognition
Egocentric videos are characterised by their ability to have the first p...
read it