What is the right way to represent document images?

03/03/2016
by   Gabriela Csurka, et al.
0

In this article we study the problem of document image representation based on visual features. We propose a comprehensive experimental study that compares three types of visual document image representations: (1) traditional so-called shallow features, such as the RunLength and the Fisher-Vector descriptors, (2) deep features based on Convolutional Neural Networks, and (3) features extracted from hybrid architectures that take inspiration from the two previous ones. We evaluate these features in several tasks (i.e. classification, clustering, and retrieval) and in different setups (e.g. domain transfer) using several public and in-house datasets. Our results show that deep features generally outperform other types of features when there is no domain shift and the new task is closely related to the one used to train the model. However, when a large domain or task shift is present, the Fisher-Vector shallow features generalize better and often obtain the best results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2016

Document image classification, with a specific view on applications of patent images

The main focus of this paper is document image classification and retrie...
research
02/08/2017

Backpropagation Training for Fisher Vectors within Neural Networks

Fisher-Vectors (FV) encode higher-order statistics of a set of multiple ...
research
04/19/2023

Analyzing the Domain Shift Immunity of Deep Homography Estimation

Homography estimation is a basic image-alignment method in many applicat...
research
02/25/2015

Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval

This paper presents a new state-of-the-art for document image classifica...
research
01/29/2016

Hybrid CNN and Dictionary-Based Models for Scene Recognition and Domain Adaptation

Convolutional neural network (CNN) has achieved state-of-the-art perform...
research
10/15/2014

Efficient Image Categorization with Sparse Fisher Vector

In object recognition, Fisher vector (FV) representation is one of the s...
research
06/16/2020

Improving accuracy and speeding up Document Image Classification through parallel systems

This paper presents a study showing the benefits of the EfficientNet mod...

Please sign up or login with your details

Forgot password? Click here to reset