Seeing Colors: Learning Semantic Text Encoding for Classification

08/31/2018
by   Shah Nawaz, et al.
8

The question we answer with this work is: can we convert a text document into an image to exploit best image classification models to classify documents? To answer this question we present a novel text classification method which converts a text document into an encoded image, using word embedding and capabilities of Convolutional Neural Networks (CNNs), successfully employed in image classification. We evaluate our approach by obtaining promising results on some well-known benchmark datasets for text classification. This work allows the application of many of the advanced CNN architectures developed for Computer Vision to Natural Language Processing. We test the proposed approach on a multi-modal dataset, proving that it is possible to use a single deep model to represent text and image in the same feature space.

READ FULL TEXT

page 3

page 4

page 6

research
10/03/2018

Image and Encoded Text Fusion for Multi-Modal Classification

Multi-modal approaches employ data from multiple input streams such as t...
research
03/31/2018

In-depth Question classification using Convolutional Neural Networks

Convolutional neural networks for computer vision are fairly intuitive. ...
research
11/08/2018

Doc2Im: document to image conversion through self-attentive embedding

Text classification is a fundamental task in NLP applications. Latest re...
research
06/16/2020

Improving accuracy and speeding up Document Image Classification through parallel systems

This paper presents a study showing the benefits of the EfficientNet mod...
research
03/11/2019

An Innovative Word Encoding Method For Text Classification Using Convolutional Neural Network

Text classification plays a vital role today especially with the intensi...
research
03/02/2023

Document Provenance and Authentication through Authorship Classification

Style analysis, which is relatively a less explored topic, enables sever...
research
08/19/2019

A novel text representation which enables image classifiers to perform text classification, applied to name disambiguation

Patent data are often used to study the process of innovation and resear...

Please sign up or login with your details

Forgot password? Click here to reset