Deep Neural Networks for Czech Multi-label Document Classification

01/13/2017
by   Ladislav Lenc, et al.
0

This paper is focused on automatic multi-label document classification of Czech text documents. The current approaches usually use some pre-processing which can have negative impact (loss of information, additional implementation work, etc). Therefore, we would like to omit it and use deep neural networks that learn from simple features. This choice was motivated by their successful usage in many other machine learning fields. Two different networks are compared: the first one is a standard multi-layer perceptron, while the second one is a popular convolutional network. The experiments on a Czech newspaper corpus show that both networks significantly outperform baseline method which uses a rich set of features with maximum entropy classifier. We have also shown that convolutional network gives the best results.

READ FULL TEXT
research
10/06/2017

Czech Text Document Corpus v 2.0

This paper introduces "Czech Text Document Corpus v 2.0", a collection o...
research
11/13/2018

ML-Net: multi-label classification of biomedical texts with deep neural networks

Background: Multi-label text classification is one type of text classifi...
research
02/03/2018

Joint Binary Neural Network for Multi-label Learning with Applications to Emotion Classification

Recently the deep learning techniques have achieved success in multi-lab...
research
11/20/2018

A Baseline for Multi-Label Image Classification Using An Ensemble of Deep Convolutional Neural Networks

Recent studies on multi-label image classification have focused on desig...
research
02/07/2018

Classification of Things in DBpedia using Deep Neural Networks

The Semantic Web aims at representing knowledge about the real world at ...
research
10/09/2013

Neural perceptual model to global-local vision for recognition of the logical structure of administrative documents

This paper gives the definition of Transparent Neural Network "TNN" for ...
research
01/15/2023

Hawk: An Industrial-strength Multi-label Document Classifier

There are a plethora of methods and algorithms that solve the classical ...

Please sign up or login with your details

Forgot password? Click here to reset