KATE: K-Competitive Autoencoder for Text

05/04/2017
by   Yu Chen, et al.
0

Autoencoders have been successful in learning meaningful representations from image datasets. However, their performance on text datasets has not been widely studied. Traditional autoencoders tend to learn possibly trivial representations of text documents due to their confounding properties such as high-dimensionality, sparsity and power-law word distributions. In this paper, we propose a novel k-competitive autoencoder, called KATE, for text documents. Due to the competition between the neurons in the hidden layer, each neuron becomes specialized in recognizing specific data patterns, and overall the model can learn meaningful representations of textual data. A comprehensive set of experiments show that KATE can learn better representations than traditional autoencoders including denoising, contractive, variational, and k-sparse autoencoders. Our model also outperforms deep generative models, probabilistic topic models, and even word representation models (e.g., Word2Vec) in terms of several downstream tasks such as document classification, regression, and retrieval.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2014

Winner-Take-All Autoencoders

In this paper, we propose a winner-take-all method for learning hierarch...
research
05/11/2020

SCAT: Second Chance Autoencoder for Textual Data

We present a k-competitive learning approach for textual autoencoders na...
research
02/13/2014

Squeezing bottlenecks: exploring the limits of autoencoder semantic representation capabilities

We present a comprehensive study on the use of autoencoders for modellin...
research
01/07/2019

Vector representations of text data in deep learning

In this dissertation we report results of our research on dense distribu...
research
09/15/2021

Disentangling Generative Factors in Natural Language with Discrete Variational Autoencoders

The ability of learning disentangled representations represents a major ...
research
02/08/2011

On Nonparametric Guidance for Learning Autoencoder Representations

Unsupervised discovery of latent representations, in addition to being u...
research
11/30/2017

Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models

The gap between our ability to collect interesting data and our ability ...

Please sign up or login with your details

Forgot password? Click here to reset