Few-shot Learning for Topic Modeling

04/19/2021
by   Tomoharu Iwata, et al.
0

Topic models have been successfully used for analyzing text documents. However, with existing topic models, many documents are required for training. In this paper, we propose a neural network-based few-shot learning method that can learn a topic model from just a few documents. The neural networks in our model take a small number of documents as inputs, and output topic model priors. The proposed method trains the neural networks such that the expected test likelihood is improved when topic model parameters are estimated by maximizing the posterior probability using the priors based on the EM algorithm. Since each step in the EM algorithm is differentiable, the proposed method can backpropagate the loss through the EM algorithm to train the neural networks. The expected test likelihood is maximized by a stochastic gradient descent method using a set of multiple text corpora with an episodic training framework. In our experiments, we demonstrate that the proposed method achieves better perplexity than existing methods using three real-world text document sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2012

Multilingual Topic Models for Unaligned Text

We develop the multilingual topic model for unaligned text (MuTo), a pro...
research
03/03/2022

Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

A topic model is often formulated as a generative model that explains ho...
research
04/05/2016

Less is more: zero-shot learning from online textual documents with noise suppression

Classifying a visual concept merely from its associated online textual s...
research
11/01/2021

End-to-End Learning of Deep Kernel Acquisition Functions for Bayesian Optimization

For Bayesian optimization (BO) on high-dimensional data with complex str...
research
01/21/2022

Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing

How to obtain hierarchical representations with an increasing level of a...
research
10/30/2019

A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

Automated analysis of clinical notes is attracting increasing attention....
research
10/09/2020

Few-shot Learning for Spatial Regression

We propose a few-shot learning method for spatial regression. Although G...

Please sign up or login with your details

Forgot password? Click here to reset