Graph Contrastive Topic Model

07/05/2023
by   Zheheng Luo, et al.
0

Existing NTMs with contrastive learning suffer from the sample bias problem owing to the word frequency-based sampling strategy, which may result in false negative samples with similar semantics to the prototypes. In this paper, we aim to explore the efficient sampling strategy and contrastive learning in NTMs to address the aforementioned issue. We propose a new sampling assumption that negative samples should contain words that are semantically irrelevant to the prototype. Based on it, we propose the graph contrastive topic model (GCTM), which conducts graph contrastive learning (GCL) using informative positive and negative samples that are generated by the graph-based sampling strategy leveraging in-depth correlation and irrelevance among documents and words. In GCTM, we first model the input document as the document word bipartite graph (DWBG), and construct positive and negative word co-occurrence graphs (WCGs), encoded by graph neural networks, to express in-depth semantic correlation and irrelevance among words. Based on the DWBG and WCGs, we design the document-word information propagation (DWIP) process to perform the edge perturbation of DWBG, based on multi-hop correlations/irrelevance among documents and words. This yields the desired negative and positive samples, which will be utilized for GCL together with the prototypes to improve learning document topic representations and latent topics. We further show that GCL can be interpreted as the structured variational graph auto-encoder which maximizes the mutual information of latent topic representations of different perspectives on DWBG. Experiments on several benchmark datasets demonstrate the effectiveness of our method for topic coherence and document representation learning compared with existing SOTA methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2022

Generating Counterfactual Hard Negative Samples for Graph Contrastive Learning

Graph contrastive learning has emerged as a powerful tool for unsupervis...
research
11/23/2022

Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning

To overcome the data sparsity issue in short text topic modeling, existi...
research
03/21/2021

An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information

In this paper, we focus on the problem of unsupervised image-sentence ma...
research
10/25/2021

Contrastive Learning for Neural Topic Model

Recent empirical studies show that adversarial topic models (ATM) can su...
research
02/14/2022

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

Learning scientific document representations can be substantially improv...
research
06/17/2021

Prototypical Graph Contrastive Learning

Graph-level representations are critical in various real-world applicati...
research
04/01/2022

Graph Enhanced Contrastive Learning for Radiology Findings Summarization

The impression section of a radiology report summarizes the most promine...

Please sign up or login with your details

Forgot password? Click here to reset