Probabilistic Latent Semantic Analysis

01/23/2013
by   Thomas Hofmann, et al.
0

Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed method is based on a mixture decomposition derived from a latent class model. This results in a more principled approach which has a solid foundation in statistics. In order to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered EM. Our approach yields substantial and consistent improvements over Latent Semantic Analysis in a number of experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

12/17/2012

A Tutorial on Probabilistic Latent Semantic Analysis

In this tutorial, I will discuss the details about how Probabilistic Lat...
03/07/2019

Quantum Latent Semantic Analysis

The main goal of this paper is to explore latent topic analysis (LTA), i...
03/14/2023

Improving information retrieval through correspondence analysis instead of latent semantic analysis

Both latent semantic analysis (LSA) and correspondence analysis (CA) are...
12/02/2015

Probabilistic Latent Semantic Analysis (PLSA) untuk Klasifikasi Dokumen Teks Berbahasa Indonesia

One task that is included in managing documents is how to find substanti...
10/02/2012

A Semantic Approach for Automatic Structuring and Analysis of Software Process Patterns

The main contribution of this paper, is to propose a novel semantic appr...
07/25/2021

A Comparison of Latent Semantic Analysis and Correspondence Analysis for Text Mining

Both latent semantic analysis (LSA) and correspondence analysis (CA) use...

Code Repositories

PLSA

PLSA implementation via EM algorithm


view repo

plsa

a python implementation of plsa


view repo

PLSA

a python implementation of probabilistic latent semantic analysis (plsa) using EM algorithm


view repo

plsa

a probabilistic latent semantic analysis model in matlab programming


view repo

PSLDoc

Protein Subcellular Localization Prediction based on based on gapped-dipeptides and probabilistic latent semantic analysis


view repo

Please sign up or login with your details

Forgot password? Click here to reset