Quantum Latent Semantic Analysis

03/07/2019
by   Fabio A. González, et al.
0

The main goal of this paper is to explore latent topic analysis (LTA), in the context of quantum information retrieval. LTA is a valuable technique for document analysis and representation, which has been extensively used in information retrieval and machine learning. Different LTA techniques have been proposed, some based on geometrical modeling (such as latent semantic analysis, LSA) and others based on a strong statistical foundation. However, these two different approaches are not usually mixed. Quantum information retrieval has the remarkable virtue of combining both geometry and probability in a common principled framework. We built on this quantum framework to propose a new LTA method, which has a clear geometrical motivation but also supports a well-founded probabilistic interpretation. An initial exploratory experimentation was performed on three standard data sets. The results show that the proposed method outperforms LSA on two of the three datasets. These results suggests that the quantum-motivated representation is an alternative for geometrical latent topic modeling worthy of further exploration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/23/2013

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis is a novel statistical technique ...
research
08/05/2020

Introductory review to quantum information retrieval

Recently people started to understand that applications of the mathemati...
research
07/11/2012

Applying Discrete PCA in Data Analysis

Methods for analysis of principal components in discrete data have exist...
research
12/02/2015

Probabilistic Latent Semantic Analysis (PLSA) untuk Klasifikasi Dokumen Teks Berbahasa Indonesia

One task that is included in managing documents is how to find substanti...
research
12/17/2012

A Tutorial on Probabilistic Latent Semantic Analysis

In this tutorial, I will discuss the details about how Probabilistic Lat...
research
02/26/2020

A hypergeometric test interpretation of a common tf-idf variant

Term frequency-inverse document frequency, or tf-idf for short, is a num...
research
04/11/2012

Concept Modeling with Superwords

In information retrieval, a fundamental goal is to transform a document ...

Please sign up or login with your details

Forgot password? Click here to reset