Bibliographic Analysis on Research Publications using Authors, Categorical Labels and the Citation Network

09/21/2016
by   Kar Wai Lim, et al.
0

Bibliographic analysis considers the author's research areas, the citation network and the paper content among other things. In this paper, we combine these three in a topic model that produces a bibliographic model of authors, topics and documents, using a nonparametric extension of a combination of the Poisson mixed-topic link model and the author-topic model. This gives rise to the Citation Network Topic Model (CNTM). We propose a novel and efficient inference algorithm for the CNTM to explore subsets of research publications from CiteSeerX. The publication datasets are organised into three corpora, totalling to about 168k publications with about 62k authors. The queried datasets are made available online. In three publicly available corpora in addition to the queried datasets, our proposed model demonstrates an improved performance in both model fitting and document clustering, compared to several baselines. Moreover, our model allows extraction of additional useful knowledge from the corpora, such as the visualisation of the author-topics network. Additionally, we propose a simple method to incorporate supervision into topic modelling to achieve further improvement on the clustering task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2016

Bibliographic Analysis with the Citation Network Topic Model

Bibliographic analysis considers author's research areas, the citation n...
research
06/02/2017

Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora

Much of scientific progress stems from previously published findings, bu...
research
04/24/2022

Co-citation and Co-authorship Networks of Statisticians

We collected and cleaned a large data set on publications in statistics....
research
06/16/2022

Research Topic Flows in Co-Authorship Networks

In scientometrics, scientific collaboration is often analyzed by means o...
research
03/30/2015

Infinite Author Topic Model based on Mixed Gamma-Negative Binomial Process

Incorporating the side information of text corpus, i.e., authors, time s...
research
02/11/2017

Citation-based clustering of publications using CitNetExplorer and VOSviewer

Clustering scientific publications in an important problem in bibliometr...
research
11/03/2018

DAPPER: Scaling Dynamic Author Persona Topic Model to Billion Word Corpora

Extracting common narratives from multi-author dynamic text corpora requ...

Please sign up or login with your details

Forgot password? Click here to reset