Use of NoSQL database and visualization techniques to analyze massive scholarly article data from journals

by   Gouri Ginde, et al.

Visualization of the massive data is a challenging endeavor. Extracting data and providing graphical representations can aid in its effective utilization in terms of interpretation and knowledge discovery. Publishing research articles has become a way of life for academicians. The scholarly publications can shape-up the professional growth of authors and also expand the research and technological growth of a country, continent and other demographic regions. Scholarly articles have grown in gigantic numbers that are published in different domains by various journals. Information related to articles, authors, their affiliations, number of citations, country, publisher, references and other information is like a gold mine for statisticians and data analysts. This data when used skillfully, via visual analysis tool, can provide valuable understanding and can aid in deeper exposition for researchers working in domains like scientometrics and bibliometrics. Since the data is not readily available, we used Google scholar, a comprehensive and free repository of scholarly articles, as data source for our study. Data was scraped from Google scholar and stored as a graph and later visualized in the form of nodes and its relationships, which offered discerning and concealed information of growing impact of articles, journals and authors in their domains. Not only this, evident domain shift of an author, various research domains spread for an author, predicting emerging domain and subdomains, detecting cartel behavior at Journal and author-level was also depicted by graphical analysis. Neo4j graph database was used in the background to help store the data in structured manner.


page 1

page 2

page 4

page 5


Visuality in a Cross-disciplinary Battleground: Analysis of Inscriptions in Digital Humanities Journal Publications

Like the old saying, "a graph is worth a thousand words," the non-verbal...

Ten years of research on ResearchGate, a scoping review using Google Scholar 2008_2017

Objective. To analyse quantitatively the articles published during 2008_...

Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research

Articles whose authors make them Open Access (OA) by self-archiving them...

Growth and dynamics of Econophysics: A bibliometric and network analysis

Digitization of publications, advancement in communication technology, a...

Fidyll: A Compiler for Cross-Format Data Stories Explorable Explanations

Narrative visualization is a powerful communicative tool that can take o...

Geodiversity of hunger research – country focus and its regional and international actors

As we address the grand challenges of our time, inclusivity and represen...

Predicting the longevity of resources shared in scientific publications

Research has shown that most resources shared in articles (e.g., URLs to...

Please sign up or login with your details

Forgot password? Click here to reset