Citation Analysis of Computer Systems Papers
Citation analysis is used extensively in the bibliometrics literature to assess the impact of individual works, researchers, institutions, and even entire fields of study. In this paper, we analyze citations in one large and influential field within computer science, namely computer systems. Using citation data from a cross-sectional sample of 2,088 papers in 50 systems conferences from 2017, we examine four research questions: overall distribution of systems citations; their evolution over time; the differences between databases (Google Scholar and Scopus) for systems papers, and; the characteristics of self-citations in the field. We find that only 1.5 accrued at least 100 citations, both statistics comparing favorably to many other scientific fields. The most cited subfields and conference areas within systems were security, databases, and computer architecture. Most papers achieved their first citation within a year from publication, and the median citation count continued to grow at an almost linear rate over five years, with only a few papers peaking before that. We also find that early citations could be linked to papers with a freely available preprint, or may be primarily composed of self-citations. The ratio of self-citations to total citations starts relatively high for most papers but appears to stabilize by 12–18 months, at which point highly cited papers revert to predominately external citations. Past self-citation count (taken from each paper's reference list) appears to bear little if any relationship with the future self-citation count of each paper. The choice of citation database also makes little difference in relative citation comparisons, despite marked differences in absolute counts.
READ FULL TEXT