Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Usage, and Impact

by   Tarek Saier, et al.

Citation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in data sets, or that language metadata is not available. Because of this, citations between publications of differing languages (cross-lingual citations) have only been studied to a very limited degree. In this paper, we present an analysis of cross-lingual citations based on over one million English papers, spanning three scientific disciplines and a time span of three decades. Our investigation covers differences between cited languages and disciplines, trends over time, and the usage characteristics as well as impact of cross-lingual citations. Among our findings are an increasing rate of citations to publications written in Chinese, citations being primarily to local non-English languages, and consistency in citation intent between cross- and monolingual citations. To facilitate further research, we make our collected data and source code publicly available.


page 9

page 16


X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents

The number of scientific publications nowadays is rapidly increasing, ca...

Quantifying the higher-order influence of scientific publications

Citation impact is commonly assessed using direct, first-order citation ...

Shorter Distances between Papers over Time are Due to More Cross-Field References and Increased Citation Rate to Higher Impact Papers

The exponential increase in the number of scientific publications raises...

Non-English language publications in Citation Indexes – quantity and quality

We analyzed publications data in WoS and Scopus to compare publications ...

unarXive 2022: All arXiv Publications Pre-Processed for NLP, Including Structured Full-Text and Citation Network

Large-scale data sets on scholarly publications are the basis for a vari...

Thirty-Two Years of IEEE VIS: Authors, Fields of Study and Citations

The IEEE VIS Conference (VIS) recently rebranded itself as a unified con...

Measuring Social Media Activity of Scientific Literature: An Exhaustive Comparison of Scopus and Novel Altmetrics Big Data

This paper measures social media activity of 15 broad scientific discipl...

Please sign up or login with your details

Forgot password? Click here to reset