Phocus: Picking Valuable Research from a Sea of Citations

01/09/2022
by   Xinrong Zhang, et al.
Tsinghua University
0

The deluge of new papers has significantly blocked the development of academics, which is mainly caused by author-level and publication-level evaluation metrics that only focus on quantity. Those metrics have resulted in several severe problems that trouble scholars focusing on the important research direction for a long time and even promote an impetuous academic atmosphere. To solve those problems, we propose Phocus, a novel academic evaluation mechanism for authors and papers. Phocus analyzes the sentence containing a citation and its contexts to predict the sentiment towards the corresponding reference. Combining others factors, Phocus classifies citations coarsely, ranks all references within a paper, and utilizes the results of the classifier and the ranking model to get the local influential factor of a reference to the citing paper. The global influential factor of the reference to the citing paper is the product of the local influential factor and the total influential factor of the citing paper. Consequently, an author's academic influential factor is the sum of his contributions to each paper he co-authors.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/28/2018

Biblioranking fundamental physics

We propose measures of the impact of research that improve on existing o...
09/20/2018

Over-Optimization of Academic Publishing Metrics: Observing Goodhart's Law in Action

The academic publishing world is changing significantly, with ever-growi...
07/27/2018

On Good and Bad Intentions behind Anomalous Citation Patterns among Journals in Computer Sciences

Scientific journals are an important choice of publication venue for mos...
03/06/2018

Genealogy tree: understanding academic lineage of authors via algorithmic and visual analysis

Ancestry and genealogy tree are proven tools to determine the lineage of...
11/28/2021

Enhancing Keyphrase Extraction from Academic Articles with their Reference Information

With the development of Internet technology, the phenomenon of informati...
04/14/2020

Author Name Disambiguation in Bibliographic Databases: A Survey

Entity resolution is a challenging and hot research area in the field of...
08/12/2019

Citations in Software Engineering -- Paper-related, Journal-related, and Author-related Factors

Many factors could affect the number of citations to a paper. Citations ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Figure 1. the number of new publications on IEEE Xplore each year from 2000 to 2021.

The number of papers published each year has grown greatly. For example, as shown in Figure 1, the number of new papers on IEEE Xplore111https://ieeexplore.ieee.org/Xplore/home.jsp increases sharply over the decade.

Paper boom in academic fields results in many severe problems. Cortes et al. (Cortes and Lawrence, 2021) examine 2014 NeurIPS and find that it is not able to pick out excellent researches, and could identify terrible papers. Chu et al. (Chu and Evans, 2021) reveal that too many papers published each year in a field hinder its development. They state this opinion in two aspects. First, researchers are busy coping with a lot of papers, but don’t have enough time to fully learn novel ideas; Second, the focused attention on a promising idea might be broken up by the deluge of new ideas.

The reason for the sharp increase in papers is that evaluation metrics for researchers and scholars focus on the number of papers. From the scientific output, research funding, to the evaluation of professional rank, papers play a very important role, and the more papers, the better. However, It is time to make changes. Quantitative metrics could not evaluate the real academic impact of a scholar or a paper. They ignore the essential differences between citations, which is a fatal error. Seglen expresses strong opposition to impact factors that measure the academic influence of journals for committees seldom have the specialist’ insights to assess primary researches(Seglen, 1997).

We propose Phocus, a novel evaluation mechanism for scholars and publications. Phocus analyzes the sentence containing a citation and its contexts to predict the sentiment polarity towards the corresponding reference. Besides, Phocus also considers the total number of citations, the number of citations per sentence, author overlap, and the number of references, similar to (Valenzuela et al., 2015). Given those factors above, Phocus uses Naive Bayesian Classifier to divide citations coarsely into 4 categories and utilizes the LambdaMART model to sort all references within a paper. Combining the categories and the ranking results, every reference gets its local influential factor within , related to the citing paper. The global influential factor of the reference to the citing paper is the product of the local influential factor and the total influential factor of the citing paper. Consequently, an author’s academic influential factor is the sum of his contributions to each paper he co-authors.

2. Related Work

Our work involves citation classification, aspect-based sentiment analysis, ranking model and evaluation metrics for academics, which will be introduced in subsections below respectively.

2.1. Citation Classification

In fact, there are already many kinds of research that have focused on citation classification. For example, Teufel et al. (Teufel et al., 2006) classify citation intents into 12 classes, using simple regular match to extract features. Valenzuela et al. (Valenzuela et al., 2015)

divide citations into 4 classes: highly influential, background, method and results citations, using SVM with an RBF kernel and random forests, taking 13 features into consideration: total number of direct citations, number of direct citations per section, the total number of indirect citations and number of indirect citations per section, author overlap, is considered helpful, citation appears in table and caption,

, , the similarity between abstracts, PageRank(Page et al., 1999), number of total citing papers after transitive closure, and field of the cited paper. While Jurgens et al. (Jurgens et al., 2016) define 7 classes of citation intents: background, motivation, uses, extension, continuation, comparison or contrast, and future, with a Random Forest classifier trained using 4 types of features: structural features, lexical, morphological and grammatical features, field, and usage. Cohan et al. (Cohan et al., 2019) propose a multitask model using BiLSTM and attention mechanism to classify citation intents that is the primary task and predict the section where the citation occurs and where a sentence needs a citation that is auxiliary tasks and is used to assist the primary task222https://github.com/allenai/scicite. They categorize intents into 3 classes: background information, method, and result comparison. Besides, Cohan builds a citation intent dataset SciCite. Those works simply classify citations according to intents but ignore the sentiment citing paper towards references, which is vital.

Butt et al. (Butt et al., 2015)

utilize Naive-Bayes Classifier to predict the sentiment polarity of a sentence containing a citation and its contexts. Whereas Liu et al.

(Liu, 2017)

use averaged word embeddings to represent sentence vectors and to classify sentiment polarities. However, this method generates the overall sentiment of text, rather than the precise sentiment towards the cited paper, which is unable to apply directly.

2.2. Aspect-based Sentiment Analysis

Aspect-based sentiment analysis (ABSA) is proposed to define such a task. Usually, ABSA consists of two stages: locating aspects and analyzing sentiment. Some works solve this problem also in a two-stage way, while some jointly.

To detect citation span in Wikipedia, Fetahu et al. (Fetahu et al., 2017) propose a sequence classification method using a linear chain CRF to decide which text fragments are covered by a citation at the sub-sentence level. Whereas Kaplan et al. (Kaplan et al., 2016) detect non-explicit citing sentences that surround an explicit citing sentence, utilizing relational, entity, lexical, and grammatical coherence between them. (Ma et al., 2018)(Zerva et al., 2020)even try to find the most relative sentences in reference paper with the citing sentences. Qazvinian and Radev (Qazvinian and Radev, 2010) proposed a method based on probabilistic inference to extract non-explicit citing sentences by modelling the sentences in an article and their lexical similarities as a Markov Random Field tuned to detect the patterns that context data create and employ a Belief Propagation mechanism to detect likely context sentences. Abu-Jbara and Radev (Abu-Jbara and Radev, 2012) determine the citation block by first segmenting the sentences and then classifying each word in the sentence as being inside or outside the citation block. Finally, they aggregate the labels of all the words contained in a segment to assign a label to the whole segment using three different label aggregation rules(majority label of the words, at least one of the words, or all of them). Kaplan et al. (Kaplan et al., 2009) proposed a new method based on coreference-chains for extracting citation blocks from research papers.

Given aspects, Sun et al. (Sun et al., 2019) construct an auxiliary sentence from a aspect, and feed the sentence-pair into BERT-based model. Gao et al. (Gao et al., 2019) utilize three target-dependent variations of the model. Bai et al. (Bai et al., 2021) propose a novel relational graph attention network333https://github.com/muyeby/RGAT-ABSA, which integrates typed syntactic dependency information.

As the errors are cumulated in the pipeline, some researchers explore solutions that detect aspects and classify sentiment jointly. Wang et al. (Wang et al., 2010) propose a latent aspect rating analysis problem that aims at analyzing reviewers’ latent opinions on an entity from several aspects. For a certain entity, they define a set of keywords of aspects and segment reviews into the aspect level. Given aspect segmentation results, they use a novel latent rating regression model to calculate aspect ratings and corresponding weights. However, Wang et al. ignore the inter-dependencies between words and sentences, which causes great information loss. This class problem is also called aspect-based sentiment analysis (ABSA). Ruder et al. (Ruder et al., 2016) proposes a hierarchical bidirectional LSTM to model the inter-dependencies of sentences within a review. The aspect is represented by the average of its entity and attribute embeddings. Hoang et al. (Hoang et al., 2019) propose to use a sentence pair classifier model from BERT(Devlin et al., 2019) to solve ABSA at sentence and text levels. Hu et al. (Hu et al., 2019) propose a span-based extract-then-classify framework based on BERT444https://github.com/huminghao16/SpanABSA. Xu et al. (Xu et al., 2019) build a dataset, ReviewRC555https://howardhsu.github.io/dataset/, and extend BERT with an extra tasking-specific layer to tune each task. Wallaart et al. (Wallaart and Frasincar, 2019)

propose a two-stage algorithm to solve the ABSA for restaurant reviews: predicting the sentiment with a lexicalized domain ontology, and using a neural network with a rotatory attention mechanism (LCR-Rot) as a backup algorithm. The order of rotatory attention mechanism operation is changed and the rotatory attention mechanism is iterated multiple times. Trusca et al. extend

(Wallaart and Frasincar, 2019) with deep contextual word embeddings and add an extra attention layer to its high-level representations(Trusca et al., 2020). To address the imbalance issue and utilize the interaction between aspect terms, Luo et al. (Luo et al., 2020) propose a gradient harmonized and cascaded labelling model based on BERT. Chen et al. (Chen et al., 2020) utilize directional graph convolutional networks to perform end-to-end ABSA task.

2.3. Ranking Model

The ranking model is based on LambdaMART, which is the boosted tree version of LambdaRank(Burges et al., 2006). This algorithm solves the gradients of non-smooth cost functions used in ranking models. Burges et al. (Burges, 2010) give a review on RankNet, LambdaRank, and LambdaMART.

To illustrate the ranking network, we use to denote the -th citation of the -th reference paper. Our ranking network receives an matrix of shape , where 4 stands for the feature quaternion of (au_overlap, n_cit, cit_word, sen_label). Among which cit_word is calculated as the total number of words in . The network calculate a score on each time of citation individually, averaging on duplicate citations to get the score of each reference paper . Then is used to rank all the reference paper, outputting .

2.4. Evaluation Metrics

In the academic field, there are journal-level, author-level and paper-level metrics that measure their impacts.

The Impact Factor (IF)(Milstead, 1980) and CiteScore666https://service.elsevier.com/app/answers/detail/a_id/14880/supporthub/scopus/ are used to measure the impact of a journal based on the number of times articles cited during a fixed period published by the journal. Besides, Journal Citation Reports (JCR) give ranking for journals777https://jcr.clarivate.com/jcr/home, Eigenfactor scores(Bergstrom, 2007) measure how likely a journal is to be used, and SCImago Journal Rank (SJR)(Gonzalez-Pereira et al., 2009) regards the citations issued by more import journals as more important than those issued by less important ones. Whereas Source Normalized Impact per Paper (SNIP)(Moed, 2010) indicates that a single citation is much more important in subject areas where citations are less, and vice versa.

Author-level metrics include h-index, g-index, i10-index and so on. H-index also called index , is proposed by Jorge E. Hirsch(Hirsch, 2005), and its definition is the number of papers with citation numbers higher or equal to . The g-index is defined as the largest number such that the top articles received together at least citations(Egghe, 2006). Google Scholar proposes the i10-index that is the number of a publication with at least 10 citations. Those metrics are derived from citations and do not reveal the truth among citations.

Paper-level metrics are usually the number of citations. Especially, Semantic Scholar makes the first step towards citation classification. It divided citations into 4 classes: highly influential, background, method and results citations(Valenzuela et al., 2015), using SVM with an RBF kernel and random forests. The features Semantic Scholar use are the total number of direct citations, number of direct citations per section, the total number of indirect citations and number of indirect citations per section, author overlap, is considered helpful, citation appears in table and caption, , , the similarity between abstracts, PageRank(Page et al., 1999), number of total citing papers after transitive closure, and field of the cited paper.

3. Methodology

Figure 2. the overview of Phocus.

As shown in Figure 2, our algorithm consists of 4 stages: pre-processing, calculating factors, evaluating contribution, and propagating influential factors. In pre-processing stage, we clean raw data, and obtain simple factors. Complex factors, like sentiment polarity are calculated in second stage. When get all factors needed, we classify citations into four classes and rank all references, and figure out the local contribution factor of each reference. We initialize all new paper to the database with an academic influential factor 1.0, and propagate its impact on references iteratively. The factors extracted from papers are listed out in Table 1

3.1. Pre-processing

Name Definition Ranges
cit_id
reference number of a
paper in the reference list
positive integer
cit_title
title of a reference
string
cit_author
authors of cit_title
list of authors
cit_year
publish year of cit_title
year
au_overlap
overlap between authors
of cit_title and citing paper
[0, 1]
sent_id
id of a sentence
natural number
sec_id
section id of a sentence
0: related work
introduction
1: main body
2: conclusion
n_cit
time of cit_id cited
in citing paper
natural number
cit_text
text of the sentence that
contains the cit_id
string
context_a
related sentences previous
to cit_text
string
context_b
related sentences behind
to cit_text
string
sen_label
the sentiment citing paper
towards cit_ id
-1: negative
0: neutral
1: positive
Table 1. factor list

Given a paper of string format, a series of steps process the raw data for the next stage: parsing, segmentation, and matching. Paring is aimed at dividing the input text into title, authors, sections, and references. We utilize flari888https://pypi.org/project/flair/ to parse the title, authors and publish year of the input paper and its references. We segment the input paper into two-level: section level and sentence level. Section segmentation is based on keywords matching and classified into three categories: 0 representing related work, introduction or other background citation; 1 representing main body including methodology, experiments and so on; 2 representing conclusion and other parts. Sentences are segmented using regular expression matching and are then labelled by their ID according to their appearing order. Reference parsing generates title, authors, publish year and even their citation markers in the paper. Given that information, we locate citations in each sentence and match citation markers with their corresponding reference papers. Then we could easily get the factor n_cit and cit_text. Factor au_overlap is calculated according to the following equation:

(1)

where A is the author set of citing paper, and B is the author set of reference paper.

3.2. Calculating Factors

There are still three factors unsolved: context_a, context_b, and sen_label. We obtain context_a, context_b with BERT, and propose a novel aspect-based sentiment analysis algorithm to classify citation sentiment.

We fine-tune BERT on a manually annotated dataset containing over 1,000 sentence pairs labelled as ”related” or ”irrelevant”. Each sentence pair is generated from a single academic paper. We get an accuracy of 94.5% on the evaluation dataset. To obtain the context of cit_context, we apply the above classifier iteratively on sentence pair ( representing the list of all sentences in the paper) where increases from 1. Once an ”irrelevant” pair is reported, the iteration is aborted and we take as context_a. Another stopping criterion is that should always be in the same paragraph with . A similar procedure is performed on to get context_b.

3.3. Evaluating Contribution

After gathering all needed factors, we train a classifier to categorize citation into 4 classes: very important, important, neutral, and terrible. And we also train a ranking model to predict the related order of references in terms of their contributions to the paper.

Label Description
3
extending the work; highly influenced by the work
2 using the work
1 related work
0 negative sentiment towards the work
Table 2. the classifying standards of Phocus.

First, we classify citations into four categories with a Naive Bayesian classifier. The classifying standards are shown in Table 2, and a larger number of labels represents more contributions.

The ranking model is based on LambdaMART, which is the boosted tree version of LambdaRank(Burges et al., 2006). This algorithm solves the gradients of non-smooth cost functions used in ranking models. Burges et al. (Burges, 2010) give a review on RankNet, LambdaRank, and LambdaMART.

Based on the classes and order of references, we project them into to get their influential factors.

3.4. Propagating Influential Factors

Figure 3. the propagation rules of influential factors

Given a list of references and their influential factors of the citing paper, we design some rules to propagate their influence. The main idea is shown in Figure 3.

denote a citing paper with academic influential factor initialized as 1, set , denote all references of , and their corresponding local contribution to , and is the local contribution of reference i to . is the set of all papers that cite , and for , is A’s local contribution to . Then, the academic influential factor of is:

(2)

For author who publishes a set of papers , and his contribution to paper is , his academic influential factor is:

(3)

For paper , and its authors, . There are two problems to prove to ensure that our method is logical. The first one is margin effects. And the second one the propagation rules.

4. Experiments

We conduct several experiments to demonstrate our new metrics that measure the influential factors of an individual scientist or scholar and the citation impact of the publications.

As the influential factor of a paper is the weighted sum of all papers that cite it and its corresponding contribution to them, the final and full network of paper and network should be constructed. However, we cannot complete this job yet out of no access to some databases, not enough time or computational resources. We will select some scholars and their publications as targets, and utilize primary citation and secondary citation relationships. Besides, we also compare our modules to other state-of-art algorithms to show the improvement we achieve.

4.1. Peer Comparison

Scholar and their publications. Let Scholar Y denote some scholar. We will show the difference between Scholar Y and the Turing Award winner Pat. Hanrahan999https://scholar.google.com/citations?hl=zh-CN&user=RzEnQmgAAAAJ. As we emphasize, Pat. Hanrahan is much more influential than scholar Y is not only for that he wins Turing Award, but also is based on solid statistics of citations. For example, He et al. (He et al., 2015) take one paper of scholar Y as a baseline that performs only better than one baseline among eleven. Table 4 shows evaluation results of scholar Y and Pat. Hanrahan on Aminer101010https://www.aminer.cn/, Google Scholar111111https://scholar.google.com/, Semantic Scholar121212https://www.semanticscholar.org/ and Phocus.

Scholar Aminer Google Scholar Semantic Scholar
publications citations citations publications citations
Y 1146 77903 78663 771 59679
Hanrahan 381 52214 50568 315 56383
Table 3. statistics of Y and Hanrahan

Table 3 lists the number of publications and citations of scholar Y and Pat. Hanrahan. It’s obviously that scholar Y is more productive than Pat. Hanrahan. However, those numbers covers up some significant truths that not all papers are equal influential and not all citations mean agreement with the cited ones.

Scholar Aminer Google Scholar Semantic Scholar Phocus (Primary)
h g h i10 h HIC
Y 131 258 123 723 119 5843 0.40
Hanrahan 97 228 93 200 88 3741 0.52
Table 4. evaluation results from several platforms

where h represents h-index, g represents g-index, i10 means i10-index, and HIC is the number of highly influential citations. H-index, also called index , is proposed by Jorge E. Hirsch(Hirsch, 2005), and its definition is the number of papers with citation number higher or equal to . The g-index is defined as the largest number such that the top articles received together at least citations(Egghe, 2006). Google Scholar proposes i10-index that is the number of a publication with at least 10 citations. Those metrics are derived from citations and do not reveal the truth among citations. Semantic Scholar makes the first step towards citation classification. It divided citations into 4 classes: highly influential, background, method and results citations(Valenzuela et al., 2015), using SVM with a RBF kernel and random forests. The features Semantic Scholar use are total number of direct citations, number of direct citations per section, total number of indirect citations and number of indirect citations per section, author overlap, is considered helpful, citation appears in table and caption, , , similarity between abstracts, PageRank(Page et al., 1999), number of total citing papers after transitive closure, and field of the cited paper.

Figure 4. for paper A, paper B and C cite it directly, called primary citations, D to A is secondary and E to A is tertiary.

We collect XX papers that cite scholar Y from 78663, and XX papers that cite Patrick Hanrahan from 56383. Only utilizing primary citations, we get the global academic influential factors of scholar Y and Patrick Hanrahan is 0.40 and 0.52 respectively. Figure 4

4.2. Mathematical Invariance

To verify the model, we conduct a series of experiments to prove it’s reasonable.

First, given a set of references within a paper, removing anyone reference from the set won’t change the related order of left references. And when removing a reference at a time, the left references also keep related orders.

Also, the final score should be stable and insensitive to propagating order under a certain paper pool. Our strategy starts from a default influential factor 1.0, traversing through each paper and updating the influential factor successively. It is proven through experiments that regardless of the updating order, the final score of each paper remains the same.

4.3. Citation Span

We conduct some experiments guided by (Abu-Jbara and Radev, 2012) as our baseline. We annotate the citation span for about 345 citing sentences as our data set to train and test the baseline model.

First, we use the tokenizer tool that SpaCy131313https://spacy.io/ provides to segment the text of each citing sentence into tokens, and use tagger and parser tool to assign part-of-speech-tags and dependency labels to each token.

Then, we extract features listed in Table 5

. as the input of the baseline model. The training is performed using SVM, Logistic Regression, and CRF, respectively. We use 10-fold cross-validation for training and testing.

Feature Description
distance
The distance (in words) between the word
and the target citaion.
position
This feature takes the value 1 if the word
comes before the target citation, and 0 otherwise.
segment
After splitting the sentence into segments
by punctuation and coordination conjunctions,
this feature takes the value 1 if the word occurs
in the same segment with the target reference,
and 0 otherwise.
pos_tag
The part of speech tag of the word, the word
before, and the word after.
dTreeDistance
Length of the shortest dependency path (in
the dependency parse tree) that connects the
word to the target reference or its representative.
lca
The type of the node in the dependency parse
tree that is the least common ancestor of the
word and the target reference.
Table 5. features used for citation span

Table 6 lists the precision, recall, and F1 for the three model.

Model Precision Recall F1
SVM 0.78 0.56 0.65
LR 0.68 0.67 0.67
CRF 0.65 0.64 0.64
Table 6. results for three different models for citation span

5. Results

As shown in Table 4, Phocus figures out that the global academic influential factors of scholar Y and Patrick Hanrahan are 0.40 and 0.52 respectively, and Patrick Hanrahan is 30% higher than scholar Y. It’s the results that only utilize primary citation data. While the evaluation results from Aminer, Google Scholar and even Semantic Scholar shows that scholar Y is more productive and influential than Patrick Hanrahan.

6. Conclusion

In this paper, we come up with Phocus, a novel set of academic evaluation metrics for authors and publications based on citation judgements that utilize aspect-based sentiment analysis. To verify our evaluation mechanism, peer comparison and ablation studies have been conducted. The results show that our metrics are able to identify the truly worthiness of a paper or a scholar, which is difficult to citation times based metrics, like h-index, g-index and others.

Phocus still need improvements. As shown in Section Experiments, we only use primary citation data, which is not enough to fully prove the reliability of Phocus. Besides, using more data such as secondary and tertiary citations could further reflect the gaps between scholars and between metrics. There are still many problems unsolved, such as “citation circles” (groups of researchers who cite one another’s work), and self-citation.

References

  • A. Abu-Jbara and D. Radev (2012) Reference scope identification in citing sentences. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 80–90. Cited by: §2.2, §4.3.
  • X. Bai, P. Liu, and Y. Zhang (2021) Investigating typed syntactic dependencies for targeted sentiment classification using graph attention neural network. IEEE/ACM Trans. Audio, Speech and Lang. Proc. 29, pp. 503–514. External Links: ISSN 2329-9290, Link, Document Cited by: §2.2.
  • C. T. Bergstrom (2007) Eigenfactor measuring the value and prestige of scholarly journals. College & Research Libraries News 68, pp. 314–316. Cited by: §2.4.
  • C. J. C. Burges, R. J. Ragno, and Q. V. Le (2006) Learning to rank with nonsmooth cost functions. In NIPS, Cited by: §2.3, §3.3.
  • C. J. C. Burges (2010) From ranknet to lambdarank to lambdamart: an overview. Cited by: §2.3, §3.3.
  • B. H. Butt, M. Rafi, A. Jamal, R. S. U. Rehman, S. M. Z. Alam, and M. B. Alam (2015) Classification of research citations (crc). In CLBib@ISSI, Cited by: §2.1.
  • G. Chen, Y. Tian, and Y. Song (2020) Joint aspect extraction and sentiment analysis with directional graph convolutional networks. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain (Online), pp. 272–279. External Links: Link, Document Cited by: §2.2.
  • J. S. G. Chu and J. A. Evans (2021) Slowed canonical progress in large fields of science. Proceedings of the National Academy of Sciences of the United States of America 118. Cited by: §1.
  • A. Cohan, W. Ammar, M. van Zuylen, and F. Cady (2019) Structural scaffolds for citation intent classification in scientific publications. ArXiv abs/1904.01608. Cited by: §2.1.
  • C. Cortes and N. Lawrence (2021) Inconsistency in conference peer review: revisiting the 2014 neurips experiment. ArXiv abs/2109.09774. Cited by: §1.
  • J. Devlin, M. Chang, K. Lee, and K. Toutanova (2019) BERT: pre-training of deep bidirectional transformers for language understanding. ArXiv abs/1810.04805. Cited by: §2.2.
  • L. Egghe (2006) Theory and practise of the g-index. Scientometrics 69, pp. 131–152. Cited by: §2.4, §4.1.
  • B. Fetahu, K. Markert, and A. Anand (2017) Fine grained citation span for references in Wikipedia. In

    Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

    ,
    Copenhagen, Denmark, pp. 1990–1999. External Links: Link, Document Cited by: §2.2.
  • Z. Gao, A. Feng, X. Song, and X. Wu (2019) Target-dependent sentiment classification with bert. IEEE Access 7, pp. 154290–154299. Cited by: §2.2.
  • B. Gonzalez-Pereira, V. Guerrero-Bote, and F. Moya-Anegon (2009) The sjr indicator: a new indicator of journals’ scientific prestige. External Links: 0912.4141 Cited by: §2.4.
  • K. He, X. Zhang, S. Ren, and J. Sun (2015) Deep residual learning for image recognition. External Links: 1512.03385 Cited by: §4.1.
  • J. E. Hirsch (2005) An index to quantify an individual’s scientific research output. Proc. Natl. Acad. Sci. USA 102, pp. 16569–16572. Cited by: §2.4, §4.1.
  • M. Hoang, O. A. Bihorac, and J. Rouces (2019) Aspect-based sentiment analysis using bert. In NODALIDA, Cited by: §2.2.
  • M. Hu, Y. Peng, Z. Huang, D. Li, and Y. Lv (2019) Open-domain targeted sentiment analysis via span-based extraction and classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 537–546. External Links: Link, Document Cited by: §2.2.
  • D. Jurgens, S. Kumar, R. Hoover, D. A. McFarland, and D. Jurafsky (2016) Citation classification for behavioral analysis of a scientific field. ArXiv abs/1609.00435. Cited by: §2.1.
  • D. Kaplan, R. Iida, and T. Tokunaga (2009) Automatic extraction of citation contexts for research paper summarization: a coreference-chain based approach. In Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries (NLPIR4DL), pp. 88–95. Cited by: §2.2.
  • D. Kaplan, T. Tokunaga, and S. Teufel (2016) Citation block determination using textual coherence. Journal of Information Processing 24 (3), pp. 540–553. External Links: Document Cited by: §2.2.
  • H. Liu (2017) Sentiment analysis of citations using word2vec. ArXiv abs/1704.00177. Cited by: §2.1.
  • H. Luo, L. Ji, T. Li, D. Jiang, and N. Duan (2020) GRACE: gradient harmonized and cascaded labeling for aspect-based sentiment analysis. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online, pp. 54–64. External Links: Link, Document Cited by: §2.2.
  • S. Ma, J. Xu, and C. Zhang (2018) Automatic identification of cited text spans: a multi-classifier approach over imbalanced dataset. Scientometrics 116 (2), pp. 1303–1330. External Links: ISSN 0138-9130, Link, Document Cited by: §2.2.
  • J. L. Milstead (1980) Citation indexing—its theory and application in science, technology and humanities. wiley, oxford (1979), 274, $15.95. Information Processing and Management 16. Cited by: §2.4.
  • H. F. Moed (2010) Measuring contextual citation impact of scientific journals. Journal of Informetrics 4 (3), pp. 265–277. External Links: ISSN 1751-1577, Document, Link Cited by: §2.4.
  • L. Page, S. Brin, R. Motwani, and T. Winograd (1999) The pagerank citation ranking: bringing order to the web.. Technical Report Technical Report 1999-66, Stanford InfoLab, Stanford InfoLab. Note: Previous number = SIDL-WP-1999-0120 External Links: Link Cited by: §2.1, §2.4, §4.1.
  • V. Qazvinian and D. Radev (2010) Identifying non-explicit citing sentences for citation-based summarization.. In Proceedings of the 48th annual meeting of the association for computational linguistics, pp. 555–564. Cited by: §2.2.
  • S. Ruder, P. Ghaffari, and J. G. Breslin (2016) A hierarchical model of reviews for aspect-based sentiment analysis. In EMNLP, Cited by: §2.2.
  • P. O. Seglen (1997) Why the impact factor of journals should not be used for evaluating research. BMJ 314, pp. 497. Cited by: §1.
  • C. Sun, L. Huang, and X. Qiu (2019) Utilizing bert for aspect-based sentiment analysis via constructing auxiliary sentence. In NAACL, Cited by: §2.2.
  • S. Teufel, A. Siddharthan, and D. Tidhar (2006) Automatic classification of citation function. In EMNLP, Cited by: §2.1.
  • M. M. Trusca, D. Wassenberg, F. Frasincar, and R. Dekker (2020) A hybrid approach for aspect-based sentiment analysis using deep contextual word embeddings and hierarchical attention. In ICWE, Cited by: §2.2.
  • M. Valenzuela, V. A. Ha, and O. Etzioni (2015) Identifying meaningful citations. In AAAI Workshop: Scholarly Big Data, Cited by: §1, §2.1, §2.4, §4.1.
  • O. Wallaart and F. Frasincar (2019) A hybrid approach for aspect-based sentiment analysis using a lexicalized domain ontology and attentional neural models. In ESWC, Cited by: §2.2.
  • H. Wang, Y. Lu, and C. Zhai (2010) Latent aspect rating analysis on review text data: a rating regression approach. Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. Cited by: §2.2.
  • H. Xu, B. Liu, L. Shu, and P. S. Yu (2019) BERT post-training for review reading comprehension and aspect-based sentiment analysis. In NAACL, Cited by: §2.2.
  • C. Zerva, M. Nghiem, N. T. H. Nguyen, and S. Ananiadou (2020) Cited text span identification for scientific summarisation using pre-trained encoders. Scientometrics (English). External Links: Document, ISSN 0138-9130 Cited by: §2.2.