1 Introduction
Peer review is a cornerstone of scientific publishing. It also functions as a gatekeeper for publication in toptier computerscience conferences. To facilitate highquality peer reviews, it is imperative that paper submissions are reviewed by qualified reviewers. In addition to assessing a reviewer’s qualifications based on their prior publications (Charlin and Zemel, 2013), many conferences implement a paper bidding phase in which reviewers express their interest in reviewing particular papers. Facilitating bids is important because the review quality is higher when reviewers are interested in a paper (Stent and Ji, 2018).
Unfortunately, paper bidding also creates the potential for difficulttodetect adversarial behavior by reviewers. In particular, a reviewer may place high bids on papers by “friends” or colluding authors, even when those papers are outside of the reviewer’s area of expertise, with the purpose of accepting the papers without merit. Anecdotal evidence suggests that such bid manipulation attacks may have, indeed, influenced paper acceptance decisions in recent toptier computer science conferences (Vijaykumar, 2020).
This paper investigates the efficacy of bid manipulation attacks in a realistic paperassignment system. We find that such systems are, indeed, very vulnerable to adversarial bid, which is corroborated by prior work (Jecmen et al., 2020). Furthermore, we design a paperassignment system that is robust against bid manipulation attacks. Specifically, our system treats paper bids as supervision for a model of reviewer preferences, rather than directly using bids to assign papers. We then detect atypical patterns in the paper bids by measuring their influence on the model, and remove such highinfluence bids as they are potentially malicious.
We evaluate the efficacy of our system on a novel, synthetic dataset of paper bids and assignments that we developed to facilitate the study of robustness of paperassignment systems. We carefully designed this dataset to match the statistics of real bidding data from recent computerscience conferences. We find that our system produces highquality paper assignments on the synthetic dataset, while also providing robustness against groups of colluding, adversarial reviewers in a whitebox setting in which the adversaries have full knowledge of the system’s inner workings and its inputs. We hope our findings will help computerscience conferences in performing highquality paper assignments at scale, while also minimizing the surface for adversarial behavior by a few bad actors in their community.
2 Bid Manipulation Attacks
We start by investigating the effectiveness of bid manipulation attacks on a typical paper assignment system.
Paper assignment system. Most paper assignment systems utilize a computed score for each reviewerpaper pair that reflects the degree of relevance between the reviewer and the paper (Hartvigsen et al., 1999; Goldsmith and Sloan, 2007; Tang et al., 2012; Charlin and Zemel, 2013). The conference organizer can then maximize utility metrics such as the total relevance score whilst maintaining appropriate balance constraints: i.e., there are an adequate number of, say, reviewers per paper and every reviewer receives a manageable load of at most papers. This approach gives rise to the following optimization problem:
(1)  
subject to 
where and refer to the total number of reviewers and papers, respectively. Eq. 1 is an assignment problem that can be solved using standard techniques such as the Hungarian algorithm (Kuhn, 1955).
The reviewerpaper relevance score, , is critical in obtaining highquality assignments. Arguably, an ideal relevance score incorporates both the reviewer’s expertise and interest towards the paper (Stent and Ji, 2018). Approaches for measuring expertise include computing the similarity of textural features between reviewers and papers (Dumais and Nielsen, 1992; Mimno and McCallum, 2007; Charlin and Zemel, 2013) as well as using authorship graphs (Rodriguez and Bollen, 2008; Liu et al., 2014). In addition to these features, paper assignment systems generally consider reviewer interest obtained via selfreported paper bids. For example, the NeurIPS2014 assignment system (Lawrence, 2014) uses a formula for that incorporates the reviewer’s and paper’s subject area, TPMS score (Charlin and Zemel, 2013), and the reviewer’s bid. Each reviewer may bid on a paper as none, in a pinch, willing, or eager^{1}^{1}1For simplicity, we exclude the option not willing that expresses negative interest. to express their preference. The none option is the default bid when a reviewer did not enter a bid.
Bid manipulation attacks. Although incorporating reviewer interest via selfreported bids is beneficial to the overall assignment quality, it also allows a malicious reviewer to bid eager on a paper that is outside their area of expertise, with the sole purpose of influencing the acceptance decision of a paper that was authored by a “friend” or a “rival”. If a single bid has too much influence on the overall assignment, such bid manipulation attacks may be effective and jeopardize the integrity of the review process.
We demonstrate the feasibility of a simple blackbox bid manipulation attack against the assignment system in Eq. 1. For a target paper , the malicious reviewer attacks the assignment system by bidding eager for and none for all other papers. We evaluate the effectiveness of the attack by randomly picking 400 papers from our synthetic conference dataset (see Section 5), and determine paper assignments using Eq. 1 (with and ) using relevance scores from the NeurIPS2014 system (Lawrence, 2014). Fig. 1 (left) shows the fraction of adversarial reviewers () that can secure their target paper in the final assignment via the bid manipulation attack. As an attack is easier if a reviewer is already ranked high for a particular paper (e.g., because nobody else bids on this paper, or the subject areas match), we visualize the success rate as a function of rank of the “true” paperreviewer relevance score. More precisely, we rank all reviewers by their original (premanipulation) relevance score and group them into bins of increasing size.
The light gray bar in each bin reports the assignment success rate if all reviewers bid honestly. In the absence of malicious reviewers, the majority of assignments go to reviewers ranked 1 to 7. However, with malicious bids, any reviewer stands a good chance of being assigned the target paper. For instance, the chance of getting a target paper for a reviewer ranked between 16 and 31 increases from 0% to over 70% when bidding maliciously. Even reviewers with the lowest ranks (2048 and lower) have a 40% chance of being assigned the target paper by just changing their bids. This possibility is especially concerning because it may be much easier for an author to corrupt a nonexpert reviewer (i.e., a reviewer with a relatively low rank), simply because there are many more such reviewer candidates.
3 Predicting Relevance Scores
The success of the bid manipulation attack exposes an inherent tension in the assignment process. Assigning papers to a reviewer who has expressed explicit interest helps in eliciting highquality feedback. However, relying too heavily on individual bids paves the way for misuse by malicious reviewers. To achieve a better tradeoff, we propose to use the bids from all reviewers (of which the vast majority are honest) as labels to train a supervised model that predicts bids as the similarity score , and all other indicators (e.g., subject area matches, TPMS score (Charlin and Zemel, 2013), and paper title) as features. This indirect use of bids allows the scoring function to capture reviewer preferences but reduces the potential for abuse. Later, we will show that this approach also allows for the development of active defenses against bid manipulation attacks.
Scoring model. Let be a feature matrix consisting of
dimensional feature vectors for every pair of
reviewers and papers. Let denote the set of possible bids in numerical form, e.g. . We defineas the label vector containing the numerical bids for all reviewerpaper pairs. We define a ridge regressor that maps reviewerpaper features to corresponding bids, similar to the linear regression model from
Charlin and Zemel (2013):(2) 
To ensure that no single reviewer has disproportionate influence on the model, we restrict the maximum number of positive bids from a reviewer to be at most and subsample bids of a reviewer whenever the number of bids exceeds . In a typical CS conference, most reviewers bid on no more than 60 papers (out of thousands of submissions) (Shah et al., 2018).
The trained model can predict reviewer interest by computing a score for a reviewerpaper pair as follows:
(3) 
where is the ridge Hessian (size ) and is the feature vector for the pair . These predicted scores can then be used in the assignment algorithm in place of bids. In Appendix B, we validate the prediction accuracy of our model using the average precisionatk (AP@k) metric.
There is an important advantage to our method: bidding is a laborious and monotonous task, and as mentioned above most reviewers only bid on very limited papers. It is likely that only a partial set of bids is observed among all papers that the reviewer is interested in. The scoring model could fill in missing scores by learning the latent interest from the features of papers and reviewers. Completing the full bidding matrix improves the assignment quality, particularly for papers that received few bids originally.
The choice of regression loss serves an important purpose. Since the bid value (between 0 and 3) reflects the degree of interest from a reviewer, the loss should reflect the severity of error when making a wrong prediction. For example, if a reviewer expresses eager interest (bid score 3), predicting no bid (bid score 0) would incur a much greater loss than predicting willing (bid score 2).
Effect against simple blackbox attack. Fig. 1 (right) shows the effect of the proposed scoring model against the bid manipulation attack from Section 2. The assignment probability for honest bidders (light orange) is similar to that of the NeurIPS2014 system across different bins of reviewer rank. However, deviations from benign bidding behavior are clearly corrected by the model: in fact, the assignment probability decreases after the attack (dark orange). This can be explained by the fact that our approach does not use bids to assign reviewers to papers directly, but instead to learn for what type of papers a reviewer may be suitable. The reviewer is actually wellsuited for high ranking submissions, but by only bidding on the target paper (instead of honest bids on similar submissions) the model receives less signal that suggests the reviewer is a match for the target paper.
4 Defending Against Colluding Bid Manipulation Attackers
Although the learningbased approach appears robust against manipulation of bids by one reviewer, attackers may have stronger capabilities. Specifically, an adversary can modify their bids based on knowledge of a friend/rival’s submissions or another reviewer’s bids. Moreover, adversarial reviewers may collude to secure the assignment of a specific paper. We capture such capabilities in a threat model that describes our assumptions about the adversary. We design an optimal whitebox attack in this threat model that drastically improves the adversary’s success rate. Both the threat model and the whitebox attack are intentionally designed to provide very broad capabilities to the adversary. Next, we design a defense that detects and removes whitebox adversaries from the reviewer pool to provide security even under the new threat model.
Threat Model.
We make the following assumptions about adversarial reviewers: 1. The adversary may collude with one or more reviewers to secure a target paper’s assignment.^{2}^{2}2e.g. by posting the paper ID in a private chat channel of college alumni or like minded members of the community. If any of the colluding reviewers are assigned the paper in question, the attack is considered successful. Collusion with any reviewer is allowed except the topranked candidates (based on honest bidding), as this would not be an abuse of the bidding process^{3}^{3}3For this reason, our framework is not suitable for preventing the attack in (Vijaykumar, 2020) since collusion likely occurred in the author stage.. 2. The adversary cannot manipulate any training features. We are interested in preventing against the additional security risk enabled by the bidding mechanism. An attack that succeeds by manipulating features can also be used against an automated assignment system that does not allow bidding. 3. The adversary may have full knowledge of the assignment system. 4. The adversary may have direct access to the features and bids of all other reviewers. 5. The adversary may be able to arbitrarily manipulate his/her bids and those of anyone in the colluding group.
4.1 Whitebox Attack
To successfully attack the assignment system under these assumptions, the adversary needs to maximize the predicted relevance score of the target paper for him/herself and/or the other colluding reviewers. This amounts to executing a data poisoning attack (Biggio et al., 2012; Xiao et al., 2015; Mei and Zhu, 2015; Jagielski et al., 2018; Koh et al., 2018) against the regression model that is used to predict scores, aiming to alter the score prediction for a specific paperreviewer pair.
Noncolluding attack. We first devise an attack that maximizes the malicious reviewer’s score for target paper in the noncolluding setting. We represent reviewers as and let
denote the feasible set of bidding vectors for a particular reviewer for which the number of positive bids is at most . Adversary can change to the that maximizes the relevance score:
It is straightforward to see that maximally increases the score prediction for reviewer :
(4) 
Note that Eq. 4 maximizes the inner product between and . To achieve the maximum, papers corresponding to the top positive values in should be assigned , and the remaining bids are set to 0. This requires the adversary to solve a top selection problems, which can be done in (Cormen et al., 2009).
Colluding attack. Adversarial reviewers can collude to more effectively maximize the predicted score for reviewer . An attack in this setting maximizes over the colluding group, , and over the bids of every reviewer in . We note that Eq. 4 is not specific to reviewer , but that the influence of any reviewer ’s bids on score prediction has the form:
Hence, the influence from the members of on are independent, which implies the adversaries can adopt a greedy approach. Specifically, colluding adversaries can alter the dimensional training label vector to to maximize the score prediction for reviewer via:
(5) 
where denotes the set of possible colluding parties of size and their bids:
The last line in Eq. 5 can be computed by first evaluating for every , and then greedily selecting the top reviewers to form the colluding party with . The computational complexity of the resulting attack is .
4.2 Active Defense Against Colluding Bid Manipulation Attacks
Both the blackbox attack from Section 2
and the whitebox attack described above adversarially manipulate paper bids. In contrast to honest reviewers whose bids are strongly correlated with their expertise and subject of interest, attackers provide “surprising” bids that have a large influence on the predictions of the scoring model. This allows us to detect potentially malicious bids using an outlier detection algorithm. Specifically, we make our paper assignment system robust against the colluding bid manipulation attacks by detecting and removing training examples that have a disproportional influence on model predictions. We make the same assumptions about the attacker as in
Section 4.1, and, in addition, that they are unaware of our active defense.To implement this system, we note that given a set of malicious reviewers , we can recompute the relevance scores for a reviewerpaper pair by removing these reviewers from the training set:
where is the Hessian matrix for data points in the complement of the malicious reviewer set . We assume that at most reviewers collude to form set . Intuitively, reflects the relevance score for the pair as predicted by other reviewers. Relying on the assumption that the vast majority of reviewers are benign, is likely close to the unobserved true preferences had been benign.
Following work on robust regression (Jagielski et al., 2018; Chen et al., 2013; Bhatia et al., 2015), this allows us to compute relevance scores that ignore the most likely malicious reviewers in by evaluating:
(6) 
That is, overestimates the decrease in the predicted relevance score for had been benign. The optimization problem in Eq. 6 is intractable because it searches over subsets of reviewers, , and because it inverts a Hessian for every . To make optimization tractable, we approximate the Hessian by , which is accurate for small . This approximation facilitates a greedy search for because it allows Eq. 6 to be decomposed:
(7) 
Eq. 7 can be computed efficiently by sorting the values of and selecting as well as the top corresponding reviewers in . The computational complexity of the resulting algorithm is for each pair .
Assignment algorithm. Efficient approximation for the robust relevance score enables our robust assignment algorithm, which proceeds as follows. We first form the candidate set of reviewerpaper pairs by selecting the top reviewers for each paper according to the predicted relevance score . For each pair , the algorithm marks as potentially malicious and removes the pair from if would not have belonged to the candidate set using the robust relevance score . Since , an colluding attack is always marked as malicious if . After removing every potentially malicious pair from , the assignment problem in Eq. 1 is solved over the remaining reviewerpaper pairs in the candidate set to produce the final assignment^{4}^{4}4This can be achieved by setting for all .. The resulting assignment algorithm is summarized in Algorithm 1. The algorithm trades off two main goals:

Every paper needs to be assigned to a sufficient number of reviewers that have the expertise and willingness to review. Therefore, the approach that removes potentially malicious reviewer candidates needs to have a low false positive rate (FPR).

The final assignment should be robust against collusion attacks. Therefore, the approach that filters out potentially malicious reviewers needs to have a high true positive rate (TPR).
This tradeoff between FPR and TPR is governed by the hyperparameter
. Using a higher value of can provide robustness against larger collusions, but it may also remove many benign reviewers from the candidate set even when insufficient alternative reviewers are available. We perform a detailed study of this tradeoff in Section 5.5 Experiments
We empirically study the efficacy of our robust paper bidding and assignment algorithm. Our experiments show that our assignment algorithm removes a large fraction of malicious reviewers, while still preserving the utility of bids for honest reviewers.
Dataset. Because real bidding data is not publicly available, we construct a synthetic conference dataset from the Semantic Scholar Open Research Corpus (Ammar et al., 2018). This corpus contains publicly available academic papers annotated with attributes such as citation, venue, and field of study. To simulate a NeurIPSlike conference environment, we collect papers published in AI conferences between 2014 and 2015 to serve as submitted papers. We also select authors to serve as reviewers, and generate bids based on paper citations. Generated bids are selected from the set , corresponding to the bids none, in a pinch, willing, and eager.
We generated bids in such a way as to mimic bidding statistics from a recent, major AI conference. Our paper and reviewer features include paper/reviewer subject area, paper title, and a TPMSlike similarity score. We refer to the appendix for more details on our synthetic dataset. For full reproducibility we release our code^{5}^{5}5https://github.com/facebookresearch/securepaperbidding and synthetic data^{6}^{6}6https://drive.google.com/drive/folders/1khI9kaPy_8F0GtAzwR48Jc3rsQmBhfe publicly and invite program chairs across disciplines to use our approach on their real bidding data.
5.1 Effectiveness of WhiteBox Attacks
We first show that the whitebox attack from Section 4.1 can succeed against our relevance scoring model if detection of malicious reviewers is not used. We perform the whitebox attacks as follows:
1. The relevance scoring model is trained to predict scores for every reviewerpaper pair.
2. We randomly select 400 papers and rank all reviewers for these papers based on .
3. We discard the highestranked reviewers as attacker candidates for paper because highranked reviewers need not act maliciously to be assigned.
4. We group the remaining reviewers into bins of exponentially growing size (powers of two), and sample 10 malicious reviewers from each bin without replacement.
5. Each selected reviewer chooses its most suitable colluders and modifies their bids using the attack from Section 4.1, targeting paper .
Result. We run our assignment algorithm on the maliciously modified bids and evaluate the chance of assignment for reviewer before and after the attack. Fig. 2 shows the fraction of malicious reviewers that successfully alter the paper assignments and is assigned their target paper. Each line shows the attack success rate with a certain colluding party size of . When bidding honestly, all reviewers are below rank and have no chance of being assigned. With a colluding party size of , a reviewer has a 22% chance of being assigned the target paper at an original rank of 51. At the same rank, the success rate is up to 5% even when no collusion occurs. Increasing the collusion size strictly increases the assignment probability, while attackers starting from a lower original rank have a lower success rate. The latter trend shows that the model provides a limited degree of robustness even without the detection mechanism.
Setting  FPR  Assignment Quality  # of under  
Top5  Top50  Frac. of pos.  Avg. bid score  Avg. TPMS  Avg. max. TPMS  reviewed  
NeurIPS2014  –  –  0.990  2.732  0.732  0.737  – 
TPMS only  –  –  0.323  0.872  0.949  0.997  – 
–  –  0.442  1.200  0.848  0.943  –  
0.022  0.259  0.443  1.201  0.849  0.943  0  
0.046  0.428  0.442  1.199  0.850  0.944  0  
0.069  0.528  0.439  1.191  0.852  0.945  4  
0.100  0.600  0.435  1.181  0.855  0.947  7  
0.139  0.657  0.433  1.172  0.859  0.950  24 
5.2 Effectiveness of the Robust Assignment Algorithm
We evaluate the robust assignment algorithm against successful attacks from Section 4.1.
What percentage of attacks is accurately detected? Fig. 3 shows the true positive rate (TPR) of detecting malicious reviewers as a function of collusion size, (on the axis), for different values of the hyperparameter . First, we measure the algorithm against all attacks that succeeded against the undefended scoring model (cf. Fig. 2). The results show that when , the detection TPR is very close to 100%, which implies almost all malicious reviewers are removed in this case. The TPR decreases as the size of the collusion, increases but still provides some protection even when . For instance, when and (darkest blue line), approximately 40% of the successful attacks are detected. Increasing will protect against larger colluding parties at the cost of increasing the false positive rate (FPR), that is, the number of times in which an honest reviewer is mistaken for an adversary. A high FPR can negatively impact the quality of the assignments.
The degree of knowledge that we assume the attacker may possess far exceed that of typical reviewers. As a result, Fig. 3 may drastically underestimate the efficacy of our detection framework for practical applications. We further formulate a stronger colluding blackbox attack and evaluate against it in the appendix. Our results are very encouraging as it suggests that conference organizers can obtain robustness against more than 80% of successful colluding blackbox attacks with when applying our detection framework.
Defense  Assignment Quality  Detection TPR  

Frac. of pos.  Avg. bid score  Avg. TPMS  Avg. max. TPMS  
TRIM  0.439  1.19  0.848  0.943  0.201  0.081  0.037  0.035  0.054 
TRIM  0.219  0.439  0.816  0.917  0.986  0.966  0.942  0.924  0.919 
Algorithm 1  0.443  1.201  0.849  0.943  1.000  0.077  0.000  0.000  0.000 
Algorithm 1  0.433  1.172  0.859  0.950  1.000  1.000  1.000  1.000  1.000 
What is the quality of the final assignments? To study the effect of false positives from detection on the final paper assignments, we also evaluate assignment quality in terms of fraction of positive bids, average bid score, average TPMS, and average maximum TPMS (i.e., maximum TPMS score among assigned reviewers for each paper averaged over all papers). Higher values of these metrics indicate a higher assignment quality. The first row in Table 1 shows the assignment quality when using the NeurIPS2014 (Lawrence, 2014) relevance scores. As expected, it overemphasizes positive bids, which constitutes its inherent vulnerability. The second line shows the assignment quality when using only the TPMS score, which serves as a baseline for evaluating how much utility from bids is our robust assignment framework preserving. In contrast, using TPMS scores overemphasizes average TPMS and average maximum TPMS.
The third line shows our assignment algorithm using the linear regression model without malicious reviewer detection (). As it fills in the initially sparse bidding matrix, it has significantly more papers to choose from and yields assignments with fewer positive bids — however the assignment quality is substantially higher in terms of TPMS metrics compared to when using NeurIPS2014 scores. The regression model offers a practical tradeoff between relying on bids that reflect reviewer preference and relying on factors related to expertise (such as TPMS).
The remaining rows report results for the robust assignment algorithm with increasing values of . As expected, detection FPR increases as increases, but only has a limited effect on the assignment quality metrics. The main reason for this is that most false positives are lowranked reviewers, who are unlikely to be assigned the paper even if they were not excluded from the candidate set. Indeed, detection FPR is significantly lower for top5 reviewers (second column) compared to that of top50 reviewers (third column). Overall, our results show that the assignment quality is hardly impacted by the detection mechanism.
We observed that a small number of papers were not assigned sufficient reviewers because the detection removed too many reviewers from the set of candidate reviewers for those papers. We report this number in the last column (# of underreviewed) of Table 1. Although this is certainly a shortcoming of the robust assignment algorithm, the number of papers with insufficient candidates is small enough that it is still practical for conference organizers to assign them manually.
Comparison with robust regression. One effective defense against labelpoisoning attacks for linear regression is the TRIM algorithm (Jagielski et al., 2018), which fits the model on a subset of the points that incur the least loss. The algorithm assumes that out of the training points are poisoned and optimize:
s.t. 
where denote the subset of training data points selected by the index set . We apply TRIM to identify the poisoned pairs and remove them from the assignment candidate set. We then proceed to assign the remaining pairs using Eq. 3.
Table 2 shows the comparison between TRIM and our robust assignment algorithm in terms of assignment quality and detection TPR. The first and third rows correspond to the TRIM algorithm and Algorithm 1 that achieve a comparable assignment quality. Both methods fail to detect colluding attacks with , but Algorithm 1 is drastically more effective when . The second and fourth rows compare settings of TRIM and Algorithm 1 that achieve a similar detection TPR. Indeed, both have close to detection rate for . However, the assignment quality for TRIM is much worse, with all quality metrics being lower than using TPMS score alone (cf. row 2 in Table 1). Note that TRIM requires a drastic overestimate of the number of poisoned data in order to detect most attack instances, which means that many benign training samples are being misidentified as malicious.
Running time. As described in Section 4.2, our detection algorithm has a computational complexity of for each reviewerpaper pair. In practice, pairs belonging to the same paper can be processed in a batch to reuse intermediate computation, which amounts to an average of 26 seconds per paper. This process can be easily parallelized across papers for efficiency.
6 Related Work
Our work fits in a larger body of work on automatic paper assignment systems, which includes studies on the design of relevance scoring functions (Dumais and Nielsen, 1992; Mimno and McCallum, 2007; Rodriguez and Bollen, 2008; Liu et al., 2014) and appropriate quality metrics (Goldsmith and Sloan, 2007; Tang et al., 2012). These studies have contributed to the development of conference management platforms such as EasyChair, HotCRP, and CMT that support most major computer science conferences.
Despite advances in automatic paper assignment, (Rennie, 2016) highlights shortcomings of peerreview systems owing to issues such as prejudices, misunderstandings, and corruption, all of which serve to make the system inefficient. For instance, the standard objective for assignment (say, Eq. 1) seeks to maximize the total relevance of assigned reviewers for the entire conference, which may be unfair to papers from underrepresented areas. This has led to efforts that design objective functions and constraints to promote fairness in the assignment process for all submitted papers (Garg et al., 2010; Long et al., 2013; Stelmakh et al., 2018; Kobren et al., 2019).
Furthermore, the assignment problem faces the additional challenge of coping with the implicit bias of reviewers (Stelmakh et al., 2019). This issue is particularly prevalent when authors of competing submissions participate in the review process, as they have an incentive to provide negative reviews in order to increase the chance of their own paper being accepted (Anderson et al., 2007; Thurner and Hanel, 2011). In order to alleviate this problem, recent studies have devised assignment algorithms that promote impartiality in reviewers (Aziz et al., 2016; Xu et al., 2018). We contribute to this line of work by identifying and removing reviewers who adversarially alter their bids to be assigned papers for which they have adverse incentives.
More recently, Jecmen et al. (2020) studied the bid manipulation problem and considered an orthogonal approach to defending against it. Their method focuses on probabilistic assignment and upper limits the assignment probability for any paperreviewer pair. As a result, the success rate of a bid manipulation attack is reduced. In contrast, our work seeks to limit the disproportional influence of malicious bids rather than uniformly across all paperreviewer pairs, and further considers the influence of colluding attackers on the assignment system.
7 Conclusion
This study demonstrates some of the risks of paper bidding mechanisms that are commonly utilized in computerscience conferences to assign reviewers to paper submissions. Specifically, we show that bid manipulation attacks may allow adversarial reviewers to review papers written by friends or rivals, even when these papers are outside of their area of expertise. We developed a novel paper assignment system that is robust against such bid manipulation attacks, even in settings when multiple adversaries collude and have indepth knowledge about the assignment system. Our experiments on a synthetic but realistic dataset of conference papers demonstrate that our assignment system is, indeed, robust against such powerful attacks. At the same time, our system still produces highquality paper assignments for honest reviewers. Our assignment algorithm is computationally efficient, easy to implement, and should be straightforward to incorporate into modern conference management systems. We hope that our study contributes to a growing body of work aimed at developing techniques that can help improve the fairness, objectivity, and quality of the scientific peerreview process at scale.
References
 Construction of the literature graph in semantic scholar. arXiv preprint arXiv:1805.02262. Cited by: Appendix A, §5.
 The perverse effects of competition on scientists’ work and relationships. Science and engineering ethics 13 (4), pp. 437–461. Cited by: §6.

Strategyproof peer selection: mechanisms, analyses, and experiments.
In
Thirtieth AAAI Conference on Artificial Intelligence
, Cited by: §6.  Robust regression via hard thresholding. In Advances in Neural Information Processing Systems, pp. 721–729. Cited by: §4.2.

Poisoning attacks against support vector machines
. arXiv preprint arXiv:1206.6389. Cited by: §4.1.  The toronto paper matching system: an automated paperreviewer assignment system. In ICML, Cited by: §A.1, §1, §2, §2, §3, §3.
 Robust sparse regression under adversarial corruption. In International Conference on Machine Learning, pp. 774–782. Cited by: §4.2.
 Introduction to algorithms, third edition. 3rd edition, The MIT Press. External Links: ISBN 0262033844 Cited by: §4.1.
 Finding related pages in the world wide web. Computer networks 31 (1116), pp. 1467–1479. Cited by: §A.1.
 Automating the assignment of submitted manuscripts to reviewers. In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 233–244. Cited by: §2, §6.
 Assigning papers to referees. Algorithmica 58 (1), pp. 119–136. Cited by: §6.
 The ai conference paper assignment problem. In Proc. AAAI Workshop on Preference Handling for Artificial Intelligence, Vancouver, pp. 53–57. Cited by: §2, §6.
 The conference paperreviewer assignment problem. Decision Sciences 30 (3), pp. 865–876. Cited by: §2.
 Manipulating machine learning: poisoning attacks and countermeasures for regression learning. In 2018 IEEE Symposium on Security and Privacy (SP), pp. 19–35. Cited by: §4.1, §4.2, §5.2.
 Mitigating manipulation in peer review via randomized reviewer assignments. arXiv preprint arXiv:2006.16437. Cited by: §1, §6.
 Paper matching with local fairness constraints. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1247–1257. Cited by: §6.
 Stronger data poisoning attacks break data sanitization defenses. arXiv preprint arXiv:1811.00741. Cited by: §4.1.
 The hungarian method for the assignment problem. Naval research logistics quarterly 2 (12), pp. 83–97. Cited by: §2.
 Paper allocation for nips. Note: https://inverseprobability.com/2014/06/28/paperallocationfornips. [Online; accessed on 20201002] Cited by: §2, §2, §5.2.
 A robust model for paper reviewer assignment. In Proceedings of the 8th ACM Conference on Recommender systems, pp. 25–32. Cited by: §2, §6.
 On good and fair paperreviewer assignment. In 2013 IEEE 13th International Conference on Data Mining, pp. 1145–1150. Cited by: §6.
 Using machine teaching to identify optimal trainingset attacks on machine learners. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29. Cited by: §4.1.
 Expertise modeling for matching papers with reviewers. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 500–509. Cited by: §2, §6.
 Let’s make peer review scientific. Nature. Cited by: §6.
 An algorithm to determine peerreviewers. In Proceedings of the 17th ACM conference on Information and knowledge management, pp. 319–328. Cited by: §2, §6.
 Design and analysis of the nips 2016 review process. The Journal of Machine Learning Research 19 (1), pp. 1913–1946. Cited by: §A.2, §3.
 PeerReview4All: fair and accurate reviewer assignment in peer review. arXiv preprint arXiv:1806.06237. Cited by: §6.
 On testing for biases in peer review. In Advances in Neural Information Processing Systems, pp. 5287–5297. Cited by: §6.
 A review of reviewer assignment methods. Note: https://naacl2018.wordpress.com/2018/01/28/areviewofreviewerassignmentmethods. [Online; accessed on 20201002] Cited by: §1, §2.
 On optimization of expertise matching with various constraints. Neurocomputing 76 (1), pp. 71–83. Cited by: §2, §6.
 Peerreview in a world with rational scientists: toward selection of the average. The European Physical Journal B 84 (4), pp. 707–711. Cited by: §6.
 Potential organized fraud in acm/ieee computer architecture conferences. Note: https://medium.com/@tnvijayk/potentialorganizedfraudinacmieeecomputerarchitectureconferencesccd61169370d. [Online; accessed on 20201013] Cited by: §1, footnote 3.
 Feature hashing for large scale multitask learning. In Proceedings of the 26th annual international conference on machine learning, pp. 1113–1120. Cited by: Appendix B.
 Support vector machines under adversarial label contamination. Neurocomputing 160, pp. 53–62. Cited by: §4.1.
 On strategyproof conference peer review. arXiv preprint arXiv:1806.06266. Cited by: §6.
Appendix A Dataset Construction
In this section, we describe how we subsampled data from the Semantic Scholar Open Research Corpus (S2ORC) (Ammar et al., 2018), extracted reviewer/paper features such as subject area and TPMS, and simulated bids using citation. Our data is publicly released^{7}^{7}7https://drive.google.com/drive/folders/1khI9kaPy_8F0GtAzwR48Jc3rsQmBhfe?usp=sharing for reproducibility and to facilitate future research.
a.1 Conference Simulation
The goal of our dataset is to simulate a NeurIPSlike conference environment, where the organizers assign reviewers to papers based on expertise and interest. We first retrieve the collection of 6956 papers from S2ORC that are published in ML/AI/CV/NLP venues between the years 20142015, which includes the following conferences: AAAI, AISTATS, ACL, COLT, CVPR, ECCV, EMNLP, ICCV, ICLR, ICML, IJCAI, NeurIPS, and UAI. We believe the diversity of subject areas represented by the above conferences is an accurate reflection of typical ML/AL conferences in recent years. We will refer to this collection of papers as the corpus.
Subject areas.
Most conferences require authors to indicate primary and secondary subject areas for their submitted papers. However, the S2ORC only contains a field of study attribute for most of the retrieved papers in the corpus, which is often the broad category of computer science
. To identify the suitable finegrained subjects for each paper, we adopt an unsupervised learning approach of clustering the papers by relatedness and treating each discovered cluster as a subject area.
Similarity is defined in terms of cocitations – a common signal used in information retrieval for discovering related documents (Dean and Henzinger, 1999). For a paper , let denote the union of incitations and outcitations for . The similarity between two papers is defined as
(S1) 
which is the cosine similarity in document retrieval. We perform agglomerative clustering using average linkage
^{8}^{8}8https://scikitlearn.org/stable/modules/clustering.html#hierarchicalclustering to reduce the set of papers to 1000 clusters. After removing small cluster (less than 5 papers), we obtain 368 clusters to serve as subject areas. Table S1 shows a few sample clusters along with papers contained in the cluster. Most of the discovered clusters are highly coherent with members sharing keywords in their titles despite the definition of similarity depending entirely on cocitations.To populate the list of subject areas for a given paper , we first compute its subject relatedness to a cluster by:
(S2) 
Given the set of clusters representing subject areas, we identify the top5 clusters according to to be the list of subject areas for the paper , denoted .
Subject Area  Papers  

Multitask learning 


Video segmentation 


Topic modeling 


Feature selection 

Reviewers.
The S2ORC dataset contains entries of authors along with their list of published papers. We utilize this information to simulate reviewers by collecting the set of authors who has cited at least one paper from the corpus. The total number of retrieved authors is 234,598. Because the vast majority of retrieved authors are very loosely related to the field of ML/AI, they would not be suitable reviewer candidates for a real ML/AI conference. Therefore, we retain only authors who have cited at least 15 papers from the corpus to serve as reviewers. We also remove authors who cited more than 50 papers from the corpus, since these reviewers represent senior researchers that would typically serve as area chairs. The number of remaining reviewers is .
Most conferences also solicit selfreported subject areas from reviewers. We simulate this attribute by leveraging the clusters discovered through cocitation. For each subject area , we count the number of times appeared in for each of the papers that the reviewer has cited. The 5 most frequently appearing clusters (ties are broken randomly) serve as the reviewer’s subject areas, denoted .
TPMS score.
The TPMS score (Charlin and Zemel, 2013) is computed by measuring the similarity between a reviewer’s profile – represented by a set of papers that the reviewer uploads – and a target paper. We simulate this score using the language modelbased approach from the original TPMS paper, which we detail below for completeness. For a reviewer , let denote the bagofwords representation for the set of papers that the reviewer has authored. More specifically, we collect the abstracts of the papers that has authored, remove all stop words, and pool the remaining words together into as a multiset. Similarly, let denote the bagofwords representation for the abstract of a paper . The simulated TPMS is computed as:
(S3) 
where is the Dirichletsmoothed normalized frequency of the word in . Let denote the bagofwords representation for the entire corpus of (abstracts of) papers, and let (resp. ) denote the occurrences of in the corpus (resp. ). Then
where is a smoothing factor. We set in our experiment. The obtained scores are normalized per paper between 0 and 1.
a.2 Simulating Bids
The most challenging aspect of our simulation is the bids. At first, it may seem natural to simulate bids using citations, since it is a proxy of interest and can be easily obtained from the S2ORC dataset. However, we have observed that bids are heavily skewed towards a few very influential papers, while the distribution of bids is much more uniform across all papers. To overcome this issue, we instead model a reviewer’s bidding behavior based on the following assumptions:

A reviewer will only bid on papers from subject areas that he/she is familiar with.

Given two papers from the same subject area, a reviewer favors bidding on a paper whose title/abstract is a better match with the reviewer’s profile.
We define several scores that reflect the above aspects and combine them to obtain the final bids. In practice, reviewers will often also rely on TPMS to sort the papers to bid on. However, since our simulated TPMS depends entirely on the abstract, we omit TPMS in our bidding model. Nevertheless, we have observed empirically that TPMS is highly correlated with the bids that we obtain.
Subject score.
We leverage citation to reflect the degree of interest in the subject of a paper. Let denote the inverse citation frequency (ICF) of a paper in the corpus:
The purpose of the ICF is to downweight commonly cited papers to avoid overcrowding of bids. Denote by the top cluster that belongs to according to Eq. S2. The subject score for a paper is defined as:
(S4) 
In other words, for each paper that cites, we merge all papers from the same subject area of , represented by , into the reviewer’s pool. Each paper in is weighted by the reciprocal of the cluster size and the ICF of , and the subject score is the resulting sum after accumulating over all papers that the reviewer cites. Note that every paper within the same subject cluster has the exact same subject score, which is nonzero only if the reviewer has bid on a paper within this subject area. This property reflects the assumption that a reviewer is only interested in papers from familiar subject areas, and is indifferent to different papers in the same subject absent of title/abstract information. To avoid overcrowding by frequently cited papers, we set for any paper that received over 1000 citations.
Title/abstract score.
To measure the degree of title/abstract similarity between a reviewer and a paper, we compute the inner product between the TFIDF vectors of the reviewer’s and paper’s title/abstract. Let denote the inverse document frequency of a word . For each reviewer , let denote the vector, indexed by words, such that for each word . Similarly, we can define the TFIDF vector for a paper , and the abstract score between a pair is given by the inner product:
(S5) 
We can define the title score in an analogous manner based on the bagofwords representation of titles instead of abstracts.
Bidding.
We simulate bids by combining the subject/title/abstract scores as follows. First, we define a total score:
(S6) 
which reflects the assumptions we made about a reviewer’s bidding behavior, i.e., a higher total score reflects a higher reviewer interest in the paper. The total score gives us a ranking of papers in the corpus, denoted by , for each paper . To obtain the positive bids, we randomly retain highranked papers with a decaying probability:
where and are hyperparameters that control the steepness of the drop in sampling probability for lowranked papers, and the average number of papers that each reviewer bids on. We set and in our experiment.
The quality of bids obtained from this sampling procedure is very reasonable. However, the majority of papers had very few bids (see Fig. S1(a)) – contrary to statistics observed in a real conference such as NeurIPS2016 (see Figure 1 in (Shah et al., 2018)). To match the distribution of the number of bids per reviewer/paper to that of a real conference, we further subsample papers (resp. reviewers) to encourage selecting ones with more bids. The distribution of the number positive bids per reviewer/paper after subsampling is shown in Fig. S1(b). Our finalized conference dataset contains reviewers and submitted papers – a realistic balance of papers and reviewers for recent ML/AI conferences.
Finally, some conferences allow more finegrained bids, such as in a pinch, willing and eager for conferences managed using CMT. To simulate bid scores that reflect the degree of interest, we quantize the total score of all positive bids into the discrete range based on the distribution of bid scores in a real conference: at a ratio of for the bids 1, 2 and 3.
Appendix B Features and Training
We provide details regarding feature extraction and model training in this section. To fully imitate a conference management environment, we extract relevant features from papers and reviewers that are obtainable in a realistic scenario, including: paper/reviewer subject area (5 areas for each), bagofwords vector for paper title, and (simulated) TPMS. These features are further processed and concatenated as input to the linear regression model in
Section 3.Table S2 lists all the extracted features and their dimensions. Paper title (PT) is the vectorized count of words appearing in the paper’s title, while paper subject area (PS), reviewer subject area (RS) and intersected subject area (IS) are categorical features represented using binary vectors. The first dimension for the TPMS vector (TV) is the TPMS score for the reviewerpaper pair. We also quantize the raw TPMS into 11 bins and use the bin index as well as the quantized scores, which results in the remaining 11 dimensions for the TPMS vector.
RSPS, RSPT, ISPT and ISTV are additional quadratic features that capture the interaction between feature pairs. The introduction of these quadratic features results in a very highdimensional, albeit extremely sparse feature vector, and hence many dimensions could be collapsed without a significant impact to performance. We apply feature hashing (Weinberger et al., 2009) to the quadratic features at a hash ratio of 0.01, which reduces the total feature dimensionality to .
Features  paper titles (PT)  paper subject area (PS)  reviewer subject area (RS) 

# of Dimensions  930  368  368 
Features  intersected subject area (IS)  TPMS vector (TV)  RSPS 
# of Dimensions  368  12  135424 
Features  RSPT  ISPT  ISTV 
# of Dimensions  342240  342240  4410 
k=1  k=2  k=3  k=4  k=5  k=6  k=7  k=8  k=9  k=10  

AP@k per reviewer  train  0.41  0.41  0.40  0.39  0.38  0.38  0.37  0.37  0.36  0.35 
test  0.38  0.41  0.39  0.38  0.38  0.37  0.36  0.36  0.35  0.34  
AP@k per paper  train  0.55  0.53  0.51  0.50  0.49  0.47  0.46  0.45  0.43  0.42 
test  0.58  0.55  0.52  0.51  0.48  0.47  0.45  0.44  0.43  0.41 
Model performance.
To validate our linear regression model and the selected features, we test the average precision at k (AP@k) for the trained model on a traintest split. Table S3 shows the AP@k per reviewer (P@k for finding papers relevant to a reviewer, averaged across all reviewers) and the AP@k per paper for the linear regressor. It is evident that both metrics are at an acceptable level for real world deployment, and the traintest gap is minimal, indicating that the model is able to generalize well beyond observed bids.
Reviewer  Assigned Papers  Bid Scores  

Kavita Bala 



Ryan P. Adams 



Peter Stone 



Yejin Choi 



Emma Brunskill 



Elad Hazan 


We also perform a qualitative evaluation of the endtoend assignment process using the relevance scoring model. We select six representative (honest) reviewers from our dataset – Kavita Bala^{9}^{9}9https://scholar.google.com/citations?user=Rh16nsIAAAAJ, Ryan P. Adam^{10}^{10}10https://scholar.google.com/citations?user=grQ_GBgAAAAJ, Peter Stone^{11}^{11}11https://scholar.google.com/citations?user=qnwjcfAAAAAJ, Yejin Choi^{12}^{12}12https://scholar.google.com/citations?user=vhPtlcAAAAJ, Emma Brunskill^{13}^{13}13https://scholar.google.com/citations?user=HaN8b2YAAAAJ and Elad Hazan^{14}^{14}14https://scholar.google.com/citations?user=LnhCGNMAAAAJ – representing distinct areas of interest in ML/AI. Table S4 shows the assigned papers for the selected reviewers, which appear to perfectly match the area of expertise for the respective reviewers. Many of the assigned papers have a bid score of 0 despite being very relevant for the reviewer, which shows that the scoring model is able to discover missing bids and improve the overall assignment quality.
Appendix C Additional Experiment on Whitebox Attack
In Section 5 we evaluated our defense against whitebox attacks that succeeded in securing the target paper assignment. However, in doing so, it is possible that malicious reviewers that did not succeed initially will inadvertent become highranked after other reviewers are removed from the candidate set. Therefore, it may be necessary to detect all attack instances in the candidate set rather than ones that were successfully assigned.
Fig. S2 shows the detection TPR for all attackers that were initially ranked below but managed to move into the candidate set after the attack. Since this attacker pool includes many that obtained a relatively low rank, detection TPR is much higher than that of Fig. 3. For instance, for , even when the colluding party is significantly larger at , detection remains viable with a TPR of more than 40%. This experiment shows that our detection mechanism is unlikely to inadvertently increase the success rate of failed attacks.
Appendix D Blackbox Attack
The whitebox attack from Section 4.1 assumed that the adversary has extensive knowledge about the assignment system and all reviewers’ features/bids. In this section, we propose a more realistic colluding blackbox attack, where the adversary only has access to the features/bids of reviewers in the colluding party. This attack represents a reasonable approximation of what a real world adversary could achieve, and we show that it is potent against the scoring model in Section 3 absent of any detection mechanism.
Colluding blackbox attack.
The failure of the simple blackbox attack from Section 2 is due to the malicious reviewer bidding positively only on a single paper, instead of also on a group of papers that are similar to . We alter the attack strategy by giving the largest bid score to papers that are most similar to (including itself). In practice, this can be done by comparing the titles and abstracts of to the target paper . We simulate this attack in our experiment by select papers whose feature vector have a high inner product with .
We can extend this strategy to allow for colluding attacks. The malicious reviewer first selects reviewers with the most similar background to form the colluding group. In simulation, we measure reviewer similarity by the inner product between their respective reviewerrelated features. Mimicking ’s paper selection strategy, every reviewer in the colluding group now gives the largest bid score to the papers with the highest inner product between and .
Attack performance.
Fig. S3 shows the success rate of the colluding blackbox attack against the linear regression model. Note that this attack is much more successful than the simple blackbox attack from Section 2, which had a success rate of 0% for all reviewers below rank 16. Here, the success rate before attack is initially 0%, which increased to close to 5% after attack even without collusion (). Increasing the colluding party size strictly improves attack performance, while attackers with lower initial rank are less successful. Compared to the whitebox attack from Section 4.1 (see Fig. 2), the colluding blackbox attack is substantially less potent as expected.
Detection performance.
For completeness, we evaluate the detection algorithm from Section 4.2 against successful colluding blackbox attacks. In Fig. S4, we plot detection TPR as a function of the size of the colluding party () for various choices of the detection parameter . For both attacks that succeeded (left) and ones that achieved a top50 (right) rank, detection TPR is close to 1 when , and remains very high for . For instance, at and , detection TPR is above 80% for successful attacks (left plot), which is in sharp contrast with the same setting in Fig. 3 for the whitebox attack, where TPR is reduced to 0%. The detection performance against this more realistic colluding blackbox attack further validates our robust assignment algorithm as a practical countermeasure against bid manipulation.