1. Introduction
Online social networks are frequently targeted by malicious actors who create ‘fake’ or ‘sybil’ accounts for the purpose of carrying out abuse. Broadly, abuse is conducted in three phases: First, malicious actors create accounts. These accounts then need to establish connections with real users (e.g. by sending friend requests on Facebook). Once they establish sufficient connections, fake accounts can expose their social networks to a variety of malicious activities.
According to its latest Community Standards Enforcement Report (Facebook, 2019), Facebook disabled over billion such accounts in the first quarter of 2019. The vast majority of these accounts were disabled during or within minutes of account creation, and
% were disabled before being reported by a Facebook user. Despite these impressive figures, the fraction of such accounts that survives registrationtime classifiers and forms connections on Facebook still constituted roughly
% of monthly active users in 2019 (Facebook, 2019).In this paper, we focus on socialgraphbased detection of new fake accounts that manage to evade registrationtime classifiers but have not yet made sufficient connections to perpetrate abuse. We define new accounts as those that are less than days old or have sent fewer than friend requests.
While the general problem of using the social graph to detect fake accounts is wellstudied, existing algorithms typically do not apply to new accounts. This is because mainstream graphbased algorithms use a structural difference to detect fake accounts—namely, that fake accounts tend to have lower connectivity to real users. When popularized over a decade ago, this approach exhibited a key advantage: it was assumed that online social network companies only knew the true fake, real labels for a handful of users, and this handful was sufficient to seed a graphbased detection algorithm based on this structural difference. Nonetheless, one disadvantage is that real and fake users will only tend to exhibit this structural difference when they have made a reasonable fraction of their connections, so these algorithms tend to exclude new users from effective detection (Yang et al., 2012b; Boshmaf et al., 2015a; AlQurishi et al., 2017; Ramalingam and Chinnaiah, 2018).
However, both the resources available to online social networks and the challenges they face have evolved in the 14 years since the popularization of graphbased algorithms. For example, Facebook now possesses highconfidence {fake, real} labels for a majority of its active users—not just the handful assumed by existing graphbased algorithms. It is therefore now possible to use these additional labels to estimate not just structural differences, but also
individuallevel differences in how different users interact with real and fake accounts. Nonetheless, these labels are typically only available for users who have been active for at least several weeks. Thus, it is natural to consider whether today’s greater data availability can inform algorithms capable of detecting new fake accounts.Specifically, using Facebook’s data on friending activity and known fake accounts, we observe that there are in fact important individuallevel differences in how fake accounts interact with real users, and how real users react to fake accounts. Observing these differences requires looking beyond aggregate statistics such as a user’s overall reject rate for the friend requests she sends. For example, Fig. 1 top shows the distributions of reject rates for fake and real accounts. As the figure shows, many fakes (like many real accounts) either never or rarely have their friend requests rejected.
Disaggregating this data reveals two key differences: First, for certain users (but not others), whether a request comes from a fake/real account is highly determinative of her decision to accept or reject. Fig. 1 (middle) plots the ratios of each Facebook user’s rates at which she accepts friend requests from reals and fakes. Mass at represents Facebook users who accept/reject fakes at the same rate as reals, which provides no information about the sender’s fake, real label. Mass to the right of represents users who are more likely to accept a request from a real user than one from a fake user by a factor corresponding to the axis. Thus, an unknown user whose friend request is accepted by such a recipient is more likely to be real. For example, the mass at represents users who are times more likely to accept a request from a real user than a fake. In fact, over of Facebook users are at least times as likely to accept either a real or a fake (i.e., mass outside ), which provides a strong signal of their senders’ labels. Because the tails of this distribution are very wide, we round all users with ratios outside to these bounds in the plot.
Second, we observe a key difference in how some fake accounts select targets for their friend requests differently than real accounts. Specifically, certain users tend to be more or less frequently targeted by friend requests from fakes compared to real users, such that sending a request to such a target reveals information about the sender’s label. Fig. 1 bottom plots the ratios of the fraction of reals’ requests and the fraction of fakes’ requests that target each Facebook user. Here, mass at represents users who are equally likely to be selected as the recipient for a fake sender’s friend request as for a real sender’s friend request. Note that some users (mass to the left of ) are preferred by fake senders, but many users (mass to the right of ) are disproportionately likely to be selected by a real sender. Thus, an unknown user who sends a request to such a target to the right of is more likely to be real, and vice versa. Here, of users are at least times as likely to be selected by a real vs. a fake or vice versa (i.e., mass outside ), which provides a strong signal of their senders’ labels.
These two key individuallevel differences suggest a new means to detect new fake users despite their sparse connections: existing users are unequal in how their acceptances of friend requests reflect information about senders’ real/fake labels, and real/fake senders deliberately target their requests to different sets of recipients.
Main contribution. In this paper we present SybilEdge, an algorithm to identify new fake accounts on social networks. SybilEdge
returns the probability that each new user is a fake by aggregating over (I) her choices of friend request targets and (II) these targets’ corresponding accepts/rejects. We show that this algorithm rapidly detects new fake users at scale on the Facebook network and outperforms stateoftheart benchmark algorithms. We also show that
SybilEdge is robust to label noise in the training data, to greater prevalence of fake accounts in the network, and to several different ways fakes can select targets for their friend requests. To our knowledge, this is the first time a graphbased algorithm has been shown to achieve high performance (AUC) on new users who have sent only a small number of friend requests.Technical overview. SybilEdge classifies new users by combining three key components: First, SybilEdge estimates whether a new user is a fake by aggregating over her choices of friend request targets, giving more weight to targets to the extent they are preferred by other fakes vs. real users. Second, SybilEdge aggregates over these targets’ responses (accept/reject) to the user’s friend requests, giving more weight to targets to the extent they respond differently to fakes versus real users. Finally, during these aggregations, SybilEdge gives more weight to choices of targets and their responses when we are more confident that they distinguish fakes from real users. Together, these three components give SybilEdge a natural means to elicit the information about a new user’s fake, real identity from each of her friendship edges.
1.1. Related work
A variety of work has proposed graphbased algorithms to detect fake accounts (Yu et al., 2006, 2008; Yang et al., 2012b; Gong et al., 2014; Danezis and Mittal, 2009; Jia et al., 2017; Boshmaf et al., 2015a; Wang et al., 2017a, b; AlQurishi et al., 2017; Xue et al., 2013; Wang et al., 2018b; Yang et al., 2012a; Mohaisen et al., 2011; Qi et al., 2018; Subrahmanian et al., 2016; Wang et al., 2018a; Yang et al., 2014; Alvisi et al., 2013; Mislove et al., 2010; Cao et al., 2015; Zhang et al., 2015; Fu et al., 2017). Mainstream graphbased algorithms typically proceed from the homophily assumption, which assumes that a pair of connected users shares the same fake, real label with high probability, such that fakes tend to be poorly connected to real users overall (Yu et al., 2006, 2008; Yang et al., 2012b; Gong et al., 2014; Wang et al., 2017b; AlQurishi et al., 2017). Based on this assumption, a variety of graphbased algorithms attempt to propagate trust out from a small known set of trusted real users to unknown ones based on their connectivity to the known set.
More specifically, these algorithms typically propagate trust outwards via either random walks or Markov random fields (i.e. loopy belief propagation methods). Random walk based methods proceed on the basis of the assumption that unknown real users will be reachable in relatively few hops from the known set of real users, whereas reaching fake accounts requires additional hops on average. These algorithms therefore typically proceed via a series of short random walks on the network to partition nodes into real and fake sets on this basis. Random walk based methods include the seminal SybilGuard algorithm (Yu et al., 2006), as well as SybilLimit (Yu et al., 2008), SybilInfer (Danezis and Mittal, 2009), SybilWalk (Jia et al., 2017), Integro (Boshmaf et al., 2015a), and SybilRank (Yang et al., 2012b). Importantly, while random walk based approaches require either a known set of real users or a known set of fakes, they cannot leverage both at the same time. They are also considered less robust to misclassification (i.e. label noise) in the set of known users (Wang et al., 2017b).
In contrast to random walk based methods, loopy belief propagation methods take a probabilistic view. These methods use Markov random fields to capture network structure and define a joint probability distribution over each node’s label, which is iteratively updated to propagate labels from known fake or real nodes to unknown ones. Algorithms of this type include the seminal
SybilBelief (Gong et al., 2014), Sybilfuse (Gao et al., 2018), and GANG (Wang et al., 2017a). Such algorithms are able to incorporate information about both known real and known fake nodes, and they are also robust to some noise in this set of known labels. Recently, Wang et al. proposed a hybrid algorithm, SybilSCAR (Wang et al., 2017b), based on this approach. SybilSCAR iteratively propagates probabilistic estimates of unknown nodes’ labels based on a known set of users of each type.Importantly, both types of algorithm require that all users have had sufficient ‘stabilization time’ to make the majority of their connections such that they will exhibit the homophily assumption (Yang et al., 2012b; Boshmaf et al., 2015a; AlQurishi et al., 2017; Ramalingam and Chinnaiah, 2018). Due to these requirements, evaluations of fake detection algorithms have often excluded users with less than e.g., to months of tenure on the social network (Yang et al., 2012b; Boshmaf et al., 2015a), which provides an ample ‘grace period’ for fake accounts to perpetrate abuse.
One partial exception is VoteTrust (Xue et al., 2013). VoteTrust assumes that a majority of users (including those with known real labels) will be longtenured, but this longtenured set can be leveraged to classify a new user. For VoteTrust, however, this advantage comes at a cost: VoteTrust requires the additional dataset of (ideally) all historical friendship requests in the history of the social network, or at very least, sufficient historical requests such that the directed graph of requests is connected (Xue et al., 2013). We note that data on old friendship requests is typically not among the datasets considered to be readily accessible for analysis in the current generation of online social networks.
The homophily assumption may also cause these algorithms to misclassify when some ‘successful’ fake accounts succeed in connecting to many real accounts. Recent research (Freeman, 2017; Ghosh et al., 2012) suggests that this phenomenon is relatively prevalent on social networks. For the same reason, the homophily assumption renders these algorithms vulnerable to sampling attacks whereby a malicious user defeats these algorithms by instructing some of her fake accounts to send many friend requests to real users (knowing that many of these accounts may be detected), then instructing her remaining fake accounts to send requests only to the subset of real users who were willing to accept requests from fakes. By generating fake users who are densely connected to real accounts, the attacker may succeed in convincing an algorithm that fake users are real (Xue et al., 2013; Cao and Yang, 2013).
Paper organization. We present the SybilEdge algorithm in Section 2. We evaluate the performance of SybilEdge on the Facebook network in Section 3. We study SybilEdge’s robustness to label noise in Section 4 and its robustness to the prevalence of fake accounts in Section 5. We conclude the paper in Section 6.
2. The SybilEdge algorithm
In this section we derive the SybilEdge (xpert ecision iven dges) algorithm and its three key components: target selection, target response, and confidence weighting.
Preliminaries.
Our goal is to determine the posterior probability
that a new user is fake as a function of the set of targets (friend request recipients) to whom she sends friend requests and their respective responses. Let represent user ’s label as a fake/sybil () or real/benign () account—that is, the label we want to learn. Let denote target ’s response to ’s friend request (i.e. accept or reject), where denotes that accepted ’s request, and letdenote the binary vector of all responses
to ’s requests from her set of targets .We denote by (and ) an arbitrary fake (and real) sender’s probability of choosing user as the target when she sends her first friend equest. We denote by the vector of probabilities for all of ’s targets , and by the corresponding vector of probabilities for all of ’s targets. We denote the probability that ccepts a request from a fake or a real sender as and , respectively. Similarly, we denote by the vector of accept probabilities for all of ’s targets, and by the vector of accept probabilities for ’s targets.
Finally, suppose we know a abelled set of known fake and real users, where is the set of known fakes and is the set of known real users, and suppose we have prior knowledge of ’s label.
Notation  Description 

The user whose label we infer;  
User ’s label: fake () i.e. sybil, or real () i.e. benign;  
Prior on user ’s probability of being a fake;  
Posterior probability that is a fake;  
The set of targets sends friend requests to;  
,  Probabilities that user is the target of an arbitrary fake and real sender’s first friend request, resp.; 
,  The vectors of probabilities and , respectively for all targets whom sends requests to; 
Target ’s response to ’s request ( accept);  
Obs. responses of all ’s targets;  
,  Probabilities that target accepts a friend request from a fake and from a real sender, respectively; 
,  Vectors of probabilities and , respectively for all targets whom sends requests to; 
Sets of known (users, fakes, reals);  
Counts of known (users, fakes, reals) who sent requests to ;  
Counts of known (users, fakes, reals) whose requests accepted;  
Priors on target ’s quality as a classifier of users who send requests to and are accepted by , resp. 
[Notation used in SybilEdge]
2.1. Component I: a user’s selection of targets
Here, we derive the first component of SybilEdge, which updates our estimate of whether user is a fake based on whether she selects targets for her friend requests that are preferred by known fake versus known real senders. Specifically, we model that each new user selects a target for her first friend request via a draw from a multinomial distribution corresponding to her fake, real label: Fake users select each target with probability , but real users select with probability . We can then estimate the posterior probability that a sender is fake based on the relative probabilities that a fake/real user would have selected ’s set of targets:
(1) 
Where and denote the vector of all probabilities and , respectively, for the targets to whom user sends requests.
We assume that conditional on the sender’s label , the relative probability that a sender selects any target is conditionally independent^{1}^{1}1This is a standard assumption (see e.g. (Raykar et al., 2010)). While not true in general (e.g. some targets are more popular), this assumption is advantageous as it may limit the effect any one observation has on model predictions, rendering it more adversarially robust. of everything else, and that the count of friend requests the sender sends is independent of her label.^{2}^{2}2The assumption that a user’s count of friend requests is independent of her label is advantageous because is allows SybilEdge to apply equally to accounts that are e.g. and days old—that is, accounts that have sent fewer/more friend requests. Technically, as a user sends more friend requests, she reduces the remaining set of possible targets for her next friend request, making each of them slightly more probable for the next request. However, because the network is very large compared to any user’s number of friend requests, sampling targets with replacement is a very good approximation of sampling without replacement.^{3}^{3}3Absent this approximation, we would renormalize and after each subsequent request, so e.g. the numerator in eq. 2 would become , where denotes the targets to whom she sent requests before .
Thus, we can then compute this target selection component via:
(2) 
Here, the numerator is the joint probability of sender ’s selections of friend request targets given these targets’ probabilities at which they are selected by fake accounts. The denominator then gives the total probability of ’s selections of targets, which we compute by adding the probability of these selections given that the sender was fake plus the probability that they occurred given that sender was real. Therefore, the entire expression gives the relative probability that is fake given her selections of targets, scaled by
, the prior probability that
is fake (for example, we might set this the overall fraction of fake accounts at Facebook).The key intuition is that eq. 2 only updates our posterior estimate that is fake to the extent her targets are selected by fake and real users at different rates (i.e. to the extent that sends requests to targets who are further from = in Fig. 1, bottom). In section 2.4, we show how to estimate targets’ selection rates and .
2.2. Component II: targets’ responses
Here, we derive the second component of SybilEdge, which updates our estimate of whether user is fake based on her targets’ responses to her friend requests. Suppose (unlike above) that a target is equally likely to receive a friend request from an arbitrary real or fake account, such that receiving a friend request from a user reveals no information about that user’s label.^{4}^{4}4In this case, , so eq. 2 factors to the prior . However, suppose we observe the targets’ responses (acceptances/rejections) of ’s friend requests, and targets may accept fake senders’ requests at different rates than real senders’ requests. If we know each target’s probabilities and of accepting a request from a fake sender and from a real sender, respectively, then we can use the sequence of observed responses to each of user ’s friend requests to estimate the probability that she is fake. Denote by and the vectors of probabilities and , respectively, for all targets to whom sends requests. Assume that conditional on the sender’s label , targets’ responses are conditionally independent of everything else. We estimate the probability is fake via:
(3) 
We now show how to compute this probability. Because a target may accept or reject a request, we first simplify notation by defining a function that takes two inputs: target ’s accept or reject of ’s friend request, and the indicator of whether the source is fake or real. returns the probability that target accepts conditional on her fake, real label if we observe that ’s friend request was accepted by , or the complement of this probability if we observe that was rejected by :
Now we can compute this target response component via:
(4) 
Here, the product in the numerator is the probability of observing sender ’s accepts and rejects conditional on her targets using the probabilities at which they accept and reject fake accounts. The denominator then gives the total probability of observing these accepts and rejects. At a high level, the entire expression captures the question ‘did source ’s accepts/rejects appear to be due to her targets treating her as they treated fakes or as they treated reals?’.
The key aspect to note is that eq. 4 only updates the posterior estimate that is fake to the extent her targets respond differently to requests from fakes vs. reals (i.e. to the extent that ’s targets are further from = in Fig. 1, middle). In section 2.4, we show how to estimate targets’ accept rates and for each class of senders.
2.3. The SybilEdge equation
Here we show how to compute the key equation in the SybilEdge algorithm, which combines these target selection and target response components to aggregate the information about a user’s fake, real label contained in each of her friendship edges. Specifically, we say that the probability of observing each of ’s accepted or rejected edges can be decomposed as (I) the probability that would select the edge’s target conditional on ’s fake, real label, and (II) the target’s response conditional on ’s selection of the target and ’s label. We thus determine the posterior probability is a fake by aggregating over ’s edges via the SybilEdge equation:
(5) 
Here, the products in the numerator give us the joint probability that (I) selects the set of targets to whom she sends friend requests as a fake user would select targets; and (II) these targets respond with the accepts and rejects we observe given that they treat as a fake when accepting/rejecting her. The products in the denominator then give the total probability that selects these targets and they respond with the accepts/rejects we observe. The SybilEdge equation therefore gives us the relative probability that the ’s set of requests, accepts, and rejects are those of a fake user.
The SybilEdge equation thus captures our intuitions that a user is more likely to be a fake to the extent that she selects targets who are preferred by fakes (for whom ), and also to the extent her targets respond differently to her requests than they usually respond to requests from reals (for whom ).
2.4. Component III: weighting target confidence
The discussion above assumes we know the true probabilities at which fakes and reals each select each target ( and ), and the probabilities at which each target accepts a request from either class ( and ). In practice, we must estimate these parameters from observed social graph data. Therefore, we introduce the final component of the SybilEdge algorithm: SybilEdge gives more weight to selections of targets and targets’ responses not only as a function of the magnitude of the difference of targets’ request and accept probabilities for fakes vs. real users (as above), but also as a function of our confidence in these differences.
SybilEdge accomplishes this confidence weighting as follows. First, consider how to compute , , that is, each target’s probability of accepting a friend request from an arbitrary fake or real user. Suppose we know a set of existing fakes and a set of existing real users. The maximumlikelihood estimate of is just target ’s count of accepts of the requests she received from known fakes divided by the total count of these requests. However, if we used this approach for all targets, then the SybilEdge equation would give equal weight to a target who responded to only a few requests (i.e. a target whose accept rates we know with low confidence) and a target had responded to thousands of requests (whose accept rates we know with high confidence).
Therefore, we instead use estimators for these rates that, in the absence of data to the contrary, shrink and towards each other. This is because in the case where , we say target is equally likely to accept ’s friend request regardless of whether is a fake or a real, so the fact that accepts does update ’s probability of being a fake according to the target response component of the SybilEdge equation above.
Specifically, let denote the count of ’s acceptances of friend requests from users with known labels, and let and denote the counts accepted from known fakes and known reals, respectively. Let denote the count of all friend requests that known users sent to target , and let and denote ’s count of friend requests from just known fake senders and just known real senders, respectively. We use estimators that reweight target accept rates based on our confidence via:
(6) 
Where is a ‘confidence’ prior on target for the target response component of SybilEdge. Setting recovers the maximum likelihood estimators for and , which compel the SybilEdge equation to place equal weight on targets for whom we have observed more or less acceptance data.^{5}^{5}5
There is a mathematical equivalence between these estimators and the Beta conjugate model in Bayesian inference.
In contrast, as we increase , we shrink and together to a degree that is inversely proportional to the count of friend requests responded to, which compels SybilEdge to place less weight on targets who have only accepted/rejected a small number of reals or fakes in the past. In this case, the SybilEdge equation will tend to learn only from targets whom we have repeatedly observed accepting reals at a different rate than fakes (i.e. targets whose acceptance rates are known with high confidence). Similarly, by increasing for a particular target but not others, we can selectively downweight the influence of target ’s accepts/rejects on the model’s predictions, which may be advantageous if we suspect target of being a malicious or adversarial user.SybilEdge uses a similar approach to place more weight in the target selection component on targets when we are more confident (i.e. have observed more data) about how they are selected by fakes vs. reals. Similarly to above, we could imagine maximum likelihood estimators for the probability (or ) that a fake (or real) user will send her first friend request to target by computing target ’s count of requests received from known fakes (or reals) divided by the count of all requests sent by known fakes (or reals). But, as above, this approach would cause SybilEdge to give equal weight to a target who received only a few requests (i.e. a target whose rates we know with low confidence) and a target who received thousands of requests (whose rates we know with high confidence).
Therefore, we instead reweight target selection rates based on our confidence. Specifically, let denote the count of all friend requests sent by known users, and let and denote the counts sent by known fakes and reals, respectively. Instead of the maximum likelihood estimators described above, we use the following to reweight target selection rates:
(7) 
Here, is our ‘confidence’ prior on target for the target selection component of the SybilEdge equation: if we set , the SybilEdge equation places equal weight on friend requests sent to targets for whom we have observed more/less data; increasing causes the SybilEdge equation to place more weight on targets for whom we have observed more data. More specifically recovers the maximum likelihood estimators for and , whereas increasing shrinks and towards each other to a degree that is inversely proportional to the overall count of friend requests received from fakes or reals. This in turn causes the SybilEdge equation to place less weight on learning from targets for whom we have observed fewer friend requests (recall that the target selection component only updates the probability that a user is fake to the extent that fake users send requests to her targets at different rates than real users). Similarly, by setting higher for a particular target compared to others, we downweight the influence of the selection of target compared to other targets in the SybilEdge equation.
2.5. The SybilEdge algorithm
These target selection, target response, and confidence weighting components form the SybilEdge algorithm:
2.6. Choosing tuning parameters and
A key property of tuning parameters and is that, by increasing one relative to the other, we can tune SybilEdge to place more emphasis on learning from the set of targets a user chooses to send requests to relative to learning from whether those targets accept or reject. Specifically, as we increase , SybilEdge sets . The algorithm then ceases to update its estimate of ’s label based on the set of targets chooses, and we recover the target response component from the full SybilEdge algorithm. This in turn makes SybilEdge more robust to attack, as a fake user cannot ‘appear real’ by sending requests to recipients who typically are not targeted by fakes. However, this robustness comes at a cost in terms of SybilEdge’s recall. Consider, for example, that when all are large, we will be less likely to detect a fake account that sends requests to targets who receive proportionally many requests from fakes, but who accept fakes at the same rate they accept reals.
2.7. SybilEdge properties
In addition to its strong performance on real and simulated Facebook data, SybilEdge exhibits six advantageous properties:
Rapid classification of new users. Previous methods typically require a lengthy ‘stabilization period’ before a new account can be classified, and are generally less likely to correctly classify a fake account that succeeds in making many friends with real users (even if those users are not discriminating). In contrast, SybilEdge becomes increasingly likely to identify a fake as she (1) sends more friend requests; (2) sends requests to more discriminating targets who accept fakes at a different rate than they accept reals (increasing the difference between and for ’s targets); (3) sends requests to targets who are more often victimized by requests from fake accounts (increasing the difference between and ); and (4) sends requests to targets who are more active users (for whom we have greater confidence in and ).
Robustness to sampling attacks. A key property of SybilEdge is that targets only carry weight in the model to the extent that they receive and accept friend requests from real and fake users at different rates. Thus, a fake account cannot improve the SybilEdge’s estimate of her probability of being fake even if she identifies and connects to many real users who accept e.g. all requests indiscriminately. Note that an indiscriminately accepting target has , which causes the target’s accept or reject to appear on both the numerator and denominator of the target response component of the SybilEdge equation. This target’s response then factors out and has no effect on our posterior estimate of ’s label.
Low complexity. SybilEdge has complexity where is the set of friend requests. Because social networks are typically sparse (Mislove et al., 2007; Gong et al., 2014), we have . This compares favorably to stateoftheart algorithms such as SybilBelief and SybilSCAR, which require , where is the number of iterations (at least ) and is the set of accepted friend requests (Wang et al., 2017b; Gong et al., 2014).
Interpretability. Unlike mainstream sybil detection algorithms, SybilEdge is interpretable. For example, SybilEdge might classify a user as fake with high probability because her friend requests were rejected by specific users who tend to accept all requests from real users and reject those from fakes, and because she also sent requests to other users who are preferred targets of fakes. Such interpretability enables researchers to audit the model’s classifications—an important precondition for disabling fake accounts.
Probabilistically labelled training data. SybilEdge accepts probabilistically labelled training data rather than binary labelled data if desired. For example, an acceptance of a request from a user that data suggests is fake with probability can be input as an acceptance of fake users and real users.
Robustness to label noise in the training data. In Section 4 below, we show that SybilEdge is robust to the presence of misclassified users in the training dataset of known fake/real users.
3. Evaluations
Our goal in this section is to show that SybilEdge achieves high performance (AUC) on new users at scale on the Facebook network, and that it significantly outperforms stateoftheart benchmark algorithms. In subsequent sections we also show that SybilEdge is robust (i) to label noise in the training data, (ii) to greater prevalence of fake accounts in the network, and (iii) to several different ways fakes can select targets for their friend requests.
3.1. Evaluation on the Facebook network
We implemented SybilEdge at scale at Facebook, and we ran it in an offline evaluation setting on the global Facebook network. Specifically, we trained SybilEdge using just a threemonth period of historical friending data from the last year. To train the model, we also used the historical set of real/fake labels from Facebook’s internal fake classifiers from these three months. These labels include a highly calibrated real/fake label for all accounts that are days old, which provided a label for all users in our three months of training data. We then tested SybilEdge by attempting to classify new users who joined Facebook anytime in the week immediately following these three months using only this one week of data on their friending activity. That is, we test SybilEdge’s ability to detect new accounts who are each between 0 and 7 days old.^{6}^{6}6To ensure fairness in this evaluation, for all new users we set a prior equal to the overall fraction of fakes among new Facebook users, and for all known targets we set confidence priors const. Because significant additional time has now passed since these users joined Facebook, they have since been labeled via our same set of fake classifiers. We compare SybilEdge’s output to these known labels.
Comparison metrics and benchmarks. Due to imbalance in the classes of fakes and real nodes (guessing ‘all real’ yields 95% accuracy), we adopt the standard approach and use ROC AUC to measure SybilEdge’s performance (Wang et al., 2017b; Boshmaf et al., 2015b). Recall that an AUC of means a classifier is no better than random on the test set.
For comparison, we also include two benchmarks: RejectRate and SybilEdgeTR (Section 3.2 below adds additional benchmarks).
RejectRate. RejectRate just computes the AUC of each new user’s fraction of sent friend requests that are rejected by her targets.
SybilEdgeTR. SybilEdgeTR is a simplified version of SybilEdge that uses only the target response component (eq. 4), and not the target selection component, so a new user’s choice of targets does not affect the posterior probability she is fake (i.e., SybilEdgeTR is SybilEdge with , see Section 2.6). SybilEdgeTR probes how much of SybilEdge’s performance is due to target response alone.
Results. Fig. 2 plots AUC for groups of these new users partitioned by the number of friend requests they sent. SybilEdge and SybilEdgeTR improve in AUC as new users send more friend requests and converge to AUC’s of and , respectively, for all users who send more than friend requests. We note that SybilEdge’s high AUC values here mean that it successfully detected even those new users who joined Facebook on the last of the 7 days in the test set (i.e. who were only 1day old at detection time). This evaluation is (to our knowledge) the first demonstration that a graphbased algorithm can detect fakes given just the small set of friend requests they attempt in their first days of activity.
We also manually inspected SybilEdge’s errors, and we found that similarly to (Xue et al., 2013), the class of ‘false positives’ among new users who sent more than requests reveals many ‘realbutspammy’ users who abused friend recommendations by sending many unwanted requests. Thus, as in (Xue et al., 2013), we conclude that SybilEdge’s ‘false positives’ can actually be desirable outputs.
We also note that, in contrast to some previous evaluations of graphbased algorithms on other social networks, the class of new fake Facebook accounts detected by SybilEdge cannot easily be distinguished by basic network statistics such as reject rates. For example, the authors of VoteTrust note that during their evaluation on the Renren network, fakes were distinguishable by their low average acceptance rate of 0.2 versus 0.8 for real users (Xue et al., 2013). In contrast, reject rates yield AUC generally under 0.7 for the class of new users on Facebook. Thus, we conclude that SybilEdge was able to elicit much more information from a new user’s sparse friendships by leveraging the differences in targets’ selections and responses.
3.2. SybilEdge vs. stateoftheart algorithms
We also compare SybilEdge to stateoftheart benchmark algorithms on a Facebook network. Because benchmark algorithms have greater computational complexity than SybilEdge (see Section 2.7), we restrict the Facebook network in this evaluation to all users in a single country with roughly million users. This restriction improves computational feasibility of the benchmarks, and it enables us to use their authors’ publicly available code implementations for the sake of experimental transparency (see (Gong et al., 2014; Wang et al., 2017b, 2018a)).
We compare SybilEdge to:
SybilRank. SybilRank (Yang et al., 2012b) is a stateoftheart random walk based algorithm. Unlike SybilEdge, SybilRank uses only the graph of accepted friend requests and a set of known real users (nodes). As in (Yang et al., 2012b), we run SybilRank for log2 iterations.
SybilBelief. SybilBelief (Gong et al., 2014) is a stateoftheart loopy belief propagation algorithm. SybilBelief uses the friendship graph of accepted friend requests and both known real users and known fakes. As in (Gong et al., 2014), we run SybilBelief with edge weights set to .
SybilSCAR. SybilSCAR (Wang et al., 2017b, 2018a) is a recent probabilistic algorithm. SybilSCAR uses the graph of accepted friend requests and both known real users and known fakes. We run both versions of this algorithm: SybilSCARC with all weights equal to half the inverse of the average degree as in (Wang et al., 2018a), and userdegree weighted SybilSCARD. Each point in Fig. 3 reports the higher of their two AUC’s.
Results. Fig. 3 plots each algorithm’s AUC for groups of new users partitioned by the number of friend requests they sent.^{7}^{7}7We note that Fig. 3 uses fewer partitions than Fig. 2 to ensure each partition still has sufficient new fake accounts for evaluation on this onecountry Facebook network. Overall, SybilEdge consistently outperforms all benchmarks regardless of the number of friend requests new users sent. Specifically, whereas SybilEdge achieves AUC on all new users who have sent more than 10 friend requests, the best performing benchmark, SybilBelief, achieves a maximum AUC of , and its performance degrades to nobetterthanrandom for new users who send friend requests. Further investigation suggests that benchmarks’ poor performance is largely due to the fact that some new fake users violate the homophily assumption and connect to many indiscriminately accepting real users, and the subset of new fake users who send the most friend requests (for whom the benchmarks’ performance is lowest—see rightmost points in Fig. 3) are particularly likely to do so. In these cases, SybilRank tends to rank these fake users in particular as more likely to be real than lowdegree real users (resulting in a low or even negative AUC), and SybilBelief and SybilSCAR tend to ‘overpropagate’ known real users’ labels via these connections such that the majority of new users converge to identical ‘ real’ posteriors, resulting in AUC of .
4. Robustness: label noise
Robustness to label noise in the training data is a desirable and wellstudied property of sybil detection algorithms (Gong et al., 2014; Wang et al., 2017b). To test the robustness of SybilEdge to noise in a realistic setting, we repeat the evaluation of SybilEdge on the global Facebook network dataset (Fig. 2), but randomly flip up to % of known real and fake users’ fake, real labels in the training data we use to compute each target’s rates. Fig. 4 plots SybilEdge’s performance on the global Facebook network with various levels of added label noise. Note that even with of added label noise, SybilEdge still converges to AUC on new users who have sent more than friend requests. We therefore conclude that SybilEdge applies well even to social networks where training labels are known with significantly less confidence than they are at Facebook.
5. Robustness: behaviors & prevalence
Our goal in this section is to show that SybilEdge’s performance advantage is also robust to conditions that differ from the current Facebook network—specifically, to (i) several different ways fakes can select targets for their friend requests, and (ii) greater prevalence of fake accounts in the network. To accomplish this, we designed a variety of synthetic friend request networks to capture a variety of ways fake users can choose targets for their requests. For each synthetic network, we then used real Facebook user data to realistically simulate how Facebook users would respond (accept/reject). Across these simulations, SybilEdge still rapidly converged to detect fakes after they sent only a small number of friend requests regardless of how they selected targets for these requests or their overall prevalence in the network. In all cases, SybilEdge outperformed stateoftheart graphbased algorithms, whose performance changed markedly depending on how fake users chose targets, and who struggled to detect both lowdegree fakes and fakes who succeed in friending less discriminating real users.
5.1. Robustness simulations setup
In each simulation, we set nodes (users) and randomly select 5% of them to be fakes, which matches Facebook’s global fraction (Facebook, 2019) of fake users (we later increase this to 10% to probe robustness to a greater prevalence of fake accounts). We randomly select 80% of nodes to have known fake/real labels and 20% to have unknown labels. This reflects a realistic ‘difficult case’ of a community where a full 20% of users are new. We then generate synthetic digraphs of friend requests using a variety of random graph models parameterized by Facebook data. This set of synthetic digraphs is selected to encompass a variety of possible strategies that fakes may deploy ranging from randomly targeting real users to preferentially targeting highdegree users or even users who have previously accepted friend requests from other fakes. For each friend request, we then draw an ‘accept’ or ‘reject’ based on mapping the simulated recipient to an actual Facebook user’s accept rates for fakes/reals, which ensures that our simulated users’ behaviors are consistent with actual Facebook users.
Benchmark algorithms We run SybilEdge and each benchmark algorithm from Section 3 on these graphs to classify the ‘unknown’ of users (test set). We also include an additional benchmark:
VoteTrust. While we did not run VoteTrust (Xue et al., 2013) on the Facebook network (Section 3.2) because it requires significant additional data^{8}^{8}8Specifically, as described in Section 1.1, aside from the friendship graph and 3month sample of users’ friend requests we use to train SybilEdge and other benchmarks, VoteTrust also requires complete older historical data on friend requests, which is not among the data that is considered readily accessible for analysis., we include it in our simulations. VoteTrust is an interesting benchmark because it is a random walk based algorithm, but like SybilEdge, VoteTrust uses the directed graph of friend requests, accepts, and rejects. VoteTrust detects fakes by propagating trust from known real nodes via random walks, then aggregating accepts/rejects of unknown users’ requests weighted by their targets’ trust scores.
5.2. Generating friend request graphs
First, we generate synthetic friend request graphs using various models, each parameterized by Facebook data, which capture various ways fakes can choose their targets. For each graph model, we vary the input parameters to produce a range of graphs with various average outdegrees (number of friend requests sent) from to .
Erdős Rényi (n=10000). We generate friend request graphs using the directed Erdős Rényi model. We vary the probability of an edge to yield a range of graphs where nodes’ expected number of friend requests varies from 1 to 50 (i.e. ). These graphs capture a scenario where nodes send friend requests to targets chosen uniformly at random, but targets accept requests as in observed Facebook behavior (see Section 5.3 below).
FBparameterized configuration model =. In practice, some users receive many more friend requests than others. To capture this in a realistic manner, we design directed configuration graphs by mapping each node uniformly at random to an observed Facebook user’s count of actual friend requests. We then use each user’s count as both her indegree and outdegree distributions. The resulting graphs capture the scenario where we see a realistic distribution of friend requests, but fakes are careful not to betray their identities by sending many more requests than they receive.
FBparameterized stochastic block model =. In practice, real users are much more likely to send requests to other real users than to fakes. We capture this by generating directed SBM graphs of friend requests with two clusters: one of fakes and one of reals. We set the probability of a friend request within or acrossclusters (the edge probability matrix ) to the observed ratios at which fakes/reals send requests to fakes/reals on Facebook.
FBparameterized preferential attachment =. In practice, we observe that many fakes preferentially target users who have already been targeted by other fakes (see Fig. 1). To capture this, we design preferential attachment graphs of friend requests. First, we randomly map each simulated fake user uniformly at random to an actual observed fake Facebook user’s receive counts from fake/real senders, and we map each simulated real user to corresponding data from a real Facebook user. We these counts as the preferential attachment process weights and , i.e., the a priori probability that each fake or real user, respectively, will send a friend request to target . We then run the classic out preferential attachment algorithm until all nodes send friend requests, and we generate a range of graphs with to .
5.3. Modeling request acceptances/rejections
After generating a friend request digraph in each simulation, we generate the corresponding ‘accept’ or ‘reject’ for each request as follows: First, we map each simulated target node to a tuple of Facebook data describing a randomly selected Facebook user’s historical rates at which she accepted requests from real users and fakes, respectively.^{9}^{9}9We note that, due to the fact that millions of users have identical rates, this information is not identifying. Here, we are careful to map each simulated fake to an actual fake Facebook user’s rates and each simulated real user to an actual real Facebook users’ rates.^{10}^{10}10For the preferential attachment graphs, we are careful to maintain the same mapping as during graph synthesis. We use these rates as Bernoulli weights to draw ‘accepts’ or ‘rejects’ for her simulated friend requests from real users and fakes, respectively. This process synthesizes realistic ‘accepts’ and ‘rejects’ that match actual Facebook userlevel distributions from fake and real accounts.
5.4. Robustness simulation results
Fig. 5 plots each algorithm’s AUC versus the average user’s count of friend requests sent (i.e. outdegree) for each graph model. Note that regardless of how fake users selected their targets, SybilEdge consistently achieved nearperfect classification after observing an average of 20 friend requests from each user (of which 4 were sent by unknown users and thus excluded from training). Thus, after training on edges per known user, SybilEdge classified new fakes almost perfectly, including those who sent only a couple of requests, across all graph models. This suggests that SybilEdge’s strong performance on the real Facebook network (Section 3.1) is quite robust to different ways fakes can select targets.
SybilEdge also reaped an additional performance advantage over SybilEdgeTR in preferential attachment graphs, as in these graphs fakes chose targets differently than real users. Per Section 3.1 and Fig. 2, this is consistent with SybilEdge’s performance advantage on the real Facebook network.
In contrast, the performance of all benchmark algorithms was markedly inconsistent across the different graph models, and none matched the performance of SybilEdge on any graph model. We inspected their errors and found that, as with evaluations on real data (Section 3.2), benchmarks’ poor performance was largely due to the fact that new (simulated) users’ sparse connections were insufficient to realize the homophily assumption. Also, as in the evaluations on real data, some real users accepted requests indiscriminately from many fakes, causing SybilSCAR and SybilBelief to ‘overpropagate’ known real users’ labels out to other fakes, which resulted in many misclassifications. Additionally, all benchmarks struggled to distinguish fakes from lowdegree real users.
Finally, SybilEdge’s performance actually improved slightly when we increased the fraction of fake accounts in the data from 5% (Fig. 5 top row) to 10% (bottom row). This is because the increase in the fraction of fake users improves balance such that a greater fraction of targets in the training data receive requests from known fake users, so SybilEdge can better estimate targets’ receive rates and accept rates for fakes when there have been fewer requests overall. This suggests that SybilEdge’s performance is quite robust to even a marked increase in the current fraction of fake accounts.
6. Conclusion
We presented SybilEdge, a socialgraphbased algorithm for the detection of new fake accounts on social networks. The class of new fakes has traditionally been overlooked by socialgraphbased algorithms, which leverage networkstructural differences to identify longtenured fakes. However, we have shown it is possible to detect new fakes by leveraging small individuallevel differences in how new fakes interact with other users, and how these users in turn react to new fakes. Because early detection limits the harm that such accounts can inflict, the development of such techniques is a promising new area for impactful research.
References
 Sybil defense techniques in online social networks: a survey. IEEE Access 5, pp. 1200–1219. Cited by: §1.1, §1.1, §1.
 Sok: the evolution of sybil defense via social networks. In 2013 ieee symposium on security and privacy, pp. 382–396. Cited by: §1.1.
 Integro: leveraging victim prediction for robust fake account detection in osns.. In NDSS, Vol. 15, pp. 8–11. Cited by: §1.1, §1.1, §1.1, §1.

Thwarting fake osn accounts by predicting their victims.
In
Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security
, pp. 81–89. Cited by: §3.1.  Combating friend spam using social rejections. In 2015 IEEE 35th International Conference on Distributed Computing Systems, pp. 235–244. Cited by: §1.1.
 SybilFence: improving socialgraphbased sybil defenses with user negative feedback. arXiv preprint arXiv:1304.3819. Cited by: §1.1.
 SybilInfer: detecting sybil nodes using social networks.. In NDSS, pp. 1–15. Cited by: §1.1, §1.1.
 External Links: Link Cited by: §1, §5.1.
 Can you spot the fakes?: on the limitations of user feedback in online social networks. In Proceedings of the 26th International Conference on World Wide Web, pp. 1093–1102. Cited by: §1.1.
 Robust spammer detection in microblogs: leveraging user carefulness. ACM Transactions on Intelligent Systems and Technology (TIST) 8 (6), pp. 83. Cited by: §1.1.
 Sybilfuse: combining local attributes with global structure to perform robust sybil detection. arXiv preprint arXiv:1803.06772. Cited by: §1.1.
 Understanding and combating link farming in the twitter social network. In Proceedings of the 21st international conference on World Wide Web, pp. 61–70. Cited by: §1.1.

Sybilbelief: a semisupervised learning approach for structurebased sybil detection
. IEEE Transactions on Information Forensics and Security 9 (6), pp. 976–987. Cited by: §1.1, §1.1, §2.7, §3.2, §3.2, §4.  Random walk based fake account detection in online social networks. In Dependable Systems and Networks (DSN), 2017 47th Annual IEEE/IFIP International Conference on, pp. 273–284. Cited by: §1.1, §1.1.
 Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pp. 29–42. Cited by: §2.7.
 You are who you know: inferring user profiles in online social networks. In Proceedings of the third ACM international conference on Web search and data mining, pp. 251–260. Cited by: §1.1.
 Keep your friends close: incorporating trust into social networkbased sybil defenses. In 2011 Proceedings IEEE INFOCOM, pp. 1943–1951. Cited by: §1.1.
 Detecting and characterizing botlike behavior on twitter. In International Conference on Social Computing, BehavioralCultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, pp. 228–232. Cited by: §1.1.
 Fake profile detection techniques in largescale online social networks: a comprehensive review. Computers & Electrical Engineering 65, pp. 165–177. Cited by: §1.1, §1.

Learning from crowds.
Journal of Machine Learning Research
11 (Apr), pp. 1297–1322. Cited by: footnote 1.  The darpa twitter bot challenge. Computer 49 (6), pp. 38–46. Cited by: §1.1.
 GANG: detecting fraudulent users in online social networks via guiltbyassociation on directed graphs. In Data Mining (ICDM), 2017 IEEE International Conference on, pp. 465–474. Cited by: §1.1, §1.1.
 Structurebased sybil detection in social networks via local rulebased propagation. IEEE Transactions on Network Science and Engineering. Cited by: §1.1, §3.2, §3.2.
 SybilSCAR: sybil detection in online social networks via local rule based propagation. In INFOCOM 2017IEEE Conference on Computer Communications, IEEE, pp. 1–9. Cited by: §1.1, §1.1, §1.1, §2.7, §3.1, §3.2, §3.2, §4.
 Sybilblind: detecting fake users in online social networks without manual labels. In International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 228–249. Cited by: §1.1.
 Votetrust: leveraging friend invitation graph to defend against social network sybils. In 2013 Proceedings IEEE INFOCOM, pp. 2400–2408. Cited by: §1.1, §1.1, §1.1, §3.1, §3.1, §5.1.
 Analyzing spammers’ social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In Proceedings of the 21st international conference on World Wide Web, pp. 71–80. Cited by: §1.1.
 SybilRank: aiding the detection of fake accounts in large scale social online services. Cited by: §1.1, §1.1, §1.1, §1, §3.2.
 Uncovering social network sybils in the wild. ACM Transactions on Knowledge Discovery from Data (TKDD) 8 (1), pp. 2. Cited by: §1.1.
 Sybillimit: a nearoptimal social network defense against sybil attacks. In 2008 IEEE Symposium on Security and Privacy (sp 2008), pp. 3–17. Cited by: §1.1, §1.1.
 Sybilguard: defending against sybil attacks via social networks. In ACM SIGCOMM Computer Communication Review, Vol. 36, pp. 267–278. Cited by: §1.1, §1.1.
 Truetop: a sybilresilient system for user influence measurement on twitter. IEEE/ACM Transactions on Networking 24 (5), pp. 2834–2846. Cited by: §1.1.
Comments
There are no comments yet.