Centrality-Friendship Paradoxes: When Our Friends Are More Important Than Us

07/04/2018 ∙ by Desmond J. Higham, et al. ∙ 0

The friendship paradox states that, on average, our friends have more friends than we do. In network terms, the average degree over the nodes can never exceed the average degree over the neighbours of nodes. This effect, which is a classic example of sampling bias, has attracted much attention in the social science and network science literature, with variations and extensions of the paradox being defined, tested and interpreted. Here, we show that a version of the paradox holds rigorously for eigenvector centrality: on average, our friends are more important than us. We then consider general matrix-function centrality, including Katz centrality, and give sufficient conditions for the paradox to hold. We also discuss which results can be generalized to the cases of directed and weighted edges. In this way, we add theoretical support for a field that has largely been evolving through empirical testing.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Motivation

Consider the graph in Figure 1. Imagine that the nodes represent people and the edges represent reciprocated friendships. Nodes have friends, respectively. So the average number of friends possessed by a node is . Now look at the friends of each node. The four friends of node 1 possess friends. Similarly for nodes we find , , , , , and , respectively. So the average number of friends possessed by a friend is , which is greater than . This effect—that our friends have more friends than we do, on average—was identified by Feld [12] and has become known as the friendship paradox. Feld showed that the friend-of-friend average always dominates the friend average, with equality if and only if all individuals have the same number of friends. The paradox is a classic example of sampling bias. In Figure 1, node 1 has friends and hence appears times in the friend-of-friend sum, whereas node only contributes its value on a single occasion; in general, highly connected nodes have a greater influence on the sum

Figure 1: Simple undirected network with nodes.

The friendship paradox has motivated much activity in the social network literature, and is also mentioned regularly in the wider media; see, for example, [27]. Researchers have measured the extent to which the discrepancy holds on real networks involving, for example, high school and university students [12, 15, 29], scientific coauthors [7], plants and pollinators [25] and users of social media [3, 16, 20]. (We mention that some of these studies also looked at individual-level analogues, such as “what proportion of nodes have fewer friends than the average over their friends?” In this work we focus exclusively on the gobal averages used in the original reference [12].)

Extending this idea, Eom and Jo [7] looked at the case where each node may be quantified according to some externally derived attribute and studied the generalized friendship paradox: on average, do our friends have more of this attribute than us? They showed that the answer is yes for attributes that correlate positively with the number of friends, and found the effect to hold empirically for certain scientific collaboration networks in the case where the attribute was publication or citation count. Similarly, Hodas, Kooti and Lerman [16] made empirical studies of Twitter networks and tested for a friend activity paradox (are our friends more active than us?) and for a virality paradox (do our friends spread more viral content than us?).

Our aim here is to study the generalized friendship paradox in the case where the attribute is importance, as quantified by a network centrality measure. Aside from the fact that centrality is a fundamental and informative nodal property [9, 13, 24], we also note that centrality measures are defined explicitly in terms of the network topology, and hence there is potential to derive results that hold universally, or at least for some well-defined classes of network. This allows us to add further theoretical backing that complements the recent data-driven studies mentioned above. Our results also alleviate the need for certain experiments. For example, in [15] Grund tested the eigenvector centrality version of the generalized friendship paradox on two small-scale friendship networks; Theorem 3.1 in Section 3 shows that this paradox holds for all networks.

This manuscript is organized as follows. In Section 2 we set up some notation, formalize the friendship paradox and explain how it follows directly from the Cauchy–Schwarz inequality. We also define the generalized friendship paradox from [7], and show how it arises when the quantity of interest correlates with degree. The new material starts in Section 3, where we show that a paradox always holds for eigenvector centrality. In Section 4 we consider other types of network centrality based on matrix functions. Using a combinatorial result from [23] we show that the paradox holds for certain types of matrix function. We also derive and interpret sufficient conditions for general matrix functions defined through power series with nonnegative coefficients, including the resolvent case corresponding to Katz centrality. Sections 34 deal with undirected, unweighted networks. In Section 5 we look at directed networks, where the picture is less straightforward. We discuss various paradoxes that arise from the use of out-degree and in-degree, and give some sufficient conditions for centrality-based analogues. We explain in Section 6 how all results extend readily to the case of nonnegatively weighted networks. We conclude in Section 7 with an overview of the main results and an indication of possible future lines of pursuit.

2 Friendship Paradox and Generalized Friendship Paradox

Suppose represents the adjacency matrix for an undirected, unweighted, network with nodes. (Directed edges will be considered in Section 5 and weighted edges in Section 6.) So , with and with if nodes and are connected. To avoid the trivial special case of an empty network, we assume at least one edge exists. Letting

denote the vector with all components equal to one, we may define the degree vector

where gives the degree of node .

We will make use of the two-norm and the one-norm, which for a vector are defined by

respectively. We note that in many cases we will be dealing with a nonnegatively valued vector , whence the one-norm reduces to the sum of the entries.

In this notation, the average degree over the nodes may be written

In the friendship paradox, we wish to compare this quantity with the average of the values that arise when we take each node, look at each of that node’s neighbours, and record how many neighbours those neighbours have. When we do this count, each node appears as a neighbour times and each time it contributes neighbours, so the count totals . The number of terms in the count is twice the number of edges, which is . The overall friend-of-friend average is therefore

So the friendship paradox is equivalent to the inequality

(1)

To see why (1) is always true, we recall from the Cauchy–Schwarz inequality [17] that for any we have

(2)

Taking , this implies

(3)

For , after squaring and rearranging we may write this inequality in the form

(4)

So we see that the friendship paradox inequality (1) is always satisfied.

Further, equality holds in the Cauchy–Schwarz inequality (2) if and only if is a multiple of . So we have equality in the friendship paradox inequality (1) if and only if the network is regular—all nodes have the same degree.

To define the generalized friendship paradox [7] a nonnegative quantity is assigned to node , and we compare the average over the nodes,

with the average over neighbours of nodes,

The numerator in the latter quantity arises because in the overall sum each node contributes its value a total of times. The denominator arises because each edge is used twice. Hence, we may say that a generalized friendship paradox with respect to the quantity arises if

(5)

Note that the original version (1) corresponds to the case where is the degree vector.

Introducing the covariance between two vectors as

where

denote the corresponding means, we may rewrite (5) as

Hence, as indicated in [7], a generalized friendship paradox with respect to the quantity arises if is nonnegatively correlated with degree.

Rather than considering an externally defined attribute, as was done in the tests of [7, 16], we will look at circumstances where is a network centrality measure that quantifies the relative importance of each node. In this way we can address, in the same generality as the original work [12], the question: are our friends more important than us, on average?

3 Eigenvector Centrality Paradox

In this section we consider the case of eigenvector centrality [4, 5, 9, 24, 28]. To make this centrality measure well-defined, we assume that the network is connected, and hence the symmetric matrix is irreducible. From Perron–Frobenius theory [17], we know that

has a real, positive, dominant eigenvalue

that is equal to , the matrix two-norm of . The centrality measure is then given by the corresponding Perron–Frobenius eigenvector , which satisfies and has positive elements.

We have the following result.

Theorem 3.1.

Given any connected network, the generalized friendship paradox inequality (5) holds for eigenvector centrality, with equality if and only if the network is regular.

Proof.

From the definition of the subordinate matrix two-norm we have

(6)

Using (3), this implies

(7)

Hence,

and we see that (5) always holds. Further, because is the only eigenvector whose elements are all positive, we have equality in (6) if and only if is a multiple of ; that is, if and only if the network is regular. ∎

4 Matrix Function Centrality Paradox

We now move on to the case where is defined from a power series expansion

(8)

Here, we assume that for all and that these coefficients have been chosen in such a way that the series converges. Centrality measures of this type have been studied by several authors, see, for example, [1, 2, 8, 9, 10, 11, 24, 26, 28]. They can be motivated from the combinatoric fact that counts the number of distinct walks of length between and . Particular examples are

  • Katz centrality [19], where . Here the real parameter must be chosen such that , where denotes the spectral radius of . In this case solves the linear system .

  • total centrality [1, 2], where for some positive real parameter . In this case the series converges for any , and may be written . Other factorial-based coefficients have also been proposed [8].

  • odd and even centralities

    based on odd and even power series, such as those for

    and [26].

We also note that degree centrality, on which the the original friendship paradox is based, corresponds to in (8) with all other coefficients equal to zero.

For the centrality measure in (8) we have

and

So the generalized friendship paradox inequality (5) may be written

(9)

By comparing terms in the two expansions, we arrive at the following sufficient condition.

Theorem 4.1.

The generalized friendship paradox inequality (5) holds for in (8) if

(10)

for every for which .

Proof.

We see that the term on the left hand side of (9) involving collapses to zero. Generally, we may obtain a sufficient condition by asking for each individual term involving to be greater than or equal to zero, for all . This leads to (10). ∎

We note that the sufficient condition (10) has a simple combinatoric interpretation: the total number of walks of length must dominate the product of the total number of walks of length and the average degree.

To proceed we make use of the following result.

Theorem 4.2 (Lagarias et al., 1983).

For any positive integers and such that is even, we have

(11)
Proof.

See [23, Theorem 1]. ∎

In words, Theorem 4.2 says that, for even, the total number of walks of length dominates the product of the total number of walks of length and the total number of walks of length , scaled by the number of nodes, .

This theorem allows us to deal with odd power series:

Theorem 4.3.

The generalized friendship paradox inequality (5) holds for in (8) in the case where for even, with equality if and only if the network is regular.

Proof.

First, suppose the network is regular. Let denote the common degree, so that . Since is an eigenvector with positive entries, it must be the Perron–Frobenius eigenvector, so . Then for all . It follows that for each , the term on the left hand side of (9) collapses to zero, giving equality.

Now suppose that the network is not regular. On the left hand side of (9), the coefficient receives the factor . This quantity relates to the original friendship paradox—see (1)—and is strictly positive by the Cauchy–Schwarz inequality. To deal with the remaining terms, it is then enough to show that (10) holds for odd . This is done by taking and in (11). ∎

The next result focuses on Katz centrality.

Theorem 4.4.

Consider the case where in (8). For any network there exists a value such that the generalized friendship paradox inequality (5) holds for all parameter values , with equality if and only if the network is regular.

Proof.

First, suppose the network is regular. Recall that denotes the common degree, so that . Then is the unique solution to the Katz centrality equation for all . Because is a multiple of the degree vector, we have equality in (5).

Now suppose that the network is not regular, so is not a multiple of . With , and small, the left hand side of (9) may be expanded as

and we see that the factor in parentheses is strictly positive. ∎

5 Directed Networks

In this section we consider the case of unweighted directed networks, so is no longer assumed to be symmetric. To be concrete when discussing results, we imagine that the network represents human-human follower relationships on a social media platform. So an edge from to , represented by , indicates that person follows person .

We define the out-degree vector and in-degree vector by

respectively. Hence, counts the number of people that person follows, and counts the number of people who follow person . Note that

Our first observation is that the inequality (4) holds for any nonzero vector , with equality if and only if is a multiple of . Hence we have

(12)

A little care is needed when interpreting these inequalities. The total arises if we take each person in turn, look at the people who follow them, and record how much following these people do. (In this way, each node shows up times and each time it contributes an amount .) Similarly, arises if we take each person in turn, look at the people who they follow, and record how many times these people are followed. (In this way, each node shows up times and each time it contributes an amount .)

In words, it is always true that

i)

our followers follow at least as many people as us, on average (and there is equality if and only if everybody follows the same number of people), and

ii)

the people we follow have at least as many followers as us, on average (and there is equality if and only if everybody has the same number of followers).

From the discussion in Section 2, we also see that the in-out/out-in analogue of (12) and corresponding statements are valid only if . Simple examples where is negative include the outward star graph where the only edges start at node and end at nodes , for which

and also the corresponding inward star graph. A strongly connected example has edges from node to nodes and from node to node for , plus an edge from node back to node . Here, we have

(13)

In this case, and , so , which is negative for . Hence, for such graphs it is not true that

iii)

our followers have at least as many followers as us, on average, or

iv)

the people we follow are following at least as many people as us, on average.

The reference [16] is unusual in that it tests the friendship paradox on directed networks. The authors consider the four versions i)–iv) and find that the paradox holds in each case for a large social network constructed from Twitter data. Our reasoning above shows that two of these versions will hold for all directed networks.

For an arbitrary network measure the relevant inequality that describes the out-degree version of the generalized friendship paradox (5) is

(14)

Similarly, the in-degree version in

(15)

We now consider eigenvector centrality as our network measure. We assume that the network is strongly connected so that is irreducible. In this directed case, we have potentially distinct left and right Perron–Frobenius eigenvectors, which we denote and , respectively. Here, and , with . Both vectors and have positive components.

The next result characterizes two cases.

Theorem 5.1.

For a strongly connected directed network the out-degree generalized friendship paradox inequality (14) holds for the case where if and only if

(16)

Similarly, (16) also characterizes the in-degree generalized friendship paradox (15) where .

Proof.

When we have

It follows that (14) reduces to (16).

The second statement may be proved by replacing with . ∎

It is of interest to note that for our unsymmetric , a classical result is that lies between the minimum out-degree or in-degree and the maximum out-degree or in-degree [17, Theorem 8.1.22]. However, it is not true in general that dominates the average in-degree (and hence average out-degree). An example is given by the strongly connected graph with adjacency matrix

In this case is the real root of , which has the form

where denotes the real cube root. Here , which is strictly below the average out/in degree of . Hence, for such networks Theorem 5.1 shows that neither the out-degree generalized friendship paradox for nor the in-degree generalized friendship paradox for applies.

Matrix function based centrality measures of the form (8) continue to make sense for directed networks. Here, the entry counts the number of distinct directed walks of length from to . We will focus on the Katz case, where

(17)

Let us first check how this measure correlates with in-degree. The relevant difference from (15) then takes the form

In terms of powers of , the zeroth order term vanishes and the first order term is

(18)

In words, we are comparing the total number of directed walks of length two with the square of the total number of directed walks of length one, scaled by the number of nodes. This difference can be negative—for example, the graph that was used to give (13) produces . Hence, the corresponding generalized friendship paradox fails on this example for small when .

We show next that is possible to prove something positive for the alternative out-degree case.

Theorem 5.2.

Consider the out-degree generalized friendship paradox inequality (14) in the Katz case (17). For any network there exists a value such that the inequality holds for all parameter values , with equality if and only if the network has a common out degree. Further, a sufficient condition for the inequality to hold for all is

(19)
Proof.

The relevant difference is , which takes the form

(20)

The sufficient condition (19) follows by considering powers of .

Now, suppose the network has a common out degree, so for some value . Then must be the Perron–Frobenius right eigenvector, so , where is the Perron–Frobenius eigenvalue. In this case, . So we have the stated equality.

Now, suppose the network does not have a common out degree, so is not a multiple of . For small , the leading term in (20) may be written

which is positive by Cauchy–Schwarz. Hence a suitable exists. ∎

The Katz centrality measure (17) assigns to node a weighted sum of all directed walks starting from node . In a message-passing context, this measure rewards nodes that are able to broadcast information effectively. In the limit this measure approaches (a shifted version of) the out-degree. As an alternative, we could replace by in (17). This measure assigns to node a weighted sum of all directed walks finishing at node , thereby rewarding nodes that are able to receive effectively. In the limit this measure approaches (a shifted version of) the in-degree. Analogous versions of the conclusions that follow (18) and the statement of Theorem 5.2 are then valid with replaced by and “in” and “out” swapped.

We also note that part of Theorem 5.2 extends to the general power series centrality measure —requiring (19) to hold for every for which serves as a sufficient condition.

6 Weighted Networks

The results in the previous sections, including [23, Theorem 1], do not require the network to be unweighted. So the conclusions extend to nonnegatively weighted networks if we are willing to use the formulations (1) and (12) for the friendship paradox and (5), (14) and (15) for the generalized friendship paradox, with the degree vectors having the same definitions: , . In this extended setting, the degree vectors represent sum of weights rather than edge counts, and we note that the inequalities are invariant to positive rescaling of degree, so we may assume without loss of generality that , or have elements that sum to unity. Similarly, we may assume that also sums to unity. The friendship and generalized friendship paradoxes then apply if “average” is interpreted as “weighted average”.

7 Discussion

The friendship paradox has spawned a range of activity in quantitative network science, and it has been argued that the effect may explain reports of increasing levels of dissatisfaction in online social interaction [3] and may be systematically distorting our perceptions and behaviours [18]. It has also been shown that the paradox may be leveraged in order to detect the spread of information or disease, and to drive effective interventions [6, 14, 12, 21, 22, 25]. Our main result, Theorem 3.1, shows that the paradox holds with the same level of generality when we consider importance, as quantified by the classical and widely adopted eigenvector centrality measure. Hence, our work adds further support to the argument that this type of sampling bias is both highly relevant and ripe for exploitation.

It is of interest to note that the original friendship paradox is based on a purely local quantity—the number of immediate neighbours. Theorem 3.1 shows that the effect is also present for a global quantity that takes account of long range interactions. Indeed the walk-based Katz centrality measure, (8) with

, interpolates between these two extremes:

from above reduces to degree and from below becomes eigenvector centrality [2, 28]. Theorem 4.4 shows that Katz maintains the paradox for sufficiently small , but it is an open question as to whether there is an undirected network for which the paradox fails to hold for some .

We note that [23] gives a concrete example of a connected network on which the total number of walks of length three is strictly less than the product of the total number of walks of length two and the average degree; so the inequality (11) is violated for and . It follows that by taking and the remaining sufficiently small, we can construct a centrality measure (8) based on a power series with positive coefficients for which the generalized friendship paradox fails to hold. This raises the question of categorizing those power series that never give rise to such counterexamples. Theorem 4.3 shows that odd power series are one such class.

In [23] it is also stated that for any given network the inequality (11), and hence the sufficiency condition (10), holds for large enough . This is entirely consistent with the eigenvector result in Theorem 3.1—increasing in Katz centrality emphasizes longer walks, and the limit corresponds to the eigenvector case [2, 28].

To the best of our knowledge, extensions of the friendship paradox to directed networks had only been studied empirically, as in [16]. In Section 5 we clarified that two of the four out/in degree versions always hold, while the other two may fail. For example, when we allow for a lack of reciprocation it remains the case that the people we admire have more admirers than us, on average (something many of us first discovered at high school), and, for the same reason, people we hate are hated by more people than us, on average. However, it is not true in general that we admire/hate people who admire/hate more people than us, on average.

We gave in Theorem 5.1 a spectral condition that determines whether the relevant eigenvector centrality maintains the generalized paradox for directed networks. In this unsymmetric setting, it would be of interest to find useful classes of network for which the spectral condition is satisfied, and also to identify power series centrality measures for which a generalized friendship paradox is always guaranteed, thereby extending the sufficiency result in Theorem 5.2.

References

  • [1] M. Benzi and C. Klymko, Total communicability as a centrality measure, Journal of Complex Networks, 1 (2013), pp. 124–149.
  • [2] M. Benzi and C. Klymko, On the limiting behavior of parameter-dependent network centrality measures, SIAM J. Matrix Anal. Appl., 36 (2015), pp. 686–706.
  • [3] J. Bollen, B. Gonçalves, I. van de Leemput, and G. Ruan, The happiness paradox: your friends are happier than you

    , EPJ Data Science, 6 (2017), p. 4.

  • [4] P. Bonacich, Factoring and weighting approaches to status scores and clique identification, Journal of Mathematical Sociology, 2 (1972), pp. 113–120.
  • [5]  , Power and centrality: a family of measures, American Journal of Sociology, 92 (1987), pp. 1170–1182.
  • [6] N. A. Christakis and J. H. Fowler, Social network sensors for early detection of contagious outbreaks, PLoS ONE, 5 (2010), p. 0012948.
  • [7] Y.-H. Eom and H.-H. Jo, Generalized friendship paradox in complex networks: The case of scientific collaboration, Scientific Reports, 4 (2014), p. 4603.
  • [8] E. Estrada, Generalized walks-based centrality measures for complex biological networks, Journal of Theoretical Biology, 263 (2010), pp. 556–565.
  • [9] E. Estrada, The Structure of Complex Networks, Oxford University Press, Oxford, 2011.
  • [10] E. Estrada, N. Hatano, and M. Benzi, The physics of communicability in complex networks, Physics Reports, 514 (2012), pp. 89–119.
  • [11] E. Estrada and D. J. Higham, Network propeties revealed through matrix functions, SIAM Review, 52 (2010), pp. 696–671.
  • [12] S. L. Feld, Why your friends have more friends than you do, American Journal of Sociology, 96 (1991), pp. 1464–1477.
  • [13] L. C. Freeman, Centrality networks: I. conceptual clarifications, Social Networks, 1 (1979), pp. 215–239.
  • [14] M. Garcia-Herranz, E. Moro, M. Cebrian, N. A. Christakis, and J. H. Fowler, Using friends as sensors to detect global-scale contagious outbreaks, PLoS ONE, 9 (2014), p. 0092413.
  • [15] T. U. Grund, Why your friends are more important and special than you think, Sociological Science, 1 (2014), p. 128–140.
  • [16] N. O. Hodas, F. Kooti, and K. Lerman, Friendship paradox redux: Your friends are more interesting than you, in Proceedings of the 7Th International AAAI Conference On Weblogs And Social Media (ICWSM), 2013. Honorable mention paper.
  • [17] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 2nd ed., 2013.
  • [18] M. O. Jackson, The friendship paradox and systematic biases in perceptions and social norms, Journal of Political Economy, (to appear).
  • [19] L. Katz, A new index derived from sociometric data analysis, Psychometrika, 18 (1953), pp. 39–43.
  • [20] F. Kooti, N. O. Hodas, and K. Lerman, Network weirdness: Exploring the origins of network paradoxes, in International Conference on Weblogs and Social Media (ICWSM), Mar. 2014.
  • [21] V. Kumar, D. Krackhardt, and S. Feld, Network interventions based on inversity: Leveraging the friendship paradox in unknown network structures, Working Paper.
  • [22] V. Kumar and K. Sudhir, Can friends seed more buzz and adoption?, Working Paper.
  • [23] J. C. Lagarias, J. E. Mazo, L. A. Shepp, and B. McKay, An inequality for walks in a graph, SIAM Review, 25 (1983), pp. 580–582.
  • [24] M. E. J. Newman, Networks: an Introduction, Oxford Univerity Press, Oxford, 2010.
  • [25] M. M. Pires, F. M. Marquitti, and P. R. Guimarães, The friendship paradox in species-rich ecological networks: Implications for conservation and monitoring, Biological Conservation, 209 (2017), pp. 245–252.
  • [26] J. A. Rodríguez, E. Estrada, and A. Gutiérrez, Functional centrality in graphs, Linear and Multilinear Algebra, 55 (2007), pp. 293–302.
  • [27] S. Strogatz, Friends you can count on, The New York Times, 17th September (2012).
  • [28] S. Vigna, Spectral ranking, Network Science, 4 (2016), pp. 433–445.
  • [29] E. W. Zuckerman and J. T. Jost, What makes you think you’re so popular? Self-evaluation maintenance and the subjective side of the ”friendship paradox”, Social Psychology Quarterly, 64 (2001), pp. 207–223.