Closure Operators and Spam Resistance for PageRank

03/13/2018
by   Lucas Farach-Colton, et al.
0

We study the spammablility of ranking functions on the web. Although graph-theoretic ranking functions, such as Hubs and Authorities and PageRank exist, there is no graph theoretic notion of how spammable such functions are. We introduce a very general cost model that only depends on the observation that changing the links of a page that you own is free, whereas changing the links on pages owned by others requires effort or money. We define spammability to be the ratio between the amount of benefit one receives for one's spamming efforts and the amount of effort/money one must spend to spam. The more effort/money it takes to get highly ranked, the less spammable the function. Our model helps explain why both hubs and authorities and standard PageRank are very easy to spam. Although standard PageRank is easy to spam, we show that there exist spam-resistant PageRanks. Specifically, we propose a ranking method, Min-k-PPR, that is the component-wise min of a set of personalized PageRanks centered on k trusted sites. Our main results are that Min-k-PPR is, itself, a type of PageRank and that it is expensive to spam. We elucidate a surprisingly elegant algebra for PageRank. We define the space of all possible PageRanks and show that this space is closed under some operations. Most notably, we show that PageRanks are closed under (normalized) component-wise min, which establishes that (normalized) Min-k-PPRis a PageRank. This algebraic structure is also key to demonstrating the spam resistance of Min-k-PPR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2013

A Theoretical Analysis of NDCG Type Ranking Measures

A central problem in ranking is to design a ranking measure for evaluati...
research
10/06/2019

Kernel Density Estimation for Totally Positive RandomVectors

We study the estimation of the density of a totally positive random vect...
research
10/06/2019

Kernel Density Estimation for Totally Positive Random Vectors

We study the estimation of the density of a totally positive random vect...
research
12/06/2022

Online Min-Max Paging

Motivated by fairness requirements in communication networks, we introdu...
research
05/30/2022

Byzantine Fault-Tolerant Min-Max Optimization

In this report, we consider a min-max optimization problem under adversa...
research
02/28/2020

Improved Algorithm for Min-Cuts in Distributed Networks

In this thesis, we present fast deterministic algorithm to find small cu...
research
05/23/2022

Nancy: an efficient parallel Network Calculus library

This paper describes Nancy, a Network Calculus (NC) library that allows ...

Please sign up or login with your details

Forgot password? Click here to reset