Adversarial Contract Design for Private Data Commercialization

The proliferation of data collection and machine learning techniques has created an opportunity for commercialization of private data by data aggregators. In this paper, we study this data monetization problem using a contract-theoretic approach. Our proposed adversarial contract design framework accounts for the heterogeneity in honest buyers' demands for data, as well as the presence of adversarial buyers who may purchase data to compromise its privacy. We propose the notion of Price of Adversary (PoAdv) to quantify the effects of adversarial users on the data seller's revenue, and provide bounds on the PoAdv for various classes of adversary utility. We also provide a fast approximate technique to compute contracts in the presence of adversaries.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

05/31/2021

Incomplete Information VCG Contracts for Common Agency

We study contract design for welfare maximization in the well known "com...
02/26/2018

Incentivizing Wi-Fi Network Crowdsourcing: A Contract Theoretic Approach

Crowdsourced wireless community network enables individual users to shar...
02/26/2018

Technical Report for "Incentivizing Wi-Fi Network Crowdsourcing: A Contract Theoretic Approach"

Crowdsourced wireless community network enables individual users to shar...
11/24/2021

Machine Learning Guided Cross-Contract Fuzzing

Smart contract transactions are increasingly interleaved by cross-contra...
05/15/2019

Multi-Cap Optimization for Wireless Data Plans with Time Flexibility

An effective way for a Mobile network operator (MNO) to improve its reve...
06/19/2018

A Novel Mobile Data Contract Design with Time Flexibility

In conventional mobile data plans, the unused data will be cleared at th...
02/28/2022

Distributed randomized Kaczmarz for the adversarial workers

Developing large-scale distributed methods that are robust to the presen...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The large-scale adoption of data-driven decision making by businesses has led to a boom in big data collection and analysis techniques. With increasing amount and demand for data, companies have found a business opportunity in offering data-based services to other companies, or selling their data to interested parties [Thomas and Leiponen2016, Spiekermann et al.2015]. Interest in data monetization is evidenced by the rise of data marketplaces, where firms and individuals can buy, sell, or trade, second or third party data. Examples include Salesforce’s Data Studio, Oracle’s BlueKai, and Adobe’s Audience Marketplace. Data commercialization faces many challenges, including IP protection, liability, pricing, and preserving privacy [Thomas and Leiponen2016]. In this paper, we focus on the latter two challenges of pricing and privacy.

The challenge of pricing refers to the fact that to accommodate diverse demands, data sellers offer different plans and pricing to their customers. Even with identical data, customers may derive different benefits from utilizing it, e.g., due to different expertise, or how this data complements the customer’s existing knowledge. Therefore, to maximize revenue, the data seller should account for this demand diversity by packaging its data accordingly. Further, despite its revenue benefits, data commercialization has to overcome the challenge of limiting privacy risks for the data subjects in the database. Specifically, adversarial buyers can request access to the database, attempting to compromise the privacy of the data subjects. Therefore, data sellers should account for this risk when designing and pricing data plans.

In this paper, we take a contract theoretical [Mas-Colell et al.1995] approach to address both the aforementioned pricing and privacy challenges of data commercialization by proposing the design of a set of contracts with varying privacy levels. Contract theory, in the classic context of pricing of goods, is the study of principal-agent problems, in which the principal (here, the data seller) designs a set of contracts with varying consumption level so as to extract maximum revenue from agents (buyers) with unknown types. We study the problem of pricing a bundle of database queries at different privacy levels with the aim of (a) maximizing revenue by offering different prices for varying privacy levels in order to accommodate the diversity of demands for the query bundle, and (b) accounting for the risks from adversarial users by modifying the contracts’ pricing accordingly. We use the well accepted -differential privacy concept as the measure of privacy [Dwork2008]. We make an effort to keep our design practical by attempting to adhere to practices already in place in data marketplaces (see Sections 2 and 3).

Technical contributions: (1) Existing contract theory results suggest that given types of agents ( types of honest buyers based on their diversity of demand, and an adversarial type), the principal should design up to contracts. We show that the data owner will offer at most contracts. In other words, it is optimal for the data owner to avoid the impractical option of designing a contract for the adversary; (2) we incorporate post-hoc fines (in case of privacy breach) in the pricing of query bundles, and analyze their effect on the contract design problem, showing that fines can be helpful in reducing loss due to the adversarial users in many situations; (3) we propose the notion of Price of Adversary () to quantify the loss incurred by the data owner due to the presence of adversarial data buyers. We show that while can be unbounded in the worst case, it is possible to bound the for a large class of problems; and (4) we provide a fast approximate technique to compute the contracts in presence of adversaries. All omitted and full version of proofs are in the appendix.

2 Background

Database marketing examples: Currently, the two industries leading database marketing are data brokers (who mine and sell consumer data to businesses), and data marketplaces (which provide a platform for buying, selling, and trading data). We elaborate upon typical privacy guarantees offered by each with an example. Among data brokers, Acxiom, one of the largest brokers worldwide, states that they maintain “privacy compliant data” through data encryption and secure data management techniques [Acxiom.2018]. The user service agreement of Salesforce Data Studio [salesforce.com, inc.2018] on the other hand, provides more detailed information about their market structure. For instance, Salesforce states that they use “unique user identifiers (user IDs) to help ensure that activities can be attributed to the responsible individual”, and that security logs are kept “in order to enable security reviews and analysis.” Our model in Section 3 takes the availability of these monitoring techniques into account. It is clear that following such safe practices is imperative when dealing with private information, e.g., as evidenced by the recent Cambridge Analytica case [Granville2018].

Differential privacy: A popular formalism of privacy loss due to adversarial queries from statistical databases is that of differential privacy (DP) [Dwork et al.2006, Dwork2008]. Formally, let be a randomized algorithm used by a data owner to release answers to queries from a database, and consider two databases and that differ in exactly one entry (row). Then, is -differentially private (-DP) for if for any output of the algorithm,

(1)

In words, -DP requires that the output of remains sufficiently unaffected (as quantified by ), whether or not a single data subject’s data is included in the database. The choice of determines the privacy loss due to , with lower corresponding to better privacy. Note that this privacy guarantee is independent of any auxiliary information available to an adversary [Dwork2008].

For continuous-valued queries, a method for achieving differential privacy is the introduction of carefully selected random noise in the responses. Specifically, let be a query function, returning the true value on database . In order to guarantee -DP, an algorithm can introduce additive Laplacian noise, returning instead , where is the sensitivity of the query function [Dwork et al.2006]. While this approach limits the privacy loss to within , it also decreases the utility of the queries to honest buyers by adding noise that increases in . In Section 3, we formalize buyers’ sensitivity to their queries’ accuracy, and consequently, their willingness to pay for more accurate answers. The seller’s contract design should balance this tradeoff with the increased privacy loss from adding less noise.

3 Model

We study the problem of designing a set of contracts for buyers requesting access to a database managed by a seller. We assume that the seller has already acquired data from subjects and compensated them using a one-time monetary payment or a free service (like a phone app). We use he/his for buyers and she/her for the seller.

Queries: There are multiple (and finite) types of potential statistical queries that can be made from the database, denoted by the set . The seller offers bundles consisting of a subset of these query types for purchase, with the restriction that any buyer can choose at most one bundle. A bundle is identified by the set . The seller designs these bundles based on historical or external information about the types of different buyers, so that every buyers’ requirement is met by one of the bundles. Further, for any bundle, the seller limits the number of queries of each type in the bundle to one (i.e., each bundle is a subset of distinct query types). This follows recommended practices in differential privacy, since allowing multiple queries inevitably degrade privacy guarantees (see also Section 7). We also posit that the seller verifies the identity of buyers, in order to keep track of the buyer’s query purchases, and to investigate a privacy attack if it occurs. Further, we posit that the seller, via her service agreement, restricts buyers from faking identifies by imposing substantial post-hoc fines.

Contracts: For each bundle , the set of possible contracts are determined by the parameters , with denoting the price to be paid by the buyer. The privacy levels are assumed to be bounded and normalized such that , with specifying the bound , where is used to determine the (Laplace) noise added to the answer of the query of type ; the buyer is free to request any within the bound, with higher corresponding to less noisy responses. Lastly, denotes the post-hoc fine to be paid if the buyer is found misusing the query answer.

Buyers: We assume that buyers belong to one of two possible classes: honest or adversarial.

Honest buyers: Honest buyers do not misuse query answers, and hence generate revenue for the operator when purchasing contracts. Each honest buyer for a given bundle has a type , determining his benefit from the database. In particular, an honest buyer of type purchasing contract derives a benefit from accessing the system. This function includes direct gain from the data, as well as the cost of hedging against the risk of potential direct attack on the buyer. We impose natural conditions on the benefit functions (as is standard for demand functions) : that the overall benefit increases with larger (monotone non-decreasing) and satisfies diminishing returns (concavity), with

. Most large organizations estimate demand functions and types of buyers from past buyers’ activity, and insurance premiums are known; hence, we assume these functions are known. Further,

; that is, higher types derive further benefit from the same noise level, e.g., due to their expertise or the relevance of the data to their tasks.

An honest buyer also has a probability of suffering an attack himself and causing inadvertent misuse of the query answer, which results in an expected loss for him as per the contract terms. Thus, an honest buyer’s overall expected utility in its interaction with the seller is given by .

Adversarial buyers: An adversarial buyer seeks to access the database with the goal of compromising its privacy. Formally, an adversarial buyer purchasing a bundle through a contract derives a benefit from an attack on the system, with overall adversary utility given by . This attack results in a cost for the seller. Further, we assume is monotone increasing and convex, with ; intuitively, higher (lower noise) lead to costlier attacks for the seller, with the severity increasing as the noise decreases. Such convexity has also been noted in literature, e.g., a recent work [Hsu et al.2014] proposes the cost for seller to be proportional to . Figure 1 shows an example of and .

We assume that a privacy attack is ultimately discovered, and the seller can track the buyer responsible for the attack. The seller may have to compensate data subjects after a privacy attack (due to lawsuits), which can be partially recovered from the post-hoc fine for data misuse. Note that we have assumed that the adversary cannot cause privacy loss beyond the given of the bundle by combining the outputs of multiple queries of the same type, as the seller restricts the number of queries per type to one. Further, large post-hoc fines for faking identities prevent the rational adversary from faking identities and attempting to purchase two or more bundles. However, the post-hoc fine for data misuse cannot be set too large as this fine affects the honest buyers, and hence the seller’s revenue, due to potential attacks on honest buyers. Therefore, our goal is to study the optimal choice of fines for data misuse so as to deter adversarial buyers while maintaining the demand from honest buyers.

Figure 1: Three Benefit fns for , adversary cost , and non-adversarial price-contract curve.

3.1 Seller’s revenue optimization problem

We now analyze the seller’s contract design problem, with one contract for each offered bundle. As all rational buyers choose only one bundle due to the marketplace design, these contracts are independent. Therefore, for the rest of this paper, we restrict attention to a given bundle.

Let denote the fraction of adversarial buyers, which is estimated by the seller (conservatively, the seller can estimate to be at most a maximum value). For the honest buyers, let denote the fraction of the honest buyers of type . These fractions can be estimated from historical data. The seller aims to maximize her revenue. Nevertheless, she can not observe individual buyers’ types when selling a contract. Consequently, she has to design contracts while balancing two goals: deriving the maximum possible profit from honest types, while limiting the adversarial type’s cost to the system.

In classic contract theory, following the revelation principle [Mas-Colell et al.1995, Proposition 14.C.2], it is known that it is enough to offer at most contracts when the number of buyer types is . 111Depending on the distribution of buyers’ types, it may be optimal to offer the same contract to adjacent types (pooling contracts), instead of separate contracts for each type (separating contracts). Each agent then selects his intended contract if it satisfies the agent’s individual rationality (IR) and incentive compatibility (IC) constraints. The IR constraint requires that the agent attains higher utility from purchasing the contract compared to opting out. The IC constraint imposes the condition that an agent of type prefers his intended contract over that of any type .

Formally, for our contract design problem, consider the user types consisting of the honest buyers and the adversarial type. Let the contract of type be and that for the adversary be . Assume, wlog, that the utility of opting out of purchasing contracts is zero. Then, the IR constraint of an honest buyer of type , denoted by , is given by . Similarly, is . Type ’s IC constraints are given by where the constraint is denoted by and which is denoted as . Similarly, the constraints can be defined for the adversary.

The seller’s goal is to maximize her revenue . However, the seller only has steady revenue over time from ; provides randomly varying revenue over time. Thus, we impose the practical constraint that , which says that a large fraction of revenue arrive steadily over time. We name this the steady revenue constraint. Therefore, the seller’s contract design problem can be formally stated as the following optimization:

subject to

3.2 No need for an adversary-specific contract

The contract design problem above includes a contract

for the adversary. While the formulation is mathematically sound and consistent with the revelation principle, this seems an odd design choice as the adversary reveals his type just by choosing this contract. We show that, as intuitively expected, it is in fact not required for the seller to design an adversary-specific contract.

Lemma 1.

The seller should offer at most contracts/bundles. In particular, it is never optimal to offer an adversary-specific contract/bundle.

Proof.

We show this by contradiction. Assume the seller treats the adversarial buyer as the -th type, and offers a contract satisfying all (honest and adversarial) buyers’ IR and IC constraints. By , this contract satisfies ; that is, it will impose a loss on the seller’s revenue. Further, by the constraints, ; that is, had the adversary purchased any of the legitimate buyers’ contracts, he would have imposed a smaller cost on the seller’s revenue. As the seller is a profit-maximizer, we conclude that such contract should not be part of an optimal collection of contracts. ∎

Given the above lemma, the contract design problem in the adversarial setting is to design contracts in order to maximize the revenue of the operator:

where is the contract chosen by the adversary, subject to IR and IC constraints for all honest buyers in choosing their contract and the adversary in choosing . For the special case of the adversary not choosing any contract, we designate with . Observe that

is a variable, and thus, the revenue maximizing problem is a bi-level optimization problem. However, following the standard technique of introducing an additional variable to formulate a zero-sum problem as a linear program, we formulate the revenue maximization problem in the adversarial setting using variable

as follows:

subject to

For our described marketplace, one can further consider the corresponding non-adversarial setting, in which the seller solves the contract design problem in the absence of any adversarial considerations. This non-adversarial contract design problem is given by:

subject to

We next study these two contract design problems to characterize the effects of the presence of adversarial types on the optimal contracts’ properties and the seller’s revenue.

4 Analysis of Adversarial Contracting

In classic contract theory, when solving for the optimal contracts, the functions are often assumed to satisfy a condition known as the single crossing property (SCP), which in turn implies the strict increasing differences (ID) property. Throughout our analysis, we will only require the (weaker) condition of (non-strict) ID property on the benefit functions , as defined below:

Definition 1 (Increasing Differences).

The functions satisfy the (strict) increasing differences property if for any , is (strictly) increasing in the type .

The above condition is a natural assumption on demand functions, and has been used extensively in the contract theory literature starting from the seminal work by [Maskin and Riley1984]. The functions shown in Figure 1 satisfy ID. This condition also allows for significant simplification of the classical contract theory optimization problem. Our first, somewhat surprising result is that, even in the adversarial contract regime with post-hoc fines, the contracts will satisfy a set of constraints akin to those of non-adversarial settings.

Theorem 1.

Assuming that the functions satisfy ID, the optimal contracts (in the presence of adversarial types) satisfy the following:

  1. Monotonicity: .

  2. Constraint set reduction: for and for are redundant at the optimal contracts.

  3. is tight: as a result, .

  4. is tight for all : as a result for ,

Proof Sketch.

We first establish the monotonicity of noise levels at the optimal contracts using the (non-strict) ID condition of the benefit functions. Next, we show how to considerably refine the constraint set (point 2) and derive the price-benefit relations (points 3-4). These arguments are based on contradiction: had any of these constraints not been redundant/tight, the operator would have had room to improve her profit by modifying the contracts without violating the remaining IR and IC constraints of honest buyers. For the contradiction argument to carry through, we show that under appropriate modifications, the effect of changes in the adversarial types’ behavior on the revenue is non-decreasing. ∎

We note that for the non-adversarial case, the same results of the above theorem holds; this follows from prior work in contract theory [Maskin and Riley1984] (using a straightforward mapping that we present in the appendix). Formally:

Proposition 1.

Assuming that the functions satisfy ID, the optimal contracts in the non-adversarial setting have and satisfy all conditions of Theorem 1 (with ).

In particular, the relation between prices, fines, and benefit functions (points 3-4), provides an easy visual representation of the contracts as shown in Figure 1 for the non-adversarial setting (that is, with ). We call this curve the price-contract curve , which is a curve on the (on x-axis), (on y-axis) plane, and connects the non-adversarial contract points for all . From Proposition 1, we get ; thus, the segment of the curve that is between and is parallel to . Thus, is continuous and piece-wise concave.

Theorem 1’s characterization greatly simplifies the optimization problem to compute the optimal contracts by removing several of the constraints (points 1-2). The result also shows that the optimization problem for computing optimal contracts in the presence of adversaries has only additional adversarial constraints and the same price-benefit relations (points 3-4) as that without adversaries. Despite these similarities, the presence of adversaries changes the seller’s objective function, leading to a different set of contracts than the non-adversarial setting. Proposition 1 further implies that the variables can be dropped in the optimization problem for the non-adversarial case, yet these variable remain a key design choice in the adversarial setting.

Price of Adversary: In order to quantify the effects of the adversary’s presence on the seller’s revenue, we introduce the following notion:

Definition 2 (Price of Adversary).

Let and denote the seller’s maximum revenue in non-adversarial and adversarial settings, respectively. Then, the price of adversary () is defined as:

Clearly is , with equality when the adversary does not choose any contract, so that . Our first finding is that is unbounded in the worst case.

Lemma 2.

is unbounded in the worst case.

Proof Sketch.

We prove this by construction with two types of legitimate users and . The benefit function are for the lower type and for the higher type . The function for the adversary is given by . We show that , leading to an unbounded . ∎

5 Approximation Algorithm

In this section, we present an approach that solves for the adversarial contracting problem approximately, given a solution for the non-adversarial case. We do so since solving the non-adversarial scenario is simpler: by Proposition 1, the non-adversarial case has both fewer variables () and fewer constraints (no adversary contract choice constraint). Our proposed algorithm also reveals a subtle relation between the adversarial and non-adversarial settings.

Since by Lemma 2 we know that is unbounded in the worst case, we limit our analysis to a large class of adversary’s benefit functions which imposes mild and natural restriction on these functions. We call these the well-behaved ’s, and define them as follows. Recall that denotes the non-adversarial price-contract curve (see Figure 1).

  • (High ) intersects once at the origin and then lies above for .

  • (Low ) intersects once at the origin and then lies below for .

  • (Intermediate ) intersects multiple times. Let be the access level at the last intersection point. We denote .

The above classes comprise several types of adversaries. High s (low s) represent powerful (weak) adversaries, who can (can not afford to) impose a high cost on the revenue; this class includes functions () as a subset. Intermediate s on the other hand represent adversaries who can purchase (some of) the contracts offered through non-adversarial contract design. Within this class, is an upper bound on the adversary’s payoff from purchasing contracts with . As lies above after , we have for all , which means that the adversary can afford all contracts with . Figure 1 illustrates an intermediate . Next, we present our approximation technique. We start with a definition.

Definition 3.

We call the non-adversarial contract a -slack -priced contract, (), if there exists such that the contract satisfies:

  • , i.e., adversary’s gain is bounded by .

  • , i.e., the contract’s price is at least .

  • , SR constraint is satisfied

Constructively, whenever it exists, should be chosen to have the least possible value.

Using the above definition, our approximation technique is tailored towards the three categories of functions as shown in Algorithm 1. This algorithm takes the set of non-adversarial contracts as input, and either successfully returns a new set of contracts by modifying this input, or prescribes solving the adversarial contract design problem from scratch. For the High case, the algorithm finds -slack contracts with a positive price (line 4, -slack ensures the adversary will not choose the new contract). If one is found, the contract generating the highest revenue among such contracts is offered to all users (line 10). For Low , the adversary does not choose any contract, hence it is optimal to retain the non-adversarial contracts as is (line 13). For Intermediate , the function presented in Algorithm 2 is invoked (line 15).

Input: Non-adv. contracts
Output: An array of or solve adv. case
1 switch  do
2       case High  do
3             is -slack -priced for some if  is empty then
4                   return solve adv. case
5             that makes -slack -priced for  to  do
6                  
7            return
8      case Low  do
9             return
10      case Intermediate  do
11             return InterApp
return solve adv. case
Algorithm 1 Approx. Algorithm
Input: Non-adv. contracts
Output: An array of
highest such that and is -slack -priced     not empty as
1 for  to  do
2       that makes -slack -priced
3for  to  do
4      
5 if  then
6       return
return
Algorithm 2

In Algorithm 2, first a set of -slack -priced contracts is found among contracts above and including that of type (line 2). The best contract with index among these is found for each user (line 4). New contracts are constructed for types (line 6), and all the non-adversarial contracts for types and below are retained as is (line 8). The revenue from honest buyers for the new contracts is found on line 9, and for the non-adversarial contracts on line 10. is the utility for the adversarial type in choosing the best new contract (line 11) and is the same adversary utility in choosing from the non-adversarial contract set (line 12). Line 13-15 compares the revenue in the adversarial setting from the non-adversarial contracts and the new contract set, and returns the contract set that leads to better revenue for the seller.

We next prove that the contracts output by Algorithm 1 are valid. First, we present a lemma on the ordering of honest buyers’ preferences over the contracts, which will later be used for the validity proof.

Lemma 3.

Given optimal non-adversarial contracts , a type user with prefers contract over for .

The validity of the Algorithm 1’s output is as follows:

Lemma 4.

For Low or Intermediate s, Algorithm 1’s output contracts satisfy the IR and IC conditions for all honest buyers. If Algorithm 1 outputs a set of contracts for a High adversary, then at least one honest buyer buys the contract.

Proof.

For High , there is one contract offered to all users, so the IC constraints are trivially satisfied. Also, for user the contract offered satisfies IR, since from optimality of the non-adversarial contracts we get . For Low , the proof is immediate from optimality of the non-adversarial contracts.For Intermediate , if the non-adversarial contracts are returned by Algorithm 2. then the claim again holds trivially. Otherwise, if new contracts are returned, first, observe that the contract is -slack -priced (follows from Def. 3 and definition of , ). Thus, is not empty as . Also, note that for users , the offered modified contracts (line 6) still has the effective price which is the same as the non-adversarial contract.

We first start by analyzing users . All users are offered modified contracts (line 6) among those indexed by (loop on line 3). By definition of , for all . Thus, prefers his contract over any other offered to any . Next, by definition of , , and then by Lemma 3 and , for all . Thus, for all . For IR, first by ID we have , hence , where the is due to optimality of the non-adversarial contracts. Finally, we just proved that , thus, .

The users are offered the non-adversarial contracts, thus, for all . Since the modified contracts (line 6) still have an effective price same as the non-adversarial contract, any user still prefers his contract to the modified ones. The IR constraint is satisfied as the non-adversarial contracts were optimal. ∎

Figure 2: Runtime comparison
Figure 3: for non-adversarial contracts
Figure 4: for optimal adversarial contracts
Figure 5: for approx adversarial contracts

Next, the following result establishes the quality of the contracts returned by Algorithm 1 by bounding the . Recall that we have already shown in Lemma 2 that is unbounded in the worst case.

Theorem 2.

Let the optimal non-adversarial contract revenue be . For the class of well-behaved ’s, we have,

  • (High ) is unbounded in general. If Algorithm 1 outputs a contract, then .

  • (Low ) Algorithm 1 always outputs the same contracts as the non-adversarial case, and hence .

  • (Intermediate ) Alg. 1 always outputs contracts. Then,

Proof Sketch.

The analysis for High and Low is straightforward. For intermediate , depending on whether new contracts or the original contracts is offered, the revenue is max of the revenues in these two scenarios. Then the result follows by noting that . ∎

6 Numerical Example

While our theory results provide a broad characterization of the problem for a large space of utility functions, in this section we illustrate specific points related to the problem parameters, with a numerical example. We use types of honest buyers (except when varying ), with , , and .

Runtime comparison: Fig. 5 illustrates runtimes when computing the optimal adversarial contracts and optimal non-adversarial contracts. The optimal adversarial contracts take much more time to compute than the non-adversarial contracts and the difference increases exponentially with increase in the size of problem . This shows why approximation is useful; our approximation approach takes almost the same time as the non-adversarial problem, as the approximation steps after solving the non-adversarial problem have (comparatively) negligible runtime.

Price of adversary with non-adversarial contracts: Fig. 5 shows the price of adversary for varying and when the non-adversarial contracts are offered in an adversarial setting. We observe that the rises exponentially with . Intuitively, the non-adversarial contracts suffer great losses if adversarial buyers dominate the market.

Price of adversary with optimal adversarial contracts: Fig. 5 shows the price of adversary for varying and when the optimal adversarial contracts are computed exactly. The rises with both increasing and . Intuitively, higher represents adversaries’ market domination, and higher is weaker honest users (i.e., more attack-prone). Thus, higher values for both of these parameters cause more loss, leading to higher .

Performance of approximation: Lastly, Fig. 5 shows the price of adversary computed using our approximation approach for varying and . The that we chose corresponds to an Intermediate . The varies mostly with and is almost constant throughout at 2.77, except for very small values of when it is 1.43. For small values of , the approximation algorithm sends back the original contracts as is (line 14 in Algorithm 2).

7 Related Work

Our work is within the emerging literature of data commercialization and its challenges [Thomas and Leiponen2016]. [Ghosh and Roth2015, Gkatzelis, Aperjis, and Huberman2015, Li et al.2014] study the problem of pricing personal data, where a data seller designs a pricing mechanism which incentivizes data subjects to reveal their private information. [Niyato et al.2016] design a pricing scheme for selling data to users with differing willingness to pay. Our approach differs in that we propose a contract-theoretical framework to accommodate heterogeneous honest buyers as well as adversarial types. More specifically, in contrast to existing work, we posit that honest buyers do not attempt to compromise the privacy of the database, hence every sale of data is not a loss of privacy. Further, by far the practice in real world is for the data seller to obtain data by compensating people in form of a one-shot monetary payment or free service [Porter2018], which is part of our model. This avoids unrealistic mechanisms in which data subjects are paid every time their data is sold to a buyer [Li et al.2014].

[Adam and Worthmann1989]classified privacy-preserving query approaches into query restriction, data perturbation, and output perturbation. Query auditing (a form of query restriction) aims to determine whether, given the query history, a new query will compromise the database privacy; however, this problem is NP-hard [Kleinberg, Papadimitriou, and Raghavan2003]. In addition, output perturbation mechanisms (including differential privacy) must limit the number of queries in order to maintain any reasonable privacy guarantee [Dinur and Nissim2003]. Our proposed approach, which is a combination of query restriction with output perturbation, restricts the type and number of queries in light of these impossibility results.

Contract-theoretical frameworks have been receiving attention as a method for optimal pricing in other application areas, including the design of demand-response programs [Meir, Ma, and Robu2017], energy procurement methods [Tavafoghi and Teneketzis2014], and incentive mechanisms in crowdsourcing markets [Ho, Slivkins, and Vaughan2016]. In contrast, we consider the optimal pricing problem in the presence of both honest and adversarial users.

Another line of work studies the effects of malicious or spiteful agents in game-theoretical settings, including network inoculation games [Moscibroda, Schmid, and Wattenhofer2006], sealed-bid auctions and colluding bidders [Brandt, Sandholm, and Shoham2007, Micali and Valiant2008], and resource allocation games [Chorppath and Alpcan2011]. These works assume that malicious agents aim to minimize the utility of all other users, and analyzes their effect on the Nash equilibria, in a game-theoretic framework. In contrast, we consider the effects of an adversarial user on the principal’s revenue in a contract-theoretic framework.

8 Conclusion

We proposed a novel and practical adversarial contract design framework in which a data seller designs a collection of contracts to optimize her revenue in the presence of honest and adversarial users. We quantified the effect of adversaries by proposing the notion of price of adversary, and characterized the effect of fines on optimal revenue. We also presented a fast approximate technique to compute contracts in an adversarial setting.

References

  • [Acxiom.2018] Acxiom. 2018. Acxiom. https://www.acxiom.com/how-we-can-help/data-stewardship/. Accessed: 2018-08-15.
  • [Adam and Worthmann1989] Adam, N. R., and Worthmann, J. C. 1989. Security-control methods for statistical databases: a comparative study. ACM Computing Surveys (CSUR) 21(4):515–556.
  • [Brandt, Sandholm, and Shoham2007] Brandt, F.; Sandholm, T.; and Shoham, Y. 2007. Spiteful bidding in sealed-bid auctions. In IJCAI, volume 7, 1207–1214.
  • [Chorppath and Alpcan2011] Chorppath, A. K., and Alpcan, T. 2011. Adversarial behavior in network mechanism design. In Proceedings of the 5th International ICST Conference on Performance Evaluation Methodologies and Tools, 506–514. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering).
  • [Dinur and Nissim2003] Dinur, I., and Nissim, K. 2003. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 202–210. ACM.
  • [Dwork et al.2006] Dwork, C.; McSherry, F.; Nissim, K.; and Smith, A. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, 265–284. Springer.
  • [Dwork2008] Dwork, C. 2008. Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Computation, 1–19. Springer.
  • [Ghosh and Roth2015] Ghosh, A., and Roth, A. 2015. Selling privacy at auction. Games and Economic Behavior 91:334–346.
  • [Gkatzelis, Aperjis, and Huberman2015] Gkatzelis, V.; Aperjis, C.; and Huberman, B. A. 2015. Pricing private data. Electronic Markets 25(2):109–123.
  • [Granville2018] Granville, K. 2018. Facebook and Cambridge Analytica: What You Need to Know as Fallout Widens. https://www.nytimes.com/2018/03/19/technology/facebook-cambridge-analytica-explained.html.
  • [Ho, Slivkins, and Vaughan2016] Ho, C.-J.; Slivkins, A.; and Vaughan, J. W. 2016. Adaptive contract design for crowdsourcing markets: Bandit algorithms for repeated principal-agent problems.

    Journal of Artificial Intelligence Research

    55:317–359.
  • [Hsu et al.2014] Hsu, J.; Gaboardi, M.; Haeberlen, A.; Khanna, S.; Narayan, A.; Pierce, B. C.; and Roth, A. 2014. Differential privacy: An economic method for choosing epsilon. In Computer Security Foundations Symposium (CSF), 2014 IEEE 27th, 398–410. IEEE.
  • [Kleinberg, Papadimitriou, and Raghavan2003] Kleinberg, J.; Papadimitriou, C.; and Raghavan, P. 2003. Auditing boolean attributes. Journal of Computer and System Sciences 66(1):244–253.
  • [Li et al.2014] Li, C.; Li, D. Y.; Miklau, G.; and Suciu, D. 2014. A theory of pricing private data. ACM Transactions on Database Systems (TODS) 39(4):34.
  • [Mas-Colell et al.1995] Mas-Colell, A.; Whinston, M. D.; Green, J. R.; et al. 1995. Microeconomic theory, volume 1. Oxford university press New York.
  • [Maskin and Riley1984] Maskin, E., and Riley, J. 1984. Monopoly with incomplete information. The RAND Journal of Economics 15(2):171–196.
  • [Meir, Ma, and Robu2017] Meir, R.; Ma, H.; and Robu, V. 2017. Contract design for energy demand response. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, 1202–1208. AAAI Press.
  • [Micali and Valiant2008] Micali, S., and Valiant, P. 2008. Revenue in truly combinatorial auctions and adversarial mechanism design. Technical Report MIT-CSAIL-TR-2008-039, MIT.
  • [Moscibroda, Schmid, and Wattenhofer2006] Moscibroda, T.; Schmid, S.; and Wattenhofer, R. 2006. When selfish meets evil: Byzantine players in a virus inoculation game. In Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing, 35–44. ACM.
  • [Niyato et al.2016] Niyato, D.; Alsheikh, M. A.; Wang, P.; Kim, D. I.; and Han, Z. 2016. Market model and optimal pricing scheme of big data and internet of things (iot). In The 2016 IEEE International Conference on Communications (ICC), 1–6. IEEE.
  • [Porter2018] Porter, E. 2018. Your Data Is Crucial to a Robotic Age. Shouldn’t You Be Paid for It? https://www.nytimes.com/2018/03/06/business/economy/user-data-pay.html. Accessed: 2018-08-15.
  • [salesforce.com, inc.2018] salesforce.com, inc. 2018. Salesforce DMP Security, Privacy and Architecture. https://help.salesforce.com/servlet/servlet.FileDownload?file=0150M0000041PNOQA2. Accessed: 2018-08-15.
  • [Spiekermann et al.2015] Spiekermann, S.; Böhme, R.; Acquisti, A.; and Hui, K.-L. 2015. Personal data markets. Electronic Markets 25(2):91–93.
  • [Tavafoghi and Teneketzis2014] Tavafoghi, H., and Teneketzis, D. 2014. Optimal contract design for energy procurement. In Communication, Control, and Computing (Allerton), 2014 52nd Annual Allerton Conference on, 62–69. IEEE.
  • [Thomas and Leiponen2016] Thomas, L. D. W., and Leiponen, A. 2016. Big data commercialization. IEEE Engineering Management Review 44(2):74–90.

Appendix

Theorem 1

Proof of Theorem 1.

As a shorthand, we will write throughout.

Monotonicity: First, we claim that for every optimal fixed cost contract we must have whenever . Let . The IC constraints include

Adding these, we get

There are two cases (1) or (2) . For case (1), we can claim that using non-strict ID of the benefits functions. The proof is by contradiction. Assume ; then, by non-strict ID we must have which violates case (1). Hence under case (1) . As the reasoning here is not based on the seller’s objective value or the adversarial type’s constraints, we do not need to consider adversarial aspects here.

Next, under case (2), let . First, if is then when and are both strictly monotone increasing. As the reasoning here is not based on the objective value or the adversarial constraints, we do not need to consider adversarial aspects here. The case when and are both monotone non-decreasing has to be dealt in a special way (see after the case below).

Thus, the only scenario left to analyze is . Then the two IC inequalities stated at the start can be re-written as and which implies , or . Also, , so that contract and are both equally and most preferred by (and similarly by ). Then offer another set of contracts in which is offered , and others are offered their earlier contract. In this new contract, all of the IC constraints are still satisfied as type preferred the most and equally preferred the now unavailable . For any other type they prefer their allocation and price to as was the case for the earlier set of contracts. Also, since and earlier contract’s IR provided , we have the new contract’s IR is also satisfied . The is also trivially satisfied since was satisfied. In this new set of contracts, as the revenue from increases and all other honest users provide same revenue as earlier, thus, the operator’s revenue from the honest users strictly increases. Finally, we need to analyze the adversaries incentives in this new collection of contracts. For the adversary, the new set of contracts provides fewer options to choose from; thus, for any choice made by the adversary in the new contract regime, he obtains less or equal utility to that from the original contract set. As the operator’s utility is zero-sum with the adversary’s utility, the contribution from the adversarial part of the operator’s revenue either increases or stays the same in the new set of contracts. Thus, putting these together, we have found a new, feasible set of contracts, that strictly outperforms the original set of contracts, contradicting the optimality of the original set. Hence, we cannot have .

Special case (non-decreasing and ): The case when and are both monotone non-decreasing requires to treat the special case of separately. Thus, reasoning exactly like the case we get that and and and are both equally preferred by . Now, if we are done, but if not we can offer to . Following an argument similar to the case of , the new set of contracts would satisfy all IR, SR, and IC constraints of the honest types. From the seller’s viewpoint, the overall revenue from legitimate users remains the same as the original set of contracts. Further, following an argument similar to case , the contribution from the adversarial part of the revenue either increases or stays the same with the new set of contracts. Therefore, for this special case, we can claim that if , the set of contracts is revenue equivalent (or even suboptimal to) a collection of contracts with . We conclude that at the optimal contract, for this case as well.

Constraint-set reduction: Next, we move on to the IC and IR constraints’ properties. We start with the IR constraints. Starting from , we have,

where the first line follows from , and the second line by the assumption on ordering of the benefit functions, i.e., . Thus, if is satisfied so is . Hence, given is satisfied, all other IR constraints are redundant. As the reasoning here is not based on objective value or adversary constraints, this assertion holds both with and without adversarial types.

Next, we consider the (IC) constraints. By we have , which can be rearranged as . By non-strict increasing difference and, as shown earlier, the monotonicity of access levels, , we get,

Thus, . By , we have , and hence we can infer that . Thus, given the local downward IC constraints and , the constraint is redundant; similarly, all constraints are redundant for . Next, for the local upward IC constraints, starting from , we have , which can be rearranged as . Again, by non-strict increasing difference and monotonicity , we’ll get . Thus, we conclude that given the local upward IC constraints, all other upward constraints for are redundant. Hence, only the local and constraints are non-redundant. As the reasoning here is not based on objective value or adversary constraints, the arguments remain valid in the presence of adversaries.

Next, we show that the local upward IC constraints is also redundant. For contradiction, suppose we solve the optimization problem without the constraint, and get the set of contracts that maximize the operator’s revenue. This solution should strictly violate (since we are assuming is not redundant). Therefore, type will strictly prefer the contract , that is, . We now modify the contracts by increasing by a small amount , i.e., we offer the contract