I Introduction
The index coding problem is one of fundamental problems in wireless network coding. An instance of the index coding problem includes a server, a set of wireless clients, and a set of data chunks. Each client wants a subset of the data chunks in set and has a different subset of the data chunks in set given to it as side information. The server can transmit uncoded data chunks or coded data chunks (i.e., combinations of data chunks in set ) to all clients over a noiseless broadcast channel. The goal of the problem is to identify a coding (transmission) scheme requiring the minimum number of transmissions to satisfy the demands of all clients. For example, Fig. 1 depicts an instance of the index coding problem, where a server needs to deliver four data chunks in set to all clients. The conventional communication approach (without coding) transmits all four data chunks . With the assist of coding, broadcasting only three coded data chunks , , and (over ) can satisfy all clients.
While transmitting an (uncoded or coded) data chuck can incur a significant transmission cost, the server transmits a data chunk only when the data chunk is important enough to clients. Unlike the original index coding problem (where the server has to satisfy all clients), we investigate an interesting and practical scenario where server’s transmissions have to strike a balance between the value of the data chunks and the cost of the transmissions. Instead of minimizing the number of transmissions, our first goal is to identify a coding scheme for maximizing the social welfare, i.e., the difference between the value of the data chunks that can be recovered by a client and the cost incurred by the transmissions from the server. In our problem, the server transmits those data chunks whose value can justify the transmission cost.
To calculate the social welfare, the server needs each client’s side information and each client’s valuation of the data chunks it wants. However, each client is potentially selfish in the sense that it has private side information and a private value of each data chunk it wants. In particular, we cannot expect a selfish client to reveal the correct side information or the true value of each data chunk it wants. Thus, our second goal is to develop an incentive scheme for providing each client with an incentive to reveal the true information.
Ia Contributions
We investigate the index coding setting in the presence of selfish clients, aiming to propose a joint coding and incentive design (called a mechanism) for 1) motivating each selfish client to truthfully reveal its side information and the value of each data chunk it wants and 2) maximizing the social welfare. Our first main contribution is to provide a sufficient condition for mechanisms that can motivate each selfish client to be truthful. Our second main contribution is to develop computationally efficient mechanisms. With the proposed sufficient condition, we establish their truthfulness. Moreover, we analyze their optimality or their worstcase approximation ratios in terms of the social welfare. We also conduct extensive computer simulations to examine the proposed mechanisms in averagecase scenarios.
IB Related works
The index coding problem was introduced in [5] and has become a hot topic. Most related works characterized capacity regions (e.g., [3, arunachala2019optimal]) for various network settings or developed computationally efficient coding schemes to (optimally or approximately) achieve the regions (e.g., [1, li2018multi]). In addition to the original index coding problem, some variants of the index coding problem have also been investigated, such as the pliable index coding problem (e.g., [6]) and the secure index coding problem (e.g., [16]). See [2] for extensive surveys. All the prior works on the index coding neglected potentially selfish clients. Thus, our work introduces another variant of the index coding problem by considering selfish clients.
Many prior works on network coding considered selfish clients. Most of those works (e.g., [4, 15, 20]) analyzed equilibrium in the presence of selfish clients. Few works (e.g., [19, 8]) developed incentive schemes for networkcodingenabled networks. In particular, those works focused on incentive design for fixed coding schemes. For example, [19] and [8] used random linear codes. In contrast, our work considers a joint coding and incentive design problem.
Our problem is also related to auction design [17] for motivating an auction participant to reveal the true value of each item. However, our problem is fundamentally different from the traditional auction design as follows.

In the traditional auction, an item is given to the winner only. However, in our problem, a transmission from the server can be received by all clients, and a client can decode some data chunks by its side information.

In the traditional auction, an auction participant can only lie about the value of each item. However, in our problem, a client can lie about not only the value of each data chunk it wants but also the side information it has. The multidimensional private information poses more challenges than the traditional incentive design (as also claimed in [11]).
This paper considers a new and practical problem, which lies in the intersection of coding theory and game theory. We are discovering a new and challenging problem.
Ii System model
Iia Network model
We consider a wireless broadcast network consisting of a server and a set of wireless clients. The server has a set of data chunks, where each data chunk represents an element of the Galois field of order . The server can transmit uncoded data chunks or coded data chunks (combined from data chunks in set ) to all clients over a noiseless broadcast channel. Without loss of generality, each client wants a single data chunk in set and already has a subset of data chunks in set as side information. A client that wants more than one data chunk can be substituted by multiple clients with the same side information.
IiB Coding schemes
In this paper, we consider scalarlinear coding schemes. In those schemes, every transmission made by the server is a linear combination of the data chunks in set . Precisely, the th transmission made by the server can be expressed by with coding coefficient , for all , where represents the total number of transmissions made by the server. Let be the coding vector of . Moreover, let be the coding matrix whose
th rows is the coding vector of
.After receiving the transmissions from the server, client can recover data chunk it wants if and only if can be expressed as a linear combination of and its side information . Note that the server does not need to satisfy all clients in our setting. Let indicate if coding matrix can recover data chunk with side information , where if it can; if it cannot.
Each client has a value representing the importance of data chunk to it. Suppose that each transmission (from the server) incurs a transmission cost of one unit. The transmission cost can reflect, for example, the power consumption. To capture the tradeoff between the importance of the data chunks and the power consumption, we define a social welfare by
(1) 
where the first term expresses the value of data chunk that can be recovered by client and the second term expresses the cost of the total transmissions made by the server. For example, the social welfare of transmitting , , and (the solution to the index coding problem) in Fig. 1 is . In contrast, the social welfare of transmitting is (where only clients and can recover the data chucks they want). Thus, transmitting is more valuable than transmitting , , and from the global view. In this paper, we aim to develop a coding scheme for maximizing the social welfare.
IiC Incentive schemes
To maximize the social welfare, the server needs value and side information from each client . Thus, we consider a protocol with the following procedure:

The server broadcasts a hash function (with the input being an element of ) to inquire about the side information of all clients.

Each client responds the hash outcomes produced by the data chunks in a set and by a set of random inputs denoted by ; moreover, it also submits a value of data chuck it wants.
With those hash outcomes, the server can obtain side information and responded by each client . While client has no knowledge about the data chucks in set , the random input set () is unlikely to produce a hash outcome corresponding to a data chuck in set . Thus, we can assume that the server can identify (because of ) and remove it from the side information responded by client . In the rest of the paper, we neglect the information . Moreover, suppose that each client can lie about the value of the data chunk it wants or can reveal only part of the side information it has, so that the information and (obtained by the server) can be different from the true information and (owned by client ). Thus, in this paper, we also aim to develop an incentive scheme for motivating each selfish client to reveal the correct value and the complete side information , so that and for all . Let and be the sets of all corresponding elements. Moreover, let and be the set of all corresponding elements except the one for client .
In this paper, we consider money transfers between the server and clients as an incentive. Each client has to pay the server for data chunk if the client can recover it. In this context, value of data chunk implies the maximum amount of money client is willing to pay to obtain it. Let be the payment of client charged by the server. A scheme determining payment for each client is referred to as a payment scheme. In general, a payment scheme depends on value set , side information set , and coding matrix (which determines indicator for each client ). The design of payment schemes and that of coding schemes depend on each other. Thus, we define a mechanism by a joint coding and payment scheme. Suppose that the mechanism is given to all clients.
For a given mechanism, we define a utility for client by , which is the difference between the value of the data chunk it can recover and the money charged by the server. Each client is selfish in the sense that, given the mechanism employed by the server, it submits information and for maximizing its utility .
In this paper, we aim to develop a mechanism such that
(2) 
for all , and , where value set is value set with being substituted by and side information set is side information set with being substituted by . The idea underlying Eq. (2) is that, regardless of the information and submitted by other clients, client can maximize its utility by submitting its true information and .
IiD Problem formulation
A mechanism satisfying Eq. (2) is referred to as a truthful mechanism. Moreover, a truthful mechanism that can also maximize the social welfare in Eq. (1) is referred to as an optimal truthful mechanism. We aim to develop an optimal truthful mechanism such that the social welfare and the utilities of all clients are simultaneously optimized. Our problem involves both the global and local optimization problems.
Note that local utility involves more than one type of private information (i.e., value and side information). The traditional incentive design for a single type of private information might be insufficient to motivate a client in our problem to reveal the true information of both types. Thus, Section III characterizes truthful mechanisms for our problem. With the results in Section III, we will develop optimal or approximate truthful mechanisms for various scenarios of our problem.
Iii Characterizing truthful mechanisms
This section provides a sufficient condition of truthful mechanisms for our problem. To that end, we introduce a type of coding schemes as follows.
Definition 1.
A coding scheme is a thresholdtype coding scheme if, for every value set and side information set , there exists a threshold such that when and when , for all .
Note that threshold for client is independent of value submitted by client , but is dependent on side information revealed by client . The next theorem provides a sufficient condition of truthful mechanisms for our problem.
Theorem 2.
A mechanism is truthful if the following three conditions hold:

the coding scheme is a thresholdtype coding scheme;

the payment scheme determines payment (for client ) if , or if ;

for all .
Proof.
See Appendix A. ∎
The first two conditions in the above theorem claim that a client cannot affect the payment by lying about the value of the data chunk it wants, because its payment depends on the values submitted by other clients. The third condition claims that a client can minimize its payment when revealing its complete side information. The theorem will be used later to establish the truthfulness of the proposed mechanisms. We remark that the theorem generalizes the sufficient condition in [17, Theorem 9.36] (which focused on the case when an auction participant has a single type of private information).
Iv VCGbased mechanism design
This section proposes an optimal truthful mechanism leveraging the celebrated VickreyClarkeGroves (VCG) approach [17]. Note that the original VCG mechanism provides an auction participant with an incentive to reveal only the true value of each item. However, our Theorem 4 will show that the proposed VCGbased mechanism can motivate each selfish client to reveal not only the true value of the data chunk it wants but also its complete side information.
Our VCGbased mechanism uses the following function
(3) 
where indicates if coding matrix can recover data chunk with side information : if it can, but otherwise. The function is the social welfare in Eq. (1) computed by the information and obtained by the server. Then, we propose our VCGbased mechanism as follows, including a VCGbased coding scheme and a VCGbased payment scheme.
VCGbased coding scheme: For given value set and side information set , identify a coding matrix for maximizing function :
(4) 
Remark 3.
We require the VCGbased coding scheme to meet a condition without loss of optimality to Eq. (4). If under coding matrix computed by Eq. (4), then coding vector of coding matrix must be the zero vector. Setting those zero coding vectors for those data chunks that cannot be recovered does not change the maximum function value . The requirement is to prevent client from recovering data chuck with side information (that is unknown to the server) if . If multiple coding matrices that are solutions to Eq. (4) and satisfy the requirement, then we can arbitrarily choose one.
VCGbased payment scheme: If under coding matrix computed by Eq. (4), then charge client
(5) 
where value set is value set with value being substituted by zero. Note that payment is nonnegative for all . The idea underlying Eq. (5) is to calculate threshold (see Theorem 2) as payment for client . By setting , the first term of Eq. (5) calculates the maximum function value among all possible coding matrices that cannot recover data chunk . By setting and , the second term of Eq. (5) calculates the maximum function value among all possible coding matrices that can recover data chunk , when client submits . Thus, if client submits a value greater than that difference, then it can recover data chunk by the VCGbased coding scheme.
The next theorem establishes the truthfulness and the optimality of the proposed VCGbased mechanism.
Theorem 4.
The VCGbased mechanism is an optimal truthful mechanism.
Proof.
Appendix B confirms that the VCGbased mechanism is truthful (by Theorem 2). Then, all clients submit the true values of the data chunks they want and their complete side information. Moreover, by Eq. (4), the VCGcoding scheme maximizes the social welfare. Thus, the VCGbased mechanism is an optimal truthful mechanism. ∎
V Algorithmic hardness results
Note that the proposed VCGbased mechanism involves the combinatorial optimization problems in both Eqs. (
4) and (5). The next proposition shows that the combinatorial optimization problems are NPhard.Proof.
We construct a reduction from the original index coding problem. See Appendix C for details. ∎
To develop computationally efficient mechanisms, the rest of this paper considers sparse coding schemes defined as follows.
Definition 6.
A coding scheme is a sparse coding scheme if each transmission from the server is a linear combination of at most two data chunks in set (over ).
With sparse coding schemes, both encoders and decoders can be easily implemented, bringing significant advantages for practical applications.
This paper considers two different scenarios: the multiple unicast scenario and the multiple multicast scenario. While in the multiple unicast scenario each client wants a different data chunk, in the multiple multicast scenario many clients can request the same data chunk. Moreover, we define a special decoding scheme called the instant decoding scheme, which can combine each individual transmission (from the server) with side information but cannot combine multiple transmissions. For example, in Fig. 1, client can instantly decode data chunk by . On the contrary, client cannot instantly decode data chunk by or separately (but it can decode by combining both and ). The instant decoding scheme has a significant advantage in practical applications because a client only needs to receive one transmission to recover the data chunk it wants; in particular, the client does not need to wait for the whole transmissions before recovering data chunk , resulting in a low decoding delay. In contrast, a decoding scheme that can combine more than one transmission with side information is referred to as a general decoding scheme.
Section VI develops computationally efficient mechanisms for the multiple unicast scenario. While Section VIA proposes an algorithm optimally solving Eqs. (4) and (5) in polynomial time for the instant decoding scheme, Section VIB establishes that the combinatorial optimization problems in Eqs. (4) and (5) are still NPhard for the general decoding scheme. To cope with the NPhardness, Sections VIB and VIC develop two approximate truthful mechanisms. Subsequently, Section VII shows that the combinatorial optimization problem in Eq. (4) for both scenarios is not only NPhard but also NPhard to approximate. Table I summarizes our main results.
Instant decoding scheme  General decoding scheme  

Multiple unicast scenario  Polynomialtime algorithm (Alg. 1) for solving Eqs. (4) and (5)  Approximate truthful mechanisms:
approximate truthful mechanism (Algs. 2 and 3) approximate truthful mechanism 
Multiple multicast scenario  NPHard to approximate Eq. (4) (Theorems 15 and 16) 
Vi Mechanism design for the multiple unicast scenario
This section develops computationally efficient truthful mechanisms (with spare coding schemes) for the multiple unicast scenario by proposing polynomialtime algorithms for (optimally or approximately) solving Eq. (4). We remark that an approximate solution to Eqs. (4) and (5) is no longer a truthful mechanism (see Example 10 later). Thus, we devise alternative payment schemes to substitute the previously proposed VCGbased payment scheme for guaranteeing the truthfulness (see Sections VIB and VIC later).
To solve Eq. (4), we introduce a weighted dependency graph constructed as follows: given value set and side information set ,

for each client , construct a vertex ;

for any two clients and such that , construct a directed arc ;

associate each arc with an arc weight .
The weighted dependency graph generalizes the dependency graph in [5] to a weighted version. We denote the weighted dependency graph by , where is the vertex set, is the arc set, and is the arc weight set. Fig. 2 illustrates the weighted dependency graph for the instance in Fig. 1.
We make two observations about weighted dependency graphs:

for the general decoding scheme, the server can satisfy all clients in a cycle in graph with transmissions;

for the instant decoding scheme, the server can satisfy all clients in a cycle with in graph with transmissions.
For example, with the general decoding scheme, clients , , and in cycle of Fig. 2 can recover the data chunks they want with and . In contrast, with the instant decoding scheme, clients , , and cannot recover the data chunks they want with any two transmissions among , , or . However, clients and in cycle of Fig. 2 can instantly decode the data chunks they want with .
We say that a coding scheme transmits along cycle in weighted dependency graph if it constructs transmissions (by pairwise coded data chunks) for satisfying all clients in the cycle. Note that, for the instant decoding scheme, a coding scheme can transmit along cycle with only, according to the above observations. While transmitting along a set of (vertex) disjoint cycles can satisfy all clients in those cycles with fewer transmissions than the number of the satisfied clients, all other sparse codes with no cycle being involved cannot (see [7] for details). Those transmissions with no cycle being involved can be substituted by uncoded data chunks without changing the function value in Eq. (3). Thus, we can focus on spare coding schemes that transmits along disjoint cycles and additionally transmits uncoded data chucks if vertex is not in those cycles (i.e., client cannot recover data chuck with the transmissions along the cycles) but .
We aim to identify a coding matrix including the coding vectors of transmitting along a set of disjoint cycles and those of uncoded data chuck if in not in those cycles but , for maximizing
(6) 
where (a) is because transmitting along cycle satisfies all clients in the cycle with transmissions; (b) considers uncoded data chucks for those clients that submit values of no less than one but cannot recover the data chucks they want with the transmissions along the cycles. Then, we use the notation to represent the truncation of toward one; in particular, we can rewrite Eq. (6) in terms of truncated values as follow:
(7) 
where (a) adds back the deducted value (caused by the truncation). Because the value of the term is zero and the value of the term in Eq. (7) is constant, it suffices to maximize .
To that end, we associate each cycle in weighted dependency graph with a cycle weight defined by
(8) 
which implies the difference between the total truncated value submitted by the clients in cycle and the cost of transmitting along cycle . Note that, for the instant decoding scheme, we assign cycle weight to cycles with only. Then, we can turn our attention to a maximum weight cycle packing problem: identifying a set of disjoint cycles for maximizing the total cycle weight in graph .
Section VIA optimally solves our maximum weight cycle packing problem for the instant decoding scheme. For the general decoding scheme, Sections VIB and VIC propose two approximate solutions to our maximum weight cycle packing problem and their respective payment schemes as the incentives.
Via The instant decoding scheme
This section develops Alg. 1 for optimally solving Eqs. (4) and (5) when all clients use the instant decoding scheme. Given value set and side information set , Alg. 1 aims to construct a (sparse) coding matrix for maximizing function in Eq. (7) in polynomial time.
To that end, Alg. 1 constructs weighted dependency graph in Line 1, aiming to identify a set of disjoint cycles with for maximizing total cycle weight in set . To identify such a set of disjoint cycles, Alg. 1 constructs an undirected graph in Line 1 with the following procedure:

for each vertex , construct a vertex ;

for any two vertices such that both arcs and are in set , construct an edge ;

associate each edge with an edge weight such that .
With the construction, each cycle with in graph corresponds to an edge in graph ; in particular, a set of disjoint cycles in graph corresponds to a matching in graph . Moreover, a cycle weight in graph corresponds to the edge weight in graph . Thus, a set of disjoint cycles in graph for maximizing the total cycle weight corresponds to a maximum weight matching in graph . Alg. 1 identifies a maximum weight matching in graph in Line 1 (by some polynomialtime algorithms like the Edmonds’s algorithm [10]). Subsequently, Alg. 1 adds the coding vectors of the transmissions (along those cycles corresponding to the maximum weight matching) to coding matrix in Line 1. Finally, if client submitting value is not satisfied by the coding matrix constructed by the maximum weight matching, then Alg. 1 adds the coding vector of data chunk to coding matrix in Line 1. The discussion in this paragraph leads to the following lemma.
ViB General decoding scheme: approximate truthful mechanism
The next lemma shows that, for the general decoding scheme, the combinatorial optimization problems in Eqs. (4) and (5) are still NPhard.
Lemma 8.
Thus, this section and next section develop two algorithms (Alg. 2 and its further modification) for approximately solving Eq. (4). To that end, Alg. 2 constructs weighted dependency graph in Line 2, aiming to approximately solving our maximum weight cycle packing problem.
The idea underlying Alg. 2 is to iteratively identify a maximum weight cycle in a greedy way. Note that, in general, identifying a maximum weight cycle in a graph is NPhard [18]. However, for our problem, we can observe that cycle weight of cycle in Eq. (8) can be written as
(9) 
By associating each arc with an arc cost , we can associate each cycle with a cycle cost , which is the total arc cost in cycle . Then, cycle weight in Eq. (9) becomes . Removing the constant, a maximum weight cycle minimizes cycle cost . Thus, Alg. 2 identifies a minimum cost cycle in Line 2 (by some polynomialtime algorithms like the FloydWarshall algorithm [9]), followed by adding the coding vectors of the transmissions along the cycle to coding matrix in Line 2. Subsequently, Alg. 2 removes the cycle from the present graph in Line 2. The condition in Line 2 guarantees that the maximum weight cycle in the present graph has a nonnegative weight. Finally, if client submitting value is not in those selected cycles, then Alg. 2 adds the coding vector of data chunk to in Line 2.
Let be the coding matrix produced by Alg. 2 and let be a (sparse) coding matrix maximizing function . The next theorem analyzes the approximation ratio of Alg. 2.
Theorem 9.
The approximation ratio of Alg. 2 is the maximum cycle length in weighted dependency graph .
Proof.
See Appendix E. ∎
Because of Theorem 9, we refer to Alg. 2 as approximate coding scheme. Next, we show that applying approximate coding scheme to solve Eqs. (4) and (5) is no longer a truthful mechanism.
Example 10.
Look at Fig. 3. First, suppose that all clients submit the true values of the data chunks they want. Then, Alg. 2 produces along cycle in Fig. 3(b). In this case, client has zero utility. Second, suppose that client submits but other clients submit the true values of the data chunks they want. Then, Alg. 2 produces and along cycles and , respectively. By solving Eq. (5) with Alg. 2, client is charged . In this case, client has utility . Thus, the condition for a truthful mechanism in Eq. (2) fails. Client can obtain a higher utility by lying about the value of data chunk .
To address the issue in the above example, we propose a payment scheme in Alg. 3 so that the joint design of Algs. 2 and 3 is a truthful mechanism. The underlying idea of Alg. 3 is to calculate threshold for each client (that can recover data chunk by Alg. 2) as payment . To that end, Alg. 3 constructs weighted dependency graph in Line 3; moreover, Alg. 3 associates each arc with an arc cost in Line 3. Note that Alg. 3 defines the arc costs in a different way from Alg. 2; precisely, Alg. 3 associates each outgoing arc from vertex with the cost of one unit (i.e., assuming value ). Then, Alg. 3 calculates the difference of the cycle costs between cycle (in Line 3) and cycle (in Line 3), where cycle has the globally maximum weight but cycle has the locally maximum weight among those cycles containing vertex . While the value of (see Eq. (9)) is analogous to the first term of Eq. (5), that of is analogous to the second term of Eq. (5). Thus, the difference of the cycle costs in Line 3 for each iteration is the minimum value submitted by client such that a cycle containing vertex can be selected by Line 2 of Alg. 2 in that iteration. Then, Alg. 3 identifies threshold by searching for the minimum among all iterations in Line 3 along with the initial value of being 1 as in Line 3 (because each client can recover the data chuck it wants when submitting ).
By verifying the three conditions in Theorem 2, the next theorem shows that the joint Algs. 2 and 3 is a truthful mechanism.
Theorem 11.
Proof.
See Appendix F. ∎
ViC approximate truthful mechanism
This section proposes another approximate coding scheme and its corresponding payment scheme for guaranteeing the truthfulness. The approximate coding scheme modifies the previously proposed Alg. 2. The modified approximate coding scheme substitutes Line 2 of Alg. 2 (i.e., identifying a cycle for maximizing cycle weight ) by identifying a cycle for maximizing . The underlying idea is to maximize cycle weight (as in Alg. 2) and at the same time to minimize the number of transmissions (because shorter cycle lengths can yield more cycles). To that end, we propose Alg. 4 for obtaining such a cycle in a weighted dependency graph. Line 4 of Alg. 4 searches for a cycle for maximizing cycle weight subject to the cycle length being no more than . Then, Line 4 of Alg. 4 can identify cycle for maximizing subject to the cycle length being no more than ; in particular, Line 4 can identify cycle for maximizing in the last iteration. The next lemma justifies the correctness of Alg. 4.
Lemma 12.
Given a weighted dependency graph, Alg. 4 can identify a cycle for maximizing .
Proof.
See Appendix G. ∎
The next theorem provides the approximation ratio of the modified approximation algorithm.
Theorem 13.
Proof.
See Appendix H. ∎
ViD Complexities of the proposed coding schemes
This section investigates the computational complexities of the three proposed coding schemes for the multiple unicast scenario: 1) Alg. 1 for the instant decoding scheme; 2) approximate coding scheme (Alg. 2) for the general decoding scheme; 3) approximate coding scheme (modified Alg. 2 along with Alg. 4) for the general decoding scheme. All three schemes are based on a weighted dependency graph. Constructing a weighted dependency graph takes steps to check all pairs of clients.
Regarding Alg. 1, we can apply the Edmond’s maximum weight matching algorithm [10] to Line 1 of Alg. 1, whose complexity is . Then, the complexity of Alg. 1 is .
ViE Numerical results
This section numerically analyzes the proposed coding schemes for the multiple unicast scenario via computer simulations, including Alg. 1, approximate coding scheme in Alg. 2, and approximate coding scheme modified from Alg. 2.
Fig. 4 simulates Alg. 1 with the instant decoding scheme and the two approximate coding schemes with the general decoding scheme. The two subfigures display the social welfare when each client has 3 and 6 data chunks, respectively, in its side information. The experiment setting is following: We simulate clients (xaxle) and set , where client wants data chunk . Value of data chunk is uniformly picked between 0 and 1. Let value because of the truthfulness guaranteed by the proposed payment schemes. The data chunks in side information of client is randomly selected from set . All results are averaged over 500 simulation times.
From Fig. 4, we can observe that even though both approximate coding schemes cannot achieve the maximum social welfare, they still outperform Alg. 1 (with the instant decoding scheme). The result tells us that the proposed approximate coding schemes can take advantage of the general decoding scheme.
To validate the proposed approximate coding schemes over uncoded schemes, Figs. 5 and 6 displays the total value of the data chunks that can be recovered. The results for the “no coding” scheme in Figs. 5 and 6 are obtained in the following way: we first obtain the number of transmissions incurred by an approximate coding scheme, and then calculate the sum of the top values of the data chunks in set (which is the maximum total value when the server transmits uncoded data chunks). Note that the maximum social welfare is zero if the server can transmit uncoded data chunks only. From Figs. 4  6, we can observe that both approximate coding schemes improve both the social welfare and the total value over the best uncoded transmission scheme.
Vii Inapproximability results for the multiple multicast scenario
Thus far, we analyzed the multiple unicast scenario. This section analyzes the multiple multicast scenario. We start with the instant decoding scheme; in particular, we show that the combinatorial optimization problem in Eq. (4) is as hard to approximate as the independent set problem, which is extremely hard to approximate [12].
Theorem 15.
In the multiple multicast scenario with sparse coding schemes and the instant decoding scheme, the combinatorial optimization problem in Eq. (4) is NPhard and NPhard to approximate.
Proof.
We construct a reduction from the independent set problem [12]. Given a graph (with vertex set and edge set ) of the independent set problem, we construct an instance of our problem as follows. For each vertex and edge , we construct data chunks and . The data chunk set consists of and for all and . For each edge , we construct three clients such that

, , ,

, , ,

, , ,
where is the number of edges that are incident to vertex . Then, Appendix I shows that our problem equivalently becomes the independent set problem, yielding the result. ∎
We have the same result for the general decoding scheme in the following theorem.
Theorem 16.
In the multiple multicast scenario with sparse coding schemes and the general decoding scheme, the combinatorial optimization problem in Eq. (4) is NPhard and NPhard to approximate.
Proof.
See Appendix J. ∎
Viii Concluding remarks
This paper treated a practical index coding setting in the presence of selfish clients. We proposed a sufficient condition for truthful mechanisms (i.e., joint coding and payment schemes). Leveraging the proposed condition, we proposed computationally efficient mechanisms that are truthful. The proposed mechanisms can either maximize the social welfare or approximate it with provable approximation ratios. The simulation results also validate the proposed coding schemes.
This paper considered a new problem of joint coding and incentive design. Some interesting future works are following. This paper focused on sparse coding schemes. The mechanism design leveraging more advanced coding schemes would be interesting. Moreover, approximate or exact mechanism design for the multiple multicast scenario is still undiscovered. Finally, this paper includes money transfers as an incentive. Developing a nonmomentary truthful mechanism would be promising.
Appendix A Proof of Theorem 2
First, we claim that if the first and second conditions hold, then for any value set and any side information set , client can maximize its utility by submitting the true value of data chunk . To prove that claim, we consider three cases as follows.

: First, suppose that client submits the true value of data chunk . By the first condition, client can recover data chunk . By the second condition, client is charged threshold . Thus, client has utility . Second, suppose that client submits value of data chunk . By the first and second conditions, if , then client has utility ; if , then client has zero utility. In summary, client can maximize its utility by submitting the true value of data chunk .

: First, suppose that client submits the true value of data chunk . By the first and second conditions, client has zero utility. Second, suppose that client submits value of data chunk . By the first and second condition, if , then client has utility ; if , then client has zero utility. In summary, client can maximize its utility by submitting the true value of data chunk .

: First, suppose that client submits the true value of data chunk . By the second condition, client has zero utility whether it can recover data chunk or not. Second, suppose that client submits value of data chunk . If , then client has utility