Structured Index Coding Problem and Multi-access Coded Caching

12/08/2020 ∙ by Kota Srinivas Reddy, et al. ∙ IIT Bombay 0

Index coding and coded caching are two active research topics in information theory with strong ties to each other. Motivated by the multi-access coded caching problem, we study a new class of structured index coding problems (ICPs) which are formed by the union of several symmetric ICPs. We derive upper and lower bounds on the optimal server transmission rate for this class of ICPs and demonstrate that they differ by at most a factor of two. Finally, we apply these results to the multi-access coded caching problem to derive better bounds than the state of the art.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Index coding is a fundamental problem in network information theory [1] which consists of a central server with a collection of messages, communicating with a set of users over a broadcast channel. Each user has prior knowledge of a subset of the messages, referred to as side-information, and is interested in recovering another subset of the messages. The goal of the index coding problem (ICP) is to minimize the server transmission size while satisfying all the user requests. While the problem in general remains open, several bounds are known based on min rank [2], local chromatic number [3], local partial clique cover [4] and maximum acyclic induced subgraph (MAIS)[5]. There has also been work on characterizing the optimal transmission rate for specific sub-classes of ICPs such as single unicast index coding problem (SUICP) with symmetric neighboring and consecutive (SUICP-SNC) side-information [6, 7] and SUICP with symmetric consecutive interference (SUICP-SCI) [8].

Index coding also has close relations to several other topics of active research including network coding [9], distributed storage [10], distributed computing [11], and coded caching [12]. The coded caching problem was introduced in [13] which pioneered an information-theoretic view of the classical caching problem, and has since attracted a lot of attention in the community, see for example [12] for an extensive survey. The coded caching problem is tied intimately to the index coding problem and [14, 15] used ideas from index coding to find the optimal rate for the coded caching problem under the assumption of uncoded placement. In this work, we focus on a new class of structured ICPs which are formed by the union of several symmetric ICPs [6] and are motivated by the multi-access coded caching problem (MACC) introduced in [16] and studied further in [17, 18, 19]. For this special class of ICPs, we characterize the optimal server transmission rate up to a constant factor of two and apply these results to the MACC problem to derive better bounds than the state of the art [17, 18, 19].

The rest of the paper is organized as follows: Sections II and III describe some useful notations and ICP setting. Sections IV and V describe the ICP preliminaries and main results. Section VI describes the MACC setting and its improved upper bound. Some discussions are given in Section VII. We relegate all the proofs to Section VIII and Appendix.

Ii Notations

  • size of file / subfile / set

  • denotes parts of File exclusively available to users with index in set

  • represents the dimension of the vector

  • denotes the maximum value of the vector

Iii Setting

Our ICP consists of files , each of size 1 unit at the central server and a set of users . Each user knows a subset of files , and we call it the Known-set or side-information of . Each user wants another subset of files , and we call it the Want-set of . The remaining subset of files , i.e., the files which are neither requested nor available are called the interference-set of User . An ICP is said to be a unicast index coding problem (UICP) if none of the users want the same file, i.e., , and an ICP is said to be a single unicast index coding problem (SUICP) if every user wants only one file and it is a UICP, i.e., , and , .

We refer to an ICP as the ICP for some non-negative integers and a natural number such that

(1)

if , User

  • wants , i.e., and

  • knows

    i.e., the side-information is available in chunks, each of which is a collection of consecutive elements and the separation/gap between the consecutive chunks is determined by . More explicitly, the side information of User is given by

We also represent the ICP as ICP, where is a vector of the dimension and is given by . Note that the ICP is an SUICP and . In an ICP, if we arrange the files circularly in clock-wise direction, then User wants File and if we go on from in clock-wise direction, first we see interference files for User , then side-information files, then interference files, then side-information files and so on. It ends with interference files for User . See Figure 1 for an illustration.

Figure 1: An illustration of the ICP. It has , , , , and the number of users . Here, we highlight User 3’s requested file with a thick circle and red color font. Among the other files, the shaded circles with blue color fonts represent the side-information files of user 3 which are available at User 3, and the empty circles denote the interference files of User 3 which are not available at User 3. If we go on from in clock-wise direction, we first see interference files , then side-information files , then interference file , then side-information files and lastly interference files of User 3. Observe that the side-information files are available in groups/chunks of size .
Example 1

A ICP contains nine users such that
, , , , , , , . Figure 1 shows the ICP.

In this paper, we focus on a union of multiple SUICPs which is defined as follows: Let there be users such that User wants and knows in the first instance and wants and knows in the second instance, then the union of these two instances is an ICP with

  • want set and

  • known set .

We assume that there are no common files involved across the instances.

In general, we can add any number of SUICPs. In this paper, we focus on a UICP formed by the union of special SUICPs. We use the notation to represent the file requested by User .

We refer to an ICP as the ICP, if it is a union of ICP, ICP, and so on till ICP, i.e., an ICP is formed by the union of ICP and its clockwise rotations. We also represent the ICP as ICP, where is a vector of the dimension and is given by . In an ICP, is a non-negative integer , is a natural number and they satisfy (1). In the ICP,

  • want set , and

  • known set

    (2)

    i.e.,

    where is the side-information structure of ICP, which is formed by the clock-wise rotations of ICP and the side-information structure is given by

Example 2

A ICP is the union of ICP, ICP and ICP. Note that ICP has , , , , and the number of users users, each user wants files and has files as side-information. In the ICP, ,

  • want set , and

  • known set

    (3)

    where .

Note that in an ICP, each user wants files and has files as side-information. Since ICP, ICP, …, ICP all are equivalent, we assume without loss of generality (WLOG) .

An ICP can be represented by a table such that the row and column contains the file requested by User . We represent a general ICP in Table 2 and the ICP in Table 2. We denote the row and column’s entry as Node . In these tables, row represents User ’s requested files. In Table 2, we highlight User 3’s requested files with red color, bold fonts and its side-information files with blue color fonts and shaded nodes. A Node is said to be side-information node of any Node , if User has File in its side-information. For example, the shaded nodes in Table 2 are side-information nodes of Node .

Table 1: ICP.
Table 2: ICP.

Our aim is to find the optimal broadcast rate of an ICP.

Iv Preliminaries

There are several upper and lower bounds [5, 3, 4, 2] available for an SUICP. We use local chromatic number and maximum acyclic induced sub-graph based bounds in our proofs and they are described below.

In an SUICP, User is said to be interfering with User , if User is requesting a file from , i.e., User ’s requested file is not available at User . The closed anti-outneighborhood of User is defined as the set containing User itself and all its interfering users and is denoted by . A proper coloring scheme assigns a color to each user such that no user shares its color with any of its interfering users. The local chromatic number of an ICP is defined as the maximum number of different colors that appear in any user’s closed anti-outneighborhood, minimized over all proper coloring schemes and is denoted by . For an SUICP with users, let for some positive integer be a proper coloring scheme with colors and let denote the set of different colors assigned to the closed anti-outneighborhood of User , then

Lemma 1

For a given SUICP, let denotes its local chromatic number and denotes its optimal broadcast rate, then

Lemma 1 gives an upper bound on the optimal broadcast rate of an SUICP and it follows from [3, Theorem 1].

An SUICP with users can equivalently be represented by a side information graph with nodes such that each node represents a unique user and there exists an edge from Node to Node if User ’s requested file belongs to Known-set of User .

Lemma 2

For a given SUICP with side-information graph , the optimal broadcast rate is greater than or equal to , where is the size of the maximum acyclic induced sub-graph of .

Lemma 2 gives a lower bound on the optimal broadcast rate of an SUICP and it follows from [2, Theorem 3].

V ICP main results

In this section, we discuss our main results related to the ICP. The following theorem gives an upper bound on the optimal broadcast rate of the ICP.

Theorem 1

Consider an ICP with users. Let be the optimal broadcast rate of the ICP, then

As mentioned in Section IV, the local chromatic number gives an upper bound on the SUICP. So, we convert a UICP with users into an SUICP with virtual users such that each user in the UICP maps into virtual users in the SUICP, and each virtual user requests a distinct file of the original user’s requested files. The side-information at the virtual user is the same as its corresponding real user. We find an upper bound on the local chromatic number by assigning a proper coloring scheme to the virtual users. This upper bound works as an upper bound for the broadcast rate of the original problem. One naive coloring scheme is to assign a unique color to each virtual user. Since is the total number of users, and is the number of side-information users of a virtual user, the naive coloring scheme gives an upper bound of . In this paper, we propose a better coloring scheme, which gives an upper bound shown in Theorem 1. The details are given in Section VIII-A. Here, we discuss the coloring scheme and upper bound for the ICP mentioned in Example 2.

In the ICP, there are users, each one requesting files, and has files as side-information. The ICP is shown in the tabular form in Table 2. Note that the ICP is a UICP with eight users, and each user requesting three distinct files. We convert this UICP into an SUICP with 24 virtual users, each one requesting a distinct file. In particular, each user in the UICP is mapped to 3 virtual users in the SUICP, such that each virtual user requests a distinct file of the original user’s requested files. The side-information at the virtual user is the same as its corresponding original user. In Table 2, recall that we use Node to represent the file requested by User . Now, we call Node as virtual user of User . The side-information of Node is the same as side-information of User . We can easily prove that the optimal broadcast rate in the two ICPs (the virtual SUICP and the original UICP) are equal. We give an upper bound for the SUICP, and it also works as an upper bound for the original UICP.

Recall from Section IV that for an SUICP, a proper coloring scheme assigns a color to each user such that no user shares its color with any of its interfering users. We take colors and assign Color to nodes , i.e., to Node () in Column 1, to Node () in Column 2, to Node () in Column 3. The coloring scheme is shown in the tabular form in Table 3. Note that every color occurs in a column exactly once.

1 7 4 2 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (i) 1 7 4 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (ii) 1 7 4 2 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (iii) 1 7 4 2 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (iv) 1 7 4 2 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (v) 1 7 4 2 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (vi) 1 7 4 2 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (vii) 1 7 4 2 8 5 3 1 6 4 2 7 5 3 8 6 4 1 7 5 2 8 6 3 (viii)
Table 3: Coloring scheme for the ICP. The number in the cell indicates the color assigned to the cell. In Table (), we discuss the Row cells, and their colors are highlighted with the red color bold fonts, their side-information cells are highlighted with the shaded cells and blue color fonts, normal cells represent their interference nodes. Observe that any red-colored bold font color always occurs at the shaded cell in the other columns. Hence it is a proper coloring scheme. The number of distinct colors in the non-shaded cells is 7 (local chromatic number) in every table.

For Node , we assign Color . We also assign Color in Column 2 to the node , which is equal to node and we can verify that it belongs to by substituting in (2). We assign Color in Column 3 to the node , which is equal to node and we can verify that it belongs to by substituting in (2). Hence, we can conclude that the colors assigned to Column 1 nodes are only assigned to side-information nodes in the other columns. Similar arguments can be used to check that for any node (), the color assigned to it is only shared with its side-information nodes. Table 3 illustrates it row by row. Therefore, this coloring scheme ensures that none of the nodes share its color with its interfering nodes and hence is a proper coloring scheme.

In Table 3 (i), for User 1, according to the definition of closed anti-outneighborhood in Section IV, colors in the non-shaded cells contribute to local chromatic number. They are in Column 1 (colors 1, 2, 3 and 6), extra color (Color 7) are added in Column 2, and finally colors (colors 4 and 5) are added in Column 3. Hence, the closed anti-outneighborhood of User 1 contain colors. By symmetry, this property holds true for other users as well and it can be verified easily using Table 3. Hence, the local chromatic number of our ICP is less than ar equal to . From Lemma 1, units.

We get as an upper bound for the ICP. Now, we discuss the ICP where we get as upper bound.

In the ICP, there are users, each one requesting files, and has files as side-information. In the ICP, ,

  • want set , and

  • known set

    where .

The ICP is shown in Table 5. We take colors and assign Color to nodes . The coloring scheme is shown in the tabular form in Table 5. In Table 5, we highlight User 1’s color with red-colored bold fonts and side-information nodes of User 1 with shaded cells. Note that the color assigned to User 1 is repeated in other columns at shaded cells only, i.e., User 1’s nodes are not sharing its color with interference nodes. Similarly, we can also verify for the other users. Hence, this coloring scheme is a proper coloring scheme.

Table 4: ICP.
1 9 5 2 10 6 3 11 7 4 1 8 5 2 9 6 3 10 7 4 11 8 5 1 9 6 2 10 7 3 11 8 4
Table 5: Coloring scheme for Table 5.

Since we are using colors in our scheme, we have from Lemma 1 that .

The following theorem gives a lower bound on the optimal broadcast rate of the ICP and its proof is given in Section VIII-B.

Theorem 2

Consider an ICP with users. Let be the optimal broadcast rate of the ICP. Then,

Remark 1

As a corollary, we can say that the optimal broadcast rate for the ICP is units and for the ICP is units. For a general setup, the ratio between the upper and the lower bound is at most 2, easily follows from Theorem 1 and Theorem 2.

Vi Multi-access Coded Caching (MACC) Problem

The multi-access coded caching (MACC) problem was proposed in [16]. [17, 19] also studied the MACC under the uncoded placement setting. In particular, the bounds proposed in [17] are based on the solution of an ICP. Now, we use our ICP results mentioned in Section V to get the tighter bounds than in [17].

The MACC setup in [17] consists of a central server with files , each of size 1 unit. There are caches, each of size units and users, each one has access to consecutive caches with a cyclic wrap-around, as shown in Figure 2. Each user requests one file which has to be served by its caches stored content and the central server’s broadcast message. The system is referred to as ()CCDN. Our aim is to find the central server’s optimal transmission rate () for a given memory . The system operates in two phases. The first one is the placement phase, in which we store the files according to some policy. The second phase is the delivery phase, which happens after users reveal their requests. In the delivery phase, we serve the user requests using a central server broadcast message and the caches’ content.

Figure 2: An illustration of the ()CCDN. It consists of files, caches of units memory and users, each one connected to consecutive caches.

At the extreme memory points units and units, we achieve the transmission rate units and units respectively. The details are there in [17]. Now, we discuss the uncoded placement policy and delivery policy proposed in [17] for the memory points , where . Note that we assume as for some .

Vi-a Uncoded placement policy:

Let be the collection of subsets of , with the following constraints:

  1. ,

  2. if , every two different elements of satisfy and .

Mathematically,

First, we divide each file into equal parts and assign one subfile to one subset . Then, we store the subfile assigned to the set , in all the caches, whose index belongs to . Note that according to our placement policy, each cache stores every file’s parts, each of size units. Hence, our placement policy satisfies the memory constraint.

Vi-B Delivery policy:

The delivery phase happens after users reveal their requests. First, we form an instance of the ICP. Then, the central server transmits messages based on the solution of the ICP.

[17] uses a naive coloring scheme, where every virtual user/node in the ICP is assigned with a different color. This naive coloring scheme [17, Theorem 1] gives an upper bound on the general ()CCDN data transmission rate as

(4)

In this paper, using the ICP results mentioned in Section V, we get a tighter upper bound than in (4). In particular, first we split our multi-access ICP into many ICP’s of the form ICP. Then, we use Theorem 1 to get an upper bound on the data transmission rate of each ICPs. Finally, the data transmission rate of multi-access ICP is upper bounded by the sum of the upper bounds of the individual ICPs. The details are given in Section VIII-C.

Now, we discuss the upper bound given in Theorem 3. A vector of dimension is said to be weak compositions of [20] if the components are non-negative and their sum is . Let be the collection of all weak compositions of and denotes the maximum component in the vector . Let at , be defined as

(5)

The following theorem gives an upper bound on the transmission rate of ()CCDN at memory point , .

Theorem 3

For an ()CCDN, let be the optimal transmission rate at cache size and be defined as in (5). Then, at memory point , ,

The upper bound on the ()CCDN transmission rate in [16] at memory point , is given by , where

(6)

The following corollary compares our upper bound given in (5) with the upper bounds in [17] and [16].

Corollary 4

Let , and at , is given by equations (5), (6) and (4) respectively. Then

and

First, we discuss Theorem 3 for an example. The general theorem proof is given in Section VIII-C.

Example 3

Consider an example ()CCDN at memory point .

Placement phase: . Therefore, . Divide each file into subfiles, each of size 1/20 units, and assign one subfile to each subset. The subfile assigned to the subset will be stored in the caches and and will be available to the users , and . Therefore, we represent the subfile of File stored in the caches and as .

Delivery phase: Let the request pattern be , i.e., User is requesting File , . Out of the 20 subfiles of File , 10 subfiles stored in the caches and are available to User . Therefore, User needs the remaining 10 subfiles, and total 80 subfiles are needed across the 8 users.

We map the problem here to an instance of ICP with 80 virtual users/nodes such that each one requests a distinct subfile. The side-information at a virtual user is the same as the subfiles available to the real user requesting the corresponding subfile. To understand the structural properties of the ICP, we form a table such that

  • each cell represents a virtual user,

  • row represents User ’s required subfiles,

  • if a column’s element is then for all , its element is .

Column 1 Column 2 Column 3 Column 4 Column 5 Column 6 Column 7 Column 8 Column 9 Column 10