Multi-version Coding with Side Information
In applications of storage systems to modern key-value stores, the stored data is highly dynamic due to frequent updates from the system write clients. The multi-version coding problem has been formulated to study the cost of storing dynamic data in asynchronous distributed storage systems. In this problem, previous work considered a completely decentralized system where a server is not aware of which versions of the data are received by the other servers. In this paper, we relax this assumption and study a system where a server may acquire side information of the versions propagated to some other servers. In particular, we study a storage system with n servers that store ν totally ordered independent versions of a message. Each server receives a subset of these ν versions that defines the state of that server. Assuming that the servers are distributed in a ring, a server is aware of which versions have been received by its h-hop neighbors. If the server is aware of the states of (n-2) other servers, we show that this side information can result in a better storage cost as compared with the case where there is no side information. Through an information-theoretic converse, we identify scenarios where, even if the server is aware of the states of (n-3)/2 other servers, the side information may not help in improving the worst-case storage cost beyond the case where servers have no side information.
READ FULL TEXT