An Exponential Lower Bound on the Sub-Packetization of MSR Codes
An (n,k,ℓ)-vector MDS code is a F-linear subspace of (F^ℓ)^n (for some field F) of dimension kℓ, such that any k (vector) symbols of the codeword suffice to determine the remaining r=n-k (vector) symbols. The length ℓ of each codeword symbol is called the sub-packetization of the code. Such a code is called minimum storage regenerating (MSR), if any single symbol of a codeword can be recovered by downloading ℓ/r field elements (which is known to be the least possible) from each of the other symbols. MSR codes are attractive for use in distributed storage systems, and by now a variety of ingenious constructions of MSR codes are available. However, they all suffer from exponentially large sub-packetization ℓ≳ r^k/r. Our main result is an almost tight lower bound showing that for an MSR code, one must have ℓ>(Ω(k/r)). Previously, a lower bound of ≈(√(k/r)), and a tight lower bound for a restricted class of "optimal access" MSR codes, were known. Our work settles a central open question concerning MSR codes that has received much attention. Further our proof is really short, hinging on one key definition that is somewhat inspired by Galois theory.