Error Probability Bounds for Coded-Index DNA Storage
The DNA storage channel is considered, in which a codeword is comprised of M unordered DNA molecules. At reading time, N molecules are sampled with replacement, and then each molecule is sequenced. A coded-index concatenated-coding scheme is considered, in which the mth molecule of the codeword is restricted to a subset of all possible molecules (an inner code), which is unique for each m. The decoder has low-complexity, and is based on first decoding each molecule separately (the inner code), and then decoding the sequence of molecules (an outer code). Only mild assumptions are made on the sequencing channel, in the form of the existence of an inner code and decoder with vanishing error. The error probability of a random code as well as an expurgated code is analyzed and shown to decay exponentially with N. This establishes the importance of increasing the coverage depth N/M in order to obtain low error probability.
READ FULL TEXT