Cover Your Bases: How to Minimize the Sequencing Coverage in DNA Storage Systems
Although the expenses associated with DNA sequencing have been rapidly decreasing, the current cost stands at roughly $1.3K/TB, which is dramatically more expensive than reading from existing archival storage solutions today. In this work, we aim to reduce not only the cost but also the latency of DNA storage by studying the DNA coverage depth problem, which aims to reduce the required number of reads to retrieve information from the storage system. Under this framework, our main goal is to understand how to optimally pair an error-correcting code with a given retrieval algorithm to minimize the sequencing coverage depth, while guaranteeing retrieval of the information with high probability. Additionally, we study the DNA coverage depth problem under the random-access setup.
READ FULL TEXT