Efficiently Supporting Hierarchy and Data Updates in DNA Storage

by   Puru Sharma, et al.

We propose a novel and flexible DNA-storage architecture that provides the notion of hierarchy among the objects tagged with the same primer pair and enables efficient data updates. In contrast to prior work, in our architecture a pair of PCR primers of length 20 does not define a single object, but an independent storage partition, which is internally managed in an independent way with its own index structure. We make the observation that, while the number of mutually compatible primer pairs is limited, the internal address space available to any pair of primers (i.e., partition) is virtually unlimited. We expose and leverage the flexibility with which this address space can be managed to provide rich and functional storage semantics, such as hierarchical data organization and efficient and flexible implementations of data updates. Furthermore, to leverage the full power of the prefix-based nature of PCR addressing, we define a methodology for transforming an arbitrary indexing scheme into a PCR-compatible equivalent. This allows us to run PCR with primers that can be variably extended to include a desired part of the index, and thus narrow down the scope of the reaction to retrieve a specific object (e.g., file or directory) within the partition with high precision. Our wetlab evaluation demonstrates the practicality of the proposed ideas and shows 140x reduction in sequencing cost retrieval of smaller objects within the partition.


