AnySeq: A High Performance Sequence Alignment Library based on Partial Evaluation
Sequence alignments are fundamental to bioinformatics which has resulted in a variety of optimized implementations. Unfortunately, the vast majority of them are hand-tuned and specific to certain architectures and execution models. This not only makes them challenging to understand and extend, but also difficult to port to other platforms. We present AnySeq - a novel library for computing different types of pairwise alignments of DNA sequences. Our approach combines high performance with an intuitively understandable implementation, which is achieved through the concept of partial evaluation. Using the AnyDSL compiler framework, AnySeq enables the compilation of algorithmic variants that are highly optimized for specific usage scenarios and hardware targets with a single, uniform codebase. The resulting domain-specific library thus allows the variation of alignment parameters (such as alignment type, scoring scheme, and traceback vs. plain score) by simple function composition rather than metaprogramming techniques which are often hard to understand. Our implementation supports multithreading and SIMD vectorization on CPUs, CUDA-enabled GPUs, and FPGAs. AnySeq is at most 7 faster (up to 12 on CPUs (SeqAn) and on GPUs (NVBio).
READ FULL TEXT