Smooth q-Gram, and Its Applications to Detection of Overlaps among Long, Error-Prone Sequencing Reads

02/04/2018
by   Haoyu Zhang, et al.
0

We propose smooth q-gram, the first variant of q-gram that captures q-gram pair within a small edit distance. We apply smooth q-gram to the problem of detecting overlapping pairs of error-prone reads produced by single molecule real time sequencing (SMRT), which is the first and most critical step of the de novo fragment assembly of SMRT reads. We have implemented and tested our algorithm on a set of real world benchmarks. Our empirical results demonstrated the significant superiority of our algorithm over the existing q-gram based algorithms in accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

Character 3-gram Mover's Distance: An Effective Method for Detecting Near-duplicate Japanese-language Recipes

In websites that collect user-generated recipes, recipes are often poste...
research
01/07/2021

Real-Time Optimized N-gram For Mobile Devices

With the increasing number of mobile devices, there has been continuous ...
research
11/25/2022

Secure Distributed Gram Matrix Multiplication

The Gram matrix of a matrix A is defined as AA^T (or A^TA). Computing th...
research
12/19/2017

Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study

Any-gram kernels are a flexible and efficient way to employ bag-of-n-gra...
research
03/21/2018

End-to-End Fingerprints Liveness Detection using Convolutional Networks with Gram module

This paper proposes an end-to-end CNN(Convolutional Neural Networks) mod...
research
10/22/2020

An overview of block Gram-Schmidt methods and their stability properties

Block Gram-Schmidt algorithms comprise essential kernels in many scienti...

Please sign up or login with your details

Forgot password? Click here to reset