Locality Sensitive Hashing-based Sequence Alignment Using Deep Bidirectional LSTM Models

04/05/2020
by   Neda Tavakoli, et al.
0

Bidirectional Long Short-Term Memory (LSTM) is a special kind of Recurrent Neural Network (RNN) architecture which is designed to model sequences and their long-range dependencies more precisely than RNNs. This paper proposes to use deep bidirectional LSTM for sequence modeling as an approach to perform locality-sensitive hashing (LSH)-based sequence alignment. In particular, we use the deep bidirectional LSTM to learn features of LSH. The obtained LSH is then can be utilized to perform sequence alignment. We demonstrate the feasibility of the modeling sequences using the proposed LSTM-based model by aligning the short read queries over the reference genome. We use the human reference genome as our training dataset, in addition to a set of short reads generated using Illumina sequencing technology. The ultimate goal is to align query sequences into a reference genome. We first decompose the reference genome into multiple sequences. These sequences are then fed into the bidirectional LSTM model and then mapped into fixed-length vectors. These vectors are what we call the trained LSH, which can then be used for sequence alignment. The case study shows that using the introduced LSTM-based model, we achieve higher accuracy with the number of epochs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2017

Parallel Long Short-Term Memory for Multi-stream Classification

Recently, machine learning methods have provided a broad spectrum of ori...
research
03/11/2019

conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT Reads

Single Molecule Real-Time (SMRT) sequencing is a recent advancement of N...
research
05/10/2019

Alignment- and reference-free phylogenomics with colored de-Bruijn graphs

We present a new whole-genome based approach to infer large-scale phylog...
research
05/12/2017

Learning to Predict Blood Pressure with Deep Bidirectional LSTM Network

Blood pressure (BP) has been a difficult vascular risk factor to measure...
research
08/12/2019

LSTM vs. GRU vs. Bidirectional RNN for script generation

Scripts are an important part of any TV series. They narrate movements, ...
research
10/04/2018

Learning Bidirectional LSTM Networks for Synthesizing 3D Mesh Animation Sequences

In this paper, we present a novel method for learning to synthesize 3D m...
research
04/30/2018

FPGA Acceleration of Short Read Alignment

Aligning millions of short DNA or RNA reads, of 75 to 250 base pairs eac...

Please sign up or login with your details

Forgot password? Click here to reset