LinearPartition: Linear-Time Approximation of RNA Folding Partition Function and Base Pairing Probabilities

12/31/2019
by   He Zhang, et al.
0

RNA secondary structure prediction is widely used to understand RNA function. Recently, there has been a shift away from the classical minimum free energy (MFE) methods to partition function-based methods that account for folding ensembles and can therefore estimate structure and base pair probabilities. However, the classic partition function algorithm scales cubically with sequence length, and is therefore a slow calculation for long sequences. This slowness is even more severe than cubic-time MFE-based methods due to a larger constant factor in runtime. Inspired by the success of LinearFold algorithm that computes the MFE structure in linear time, we address this issue by proposing a similar linear-time heuristic algorithm, LinearPartition, to approximate the partition function and base pairing probabilities. LinearPartition is 256x faster than Vienna RNAfold for a sequence with length 15,780, and 2,771x faster than CONTRAfold for a sequence with length 32,753. Interestingly, although LinearPartition is approximate, it runs in linear time without sacrificing accuracy when base pair probabilities are used to assemble structures, and even leads to a small accuracy improvement on longer families (16S and 23S rRNA).

READ FULL TEXT

page 1

page 4

page 13

research
10/26/2022

LinearCoFold and LinearCoPartition: Linear-Time Algorithms for Secondary Structure Prediction of Interacting RNA molecules

Many ncRNAs function through RNA-RNA interactions. Fast and reliable RNA...
research
06/29/2022

LinearAlifold: Linear-Time Consensus Structure Prediction for RNA Alignments

Predicting the consensus structure of a set of aligned RNA homologs is a...
research
07/18/2023

LinearSankoff: Linear-time Simultaneous Folding and Alignment of RNA Homologs

The classical Sankoff algorithm for the simultaneous folding and alignme...
research
12/22/2019

LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search

Motivation: Predicting the secondary structure of an RNA sequence is use...
research
09/11/2013

Approximate Counting CSP Solutions Using Partition Function

We propose a new approximate method for counting the number of the solut...
research
08/06/2014

MCMC for Hierarchical Semi-Markov Conditional Random Fields

Deep architecture such as hierarchical semi-Markov models is an importan...
research
04/23/2022

Partitioning into degenerate graphs in linear time

Let G be a connected graph with maximum degree Δ≥ 3 distinct from K_Δ + ...

Please sign up or login with your details

Forgot password? Click here to reset