Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks

03/16/2021
by   Mohammadamin Barekatain, et al.
0

Motivation Regulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed. Results Here we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 120 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox. Availability and implementation Spline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at https://github.com/gagneurlab/Manuscript_Avsec_Bioinformatics_2017.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 8

research
04/07/2023

Neural Diffeomorphic Non-uniform B-spline Flows

Normalizing flows have been successfully modeling a complex probability ...
research
03/13/2023

CoGANPPIS: Coevolution-enhanced Global Attention Neural Network for Protein-Protein Interaction Site Prediction

Protein-protein interactions are essential in biochemical processes. Acc...
research
08/31/2018

Full Workspace Generation of Serial-link Manipulators by Deep Learning based Jacobian Estimation

Apart from solving complicated problems that require a certain level of ...
research
03/25/2022

Continuous Dynamic-NeRF: Spline-NeRF

The problem of reconstructing continuous functions over time is importan...
research
04/26/2023

HiQ – A Declarative, Non-intrusive, Dynamic and Transparent Observability and Optimization System

This paper proposes a non-intrusive, declarative, dynamic and transparen...
research
09/26/2020

ProDOMA: improve PROtein DOMAin classification for third-generation sequencing reads using deep learning

Motivation: With the development of third-generation sequencing technolo...
research
11/28/2020

Batch Normalization with Enhanced Linear Transformation

Batch normalization (BN) is a fundamental unit in modern deep networks, ...

Please sign up or login with your details

Forgot password? Click here to reset