Deep Recurrent Neural Networks for Sequential Phenotype Prediction in Genomics
In analyzing of modern biological data, we are often dealing with ill-posed problems and missing data, mostly due to high dimensionality and multicollinearity of the dataset. In this paper, we have proposed a system based on matrix factorization (MF) and deep recurrent neural networks (DRNNs) for genotype imputation and phenotype sequences prediction. In order to model the long-term dependencies of phenotype data, the new Recurrent Linear Units (ReLU) learning strategy is utilized for the first time. The proposed model is implemented for parallel processing on central processing units (CPUs) and graphic processing units (GPUs). Performance of the proposed model is compared with other training algorithms for learning long-term dependencies as well as the sparse partial least square (SPLS) method on a set of genotype and phenotype data with 604 samples, 1980 single-nucleotide polymorphisms (SNPs), and two traits. The results demonstrate performance of the ReLU training algorithm in learning long-term dependencies in RNNs.
READ FULL TEXT