Unsupervised Learning in Genome Informatics

08/03/2015
by   Ka-Chun Wong, et al.
0

With different genomes available, unsupervised learning algorithms are essential in learning genome-wide biological insights. Especially, the functional characterization of different genomes is essential for us to understand lives. In this book chapter, we review the state-of-the-art unsupervised learning algorithms for genome informatics from DNA to MicroRNA. DNA (DeoxyriboNucleic Acid) is the basic component of genomes. A significant fraction of DNA regions (transcription factor binding sites) are bound by proteins (transcription factors) to regulate gene expression at different development stages in different tissues. To fully understand genetics, it is necessary of us to apply unsupervised learning algorithms to learn and infer those DNA regions. Here we review several unsupervised learning methods for deciphering the genome-wide patterns of those DNA regions. MicroRNA (miRNA), a class of small endogenous non-coding RNA (RiboNucleic acid) species, regulate gene expression post-transcriptionally by forming imperfect base-pair with the target sites primarily at the 3' untranslated regions of the messenger RNAs. Since the 1993 discovery of the first miRNA let-7 in worms, a vast amount of studies have been dedicated to functionally characterizing the functional impacts of miRNA in a network context to understand complex diseases such as cancer. Here we review several representative unsupervised learning frameworks on inferring miRNA regulatory network by exploiting the static sequence-based information pertinent to the prior knowledge of miRNA targeting and the dynamic information of miRNA activities implicated by the recently available large data compendia, which interrogate genome-wide expression profiles of miRNAs and/or mRNAs across various cell conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2022

A Bayesian framework for genome-wide inference of DNA methylation levels

DNA methylation is an important epigenetic mark that has been studied ex...
research
01/03/2021

Segmentation and genome annotation algorithms

Segmentation and genome annotation (SAGA) algorithms are widely used to ...
research
04/13/2018

Classification of large DNA methylation datasets for identifying cancer drivers

DNA methylation is a well-studied genetic modification crucial to regula...
research
06/24/2023

MIRACLE: Multi-task Learning based Interpretable Regulation of Autoimmune Diseases through Common Latent Epigenetics

DNA methylation is a crucial regulator of gene transcription and has bee...
research
08/27/2015

Nucleosome positioning: resources and tools online

Nucleosome positioning is an important process required for proper genom...
research
10/03/2017

Dilated Convolutions for Modeling Long-Distance Genomic Dependencies

We consider the task of detecting regulatory elements in the human genom...
research
07/20/2020

i6mA-CNN: a convolution based computational approach towards identification of DNA N6-methyladenine sites in rice genome

Motivation: DNA N6-methylation (6mA) in Adenine nucleotide is a post rep...

Please sign up or login with your details

Forgot password? Click here to reset