DNA Steganalysis Using Deep Recurrent Neural Networks

04/27/2017
by   Ho Bae, et al.
0

The technique of hiding messages in digital data is called a steganography technique. With improved sequencing techniques, increasing attempts have been conducted to hide hidden messages in deoxyribonucleic acid (DNA) sequences which have been become a medium for steganography. Many detection schemes have developed for conventional digital data, but these schemes not applicable to DNA sequences because of DNA's complex internal structures. In this paper, we propose the first DNA steganalysis framework for detecting hidden messages and conduct an experiment based on the random oracle model. Among the suitable models for the framework, splice junction classification using deep recurrent neural networks (RNNs) is most appropriate for performing DNA steganalysis. In our DNA steganography approach, we extract the hidden layer composed of RNNs to model the internal structure of a DNA sequence. We provide security for steganography schemes based on mutual entropy and provide simulation results that illustrate how our model detects hidden messages, independent of regions of a targeted reference genome. We apply our method to human genome datasets and determine that hidden messages in DNA sequences with a minimum sample size of 100 are detectable, regardless of the presence of hidden regions.

READ FULL TEXT

page 2

page 4

page 6

page 7

page 8

research
10/03/2017

Dilated Convolutions for Modeling Long-Distance Genomic Dependencies

We consider the task of detecting regulatory elements in the human genom...
research
09/20/2023

Embed-Search-Align: DNA Sequence Alignment using Transformer Models

DNA sequence alignment involves assigning short DNA reads to the most pr...
research
10/20/2022

Robust Multi-Read Reconstruction from Contaminated Clusters Using Deep Neural Network for DNA Storage

DNA has immense potential as an emerging data storage medium. The princi...
research
02/07/2018

Spectral Learning of Binomial HMMs for DNA Methylation Data

We consider learning parameters of Binomial Hidden Markov Models, which ...
research
12/24/2021

Measuring Quality of DNA Sequence Data via Degradation

We propose and apply a novel paradigm for characterization of genome dat...
research
07/20/2023

Generative Language Models on Nucleotide Sequences of Human Genes

Language models, primarily transformer-based ones, obtained colossal suc...
research
01/24/2022

Inferring taxonomic placement from DNA barcoding allowing discovery of new taxa

In ecology it has become common to apply DNA barcoding to biological sam...

Please sign up or login with your details

Forgot password? Click here to reset