General Framework for Reversible Data Hiding in Texts Based on Masked Language Modeling

06/21/2022
by   Xiaoyan Zheng, et al.
0

With the fast development of natural language processing, recent advances in information hiding focus on covertly embedding secret information into texts. These algorithms either modify a given cover text or directly generate a text containing secret information, which, however, are not reversible, meaning that the original text not carrying secret information cannot be perfectly recovered unless much side information are shared in advance. To tackle with this problem, in this paper, we propose a general framework to embed secret information into a given cover text, for which the embedded information and the original cover text can be perfectly retrieved from the marked text. The main idea of the proposed method is to use a masked language model to generate such a marked text that the cover text can be reconstructed by collecting the words of some positions and the words of the other positions can be processed to extract the secret information. Our results show that the original cover text and the secret information can be successfully embedded and extracted. Meanwhile, the marked text carrying secret information has good fluency and semantic quality, indicating that the proposed method has satisfactory security, which has been verified by experimental results. Furthermore, there is no need for the data hider and data receiver to share the language model, which significantly reduces the side information and thus has good potential in applications.

READ FULL TEXT
research
09/23/2022

Reversible Data Hiding in Encrypted Text Using Paillier Cryptosystem

Reversible Data Hiding in Encrypted Domain (RDHED) is an innovative meth...
research
03/08/2022

Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding

Linguistic steganography (LS) aims to embed secret information into a hi...
research
03/26/2022

Autoregressive Linguistic Steganography Based on BERT and Consistency Coding

Linguistic steganography (LS) conceals the presence of communication by ...
research
11/12/2018

Automatically Generate Steganographic Text Based on Markov Model and Huffman Coding

Steganography, as one of the three basic information security systems, h...
research
10/18/2018

TS-CNN: Text Steganalysis from Semantic Space Based on Convolutional Neural Network

Steganalysis has been an important research topic in cybersecurity that ...
research
05/10/2023

Generative Steganographic Flow

Generative steganography (GS) is a new data hiding manner, featuring dir...
research
10/24/2017

Scaling Text with the Class Affinity Model

Probabilistic methods for classifying text form a rich tradition in mach...

Please sign up or login with your details

Forgot password? Click here to reset