Autoregressive Linguistic Steganography Based on BERT and Consistency Coding

03/26/2022
by   Xiaoyan Zheng, et al.
0

Linguistic steganography (LS) conceals the presence of communication by embedding secret information into a text. How to generate a high-quality text carrying secret information is a key problem. With the widespread application of deep learning in natural language processing, recent algorithms use a language model (LM) to generate the steganographic text, which provides a higher payload compared with many previous arts. However, the security still needs to be enhanced. To tackle with this problem, we propose a novel autoregressive LS algorithm based on BERT and consistency coding, which achieves a better trade-off between embedding payload and system security. In the proposed work, based on the introduction of the masked LM, given a text, we use consistency coding to make up for the shortcomings of block coding used in the previous work so that we can encode arbitrary-size candidate token set and take advantages of the probability distribution for information hiding. The masked positions to be embedded are filled with tokens determined by an autoregressive manner to enhance the connection between contexts and therefore maintain the quality of the text. Experimental results have shown that, compared with related works, the proposed work improves the fluency of the steganographic text while guaranteeing security, and also increases the embedding payload to a certain extent.

READ FULL TEXT

page 1

page 5

research
03/08/2022

Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding

Linguistic steganography (LS) aims to embed secret information into a hi...
research
06/21/2022

General Framework for Reversible Data Hiding in Texts Based on Masked Language Modeling

With the fast development of natural language processing, recent advance...
research
11/12/2018

Automatically Generate Steganographic Text Based on Markov Model and Huffman Coding

Steganography, as one of the three basic information security systems, h...
research
12/21/2021

Pixel-Stega: Generative Image Steganography Based on Autoregressive Models

In this letter, we explored generative image steganography based on auto...
research
06/03/2021

Provably Secure Generative Linguistic Steganography

Generative linguistic steganography mainly utilized language models and ...
research
03/10/2023

ICStega: Image Captioning-based Semantically Controllable Linguistic Steganography

Nowadays, social media has become the preferred communication platform f...

Please sign up or login with your details

Forgot password? Click here to reset