Addressing Segmentation Ambiguity in Neural Linguistic Steganography

11/12/2022
by   Jumon Nozaki, et al.
0

Previous studies on neural linguistic steganography, except Ueoka et al. (2021), overlook the fact that the sender must detokenize cover texts to avoid arousing the eavesdropper's suspicion. In this paper, we demonstrate that segmentation ambiguity indeed causes occasional decoding failures at the receiver's side. With the near-ubiquity of subwords, this problem now affects any language. We propose simple tricks to overcome this problem, which are even applicable to languages without explicit word boundaries.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2018

Does the brain represent words? An evaluation of brain decoding studies of language understanding

Language decoding studies have identified word representations which can...
research
02/13/2023

Linguistic ambiguity analysis in ChatGPT

Linguistic ambiguity is and has always been one of the main challenges i...
research
02/25/2022

Exploring Multi-Modal Representations for Ambiguity Detection Coreference Resolution in the SIMMC 2.0 Challenge

Anaphoric expressions, such as pronouns and referential descriptions, ar...
research
07/15/2019

Towards Near-imperceptible Steganographic Text

We show that the imperceptibility of several existing linguistic stegano...
research
03/21/2022

Neural Token Segmentation for High Token-Internal Complexity

Tokenizing raw texts into word units is an essential pre-processing step...
research
03/30/2020

Investigating Language Impact in Bilingual Approaches for Computational Language Documentation

For endangered languages, data collection campaigns have to accommodate ...
research
03/11/2017

Language Use Matters: Analysis of the Linguistic Structure of Question Texts Can Characterize Answerability in Quora

Quora is one of the most popular community Q&A sites of recent times. Ho...

Please sign up or login with your details

Forgot password? Click here to reset