Advancing Beyond Identification: Multi-bit Watermark for Language Models

08/01/2023
by   KiYoon Yoo, et al.
0

This study aims to proactively tackle misuse of large language models beyond identification of machine-generated text. While existing methods focus on detection, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose "Multi-bit Watermark through Color-listing" (COLOR), embedding traceable multi-bit information during language model generation. Leveraging the benefits of zero-bit watermarking (Kirchenbauer et al., 2023a), COLOR enables extraction without model access, on-the-fly embedding, and maintains text quality, while allowing zero-bit detection all at the same time. Preliminary experiments demonstrates successful embedding of 32-bit messages with 91.9 (∼500 tokens). This work advances strategies to counter language model misuse effectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2023

A Watermark for Large Language Models

Potential harms of large language models can be mitigated by watermarkin...
research
06/22/2023

AudioPaLM: A Large Language Model That Can Speak and Listen

We introduce AudioPaLM, a large language model for speech understanding ...
research
05/24/2023

LLMDet: A Large Language Models Detection Tool

With the advancement of generative language models, the generated text h...
research
02/23/2023

Language Model Crossover: Variation through Few-Shot Prompting

This paper pursues the insight that language models naturally enable an ...
research
07/28/2023

Robust Distortion-free Watermarks for Language Models

We propose a methodology for planting watermarks in text from an autoreg...
research
07/26/2023

Three Bricks to Consolidate Watermarks for Large Language Models

The task of discerning between generated and natural texts is increasing...
research
02/20/2023

Can discrete information extraction prompts generalize across language models?

We study whether automatically-induced prompts that effectively extract ...

Please sign up or login with your details

Forgot password? Click here to reset