On the Risks of Stealing the Decoding Algorithms of Language Models

03/08/2023
by   Ali Naseh, et al.
0

A key component of generating text from modern language models (LM) is the selection and tuning of decoding algorithms. These algorithms determine how to generate text from the internal probability distribution generated by the LM. The process of choosing a decoding algorithm and tuning its hyperparameters takes significant time, manual effort, and computation, and it also requires extensive human evaluation. Therefore, the identity and hyperparameters of such decoding algorithms are considered to be extremely valuable to their owners. In this work, we show, for the first time, that an adversary with typical API access to an LM can steal the type and hyperparameters of its decoding algorithms at very low monetary costs. Our attack is effective against popular LMs used in text generation APIs, including GPT-2 and GPT-3. We demonstrate the feasibility of stealing such information with only a few dollars, e.g., $0.8, $1, $4, and $40 for the four versions of GPT-3.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2023

Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System

Neural language models are increasingly deployed into APIs and websites ...
research
10/24/2020

NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints

Conditional text generation often requires lexical constraints, i.e., wh...
research
05/07/2021

On-the-Fly Controlled Text Generation with Experts and Anti-Experts

Despite recent advances in natural language generation, it remains chall...
research
10/07/2022

An Analysis of the Effects of Decoding Algorithms on Fairness in Open-Ended Language Generation

Several prior works have shown that language models (LMs) can generate t...
research
09/20/2021

A Plug-and-Play Method for Controlled Text Generation

Large pre-trained language models have repeatedly shown their ability to...
research
02/01/2022

Typical Decoding for Natural Language Generation

Despite achieving incredibly low perplexities on myriad natural language...
research
07/29/2020

Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm

Neural text decoding is important for generating high-quality texts usin...

Please sign up or login with your details

Forgot password? Click here to reset