DeepAI AI Chat
Log In Sign Up

Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement

by   Wouter Kool, et al.

The well-known Gumbel-Max trick for sampling from a categorical distribution can be extended to sample k elements without replacement. We show how to implicitly apply this 'Gumbel-Top-k' trick on a factorized distribution over sequences, allowing to draw exact samples without replacement using a Stochastic Beam Search. Even for exponentially large domains, the number of model evaluations grows only linear in k and the maximum sampled sequence length. The algorithm creates a theoretical connection between sampling and (deterministic) beam search and can be used as a principled intermediate alternative. In a translation task, the proposed method compares favourably against alternatives to obtain diverse yet good quality translations. We show that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.


page 1

page 2

page 3

page 4


Conditional Poisson Stochastic Beam Search

Beam search is the default decoding strategy for many sequence generatio...

Consistent Sampling with Replacement

We describe a very simple method for `consistent sampling' that allows f...

Incremental Sampling Without Replacement for Sequence Models

Sampling is a fundamental technique, and sampling without replacement is...

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Decoding methods for large language models often trade-off between diver...

WOR and p's: Sketches for ℓ_p-Sampling Without Replacement

Weighted sampling is a fundamental tool in data analysis and machine lea...

Ensemble Rejection Sampling

We introduce Ensemble Rejection Sampling, a scheme for exact simulation ...

Confidence sequences for sampling without replacement

Many practical tasks involve sampling sequentially without replacement f...

Code Repositories


Estimating Gradients for Discrete Random Variables by Sampling without Replacement

view repo