DeepAI AI Chat
Log In Sign Up

Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement

03/14/2019
by   Wouter Kool, et al.
10

The well-known Gumbel-Max trick for sampling from a categorical distribution can be extended to sample k elements without replacement. We show how to implicitly apply this 'Gumbel-Top-k' trick on a factorized distribution over sequences, allowing to draw exact samples without replacement using a Stochastic Beam Search. Even for exponentially large domains, the number of model evaluations grows only linear in k and the maximum sampled sequence length. The algorithm creates a theoretical connection between sampling and (deterministic) beam search and can be used as a principled intermediate alternative. In a translation task, the proposed method compares favourably against alternatives to obtain diverse yet good quality translations. We show that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/22/2021

Conditional Poisson Stochastic Beam Search

Beam search is the default decoding strategy for many sequence generatio...
08/29/2018

Consistent Sampling with Replacement

We describe a very simple method for `consistent sampling' that allows f...
02/21/2020

Incremental Sampling Without Replacement for Sequence Models

Sampling is a fundamental technique, and sampling without replacement is...
10/18/2022

Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Decoding methods for large language models often trade-off between diver...
07/14/2020

WOR and p's: Sketches for ℓ_p-Sampling Without Replacement

Weighted sampling is a fundamental tool in data analysis and machine lea...
01/24/2020

Ensemble Rejection Sampling

We introduce Ensemble Rejection Sampling, a scheme for exact simulation ...
06/08/2020

Confidence sequences for sampling without replacement

Many practical tasks involve sampling sequentially without replacement f...

Code Repositories

estimating-gradients-without-replacement

Estimating Gradients for Discrete Random Variables by Sampling without Replacement


view repo