Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

11/15/2022
by   Siddhant Bhambri, et al.
0

In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) problems. These methods are based on approximation in value space and the rollout approach, admit a straightforward implementation, and provide improved performance over various heuristic approaches. For the Wordle puzzle, they yield on-line solution strategies that are very close to optimal at relatively modest computational cost. Our methods are viable for more complex versions of Wordle and related search problems, for which an optimal strategy would be impossible to compute. They are also applicable to a wide range of adaptive sequential decision problems that involve an unknown or frequently changing environment whose parameters are estimated on-line.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2023

Learning Optimal Admission Control in Partially Observable Queueing Networks

We present an efficient reinforcement learning algorithm that learns the...
research
10/01/2019

Sampling Unknown Decision Functions to Build Classifier Copies

Copies have been proposed as a viable alternative to endow machine learn...
research
12/09/2011

KL-learning: Online solution of Kullback-Leibler control problems

We introduce a stochastic approximation method for the solution of an er...
research
02/01/2023

Deep reinforcement learning for the olfactory search POMDP: a quantitative benchmark

The olfactory search POMDP (partially observable Markov decision process...
research
11/27/2019

Deep Reinforcement Learning based Adaptive Moving Target Defense

Moving target defense (MTD) is a proactive defense approach that aims to...
research
10/02/2020

Reinforcement Learning of Simple Indirect Mechanisms

We introduce the use of reinforcement learning for indirect mechanisms, ...
research
08/20/2021

Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control

In this paper we aim to provide analysis and insights (often based on vi...

Please sign up or login with your details

Forgot password? Click here to reset