CubeTR: Learning to Solve The Rubiks Cube Using Transformers

11/11/2021
by   Mustafa Ebrahim Chasmai, et al.
0

Since its first appearance, transformers have been successfully used in wide ranging domains from computer vision to natural language processing. Application of transformers in Reinforcement Learning by reformulating it as a sequence modelling problem was proposed only recently. Compared to other commonly explored reinforcement learning problems, the Rubiks cube poses a unique set of challenges. The Rubiks cube has a single solved state for quintillions of possible configurations which leads to extremely sparse rewards. The proposed model CubeTR attends to longer sequences of actions and addresses the problem of sparse rewards. CubeTR learns how to solve the Rubiks cube from arbitrary starting states without any human prior, and after move regularisation, the lengths of solutions generated by it are expected to be very close to those given by algorithms used by expert human solvers. CubeTR provides insights to the generalisability of learning algorithms to higher dimensional cubes and the applicability of transformers in other relevant sparse reward scenarios.

READ FULL TEXT
research
07/30/2020

PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards

Reinforcement learning (RL), particularly in sparse reward settings, oft...
research
07/12/2023

Transformers in Reinforcement Learning: A Survey

Transformers have significantly impacted domains like natural language p...
research
08/25/2023

Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers

Although dominant in natural language processing, transformer-based mode...
research
05/18/2018

Solving the Rubik's Cube Without Human Knowledge

A generally intelligent agent must be able to teach itself how to solve ...
research
05/26/2020

Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges and Tabular Q-Learning

Penetration testing is a security exercise aimed at assessing the securi...
research
12/18/2020

Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning

Query reformulation aims to alter potentially noisy or ambiguous text se...
research
08/09/2021

Knowledge accumulating: The general pattern of learning

Artificial Intelligence has been developed for decades with the achievem...

Please sign up or login with your details

Forgot password? Click here to reset