Solving the Rubik's Cube Without Human Knowledge

05/18/2018
by   Stephen McAleer, et al.
0

A generally intelligent agent must be able to teach itself how to solve problems in complex domains with minimal human supervision. Recently, deep reinforcement learning algorithms combined with self-play have achieved superhuman proficiency in Go, Chess, and Shogi without human data or domain knowledge. In these environments, a reward is always received at the end of the game, however, for many combinatorial optimization environments, rewards are sparse and episodes are not guaranteed to terminate. We introduce Autodidactic Iteration: a novel reinforcement learning algorithm that is able to teach itself how to solve the Rubik's Cube with no human assistance. Our algorithm is able to solve 100 length of 30 moves -- less than or equal to solvers that employ human domain knowledge.

READ FULL TEXT

page 3

page 4

research
12/05/2017

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

The game of chess is the most widely-studied domain in the history of ar...
research
07/04/2018

Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization

Adversarial self-play in two-player games has delivered impressive resul...
research
02/17/2018

A Deep Q-Learning Agent for the L-Game with Variable Batch Training

We employ the Deep Q-Learning algorithm with Experience Replay to train ...
research
08/17/2020

A Survey on Reinforcement Learning for Combinatorial Optimization

This paper gives a detailed review of reinforcement learning in combinat...
research
11/11/2021

CubeTR: Learning to Solve The Rubiks Cube Using Transformers

Since its first appearance, transformers have been successfully used in ...
research
07/04/2012

Learning from Sparse Data by Exploiting Monotonicity Constraints

When training data is sparse, more domain knowledge must be incorporated...
research
07/05/2021

Learning Delaunay Triangulation using Self-attention and Domain Knowledge

Delaunay triangulation is a well-known geometric combinatorial optimizat...

Please sign up or login with your details

Forgot password? Click here to reset