# On the Parallel Tower of Hanoi Puzzle: Acyclicity and a Conditional Triangle Inequality

A generalization of the Tower of Hanoi Puzzle—the Parallel Tower of Hanoi Puzzle–is described herein. Within this context, two theorems on minimal walks in the state space of configurations, along with their algorithmic proofs, are provided.

## Authors

• 1 publication
01/15/2019

### A New Inequality Related to Proofs of Strong Converse Theorems in Multiterminal Information Theory

In this paper we provide a new inequality useful for the proofs of stron...
01/15/2019

### A New Inequality Related to Proofs of Strong Converse Theorems for Source or Channel Networks

In this paper we provide a new inequality useful for the proofs of stron...
02/05/2020

### Combinatorial proofs of two theorems of Lutz and Stull

The purpose of this note is to give combinatorial-geometric proofs for t...
05/22/2019

### KPynq: A Work-Efficient Triangle-Inequality based K-means on FPGA

K-means is a popular but computation-intensive algorithm for unsupervise...
12/17/2018

### Combinatorics of nondeterministic walks of the Dyck and Motzkin type

This paper introduces nondeterministic walks, a new variant of one-dimen...
10/18/2021

### State-Space Constraints Improve the Generalization of the Differentiable Neural Computer in some Algorithmic Tasks

Memory-augmented neural networks (MANNs) can solve algorithmic tasks lik...
08/21/2020

### Metrics and Ambits and Sprawls, Oh My

A follow-up to my previous tutorial on metric indexing, this paper walks...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

### 1.1 Problem Description

The Tower of Hanoi Puzzle consists of annular disks (no two disks of equal radius) and posts attached to a fixed base. The puzzle begins with all of the disks stacked on a single post with no larger disk being stacked atop a smaller disk. The goal of the puzzle is to transfer the initial “tower” of disks to another post while adhering to the following rules:

1. one disk is transferred from the top of one stack to the top of another (possibly empty) stack at each stage of the puzzle, and

2. a larger disk cannot be transferred to a post occupied by a smaller disk.

See [2] for a complete summary and history of the Puzzle. One prevailing question that has been studied over the past decades is summarized as follows: assuming feasibility, how does one perform the task in the fewest number of disk transfers?

This article considers the same question in a generalization of the classic Puzzle where there are one or more “towers” of disks. We will introduce towers on posts where . Each tower of disks is assigned its own color, and each tower has disks of sizes where implies that disk is larger than disk . Similar to above, we require that no disk (of any color) is atop any smaller or equal-sized disk (of any color). We will define herein the rules of the parallel puzzle in an analogous manner:

1. one or more one disks (of any color) are transferred—each from the top its own stack—to the top of one or more (possibly empty) stacks at each stage of the puzzle, where no two transfers share the same destination stack, and

2. a larger disk cannot be transferred to a post occupied by a smaller or equal-sized disk.

For other parallel variants of the Tower of Hanoi Puzzle (on a single tower of disks, but where more than one disk can be transferred at each stage) see [8] and [4]

. We examine the question of finding walks of minimal length within our parallel context, and we establish two results—with constructive proofs—that prescribe necessary properties that minimal-length walks connecting configurations must possess. These results can be iteratively applied to a given a walk in the state space connecting arbitrary, valid configurations (e.g., a learning episode of a reinforcement learning procedure within the context of the Tower of Hanoi Puzzle; e.g, see

[3], [7], [1]); these results may 1.) potentially reduce the size of the search space for finding walks of minimal length connecting the configurations, and 2.) decide how this reduced search space should be traversed by a walk of minimal length.

### 1.2 Article Summary

1. It formally encapsulates the Parallel Tower of Hanoi Puzzle in a finite metric space on graphs, words, and sets of words (clusters).

2. Two theorems—a theorem on acyclicity, and a conditional triangle inequality—are presented. These results are generalizations of the results from [5]; and the prescribe necessary properties of minimal-length walks within this mathematical framework.

### 1.3 Notation

The domain of all variables is the set of non-negative integers. Throughout, we use , , and to denote the number of disks, posts, and towers, respectively.

For any fixed positive integer , we let denote the set of the first positive integers. We write to denote the set Furthermore, we’ll write and to denote the sets and , respectively. We will define the set of posts to be .

Assume is a subsequence of contiguous elements within the sequence (e.g., a subsequence of contiguous symbols in a string, a subsequence of adjacent disk configurations). Should the case arise where , we will assume that the subsequence is the empty sequence (e.g, the empty string , the empty subsequence).

For a fixed , we will write , and for , we will write . Context permitting, we will omit the superscript.

Assume that where and . We will say that (equivalently, ) if and only if for each . We will also write (equivalently, ) if the inequality (equivalently, ) holds for at least one index .

We assume that each of the towers is assigned its own color. We will write to denote the -th largest disk of color (). We will represent a configuration of disks as matrix over the set of posts : the matrix

 ⎡⎢ ⎢ ⎢⎣a1,1…a1,n⋮⋱⋮at,1…at,n⎤⎥ ⎥ ⎥⎦

encodes a configuration of towers, each with disks, on posts where the disk occupies the post ; the largest disks of each color occupy the posts in the in the first (left-most) column, and the post configurations of the smaller disks are listed by column in decreasing size from left to right.

## 2 The Parallel Tower of Hanoi Problem

In this section, we will establish the mathematical machinery to analyze the state space of disk configurations in a hierarchical manner: we will partition the state space of disk configurations into sets (called clusters) that are defined by the arrangement of a prescribed subset of the largest disks. Afterwards, as was originally done in [6], we will represent the parallel state space of configurations in the context of graphs. We will define the formalism to analyze walks within these graphs, as well as cluster walks: for a pair of arbitrary configurations within the state space, this machinery will allow us to decide in which subsets of configurations (i.e., the corresponding subsets of vertices in the state graph) that minimal walks connecting and must be contained.

###### Definition 2.1 (Parallel Clusters/Cluster Sets).

Assume that .

1. A cluster (of grading ) is a set of disk configurations in which, for each , the disks are in a fixed configuration.

We denote the set of all clusters of g on posts as . We also define the cluster of grading to be the set of all disk configurations, and we denote this set as .

2. Assume the cluster . For all , we define the set of clusters

 St,hp(A)={B ∣B∈St,hp,B⊆A}.

Context permitting, we will omit the superscript and subscript whenever possible.

For , there are choices to place , choices to place choices to place . Thus, there are choices to place the disks of index . Consequently, induction yields clusters of grading .

In the classical puzzle, two configurations were adjacent if and only if they differed by the valid transfer of a single disk. We now extend this definition into the parallel context as follows.

Let the configuration .

A configuration is EREW-adjacent to if and only if

the following conditions are satisfied:

1. Non-triviality: there exists and where (at least one disk has transferred), and

2. Stack Read/Write: if , then, for and , the inequalities and hold.

One consequence of this definition is the exclusive read/write property for the parallel Puzzle: if and where , then and .

Equipped with this definition of adjacency, we will define the state graph of the parallel Puzzle as was done in [6] for the single-tower case (see Figure 1).

###### Definition 2.3.

We define the state graph of the parallel Puzzle to be the pair where

and we will denote this graph as .

We now define a cluster walk, and we will equip walks in the parallel state space with a measure in the parallel state space that naturally aligns with the counting measure of the classic Puzzle. To this end, we will introduce an auxiliary graph object for the parallel Puzzle.

###### Definition 2.4.

Let configurations and be EREW-adjacent. We define the transition graph of the adjacent pair to be the directed, edge-labelled, graph on with edge-labels over : the edge is in the edge set of the transition graph with label if and only if the disk transfers from post to post .

The edge set is defined to be the

transfer vector

of the EREW pair . We will say that the configurations are -adjacent, and express this relation with the notation .

We denote the number of edges in the transition graph of the pair to be . If , then we define .

As per the rules of the parallel Puzzle, the in-degree and out-degree of any vertex in the transition graph is at most one.

We will now define adjaceny for clusters as follows.

###### Definition 2.5 (Parallel Cluster Pair).

Let clusters for some fixed grading where . The clusters and are EREW-adjacent if and only if there are configurations and where for some transfer vector .

Let denote the subset111As , we are assured that . of where

 j={(q,r)∣\textmdtheedgelabelof(q,r)\textmdis(u,ju)\textmdwhereju≤g}.

We will say that and form a -adjacent cluster pair (with transfer vector ), denoted by .

###### Definition 2.6 (Parallel Cluster Walk).

Let for some fixed grading . A (cluster) walk (of grading ) connecting and , denoted as

 π(A,B)=(A1,…,Aw),

is a sequence of clusters of grading (a -walk) that satisfies the following properties: , , and, for , either or for some transfer vector .

We will also define a valid sequence of configurations analogously (we allow repeated configurations in a valid sequence of configurations).

We will define both the sequence length and transfer length of configuration sequences. If is a configuration sequence, then its sequence length is defined to be (the number of configurations in ), and its transfer length is defined to be (the sum of the number of edges in all transition graphs of ).

In order to establish results on the transfer length of configuration sequences, we will formally prescribe disk transfers as mappings on such sequences.

###### Definition 2.7 (Translative and Reflective Mappings).

Let the disk configuration , and let .

1. Let . Let the cluster where the disk occupies the post for and . The translative map maps to the configuration , where

 TB(α)=⎡⎢ ⎢ ⎢⎣b1,1…b1,ga1,g+1…a1,n⋮⋱⋮⋮⋱⋮bt,1…bt,gat,g+1…at,n⎤⎥ ⎥ ⎥⎦.
2. Let . The reflective map maps to the configuration , where, for and ,

1. if , then ;

2. if , then ;

3. if , then .

We will now identify a class of mappings that for which the images of valid configuration sequences are themselves valid.

###### Lemma 2.8 (Mappings on Configuration Sequences).

Let be a valid sequence of configurations. Also, let , and let .

1. For any cluster , the sequence is a valid sequence of configurations contained in .

2. If there exists a cluster that contains , then, for , the sequence is a valid sequence of configurations contained in .

###### Proof.

Assume the hypotheses and notation in the statement of the lemma. For each , let .

1. Assume that the cluster is such that the disk occupies the post for and .

If , then we have the equality .

If for some transfer vector , then write where

 b(v)u,y={a(v)u,yy>gbu,yy≤g.

The claim is that either or for some transfer vector where .

Let i denote the subset of where

 i={(q,r)∣\textmdtheedgelabelof(q,r)\textmdis(u,ju)\textmdwhereju>g}.

If is empty, then we have the equality . Otherwise, if where , then and for and (by the definition of the j-adacency of and ). As , we have that and ; furthermore, as the equalities and hold for , it follows that and . Thus, the configurations and form an i-adjacent pair .

2. If , then we have the equality .

If , then . Thus, we will assume that .

If for some transfer vector j, then, under the assumption that , it must be the case that each edge in has an edge label where . Let , where, for , we define

 b(v)u,y=⎧⎪ ⎪⎨⎪ ⎪⎩ra(v)u,y=q,qa(v)u,y=r,a(v)u,ya(v)u,y∉{q,r}.

We will show that forms a cluster pair.

1. If , then:

1. if and , then , and ; an analogous argument holds for when and ;

2. if and , then , and ; analogous arguments hold for when and , and when and ;

3. if and , then , and .

In all cases, the inequality holds, and the non-triviality condition 2.2.1 is met.

2. If , then and for and , as per condition 2.2.2.

1. if the post and the post , then the post , and the post ; furthermore, for , the post (otherwise, the post ), and the post (otherwise, the post ); an analogous argument holds for when and ;

2. if the post and the post , then the post and the post . Furthermore, the post (otherwise, the post ), and the post

 b(v+1)s,y=⎧⎪ ⎪⎨⎪ ⎪⎩qa(v+1)s,y=rra(v+1)s,y=qa(v+1)s,ya(v+1)s,y∉{q,r};

in each case, as the post , we have the inequality . Analogous arguments hold for when the post and , and when the post and the post ;

3. if and , then , and ; as before, whether or not (and whether or not), we have the inequalities and .

In all cases, the stack read/write condition 2.2.2 is met.

## 3 Results

An elementary result of graph theory states that any minimal path in a graph connecting a pair of vertices must be acyclic. The first theorem extends the notion of acyclicity of minimal walks into the context of parallel cluster walks.

###### Theorem 3.1.

Let , let , and let . If the configurations , then every minimal configuration sequence .

###### Proof.

Assume the notation within the hypotheses of the theorem statement. We will show that for any walk where , there exists a walk with the property that and .

As , there exists a unique sequence of clusters of grading g where , and for some transfer vector for each .

Thus, we write as

 ν=∑Bwνw,

where , and

 νw=(αw,1,…,αw,l(w)).

For , the -adjacent configurations satisfy the properties that and where .

By Definition 2.5, the transfer vector

 iw={(q,r)∣\textmdtheedgelabelof(q,r)\textmdis(u,ju)\textmdwhereju≤g}≠∅;

thus, as per the proof in 2.8.1, we have the following inequality on the edge counts of the translation graphs:

 η(TA(αw,l(w)),TA(αw+1,1))<η(αw,l(w),αw+1,1)

(as the transfer of any disk of index less than or equal to is removed from the set ).

Thus, the walk is contained in as per Lemma 2.8, and .

The second theorem yields a conditional triangle inequality within the context of parallel clusters.

###### Theorem 3.2.

Fix , and let . Fix where the disk occupies the post for and .

Let be pairwise unequal post values. Fix , and let where for each . Let the clusters and be elements of where the disk occupies the post and , respectively.

Let the configuration be such that the identity

 Rgα∣b(c)=α

holds.

Then, for any configuration , and for any configuration sequence contained in , there exists a configuration sequence contained in that satisfies the inequality

 L(μ)
###### Proof.

Assume the notation and hypotheses within the theorem statement.

By applying Theorem 3.1 to walks connecting configurations in , we can assume that the configuration sequence is contained in the cluster . Furthermore, by applying Theorem 3.1 to walks connecting configurations within the clusters and , we assume that we can uniquely express the walk as

 ν=νA+νC+νB,

where , and are contained in , and , respectively.

Assume that the subwalk

 νA=(α1,…,αx),

the subwalk

 νC=(γ1,…,γz),

and the subwalk

 νB=(β1,…,βy).

The following conditions hold within the images of and under the reflective map :

1. The cluster by assumption.

2. The image is contained in .

3. The configurations and are j-adjacent for some transfer vector j, and the pair with the edge label .

We will now construct a transfer vector where and .

Firstly, we express the configuration , the configuration , and the configuration for and . We have that the configurations and are elements of ; thus, we have the equalities

 cu,y=bu,y=c′u,y

for .

We now address the case of static disks: if the post for , then it must be the case that (otherwise, the transfer of disk is prohibited). Thus, under the assumption that the disk statically occupies post in , we have that .

We now address the case of active disks in : as the post , and the post , we have that the post , and the edge .

Let with the edge label where . As per Definition 2.2.2, it must be the case that and . Moreover, as with edge label , it follows that and for and .

Under these conditions, we condition by cases as follows. Assume that and :

1. : The post . The post (as per Definition 2.2.2), and the post (whether or not ). The post ; thus, the edge with with the edge label .

2. : The post , and as the post , the post . The post ; thus, the edge with with the edge label .

3. : We have that , and ; as , it follows that (whether or not ). The post