 # Exploration of High-Dimensional Grids by Finite State Machines

We consider the problem of finding a treasure at an unknown point of an n-dimensional infinite grid, n≥ 3, by initially collocated finite state agents (scouts/robots). Recently, the problem has been well characterized for 2 dimensions for deterministic as well as randomized agents, both in synchronous and semi-synchronous models. It has been conjectured that n+1 randomized agents are necessary to solve this problem in the n-dimensional grid. In this paper we disprove the conjecture in a strong sense: we show that three randomized synchronous agents suffice to explore an n-dimensional grid for any n. Our algorithm is optimal in terms of the number of the agents. Our key insight is that a constant number of finite state machine agents can, by their positions and movements, implement a stack, which can store the path being explored. We also show how to implement our algorithm using: four randomized semi-synchronous agents; four deterministic synchronous agents; or five deterministic semi-synchronous agents. We give a different algorithm that uses 4 deterministic semi-synchronous agents for the 3-dimensional grid. This is provably optimal, and surprisingly, matches the result for 2 dimensions. For n≥ 4, the time complexity of the solutions mentioned above is exponential in distance D of the treasure from the starting point of the agents. We show that in the deterministic case, one additional agent brings the time down to a polynomial. Finally, we focus on algorithms that never venture much beyond the distance D. We describe an algorithm that uses O(√(n)) semi-synchronous deterministic agents that never go beyond 2D, as well as show that any algorithm using 3 synchronous deterministic agents in 3 dimensions must travel beyond Ω(D^3/2) from the origin.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Motivated by the self-organizing behaviour of ants and other social insects, swarm robotics leverages the collective capability of a collection of extremely simple and inexpensive robots. Such robots have very limited computation and communication capabilities, and yet can collectively perform seemingly complex tasks such as: forage for food ; form patterns ; pull heavy objects ; and play Für Elise on the piano .

A series of recent papers [25, 23, 22, 13, 18] studies the conditions required for such primitive robots (also called agents or scouts) to search for a treasure placed at an unknown location in an infinite two-dimensional grid. In particular, they consider agents whose behaviour is controlled by a finite automaton (FA), and that can only communicate with other agents that are at the exact same grid location as themselves. Furthermore, this communication is limited to a constant number of bits. The primary question of interest is: how many such agents are needed to search for a treasure located at an unknown location in an infinite -dimensional grid for ? As shown in [13, 22] for , the answer depends on the computational power of the agents: whether or not they have access to random bits, the amount of memory they have, and whether or not they are synchronized. Note that for randomized algorithms, we require a finite mean hitting time for every node in the grid. The set of agents is fully synchronous if they operate by the same global clock; they are semi-synchronous111In some related literature [22, 18, 22] the same model was referred to as asynchronous. We follow the terminology of semi-synchronous of  and the vast literature on autonomous mobile robots to avoid confusion with a fully asynchronous model. if in every time slot, a subset of adversarially scheduled agents is active. In our algorithms, all agents are finite automata, full details of the agent models are given in Section 2.

The case of the 2-dimensional grid has been completely characterized. It has been shown that if the agents are deterministic and semi-synchronous, 4 agents are necessary  and sufficient . If random bits are available to the agents, 3 agents are necessary  and sufficient , regardless of whether they are synchronous or semi-synchronous. Even without random bits, if the agents are fully synchronous, then 3 agents are necessary and sufficient .

In , the authors proved that 3 agents are necessary to search the 2-dimensional grid, even if they are fully synchronized and are randomized. They conjectured that in an -dimensional grid, agents would be necessary.

###### Conjecture 1.1

 For , any search strategy on the -dimensional infinite grid requires at least agents.

The main result of this paper is to disprove the above conjecture; we show that three randomized synchronous agents, or 5 deterministic semi-synchronous agents can explore any -dimensional grid. These algorithms are completely different from previous algorithms for grid exploration, and are based on the key insight that a constant number of finite state machine agents can, by their positions and movements, implement a stack that stores the path being explored.

### 1.1 Our results

First, we show that in the -dimensional grid, deterministic semi-synchronous agents are sufficient for grid exploration. We give an algorithm which is similar to the algorithm for the -dimensional grid given in , but with an important modification that enables exploration of the 3-dimensional grid without increasing the number of agents. Our algorithm is optimal in the explored space and also in the number of agents, since 4 agents are necessary to explore even the 2-dimensional grid.

Our main result is an algorithm for 3 randomized synchronous agents to explore an -dimensional grid for any . This result is optimal, since 3 agents are necessary to explore even the 2-dimensional grid. Next we show how to ”derandomize” the algorithm with the addition of one agent. If the agents are semi-synchronous, the algorithm can be implemented with the addition of one more agent, in both the randomized and deterministic cases. Table 1 shows our results.

The algorithms mentioned in Table 1, except the 4-agent deterministic semi-synchronous algorithm for the three-dimensional grid, have an exploration cost/time that is exponential in the volume of the smallest ball containing the treasure. In Section 5, we give a deterministic synchronous algorithm for exploring the -dimensional grid that uses 5 agents and takes time polynomial in , the distance from the origin to the treasure. A semi-synchronous implementation of this algorithm uses 6 agents. In Section 6, we give a lower bound of on the distance from the origin that must be travelled by some agent in any 3-agent deterministic synchronous algorithm, and give an algorithm using deterministic semi-synchronous agents in which no agent travels distance more than .

### 1.2 Related work

First introduced by Beck  and Bellman , the cow-path problem is the problem of minimizing the time required for search for a treasure on an infinite line by a single agent. Since then many variants have been studied, including search on the plane, and by multiple robots [2, 3, 5, 6, 12, 17, 30, 33, 32, 35, 11]. Evacuation, or group search by a set of collaborating robots where the objective is to minimize the time the last robot arrives at the treasure has been the focus of many recent papers (see for example [17, 19]). Two models of communication have been studied for the collaborating robots: wireless, or face-to-face. In the latter model, similar to the model we use in this paper, the robots communicate only if they are at the same place at the same time.

Graph exploration is a much-studied problem (see, for example, [21, 29, 9, 8, 14, 28] and references therein). The study of the exploration of labyrinths by agents with pebbles is related to our work. A labyrinth is a two-dimensional grid with some blocked cells; it is called finite if a finite number of cells is blocked. It was shown in  that finite 2D labyrinths can be explored by one FA agent with four pebbles, but no collection of FA agents can search 3D maze. Our algorithm in Section 4 can be implemented using a single FA agent with four pebbles to explore -dimensional grids, albeit with no blocked nodes. Recently it was shown that pebbles are necessary and sufficient for exploration of arbitrary unknown undirected graphs by a single FA agent.

Aleliunas et al  showed that a random walk by a single agent has a polynomial hitting time on a finite graph. On the infinite

-dimensional grid, it is known that every node on the grid can be reached with probability 1 if and only if

. However, the mean hitting time of some nodes is infinite, even for . Exploration with 3 non-interacting random walks achieves a finite mean hitting time for all nodes a one-dimensional grid, but on a two-dimensional grid, this is not possible with any finite number of non-interacting random walks.

A large body of work is devoted to the capabilities of autonomous mobile robots with very limited computational and communication abilities; see  for a comprehensive introduction. While we borrow some of the terminology in Section 2, their robots are usually assumed to be identical, anonymous, and communication is limited to being able to ”see” each other’s positions, regardless of how far they are. In contrast, in our model, the robots follow different algorithms (or they can be assumed to start at different states of the same FSM), can only communicate if they are at the same location, though the communication is limited to a constant number of bits. Equivalently, they can be assumed to see the current states of other robots at the same location. This is similar to the ”robots with lights” model in the autonomous mobile robot literature .

The research most related to our work was initiated by Feinerman et al in , which introduced the problem of

randomized mobile agents, starting from the same initial position, and searching for a treasure at an unknown location on the two-dimensional infinite grid. In their model, the agents are Turing machines, but cannot communicate at all. They show that if the agents have a constant approximation of

, the treasure can be found optimally in time , where is the distance between the initial location and the treasure. The authors of  consider semi-synchronous and randomized FA agents and show that the same time complexity can be achieved. The relationship between the number of random bits available and the search time was studied in .

Emek et al  posed the question of how many agents are required to find the treasure. They studied deterministic as well as randomized agents, synchronous as well as semi-synchronous agents, and FA agents, as well as agents that are controlled by a push-down automaton (PDA). They show that the problem can be solved by any of the following: 4 deterministic semi-synchronous FA agents; 3 deterministic synchronous agents; 3 randomized semi-synchronous FA agents; 1 deterministic FA together with 1 deterministic PDA agent; 1 randomized PDA agent. On the negative side they show that the problem cannot be solved by 2 deterministic (synchronous) FA agents; a single randomized FA agent; a deterministic PDA agent. Cohen et al  prove that at least 2 FA agents are necessary to explore the one-dimensional grid and at least 3 FA agents are needed to explore the two-dimensional grid, thus proving the optimality of the FA-agent deterministic synchronous and randomized semi-synchronous algorithms in . Recently it was shown that 3 deterministic semi-synchronous FA agents cannot perform exploration of the 2-dimensional grid , thus proving the optimality of the 4 FA-agent deterministic synchronous algorithm in .

## 2 Model and Notation

We use the same models (with an exception of Section 7 on unoriented grids) as in [22, 13]. For completeness, we recall key definitions and introduce some notation in this section.

Our search domain is with the Manhattan metric, i.e., the distance between two points is defined as . We refer to as the -dimensional integer grid and its elements as grid points, points, or cells. A grid point is adjacent to every grid point , where for some , . Thus is adjacent to grid points whose coordinates differ from those of in exactly one dimension and exactly by 1. We assume that any two grid points cannot be distinguished from each other by an agent, and that includes the origin from which the search starts.

The search for the treasure in the grid is done using a fixed number of agents. We assume that each agent is a very simple device of very limited communication capabilities. Thus, an agent is modelled by a finite automaton, and two agents can exchange information with each other only when they occupy the same grid location at the same time. Initially, all agents are located in the same grid point. Without loss of generality we assume that this cell is the origin of the grid. The treasure is located at distance from from the origin and this distance is not known to the agents. We assume that the grid is oriented and the edges out of each grid point are labelled by dimensions.

Time is divided into discrete units. In each time unit an active agent performs a single look-compute-move cycle. In the look part of the cycle the agent sees the state of other agents located in its own grid point. At the compute part of the cycle the agent determines, using its own state and those it sees, to which adjacent node to move to, if at all. The agent also determines its new state. Such a move is then executed in the move part of the cycle. When we consider randomized algorithms, we assume that an agent has access to a random value during each compute cycle, as needed.

We say that the system is synchronous if at each time unit all agents are active. We say that the system is semi-synchronous if at each time unit only a subset of agents, chosen by an adversarial scheduler, is active. In order to avoid trivial cases, one restriction on the adversarial scheduler is introduced — it must schedule each agent infinitely often.

In addition to the question of whether can be fully explored by agents, we are also interested in the efficiency of such exploration procedures. We refer to this measure of interest as the exploration cost. Intuitively, we measure how long it takes for agents to visit all points in a sphere of radius , as a function of . Observe that such a sphere contains points, thus any algorithm having exploration cost is optimal to within a constant factor. In the synchronous model, this measure is simply the overall time taken by the robots. In the semi-synchronous model, the adversarial scheduler might schedule only one robot in each time step. In addition, the robot scheduled at a particular time step might be waiting to meet another robot and doesn’t have to move. Thus, if we count the overall time taken by the robots, the adversary can make it as large as it desires. The more reasonable notion of the exploration cost in the semi-synchronous case is the total distance travelled by all robots required to visit all points in a sphere of radius . Now that we have discussed this subtlety, we will abuse the terminology and use “exploration cost” and “time” interchangeably.

## 3 Exploring 3-dimensional Grids using 4 Semi-Synchronous agents

In this section

denote the vectors

, respectively.

The basic building block of the algorithm of  for the exploration of the 2D-grid with semi-synchronous agents is the exploration of the perimeter of a right isosceles triangle containing the origin of the plane. The tip of the triangle is at distance from the origin with the shorter sides being the diagonals containing vertices of the grid. Three of the agents are used to mark the vertices of the triangle, and the fourth one does the exploration of the sides. The value of is increased when the exploration of the triangle is finished, and the exploration of the perimeter of the larger triangle is done until the treasure is found.

Our algorithm for the grids explores the sphere consisting of points at distance from the origin. In the Manhattan metric, these points are located on the triangular faces of a regular octahedron whose edges contain grid vertices, see Figure 1. Figure 1: Sphere of points at distance q from the origin in the Manhattan metric.

Thus, the basic building block of our algorithm is the exploration of all grid points on the surface of the equilateral triangle with vertices , and , where ’s are from . The key to our success is an algorithm for exploring one such triangle using four agents, so that
the value of is maintained by the distance between some of the agents while exploring a triangle, so that it can be used for the exploration of all triangles of the octahedron,
the exploration of all eight triangles can be done in a fixed order, and
the value of can be increased for the exploration of the larger sphere after the exploration of the sphere of radius is finished.

We use the four agents as follows:

• is the active agent doing the exploration,

• is the base agent that remains stationary during the exploration of a triangle, and

• and mark the ends of a line segment to be explored.

The exploration of triangle , , starts with three agents , and located in and in node (see the leftmost triangle in Figure 2), and ends with agent remaining in , while the other agents are all in . The implicit parameters , , , , , where , which control the direction of movements of agents are stored in the state of the agents. The exploration proceeds in phases; in one phase agent travels from to and back along the direction . When agent meets agents and it pushes them one step towards along vectors and , respectively, as shown in the middle and rightmost triangles in Figure 2.

This phase is repeated until and meet at point , at which time the whole triangle has been explored. See the pseudocode of this exploration in Algorithm 1.

Observe that the value of is maintained during this exploration since at the beginning of each phase. This is the crucial difference of our algorithm w.r.t. to that of , which keeps on increasing in order to explore the area of a triangle, making it unsuitable for exploring the sphere. Furthermore, at the end of the exploration of triangle , , , the robots are in position to start the exploration of the adjacent triangle with vertices and . It is easy to see that at the beginning of the scan of triangle , , , in the first move of agent from to we could have moved to the location of . After this modification the robots would be in position to start the exploration of the other adjacent triangle with vertices and at the end the the scan of the triangle.

Exploration of the entire sphere of radius is done by doing a sequence of eight triangle explorations. Such a sequence corresponds to a Hamiltonian cycle in the -dimensional hypercube, where vertices (determined by the values of ’s) represent the triangles to be explored. An edge in the Hamiltonian cycle corresponds to the transition to exploring the next triangle, and also to the edge connecting the just explored triangle with an adjacent triangle to be explored. Depending on which triangle is explored next, the input invariant of ExploreTriangle can be made correct by either leaving in place, or bringing to the side of the next triangle, as pointed out above. The order in which the faces are explored is independent of and the agents keep the corresponding sequence of parameters , , , , , in their states. Once all eights triangles have been explored, the agents are positioned back at the edge of the first triangle. At this point the value of is increased to by moving agents , and in direction , and in direction .

Thus our algorithm Explore3Dgrid simply repeats the exploration of the sphere of radius starting with and increasing by 1 until the treasure is found.

Now we establish the exploration cost of our algorithm when it looks for the “treasure” located at distance from the origin. While exploring the area of a triangle with points at its edge, agent needs steps to scan a line in the triangle containing points and to move agents and . Thus, the exploration of a single triangle, i.e., a single facet of the octahedron, costs . After that we need to reposition robots to the beginning configuration of exploring the next facet of the octahedron, and this repositioning costs at most . To explore all facets of the octahedron, we need to repeat this 7 more times. Thus, the exploration of all the facets of side length costs . This needs to be repeated for from 1 to , and thus our algorithm has exploration cost , which is optimal up to a constant factor as discussed in Section 2.

Since at least four semi-synchronous agents are needed to explore a 2-dimensional grid , our result is also optimal as far as the number of semi-synchronous agents used by our algorithm is concerned. Thus we have the following theorem.

###### Theorem 3.1

Assume that the treasure is located in a 3D grid at distance from the origin. Algorithm Explore3Dgrid finds the treasure using 4 semi-synchronous agents, with the exploration cost of . This is optimal as far as the number of semi-synchronous agents used, and up to a constant factor in the exploration cost.

## 4 Exploration of n-dimensional Grids

A straightforward generalization of the algorithms for the exploration of 2D grids  to dimensions results in algorithms that use agents. Consider, for example, such a simple generalization of a randomized 2D algorithm. The basic idea of the -agent randomized algorithm for dimensions is to make an -segment walk, starting from the origin, and walking the -th segment along dimension . The lengths of the segments are chosen randomly, and one agent per segment is used to mark its endpoint. This allows the agent to find the way back to the origin and start another random trial.

In essence, this algorithm uses agents per dimension to store in unary the distance travelled in this dimension, and by an appropriate arrangement we can reuse one of the agents in the successive dimension to bring the number of additional agents per dimension to .

The main idea of our approach is a realization that it is not necessary to use agents to store numbers of segment lengths. Observe that segment lengths are stored and retrieved in this randomized algorithm in the first-in last-out order. Thus this algorithm can be realized if we can implement a stack of the agent’s movements. Turns out we can use a constant number of agents, independent of the grid’s dimension, to implement a stack into which the active agent, that does the exploration, stores its walk and then it can use it to return to the origin. The active agent carries the stack along its walk.

### 4.1 The Stack Implementation

The format of data stored in the logical stack is the string , where represents continue walking in the current direction, represents switch to the next dimension.
The physical implementation of the stack stores this data by interpreting (that is reversed) as a binary number and storing it in unary as a distance between two agents located in a row in the first dimension.

We employ the following agents:

• : the active agent that is doing the exploration of the grid; in the semi-synchronous model this is the only agent moving around and manipulating the other agents,

• : the base of the stack, from which measurements are taken, and representing the current logical location of the exploration,

• : the counter agent; this is an auxiliary agent for implementing the stack operations in the semi-synchronous model,

• : the distance agent; its distance from the base stores the content of the stack,

• : the extra agent used in the deterministic algorithms to store an extra copy of the current stack value.

The basic stack operations we need to implement are isEmpty(), push(v) where and pop(). Operation isEmpty() simply returns whether and are collocated. Implementation of push() and pop() is model-dependent and given below.

#### 4.1.1 Implementing Semi-Synchronous Stack

Algorithms 2 and 3 show the implementation of push and pop operations for the semi-synchronous stack.

#### 4.1.2 Implementing Synchronous Stack

In the synchronous model, we can synchronize the movements of agents to effectively multiply or divide the stack content by without the need of the counter agent , see Figure 3. Figure 3: Implementation of multiply (left) and divide (right) using synchronous agents. Both odd and even cases are shown for divide.

### 4.2 The Randomized Algorithm

As already stated in the initial part of this section, the main idea of the algorithm is to use the stack to store the random choices during the walk, so that the agent can return to the origin. The agent carries the stack along this walk so that the operations can be applied without the need to search for the stack.

In addition to the stack methods, it uses two new procedures. Procedure random() returns with probability , while moveStack() moves the whole stack one step in the direction specified. Note that since the whole stack is located on a single line, the agent can do that with finite memory.

The algorithm works in rounds, which we number , which correspond to the iteration numbers of the outer while loop. At the beginning of each round, the active robot picks a binary string uniformly at random. This string indicates that the robot is going to explore dimension in direction . Then for each dimension from to , the active robot travels for steps in direction , where

is geometrically distributed with parameter

(to be determined later). Note that we want to represent the length of the string pushed onto the stack while moving in dimension . Since the string pushed on the stack includes the “separator” between dimensions, we have the term for the actual number of moves. We call the concatenation of all such moves over all dimensions the logical path of the active robot. If no treasure is found, the active robot uses the stack to retrace its logical path back to the origin by travelling steps in direction first, followed by steps in direction

, and so on. To estimate the exploration cost of each round, we need a simple helper lemma.

###### Lemma 4.1

Let be the maximal stack size during one iteration of the outer while loop of Algorithm 6. The overall cost of this iteration is when implemented by semi-synchronous agents, and is when implemented by synchronous agents.

Proof. In the semi-synchronous model, each push() or pop() costs , where is the actual stack size, as the active agent zig-zags between and . On the other hand, in the synchronous model, the cost of each operation is linear in the stack size. The cost of moving the stack is linear in both models.

As the stack size grows exponentially, and then reduces exponentially, the overall cost is determined by the cost when the stack is the largest, i.e. and for the semi-synchronous and synchronous models, respectively.

Observe that during a given round the maximum size of the stack is . Thus the exploration cost of each round is at most , where is the bound on the overall length of the logical path (there and back) of the active robot, and by Lemma 4.1 each step of the active path costs , since we need to perform operations on the stack of size . Also note that . Let be the constant in the notation such that the exploration cost of a round is at most .

For simplicity, we will assume that the active robot checks for the treasure only at the far end-point of the logical path in each round. This assumption might lead to a more pessimistic upper bound on the exploration cost than if we assumed that the active robot checks for treasure at each grid point that it visits. However, our assumption simplifies the calculations and is sufficient for our purposes.

###### Theorem 4.1

Algorithm 6 locates the treasure in the -dimensional grid in finite expected time, using either semi-synchronous or synchronous agents.

Proof.

Consider the infinite sequence of random variables

, where is the exploration cost of round . Note that the are independent and identically distributed. Consider the exploration cost of a particular round, e.g., . Then we have , where the and are as defined above. Then we have

 E(X1) ≤E(2c(Z1+⋯+Zn)) =∞∑i1=1∞∑i2=1⋯∞∑in=12c(i1+⋯+in)pi1−1(1−p)pi2−1(1−p)⋯pin−1(1−p) =(∞∑i1=1(2cp)i1−12c(1−p))(∞∑i2=1(2cp)i2−12c(1−p))⋯(∞∑in=1(2cp)in−12c(1−p)) =2cn(1−p)n1(1−2cp)n,

where the last step holds as long as that is .

Define a random variable to be the minimum such that the far end-point of coincides with the treasure. That is our exploration procedure terminates in round , but not earlier. Suppose that the treasure is located at position where . By the discussion immediately preceding the statement of this theorem, the probability that the treasure is found in a particular round is , where is the probability of guessing correctly the signs of the and is the probability of travelling the correct number of steps in dimension . Thus is geometrically distributed with parameter . Therefore, .

We are interested in bounding the overall exploration cost, that is . Since the are i.i.d. and is a stopping time, it follows by a generalization of the Wald’s equation  to stopping times that

 E(X1+⋯+XT)=E(T)E(X1)≤1ˆp2cn(1−p)n1(1−2cp)n<∞.

This holds as long as we choose . Since is a constant, such a probabilistic coin can be implemented by finite state machines. The statement of the theorem follows by the number of robots sufficient to implement stack operations in each of the models (synchronous vs. semi-synchronous).

### 4.3 The Deterministic Algorithm

The main idea is to exhaustively go over all possible stack contents in increasing order, interpreting each stack as a specification of a walk. We also keep a backup of the initial stack content, and at the end of the walk we use the backup to return to the origin. The back-up stack is stored using an additional agent. The backup is needed, as reading the stack content during the walk destroys it. Note that after the outward walk, we do not logically reverse the stack; hence the return to the origin does not use the same path as the original walk. However, this is not a problem as the walks along different dimensions are commutative.

Finally, we should mention that some generated stacks do not necessarily have the correct format, some may contain too few or too many s. However, this is easy to handle by the algorithm: too few ones just means we walked without using all of the dimensions, which is still a perfectly valid walk. The excessive s are simply ignored by taking the first excessive as a directive to end the walk and return to the origin.

Using essentially the same arguments as in Lemma 4.1 yields

###### Lemma 4.2

The cost of procedure Walk is and in the semi-synchronous and synchronous models, respectively, where is the size of the backup stack.

###### Theorem 4.2

Algorithm 7 locates the treasure in the -dimensional grid with:
agents and the exploration cost of moves in the semi-synchronous model, and
agents and the exploration cost of in the synchronous model.

Proof. The number of agents and the correctness follows easily from the construction.

It remains to sum up the cost of all calls to procedure Walk. Note that each point in space uniquely specifies a valid (i.e. with precisely ’s) stack. Hence, the valid stack for the treasure at distance contains digits. Therefore, the overall cost of Algorithm 7 is

 2n2D+n∑X=1O(X2)=O(2n(2D+n)3)=23D+4n

in the semi-synchronous model, and in the synchronous model (the initial covers all choices for string ).

## 5 Polynomial time solutions

While designing our exploration algorithms in the previous section, we concentrated on minimizing the number of agents used, and the resulting cost of these algorithms is exponential in the volume , the smallest ball containing the treasure. A natural question to ask is whether this is an unavoidable consequence of using only a constant number of agents in the exploration. In this section we show that this is not the case: a single additional agent is sufficient to bring the cost of exploration down to a polynomial in .

The main reason the cost of algorithms in the preceding section is exponential is the number of incorrect stack contents being considered: as grows compared to the fixed , ever larger proportion of stack contents does not have the correct format and they result in repeatedly reaching already explored vertices. To avoid this problem we will efficiently explore an -dimensional cube of side centered at the origin. We use again the stack idea to trace the exploration of . The logical stack content now consists of numbers in -ary alphabet, describing a location within this cube. However, in this case, we also need to store the scale . As before, the stack implementation interprets the logical content as a -ary number and stores it in unary222This is similar to the simulation of PDAs by counter machines — see Chapter 8.5 in Hopcroft, Motwani, and Ullman text ; however, the details of our implementation are completely different.. Since also needs to be stored on its own, this incurs the additional cost of one agent. However, this allows us to multiply and divide by , which would not have been possible without the extra agent.

The stack is manipulated using the explicit commands:
- isDivisible() which checks the divisibility by ,
- push() which multiplies the stack content by ,
- pop() which divides the stack content by , and
- increment() which increments the top of the stack.

### 5.1 Stack operations: semi-synchronous implementation

In addition to agents , and , we use agent to maintain the value of by placing it at . Furthermore, two counter agents and are used. At the beginning of the stack operations, and are collocated, as are and , and and . The basic procedure is a traversal of the whole stack by agent , manipulating the tokens according to the specific command.

In push() (i.e. multiplying the stack content by ), pushes towards and away from . Whenever reaches , a transports it back to as well as pushes one step closer to . The process terminates when reaches ; subsequently and change roles. The detailed procedure is given in Algorithm 8. It is easy to see that the outer loop executes times, where is the size of the stack at the start of the algorithm, and the inner loop times, and each iteration of the inner loop takes at most steps. Thus, the total cost of the push() operation is bounded by where is the size of the stack at the end.

In isDivisible(), pushes towards and towards , until reaches . Whenever arrives to , it is transported back to

. isDivisible() returns true iff at the moment when

reaches , is at (or ).

pop() means dividing the stack by . The process is essentially reverse of push() – in every iteration/traversal of the stack, and are pushed towards . Whenever reaches , it is brought back to and is pushed away from . When reaches , and exchange their roles.

The detailed pseudocode of isDivisible() and pop() are straightforward and omitted.

### 5.2 Stack operations: Synchronous implementation

A straightforward application of the technique from Section 4 would need agents traveling at speed (for multiply) and (for divide), which is impossible with finite state agents.

Instead, we take to be a power of two and implement the operation of multiply, divide by via repeated applications of multiplication by , division by , respectively. Thus in this case is placed at distance from , instead of placing it at distance of from . The counter is used to count the number of multiplications/divisions already performed, while the counter is not used at all, i.e. only agents , , , and are needed. The operations of doubling and halving were already described in Section 4 and shown to take time. Since these operations are performed times, the total time complexity of every stack operation is .

### 5.3 Fast deterministic grid exploration

Our polytime deterministic grid exploration algorithm is described in Algorithm 9. Starting with , and for any fixed value of , the algorithm generates and visits the addresses (-tuples from a -ary alphabet) in lexicographic order. Then the agent moves to position , doubles the value of , and moves on to the next iteration. Agent always drags the stack along as it performs the exploration. The procedure is a recursive procedure to generate -tuples in lexicographic order; it is called with logical stack content an -tuple . It then iteratively calls to visit the -dimensional cube of side with as the origin, for ranging from to .

Note that the algorithm as shown in Algorithm 9 is presented using recursive calls for convenience; however, is maintained in the local state.

###### Theorem 5.1

Let be the volume of the ball of diameter in the -dimensional grid. Algorithm 9 locates the treasure in the -dimensional grid with:
agents and the exploration cost of moves in the semi-synchronous model, and
agents and the exploration cost of In the synchronous model.

Proof. The number of agents and the correctness follows easily from the construction.

It remains to sum up the cost of all stack operations on a stack of size . As already described, the cost of each stack operations is and in the semi-synchronous and synchronous models, respectively. The maximal stack size is bound by , which is also the number of points covered by the stack base during one iteration of the outer loop (i.e. for fixed ). This results in the overall cost of and in the semi-/pol synchronous and synchronous models, respectively. As grows exponentially, the overall cost is determined by the cost for the last value of .

Finally, it is known that . As (the treasure would had been found if ), we get that , where is a constant. This proves the theorem.

## 6 On the Size of the Explored Space

In our exploration algorithms for general in Sections 4 and 5, the agents employed a stack of size exponential in . In this section we address the question of whether such behavior is necessary, or if there are exploration algorithms, which we call space-efficient, that limit the size of the space visited during the exploration to a constant factor of . In Subsection 6.1, we present high-level details of a space-efficient algorithm that uses more than a constant number of agents (but still ). While we are unable to prove the general lower bound saying that synchronous agents cannot explore , in Subsection 6.2 we show that there is no space-efficient algorithm with synchronous agents to explore . More specifically, every algorithm with agents that explores all grid points within distance must have an agent travel distance away from the origin at some point in time.

### 6.1 Space-Efficient Exploration with Many Agents

The main idea for limiting the visited space is to encode the needed information in a more compact way, using more agents. Our previous solutions had a single active agent doing the exploration, and a constant number of agents that implement a stack that stores several numbers encoded as a single number and represented as a distance between two agents. Now, instead of a stack, the active agent carries around a -dimensional sub-cube of side length . A non-active agent inside such a sub-cube can be used to represent base numbers from — consider simply the coordinates of the non-active agent relative to the origin of the sub-cube. Therefore, agents can be used to represent numbers from . When these numbers are juxtaposed they correspond to a single -digit base number — the coordinate of the grid point that is currently being explored by the active agent inside . The active agent needs to be able to explore the sub-cube, reorganize all non-active agents inside the sub-cube to point to the next -digit base number, and move itself and the entire sub-cube to the new location indicated by the updated positions of non-active agents. The active agent needs to be able to reorganize all non-active agents in such a way as to enumerate all possible -digit base numbers. Once that happens, the agent can run the protocol in reverse, return to the origin, increment and repeat the process.

Although the details are tedious and omitted in this version of the paper, one can easily verify that the active agent can perform the operations required to enumerate all -digit base numbers: increment a number, which means push the corresponding non-active agent inside the cube by 1 along one of the axis; check whether the number has reached , which means check if some non-active agent is on some facet of the sub-cube; set the number to zero, which means bring a non-active agent located on some facet of the cube to the opposite facet of the cube; and proceed to the next/previous digit, which corresponds to modifying the internal state of the active agent. These tasks might require extra agents, but we can always employ a simple algorithm using agents to explore an -ary sub-cube of side without ever leaving the sub-cube. Altogether agents are sufficient to implement this entire scheme. The overall exploration cost is as visiting each node incurs the overhead of traversing the whole memory of size . This improves upon the results from the previous section in terms of exploration cost, and simultaneously limits the exploration to points at distance at most from the origin.

This technique can be applied recursively: Let denote the number of dimensions we can explore at logical level using agents. For level , we can use agents in -dimensional space to encode numbers, yielding and . If we try to minimize w.r.t. to , we will choose for all , resulting in , i.e. agents are sufficient to explore -dimensional grids while limiting the visited space to .

### 6.2 Lower Bound on the Visited Space for 3 Synchronous Agents for n=3

Using the techniques and results from , it is possible to show that for any distance there are only configurations in which two agents are collocated and the third one is at distance . Furthermore, again based on previous results we know that there must be infinitely many meetings between pairs of agents. As the agents are finite automata moving (when looking from sufficiently far above) in straight lines, the number of explored vertices between two consecutive meetings is . Combining with the fact that there are grid points in the ball of radius and the fact that the number of meetings at distance less than is (and hence, their total contribution to the number of explored nodes is at most ) means that in order to visit all vertices in the ball of radius the distance between agents must have been at some moment before locating the treasure. In what follows, we give formal arguments supporting the above intuition.

Suppose that we have an algorithm that uses several agents to find a treasure at an unknown location. Observe that if we run such an algorithm on an empty grid, i.e., without a treasure at all, then eventually every grid point has to be visited by some agent – this is an equivalent view of a treasure search algorithm. Throughout this section we will often adopt this point of view and think of a treasure search algorithm as running on an empty grid and having to “cover” all grid points eventually.

We first start with a few general definitions and lemmas that apply to any dimension of the ambient space.

###### Definition 6.1

A cylinder in direction of radius from origin is the set

 {x∈Rn∣∃t∈R such that ||x−(tv+x0)||∞≤r}.

Intuitively, between meeting each other agents move along cylinders. Precise statements follow below, but for now we make an easy observation that no finite number of cylinders can cover all grid points.

###### Lemma 6.1

Finitely many cylinders cannot cover all of .

Proof. Consider cylinders with radii . Now consider a ball of radius . The number of integral points within the ball is . The number of integral points within the ball that are covered by cylinder is at most (the height of the cylinder relevant to the ball is at most ). Assume for contradiction that the cylinders cover all of , then they cover all the integral points within the ball as well. Thus, we must have

 Θ(k∑i=1rn−1iR)≥Θ(Rn).

The left hand side is a linear function of , while the right hand side is a polynomial of degree of . Thus, a large enough value of would violate the inequality. This leads to a contradiction.

The following definition makes precise the local view of the world by a set of agents.

###### Definition 6.2

Consider a set of agents in at a particular time . The agents are at positions and in states . The tuple is called the configuration of the agents at time . The tuple where is called the relative configuration of the agents at time . Note that to obtain the relative configuration we simply shift the origin of the coordinate system to agent .

The following is the main helper lemma that will be used multiple times to establish the precise behavior of 3 agents in . It relates repeating relative configurations to cylinders.

###### Lemma 6.2

Consider agents exploring . Suppose that the agents interact only with each other and no other agents from some time onward. If the relative configuration repeats then all the grid points visited by the agents fall within a cylinder of radius and direction , where and depend only on the original relative configuration.

Proof. Let denote the absolute configuration at time , and let denote the relative configuration at time . Consider the such that . Since the agents are deterministic finite state automata, they are going to repeat exactly the same sequence of steps from as they did from — the relative configuration corresponds to the view of the world as perceived by the agents. Thus, the same pattern will repeat starting from . Thus, the pattern of exploration by the agents shifts by the vector . Let denote the number of grid points visited by the agents until time , let denote the number of grid points visited by the agents between times and . Finally, let . Therefore all the grid points visited by the agents from onward fall within the cylinder in direction of radius and origin . It is clear that and depend only on .

The following lemma collects several facts about the behaviors of 1, 2, and 3 agents, respectively.

###### Lemma 6.3
1. Consider moves of a single agent in between meetings with other agents. The agent explores lattice points that fall within a cylinder in direction of radius . The direction and the radius depend only on the state of at the beginning of the movement.

2. Consider moves of two agents and that start from the same grid point until one of the agents meets an agent different from or . Then only one of the following is possible:

1. Agent visits grid points that fall within a cylinder in direction of radius , where and depend on the pair of states of agents and at the beginning of the movement; or

2. Both agents visit grid points that fall within a single cylinder in direction of radius , where and depend on the pair of state of agents and at the beginning of the movement.

3. Consider 3 agents in that run a protocol for exploring all of . All three agents cannot meet simultaneously infinitely often.

Proof. All the statements are easy consequences of Lemma 6.2.

1. Consider that never meets any other agent. Since can have finitely many states and its relative configuration is , where is the state of at time , some relative configuration has to repeat. Then by Lemma 6.2 all grid points visited by lie within some cylinder with direction and radius that depend on the state of at the beginning of the considered time interval.

We now claim that the cylinder defined as above covers all grid points that are visited by even if the number of steps until meeting another agent is finite. That is because the path of agent until time is a subpath of the path of agent until time , if does not meet any other agent until .

2. First suppose that and never meet an agent different from and . Then there are two possibilities: either (a) agents and meet each other finitely many times, or (b) agents and meet each other infinitely often.

In case (a), consider the step immediately after the last time and meet each other. By assumption, and do not meet any other agents, thus we can apply the first part of this lemma to each of them separately. By adjusting the radius of the cylinders we can also cover all grid points that were visited until the last time and met each other. This proves the first subpart.

In case (b), consider relative configurations at times when and meet each other. These relative configurations are of the form . Thus, there are only finitely many possible relative configurations. Since and meet each other infinitely often, they must repeat some relative configuration. The second subpart of the statement follows by Lemma 6.2.

If one of the two agents and eventually meet an agent different from and then by the same argument as in the proof of the first part of the lemma the grid points visited until then still fall within the cylinders defined above.

3. Consider relative configurations at times of the meetings. They are of the form
. Therefore, there are only finitely many possible relative configurations. Assume for contradiction that the agents meet infinitely often, then some relative configuration has to repeat. By Lemma 6.2 all the grid points visited by the three agents fall within a cylinder. Since all of cannot be contained within a single cylinder, we get a contradiction.

###### Lemma 6.4

Consider 3 agents in that run a protocol for exploring . Then each agent has to meet with some other agent infinitely often.

Proof. Suppose for contradiction that we have an agent that meets the two other agents only finitely often. Then by Lemma 6.3, it explores only grid points that fall within some cylinder . By the same lemma, the two remaining agents either explore their own cylinders and , or they explore a combined single cylinder . Thus, we get that can be covered by either 2 or 3 cylinders, which contradicts Lemma 6.1.

The following lemma justifies why the number of meeting points between two agents where the third is at some distance is bounded by a constant independent of .

###### Lemma 6.5

Consider 3 agents in that run a protocol for exploring . By part (3) of Lemma 6.3, we can consider large enough such that all three agents never meet after time . Consider only those times after such that two of the three agents are collocated: . Define to be the distance between the two collocated agents at time and the lone agent. There is a universal constant such that for all we have

 |{i∣d(ti)=d}|≤k.

Proof. The essence of the proof is to show that only finitely (i.e., at most ) different relative configurations can give rise to the same value . If we prove this, then it means that in the entire trace of the exploration, the value can be incurred at most times. Otherwise, some relative configuration would have to repeat, meaning by Lemma 6.2 that the three agents only explore grid points falling within a cylinder, so they cannot explore the entire .

Now, consider the situation where and are collocated and is at distance at time . Moreover, suppose that the next time when two agents meet each other will be when meets with at some later time. We need to bound the number of possible coordinates of relative to at the beginning of the movement, i.e., at . There are only finitely many directions and radii of cylinders within which can move, and similarly for (since they depend only on the states of agents). Consider one such cylinder for with direction and radius . Similarly, consider one such cylinder which corresponds to with direction and radius . The possible starting locations333Relative to . of have to satisfy

1. ,

2. .

It is easy to see that there are only finitely many vectors that satisfy the above conditions. First note that there are only finitely many error terms such that . Thus, it is sufficient to fix one such and show that there are finitely many that satisfy and . We can rewrite the second condition as , which gives us three equations and 4 unknowns (coordinates of and ). The fourth equation is given by . We can consider this to split into 8 cases depending on signs of coordinates of , each case giving at most one solution to the overall system. Overall, we get that for each and each there can only be finitely many starting points of such that and do not miss each other, while exploring along the corresponding cylinders. Since the number of possible values of is also finite, the statement of the lemma follows.

Combining the results proven so far, we can show the key lemma.

###### Theorem 6.1

Consider 3 agents in that run a protocol for exploring . Suppose that by time the maximum distance of an agent from the true origin is at most . Then the number of grid points visited by all three agents by time is .

Proof. The assumption implies, in particular, that at all meeting times the distances satisfy . Between meeting times, the agents explore along some cylinders. Whenever an agent explores along a cylinder, the agent visits only a linear number of grid points (since width of the cylinder is constant). Thus, an agent can only explore grid points between two meetings. By Lemma 6.5 we have , i.e., there can only be a linear (in ) number of meetings. Multiplying the two estimates gives an upper bound on the total number of visited grid points.

The following easy corollary is one of the main conclusions of this subsection.

###### Corollary 6.1

Consider 3 agents in that run a protocol for exploring . In order to visit all grid points in the ball of radius the distance of some agent from the origin must have been .

We believe that it should be possible to extend the above result to the case of general and agents. Namely, the desired statement is that if agents visit all grid points inside the ball of radius then one of the agents has to visit a location that is away from the origin. The proof, which would be based on extended versions of Lemmas 6.4 and 6.5, would proceed by induction on the number of agents. The formal proof is deferred to the complete version of the paper.

## 7 Unoriented Grids

In this section we consider exploration of unoriented grids. In such grids, the incident edges to each node are labelled by different labels from . Note that each edge receives two labels (port numbers), one on each end. However, no global consistency among edge labels can be assumed. This, together with agent’s finite memory, means that a lone agent cannot cross any non-constant distance, as the irregular nature of the port labels would lead it astray, never to meet any other agent. It is, therefore, an interesting question to ask “How many additional agents are necessary to solve the problem in unoriented grids?” Figure 4: Left: The global directions. Right: Handrail in 2D. Full lines correspond to α that lead to computing v0 and v2. Dashed lines returned to v but did not satisfy α3=3. Dotted lines did not return to v or backed on themselves

In this section we show that one additional agent is sufficient in the semi-synchronous model, while two additional agents are sufficient in the synchronous model. This result is obtained by employing and generalizing from the handrail technique of , which allows a moving agent to establish by local exploration the relationship between the port labels of vertices it is moving through, in effect carrying the orientation along. However, an auxiliary agent is needed to achieve this. As all our algorithms using semi-synchronous agents are in fact algorithms for one active agent using the remaining agents as tokens with IDs, a single auxiliary agent is sufficient. In the case of synchronous agents, at most two of them are active agents, which implies that two auxiliary agents are sufficient.

In the remainder of this section, we sketch the handrail technique. Combining it with the algorithms for the oriented grid is straightforward for the semi-synchronous case, while a bit more care about the timing is needed in the synchronous case.

Let denote addition modulo . Let . Then the global direction corresponds to increasing the position in dimension , while direction corresponds to decreasing the position in dimension .

Let for denote the port label at node of the edge leaving in the direction , and let , the orientation at , denote .

Assume the agent knows the orientation at the current node and it wants to move, as in the algorithm for oriented grid, to ’s neighbour . Applying the following procedure allows to compute ; this means can maintain the global orientation while moving.

The correctness of Algorithm 10 is based on the fact that in a grid, the only way to return to via direction after a -step walk which never backtracks on itself is when the four steps were in directions for . Note that the cost, i.e., the number of moves or time, of Algorithm 10 is , i.e. a constant. With a more careful approach (at a cost of more complex presentation), this can be reduced to .

## 8 Conclusions and Open Questions

We studied the exploration of -dimensional grids for by finite state automata agents. We showed the surprising result that three randomized synchronous agents suffice to find a treasure in an -dimensional grid for any ; this is optimal in the number of agents. Our strategy can also be implemented by four randomized asynchronous agents, or four deterministic synchronous agents, or five deterministic asynchronous agents. For the three-dimensional case, we gave a different algorithm for the deterministic asynchronous case that uses only 4 agents, and is optimal. Our algorithms for require agents to travel far away from the origin, i.e., exponential in distance away, while looking for a treasure which is located at distance from the origin. We also considered the question of whether it is possible to design algorithms that use few agents and do not require travelling much further than distance away from the origin in order to explore the entire ball of radius around the origin. We a