HouseExpo: A Large-scale 2D Indoor Layout Dataset for Learning-based Algorithms on Mobile Robots

03/23/2019 ∙ by Tingguang Li, et al. ∙ The Chinese University of Hong Kong 0

As one of the most promising areas, mobile robots draw much attention these years. Current work in this field is often evaluated in a few manually designed scenarios, due to the lack of a common experimental platform. Meanwhile, with the recent development of deep learning techniques, some researchers attempt to apply learning-based methods to mobile robot tasks, which requires a substantial amount of data. To satisfy the underlying demand, in this paper we build HouseExpo, a large-scale indoor layout dataset containing 35,357 2D floor plans including 252,550 rooms in total. Together we develop Pseudo-SLAM, a lightweight and efficient simulation platform to accelerate the data generation procedure, thereby speeding up the training process. In our experiments, we build models to tackle obstacle avoidance and autonomous exploration from a learning perspective in simulation as well as real-world experiments to verify the effectiveness of our simulator and dataset. All the data and codes are available online and we hope HouseExpo and Pseudo-SLAM can feed the need for data and benefits the whole community.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 4

page 5

page 6

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

With the significant achievements in AI field [1], the investigation of learning-based methods in robotics area has received more and more attention in recent years. A number of algorithms have been developed for mobile robots ranging from autonomous exploration [2] to mapless navigation [3] [4], which all achieve impressive results.

From these achievements, the authors see the huge potential in applying learning-based methods to mobile navigation problems. However, for the learning-based methods, the issue of data requirement must be addressed first. The size and diversity of training data are crucial for the performance and will influence the generalization ability of the methods. Since the problems in robotics involve the interaction with environments, getting data from real-world situation is impractical considering the resource and time cost. Therefore there is a need for a large-scale dataset and a high-performance simulator to speed up the training process. On the other hand, it is still challenging for current datasets and simulators to meet such demand. For the existing datasets of 2D environments, the size as well as the variability is limited [5][6][7][8]

, which will adversely affect the algorithm’s performance. As for the simulators, the processing time to build the map through Simultaneous Localization And Mapping (SLAM) is time-costing, which is a bottleneck in training the neuron networks which routinely involve millions of trial-and-error episodes. These issues motivate the authors to develop a large-scale dataset, HouseExpo, and a fast simulation platform, Pseudo-SLAM, to improve the training efficiency.

Fig. 1: Some house samples from HouseExpo dataset. Black pixels mean obstacles and white pixels represent free space.

HouseExpo111The HouseExpo dataset and the simulation platform code are available at https://github.com/TeaganLi/HouseExpo/ is a large-scale 2D floor plan dataset built on SUNCG dataset [9], consisting of human-designed 2D house blueprints including rooms in total, ranging from single-room studios to multi-room houses (some map samples are displayed in Fig. 1). The details of the dataset generation pipeline are presented in Section II.

Pseudo-SLAM is a lightweight simulation platform with OpenAI Gym-compatible interface [10] which simulates SLAM and the navigation process in an unknown 2D environment. It reads the data from HouseExpo, creates the corresponding 2D environment and generates the mobile robot to carry on different tasks in this environment. The detailed introduction is given in Section III.

To demonstrate the effectiveness and the efficiency of HouseExpo and simulator, we re-examine two tasks: object avoidance and autonomous exploration, based on Deep Reinforcement Learning (DRL) in the experiment part. We also implement a real-world experiment on a

TurtleBot with the policy trained in our simulator without additional fine-tuning. The result shows that the knowledge learned in Pseudo-SLAM can be transferred to the real world and the indoor spatial structure can help guide the exploration process.

In summary, our work has the following contributions:

  • A large-scale 2D indoor layout dataset is built, containing maps for problems like exploration.

  • A high-speed simulation platform is developed to improve the training efficiency of DRL network.

  • The effectiveness of HouseExpo and Pseudo-SLAM is verified via simulation and real-world experiments.

Ii HouseExpo DATASET

In recent years, many researchers attempt to apply deep learning techniques to mobile robots. However, one of the main difficulties of training deep neural networks is the lack of large datasets with diverse samples. On the one hand, the sizes of the existing 2D floor plan datasets are limited. As far as we know, the largest 2D floor plan datasets are MIT campus dataset

[7] and KTH campus dataset [8] with the size of and floor plans, respectively. Apart from their limited size, the lack of the diversity of their samples is another concern. Both MIT and KTH datasets are collected from campus buildings and thus the location of rooms obeys some particular distribution which may not exist in other more commonly applied environment like home and office, e.g. rooms are orderly arranged along corridors, which in turn constrains the variety of samples and restricts their application scenarios. Furthermore, neither of these two datasets considers the importance of the connectivity between rooms. At the training stage, robots are routinely initialized at a random location at the beginning of each episode. If the dataset has a low connectivity level, i.e. it contains many isolated rooms which cannot be accessed from any other adjacent rooms, robots are likely to be initialized at these rooms and therefore can barely learn anything.

Due to the lack of a large-scale 2D floor plan dataset, current work is evaluated either in simple simulated environments [11][12], lacking realism in terms of spatial structure, or in a limited number of similar scenes [4], deficient in verifying generalization capacity. In view of this, we create HouseExpo dataset consisting of environments with a total of rooms to benefit the investigation of data-driven approaches.

Ii-a Dataset Generation

Fig. 2: The pipeline of generating HouseExpo. (a) A 3D model from SUNCG dataset. (b) The ground cross section . (c) The door cross section . (d) Find the boundary of and fill the outside as obstacles (denoted as ). (e) The location information of doors is obtained via subtracting from . (f) Remove the doors in based on . (g) Final result after cell-filling, connectivity-checking, line refinement and image cropping.
0:  Number of SUNCG models , ground plane height , door plane height , number of cell-checking samples , number of connectivity-checking samples , area threshold ;
0:  The set of 2D floor plans ;
1:  ;
2:  for  to  do
3:     extract a 3D model from SUNCG;
4:     , ;
5:     ;
6:     calculate the contour of , fill in its outside as obstacles and get ;
7:      free, for all in ;
8:     calculate all the contours of ;
9:     fill in if area(), for all in ;
10:     UniformRandom(, ), where ;
11:     =dist() for all ;
12:     repeat
13:        pick that ;
14:        repeat
15:           plan a path between and ;
16:           if  then
17:              remove a cross segment between the wall and the line(,);
18:           end if
19:        until 
20:     until all point pairs are checked
21:     refine the wall and crop the generated map ;
22:     ;
23:  end for
24:  compute the similarity of and remove duplicate elements;
25:  return  
Algorithm 1 2D Floor Plan Dataset Generation

The HouseExpo dataset is built on SUNCG dataset [9]

, which is one of the widely adopted 3D environment datasets in computer vision community. The SUNCG dataset, consisting of

manually-designed 3D house models, is originally created to facilitate semantic scene completion, a task for simultaneously producing a 3D voxel representation and semantic labels by using a single-view observation, and thus it carries rich objects, textures, and layout information. The various house models in SUNCG make it suitable as a base for our application.

However, there are several issues if we directly utilize the SUNCG dataset. First of all, similar to KTH and MIT dataset, the SUNCG dataset does not guarantee the connectivity among rooms which may cause redundant initialization problem and influence the training process. In addition, some mobile robot tasks can be treated as 2D problems [2][3][13]. Using 3D environments in SUNCG for 2D task algorithms is inefficient or even inapplicable. Furthermore, as the SUNCG dataset is originally designed for scene completion task, it involves too much semantic information like textures, which incurs additional computational costs.

To satisfy the demand for a large-scale 2D layout dataset, we built HouseExpo dataset. An illustrative example of our pipeline to generate the dataset is shown in Fig. 2. First, we extract a 3D structure model from SUNCG dataset. Note that the projection of the top view of onto a 2D plane cannot be regarded as the desired indoor map since it fails in reflecting the connectivity relationship between rooms due to the existence of the lintel. Then we obtain the ground cross-section plane and the door cross-section plane at the height of and , respectively. As a result, the doors location set can be easily determined by subtracting from . In addition, considering that the rooms in are closed we can readily calculate the contour of the house according to [14] and get the indoor layout by filling in the outside of the boundary as obstacles. Further, the doors are removed from with the knowledge of door location set .

Notice that the cross section of 3D models may contain some closed cells, caused by cross sectioning particular regions like chimneys and unused space between rooms, which are meaningless and must be removed. To remove such cells, we calculate all the room contours using [14] and then fill in the small cells as obstacles. Furthermore, to tackle the connectivity issue, we recheck the connectivity in every generated map: (1) first we uniformly sample points on and compute the distance of each point pairs, denoted by the distance matrix ; (2) then we pick the two points and with the shortest distance in and plan a path between them using A* algorithm [15]; (3) if step (2) fails, indicating the two area is not connected yet and manually establishing the connectivity is necessary, we will remove a wall segment which intersects with the line between and ; (4) repeat step (2) until all point pairs are checked.

One remaining step is to refine the wall appearance to maintain the consistent thickness along the same wall and crop the image to center the house. Incidentally, notice that there exist some duplicate samples in our generated dataset due to the characteristic of SUNCG, where each scene is a combination of house models and various objects. To reduce such redundancy, we re-examine the similarity among maps by calculating their difference and remove the ones with similar appearance. The detailed introduction is given in Algorithm 1, where , m, m, , and . Apart from SUNCG, our pipeline can be applied to any 3D house model for extracting its 2D floor plan.

Although much efforts are made to maintain the intelligence of our algorithm, human involvement is still necessary for some extreme cases: (1) for some houses with French windows, the windows may be recognized as doors and removed subsequently, resulting in a severe damage in indoor structure. Manually specifying the door height is necessary; (2) for some isolated rooms far from main areas like garages, they are regarded as inaccessible and erased from the map; (3) for some scenes where there are no layout but only objects, we exclude them from our HouseExpo dataset; (4) for some open/semi-open houses where there is no wall as their boundaries, we design walls for them to make sure every house has a distinct boundary; (5) for some houses with unreasonable layout, we remove them from our dataset.

Ii-B The Statistics Information

All floor plans are stored as JSON files and the structure information is represented as line segments with respect to the house’s centroid coordinates. There are houses with rooms, with mean of 7.14 and median of 7.0 rooms per house. Furthermore, the room category labels are inherited from SUNCG dataset, aiming to provide the semantic information. The distribution of rooms per house and room category labels are displayed in Fig. 3.

Fig. 3: Room number distribution and room category label distribution.

Iii Simulation Platform

Iii-a Motivation

Existing simulators, like Gazeobo [16] and Stage [17], can receive robot motion commands and output sensory data, e.g. LiDAR range measurements, then SLAM algorithms like gMapping [18] or hector slam [19] are employed to build the occupancy maps with the sensor data.

However, the computational complexity of SLAM algorithms leads to a low run-time efficiency, especially when the map increases to a large scale, as discussed in Section III-F. Such complexity makes SLAM an extremely inefficient algorithm for learning-based methods that usually require a substantial amount of data at training stage. Although a possible solution to reduce the mapping time is to reduce the filter particle number and iteration times, this will sacrifice the mapping accuracy, which is also undesirable. Therefore, a simulator that can keep the mapping quality with a low computation time is necessary.

In light of this, we develop a simulation platform to simulate the SLAM process, namely Pseudo-SLAM. It can achieve a competitive mapping result as SLAM but at a much lower time cost, thus feeding the need for data of learning methods and speeding up their training process.

Iii-B Simulator Overview

The developed simulator can be regarded as a combination of a traditional simulator and a built-in SLAM component, thus the simulator is able to build the occupancy map directly. As a result, low-level information processing is abstracted away, and users can focus on performing high-level strategic policy based on the built map. When developing the simulation platform, we obey the following principles: (1) Time efficiency and low computational cost; (2) Close to real-world situations such that the model developed using the simulator can be transferred to the real world.

Apart from simulating the SLAM process, the simulation platform can generate obstacles. To the best of the authors’ knowledge, most of the 2D indoor environment datasets contain only floor plans [7][8]

without obstacles like furniture, chairs, tables, etc. Instead of statically adding obstacles before simulation, our simulator can generate obstacles inside the floor plan in a dynamic manner. This can help increase the variance of training samples and narrow its gap with the real-life scenarios. The details of obstacle generation are introduced in Section

III-E.

Besides, the simulator also has a flexible interface. Multiple parameters can be specified by users, including the range and field of view of laser rangefinder, robot size, range of obstacle number and size, sensor noise and SLAM error variance, etc. Users can customize these configurations according to their sensor and robot specification in the real world. Furthermore, an OpenAI Gym compatible interface is also implemented to help users easily integrate more existing learning-based methods.

Iii-C Pseudo-SLAM Pipeline

The Pseudo-SLAM aims to simulate SLAM algorithms with the knowledge of the ground truth of the map. Its map format is consistent with that of the SLAM algorithms in ROS gMapping [18] and hector slam package [19]. It is an occupancy grid map, consisting of three states, i.e. free, obstacle and uncertain, represented by different pixel values.

Fig. 4: The pipeline of Pseudo-SLAM. (a) The ground truth map . (b) The cropped sector centered at the robot at time . (c) The processed sector . (d) The occupancy map at time . (e) The occupancy map at time .

The workflow of Pseudo-SLAM is shown in Fig. 4: (1) A floor plan is loaded as the ground truth map ; (2) The robot pose is updated according to its motion, and a sector centered at with a radius of the laser range and angle of the filed of view is cropped; (3) Process to hide the areas behind obstacles. The obstacle locations in the sector are identified first. Along the robot-obstacle line, pixels behind the obstacle are set uncertain, and pixels between the robot and obstacle are set free. The processed sector is denoted as ; (4) Merge into the occupancy map built at and obtain ; (5) At each point of robot’s trajectory, the step (2) to (4) are repeated.

Iii-D Noise and Uncertainty Model

Using the above pipeline, an ideal SLAM result will be generated. However, this is seldom the case in real environments that are full of noise and uncertainty. Therefore, it is necessary to add noise and uncertainty to the simulator for minimizing the gap between simulation and the real world. Here, we simulate the noise of laser range measurement and the laser point matching and registration uncertainty [20][21]. Besides, we assume that the total reflection of the laser pulse does not exist.

Iii-D1 Laser Scan Noise

There always exists noise in the measured phase, and the laser noise is proposed to be modeled as the Gaussian distribution in the literature

[20][21]. To simulate the noise, each obstacle point is shifted by pixel along the robot-obstacle segment, where

is sampled from the Gaussian distribution with a mean of 0 and a user-defined standard deviation.

Iii-D2 Matching & Registration Uncertainty

As for the SLAM, there may be matching error when registering laser points to the global map, causing a shift of the observation sector in the map. To simulate this phenomenon, the processed sector is rotated by and transformed by a linear shift of in each step, while and is sampled from the Gaussian distribution with mean of and a user-defined standard deviation.

Iii-E Obstacle Generation

As mentioned in Section III-B, adding obstacles in the simulator can help to increase the variance of training data, i.e. improving diversity of floor plans. Apart from the training perspective, this also narrows the gap between simulation and real situations. In the real world, most houses are filled with furniture, and adding obstacles in our simulator can help make the model more adaptable to real world environments.

The furniture is generated in a dynamic manner: at the beginning of each episode, objects are placed onto the ground truth map at random locations, without overlapping with themselves or with the walls. The furniture is designed in three shapes, i.e. rectangle, ellipse and circle. The range of obstacle number and obstacle size can be specified by users.

Iii-F Comparison

Iii-F1 Speed Comparison

Fig. 5: Process Time Comparison. The Pseudo-SLAM simulator and gMapping are tested on a laptop with Intel Core i5-6500.
Fig. 6: Map accuracy comparison. Left top map, denoted as is built by gMapping in Stage and left bottom map denoted as is the output by our simulator. Right map is the overlapped result where walls from are in black color and walls from are in white color. The mismatched areas are marked in red color.

The processing time of map building using gMapping, a popular SLAM algorithm, and Pseudo-SLAM are shown in Fig. 5. It can be seen that the process time of the developed simulator is much shorter and the increment of map size also does not affect the mapping speed.

Iii-F2 Map Accuracy Comparison

One floor plan is picked from HouseExpo and used to build the occupancy map using gMapping in Stage and the developed simulator, respectively, as shown in Fig. 6. The left top map is the result of gMapping and the left bottom map is the result of Pseudo-SLAM. The right map is the overlapped result where mismatched areas are highlighted in red color. To compare their similarity, we compute their Intersection over Union (IoU) where the walls are regarded as the bounding box in terms of indoor spaces. Our simulator can achieve of IoU, indicating the accuracy of Pseudo-SLAM.

Iv Experiments

Iv-a Experiment Setup

To verify the effectiveness of HouseExpo and Pseudo-SLAM for data-driven approaches, we implement two model-free algorithms to tackle obstacle avoidance and autonomous exploration. For obstacle avoidance, we train a model in our simulator, then transfer the learned policy to a TurtleBot robot platform. The robot can navigate without collision in the room full of obstacles, showing the knowledge acquired from Pseudo-SLAM can be transferred to reality without additional fine-tuning. For autonomous exploration, we employ DRL to extract the spatial information from HouseExpo and simulation results show such information is helpful in accelerating exploration process.

Both problems are formulated as Markov decision-making process, where at time , the agent observes a state , based on which it makes an action , and receives a reward accordingly. In our model, is rectangular area centered at the robot’s position and its orientation is the same as robot’s orientation, and corresponds to three directional movements {forward, left rotation, right rotation}. The agent is equipped with a laser with a range of meters and a horizontal field of view degrees.

Iv-B Obstacle Avoidance

In this part, we train a model in Pseudo-SLAM aiming to navigate the robot without collision through obstacle-filled environments and test the learned policy on a TurtleBot in the real world. The real-world experiments are conducted in five scenes and results show that the knowledge learned in our simulator can be directly transferred to real robots without any fine-tuning, verifying the capability of our simulator.

In our experiment, the goal of obstacle avoidance is to prevent the agent from hitting objects or walls, meanwhile cover as a long distance as possible. Therefore, the reward function is defined as

If collision happens at time , the agent receives a penalty of . Otherwise, the reward is the weighted sum of state reward and action reward . The state reward is the newly discovered area at time , encouraging the agent to move towards unknown areas. The action reward is defined as , where is an indicator function with a value of if is forward and otherwise, preventing the cases where the agent keeps rotating in place. In our case, and .

The focus of our obstacle avoidance model is on recognizing and avoiding obstacles, instead of reasoning about the topology of the environment. Therefore, we train our agent in one empty rectangular room instead of HouseExpo dataset. At the beginning of each episode, our simulator randomly generates objects inside the room where and each episode lasts steps. The laser range is m and the state length is m. The neuron network has convolutional layers as the the configuration in [22]

, followed by a Long-Short Term Memory (LSTM) with

units. Proximal Policy Optimization (PPO) [23] is employed to train the network and the learning curve is depicted in Fig. 7.

Fig. 7: The learning curve of learning-based obstacle avoidance at training stage.
forward left rotation right rotation
train (, ) (0.3, 0) (0,10) (0, -10)
test (, ) (0.2, 0) (0, 40) (0, -40)
TABLE I: Action Commands Used in Simulation And Reality. , , , Represent Linear Step Length (meter), Angular Step Length (degree), Linear Velocity (meter/sec) and Angular Velocity (degree/sec).
Fig. 8: One test scene example. The room is and filled with objects.
Scene 1 Scene 2 Scene 3 Scene 4 Scene 5
Objects 6 6 7 9 9
Collisions 2 0 2 2 0
Mean(Distance) 18.41 17.98 18.08 14.67 15.62
Var(Distance) 0.70 4.31 2.01 15.25 21.78
TABLE II: Obstacle Avoidance Performance in Real-World Experiments.
Fig. 9: Trajectory demonstration in Scene 1 to Scene 5. Different trajectories are indicated with different colors.

Then we deploy the learned policy to a TurtleBot. The gMapping algorithm is employed to construct the grid map, with the configuration consistent with that in the simulator including observation size, laser range, map resolution, etc. Furthermore, the actions in Pseudo-SLAM are discrete, so we map the action commands to continuous space as displayed in TABLE I. The codes run on a laptop with Intel Core i-5-8300H and Nvidia Geforce 1050Ti, achieving a Hz control rate. We evaluate our trained model in a room with size of filled with objects. different object layouts are tested and in each scene, episodes with random starting points are evaluated. One example scene is showed in Fig. 8. The time limit of each episode is minutes. In total, we conduct 50 experiments for 100 minutes.

We quantitatively evaluate our model in real-world scenes from two perspectives: number of collisions and distance (trajectory length). Number of collisions is the summation of the times that the robot hit walls or obstacles in one scene, reflecting the basic ability to avoid obstacles. On the other hand, there are cases where the robot always rotates in place or just goes through a small region, which is unlikely to hit anything but not consistent with the goal. Thus we measure the distance that the robot traverse. The experimental results are showed in TABLE II. As we can see, the performance of the model is stable and only collisions happen in scenes. Another observation is the mean of the distance decreases with the number of objects, reflecting that the robot is more careful in complicated environments and takes more action in adjusting its pose.

Fig. 9 gives the trajectories of the episodes in all test scenes. As we can see, our policy is robust and can be generalized to real-world scenarios. The robot can traverse most of the areas while keeping a safe distance to the objects. The robot can even reach some complex regions and move out of them, for example, the dead end at the bottom right corner in Scene . This experiment demonstrates that the experience generated in Pseudo-SLAM can be applied to real-world situation and make the training process more efficient.

Iv-C Autonomous Exploration

In this part, we demonstrate the effectiveness of topological information through robot exploration task. As justified in [8], the spatial knowledge can be utilized to reason about unknown spaces in indoor environments and we use such knowledge to guide the exploration process.

Autonomous exploration refers to the process of searching for unknown areas. In our experiment, the robot is expected to discover as much area as possible to collect more information within a fixed time limit. Thus the reward function is defined as , where is the newly discovered area at time , encouraging the robot to move towards the unknown areas. Since the main focus of this experiment is on utilizing 2D layout information, all the training and testing houses are empty without adding any obstacles. To illustrate the influence of training set size, 3 models are separately trained on three training set, denoted as with , and maps, respectively, where . Furthermore, random policy is compared under the same setting where each action has an equal possibility of to be selected. All the models are then evaluated in new environments in the simulator and in each map the robot is initialized at the same location. The network structure is the same as Section IV-B but the state length is m, enabling the agent to have a larger horizon and obtain more layout information. An example of occupancy map and local observation is showed in Fig. 10. We record the explored area within steps as the measurement of exploration.

Fig. 10: Global map and local observation example. Left figure shows a global map where an agent is exploring a house. Right figure is the input to the network. Uncertain areas, walls and free spaces are indicated in black, white and gray color, respectively.
Fig. 11: Explored area in testing houses. The data has been sorted in ascending order of house areas.

The simulation result is showed in Fig. 11. The four solid curves represent smoothed results with window size . The House ID (x-axis) is sorted in ascending order according to their total areas. By comparing our model with random policy, it is evident that data-driven models perform much better with much higher traversed area in all houses, indicating that the spatial structures can be learned and transferred to new environments. By comparing the smoothed curves of , and , we observe that the model performance increases with the training set size, since the curve of achieves a higher exploration area in most of the testing houses compared with and .

V Conclusions

In this paper, we build HouseExpo, a large-scale indoor layout dataset, and Pseudo-SLAM, an efficient simulation platform to facilitate applying learning-based methods to mobile robots. The effectiveness of our dataset and simulation platform is verified via simulation and real-world experiments.

Apart from tasks mentioned in Section IV, we believe our dataset can also contribute to a number of other tasks, such as room segmentation, graph-bashed structure reasoning and mapless navigation, and scale up the diversity for algorithm evaluation.

At the same time, there is some future work to be done. One concern is how to optimally use the topological information. In our autonomous exploration experiment, only the local observation (a local map around the robot) is utilized. Since the sizes of houses vary a lot, it is impractical to directly feed the whole global map into convolutional neuron networks. It could be investigated how to represent the topology information. Another direction is the combination of learning-based methods and traditional methods. Take autonomous exploration problem as an example, the data-driven approach makes a long-term plan and designates a goal based on its experience and a local planner then plans a path and drives the robot to the goal.

References

  • [1] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, and A. Bolton, “Mastering the game of go without human knowledge.” Nature, vol. 550, no. 7676, p. 354, 2017.
  • [2] D. Zhu, T. Li, D. Ho, and Q. H. Meng, “Deep reinforcement learning supervised autonomous exploration in office environments,” in IEEE International Conference on Robotics and Automation (ICRA), 2018.
  • [3] L. Tai, G. Paolo, and M. Liu, “Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 31–36.
  • [4] Y. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, L. Fei-Fei, and A. Farhadi, “Target-driven visual navigation in indoor scenes using deep reinforcement learning,” in IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 3357–3364.
  • [5] R. Bormann, F. Jordan, W. Li, J. Hampp, and M. Hägele, “Room segmentation: Survey, implementation, and analysis,” in IEEE International Conference on Robotics and Automation (ICRA), 2016, pp. 1019–1026.
  • [6] M. Mielle, M. Magnusson, and A. J. Lilienthal, “A method to segment maps from different modalities using free space layout maoris: map of ripples segmentation,” in IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 4993–4999.
  • [7] J. B. Emily Whiting and S. Teller, “Generating a topological model of multi-building environments from floorplans,” in CAAD (Computer-Aided Architectural Design) Futures 2007, Sydney,Australia, July 2007, pp. 115–28.
  • [8] P. J. Alper Aydemir and J. Folkesson, “What can we learn from 38,000 rooms? reasoning about unexplored space in indoor environments,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 4675–4682.
  • [9] S. Song, F. Yu, A. Zeng, A. X. Chang, M. Savva, and T. Funkhouser, “Semantic scene completion from a single depth image,”

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    , 2017.
  • [10] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” arXiv preprint arXiv:1606.01540, 2016.
  • [11]

    S. Bai, F. Chen, and B. Englot, “Toward autonomous mapping and exploration for mobile robots through deep supervised learning,” in

    IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 2379–2384.
  • [12] C. Wang, L. Meng, T. Li, C. W. De Silva, and M. Q.-H. Meng, “Towards autonomous exploration with information potential field in 3d environments,” in IEEE International Conference on Advanced Robotics (ICAR), 2017, pp. 340–345.
  • [13] H.-T. Chiang, N. Malone, K. Lesser, M. Oishi, and L. Tapia, “Path-guided artificial potential fields with stochastic reachable sets for motion planning in highly dynamic environments,” in IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 2347–2354.
  • [14] S. Suzuki et al., “Topological structural analysis of digitized binary images by border following,” Computer vision, graphics, and image processing, vol. 30, no. 1, pp. 32–46, 1985.
  • [15]

    P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic determination of minimum cost paths,”

    IEEE transactions on Systems Science and Cybernetics, vol. 4, no. 2, pp. 100–107, 1968.
  • [16] N. P. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator.” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 4, 2004, pp. 2149–2154.
  • [17] R. Vaughan, “Massively multi-robot simulation in stage,” Swarm intelligence, vol. 2, no. 2-4, pp. 189–208, 2008.
  • [18] G. Grisetti, C. Stachniss, and W. Burgard, “Improved techniques for grid mapping with rao-blackwellized particle filters,” IEEE transactions on Robotics, vol. 23, no. 1, pp. 34–46, 2007.
  • [19]

    S. Kohlbrecher, O. Von Stryk, J. Meyer, and U. Klingauf, “A flexible and scalable slam system with full 3d motion estimation,” in

    IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), 2011, pp. 155–160.
  • [20] A. Schaefer, L. Luft, and W. Burgard, “An analytical lidar sensor model based on ray path information.” IEEE Robotics and Automation Letters, vol. 2, no. 3, pp. 1405–1412, 2017.
  • [21] A. Petrovskaya and S. Thrun, “Model based vehicle detection and tracking for autonomous urban driving,” Autonomous Robots, vol. 26, no. 2-3, pp. 123–139, 2009.
  • [22] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, and G. Ostrovski, “Human-level control through deep reinforcement learning.” Nature, vol. 518, no. 7540, p. 529, 2015.
  • [23] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.