I Introduction
Longterm autonomy is an essential capability for service mobile robots deployed in human environments. One challenging aspect of longterm autonomy in a human environment is robustness to changes in the environment.
Many approaches have been proposed to reason about a changing environment, including estimating the latest state of the environment [18, 13], estimating different environment configurations [2, 6], or modeling the dynamics of the environment [5, 16, 17, 1, 19]. However, in large environments, it may not be feasible to make the periodic observations required by these approaches. Therefore, we model observations in human environments as arising from three distinct types of features [4]: Dynamic Features (DFs) or moving objects such as people or carts, Short Term Features (STFs) or movable objects such as tables or chairs, and LongTerm Features (LTFs) which persist over long periods of time, such as office walls or columns. Episodic nonMarkov Localization (EnML) [4] simultaneously reasons about global localization information from LTFs, and local relative information from STFs. A key requirement to EnML is an estimate of the LongTerm Vector Map: the features in the environment that persist over time, represented in line segment or vector form (Fig. 1).
In this paper, we introduce an algorithm to build and update LongTerm Vector Maps indefinitely, utilizing observations from all deployments of all the robots in an environment. Our proposed algorithm filters out observations corresponding to DFs from a single deployment using a signed distance function (SDF) [9]. Merging the SDFs from multiple deployments then filters out the shortterm features. Remaining observations correspond to LTFs, and are used to build a vector map via robust local linear regression. Uncertainty estimates of the resultant LongTerm Vector Map are calculated by a novel Monte Carlo uncertainty estimator. Our proposed approach thus consists of the following steps:

Filter: Use the most recent observations to compute an SDF and discard points based on weights and values given by the SDF (Section IV).

Line Extraction: Use greedy sequential local RANSAC [11] and nonlinear leastsquares fitting to extract line segments from the filtered observations (Section V).

Estimate Feature Uncertainty: Compute segment endpoint covariance estimates via Monte Carlo resampling of the observations (Section VI).

Map Update: Add, merge, and delete lines using a decoupled scatter matrix representation [3] (Section VII).
Our approach takes advantage of the robust filtering provided by the SDF while avoiding dependency on a discrete world representation and grid resolution by representing LTFs as line segments in . Evaluation of our approach is detailed in section VIII. Across all metrics examined in this study, we find vector maps constructed via SDF filtering comparable or favorable to occupancy grid based approaches.
Ii Related Work
The problem of longterm robotic mapping has been studied extensively, with most algorithms relying on one of two dominant representations: occupancy grids [10, 14] and geometric or polygonal maps [8, 20]. Recently, work towards metric map construction algorithms that are able to cope with dynamic and shortterm features has accelerated.
Most approaches fall into one of four categories: dynamics on occupancy grids, latest state estimation, ensemble state estimation, and observation filters.
One common approach models map dynamics on an occupancy grid using techniques such as learning nonstationary object models [5]
or modeling grid cells as Markov chains
[16, 17]. Alternatively, motivated by the widely varying timescales at which certain aspects of an environment may change, some approaches seek to leverage these differences by maintaining information relating to multiple timescales within one or more occupancy grids [1, 19].Other approaches estimate the latest state of the world, including dynamic and shortterm features. Dynamic posegraph SLAM [18] can be used in lowdynamic environments, and spectral analysis techniques [13] attempt to predict future environment states on arbitrary timescales.
Instead of estimating solely the latest state, some approaches estimate environment configurations based on an ensemble of recent states. Temporal methods such as recency weighted averaging [2] determine what past information is still relevant, and other techniques such as learning local grid map configurations [6] borrow more heavily from the dynamic occupancy grid approach.
Another approach filters out all observations corresponding to nonLTFs, resulting in a “blueprint” map. Previous algorithms have had some success filtering dynamic objects, specifically people [12], but have struggled to differentiate between STFs and LTFs. Furthermore, all of the methods mentioned above rely on an occupancy grid map representation, whereas our method produces a polygonal, vector map.
Iii LongTerm Vector Mapping
Longterm vector mapping runs iteratively over multiple robot deployments, operating on the union of all registered laser observations from the given deployment, aligned to the same frame. We call these unions composite scans and denote them , where is a single laser scan. Composite scans are processed in batch after each deployment, and may be generated via Episodic nonMarkov Localization [4] or a similar localization algorithm.
After each deployment, a shortterm signed distance function (STSDF) given by a set of weights and values is computed over the area explored by the robot by considering the composite scan . The STSDF is then used to update the longterm SDF (LTSDF), given by and , which aggregates information over all previous deployments. The updated LTSDF is denoted and , and is used to determine a filtered composite scan , containing observations corresponding exclusivley to LTFs. We call this process SDFfiltering. Note that after only one deployment, it is not possible to distinguish STFs from LTFs. However, as the number of deployments increases, the number of observations corresponding to STFs in approaches zero.
The next step in our algorithm is feature (line) extraction. Line extraction does not rely on any persistent data, using only the filtered composite scan to extract a set of lines . Each is defined by endpoints and , a scatter matrix , a center of mass , and a mass .
Uncertainties in the endpoints of each line segment are computed by analyzing a distribution of possible endpoints generated via Monte Carlo resampling the initial set of observations and subsequently refitting the samples. For a given line the uncertainty estimation step takes as input the endpoints and and a set of inliers , and produces covariance estimates and for these endpoints.
The longterm vector map is updated based on the newest set of extracted lines and the current SDF. Similar lines are merged into a single line, obsolete lines are deleted, and uncertainties are recomputed. Thus, this step takes the results from the most recent deployment, , as well as the existing map given by , , and , and outputs an updated map, . Fig. 2 presents an overview of the algorithm as it operates on data from a single deployment.
Iv SDFbased Filtering
Let be a composite scan, where is a single observation providing depth and angle with respect to the laser’s frame and the robot pose at which and were recorded. That is, . The filtering problem is to determine such that all originate from LTFs and all originate from STFs and DFs.
Determining is a threestep process. First, we construct a STSDF over the area observed by the robot during the deployment corresponding to . Next, we update the LTSDF based on the STSDF. Finally, we use the updated LTSDF to decide which observations correspond to LTFs.
Iva SDF Construction
SDFs operate over discretized space, so we create a grid of resolution containing all . Each pixel in the SDF maintains two measurements, a value and a weight . For every observation , all pixels that lie along the given laser ray update their respective values according to and , where and are the current distance and weight values, respectively, and and are the distance and weight values for the given reading . and are given by
(1) 
where and is the signed distance from the range reading to the pixel, with pixels beyond the range reading having and those in front having . and are parameters that depend on the accuracy of the sensor. Pixels that are located along the ray but are more than beyond the detection point are not updated since we do not know whether or not they are occupied. Fig. 3 illustrates a single pixel update during SDF construction. Note that this process is parallelizable since weights and values for each pixel may be calculated independently. Thus, the SDF construction step, outlined in Algorithm 1, runs in time proportional to .
The intuition for the weight and value functions comes from two observations. First, capping by helps keep our SDF robust against anomalous readings. Second, follows common laser noise models. Other choices for and may yield similar results; however, these choices have already been successfully adopted elsewhere [7].
IvB SDF Update
Once the STSDF is calculated we normalize the weights:
(2) 
Here, is the maximum weight over all pixels and is a dynamic feature threshold. The LTSDF is then updated as the weighted average of the STSDF and the LTSDF, i.e. WeightedAverage. Pixel weights loosely map to our confidence about the associated value, and values are an estimate for how far a pixel is from the nearest surface.
IvC SDF Filter
Given an uptodate SDF, we determine
using bicubic interpolation on the position of each observation
and the updated LTSDF. The filtering criteria are if BicubicInterpolation and BicubicInterpolation, where is a STF threshold and is a threshold that filters observations believed to be far from the surface of any object. Lines 1115 in Algorithm 1 detail the filtering process. Thus, after running SDF Filtering(), we obtain a filtered composite scan , used to find LTFs. Additionally, SDF Filtering updates the LTSDF needed for the map update step.V Line Extraction
Given , extracting lines requires solving two problems. First, for each , a set of observations must belong to line . Second, endpoints and defining line must be found. We take a greedy approach, utilizing sequential local RANSAC to provide plausible initial estimates for line endpoints and . Points whose distance to the line segment is less than , where is proportional to the noise of the laser, are considered members of the inlier set . Once a hypothetical model and set of inliers with center of mass have been suggested by RANSAC (lines 57 in Algorithm 2), we perform a nonlinear leastsquares optimization using the cost function
(3)  
The new endpoints and are then used to find a new set of inliers . is the projection of a point onto the infinite line containing and . Iteration terminates when , where is a convergence threshold.
This cost function has several desireable properties. First, when all points lie between the two endpoints, the orientation of the line will be identical to the leastsquares solution. Second, when many points lie beyond the ends of the line segment, the endpoints are pulled outward, allowing the line to grow and the region in which we accept inliers to expand. Last, the term allows the line to shrink in the event that the nonlinear leastsquares solver overshoots the appropriate endpoint. Once a set of lines has been determined by running Line Extraction() we complete our analysis of a single deployment by estimating our uncertainty in feature locations.
Vi Uncertainty Estimation
Given a line , with endpoints and and a set of inliers , uncertainty estimation produces covariance estimates and for and , respectively. To estimate and we resample using the covariance of each range reading. is derived based on the sensor noise model in [15]. In world coordinates is given by
(4)  
where and
are standard deviations for range and angle measurements for the sensor, respectively. Resampling
times, as shown in Fig. 4, produces a distribution of and . We then construct a scatter matrix from the set of hypothetical endpoints, and compute covariances and by using the SVD of .The Monte Carlo approach, detailed in Algorithm 3, is motivated by the following factors: 1) There is no closedform solution to covariance for endpoints of a line segment. 2) A piecewise cost function makes it difficult to calculate the Jacobian reliably. 3) Resampling is easy since we already have and can calculate .
Vii Map update
Given the current map , the LTSDF, and a set of new lines, , where every is specified by a set of endpoints and , a set of covariance matrices and , a center of mass , a mass , and a partial scatter matrix , the update step produces an updated map .
The map updates are divided into two steps outlined in Algorithm 4. First, we check if all current lines in the map are contained within highweight regions. That is, we check that the weight of every pixel a given line passes through satisfies . If a line lies entirely within a highweight region, it remains unchanged. Similarly, if a line lies entirely outside all highweight regions, it is removed. If only part of a line remains within a highweight region, we can lower bound the mass of the remaining region by , where and are the extreme points of the segment remaining in the highweight region (line 11). We then resample points uniformly along the line between and , adding Gaussian noise in the perpendicular direction with a standard deviation based on the sensor noise model. The points are then fit using the cost function given in (4). The sampling and fitting process is executed a fixed number of times (lines 1216), and the distribution of fit results is then used to compute covariance estimates for the new line.
The second part of the update involves merging new lines, , with lines from the current map, . This process consists of first finding candidates for merging (lines 2223) and then computing the updated parameters and covariances (lines 2425). Note that the mapping from new lines to existing lines may be onto, but without loss of generality we consider merging lines pairwise. Because our parameterization uses endpoints, lines which ought to be merged may not have endpoints near one another. So, we project and from onto , resulting in and , respectively. and are merged if they pass the chisquared test:
(5) 
where , and is given by a linear interpolation of the covariance estimates for and determined by where is projected along .
We would like our merged LTFs and their uncertainty measures to remain faithful to the entire observation history. However, storing every observation is infeasible. Instead, our method implicitly stores observation histories using decoupled scatter matrices [3], reducing the process of merging lines with potentially millions of supporting observations to a single matrix addition.
The orientation of the new line is found via eigenvalue decomposition of the associated scatter matrix, and new endpoints are found by projecting the endpoints from the original lines onto the new line and taking the extrema. Thus, after executing
Map Update(), we have a set of vectors in corresponding to LTFs.Table I displays the major parameters and physical constants needed for longterm vector mapping.
Name  Symbol  Domain  Our Value 

DF Threshold  
STF Threshold  
Line Merge Criteria  
Sensor Noise Threshold  meters  
RANSAC Inlier Criteria  meters  
Line Fit Convergence  meters  
SDF Max Value  meters 
Viii Results
To demonstrate the effectiveness of our algorithm we mapped 4 different environments, including standard data. Data was also collected from three separate environments, in addition to the MIT Reading Room: AMRL, Wallocclusion, and Hallwayocclusion, using a mobile robot platform and a Hokuyo UST10LX laser range finder. The AMRL, Wall, and Hall datasets consist of 8, 5, and 5 deployments, respectively. MIT Reading Room contains 20 deployments. Each deployment contains hundreds of scans, corresponding to hundreds of thousands of observations. Datasets are intended to display a high amount of clutter and variability, typical of scenes mobile robots need to contend with.
The MIT and AMRL data sets are intended to test the accuracy of our algorithm over larger numbers of deployments. Both rooms contain multiple doors, walls of varying lengths, and achieve many different configurations, testing our ability to accurately identify LTF positions. The Wall and Hallwayocclusion datasets are intended to measure our algorithm’s robustness in environments where LTFs are heavily occluded.
Quantitatively we are concerned with accuracy and robustness, defining accuracy as pairwise feature agreement, where interfeature distances are preserved, and featureenvironment correspondance, where vectors in the map correspond to LTFs in the environment. Vectors should also be absent from areas where there are no longterm features such as doorways. Metric ground truth is established by either measuring wall lengths or handfitting the parts of the data we know correspond to LTFs. Robustness refers to a map’s ability to deal with large numbers of STFs and lack of degradation over time.
Over all datasets, our approach correctly identifies all 7 doorways (4 in MIT, 2 in AMRL, 1 in Hallway), and correctly ignores all 73 STFs. Using the AMRL and MIT datasets, we find the average difference between pairwise feature separation in the generated map versus ground truth to be on the order of 2cm. Our line extraction method yields MSE values in the order of meters, while marching squares yields a MSE of roughly , about three times as large. Additionally, marching squares yields over features while LTVM maintains on the order of LTFs. Furthermore, the maps do not degrade over the timescales tested in this paper, with no noticeable difference in average pairwise feature disagreement between maps generated after one deployment and those considering all deployments. Fig. 5 displays qualitative results for all environments.
Ix Conclusion
In this paper, we introduced an SDFbased approach to filter laser scans over multiple deployments in order to build and maintain a longterm vector map. We experimentally showed the accuracy and robustness of the generated maps in a variety of different environments and further evaluated the effectiveness of longterm vector mapping compared to more standard, occupancy grid techniques. Currently this approach processes observations in batch post deployment, but modifying the algorithm to run in an online setting seems an appropriate next step.
X Acknowledgements
The authors would like to thank Manuela Veloso from Carnegie Mellon University for providing the CoBot4 robot used to collect the AMRL datasets, and to perform experiments at University of Massachusetts Amherst.
References
 [1] Temporal occupancy grids: a method for classifying the spatiotemporal properties of the environment. In IROS, 2002., Cited by: §I, §II.
 [2] (2005) Dynamic maps for longterm operation of mobile service robots. In RSS, Cited by: §I, §II.
 [3] (2012) Planar polygon extraction and merging from depth images. In IROS, Cited by: item 4, §VII.
 [4] (2014) Episodic nonmarkov localization: reasoning about shortterm and longterm features. In ICRA, Cited by: §I, §III.
 [5] (2002) Towards object mapping in nonstationary environments with mobile robots. In IROS, Cited by: §I, §II.
 [6] (2007) Autonomous navigation in dynamic environments. pp. 3–28. Cited by: §I, §II.
 [7] (2013) Realtime camera tracking and 3d reconstruction using signed distance functions. In RSS, Cited by: §IVA.
 [8] (1985) Position referencing and consistent world modeling for mobile robots. In Robotics and Automation., Cited by: §II.
 [9] (1996) A volumetric method for building complex models from range images. In ACM, Cited by: §I.
 [10] (1989) Using occupancy grids for mobile robot perception and navigation. Computer. Cited by: §II.
 [11] (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. ACM. Cited by: item 2.
 [12] (2002) Map building with mobile robots in populated environments. In IROS, Cited by: §II.
 [13] (2014) Spectral analysis for longterm robotic mapping. In ICRA, Cited by: §I, §II.
 [14] (1989) A Bayesean method for certainty grids. In AAAI Spring Symposium on Robot Navigation, Cited by: §II.
 [15] (2003) Weighted line fitting algorithms for mobile robot map building and efficient data representation. In ICRA, Cited by: §VI.
 [16] Independent markov chain occupancy grid maps for representation of dynamic environment. In IROS, 2012, Cited by: §I, §II.
 [17] (2013) Lifelong localization in changing environments. IJRR. Cited by: §I, §II.
 [18] (2012) Dynamic pose graph slam: longterm mapping in low dynamic environments. In IROS, Cited by: §I, §II.
 [19] (2005) Mobile robot simultaneous localization and mapping in dynamic environments. Autonomous Robots. Cited by: §I, §II.
 [20] (2000) Line segment based map building and localization using 2d laser rangefinder. In ICRA, Cited by: §II.