Linear Features Observation Model for Autonomous Vehicle Localization

02/28/2020 ∙ by Oleg Shipitko, et al. ∙ Institute for Problems of Information Transmission 0

Precise localization is a core ability of an autonomous vehicle. It is a prerequisite for motion planning and execution. The well-established localization approaches such as Kalman and particle filters require a probabilistic observation model allowing to compute a likelihood of measurement given a system state vector, usually vehicle pose, and a map. The higher precision of the localization system may be achieved through the development of a more sophisticated observation model considering various measurement error sources. Meanwhile model needs to be simple to be computable in real-time. This paper proposes an observation model for visually detected linear features. Examples of such features include, but not limited to, road markings and road boundaries. The proposed observation model depicts two core detection error sources: shift error and angular error. It also considers the probability of false-positive detection. The structure of the proposed model allows precomputing and incorporating the measurement error directly into the map represented by a multichannel digital image. Measurement error precomputation and storing the map as an image speeds up observation likelihood computation and in turn localization system. The experimental evaluation on real autonomous vehicle demonstrates that the proposed model allows for precise and reliable localization in a variety of scenarios.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The ability to precisely estimate ego-pose is a core ability of any mobile robot. The motion planning and execution are obstructed by imprecise positioning. The localization problem has been investigated extensively by the research community over the last decades. Visual localization is a localization technique based on data received from visual sensors. Throughout many years it is still one of the core topics in robotics 

[15, 4, 17].

The modern visual localization methods are mostly based on detection and matching of surrounding environment features to retrieve motion parameters between images and determining the current camera pose. Among visual features are distinguished global (characterizing the whole image) [22, 19] and local ones (characterizing small region of the image) [21, 25]. Global features are known to be less robust against view angle changes and occlusions. On the contrary local features are robust to different image variations. The local features include points [14, 12], line segments [6, 18], contours [23] and even objects [3].

Fig. 1: Prior map represented by a multichannel digital image. Each channel encodes a part of a probabilistic observation model.

While point features observation models are widely studied [20], there is no rigorous study of the uncertainty sources of linear features visual detection. Although the positioning based on lines detection has a long history [10], the need of probabilistic observation model for linear features is reinforced by the growing interest to the self-driving vehicles, since the most obvious positioning data source on the road is road markings which can be represented as a set of linear features.

By the 2000s the probabilistic robotics became a dominating paradigm in robotics. Probabilistic localization methods are based on the fusion of data received from several sources and account for both motion and measurement uncertainties [27]

. Classical probabilistic data fusion algorithms such as Kalman filter 

[28] and particle filter (Monte-Carlo localization) [2] require a priory known model of the measurement error distribution. In other words, they use a probabilistic model of sensor measurement (i.e. observation model) allowing to estimate the probability of measurement given the robot pose. It has been long understood that the localization precision is mostly conditioned on the quality of the observation model.

In this work, we propose a probabilistic observation model of visually detected linear features. The proposed model accounts for several major types of measurement uncertainties: shift error, angular detection error, and false-positive detection. We show that the structure of the proposed model allows for efficient precomputation of measurements errors probabilities which can be directly incorporated into the a priori known map. This significantly speeds up the measurement model computation and allows for real-time localization implementation. By experimental evaluation on real vehicles, we demonstrate that the proposed model provides high localization precision in a variety of different localization scenarios.

Ii Related Work

Road markings contain essential information used by human drivers for decision making and road situation analysis. It has been long understood that road marking could be used for autonomous vehicle navigation as well. Hillel et al. [7] provide an extensive overview of lane detection techniques, their limitations, and applications.

The probabilistic observation model of road markings detector is proposed in [9]. The authors proposed a model accounting for detection shift error. The authors also conducted an experiment evaluating the influence of unpredicted vehicle orientation changes onto the measurement error. They incorporated an error caused by the vehicle orientation changes into the model by increasing the measurement uncertainty according to the magnitude of orientation deviation.

In Bertha (autonomous car made by Daimler AG [29]) road markings along with point features were used for vehicle localization. The proposed observation model accounts for the decrease of spatial detection precision with the growth of distance between camera and detected road marking.

Monocular visual localization in an urban environment using a particle filter is presented in [13]

. In this work, the source of data for localization is the orientation of linear features obtained by analyzing the position of the vanishing point in an image. Detected linear features are compared with the map to estimate the current position of the robot. An exponential heuristic function weighting detected and reference lines misalignment is used to compute measurement likelihood.

An alternative approach is used in works [5, 1]. The authors estimate vehicle position and orientation relative to the detected road markings and compute hypothesis weight as a sum of Gaussian error distributions centered in position and orientation estimates.

There are also works where localization is considered as optimization problem [16]. Although in this case a probabilistic measurement model is not required, the evaluation results show that this approach can provide necessary localization precision, however, it is less robust in complex environment with partial or full road markings occlusions.

Iii Linear Features Measurement Model

Let us first introduce a concept of a linear feature. Linear feature is any curve detected on image and represented by a set of equidistant points . There are two types of linear features. A step-like image signal change is referred to as edge. A spike-like image change is called a ridge. Examples illustrating the difference between edges and ridges are shown in Fig. 2.

Fig. 2: Linear feature examples. (a) Image edge: profile (on top), synthetic pixels row (middle), real world example (bottom). (b) Image ridge: profile (on top), synthetic pixels row (middle), real world example (bottom).

The set of all curves detected on a single image is referred to as a measurement, . The observation model estimates likelihood of a measurement given the vehicle pose and the map . The map in our case is represented by a multichannel digital image containing ground-truth positions of all lines that could be possibly detected (see Fig. 1).

The observation model accounts for different sources of measurement errors. Classical models for linear features only account for the shift typically exploiting the Gaussian noise model [9]. However, it can be shown that such a model is prone to bias since it does not account for angular misalignment of reference and measured features. Therefore a slightly shifted but correctly oriented line can receive lower likelihood than the line which orientation and shift both estimated incorrectly. An example of this effect is shown in Fig. 3.

Fig. 3: Example of classical model bias. Due to the sampling the incorrect measurement (red) receives higher probability than the correct but slightly shifted one (green).

To account for the orientation and overcome such a bias we propose a measurement model that consists of two parts:


where is a number of detected curves,

is a probability density function (PDF) of shift errors measuring how probable the line segment depending on how far it is shifted relative to the expected one and

is a PDF of angular errors showing how probable the detected line is depending on its deviation from the expected orientation, is a normalization coefficient.

We assume a shift error to be a 2D Gaussian function with the mean centered in the real line position. In reality, the detection process usually produces a Gaussian noise of the detected line ends positions, so our model overestimates the noise. However, such an assumption significantly simplifies computation and allows for noise to be precomputed as it will be shown in the next section. Apart from the spatial detection noise, we account for false positives detection i.e. detection of lines when in reality there are none. To account for that we introduce additional uniform probability distribution of a random detection (see Fig. 



where is a number of points approximating line, and are the and components of the vector between point in map reference frame and the closest point of the closest reference line,

is a parameter corresponding to the uniform distribution accounting for the false positive detection.

Fig. 4:

Measurement shift probability distribution: (a) normal distribution, (b) uniform distribution, (c) normal + uniform. Color corresponds to the measurement probability.

To account for angular detection error the additional PDF is introduced. The likelihood of the detected line segment depends on the angle between true line orientation and orientation of the detected segment. The angular noise is assumed to be Gaussian with zero mean:


where is a number of points approximating line.

In the case of the multi-camera system the likelihoods computed for separate cameras are multiplied in order to estimate total measurement likelihood:


where is a camera index.

Iv Map Representation

A digital map is required to compare the sensory data with the expected environment observation. The map is represented by a multichannel raster image where each channel contains different map-related information. The first channel stores the initial map of linear features used for the localization. It can be a map of road markings (see Fig. 1, top), schematic map of the building or map of any other linear features type.

The proposed measurement model allows the precomputing of the measurement model beforehand. The map stores precomputed components of the measurement model ( and ) in two separate map image channels.


related channel is obtained from the initial map via Gaussian smoothing – by convolving the original map image with 2D Gaussian kernel. The size of the kernel and standard deviation

depend on the detector shift noise, estimated experimentally and the map scale (). The uniform distribution is simply added to each pixel after Gaussian smoothing (see Fig. 1, left). Therefore, computing the equation 2 comes down to the summation of values stored in pixels, corresponding to the detected linear segments, transformed to the map reference frame.

The related channel is required to speed up computation of angular part of measurement model (3). The angle can be estimated if the shortest distances between ends of the detected segment and closest map segment are known. Such a distance can be precomputed. In image processing, there is a morphological operation known as distance transform [24]. The result of the transform is a gray-scale image where each pixel stores a distance to the closest boundary (see Fig. 1, middle). Given the distances and of a line segment ends to the closest reference line, the absolute angular misalignment is estimated as:


where – the length of the line segment.

Additional map channel represents occupancy grid map – information about the areas where car presence is possible (see Fig. 1, right). This information allows discarding infeasible hypotheses, therefore, decreasing the computational complexity of the localization system.

Storing digital map in memory as a multichannel raster image allows accessing any location on the map in constant time given that the area is limited. Such representation can also easily be extended with new prior information which might be added to an additional image channel.

V Localization System Overview

To evaluate the proposed measurement model the particle filter is used [2]. It approximates the posterior distribution of the system state vector

at the time moment

with a set of hypotheses (particles) with associated weights proportional to their likelihood estimations , where , – total number of particles. Every particle is a hypothesis of the current vehicle pose , where and are the 2D vehicle coordinates in the map reference frame at the time , and is a corresponding yaw angle computed relative to the axis of the map coordinate system (Fig. 5). Unlike other localization methods, particle filter is able to handle non-linear models and multimodal probability distributions.

Fig. 5: Car pose in map coordinate system.

We used a systematic resampling [11] at the resampling step of the particle filter. It outperforms other sampling approaches in terms of quality and computational complexity [8].

V-a Motion Model

Each filter iteration starts with the particles poses prediction based on the relative odometry measurements. Generally, proprioceptive sensors cannot provide reliable positioning on their own since their estimations are prone to the drift over time. Therefore only relative odometry measurements are applied to particles. To account for motion noise (e.g. imperfect actuators and odometry measurements) additive Gaussian noise with zero mean is applied both to the pose and orientation of each particle. Given the difference of two consecutive odometry measurements actuation model can be expressed as follows:


where , , and represent Gaussian noise with zero mean. Standard deviations of such distributions depend on the particular sensors set used for odometry.

V-B Measurement Model

To estimate likelihood (i.e. weight) of each particle the proposed measurement model is used. Detected relative to the vehicle pose linear features are approximated by polylines with constant segment length in the image reference frame. Polylines then transformed into the map reference frame for each particle individually, taking into account particle current pose on the map. Then the observation model (1) is applied to estimate the likelihood of each particle.

In order to increase localization quality, prior information in the form of an occupancy grid map is used. It limits the area where particle presence is possible and therefore increases localization precision since only probable particles are considered. The likelihood function for the occupancy grid map is represented below:


The particle likelihood computed according to (1) is multiplied by the . The estimated particles weights are then normalized:


The resulting pose is computed as a weighted average of all particles.

Vi Experimental Setup

Fig. 6: The autonomous logistic vehicle used for measurement model evaluation: (a) sensors placement, (b) cameras field of view.
Fig. 7: Experimental results: (a) test route and road marking map, (b) errors dependence over time.

This section details the results of the experimental evaluation of the proposed measurement model. It was evaluated in an outdoor environment on an autonomous logistic vehicle. The vehicle uses four cameras to detect road markings as shown in Fig. 6. Fig. 6(a) demonstrates prior lane markings map of test site used to test localization performance as well as the test route which includes turns, straight and backward movement as well as a roundabout. The trajectory obtained with REACH RS+ RTK GNSS was used as a ground truth. To detect linear feature the Hough transform-based algorithm similar to the one described in [26] was used. Both linear features detector and particle filter algorithm were implemented as C++ code and were running on the PC with Intel Core i7 CPU and 8 Gb RAM. No parallel processing was used. The number of particles used in all experiments was equal to .

The experiments were conducted to compare different observation models. The proposed model (shift+angular) was compared to the models accounting only for the shift error (shift) and only for angular error (angular). All numerical results were averaged over test runs.

Vii Experimental Results

The results are demonstrated in Table I. It can be seen that the localization with the proposed model is superior to the models only considering shift probability or angular misalignment probability. With the use of a combined model, the maximum absolute longitudinal error was decreased on in comparison with the model only considering measurement shift, maximum angular error on (Mean Absolute Error (MAE) on ). However, it needs to be noted that while lateral MAE was decreased on , the maximum lateral error was increased from m to m (on ).

Fig. 6(b) depicts localization errors dependence on time for one of the test runs. It can be noted how lateral error growth on the straight segments of the route where there are no features for longitudinal correction (segments C, E). However, the error is immediately corrected once the distinctive features are observed – road crossing between C and D (approximately th seconds of the test run), turn in the middle of segment E. It also can be seen from Fig. 6(b) that the angular error is positive most of the time. This bias can be due to the slightly wrong calibration of cameras or constant inclination of the car itself.

Averaged over test runs distributions of longitudinal, lateral and angular errors for the proposed observation model are presented in Fig. 8. It can be seen that the longitudinal error distribution is much wider. It leads to the conclusion that while lateral positioning can be efficiently performed based only on linear features the robust and precise longitudinal positioning requires additional sources of localization data. Such sources can be represented by other types of sensors or e.g. point features in case of visual localization. Fig. 8 also supports the hypothesis of bias in angular error due to some source of a static error.

Model Longitudinal [m] Lateral [m] Angular [rad]
shift + angular
TABLE I: Measured localization performance. Different rows correspond to different measurement models. Max stands for maximum error. MAE stands for Mean Absolute Error.
Fig. 8: Localization errors distributions.

To analyze computational load introduced with a model modification that considers angular misalignment we performed an algorithm computation time profiling. The results are demonstrated in Fig. 9. It can be seen that the model component corresponding to angular error takes twice the time of shift component to be computed. Nevertheless, the whole particle filter step including measurement model computation is very fast (can run at Hz) and can be further accelerated by parallelized computations.

Fig. 9: Time profiling of the single particle filter iteration. Note, the percentage marked with asterisk symbol is percentage form the observation model, not the whole particle filter iteration. The whole iteration takes on average 0.012 seconds.

Viii Conclusion

In this work, the probabilistic observation model for visually detected linear features is proposed. The proposed model can be used in localization algorithms such as particle filter and Kalman filter. The proposed model accounts for three measurement error sources: detection shift error, orientation error, and false-positive detection. The model overcomes known drawback of classical models – their bias towards incorrect orientation estimation. The structure of the model allows to directly incorporate a precomputed measurement model into the map which speeds up the localization algorithm.

The proposed model was evaluated on a real autonomous vehicle. Experimental results have demonstrated the superiority of the proposed model over the traditional model considering only a shift between expected and obtained measurements. We also evaluated the computational load required by the proposed extended model. We showed that although it doubles computational requirements the proposed model still can be efficiently computed even on the low-end processing unit.

The future work may include the expansion of the proposed model to incorporate external error sources (e.g. increasing detection uncertainty with the increasing detection distance or uneven road surface) and internal error sources (e.g. unpredicted vehicle motions or error model of particular underlying linear features detection algorithm). The future work also includes the application of the proposed model to other real-life localization scenarios such as indoor localization using linear features.


  • [1] F. Chausse, J. Laneurit, and R. Chapuis (2005) Vehicle localization on a digital map using particles filtering. In IEEE Proceedings. Intelligent Vehicles Symposium, 2005., pp. 243–248. Cited by: §II.
  • [2] F. Dellaert, D. Fox, W. Burgard, and S. Thrun (1999) Monte carlo localization for mobile robots. In ICRA, Vol. 2, pp. 1322–1328. Cited by: §I, §V.
  • [3] D. Fassbender, M. Kusenbach, and H. Wuensche (2015) Landmark-based navigation in large-scale outdoor environments. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4445–4450. Cited by: §I.
  • [4] J. Fuentes-Pacheco, J. Ruiz-Ascencio, and J. M. Rendón-Mancha (2015) Visual simultaneous localization and mapping: a survey. Artificial Intelligence Review 43 (1), pp. 55–81. Cited by: §I.
  • [5] F. Ghallabi, F. Nashashibi, G. El-Haj-Shhade, and M. Mittet (2018) Lidar-based lane marking detection for vehicle positioning in an hd map. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 2209–2214. Cited by: §II.
  • [6] I. Heisterklaus, N. Qian, and A. Miller (2014)

    Image-based pose estimation using a compact 3d model

    In 2014 IEEE Fourth International Conference on Consumer Electronics Berlin (ICCE-Berlin), pp. 327–330. Cited by: §I.
  • [7] A. B. Hillel, R. Lerner, D. Levi, and G. Raz (2014) Recent progress in road and lane detection: a survey. Machine vision and applications 25 (3), pp. 727–745. Cited by: §II.
  • [8] J. D. Hol, T. B. Schon, and F. Gustafsson (2006) On resampling algorithms for particle filters. In Nonlinear Statistical Signal Processing Workshop, 2006 IEEE, pp. 79–82. Cited by: §V.
  • [9] K. Jo, Y. Jo, J. K. Suhr, H. G. Jung, and M. Sunwoo (2015) Precise localization of an autonomous car based on probabilistic noise models of road surface marker features using multiple cameras. IEEE Transactions on Intelligent Transportation Systems 16 (6), pp. 3377–3392. Cited by: §II, §III.
  • [10] P. Kahn, L. Kitchen, and E. M. Riseman (1990) A fast line finder for vision-guided robot navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (11), pp. 1098–1102. Cited by: §I.
  • [11] G. Kitagawa (1996) Monte carlo filter and smoother for non-gaussian nonlinear state space models. Journal of computational and graphical statistics 5 (1), pp. 1–25. Cited by: §V.
  • [12] T. Krajník, J. P. Fentanes, J. M. Santos, and T. Duckett (2017) Fremen: frequency map enhancement for long-term mobile robot autonomy in changing environments. IEEE Transactions on Robotics 33 (4), pp. 964–977. Cited by: §I.
  • [13] K. Y. K. Leung, C. M. Clark, and J. P. Huissoon (2008) Localization in urban environments by matching ground level video images with an aerial image. In 2008 IEEE International Conference on Robotics and Automation, pp. 551–556. Cited by: §II.
  • [14] D. G. Lowe (2004) Distinctive image features from scale-invariant keypoints.

    International journal of computer vision

    60 (2), pp. 91–110.
    Cited by: §I.
  • [15] S. Lowry, N. Sünderhauf, P. Newman, J. J. Leonard, D. Cox, P. Corke, and M. J. Milford (2015) Visual place recognition: a survey. IEEE Transactions on Robotics 32 (1), pp. 1–19. Cited by: §I.
  • [16] Y. Lu, J. Huang, Y. Chen, and B. Heisele (2017) Monocular localization in urban environments using road markings. In 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 468–474. Cited by: §II.
  • [17] Y. Lu, Z. Xue, G. Xia, and L. Zhang (2018) A survey on vision-based uav navigation. Geo-spatial information science 21 (1), pp. 21–32. Cited by: §I.
  • [18] B. Morago, G. Bui, and Y. Duan (2016) 2D matching using repetitive and salient features in architectural images. IEEE Transactions on Image Processing 25 (10), pp. 4888–4899. Cited by: §I.
  • [19] T. Naseer, W. Burgard, and C. Stachniss (2018) Robust visual localization across seasons. IEEE Transactions on Robotics 34 (2), pp. 289–302. Cited by: §I.
  • [20] N. Piasco, D. Sidibé, C. Demonceaux, and V. Gouet-Brunet (2018) A survey on visual-based localization: on the benefit of heterogeneous data. Pattern Recognition 74, pp. 90–109. Cited by: §I.
  • [21] X. Qu, B. Soheilian, E. Habets, and N. Paparoditis (2016) EVALUATION of sift and surf for vision based localization.. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences 41. Cited by: §I.
  • [22] F. Radenović, G. Tolias, and O. Chum (2016)

    CNN image retrieval learns from bow: unsupervised fine-tuning with hard examples

    In European conference on computer vision, pp. 3–20. Cited by: §I.
  • [23] S. Ramalingam, S. Bouaziz, P. Sturm, and M. Brand (2010) Skyline2gps: localization in urban canyons using omni-skylines. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3816–3823. Cited by: §I.
  • [24] A. Rosenfeld and J. L. Pfaltz (1968) Distance functions on digital pictures. Pattern recognition 1 (1), pp. 33–61. Cited by: §IV.
  • [25] O. Saurer, G. Baatz, K. Köser, M. Pollefeys, et al. (2016) Image based geo-localization in the alps. International Journal of Computer Vision 116 (3), pp. 213–225. Cited by: §I.
  • [26] O. S. Shipitko, M. P. Abramov, A. S. Lukoyanov, E. I. Panfilova, I. A. Kunina, and A. S. Grigoryev (2019) Edge detection based mobile robot indoor localization. In Eleventh International Conference on Machine Vision (ICMV 2018), Vol. 11041, pp. 110412V. Cited by: §VI.
  • [27] S. Thrun, W. Burgard, and D. Fox (2005) Probabilistic robotics. MIT press. Cited by: §I.
  • [28] E. A. Wan and R. Van Der Merwe (2000) The unscented kalman filter for nonlinear estimation. In Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No. 00EX373), pp. 153–158. Cited by: §I.
  • [29] J. Ziegler, H. Lategahn, M. Schreiber, C. G. Keller, C. Knoppel, J. Hipp, M. Haueis, and C. Stiller (2014) Video based localization for bertha. In Intelligent Vehicles Symposium Proceedings, 2014 IEEE, pp. 1231–1238. Cited by: §II.