The ability to drive inside a prescribed road lane, also known as lane following or lane centering, is central to the development of fully autonomous vehicles and involves both perception and control, as it requires to first sense the surrounding environment and then act on the steering accordingly.
To plan the best possible trajectory, it is necessary to retrieve from the environment not only the position and shape of the line markings, but also the shape of the lane center, or centerline, and the vehicle relative pose with respect to it. This is particularly useful in multi-lane roadways and in GNNS adverse conditions (e.g. tunnels and urban canyons).
Although perception and control are strongly interconnected within this problem, the current literature is divided between the works on lateral control, where the trajectory is assumed to be given and the focus is posed on the control models and their implementation , and a perception side mostly centered on the mere lines detection. Most of the times, this task is performed only in image coordinates, and no line description in the world reference frame is ultimately provided .
Furthermore, the technology commercially available at the moment offers only aiding systems, which monitor the line position in the strict proximity of the vehicle and are limited to either issue a warning to the driver (line departure warning system), or slightly act on the steering (lane keeping assist) to momentarily adjust the trajectory for him , although he remains in charge of the vehicle for the entire time . Only a handful of more advanced commercial systems actually do provide a lane following mechanism, but just in limited situations, such as in presence of a traffic jam (Audi A8 Traffic Jam Pilot ), when the vehicle is preceded by another car (Nissan Propilot Assist ), or, again, when driving in limited-access freeway and highways (Autosteer feature in Tesla Autopilot ).
What we propose in this paper is a perception system which enables full lateral control, capable not only to slightly correct the trajectory, but also to plan and maintain it regardless of the particular concurring situations. To this end, we design our perception to provide not only a mathematical description of the road lines in the world frame, but also an estimate of shape and position of the lane centerline and a measurement of the relative pose heading and lateral displacement of the vehicle with respect to it.
The scarcity of similar works in the literature leads to the absence of related benchmarking data publicly available. The published datasets in the literature of perception systems only focus on the mere line detection problem , even providing no line representation in world coordinates. In addition, most of these datasets do not contain sequential images , , or if they do ,  the sequences are still not long enough to guarantee a fair evaluation of the system performance over time. No dataset publicly available reports a way to obtain a ground truth measure of the relative pose of the vehicle within the lane, which is crucial for a complete evaluation of our findings. For this reason, we proceeded to personally collect the data required for the validation of our system. and we release this data as a further contribution of this work.
Indeed, to generate an appropriate ground truth and validate our work, the full knowledge of the position of each line marking in the scene was required, and for this reason we performed our experiments on two circuit tracks we could fully access to perform our measurements, operation hardly possible in a trafficked road. Although this might seem a simplified environment, the tracks chosen actually offer a wide variety of driving scenarios and can simulate situations from plain highway driving to conditions even more complicated than usual urban setups, making the experiments challenging and scientifically significant.
This paper is then structured as follows; we first analyze the state of the art concerning line detection, with a particular interest in the models used to represent street lines. Then we proceed with an analysis of the requirements that these systems must satisfy to provide useful information to the control system. Next, in Section IV, we describe our pipeline for lines detection and, in Section V, how information like heading and lateral displacement are computed. Lastly, we introduce our dataset and perform an analysis on the accuracy of our algorithms compared to a recorded ground-truth.
Ii Related Work
Lane following has been central to the history of autonomous vehicles. The first complete work dates back to 1996, when Pomerleau and Jochem  developed RALPH and evaluated its performance with their test bed vehicle, the Navlab 5, throughout the highways of the United States. Other works, at this early stage, focused mostly on the theoretical aspects, developing new mathematical models for the lateral control problem , .
. In this context, traditional line detection systems can be generally described in terms of a pipeline of preprocessing, feature extraction, model fitting and line tracking, , 
. In recent years, learning methods have been introduced into this pipeline. Very common is the use of a Convolutional Neural Network (CNN) as as a feature extractor, for its capability of classifying pixels as belonging or not to a line.
Finally output of these systems is a representation of the lines. In this regard, a distinction is made between parametric and non-parametric frameworks . The former include straight lines , used to approximate lines in the vicinity of the vehicle, and second and third degree polynomials, adopted to appropriately model bends , while the latter is mostly represented by non-parametric spline, such as cubic splines , Catmull-Rom splines  and B-snake 
. While parametric models provide a compact representation of the curve, as needed for a fast computation of curve-related parameters, non-parametric representations can model more complex scenarios as they do not impose strong constraints on the shape of the road.
As the sole objectives of line detection systems is to provide a mathematical description of the lines, any of the described line models is in principle equally valid. For this reason, all of these studies strongly rely on a Cartesian coordinate system as the most intuitive one. In the literature on lateral control instead, the natural parametrization is preferred, as it intrinsically represents the curve shape in terms of quantities directly meaningful for the control model (e.g., heading, radius of curvature, etc.). In this regard, Hammarstrand et al. argue that models based on arc-length parametrizations are more effective at representing the geometry of a road. Yi et al.  developed their adaptive cruise control following this same idea and discuss the improvements introduced by a clothoidal model.
Other works in lateral control typically focus on the control models adopted, mostly validating their findings on predefined trajectories. While this is mostly performed through computer simulations , , Pérez et al.  make their evaluations on DGPS measurements taken with a real vehicle. Ibaraki et al.  instead, estimate the position of each line marking detecting the magnetic field of custom markers previously embedded into the line of their test track.
Only few works incorporate the line detection into their system, aiming at building a complete lane following architecture. In particular, Liu et al. 
first detect the line markings through computer vision and represents them in a Cartesian space, then they reconstruct the intrinsic parameters needed for control. To remove this unnecessary complication Hammarstrand et al. directly represent the detected lines within an intrinsic framework and are able to easily obtains those parameters. Their line detection system, however, relies not only on vision to detect the line markings, but also on the use of radar measurements to identify the presence of a guardrail and exploit it to estimate the shape of the road.
In recent years also end-to-end learning approaches have been proposed. Chen and Huang  developed a CNN-based system able to determine the steering angle to apply to remain on the road. In the meantime, instead, Bojarski et al.  present their deep end-to-end module for lateral control DAVE-2, trained with the images seen by a human driver together with his steering commands, and able to drive the car autonomously for 98% of the time in relatively brief drives. Nonetheless, as mentioned for the line detection systems, strong arguments have been raised against their interpretability and, ultimately, their safety .
Our perception system improves the state of the art as it directly provides the quantities necessary in lateral control while relying exclusively on vision and exploiting a compatible road representation. Furthermore, an experimental validation is conducted on a real vehicle, considering different scenarios, driving styles and weather conditions.
Iii Requirements for Lateral Control
To properly define a lateral control for an autonomous vehicle, three inputs are essential:
vehicle lateral position with respect to centerline;
relative orientation with respect to the ideal centerline tangent;
roadshape (of the centerline) in front of the vehicle.
In  the roadshape is described through third order polynomials in a curvilinear abscissa framework that is centered according to the current vehicle position. The most important advantage with respect to Cartesian coordinates is that each road characteristic can be described as a function of one parameter (i.e., the abscissa s), thus each function that approximates the lane center is at least surjective. This property is very important because it is retained along with the whole optimization horizon in model predictive control approaches . Fig. 2 depicts an example of such representations.
What is proposed in the following is a pipeline to compute the three parameters required by the control system: vehicle orientation, lateral offset and road shape, in an unknown scenario without the help of GPS data.
Iv Line Detection
To estimate the required parameters, we need to acquire a representation of the lane lateral lines in the scene.
At first, we adopt a purpose-built CNN to determine which pixels in an acquired frame belong to a line marking. Our architecture, trained using the Berkeley DeepDrive Dataset , is based on U-net , but with some significative changes to improve the network speed on low power devices and allow predictions at 100 Hz. In particular, the depth is reduced to two levels, and the input is downscaled to 512x256 and converted to grayscale. With these changes the network requires only 5ms to predict an image on our testing setup, a Jetson Xavier.
The obtained prediction mask is then post-processed through two stages. At first, we apply an Inverse Perspective Mapping (IPM) and project it into the Bird’s Eye View (BEV) space, where the scene is rectified and the shape of the lines reconstructed. In this space, then, the predictions are thresholded and morphologically cleaned to limit the presence of artifacts. The result is a binary mask in the BEV space, highlighting what we refer to as feature points.
Next, a feature points selection phase separates the points belonging to each road line of interest, discarding noisy detections at the same time. Algorithms for connected components extraction and clustering easily fail as soon as the points detected are slightly discontinuous, and have usually a strong computational demand. Therefore, we develop for this task a custom algorithm based on the idea that the lateral lines are likely well-positioned at the vehicle sides when looking in its close proximity. Once they are identified there, then, it is easier to progressively follow them as they move further away. Exploiting this concept, our window-based line following (WLF) algorithm is able to search for a line in the lower end of the image and then follow it upwards along its shape thanks to a mechanism of moving windows.
The line points collected are then passed to the fitting phase. Here each line is first temporarily fit to a cubic spline model to filter out the small noise associated with the detections while still preserving its shape. This model is however hard to further manipulate. To obtain a representation useful for lateral control, we propose to represent our line in a curviliear framework (). The conversion of the modeled lines into this framework requires a few steps, as the transition is highly nonlinear and cannot be performed analytically. We first need to sample our splines, obtaining a set of points . Fixing then an origin on the first detected point , we measure the euclidean distance between each point and its successor and the orientation of their connecting segment with respect to the x axis. For small increments then, we can assume:
obtaining a set . The main advantage obtained is that this set, while still related to our Cartesian curve, is now representable in the -space as a 1-dimensional function:
which can be easily fit with a polynomial model, final representation of our lateral lines.
As last step of our algorithm, the temporal consistency of the estimated lines is enforced in several ways. The information from past estimates is used to facilitate the feature points selection. In particular, when a line is lost because no feature points are found within a window, we can start a recovery procedure that searches for more points in a neighborhood of where the line is expected to be.
A further addition is the introduction of the odometry measures to improve the forward model of the road lines. While we are driving on our lane we can see its shape for dozens of meters ahead. Thus, instead of forgetting it, we can exploit this information as we move forward, in order to model not only the road ahead of us, but also the tract we just passed. This is crucial to be able to model the road lines not only far ahead of the vehicle, but also and especially where the vehicle currently is. To do so, we only need a measurement of the displacement of the vehicle pose between two consecutive frames, which is simple to obtain from the encoders of the vehicle or other odometry sources. With this information, we can store the line points detected at each time step and, at the next step, project them backwards to reflect the motion of our vehicle, finally adding them to the new detections before the line is fitted. As we move forwards, more and more points are accumulated, representing regions of the road not observable anymore. To avoid storing too complex road structures then, we prune old portions of the road as we move away from them, maintaining only past line points within 5–10 meters from our vehicle.
While the literature is mostly oriented towards Bayesian filters (mostly KF and EKF) to track the model parameters, we adopt an alternative perspective. It is important to notice that Bayesian filters directly act on the parameters of the line after the fitting and for optimal results they require external information about the motion of the vehicle. As our line detection system only relies on vision, we employ instead an adaptive filter based on the Recursive Least Square (RLS) method . In particular, we design this filter to receive in input, at each time step, the set of line points observed in the respective frame. With these, its overall model estimate is updated, following a weighted least squares scheme. Entering the filter with a full weight, points are considered to lose importance as they age, and thus their weight is exponentially reduced over time. For a cubic polynomial model as ours:
we have, at time :
|with . Assuming our real process to be constituted of a deterministic term, which we seek, and a stochastic one, which we want to remove:|
|we can then incrementally update our model parameters by computing, for each|
The main advantage of this approach is that no assumption is made on the behavior of the parameters and it is instead only the accumulation of line points through time to smooth the results. The recursive formulation then makes the computation fast and efficient.
V Centerline and Relative Pose Estimation
Given the representation of the lateral lines, it is important to model the lane centerline and the relative pose of the vehicle, measured in terms of its heading and lateral offset with respect to the centerline.
As no parallelism is enforced between the lateral lines, an analytical representation of the centerline is hard to find but we can reconstruct its shape with some geometrical consideration and exploiting the parametric representation adopted. In particular, we devise an algorithm to project the points from both lateral lines into the same plane, and we fit these points with a single model, equally influenced by both lines. In the best scenario, this would require each line point to be projected towards the center along the normal direction to the road. This projection, particularly impractical in Cartesian coordinates, is easily achieved in a parametric representation.
We assume, for the time being, that the lane has a fixed curvature. Moreover, if we take into account the center of curvature of any road line, we also make the following assumptions:
the two lateral lines (,) and the centerline () share the same center of curvature ():
the center of curvature varies smoothly:
With this setup, we can define the following procedure, to be repeated for both lateral lines (generically indicated as ). We refer the reader to Fig. 3 for a graphical interpretation of the quantities involved.
Compute from , using its heading and radius of curvature.
Find the line passing through and , the first line point in .
Find the line passing through and , the second line point in .
Compute the angle between and .
Obtain the ratio .
Define for convenience
At this point, with the ratio in Equation (15), we can define a coordinate transformation from the lateral line to the centerline and vice versa, all remaining into the parametric framework:
Notice that, although we think of this projection in the Cartesian space, we only define a linear transformation in theframe, aiming at rescaling each line model in order to make their shapes comparable. Although the assumptions made do not hold in general scenarios, this produces an acceptable approximation of the expected results.
With this procedure then, we are ultimately able to take all the points detected on both lines and collapse them onto the centerline.
As done for the lateral lines, the points can be fit with a cubic polynomial in and the result tracked through an RLS framework.
V-a Relative pose estimation
Given the centerline model, we notice that the heading of the vehicle is represented by the value of in a particular point, to be determined. Finding the exact point however is not simple, as we want to perform this measurements exactly along the line passing through the center of mass of the vehicle . As this requires us to pass from intrinsic to extrinsic coordinates, no closed form formulas are available, and we have to solve a simple nonlinear equation. In particular, as illustrated in Fig. 4, we need to look for a line , passing through and crossing the centerline perpendicularly. Formally then, we search for a value , corresponding to a point along the centerline , such that:
This can be easily done translating this conditions in the corresponding geometrical equations and considering that, for parametric representations:
Once this point is found, heading () and lateral displacement () are:
To maintain the vehicle pose temporal consistency, we set up an Extended Kalman Filter (EKF). We take as measurements the Cartesian position of the pointsand , intersections of the lateral lines with (see Fig. 4), and maintain a state composed of , heading of the vehicle relative to the centerline, , signed normalized lateral displacement, and , width of the road. Notice that we formally split the lateral displacement into and , measuring respectively the width of the lane and the relative (percentage) offset with respect to it. This is done on one hand to simplify the definition of the measurement function and thus obtain faster convergence, and on the other hand to obtain the additional estimate of , potentially helpful in control. This allows us to produce approximate estimates even when the tracking for one of the two lateral lines is lost, as we can locally impose parallelism and project the detected line on the opposite side of the lane, allowing our system to be resilient for short periods of time.
Mathematically, the state space model representation of our system can be shown as:
where the measurement function is:
with center of mass of the vehicle.
Vi Experimental Validation
All data for the tests have been acquired on the Aci-Sara Lainate (IT) racetrack and on the Monza Eni Circuit track (Fig. 5). The circuits present an optimal configuration for real street testing with long straights, ample radius curves and narrow chicanes.
The dataset is acquired using a fully instrumented vehicle showed in Fig. 6; images with resolution are recorded using a ZED stereo-camera working at . Car trajectory and lines coordinates are registered using a Swiftnav RTK GPS.
The ground truth creation process requires to map the desired area and retrieve the lines GPS coordinates. Then, the road centerline has been calculated considering the mean value of the track boundaries and sampled to guarantee a point each . This value of allows avoiding the oversampling of GPS signals while ensuring smoothness and accuracy of the road map. After that, third order polynomials have been derived at every along the centerline for the following meters. Thanks to the experimental data collected, the lateral distance from the centerline is computed as the minimum distance to the closest point of the centerline map. The relative angle with respect to the centerline is instead evaluated by approximating the centerline orientation computing the tangent to the GPS data.
For the tests, we recorded multiple laps, with different speed, from up to and different driving style. In particular, we considered three different trajectories (Fig. 7), one in the middle of the road, representing the optimal trajectory of the vehicle. Then one oscillating, with significative heading changes, up to to stress the limits of the algorithm, with particular focus to the heading value. Lastly, one on the racing line, often close to one side of the track and on the curbs, to better examine the computed lateral offset. Moreover, the recordings were performed in different weather scenarios, some on a sunny day, other with rain. This guarantees approximately one hour of recordings on two race track, one with length and one with extension, for a total of , with multiple driving styles and mixed atmospheric condition. With those described, we evaluate the performance of our system in delivering the necessary information for the lateral control, i.e. the relative pose of the vehicle (, ). The estimation is performed on four rides (two on each available tracks), covering three driving styles. To compare the results with the ground truth, we measure the mean absolute error reported on the entire experiment, considering only the frames where an estimate was available. A measure of the relative number of frames for which this happens () is also considered as an interesting metric. The results are reported in Table I. For further reference, the behavior of our estimates over time for the most significant experiments is presented in Fig. 8.
|Driving style: centered|
|Track A - Trajectory 1||99.71|
|Track B - Trajectory 1||99.91|
|Driving style: oscillating|
|Track A - Trajectory 2||100.00|
|Driving style: racing|
|Track B - Trajectory 2||95.92|
From these experiments, we observe how the system is able to provide an accurate estimate of the required data for lateral control, while maintaining a high availability. Indeed, the errors registered for the lateral offset account for only of the lane width, which lies between 9 to 12 meters for the tracks considered, while the errors in the heading, of about , are comparable to the ones experimentally obtained using a RTK GPS. Furthermore, the error values remains considerably low also in non-optimal scenarios (Track A Trajectory 2, Track B Trajectory 2) where the vehicle follows a path considerably different from a normal driving style.
In this paper, we propose a perception system for the task of lateral control parameters estimation, relying only on vision. This system is able to detect the lateral lines on the road and use them to estimate the lane centerline and relative pose of the vehicle in terms of heading and lateral displacement. As no benchmarking is publicly available, a custom dataset is collected and made openly available for future researches. The results obtained indicate that the proposed system can achieve high accuracy in different driving scenarios and weather conditions. The retrieved values are indeed comparable to the one calculated by state of the art RTK GPS, while compensating for its shortcomings.
-  (2019-11) 2020 ford fusion owner’s manual. Ford Motor Company (English). External Links: Cited by: §I.
-  (2008) Real time detection of lane markers in urban streets. In 2008 IEEE Intelligent Vehicles Symposium, pp. 7–12. Cited by: §I.
Non-linear mpc motion planner for autonomous vehicles based on accelerated particle swarm optimization algorithm. In 2019 International Conference of Electrical and Electronic Technologies for Automotive, Cited by: §III.
-  (2016-07)(Website) Car and Driver. External Links: Cited by: §I.
-  (1998) GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection. IEEE transactions on image processing 7 (1), pp. 62–81. Cited by: §II.
-  (2015) Lane departure warning system based on hough transform and euclidean distance. In 2015 Third International Conference on Image Information Processing (ICIIP), pp. 370–373. Cited by: §II.
-  (2016) End to end learning for self-driving cars. arXiv:1604.07316. Cited by: §II.
-  (1994) Exponentially weighted least squares identification of time-varying systems with white disturbances. IEEE Transactions on Signal Processing 42 (11), pp. 2906–2914. Cited by: §IV.
Efficient road lane marking detection with deep learning. In 2018 IEEE 23rd International Conference on Digital Signal Processing, Cited by: §II.
-  (2017) End-to-end learning for lane keeping of self-driving cars. In 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 1856–1860. Cited by: §II.
-  (2002) Lateral control of autonomous vehicle by yaw rate feedback. KSME international journal 16 (3), pp. 338–343. Cited by: §II.
-  (2014) Lane departure identification for advanced driver assistance. IEEE Transactions on Intelligent Transportation Systems 16 (2), pp. 910–918. Cited by: §II.
-  (2016) Long-range road geometry estimation using moving vehicles and roadside observations. IEEE Transactions on Intelligent Transportation Systems 17 (8), pp. 2144–2158. Cited by: §II, §II.
-  (2014) Recent progress in road and lane detection: a survey. Machine vision and applications 25 (3), pp. 727–745. Cited by: §II.
-  (2005) Design of luenberger state observers using fixed-structure h/sub/spl infin//optimization and its application to fault detection in lane-keeping control of automated vehicles. IEEE/ASME Transactions on Mechatronics 10 (1), pp. 34–42. Cited by: §II.
-  (2019-10) Jeep renegade owner handbook. FCA Italy S.p.A (English). External Links: Cited by: §I.
-  (2015) Efficient lane detection based on spatiotemporal images. IEEE Transactions on Intelligent Transportation Systems 17 (1), pp. 289–295. Cited by: §II.
-  (2008) Robust lane detection and tracking in challenging scenarios. IEEE Transactions on Intelligent Transportation Systems 9. Cited by: §I, §II.
-  (2017) Autonomous vehicle safety: an interdisciplinary challenge. IEEE Intelligent Transportation Systems Magazine 9 (1), pp. 90–96. Cited by: §II.
-  (2004) Springrobot: a prototype autonomous vehicle and its algorithms for lane detection. IEEE Transactions on Intelligent Transportation Systems 5 (4), pp. 300–308. Cited by: §II.
-  (1993-12) Estimator and controller design for lanetrak, a vision-based automatic vehicle steering system. In Proceedings of 32nd IEEE Conference on Decision and Control, Vol. , pp. 1868–1873 vol.2. External Links: Cited by: §II.
-  (2007) Development of an interactive lane keeping control system for vehicle. In 2007 IEEE Vehicle Power and Propulsion Conference, pp. 702–706. Cited by: §II.
-  (2018) A review of recent advances in lane detection and departure warning system. Pattern Recognition 73, pp. 216–234. Cited by: §I, §II.
-  (2017) The mapillary vistas dataset for semantic understanding of street scenes. In International Conference on Computer Vision (ICCV), External Links: Cited by: §I.
-  (2019) Potential safety benefits of lane departure prevention technology. IATSS research 43 (1), pp. 21–26. Cited by: §I.
-  (2011) Cascade architecture for lateral control in autonomous vehicles. IEEE Transactions on Intelligent Transportation Systems 12 (1), pp. 73–82. Cited by: §I, §II.
-  (1996) Rapidly adapting machine vision for automated vehicle steering. IEEE expert 11, pp. 19–27. Cited by: §II.
-  (2015) U-net: convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Cited by: §IV.
Spatial as deep: spatial cnn for traffic scene understanding. In
AAAI Conference on Artificial Intelligence (AAAI), Cited by: §I.
-  (2019-10) Tesla model s owner’s manual. Tesla (English). Cited by: §I.
-  (2019-01)(Website) Volkswagen Group Italia. External Links: Cited by: §I.
-  (2004) Lane detection and tracking using b-snake. Image and Vision computing 22 (4). Cited by: §II.
-  (2015) Vehicle trajectory prediction for adaptive cruise control. In 2015 IEEE Intelligent Vehicles Symposium (IV), pp. 59–64. Cited by: §II, §II.
-  (2018) Bdd100k: a diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687. Cited by: §IV.
-  (2018) Bdd100k: a diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687. Cited by: §I.
-  (2009) Lateral control of vehicle for lane keeping in intelligent transportation systems. In 2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, Vol. 1, pp. 446–450. Cited by: §II.
-  (2012) A novel multi-lane detection and tracking system. In 2012 IEEE Intelligent Vehicles Symposium, pp. 1084–1089. Cited by: §II.
-  (2017) Vision-based lane detection and tracking for driver assistance systems: a survey. In 2017 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM), pp. 660–665. Cited by: §II, §II.