Intelligent Traffic Monitoring Systems for Vehicle Classification: A Survey

10/10/2019 ∙ by Myounggyu Won, et al. ∙ University of Memphis 0

A traffic monitoring system is an integral part of Intelligent Transportation Systems (ITS). It is one of the critical transportation infrastructures that transportation agencies invest a huge amount of money to collect and analyze the traffic data to better utilize the roadway systems, improve the safety of transportation, and establish future transportation plans. With recent advances in MEMS, machine learning, and wireless communication technologies, numerous innovative traffic monitoring systems have been developed. In this article, we present a review of state-of-the-art traffic monitoring systems focusing on the major functionality–vehicle classification. We organize various vehicle classification systems, examine research issues and technical challenges, and discuss hardware/software design, deployment experience, and system performance of the vehicle classification systems. Finally, we discuss a number of critical open problems and future research directions in an aim to provide valuable resources to academia, industry, and government agencies for selecting appropriate technologies for their traffic monitoring applications.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 6

page 9

page 11

page 12

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

As the number of vehicles has increased significantly, the capacity of existing transportation networks is almost at its maximum, causing severe traffic congestion in many countries [won2016toward]. Constructing additional highway infrastructure, however, is not a feasible option because of the high cost and limited space. For example, constructing a high occupancy vehicle (HOV) lane in the city of Los Angeles costs up to $750,000 per lane and per mile [2016TrafficGuide]. The expenses increase prohibitively to provide safety to construction workers and build extra facilities to maintain traffic flow during construction.

A traffic monitoring system is an effective alternative to mitigate traffic congestion. It is an integral component of Intelligent Transportation Systems (ITS) that is used to collect traffic data such as the number of vehicles, types of vehicles, and vehicle speed. Based on the collected data, it performs traffic analysis to better utilize the roadway systems, predict future transportation needs, and improve the safety of transportation [won2018deepwitraffic]. Transportation agencies in many countries spend huge amounts of money to develop, deploy, and maintain traffic monitoring systems [lee2015using].

The three key functionalities of a traffic monitoring system is vehicle counting, vehicle speed estimation, and vehicle classification. Especially, due to significant technical challenges, various research issues have been investigated on vehicle classification, and numerous vehicle classification systems have been developed. Classifying vehicles into different types accurately is of crucial importance for effective traffic operation and transportation planning. For example, the information about the number of large trucks on a highway section is used to estimate the capacity of the highway section and plan for pavement maintenance work. Identifying the vehicle types especially the number of multi-unit vehicles is of a great interest to the safety community. Even the geometric roadway design is dictated by the vehicle types that frequently utilize the roadway.

Numerous vehicle classification systems have been developed. Especially, recent advances in sensing, machine learning, and wireless communication technologies gave rise to numerous innovative vehicle classification systems. Although these new classification systems enable vehicle classification with higher accuracy, they have significantly different characteristics and requirements such as the types of sensors used, hardware settings, configuration process, parameter settings, operating environment, and even the cost, making it extremely challenging for transportation agencies, engineers, and scientists to select the most appropriate solution for their vehicle classification applications. The needs and demands for a comprehensive review of these latest vehicle classification techniques are ever higher.

Fig. 1: The taxonomy of vehicle classification schemes.

In this article, we present a survey on state-of-the-art vehicle classification technologies to address the significant demand and provide guidelines for selecting an appropriate technology for vehicle classification. We systematically organize ideas, research issues, and technical solutions that are developed to achieve high vehicle classification accuracy. Specifically, we classify largely the vehicle classification systems into three categories, i.e., in-road-based, over-road-based, and side-road-based approaches. The vehicle classification schemes in each category is further classified into subcategories based on the types of sensors used, methodologies for utilizing the sensors, and mechanisms for classifying vehicles. We provide in-depth description, analysis, and comparison of numerous innovative vehicle classification schemes in each subcategory. We also present a number of open problems and several future research directions.

There are a few surveys on traffic monitoring systems focusing on vehicle classification. The federal highway administration (FHWA) provides general guidelines for selecting traffic monitoring systems. However, it is limited to industry solutions without discussing on-going research issues and emerging traffic monitoring systems [2016TrafficGuide][2016Handbook]. Some papers discuss only traditional traffic monitoring systems such as the loop detectors [tyburski1988review]. Interestingly, it was found that most survey works are focusing on vision-based vehicle classification techniques [datondji2016survey][tian2011video][inigo1985traffic][yousaf2012comparative][kul2017concise][jain2019review] overlooking numerous other emerging vehicle classification solutions. Although there are some works that provide a review of vehicle classification systems based on different types of sensors, these papers discuss only a particular type of vehicle classification system such as UAVs [puri2005survey][kanistras2015survey]. A comprehensive review on traffic monitoring systems have been performed recently [bernas2018survey]. However, the paper is concentrated on the vehicle detection technologies rather than vehicle classification schemes. In contrast, this article provides a comprehensive survey on virtually all vehicle classification technologies developed in the past decade with in-depth analysis of research issues, technical challenges, and novel approaches. The contributions of this article are summarized as follows.

  • To the best of our knowledge, this article presents the first comprehensive review of latest traffic monitoring systems concentrating on vehicle classification schemes.

  • This article is specifically focused on discussing various research issues on vehicle classification related to machine learning, low-power sensing and image processing technologies.

  • This article introduces new breeds of traffic monitoring systems that are significantly different from traditional ones such as RF and Wi-Fi-based traffic monitoring systems.

  • This article presents open research problems and a number of future research directions.

This article is organized as follows. In Section II, the taxonomy of vehicle classification schemes is introduced, followed by detailed descriptions of vehicle classification systems in each category, i.e., in-roadway based systems (Section III), over-roadway based systems (Section IV), and side-roadway based systems (section V). We then present open problems and future research directions in Section VI and conclude in Section VII.

Ii Taxonomy of Vehicle Classification Technologies

This section presents the taxonomy of vehicle classification systems. The details of each vehicle classification system are described in subsequent sections. Vehicle classification systems are largely categorized into three classes depending on where the system is deployed: in-roadway-based, over-roadway-based, and side-roadway-based systems (Fig. 1). We then further classify the vehicle classification systems based on sensor types and how the sensor data are utilized for vehicle classification.

The in-roadway-based vehicle classification systems install sensors on or under the pavement of a roadway. Different types of sensors are used for the in-roadway-based vehicle classification systems such as piezoelectric sensors [rajab2014vehicle], magnetometers [bottero2013wireless][xu2018vehicle], vibration sensors [stocker2014situational], loop detectors [meta2010vehicle]. Various kinds of information is extracted from the sensor data including the vehicle length, axle count, and unique features of the signal/waveform. The in-roadway-based systems boast the high vehicle classification accuracy because the sensors maintain close contact with passing vehicles, effectively capturing the body and motion signature of the vehicles. A major downside is, however, the high cost for installation and maintenance because the pavement of a roadway needs to be sawcut to install the sensors under the roadway. The cost increases significantly due to traffic disruption and lane closure to provide safety to road workers.

The side-roadway-based systems addresses the cost issue of the in-roadway-based vehicle classification schemes since the sensors are installed on a roadside, obviating the needs for lane closure and construction. Like the in-roadway-based systems, different types of sensors have been utilized. Some of the most widely used sensors include magnetometers [wang2014easisee][yang2015vehicle], accelerometers [ma2014wireless], and acoustic sensors [george2013vehicle]. Recently, advanced sensors such as Laser Infrared Detection and Ranging (LIDAR) [lee2012side][lee2015using], infrared sensors [odat2017vehicle], and Wi-Fi transceivers [won2017witraffic] have been employed. Despite the benefits of easier installation and reduced cost, the side-roadway-based systems require extra efforts for adjusting precisely the directions and placement of the sensors [odat2017vehicle]. A more critical problem is that most systems fail to classify overlapped vehicles accurately. Additionally, an algorithm for calibrating the sensor data is needed to reduce the effect of the noise and increase the classification accuracy.

The over-roadway-based systems utilize sensors installed over the roadway thus being capable of covering multiple lanes simultaneously. For example, unmanned aerial vehicles (UAVs) and satellites are used in these systems [tang2017arbitrary]. The most prevalent technology under this category is the camera-based systems [chen2012vehicle][bautista2016convolutional]. While the camera-based systems have high classification accuracy, the performance is affected by weather and lighting conditions. Another important problem is the driver privacy concerns as there are many people who do not feel comfortable to be exposed to cameras. Some over-roadway-based systems address the privacy concerns by adopting different types of sensors such as infrared sensors [odat2017vehicle] and laser scanner [chidlovskii2014vehicle].

Having presented the taxonomy of the vehicle classification systems as the big picture for this review article, in the following sections, the details on research issues, technical challenges, hardware/software design, deployment experience, and comparison of various vehicle classification systems are discussed.

Iii In-Roadway-Based Vehicle Classification

This section presents a review of the in-roadway-based vehicle classification systems. Specifically, we focus on the basic theory, specific research problems, key mechanisms for vehicle classification, vehicle types for classification, and average classification accuracy. Starting with the discussion on the loop detectors which are the most widely used in-roadway-based vehicle classification systems, we cover various other solutions that are built with different kinds of sensors. The characteristics of the in-roadway-based systems covered in this section are summarized in Table I.

Major Equipment Publications Accuracy Vehicle Classes Key Features
3cmMagnetic Sensors 3cmBottero TRP-C’13 [bottero2013wireless] 1cm88% 3cmcar, van, truck 6cmA wireless sensor network of two magnetic sensors; Vehicle length is used as a key feature
3cm 3cmMa TITS’14 [ma2014wireless] 1cm99.0% 3cm2,3,4,5,6-axle vehicles 6cmCombination of a magnetometor for speed estimation and an accelerometer for axle counting and axle spacing estimation
3cm 3cmLi Measurement’14 [li2014vehicle] 1cm88.9% (cars), and 94.4% (busses) 3cmcars, and busses 6cmA single magnetic sensor; Speed-independent features of vehicle waveform.
3cm 3cmLi CN’17 [li2017reliable] 1cm96.4% 3cmpassenger vehicles, SUVs, buses, and Vans 6cmSensor fusion of magnetic waveforms collected from two magnetic sensors that are 80m apart on the same lane
3cmXu Sensors’18 [xu2018vehicle] 1cm95.46% 3cmhatchbacks, sedans, buses, and multi-purpose vehicles 6cmAdvanced machine learning techniques for classification focusing on the imbalance effect
3cmBalid TITS’18 [balid2018intelligent] 1cm97% 3cmpassenger vehicles, single-unit trucks, combination trucks, and multi-trailer trucks. 6cmMachine learning-based classification using the vehicle length as a key feature
3cmDong Access’18 [dong2018improved] 1cm80.5% 3cmclass 1 (sedans and SUVs), class 2 (vans and seven-seat cars), class 3 (light and medium trucks), and class 4 (heavy trucks and semi trailers)

6cmClassification based on XGBoost using a single magnetic sensor

3cmVibration Sensors 3cmBajwa IPSN’11 [bajwa2011pavement] 1cmN/A 3cmvehicles with different axle counts and spacing (mostly large trucks) 6cmAxle count and spacing between axles as key features
3cmStocker TITS’14 [stocker2014situational] 1cm83% 3cmlight, and heavy vehicles

6cmUnique characteristics of seismic signals used as key features; Multilayer perceptron (MLP) feedforward artificial neural networks for classification

3cmZhao TRR’18 [zhao2018vibration] 1cm89.4% 3cmpassenger car, bus, and 26 axle trucks/trailers 6cmAxle count and spacing between axles as key features; Capable of classifying 2-axle cars with similar axle configurations based on a multi-parameter classifier
3cmJin GRSL’18 [jin2018vehicle] 1cm92% 3cmassault Amphibian Vehicle (AAV) and dragon wagon (DW)

6cmFocused on the complexity of seismic signal; Convolutional neural network (CNN) with the log-scaled frequency cepstral coefficient (LFCC) matrix as a key feature to address the complexity

3cmLoop detectors 3cmMeta TVT’10 [meta2010vehicle] 1cm94.2% 3cmcar/jeep, minibus/van, pickup/truck, bus, and motorcycle 6cmNoise reduction in the raw signal; PCA for dimensionality reduction; Application of BPNN
3cm 3cmTok TRB’10 [tok2010vector] 1cm80.8% 3cm27 axle configuration classes, 9 drive unit body classes, and 10 trailer unit body classes 6cmCombination of axle configuration-based and inductive signature-based systems
3cm 3cmJeng TRR’13 [jeng2013wavelet] 1cm93.8% 3cmThe 13 FHWA vehicle classes [FHWAClass] 6cmFeature extraction from inductive signals using the wavelet transformation technique; Classification with k-NN
3cmLamas Sensors’15 [lamas2015vehicle] 1cm96% 3cmcar, truck, and van 6cmSpectral features of inductive signatures using DFT
3cmLiu TRP-C’14 [liu2014length] 1cm99.4% 3cmlong vehicles, regular cars 6cmVehicle-length-based work using a single loop detector; A traffic theory was used to estimate vehicle speed; Classifies vehicles into only two types
3cmWu TRR’14 [wu2014vehicle], TRP-C’14 [wu2014improved] 1cm99% 3cmthree length classes with boundaries at 28 ft and 46 ft 6cmAddresses the issue of non-zero acceleration of passing cars
3cmWeigh In Motion 3cmHernandez TRP-C’16 [hernandez2016integration] 1cm+80.0% 3cm31 single and semi-trailer body trucks, and 23 single unit trucks 6cmClassification for truck body types; Integration of WIM with an inductive loop detector
3cmPiezoelectric Sensors 3cmRajab IV’14 [rajab2014vehicle] 1cm86.9% 3cmThe 13 FHWA vehicle classes [FHWAClass] 6cmAn array of piezoelectric sensors; Standard features are used
3cmFiber Bragg Grating Sensors 3cmHuang IEEE Sensors’18 [huang2018vehicle] 1cm98.5% 3cmsmall, medium, large 6cmA sensor network of FBG sensors; Standard features are used
TABLE I: In-roadway-based vehicle classification systems

Iii-a Loop Detectors

Fig. 2: (a) saw-cut loop [meta2010vehicle]; (b) preformed loop [martin2003detector].

An inductive loop detector is one of the most commonly used traffic monitoring systems [coifman2014improved]. It is a coil of wire that is embedded under the road surface (Fig. 2). It captures the change of inductance and generates a time-variable signal when a vehicle passes over. The characteristics of the signal such as the amplitude, phase, and frequency spectrum are varied depending on the classes of vehicles. These unique characteristics of the signal are known as the magnetic profile [jeng2014high], which is used to perform vehicle classification.

There are largely two types of loop detectors depending on the installation method: saw-cut and preformed methods. The saw-cut method requires to saw-cut the pavement, lay the loop wire, and protect the wire by filling the pavement (Fig. 2(a)). The preformed loop detectors do not embed the loop wire under the pavement; instead it encases the loop wire in a PVC pipe and attach the pipe on the pavement (Fig. 2(b)). The loop detectors can also be categorized into the single loop detectors and dual loop detectors depending on the number of loop detectors used for vehicle classification. The dual loop detectors consist of a pair of loop detectors in a lane. A key strength of the dual loop detectors compared with the single loop detectors is that the dual loop detectors can measure the vehicle speed and vehicle length based on the predetermined longitudinal distance between the two loop detectors.

Numerous research works have been conducted to enhance the performance of loop detectors for vehicle classification. In particular, recent development of machine learning technologies sparked the emergence of advanced loop detectors that apply machine learning techniques to analyze the magnetic signature of passing cars. Meta and Cinsdikici utilized the backpropagation neural network (BPNN) for vehicle classification 

[meta2010vehicle]

. Specifically, based on the observation that the low classification accuracy of existing loop detectors is attributed to simple data sampling of noisy raw signals, an algorithm based on Discrete Fourier transform (DFT) is designed to clear the noise. The principal Component Analysis (PCA) is then applied to reduce the dimensionality of the noiseless data. The PCA features are expanded to emphasize the undercarriage height variation of a passing vehicle. Finally, the output of PCA is fed into the three-layered BPNN to classify the vehicles into five classes: car/jeep, minibus/van, pickup/truck, bus, and motorcycle. The classification accuracy of 94.2% was achieved.

A significant technical challenge for loop detectors is that vehicles with similar axle configurations are difficult to classify accurately. Tok and Ritchie propose a novel vehicle classification system that effectively classifies vehicles with similar axle configurations by integrating a novel loop sensor with a loop detector so as to combine the advantages of the axle-based vehicle classification system with the body signature-based system [tok2010vector]. More specifically, vehicles are first classified into three high-level types based on the number of axle clusters. And then, the vehicle body signature (i.e., the magnetic profile of the passing vehicle) is used to further classify the vehicles based on the multi-layer feedforward neural network (MLF) [svozil1997introduction]. The authors achieved 80.8% accuracy for a total of 1,029 vehicles with 27 different axle configurations, 9 drive unit body classes, and 10 trailer unit body classes. It should be noted that the seemingly low classification accuracy is actually quite high considering that the system was evaluated with a lot of vehicles with similar axle configurations, and also considering the fact that the number of vehicle types was large.

Jeng et al. propose a similar approach that is based on the analysis of the magnetic signature of a passing car [jeng2013wavelet]. The Haar wavelet transformation technique [walnut2013introduction] is adopted to compress the waveform data thereby removing the salient characteristics of vehicle signatures and to maintain more distinctive features in the compressed data. After that, the k nearest neighbor (kNN) approach is used as a classifier to classify vehicles into 13 FHWA vehicle types [FHWAClass]. A data set collected from the I-405 of the city of Irvine, CA, as well as the data set from the city of San Onofre, CA were used for the experiments. The classification accuracy was 93.8%.

The vehicle classification systems discussed so far are based on a single loop detector. A limitation of the single loop detector is that it is difficult to measure the vehicle speed (with which we can calculate the vehicle body length) as opposed to the dual loop detector. The dual loop detector consists of a pair of loop detectors in a lane [cheevarunothai2006identification]. It measures the traversal time of a passing vehicle, which is converted to the vehicle speed by dividing the traversal time by the known distance between the pair of the loop detectors. The body length of a passing vehicle can be calculated by multiplying the speed with the dwell time over a loop detector.

Vehicle classification systems based on dual loop detectors have been developed. These systems use the vehicle length as a key feature [cheevarunothai2006identification]. In particular, Wu et al. note that small changes in acceleration influence the precision of estimating the vehicle length, consequently degrading the classification accuracy significantly especially under congested conditions [wu2014vehicle][wu2014improved]. To this end, they develop a new method that takes into account the possibility of non-zero acceleration of a passing vehicle. The new approach was tested using the Next Generation Simulation (NGSIM) datasets [kovvali2007video] and performed vehicle classification into three length classes with boundaries at 28 ft and 46 ft. The classification accuracy was over 98%. Although the dual loop detectors allow for using the vehicle length as an additional feature, classifying vehicles with similar body lengths (e.g., pick-up trucks and minivans) still remains as a challenge.

Despite the benefits of the dual loop detectors, the cost for dual loop detectors is higher than the single loop detectors. Interestingly, researchers have shown that a single loop detector can be enough for achieving accurate vehicle classification. Lamas-Seco et al.

identify that certain spectral features extracted from the magnetic signal collected from a signal loop detector have no dependency with the vehicle speed. 

[lamas2015vehicle]. Specifically, they argue that based on these features, an effective classification system can be developed without relying on the dual loop detectors. They classified the vehicles into three types: car, truck, and van. The classification accuracy was about 96%.

In line of this research, Liu and Sun addresses the limitation of the single loop detector, successfully measuring the vehicle length with a single loop detector and using it as a key feature for vehicle classification [liu2014length]. Newell’s simplified car following model [newell2002simplified] is adopted to understand the relationships among vehicles in a platoon and estimate the vehicle occupation time. The classification is simply performed by comparing the anticipated vehicle occupation time with the measured vehicle occupation time, where the discrepancy indicates a long vehicle. Field data collected from a highway with a total of 2,547 samples were used for the experiments. The classification accuracy was 99.4%.

Iii-B Magnetic Sensors

Fig. 3: Magnetic field changes by a vehicle [bottero2013wireless].

A large amount of ferrous metals in a vehicle frame induces disturbance to the Earth’s magnetic field in the direction of the lane and the vertical direction [cheung2005traffic]. Fig. 3 illustrates distortion to the magnetic field caused by a passing vehicle. Magnetic sensors are used to capture these distinctive changes in the magnetic field to classify vehicles. In comparison with loop detectors, magnetic sensors have advantages in terms of the size, weight, cost, and energy efficiency. In this section, we present a review of recent efforts on developing vehicle classification systems based on magnetic sensors.

We categorize magnetic sensor-based vehicle classification systems into three types: (1) systems that are built with a network of multiple magnetic sensors relying on the vehicle length as a key feature (2) systems that use a single magnetic sensor leveraging waveform analysis based on machine learning techniques, and (3) hybrid systems that utilize the unique features of the waveform and the vehicle length. Bottero et al. designed a wireless sensor network (WSN) of two magnetic sensors to perform vehicle classification [bottero2013wireless]. More specifically, two pavement-mounted magnetic sensors are aligned to the lane axis to measure the vehicle speed. Given the distance between the two sensors, the vehicle length can be calculated. Vehicle classification is then performed based on the vehicle length similar to the dual loop detector-based approaches [cheevarunothai2006identification]. Vehicles were classified into three types, namely cars, vans, and trucks and the average classification accuracy of 88% was achieved.

Balid et al. propose a similar approach that uses the vehicle length as a key feature [balid2018intelligent]. In particular, the main feature is called the vehicle magnetic length which is defined as the product of the vehicle speed and the period of time that the vehicle was on the magnetic sensor, i.e.,

sensor departure time minus sensor arrival time. The vehicle speed is measured by calculating the travel time between two longitudinally located magnetic sensors. Given the vehicle magnetic length as the main feature, different machine learning classifiers are adopted for comparing the performance including Decision Tree (DT), support vector machine (SVM), k-Nearest Neighbor (

k

NN), and Naive Bayes Classifier (NBC). The classification accuracy was over 97% for classifying vehicles into passenger vehicles, single-unit trucks, combination trucks, and multi-trailer trucks.

Li and Lv propose another WSN of magnetic sensors for vehicle classificationa [li2017reliable]. Similar to [bottero2013wireless], two magnetic sensors are deployed on the same lane with 80m away from each other. However, the proposed work uses the magnetic sensors not only for estimating the vehicle length, but also for analyzing the feature waveform of the magnetic sensor data to enhance the classification accuracy. Specifically, the main contributions of this work compared with other solutions based on a WSN of magnetic sensors are two fold. First, a novel data segmentation technique is developed to separate the magnetic waveform effectively from the overall waveform of magnetic sensor data. Second, a sensor fusion algorithm is developed to correlate the feature waveforms from the two sensors to enhance the classification accuracy. The average classification accuracy was 96.4% in classifying vehicles into four types: passenger vehicles, SUVs, busses, and vans.

Different types of sensors have been integrated with magnetic sensors to enhance the effectiveness of vehicle classification. For example, Ma et al. propose a WSN consisting of magnetic sensors and accelerometers [ma2014wireless]. More specifically, the magnetic signatures collected with magnetic sensors are used to estimate the vehicle speed, and the accelerometer is used to count the number of axles and estimate the axle spacing between each pair of axles leveraging the measured vehicle speed. Vehicles are classified according to the FHWA 13-category [wyman1985field]. The proposed system classifies vehicles with the accuracy of 99%. The classification accuracy is high because the vehicles with different axle counts were used for the evaluation. A question remains whether the proposed system will perform well for the vehicles with the same axle count and similar axle spacing.

Recent research shows that it is possible to achieve high classification accuracy with only a single magnetic sensor by applying advanced machine learning techniques especially for analyzing the magnetic sensor data. The idea is to automatically extract the effective features from the magnetic signature rather than relying on simple features such as the peaks, and then build a vehicle classification model that is used to classify vehicles effectively [xu2018vehicle]. Various machine learning techniques are used for vehicle classification such as the k-nearest neighbor (kNN) [keller1985fuzzy], support vector machine (SVM) [suykens1999least], back-propagation neural network (BPNN) [goh1995back], and convolutional neural network (CNN) [krizhevsky2012imagenet].

Li et al. identifies eight speed-independent features (i.e.,

number of peaks, the maximum peak time ratio, the minimum trough time ratio, the mean value, the standard deviation, the maximum peak amplitude, the minimum trough amplitude, and the maximum peak/trough amplitude ratio) from a magnetic waveform 

[li2014vehicle]. These features are then used to build a vehicle classification model based on the optimal Minimum Number of Split-sample (MNS)-based Classification and Regression Tree (CART) algorithm [steinberg2009cart]. They achieved the classification accuracy of 88.9%, and 94.4% for cars and busses, respectively. Especially, Xu et al. focus on the problem of the unbalanced magnetic sensor dataset [xu2018vehicle]. They note that the numbers of vehicles in each vehicle class are significantly different in many datasets which often leads to degraded classification performance. The proposed work is dedicated to minimizing the imbalance effect and applies the k-nearest neighbor (kNN) for vehicle classification.

Dong et al. also show that a single magnetic sensor can be a powerful tool for vehicle classification [dong2018improved]. Three types of features are extracted from the Z-axis of a magnetic signal including statistical, energy, and short-term features. In particular, the energy features are used because it is highly correlated with the vehicle size. These features are provided as input to a classifier, XGBoost [chen2016xgboost] to perform vehicle classification into four categories: class 1 (sedans and SUVs), class 2 (vans and seven-seat cars), class 3 (light and medium trucks), and class 4 (heavy trucks and semi trailers). The average classification accuracy was 80.5% with 1,797 vehicles being successfully classified out of 2231 vehicles.

Iii-C Vibration Sensors

Vibration sensors, typically accelrometers, are installed under the roadway and monitors vibrations in the field caused by passing vehicles. Using highway pavement itself as a transducer, vibration sensors capture the unique vibration patterns induced by passing vehicles due to the low elasticity of road pavement that makes vibrations well localized in time and space [bajwa2011pavement]. Fig. 4 shows an example of a vibration sensor and the installation process. However, there are some disadvantages. The propagation of the seismic wave is significantly affected by the underlying geology. Additionally, since the seismic wave has various forms, directions, and speeds, the waveform is very complex making vehicle classification very difficult.

Fig. 4: An example of a vibration sensor and the installation process [bajwa2011pavement].

Various vibration sensor-based vehicle classification systems have been developed. Some systems use vibrations to count the number of axles and measure the spacing between axles [bajwa2011pavement]. The axle count and spacing between axles are used as key features for vehicle classification. Another type of solution is based on the analysis of the seismic signals induced by passing vehicles [stocker2014situational][jin2018vehicle]. The characteristic features of the seismic waves are extracted to model a classifier to perform classification. Since the seismic waves are very complex, machine learning techniques are often adopted to extract effective features.

Bajwa et al. propose a vehicle classification system based on the axle count and spacing between axles [bajwa2011pavement]. The proposed system consists of magnetic sensors and vibration sensors. The magnetic sensors are used for detecting a vehicle and reporting the arrival and departure times of the passing vehicles. The vibration sensors are utilized for calculating the number of axles and spacing between axles which are the two key features for vehicle classification.

Zhao et al. develop a novel vibration sensing system for vehicle classification called the distributed optical vibration sensing system (DOVS) [zhao2018vibration]. Although the system uses the same features for classification as the paper [bajwa2011pavement], i.e., the axle count and the spacing between axles, the system achieves high resistance to damage and electromagnetic interference, and the performance is reliable in severe environments. Furthermore, it is easy to deploy and the cost for installation is low compared with other vibration-sensor-based systems. Another notable feature of this work is that it supports classification of the vehicles with similar axle configurations especially 2-axle vehicles such as vans, two-axle buses, and two-axle trucks by developing a multi-parameter classifier incorporating additional features in the frequency domain and the vehicle speed. The proposed system classifies vehicles into 10 vehicle types and achieves the average classification accuracy of 89.4%.

Different from the systems that use the axle count and spacing between axles as the key features, there are other vehicle classification systems that leverage the unique characteristics of the seismic signals of passing vehicles. Stocker et al. [stocker2014situational] propose a digital signal processing algorithm to process vibration sensor data to identify unique vibration patterns for passing vehicles. After processing the vibration sensor data, a machine learning technique is applied to perform vehicle classification. More specifically, the multilayer perceptron (MLP) feedforward artificial neural networks [haykin1994neural] is used to classify vehicles into light (mini-cleaner, mini-lifter, personal-car, van, ambulance, fire-van, and pickup-truck) and heavy vehicles (truck, fire-truck, and bucket-digger). The classification accuracy obtained was 83%.

A similar work was performed by Jin et al. [jin2018vehicle]. The authors focus specifically on the complexity of the seismic signals which is nonstationary and nonlinear. The seismic signal comprises a number of signals generated by a passing vehicle (e.g., the engine and propulsion system of a passing car). It is not only highly dependent on underlying geology, but its propagation speed and direction vary significantly [jin2012target]. To achieve high classification accuracy under the complexity of the seismic signal, the authors apply a convolutional neural nework (CNN). Specifically, they develop a seismic signal-based deep CNN architecture for classifying vehicles. The proposed CNN framework takes the log-scaled frequency cepstral coefficient (LFCC) matrix as a key feature. Vehicle classification was performed with the data collected from the DARPA’s SensIt project for two vehicle classes, i.e., Assault Amphibian Vehicle (AAV) and dragon wagon (DW). The best classification accuracy was 92%.

Iii-D Other Technologies

Different kinds of sensors such as weigh-in-motion sensors [hernandez2016integration], peizoelectric sensors [rajab2014vehicle], and fiber-optic sensors [huang2018vehicle] are used to develop in-roadway-based vehicle classification systems. While these sensors are less frequently used for vehicle classification, the mechanism for vehicle classification is similar to other in-road-based solutions in that standard features are used for vehicle classification such as the axle count, axle spacing, and vehicle length.

Hernandez et al. develop an in-road-based vehicle classification system that integrates a weigh in motion sensor with a loop detector [hernandez2016integration]. Specifically, the vehicle weight data are combined with the axle spacing data to achieve better classification accuracy. They propose to utilize multiple classification models, i.e.,

Naive Bayes Classifier, Decision Tree, SVM, Multilayer Feed forward Neural Network 

[hornik1989multilayer], and Probabilistic Neural Network [specht1990probabilistic]. In particular, a multiple classifier systems (MCS) method [kuncheva2004combining] is adopted to combine the results of these classifiers. A huge data set of 18,967 trucks was used to classify the trucks into 31 single and semi-trailer body trucks, and 23 single unit trucks. The accuracy was over 80% for each truck body type.

Rajab et al. develop a multi-element piezoelectric sensor system which consists of 16 piezoelectric sensors [rajab2014vehicle]. Three main features, i.e., the number of tires, vehicle length, and axle spacing are used for vehicle classification. Specifically, by sensing the impact on multiple sensor elements, the number of tires is computed. The vehicle speed is estimated based on the time difference between the impact on two sensors aligned to the lane axis. The vehicle length and axle spacing are computed based on the vehicle speed and the dwell time over a sensor. The 13 FHWA vehicle classes were used for vehicle classification. The average classification accuracy was 86.9%.

Recent advances in fiber-optic sensors that are small, lightweight, immune to electro-magnetic interference gave rise to novel traffic engineering applications [malla2008special]. Huang et al. adopt fiber bragg grating (FBG) sensors for vehicle classification [huang2018vehicle]. A sensor network consisting of two FBG sensors is developed to extract the features of the number of axles and axle spacing. Specifically, the FBG sensors capture the strain signals generated from the pavement when vehicles pass on the road, so that an individual peak is used to identify the features. With the two aligned sensors, the vehicle speed can be measured, and the axle spacing is measured based on the vehicle speed. The classification accuracy was high as 98.5% partly due to the simple vehicle classification scheme, i.e., small, medium, and large vehicles.

Iv Over-Roadway-Based Vehicle Classification

Major Equipment Publications Accuracy Vehicle Classes Key Features
3cmCamera 3cmChen ITSC’12 [chen2012vehicle] 1cm94.6% 3cmmotorcycles, cars, vans, buses, and unknown vehicles 6cmGMM for background noise removal; SVM for classification
3cm 3cmMithun TITS’12 [mithun2012detection] 1cm+88% 3cmmotorbikes, rickshaws, autorickshaws, cars, jeeps, covered vans, and busses 6cmVehicle detection based on multiple virtual detection lines (MVDLs); A two-step classification method based on kNN
3cm 3cmUnzueta TITS’12 [unzueta2012adaptive] 1cm92.6% 3cmtwo wheels, light vehicles, and heavy vehicles 6cmAddressed the problem of the dynamic changes of the background; A multicue background subtraction method
3cm 3cmDong TITS’15 [dong2015vehicle] 1cm89.4% 3cmtruck, minivan, bus, passenger car, and sedan 6cmA two-stage CNN for automatic feature extraction; Softmax classifier based on multi-task learning
3cm 3cmKaraimer ITSC’15 [karaimer2015combining] 1cm96.5% 3cmcars, vans, and motorcycles 6cmCombination of kNN with shape-based features and SVM with HOG features
3cm 3cmHuttunen IV’16 [huttunen2016car] 1cm97.0% 3cmbus, truck, van and small car 6cmAutomatically extracted features using DNN
3cm 3cmAdu-Gyamfi TRR’17 [adu2017automated] 1cm+89% 3cmThe 13 FHWA vehicle classes [FHWAClass] 6cmDeep convolutional neural network for feature extraction and SVM for classification. Pretraining DCNN model with auxiliary data
3cm 3cmJavadi PCS’17 [javadi2018vehicle] 1cm96.5% 3cmprivate cars, light trailers, buses, and heavy trailers 6cmDesigned for classifying vehicles with similar body dimensions; Prior knowledge about speed regulations used for enhanced performance
3cm 3cmZhao TCDS’17 [zhao2017deep] 1cm97.9% 3cmsedans, vans, trucks, SUVs, and coaches 6cmThe visual attention mechanism to focus on only relevant part of the car image
3cm 3cmThegarajan CVPRW’17 [theagarajan2017eden] 1cm97.8% 3cmarticulated trucks, background, busses, bicycles, cars, motorcycles, nonmotorized vehicles, pedestrians, pickup trucks, single unit trucks, and work vans 6cmUsed the largest image dataset ever known to the research community
3cm 3cmKim CVPRW’17 [kim2017vehicle] 1cm97.8% 3cmarticulated trucks, background, busses, bicycles, cars, motorcycles, nonmotorized vehicles, pedestrians, pickup trucks, single unit trucks, and work vans 6cmData augmentation; A weighting scheme to compensate for different sample sizes
3cm 3cmLiu ACCESS’17 [liu2017ensemble] 1cm97.6% 3cmarticulated truck, background, bicycle, bus, car, motorcycle, pedestrian, pickup truck, non-motorized vehicle, single unit truck and work van 6cmData augmentation; An ensemble of CNN models
3cm 3cmChang ITSM’18 [chang2018vision] 1cm97.6% 3cmsedans, SUVs, vans, busses, and trucks 6cmThe multi-vehicle occlusion problem was addressed prior to vehicle classification
3cm 3cmHasnat ICIP’18 [hasnat2018new] 1cm99.0% 3cmlight, intermediate, heavy, heavy with more than 2 axles, and motorbikes

6cmA camera integrated with optical sensors; Two combined classifiers using the Gradient Boosting technique

3cmAerial Platforms 3cmCao ICIP’11 [cao2011linear] 1cm90.0% 3cmonly for vehicle detection 6cmNot capable of vehicle classification
3cm 3cmLiu GRSL’15 [liu2015fast] 1cmUp to 98.2% 3cmcars, and trucks 6cmHOG features used with a single hidden layer neural network
3cm 3cmAudebert RS’17 [audebert2017segment] 1cm80.0% 3cmsedans, vans, pickups, trucks 6cmData normalization and augmentation schemes to reduce discrepancy between training and testing datasets; LeNet, AlexNet, and VGG-16 for vehicle classification
3cm 3cmTan ICIP’18 [tan2018vehicle] 1cm80.3% 3cmsedans, vans, pickups, trucks 6cmManned aerial vehicle equipped with an infrared sensor; AlexNet and Inception Model for classification
3cmInfrared + ultrasonic sensors 3cmOdat TITS17 [odat2017vehicle] 1cmUp to 99% 3cmsedan, pickup truck, SUV, bus, two wheeler

6cmCombination of infrared sensors and ultrasonic sensors; Classification based on the Bayesian Network and Neural Network

3cmLaser scanner 3cmSandhawalia ITSC’13 [sandhawalia2013vehicle] 1cm82.5% 3cmpassenger vehicles, passenger vehicles with one trailer, trucks, trucks with one trailer, trucks with two trailers, and motorcycles 6cmRepresentation of a laser scanner profile as an image
3cm 3cmChidlovskii ITSC’14 [chidlovskii2014vehicle] 1cm86.8% 3cmpassenger vehicles, passenger vehicles with one trailer, trucks, trucks with one trailer, trucks with two trailers, and motorcycles 6cmSpecific domain knowledge (vehicle shape information) extracted from a laser scanner profile for classification
TABLE II: Over-roadway-based vehicle classification systems

The over-roadway-based systems install sensors over the roadway, offering non-intrusive solutions that do not require physical changes in the roadway, greatly reducing the cost for construction and maintenance. Furthermore, the over-roadway-based systems are capable of covering multiple lanes and in some cases an entire road segment (e.g., aerial platforms [cao2011linear]). Since cameras are most widely used for the over-roadway-based traffic monitoring systems [chen2012vehicle][bautista2016convolutional], the majority part of this section is dedicated to describing the camera-based vehicle classification systems. In addition, considering the recent research efforts to develop the camera-based systems that are mounted on aerial platforms such as unmanned aerial vehicles (UAVs) and satellites, this section also discusses those vehicle classification systems. Although the vehicle classification systems based on cameras have numerous advantages such as the high classification accuracy and the capability of covering multiple lanes, the major downside is the privacy concerns. As such, we discuss a number of privacy-preserving solutions such as the ones based on infrared sensors [odat2017vehicle], and laser scanners [sandhawalia2013vehicle]. Table II summarizes the characteristics of the over-roadway-based vehicle classification systems.

Iv-a Cameras

Fig. 5: A camera-based traffic monitoring system [unzueta2012adaptive].

A most widely adopted sensor for non-intrusive vehicle classification systems is a camera [chen2012vehicle][bautista2016convolutional]. A camera provides rich information for vehicle classification such as the visual features and geometry of passing vehicles  [tseng2002real]. In comparison with the in-road-based systems where multiple sensors are needed to cover multiple lanes, a single camera is sufficient for classifying vehicles in multiple lanes (Fig. 5). Advanced image processing technologies supported by sufficient processing power allow for classifying multiple vehicles very quickly and accurately.

The general working of a camera-based vehicle classification system is to capture an image of a passing car, extract features from the image, and run an algorithm to perform vehicle classification. As such, the camera-based systems can be categorized based on how the vehicle image is captured effectively (e.g., methods for reducing the impact of the background image), types of features extracted from the image, and the mechanisms for performing classification based on the extracted features. A recent trend is that more and more machine learning techniques are applied to extracting features automatically and effectively, and processing the features to build classification models. While earlier systems use simple classification models based on SVM, k

NN, and decision tree, more advanced machine learning algorithms such as the deep learning are increasingly adopted.

Chen et al. focus on effectively capturing a car image from video footage [chen2012vehicle]

. The authors adopt the background Gaussian Mixture Model (GMM) 

[zivkovic2006efficient] and the shadow removal algorithm [chen2009background] to reduce the negative impacts on vehicle classification caused by shadow, camera vibration, illumination changes, etc

. The Kalman filter is used for vehicle tracking and SVM is used to perform vehicle classification. Experiments were performed with real video footage obtained from cameras deployed in Kingston upon Thames, UK. Vehicles were classified into five categories,

i.e., motorcycles, cars, vans, buses, and unknown vehicles. The classification accuracy for these vehicle types was 94.6%.

Unzueta et al. also focus on effectively capturing the car image [unzueta2012adaptive]. Specifically, the authors address the problem of dynamic changes of the background in challenging environments such as illumination changes and headlight reflections to improve the classification accuracy. A multicue background subtraction method is developed that the segmentation thresholds are dynamically adjusted to account for dynamic changes of the background, and supplementing with extra features extracted from gradient differences to enhance the segmentation [unzueta2012adaptive]. After that, a two-step approach is proposed to derive spatial and temporal features of a vehicle for classification, i.e., by first generating 2-D estimations of a vehicle silhouette, and then augmenting them to 3-D vehicle volumes for more accurate vehicle classification. Three vehicle types are considered, namely, two wheels, light vehicles, and heavy vehicles. The classification accuracy was 92.6%.

Fig. 6: An example of the virtual detection lines (VDL) [mithun2012detection].

Mithun et al. propose a multiple virtual detection lines (MVDLs)-based vehicle classification system [mithun2012detection]. The VDL is a set of line indices of a frame for which the position is perpendicular to the moving direction of a vehicle (Fig. 6). The pixel strips on a VDL in chronological frames create a time spatial image (TSI). Multiple TSIs are used for vehicle detection and classification to reduce misdetection mostly due to occlusion. Specifically, a two-step process is proposed for classification. Vehicles are first classified into four general types based on the shape-based features. After that, another classification scheme based on the texture-based and shape-invariant features is applied to classify a vehicle into more specific types including motorbikes, rickshaws, autorickshaws, cars, jeeps, covered vans, and busses. The classification accuracy was between 88% and 91%

Identifying effective features from the car images is another important challenge for the camera-based vehicle classification systems. Karaimer et al. combines the shape-based classification and the The Histogram of Oriented Gradient (HOG) feature-based classification methods in order to improve the classification performance [karaimer2015combining]. Specifically, kNN is used for the shape-based features including convexity, rectangularity, and elongation, and SVM is used with the HOG features. The two methods are combined using different combination schemes, i.e.,

the sum rules and the product rules. The sum rule determines the vehicle class such that the sum of the two probabilities for the two classifiers is maximized, and the product rule determines based on the product of the two probabilities. Three vehicle classes were used, namely, cars, vans, and motorcycles. The classification accuracy was 96.5%.

Machine learning algorithms are used to extract effective features automatically. Huttunen et al. designed a deep neural network (DNN) that extracts features from a car image with background, removing the preprocessing steps of detecting a car from an image and aligning a bounding box around the car [huttunen2016car]. The hyper-parameters of the neural network are selected based on a random search that finds a good combination of the parameters [bergstra2012random]. The proposed system was evaluated with a database consisting of 6,555 images with four different vehicle types, i.e., small cars, busses, trucks, and vans. The classification accuracy was 97%.

Dong et al. applies the semisupervised convolutional neural network (CNN) for feature extraction [dong2015vehicle]

. In this work, vehicle front view images are used for classification. Specifically, the CNN consists of two stages. In the first stage, the authors design an unsupervised learning mechanism to obtain the effective filter bank of CNN to capture discriminative features of vehicles. In the second stage, the Softmax classifier is trained based on the multi-task learning 

[kumar2012learning] to provide the probability for each vehicle type. Experiments were conducted with two data sets, i.e., the BIT-Vehicle data set [bit_dataset], and the data set used by Peng et al. [peng2012vehicle]. The former data set consists of 9,850 vehicle images with six types: bus, microbus, minivan, sedan, SUV, and truck; the latter includes 3,618 daylight and 1,306 nighttime images with truck, minivan, bus, passenger car, and sedan. The classification accuracy for the two data sets were 88.1%, and 89.4%, respectively.

In line of the research based on advanced machine learning techniques, Adu-Gyamfi et al. develop a vehicle classification system using the deep convolutional neural network (DCNN) that is designed to extract vehicular features quickly and accurately [adu2017automated]. Compared to other approaches, the DCNN model is pretrained with an auxiliary data set [russakovsky2015imagenet] and then is fine-tuned with the domain specific data collected from the Virginia and Iowa DOT CCTV camera database. The vehicles were classified into FHWA’s 13 vehicle types. The results show that the classification accuracy was greater than 89%.

Although machine learning techniques advanced the feature extraction process and improved the vehicle classification accuracy, numerous challenges still remain to be addressed. One of those challenges is to classify visually similar vehicles. Javadi et al. propose to apply the fuzzy c-means (FCM) clustering [bezdek2013pattern] based on vehicle speed as an additional feature to address this challenge [javadi2018vehicle]. Specifically, they exploit the prior knowledge about varying traffic regulations and vehicle speeds to enhance the classification accuracy for the vehicles with similar dimensions. The proposed classification approach was evaluated with the vehicle images collected for 10 hours from a real highway, classifying the vehicles into four types, namely private cars, light trailers, buses, and heavy trailers. The classification accuracy of 96.5% was achieved.

Another challenge for applying machine learning techniques for automating background processing and feature extraction is that different parts of an image of a passing car are treated without distinctions, degrading the performance [krizhevsky2012imagenet][sivaraman2013looking]. Zhao et al. focus on this problem that potentially misses the key part of a car image [zhao2017deep]. Their work is motivated by the human vision system that distinguishes the key parts of an image from the background, which is called the multiglimpse and visual attention mechanism [rensink2000dynamic]. This remarkable capability of focusing on only the relevant part of the image allows the human to classify images very accurately. The key idea of their work is thus to exploit the visual attention mechanism to generate a focused image first and provide the image as input to CNN for more accurate vehicle classification. They performed experiments to classify a vehicle into five types, sedans, vans, trucks, SUVs, and coaches, and achieved the classification accuracy of 97.9%.

Theagarajan et al. observe that machine learning algorithms work only effectively with an extremely large amount of image data [theagarajan2017eden]. The authors also found that most camera-based classification systems are built upon small traffic data sets that do not take into account sufficiently the variability in weather conditions, camera perspectives, and roadway configurations. To address this problem, they develop a deep network-based vehicle classification mechanism utilizing the largest data set that is ever known to the research community. The data set contains 786,702 vehicle images from cameras at 8,000 different locations in USA and Canada. With the huge amount of data, they classified vehicles into 11 types including articulated trucks, background, busses, bicycles, cars, motorcycles, nonmotorized vehicles, pedestrians, pickup trucks, single unit trucks, and work vans. They obtained high classification accuracy of 97.8%.

The same data set [theagarajan2017eden] was used by Kim and Lim [kim2017vehicle]. Different from other works based on CNN, the authors apply a data augmentation technique to enhance the performance under different sample sizes for different types of cars. The authors also apply a weighing mechanism that associates a weight depending on different vehicle types. The classification accuracy was 97.8%. The imbalanced dataset problem was also addressed by Liu et al. [liu2017ensemble]. Specifically, to increase the number of samples for certain vehicle types, they apply various data augmentation techniques such as random rotation, cropping, flips, and shifts and created an ensemble of CNN models based on the parameters obtained from the augmented dataset. The proposed work was tested with the MIO-TCD classification challenge dataset which classifies the vehicles into 11 types. They achieved the classification accuracy of 97.7%.

The vehicle occlusion problem is another challenge for applying machine learning algorithms to camera-based vehicle classification. Chang et al. propose an effective model based on the Recursive Segmentation and Convex Hull (RSCH) to address this problem [chang2018vision]. Specifically, vehicles are assumed as convex regions, and a decomposition optimization model is derived in order to separate vehicles from a multi-vehicle occlusion. After addressing the occlusion problem, vehicle classification is performed with a regular CNN. Experiments were conducted with the CompCars dataset [yang2015large] which consists of 136,726 vehicle images with five types: sedans, SUVs, vans, busses, and trucks. For this dataset, the authors achieved the classification accuracy of 97.6%.

Some vehicle classification systems integrate a camera with a different type of sensor. Hasnat et al. significantly improve the classification accuracy by integrating a camera with optical sensors [hasnat2018new]. They call it a hybrid classifier system. Specifically, the system consists of both the optical sensor-based classifier and the CNN-based classifier. And then, they apply the Gradient Boosting technique [friedman2001greedy] to combine the decisions from these classifiers, constructing a stronger predictor based on the base predictors. Five vehicle classes are defined for classification: light vehicles (height less than 2m), intermediate vehicles (height between 2m and 3m), heavy vehicles (height greater than 3m), heavy vehicles with more than 2 axles, and motorbikes. The classification accuracy was 99.0%.

Iv-B Aerial Platforms

Fig. 7: An example of an aerial image and vehicle detection using SVM [cao2011linear].

Cameras are mounted on aerial platforms such as UAVs and satellites in order to cover wider areas such as an entire roadway segment (Fig. 7). Despite the advantage of wider coverage, vehicle classification for aerial platforms is a non-trivial task due to the low image resolution. In fact, even the vehicle detection itself is not an easy task. For example, Cao et al. develop a method for vehicle detection based on an airborne platform [cao2011linear]. The key contribution is to enhance the detection process by utilizing a new feature called the boosting HOG. And then, the linear SVM is used for classification. Videos were captured in an urban traffic environment to evaluate the proposed system. While most ground-based traffic monitoring systems achieve near 99% accuracy for vehicle detection (note that this is not for vehicle classification), the proposed system achieved the vehicle detection accuracy of 90%.

Due to the low image resolution, many aerial platform-based vehicle classification systems target for only a limited number of vehicle types such as cars and trucks. In particular, Liu and Mattyus focus on improving the computation speed for vehicle classification [liu2015fast]. A binary sliding window detector is applied to detect a vehicle from an aerial image. Once a vehicle is detected, the HOG features are extracted [dalal2005histograms] using a neural network with a single hidden layer [lecun2012efficient]. Vehicles are then classified into two types, i.e., cars and trucks. The classification accuracy was high as 98.2% due to the small number of vehicle types for classification.

With the help of advanced machine learning techniques, the classification accuracy of some aerial platform-bases vehicle classification systems is improved. Yet, the results are not comparable to the ground sensor-based vehicle classification systems. Tan et al. develop a two-step vehicle classification method using aerial images [tan2018vehicle]. A change detection scheme is applied to detect vehicles based on pixel-level changes represented as a heat map. And then, a standard CNN is applied for classification. In particular, they adopt the fully connected layer of the AlexNet model [krizhevsky2012imagenet], and the final classification layer of the Inception model [szegedy2016rethinking]. Experiments were performed with the images collected from a manned aircraft. The vehicles were classified into four classes: sedans, vans, pickups, and trucks. The classification average accuracy was 80.3%.

Audebert et al. also apply a standard CNN to aerial images for vehicle classification [audebert2017segment]. Various CNN models are adopted such as LeNet [lecun1998gradient], AlexNet [krizhevsky2012imagenet], and VGG-16 [simonyan2014very] pre-trained with existing training datasets. To overcome the discrepancy between the training datasets and testing datasets, the authors utilize data normalization and augmentation techniques based on the geometric operations including translations, zooms, rotations of images. The experiments were performed with the NZAM/ONERA Christchurch dataset classifying the vehicles into cars, vans, pickups, trucks. The highest classification accuracy of 80% was achieved with the VGG-16 model.

Iv-C Privacy Preserving Solutions

Fig. 8: Vehicle classification based on infrared and ultrasonic sensors [odat2017vehicle].

A major downside of the camera-based vehicle classification systems regardless of the types of platforms (ground or aerial) is the privacy concerns. Various privacy preserving solutions have been developed using different kinds of sensors. Odat et al. propose a system based on the combination of the infrared and ultrasonic sensors [odat2017vehicle] (Fig. 8) The Bayesian network and neural network are used to fuse the extracted features from the sensor data collected from both sensors. Specifically, the height of different parts of a passing vehicle which is computed using the measurements of the ultrasonic sensor are used as the key features. Also, other features extracted from the infrared sensors, i.e., the inverse of the estimated delay and the estimated duration are used for classification. The passing vehicles were classified into sedan, pickup truck, SUV, bus, two wheeler. The best classification accuracy was 99%.

Sandhawalia et al. develop a privacy preserving solution using the laser scanners [sandhawalia2013vehicle]. The laser scanners perform 3D scan of the vehicle surface allowing for accurate estimation of the width, height, and length of the passing vehicle. It is noted that although the laser scanners addresses the privacy concerns, the laser scanners are sensitive to extreme weather conditions and the cost for installation is higher than cameras. The authors represent a laser scanner profile as an image to perform image classification. Specifically, an image presentation technique, i.e., the Fisher vector [perronnin2010improving] is applied to extract effective features from a laser scanner image. In this work, the vehicles were classified into six types: passenger vehicles, passenger vehicles with one trailer, trucks, trucks with one trailer, trucks with two trailers, and motorcycles. The classification accuracy of 82.5%.

Another laser scanner-based approach is developed by Chidlovskii et al. [chidlovskii2014vehicle]. The key contribution of this vehicle classification system in comparison with [sandhawalia2013vehicle] is to utilize the specific domain knowledge, i.e., the vehicle shapes to enhance the classification accuracy. Specifically, vehicle shapes are extracted from the laser scans to analyze a vehicle as a multi-dimensional object. To address the space shift and scaling problem, the dynamic time warping (DTW) [berndt1994using] and the global alignment kernel (GA) [cuturi2007kernel] are used. The same six vehicle types as [sandhawalia2013vehicle] were used for experiments. The best classification accuracy achieved was 86.8%.

V Side-Roadway-Based Vehicle Classification

The side-roadway-based vehicle classification systems deploy sensors on a roadside. Similar to over-roadway-based systems, a key advantage of the side-roadway-based systems is the capability of covering multiple lanes simultaneously. Additionally, the side-roadway-based systems are easier to install quickly at a reduced cost as no traffic disturbance and lane closure is needed at all, which makes these systems especially appropriate for ad-hoc monitoring purposes. However, a critical challenge lies in classifying the overlapping vehicles because it is difficult to obtain the sensor data for the occluded vehicles, and the sensor data for the front vehicle may be distorted significantly. Various kinds of sensors are used to implement the side-roadway-based systems such as the magnetic sensors [taghvaeeyan2014portable], acoustic sensors [ntalampiras2018moving], LIDAR [asborno2019truck], radar [raja2016analysis], radio tranceivers [sliwa2018leveraging], and Wi-Fi transceivers [won2017witraffic]. Table III summarizes the characteristics of these side-road-based vehicle classification systems.

Major Equipment Publications Accuracy Vehicle Classes Key Features
3cmMagnetic sensors 3cmTaghvaeeyan TITS’14 [taghvaeeyan2014portable] 1cm83.0% 3cmClass I (sedan), class II (SUV, pickup, van), class III (bus, two-three-axle trucks), class IV (articulated bus, four-to-six-axle truck) 6cmMagnetic height as a key feature to address the problem of classifying vehicles with the same length
3cm 3cmWang TITS’14 [wang2014easisee] 1cm93.0% 3cmbicycles (including bicycles, electric bicycles and motorcycles), cars (including family cars, taxis, and SUVs), and minibuses 6cmMagnetic sensor used for collaborative sensing with a camera to reduce power consumption
3cm 3cmYang IEEE Sensors’15 [yang2015vehicle] 1cm93.6% 3cmmotorcycle, two-box, saloon, bus and sport utility vehicle (SUV) 6cmVehicle classification for low-speed congested traffic
3cmAcoustic sensors 3cmBischof IS’10 [bischof2010autonomous] 1cm85.0% 3cmcars and trucks 6cmAcoustic sensors used to support autonomous training for the camera-based system
3cm 3cmNtalampiras TETCI’18 [ntalampiras2018moving] 1cm96.3% 3cmassault amphibian vehicle (AAV) and dragon wagon (DW) 6cmA group of acoustic sensors; Sensor-specific classification model; faulty sensor detection
3cmLidar 3cmLee TRR’12 [lee2012side], JITS’15 [lee2015using] 1cm99.5% 3cmmotorcycle, passenger vehicle, passenger vehicle pulling a trailer, single-unit truck, single-unit truck pulling a trailer, and multiunit truck 6cmVehicle body information (vehicle length and height) extracted from accurate LiDAR data used as key features
3cm 3cmAsborno TRR’19 [asborno2019truck] 1cm96.0% 3cm van and container, platform, low-profile trailer, tank, and hopper and end dump 6cmDesigned specifically for classification of truck body types; The duration and vehicle body points are the main features
3cmRF Transceivers 3cmHaferkamp VTC’17 [haferkamp2017radio] 1cm99.0% 3cmpassenger cars, trucks

6cmReceived signal strength (RSSI) as a key feature; kNN and SVM for classification

3cm 3cmSilwa ITSC’18 [sliwa2018leveraging] 1cm89.1% 3cmpassenger cars, passenger cars with trailer, SUVs, minivans, vans, trucks, truck with trailers, buses, and transporters 6cmMultiple sets of RF transmitters and receivers
3cmRadar 3cmRaja Sensors’16 [raja2016analysis] 1cm99.0% 3cmcompact, saloon and small sport utility vehicle (SUV) 6cmPower spectral density of the time-domain signal as input to kNN; The classification accuracy depends on the distance between the radio receiver and the passing car
3cmWi-Fi Transceivers 3cmWon ICCCN’17 [won2017witraffic] 1cm96.0% 3cmpassenger vehicles, and trucks 6cmThe first Wi-Fi-based traffic monitoring system that is build upon a pair of Wi-Fi transceivers to reduce the cost.
3cm 3cmWon ArXiv’18 [won2018deepwitraffic] 1cm91.1% 3cmmotorcycle, passenger car, SUV, pickup truck, large truck 6cmA Wi-Fi-based traffic monitoring system with an advanced machine learning technique to enable classification for more vehicle types.
TABLE III: Side-roadway-based vehicle classification systems

V-a Magnetic Sensors

The magnetic sensors have been widely adopted by the in-road-based vehicle classification systems. However, the major limitation of the in-road-based systems is the huge cost for installation and maintenance. In an effort to address this limitation, new vehicle classification systems are developed that deploy magnetic sensors on a roadside. While the basic mechanism for these side-roadway-based systems is similar to the in-road-based systems in that vehicle classification is performed based on the magnetic profile of a passing car, numerous research challenges are addressed such as classifying vehicles with the similar body size (e.g., SUVs and pickup trucks), and classifying overlapping vehicles.

Taghvaeeyan et al. develop a vehicle classification system based on the three-axis magnetic sensors (Fig. 9) deployed roadside focusing on addressing the problem of classifying vehicles with similar body size [taghvaeeyan2014portable]. The key idea is to utilize both the vehicle length and height as the main features for vehicle classification. The vehicle height information can be obtained by deploying the sensors roadside. More precisely, while existing in-road-based systems based on magnetic sensors measure only the vehicle length, the proposed system is capable of obtaining the vehicle height information by placing another magnetic sensor above a magnetic sensor and measuring the ratio of the sensor readings from the two sensors. Vehicles were classified into five categories: Class I (sedans), Class II (SUVs, pickups, and vans), Class III (buses, two- and three-axle trucks). Class IV (articulated buses and four- to six-axle trucks). The classification accuracy was 83%.

Fig. 9: The three-axis AMR sensor used by [taghvaeeyan2014portable].

Yang and Lei focus on another interesting problem for magnetic sensor-based vehicle classification systems, i.e., classifying vehicles that are too close to each other, which typically happens under the low-speed congested traffic conditions [yang2015vehicle]. When vehicles are too close, the magnetic signals are significantly distorted making the vehicle classification process extremely challenging. To address this problem, the authors propose a hierarchical tree-based approach [kaewkamnerd2009automatic]. The key idea is to identify and extract effective features from the magnetic signal that are immune to signal distortions caused by the small inter-vehicle distance. Specifically, five features including the signal duration, signal energy, average energy, and ratios of the positive and negative energy are extracted. A hierarchical tree is constructed by comparing the values of these features, which is then used to classify vehicles into five categories: motorcycle, two-box, saloon, bus and sport utility vehicle (SUV). The classification accuracy was 93.6%.

In some cases, magnetic sensors are used for collaborative sensing. EasiSee is a camera-based vehicle classification system [wang2014easisee], but it utilizes a magnetic sensor to save power consumption. Specifically, the magnetic sensor is used to detect a passing vehicle, and only when a vehicle is detected, the camera is activated. The authors also develop an efficient image processing algorithm focusing on reducing the computational complexity. Vehicles were classified into bicycles (including bicycles, electric bicycles and motorcycles), cars (including family cars, taxis, and SUVs), and minibuses. The classification accuracy was 93%.

V-B Acoustic Sensors

The acoustic sensor-based vehicle classification systems capture the audio signal induced by a passing vehicle using the microphone sensors. The success of these types of solutions depends largely on effective feature extraction from acoustic signals. However, since the performance of the acoustic sensors are easily affected by noise, it is very challenging to identify such effective features. As a result, the acoustic sensors are typically used to support operation of other types of sensors such as cameras [bischof2010autonomous]. Additionally, a group of acoustic sensors are deployed to mitigate the impact of noise and increase the classification accuracy [ntalampiras2018moving].

Bischof et al. adopt an acoustic sensor to support the self learning process of a camera-based vehicle classification system [bischof2010autonomous]. The proposed system consists of audio-based and video-based classification systems. The audio-based system acts as a supervisor to enable autonomous training of the video-based system, obviating the needs for labeling the huge amount of video data manually. Specifically, the audio sensor-based system performs a priori classification for a passing car and forwards the classification results with the confidence level to the video-based system. And then, the video-based system uses the results for autonomously training the classification model. The proposed system was evaluated with different kinds of classifiers such as kNN, SVM, and ANN. Vehicles were classified into two types trucks and cars. The classification accuracy was 85% for trucks and 71% for cars.

Ntalampiras [ntalampiras2018moving] develop a wireless acoustic sensor network (WASN) that consists of multiple wireless microphone nodes to make the system resilient to environmental noise. An interesting aspect of their work is that the sensor specific classification models are created at the sensor level, and then the decisions are combined at the higher level using the correlation-based dependence graph. In addition, a stationary checking algorithm is proposed to detect sensor faults, taking advantage of multiple acoustic sensors. Experiments were conducted with the DARPA/IXOs SensIT dataset which consists of two vehicle types, Assault Amphibian Vehicle (AAV) and Dragon Wagon (DW) [duarte2004vehicle]. The average classification accuracy was 96.3%.

V-C Lidar

Fig. 10: An example of a LIDAR-based vehicle classification system [lee2012side].

A light detection and ranging (LIDAR) sensor sends eye-safe laser lights and record the reflections to calculate the points of the environment such as the road, passing vehicles, and vegetation, etc. Based on the collected data, effective features are extracted such as the size and shape of the passing car to perform vehicle classification. LIDAR is especially powerful in identifying the shape of a passing car due to the high precision sensing. However, the vehicle occlusion problem remains as a challenge for LIDAR-based vehicle classification systems.

Lee and Coifman develop a LIDAR-based vehicle classification system [lee2012side][lee2015using]. Two LIDAR sensors that are mounted on the driver side of a car are deployed roadside to scan the body of a passing car vertically (Fig. 10). Specifically, six features are identified and extracted from the LIDAR data which include the vehicle height, vehicle length, middle drop, height at middle drop, front vehicle height, front vehicle length, rear vehicle height, and rear vehicle length. The middle drop is used to classify vehicles pulling trailers; The different height at middle drop is used to differentiate between the passenger vehicles with trailers and the trucks with trailers. A classification tree is built by comparing the values of the features. Six vehicle classes were used for classification, i.e., the motorcycle, passenger vehicle, passenger vehicle pulling a trailer, single-unit truck or bus, single-unit truck or bus pulling a trailer, and multi-unit truck. They achieved the classification accuracy of 99.5%.

Asborno et al. focus on the classification of truck body types [asborno2019truck]. Two LIDAR units are deployed roadside. Two key features are defined, i.e., the duration and the array of the vehicle body points. The duration means the elapsed time while a passing vehicle was in front of the LIDAR unit, and the vehicle body points capture the shape of truck body. Based on these two key features as input to several classifiers such as Decision Tree (DT), artificial neural network (ANN), support vector machine (SVM), and Naive Bayes (NB), vehicle classification was performed. The proposed system was deployed at an interstate location to classify vehicles into five different truck body types, i.e., five-axle tractor-trailers (van and container, platform, low-profile trailer, tank, and hopper and end dump). They obtained the classification accuracy up to 96%.

V-D Radar

The basic mechanism of the radar-based vehicle classification systems are similar to the LIDAR-based systems. The difference is that while the LIDAR sensors use the laser beams, the radar sensors use radio waves. The radar sensors are less vulnerable to weather and light conditions than LIDAR, but the LIDAR sensors provide more accurate representation of the vehicle body.

Raja et al. use the passive forward scattering radar (FSR) for vehicle classification [raja2016analysis]. The radar cross section information is analyzed in the time domain for de-noising and normalization. And then, the power spectral density (PSD) of the time-domain signal is calculated using the Welch algorithm [welch1967use]. The power spectral density estimates the power of the signal at different frequencies, which is used as input to a classifier. The large data size of the spectral signature of PSD is reduced using the Principle Components Analysis (PCA). After that, the kNN is applied to classify vehicles into three types: compact, saloon and small sport utility vehicle (SUV). The classification accuracy was influenced by the distance between the receiver and the car, i.e., the classification accuracy was 99% for 5m, and 82.1% for 20m.

V-E RF Transceivers

The propagation of the radio frequency (RF) signals is influenced by a passing vehicle. Specifically, a RF transmitter and a receiver are deployed on the opposite sides of a road. When a car passes, the line of sight between the transmitter and the receiver are interrupted resulting in attenuation and reflection of the RF signals. Consequently, distinctive patterns of the received RF signals depending on the shape and size of the passing car are captured by the receiver. These unique patterns are used to classify the vehicles.

Haferkamp et al. focus on the attenuation of the RF signal due to a passing car and uses it as a key feature for vehicle classification [haferkamp2017radio]. The signal attenuation is represented by the received signal strength indicator (RSSI). The RSSI traces corresponding to the passing vehicle are provided as input to classifiers, i.e., kNN and SVM. A five-fold cross validation is used to perform classification. The vehicles were classified into passenger cars and trucks. The classification accuracy was 99% which is quite high due to the small number of vehicle types.

Silwa et al. utilize the the low-rate wireless personal area networks (LR-WPANs), i.e., the IEEE 802.15.4 standard to capture the radio fingerprint of a passing vehicle for vehicle classification [sliwa2018leveraging]. Similar to [haferkamp2017radio], RSSI is used as the main feature, while the proposed system is designed to achieve more accurate and reliable vehicle classification. Specifically, three transmitters and three receivers are deployed on each side of the street with the fixed longitudinal distances. Three different classifiers are adopted, i.e., SVM [cortes1995support], CNN [lecun1989generalization]

, and Random Forests (RF) 

[breiman2001random]. The system classifies vehicles into 9 different types: passenger cars, passenger cars with trailer, SUVs, minivans, vans, trucks, truck with trailers, buses, and transporters. The average classification accuracy was 89.1%.

V-F Wi-Fi Transceivers

Recently, Wi-Fi-based traffic monitoring systems have been developed specifically targeting the endemic cost issue for deploying a large number of traffic monitoring systems to cover huge miles of rural highways. The idea is to leverage the unique Wi-Fi channel state information (CSI) patterns [halperin2011tool] induced by passing vehicles to perform vehicle classification. Specifically, the spatial and temporal correlations of CSI phase and amplitude enable effective vehicle classification. Especially the significantly low cost of off-the-shelf Wi-Fi transceivers enable large-scale deployment of traffic monitoring systems. Won et al. develop the first prototype system and demonstrate the average vehicle classification accuracy of 96% [won2017witraffic]. However, the prototype classifies vehicles only into passenger cars and trucks. The authors, in their extended version of the work, applies an advanced machine learning technique, i.e., a convolutional neural network (CNN) to extract the effective features of the CSI data automatically and enables classification for more vehicle types including motorcycles, passenger cars, SUVs, pickup trucks, and large trucks [won2018deepwitraffic]. They achieved the average classification accuracy of 91.1%.

Specifically, a convolutional neural network (CNN) is designed to capture the optimal features of CSI data automatically and train the vehicle classification model based on effectively preprocessed CSI data as input. Numerous techniques are applied to address challenges of improving the classification accuracy.

Vi Challenges for Future Research

We have witnessed significant development of vehicle classification systems in the past decade. Thanks to the recent advances in sensing, machine learning, and wireless communication technologies, the classification accuracy has improved greatly at a significantly reduced cost. However, these emerging vehicle classification systems have left a number of open questions as well. In this section, we discuss these challenges and several future research directions.

Fig. 11: The classification accuracy for different numbers of vehicle types.

A standard that defines a list of vehicle types for classification is needed to allow the vehicle classification systems to be evaluated based on the same set of vehicle types. As a result, the system developers and researchers will be able to evaluate the performance of their systems more effectively, and the users like the government agencies will be able to do fair comparison of various vehicle classification systems and select the most appropriate solution for them. Unfortunately, however, various vehicle classification systems have been tested with extremely different types and numbers of vehicles. Fig. 11 displays the classification accuracy for different numbers of vehicle types of the vehicle classification systems that we reviewed in this article. The fitted curve in this figure indicates that the systems that are evaluated with a smaller number of vehicle types tend to have higher classification accuracy. However, the high classification accuracy does not guarantee consistently good performance for different vehicle types.

Another important problem that makes fair comparison of vehicle classification systems difficult is different experimental conditions used by different vehicle classification systems. There are numerous factors that should be controlled to allow for fair comparison of the performance such as the number of lanes, obstacles, and weather conditions. For example, weather conditions affect the performance of certain types of sensors such as the camera, LIDAR, radio, and Wi-Fi. Side-firing sensors are significantly affected by the number of lanes due to overlapped vehicles. Some sensors such as the acoustic sensors are exceptionally vulnerable to noise. A universally accepted standard for experimental configurations is demanded.

The vehicle classification systems should conform to a common set of performance metrics. However, numerous vehicle classification systems focus only on measuring the classification accuracy while ignoring other performance metrics such as the cost for maintenanace/installation, the capability of classifying overlapped vehicles, sustainability (duration of operation), and resiliency to weather conditions/noise. For example, while camera-based classification systems achieve high classification accuracy, these systems suffer from the privacy concerns. Similarly, many in-road-based classification systems have high classification accuracy due to close contact with passing cars, but these systems are very costly to build and maintain those systems.

One of the critical challenges especially for side-roadway-based classification systems is the vehicle occlusion problem. The operation of numerous kinds of sensors such as the magnetic sensors, LIDAR, Radar, RF, and Wi-Fi is disturbed by the occluding vehicles, making it nearly impossible to accurately classify the overlapped vehicles. A possible approach is take advantages of the over-roadway-based systems to develop a more efficient side-roadway-based systems. Specifically, the side-firing sensors can be placed at different heights so that each sensor can cover each lane explicitly without being interrupted by the vehicles in other lanes. For example, a LIDAR sensor can be configured to record reflections from a targeted lane only. To the best of our knowledge, there is no side-roadway-based vehicle classification systems that consider the better strategy of placing sensors to overcome the vehicle occlusion problem.

More and more vehicle classification systems depend on machine learning techniques. To achieve high classification accuracy, however, a huge amount of data should be collected to train and create an effective classification model. Especially, the manual labeling process for training the classification model requires a significant amount of time and efforts. It also requires extra efforts for obtaining the ground truth data. A possible future research direction is to develop a “closed loop self-learning” vehicle classification system. Once deployed, these systems will train the classification models autonomously and continuously evolve based on trial and error.

Although we have seen that many classification systems achieve very high classification accuracy, achieving near 100% classification accuracy especially for a large number of vehicle types is still a very challenging task. One possible reason for the difficulty lies in the fact that most solutions rely on a single type of sensor for vehicle classification. There are few works that utilize the hybrid approach of combining the advantages of different types of sensors, and even the different types of deployment methods, e.g., combination of the side-roadway-based and over-roadway-based systems. These heterogeneous sensor systems will communicate and exchange various kinds of information to offset their weaknesses and capitalize their strengths to achieve higher classification accuracy. For example, the camera-based system may be adaptively controlled based on the presence of a vehicle that is detected with a low-power sensor in order to reduce the power consumption. Similarly, the camera-based systems may be activated only when the light condition is met in coordination with the light sensor, and different kinds of monitoring systems such as the infrared sensor based system can be activated at night. To the best of our knowledge, no research has been performed that identifies the optimal method for integrating various classification systems together. We envision that this review paper will be useful resources for development of such collaborative systems.

With the rapid development of the vehicle-to-everything (V2X) technology, we will see a mix of the vehicles equipped with the V2X device and the traditional ones on highways in the very near future. The traffic monitoring systems should provide support for classifying these V2X-equipped vehicles. Fortunately, classifying these vehicles can be simple by allowing them to send the information about the vehicle type as a V2X message to the classification system. Yet, numerous technical challenges for creating an effective protocol that enables seamless communication between passing cars and the traffic monitoring system should be addressed, such as reliable and secure data transmission, dynamic range adjustment, interference reduction, support for both DSRC and LTE, definition of the message format, etc.

Vii Conclusion

We presented a review of traffic monitoring systems focusing on the key functionality of vehicle classification. By categorizing the vehicle classification systems according to how sensors are installed into three types, i.e., in-roadway, over-roadway, and side-roadway based systems, we discussed various research issues, methodologies, hardware design, and limitations. We also discussed a number of research challenges and future research directions. We expect that the rich contents about virtually all vehicle classification systems developed in the past decade will be useful resources for academia, industry, and government agencies in selecting appropriate vehicle classification solutions for their traffic monitoring applications.

Acknowledgement

This research was supported in part by the Competitive Research Grant Program (CRGP) of South Dakota Board of Regents (SDBoR), and in part by Global Research Laboratory Program (2013K1A1A2A02078326) through NRF, and DGIST Research and Development Program (CPS Global Center) funded by the Ministry of Science, ICT & Future Planning of South Korea.

References