DeepWiTraffic: Low Cost WiFi-Based Traffic Monitoring System Using Deep Learning

12/19/2018 ∙ by Myounggyu Won, et al. ∙ South Dakota State University DGIST University of Memphis 0

A traffic monitoring system (TMS) is an integral part of Intelligent Transportation Systems (ITS) for traffic analysis and planning. This paper addresses the endemic cost issue of deploying a large number of TMSs to cover huge miles of two-lane rural highways (119,247 miles in U.S.). A low-cost and portable TMS called DeepWiTraffic based on COTs WiFi devices and deep learning is proposed. DeepWiTraffic enables accurate vehicle detection and classification by exploiting the unique WiFi Channel State Information (CSI) of passing vehicles. Spatial and temporal correlations of preprocessed CSI amplitude and phase data are identified and analyzed using deep learning to classify vehicles into five different types: motorcycle, passenger vehicle, SUV, pickup truck, and large truck. A large amount of CSI data of passing vehicles and the corresponding ground truth video data are collected for about 120 hours to validate the effectiveness of DeepWiTraffic. The results show that the average detection accuracy of 99.4 accuracy of 91.1 Truck: 83.3 about 1,000.



There are no comments yet.


page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

A traffic monitoring system (TMS) is an important component of Intelligent Transportation Systems (ITS) for improved safety and efficiency of transportation. TMSs are deployed to collect traffic data that characterize performance of a roadway system. Different traffic parameters are measured such as the number of vehicles, vehicle density, vehicle speed, and vehicle class. These traffic parameters are essential information in analyzing transportation systems and estimating future transportation needs 

[1]. For example, TMSs have played a key role in supporting decision making process for road improvement plans, accessing the road network efficiency, and analyzing economic benefits, etc.

The Department of Transportation (DOT) in each state is charged by the United States Federal Highway Administration (FHWA) to collect traffic information about vehicles traveling state and federal highways and roadways to improve the safety and efficiency [2]. As such, state highway and transportation agencies operate TMSs to perform vehicle counting, vehicle classification, and vehicle weight measurement. These TMSs are either temporary or permanent. There are 7,430 TMSs under operation in the U.S. as of August 2015 [3].

An endemic issue for many state DOTs is the high cost for deploying a sufficient number of TMSs to cover the gigantic land area of U.S. especially considering the huge miles (119,247 miles) of rural highways. According to the Georgia DOT, the minimum cost to install a continuous TMS on a two-lane rural roadway is about $25,000 [4], and 365 day vehicle classification on a two-lane rural roadway is more expensive costing about $35,770 [5]. This paper aims at addressing this endemic cost issue by taking an innovative approach to develop a low-cost, portable, and innovative TMS based on WiFi channel state information (CSI) and deep learning. To this end, a novel TMS is proposed that achieves vehicle detection accuracy of 99.4% and classification accuracy of 91.1% at the cost of about $1,000.

Vehicle detection and classification techniques are largely categorized into three types: intrusive, non-intrusive, and off-roadway [6]. Intrusive solutions embed sensors such as magnetic detectors [7], vibration sensors [8], and inductive loops [9] in the pavement of roadway. Non-intrusive approaches mount sensors like magnetic sensors [10], acoustic sensors [11], and LIDAR (Laser Infrared Detection And Ranging) [12] either on roadsides or over the road. Off-roadway solutions use mobile sensor systems such as UAVs [13][14] or satellites [15]. Detailed and comprehensive discussion of existing technologies is presented in Section II.

Intrusive approaches are known to be the most expensive option mainly due to the significantly high cost for installation and maintenance, especially for traffic disruption and lane closure to assure security of road workers. Furthermore, effectiveness of these embedded sensors is easily affected by the condition of the pavement and often generates unreliable results. DOTs are increasingly adopting non-intrusive solutions. A widely adopted sensor in this category is a camera. However, it has been reported that the performance is degraded when vision obstructions are present and even more severely in adverse weather conditions. Furthermore, cameras incur the privacy issue and are expensive especially because of the installation cost since they must be fixed at a certain mounting height for optimum performance. Other sensors for non-intrusive solutions such as magenetic sensors and acoustic sensors require precise calibration of sensor direction and placement, thus not appropriate for general and ad hoc deployment [16]. Although some sensors such as LIDAR guarantee very good performance, those sensors are extremely expensive. Thanks to recent advances in UAV technologies, off-road-based approaches are receiving greater attention. However, these solutions suffer from spatial and temporal limitations, e.g., the operation time of a UAV is limited due to the limited flight time, and satellites are not always available.

This paper proposes - a portable, non-intrusive, and inexpensive TMS. The proposed TMS hinges on distinctive wireless channel characteristics induced by passing vehicles to classify them into five vehicle types: motorcycles, passenger cars, SUVs, pickup trucks, and large trucks.

utilizes WiFi channel state information (CSI) that conveys rich information about the changes in the channel properties caused by passing vehicles. Especially the spatial and temporal correlations of CSI phase and amplitude of different subcarriers are analyzed for effective vehicle classification. Specifically, a convolutional neural network (CNN) is designed to capture the optimal features of CSI data automatically and train the vehicle classification model based on effectively preprocessed CSI data as input. Numerous techniques are applied to address challenges of improving the classification accuracy. Environment noises in CSI amplitude data caused by surrounding obstacles and low-speed moving objects,


people moving around are effectively mitigated. Principal component analysis (PCA) is exploited to reduce the dimension of multiple subcarriers (in our experiments, 30 subcarriers for each TX and RX pair) down to one, expediting the processing speed for classification and sharpening the vehicle detection performance. A linear transformation-based phase preprocessing technique is used to effectively capture the changes in CSI phase data induced by passing vehicles.

We have collected huge amounts of CSI data for about 120 hours over a month period. The video data were used as the ground truth, i.e., CSI data of passing vehicles were manually and individually tagged with the ground truth vehicle class captured in the recorded video. Rigorous experiments were performed with a large number of combinations of hyper parameters in training the CNN model to improve the classification accuracy. Consequently, we report that the average vehicle detection and classification accuracy of was 99.4% and 91.1%, respectively.

Ii Related Work


Classification Approach Publication Vehicle Class Cost Accuracy
Intrusive Piezoelectric sensor Rajab [17] 6cmmotorcycles, passenger vehicles, two axle for tire unit, buses, two axles six tire single units, three axles single units, four or more axles single unit, four axles single trailer, five axles single trailer, seven axles single trailer, seven or more axles multi-trailer medium 86.9%
Magnetometer Bottero [18] 6cmcar, van, truck medium 88.0%
Xu [7] 6cmhatchbacks, sedans, buses, and multi-purpose vehicles medium 95.4%
Loop Detector Meta [19] 6cmcar, van, truck, bus, motorcycle high 94.2%
Jeng [9] 6cmmotorcycles, passenger cars, other two-axle four-tire single unit vehicles, buses, two-axle, six-tire single-unit trucks, tree-axle single-unit trucks, four or more axle single-unit trucks, four or fewer axle single-trailer trucks, five-axle single-trailer trucks, six or more axle single-trailer trucks, five or fewer axle multi-trailer trucks, six-axle multi-trailer trucks, seven or more axle multi-trailer trucks high 92.4%
Non-intrusive Camera Chen [20] 6cmcar, van, bus, motorcycle medium/high 94.6%
Bautista [21] 6cmjeep, sedan, truck, bus, SUV, and van medium/high 96.4%
Infrared + ultrasonic sensors Odat [22] 6cmsedan, pickup truck, SUV, bus, twoo wheeler low/medium 99.0%
Magnetic Sensors Wang [23] 6cmbicycle, car, minibus low/medium 93.0%
Yang [10] 6cmmotorcycle, passenger car (two-box and saloon), SUV, bus low/medium 93.6%
Acoustic Sensors George [11] 6cmheavy (truck bus), medium (car, jeep, van), light (auto rickshaws, two wheelers) low/medium 73.4%
LIDAR Lee [24][12] 6cmmotorcycle, passenger vehicle, passenger vehicle pulling a trailer, single-unit truck, single-unit truck pulling a trailer, and multi-unit truck very high 99.5%
Off-roadway UAV Liu [13] 6cmcar, truck medium 98.2%
Tang [14] 6cmseven vehicle types, such as car, truck, bus, etc. (specific type not specified) medium 78.7%
Satellites Audebert [15] 6cmpick up, van, truck, car high 80.0%


TABLE I: Vehicle Classification Approaches

Vehicle detection (counting) and vehicle classification are the key functionalities of TMSs [19]. The literature shows that the vehicle detection accuracy is typically very high. However, the performance of vehicle classification techniques varies substantially. This section presents a comprehensive review on TMSs concentrating on the functionality of vehicle classification. Vehicle classification methods are divided largely into three categories: intrusive, non-intrusive, and off-roadway approaches. Table I summarizes the properties of existing vehicle classification schemes including sensor types, vehicle types for classification, classification accuracy, and the cost. As can be seen, it is challenging to make fair comparisons because most classification schemes are designed to classify vehicles into different types. As such, this section is rather focused on drawing meaningful insights by covering the literature comprehensively and providing general guidelines to readers for selecting an appropriate TMS.

A common property of intrusive solutions is that sensors (e.g., piezoelectric sensors [17], magnetometers [18][7], vibration sensors [8], loop detectors [19]) are installed on or under a roadway. As Table I shows, intrusive approaches are capable of classifying a large selection of vehicle types with high classification accuracy leveraging close contact with passing vehicles that allow for securing high-precision sensor data. The main concern of these solutions, however, is the cost issue. Especially when sensors are installed under the pavement, the cost increases prohibitively. The maintenance cost is also nonnegligible as it incurs extra cost for constructor safety assurance.

Due to the high cost of intrusive solutions, non-intrusive approaches have received a lot of attention. A typical characteristic of these solutions is that sensors are deployed on a roadside obviating the construction and maintenance cost for intrusive solutions. A most widely adopted sensor for non-intrusive solutions is a camera [20][21]

. Significant advances in imaging technologies and image processing techniques based on machine learning algorithms gave a birth to precise camera-based TMSs 

[25]. As Table I shows, the classification accuracy of camera-based TMSs is very high. However, achieving high classification accuracy is still challenging at night, under severe weather conditions, and when there are obstacles that obstruct the clear view. There are other sensors such as magnetometers [23][10], accelerometers [26], and acoustic sensors [11] that have been used in non-intrusive TMSs. Table I shows that state-of-the-art systems based on these sensors have quite good classification accuracy. However, the low-fidelity information that these sensors provide requires strategic positioning of multiple of those sensors. As such, minor errors in positioning or adjusting sensing directions may increase classification errors. To address the drawbacks of these sensors, more advanced sensors such as LIDAR (Laser Infrared Detection And Ranging) [24][12], and infrared sensors [22] were considered. While these advanced technologies allow for very high classification accuracy, the cost is extremely high.

Off-roadway solutions utilize cameras mounted on UAVs [14] or satellites [15] for vehicle classification. As shown in Table I, the classification accuracy of off-roadway approaches is not quite impressive (except for Liu et al. who achieved the accuracy of 98.2% for only two vehicle types). The low classification accuracy of off-roadway solutions is mainly attributed to the small image size. However, off-roadway approaches are appropriate when the user needs to cover a large area at the cost of degraded classification accuracy.

Recently, radically different TMSs based on wireless signals have been proposed. Haferkamp et al. exploited multiple pairs of RF (radio frequency) transceivers to develop a TMS [27]. The key intuition of their system is that different types of vehicles, when passing the line of sight (LoS) between a pair of RF transceivers, result in unique received signal strength (RSSI) patterns. However, since RSSI represents only a single dimensional information (i.e., signal strength for a single channel), it is challenging to correlate effectively the vehicle body shape with the corresponding RSSI. To overcome this difficulty, multiple RF transceivers are necessary. In contrast, WiFi CSI data contain much richer information conveyed via 30 subcarriers for each pair of TX-RX antennas allowing us to perform more sophisticated analysis leading to more effective vehicle classification with just a single pair of a transmitter and a receiver.

Iii Preliminaries and Problem Statement

Iii-a WiFi Channel State Information

The orthogonal frequency division multiplexting (OFDM) modulation scheme is used to implement the physical layer of contemporary WiFi standards [28]. It is robust against the frequency selective fading since high data-rate stream is partitioned onto close-spaced subcarriers. WiFi CSI represents the channel properties for these OFDM subcarriers, i.e., a combined effect of fading, scattering, and power decay with distance. Formally, WiFi CSI represents the properties of the channel as follows [29].


Here and refer to the transmitted and received signal, respectively. represents the channel noise. is a matrix, where , , and , are the number of receiver antennas, transmitter antennas, and subcarriers, respectively. Matrix

can be expressed as a vector of

subcarrier groups as follows.


Here is a matrix that represents the CSI values for the -th subcarrier received via different transmitter-receiver antenna pairs. A CSI value for the -th subcarrier received via a TX antenna and receiver antenna pair is denoted by , which is defined as follows.


This CSI value contains both the amplitude () and phase information () of the -th subcarrier signal received via the antenna pair.

Iii-B Problem Statement


5.7cmthe total number of passing cars
5.7cmthe total number of packets
5.7cma matrix that contains CSI values of 30 subcarriers for packets
5.7cma set of CSI amplitude values extracted from
5.7cma set of CSI phase values extracted from
5.7cma set of CSI amplitude values for -th passing vehicle, , extracted from
5.7cma set of CSI phase values for -th passing vehicle, , extracted from
5.7cma collection of CSI amplitude sets for passing vehicles
5.7cma collection of CSI phase sets for passing vehicles
5.7cma convolutional neural network model for vehicle classification
5.7cmvehicle detection algorithm
5.7cmvehicle classification algorithm
backward offset
forward offset
5.7cmminimum inter vehicle distance, i.e., samples within are considered belonging to the same vehicle.


TABLE II: Notations and Parameters

Let denote a matrix that each element represents a CSI value for a certain TX-RX antenna pair–30 subcarriers are sent per a TX-RX antenna pair. Here is the number of successively received packets. Also let denote the total number of cars that have passed through the line of sight (LoS) of the transmitter and the receiver. We are tasked to classify vehicles into five different types {bike, passenger car, SUV, pickup truck, large truck} given one or more depending on the number of TX and RX antennas used. Note that if is set to one, then it implements a real-time TMS.

The CSI amplitude and phase values are extracted from , which are denoted by sets , and , respectively. Note that each element () represents the aggregatesd amplitude (phase) value for 30 subcarriers, i.e., we assume that a technique to reduce the dimension of the CSI amplitude and phase values for 30 subcarriers into one is developed (Section IV-B). A vehicle detection algorithm detects -th vehicle and extracts from () the “induced” CSI amplitude (phase) values by the vehicle, which are denoted by (). Now we can create the collections of and for all , which are denoted by and , respectively. These and are provided as input to a convolutional neural network (CNN) to train a model (in training mode) and to classify based on input CSI data consisting of and into five vehicle types {bike, passenger car, SUV, pickup truck, large truck} using the model (in testing mode). The algorithm used for this vehicle classification is denoted by .

Now the problem that we solve in this paper is concentrated on development of the two algorithms namely and targeting typical two-lane rural highways. In subsequent sections, we will describe (1) the CSI data preprocessing methods to reduce the noise and dimension of raw CSI amplitude and phase data, (2) algorithms to extract the CSI amplitude and phase portions corresponding to a passing vehicle, and (3) design of a neural network for effective vehicle classification. Notations used throughout this paper are listed in Table II.

Iv Proposed System

Iv-a System Overview

Fig. 1: System architecture of .

consists of four system components, namely Data Collection, Data Processing, Vehicle Detection, Lane Detection, and Vehicle Classification. Figure 1 shows the system architecture of . The data collection module receives CSI packets and builds . Note that in the training mode, can be sufficiently large to collect CSI data for a large number of vehicles, while small can be used for real-time vehicle classification in the online mode. The key roles of the data processing module are threefold: extraction of CSI amplitude values and phase values from , noise reduction for and , and dimension reduction of for 30 subcarriers down to one for faster processing. The vehicle detection module implements , i.e., extracts the “induced” CSI amplitude and phase by a passing vehicle. The vehicle classification module consists of two parts: CNN Training and CNN Prediction. In the former part, the module trains a CNN model based on and ; In the latter part, the module classifies the detected vehicle into five different types based on and .

Iv-B CSI Data Processing

Iv-B1 Low Pass Filtering

Fig. 2: CSI amplitude values of passing car (raw vs filtered).

In order to capture only the CSI data of passing vehicles, the frequency components of other slow-moving objects, e.g., human (mostly system operators) mobility, are effectively cleared off. More precisely, we ensure that CSI amplitude fluctuations caused by any objects that move at a speed of less than 2m/s are excluded. The WiFi wavelength of our system that operates at 5.32GHz frequency bandwidth is 5.64cm [30]. With the wavelength of 5.64cm and the movement speed of 2m/s, the corresponding frequency component is calculated as 38Hz. Consequently, we apply a general low pass filter with a cutoff frequency of 38Hz to mitigate the impact of irrelevant objects. Figure 2 exhibits that the noise has been effectively removed.

Iv-B2 PCA-Based Denoising

Environmental noise (e.g., caused by slow moving objects) has been successfully mitigated by designing and applying a low pass filter. Another important source of performance degradation is noise caused by internal state transitions in a WiFi NIC which include changes in transmission power, adaptation of transmission rate, and CSI reference level changes [31]. Typically, burst noises in CSI data are caused by these internal state transitions. Ali et al. made an interesting observation that the effect of these burst noises is significantly correlated across CSI data streams of subcarriers [32].

Fig. 3: Filtered WiFi CSI streams for subcarriers #1, #2, and #3.

The principal component analysis (PCA) is used to mitigate the burst noises by exploiting highly correlated CSI streams for different subcarriers. Figure 3 depicts an example illustrating that CSI streams for different subcarriers are highly correlated. The PCA is also used to reduce the dimension of CSI data for all 30 subcarriers down to one. More specifically, using PCA, we analyze the correlations of these multi-dimensional CSI data, extract common features, and reduce the dimension to one. This noise and dimension reduction process is executed in four steps as follows.

Preprocessing of Sample: For each TX-RX antenna pair, define a matrix that store CSI amplitude values for all 30 subcarriers and received packets. A CSI stream (consisting of CSI amplitude values) for each subcarrier is arranged in each column of matrix . After construction of matrix , the mean value of each column is calculated and subtracted from each column, which completes this step.

Computation of Covariance Matrix: the covariance matrix is calculated in this step.

Computation of Eigenvalues and Eigenvectors of Covariance:

Eigendecomposition of the covariance matrix

is performed to obtain the eigenvectors


Reconstruction of Signal: By projecting onto the eigenvectors (), we obtain , where is the eigenvector and is the principal component.

Fig. 4: PCA #1 that represents all 30 CSI streams vs. a CSI stream

Figure 4 shows the first PCA component compared with a filtered CSI stream. As shown, the PCA component more clearly represents the changes of CSI amplitude values induced by passing vehicles compared with original CSI amplitude streams.

Iv-B3 Phase Preprocessing

Since the phase information is one of the two primary features for vehicle classification, it is important to effectively mitigate the impact of random noises of CSI phase data. This section presents a method to preprocess CSI phase data so that the random noises are reduced.

We can express the measured CSI phase of subcarrier as the following [33].


Here is the original phase; denotes the subcarrier index;

is the Fast Fourier Transform size (64 for IEEE 802.11 a/g/n); and

is the measurement noise. Our objective is to remove and , which are the time lag and the phase offset at the receiver, respectively. We adopt a linear transformation to remove these noise factors [34]. Formally, we define the two variables and as follows.


Here refers to the last subcarrier index. Note that because we use the Intel 5300 NIC which exports 30 subcarriers. We then use a linear transformation: to remove both the timing offset and the phase offset . We disregard the small measurement noise in this calculation.

Fig. 5: Raw and processed phase data for a passing vehicle.

Figure 5 shows both the raw and preprocessed CSI phase data (measured with sampling rates of 2,500 samples/sec). The result indicates that the algorithm successfully captures the time series of the CSI phase values.

Iv-C Vehicle Detection

Given and , the next task is to detect passing vehicles and extract and from and (i.e.,

extracting only the portions of CSI amplitude and phase values that are influenced by the passing vehicles). Detecting a passing vehicle is simple because it causes abrupt changes in CSI amplitude values. As such, we adopt a standard outlier detection technique based on the scaled median absolute deviation (MAD) defined as

, . Here , where is the inverse complementary error function. An -th CSI amplitude value is considered as an outlier if it is more than three scaled MAD away from the mean, detecting a vehicle.

Data: and
Result: and
1 begin
2       . for  to  do
3             if  &&  then
4                   , , TRUE.
5             else if  &&  then
6                   .
7             else if  &&  then
8                   if  then
9                         FALSE. if  &&  then
10                               . .
13             else
14                   continue.
Algorithm 1 CSI Data Extraction Algorithm

Once a vehicle(s) is detected, and are extracted. Since and are synchronized, outliers of are also the outliers of . More specifically, assume that an outlier is . Starting from , the algorithm extracts the CSI amplitude samples in the range between and . These and are system parameters. We use to take into account the momentary CSI amplitude/phase changes after a vehicle passes through the line of sight (LoS) between the TX-RX antenna pair. is used to capture the minor changes in CSI amplitude/phase when the vehicle is very close to the LoS but yet passed through it. In our experiments, we found that (0.25sec), and (0.25sec) gave the best results.

Algorithm 1 displays the peudocode of the amplitude and phase extraction process. The function finds the outliers and records the packet number for the outliers in an array (Line 2). The algorithm keeps track of the beginning and end of extracted amplitude and phase values, and a flag is used to indicate that the extraction process is in progress so that the algorithm can finish when the extraction process is completed (Lines 4-5). In other words, the extraction process is continued as long as is set to TRUE and a CSI amplitude value is considered as an outlier (Lines 6-7). If a CSI amplitude valaue is found to be a non-outlier, the interval between the current and last valid outlier is calculated, and it is compared with the threshold (in our experiments, we used 1,250, i.e., 0.5 second). Finally, if the interval is greater than , we set the flag to FALSE to finish the extraction process and save the extracted CSI amplitudes and phases in and , respectively (Lines 8-13).

Iv-D Vehicle Classification

We adopt the convolutional neural network (CNN) for vehicle classification. The correlations of the time series of CSI amplitude and phase values are taken into account by aggregating them as a single input image. Specifically, a

image is provided as input to CNN. The first three rows of the image represent the time series of extracted CSI amplitude values for three TX-RX antenna pairs (Note that there are 1 TX antenna, and 3 RX antennas). The subsequent three rows of the image are the time series of extracted CSI phase values. These 6 CSI data sequences are exactly aligned in the image to enable CNN extract the hidden correlations between the CSI data sequences. We ensure that all images have the same size by padding with 0s.

Fig. 6: CNN architecture.

Figure 6

shows the design of the proposed CNN. As shown, it consists of two layers of alternating Convolution, (Batch Normalization, ReLu), and Pooling sublayers such that the lower layer extracts basic features while the higher layer extracts more complex features 

[35]. In the following section, we describe the detailed roles of the sublayers.

Iv-D1 Convolutional Layer

The convolutional layer basically convolves the input images by sliding the kernels (also called as filters) vertically and horizontally and calculates the dot product of the input and the weights of the kernels. Formally, denote the value at the -th row and -th column of the -th input image in the -th layer by . The input images in the previous layer are convolved with the kernels in the convolutional layer to calculate

. The output of the convolutional layer is provided as input to the activation function

(we used rectified linear unit function), and the result of the activation function forms the input image of the next layer:


Here is the bias for the -th input image in the -th layer; is the value of the kernel at the -th position; is the size of the kernel in the -th layer; is an index that goes over the set of input images in the -th layer.

Iv-D2 Batch Normalization Layer

Before providing the result of the convolutional layer as an input to the next layer, the result goes through the normalization layer. The normalization layer is used to speed up the training process and reduce the sensitivity to the initial network configuration. Given a mini batch , the layer performs normalization as the following:


where , and , i.e.,

by subtracting the mean and then dividing by the standard deviation, the layer scales the result by a scale factor

and shifts by an offset . These two parameters and are learned during the training process. Formally, .

Iv-D3 ReLu Layer

After the convolutional layer and batch normalization layer, a nonlinear activation function is executed, for which we adopt the rectified linear unit (ReLU) function. It basically performs a threshold operation to each element we obtain after the convolutional and batch normalization layer as follows: .

Iv-D4 Max Pooling Layer

In the max pooling layer, the resolution of the feature maps is decreased in order to prevent overfitting using the max pooling function defined as follows.


Here is the length of the pooling region in the -th layer.

Iv-D5 Dropout and Fully Connected Layer

While training the CNN model, we observed significant overfitting and decided to deploy the dropout layer to reduce the impact of overfitting. Basically, this layer randomly drops out an element of the results of the Max Pooling Layer with a fixed probability

. In our experiments, we found that a drop out rate of 0.6 gave the good results.

Iv-D6 Fully Connected Layer

Followed by the two layers of alternating Convolution, Batch Normalization, ReLu, and Pooling sublayers is the Fully Connected Layer. This layer is basically the same as the regular neural network which maps the flattened feature into the output classes (i.e.,

five vehicle types) generating the scores for each output class. Finally, the output scores of the Fully Connected Layer is provided as input to the SoftMax layer in which the scores are converted into values in the range between 0 and 1 such that the sum is 1. This way the SoftMax layer represents the output as a true probability distribution.

V Experimental Results

V-a Experimental Setup

Fig. 7: Experimental setup.

We deployed a prototype of in a two-lane rural highway (Figure 7). Two laptops (HP Elite 8730w model) were used to develop the prototype. These laptops were used as a WiFi transmitter and receiver, respectively, which are equipped with 2.53GHz Intel Core Extreme CPU Q9300 processor, 4GB of RAM, and Intel 5300 NIC. was executed on Ubuntu 14.04.04 (kernel version of 4.2.0-27). We deployed another two laptops to record the ground-truth video data on each side of the road. Separate laptops were used for video recording to avoid interfering with WiFi communication and WiFi CSI data processing. These separate laptops were synchronized with the WiFi transmitter to ensure that video recording is started at the same time WiFi communication is triggered.


Vehicle Classification # of Samples
Car-like Small Bike 22
Passenger Car 238
Medium SUV 253
Pickup Truck 252
Truck-like Large Large Truck 18


TABLE III: Vehicle types and number of samples

CSI data were collected for about 120 hours over a month. Extracted CSI amplitude and phase data set for each passing vehicle was manually tagged based on the recorded video. Consequently, we collected CSI data for a total of 783 vehicles (Table III). We referred to FHA vehicle classification [36] to determine the vehicle types. Classifying vehicles with more than two axles is known to be quite effective due to the large vehicle body. Mostly the challenge exists in classifying vehicles with two axles due to the similar body size. As such, we concentrate on classifying vehicles with two axles, i.e., class 1 (moborcycle) class 2 (passenger car) class 3 (SUVs) class 4 (bus) class 5 (pickup truck), and other class (large truck) according to the FHA classification [36]. Here the large truck means a single unit with the axle count greater than or equal to three. Note that we excluded the class 4 (buses) since we spotted only 2 buses in the rural highway during the period of data collection. As Table III shows, we tested for two other typical classification methods namely ‘car-like vs truck-like’ classification [36], and ‘small, medium, large’ classification [37].


Parameter Type Value
Solver 20cmStochastic Gradient Descent
with Momentum (SGDM) Optimizer
Dropout Rate 60%
Shuffle Frequency

Every Epoch

Validation Data 30%
Input Image Size 6 WINDOW_SIZE
L2 Regularization None


TABLE IV: Hyper parameters for CNN

Table IV summarizes the hyper parameters we selected to train the CNN model. As shown, we used 70% of the collected CSI data to train the CNN model, and the rest for testing purpose. We compared the performance of

with that of support vector machine (SVM) and k nearest neighbor (kNN). In particular we used the following five features in training the SVM and kNN models: (1) the normalized standard deviation (STD) of CSI, (2) the offset of signal strength, (3) the period of the vehicle motion, (4) the median absolute deviation (MAD), (5) interquartile range (IR) according to  

[38] which exploited WiFi CSI for fall detection.

V-B Detection Accuracy

The detection accuracy was 99.4% (778 out of 783). This high vehicle detection accuracy is attributed to the PCA analysis that achieves sharp differentiation of the CSI amplitude values for passing vehicles by effectively extracting the common features of the CSI amplitude values of 30 subcarriers and representing the CSI amplitude values with a single dimension. The result coincides with the literature that most recent TMSs have very high vehicle detection accuracy. A total of 24 false positives were observed.

V-C Classification Accuracy


Classification SVM kNN Deep Learning
Car-like Small Bike 99.3% 85.7% 81.3% 99.2% 46.5% 77.2% 100.0% 91.1% 97.2%
Passenger Car 75.9% 54.5% 91.1%
Medium SUV 85.8% 50.6% 92.8% 47.8% 94.1% 83.8%
Pickup Truck 75.5% 42.5% 83.3%
Truck-like Large Large Truck 98.0% 96.2% 95.5% 92.8% 91.1% 90.4% 100.0% 100.0% 99.7%
Average 98.7% 89.2% 75.8% 96.0% 76.8% 62.5% 100.0% 95.1% 91.1%


TABLE V: Classification accuracy

The classification accuracy is defined as the total number of correctly classified vehicles divided by the total number of detected vehicles. The classification accuracy of is compared with SVM and kNN-based approaches. In this experiment, we randomly selected 30% of the passing vehicles as the validation set for SVM, kNN, and Deep Learning (). We then calculated the average classification accuracy by repeating the experiments 1,000 times.

The results are summarized in Table V. All classifiers did a good job in classifying vehicles into car-like and truck-like classes. However, the performance decreased as the number of classes increased, especially the classification accuracy for SVM and kNN sharply dropped. In contrast, the average classification accuracy of remained high as 91.1% even for individual vehicle classes. Still, distinguishing similar sized vehicles, i.e., SUV and pickup trucks was not easy for Deep Learning resulting in the accuracy of 83.8% for SUV and 83.8% for pickup trucks. Overall, shows very promising performance comparable to recent camera-based solutions [20][21], and magnetic sensor-based approaches [23][10].

V-D Classification Accuracy Per Lane


Classification Lane 1 Lane 2 Combined Lane
Car-like Small Bike 100.0% 91.1% 97.2% 100.0% 90.3% 97.0% 99.8% 79.6% 95.5%
Passenger Car 91.1% 87.5% 81.6%
Medium SUV 94.1% 83.8% 93.7% 83.1% 80.0% 76.3%
Pickup Truck 83.3% 80.0% 66.5%
Truck-like Large Large Truck 100.0% 100.0% 99.7% 100.0% 100.0% 99.1% 99.5% 99.0% 92.1%
Average 100.0% 95.1% 91.1% 100.0% 94.7% 89.3% 99.7% 86.2% 82.4%


TABLE VI: Classification accuracy per lane

Another interesting research question that we answer here is: how does the lane affect the performance of . To answer this question, we created CNN models separately for each lane. The results for different CNN models for lane 1, lane 2, and combined lanes are summarized in Table VI. We found that the effect of lane was negligible when vehicles were classified into car-like and truck-like classes. However, the accuracy of the CNN model for combined lanes degraded by 8.9% and 8.7% for ‘S,M,L’ classes and individual vehicle classes, respectively. Based on these results, trains CNN models individually for each lane, performs classification with both models, and selects the output with a higher probability. Another interesting observation is that the accuracy for Lane 1 is slightly higher than that for Lane 2. The reason is, as illustrated in Figure 1, when a passing vehicle is close to the receiver, WiFi signals for different TX-RX antenna pairs are spaced more widely and cover the vehicle body more effectively.

Vi Conclusion

We have presented the design, implementation, and evaluation of , a low-cost and portable TMS based on WiFi CSI and deep learning. Numerous technical challenges have been addressed to achieve high vehicle detection and classification accuracy. With the large amounts of CSI data and ground truth video data that we collected over a month, we performed extensive real-world experiments and successfully validated the effectiveness of . Despite the low cost of the proposed system, the average classification accuracy for five different vehicle types was 91.1%, which is comparable to recent non-intrusive vehicle classification solutions. We expect that will contribute to solving the cost issue of deploying a large number of TMSs to cover the huge miles of rural highways.


This research was supported in part by the DGIST R&D Program of MSIP of Korea (CPS Global Center) and the Global Research Laboratory Program through NRF funded by MSIP of Korea (2013K1A1A2A02078326).

This research was also supported in part by the Competitive Research Grant Program (CRGP) of South Dakota Board of Regents (SDBoR).

=0mu plus 1mu


  • [1] L. E. Y. Mimbela and L. A. Klein, “Summary of vehicle detection and surveillance technologies used in intelligent transportation systems,” Federal Highway Administration’s Intelligent Transportation Systems Joint Program Office, Tech. Rep., 2007.
  • [2] FHWA transportation infrastructure management. [Online]. Available:
  • [3] U.S. traffic monitoring location data. [Online]. Available:
  • [4] Challenges of the day-today operation of a traffic monitoring program (georgia dot). [Online]. Available:
  • [5] Traffic monitoring. [Online]. Available:
  • [6] W. Balid, H. Tafish, and H. H. Refai, “Intelligent vehicle counting and classification sensor for real-time traffic surveillance,” IEEE Transactions on Intelligent Transportation Systems, vol. 19, no. 6, pp. 1784–1794, 2018.
  • [7] C. Xu, Y. Wang, X. Bao, and F. Li, “Vehicle classification using an imbalanced dataset based on a single magnetic sensor,” Sensors, vol. 18, no. 6, p. 1690, 2018.
  • [8] M. Stocker, M. Rönkkö, and M. Kolehmainen, “Situational knowledge representation for traffic observed by a pavement vibration sensor network,” IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 4, pp. 1441–1450, 2014.
  • [9] S.-T. Jeng and L. Chu, “A high-definition traffic performance monitoring system with the inductive loop detector signature technology,” in Proc. of ITSC, 2014.
  • [10] B. Yang and Y. Lei, “Vehicle detection and classification for low-speed congested traffic with anisotropic magnetoresistive sensor,” IEEE Sensors Journal, vol. 15, no. 2, pp. 1132–1138, 2015.
  • [11] J. George, L. Mary, and K. Riyas, “Vehicle detection and classification from acoustic signal using ann and knn,” in Proc. of ICCC, 2013.
  • [12] H. Lee and B. Coifman, “Using lidar to validate the performance of vehicle classification stations,” Journal of Intelligent Transportation Systems, vol. 19, no. 4, pp. 355–369, 2015.
  • [13] K. Liu and G. Mattyus, “Fast multiclass vehicle detection on aerial images.” IEEE Geosci. Remote Sensing Lett., vol. 12, no. 9, pp. 1938–1942, 2015.
  • [14] T. Tang, S. Zhou, Z. Deng, L. Lei, and H. Zou, “Arbitrary-oriented vehicle detection in aerial imagery with single convolutional neural networks,” Remote Sensing, vol. 9, no. 11, p. 1170, 2017.
  • [15] N. Audebert, B. Le Saux, and S. Lefèvre, “Segment-before-detect: Vehicle detection and classification through semantic segmentation of aerial images,” Remote Sensing, vol. 9, no. 4, p. 368, 2017.
  • [16] E. Odat, J. S. Shamma, and C. Claudel, “Vehicle classification and speed estimation using combined passive infrared/ultrasonic sensors,” IEEE Transactions on Intelligent Transportation Systems, vol. 19, no. 5, pp. 1593–1606, 2018.
  • [17] S. A. Rajab, A. Mayeli, and H. H. Refai, “Vehicle classification and accurate speed calculation using multi-element piezoelectric sensor,” in Proc. of IV, 2014.
  • [18] M. Bottero, B. Dalla Chiara, and F. P. Deflorio, “Wireless sensor networks for traffic monitoring in a logistic centre,” Transportation Research Part C: Emerging Technologies, vol. 26, pp. 99–124, 2013.
  • [19] S. Meta and M. G. Cinsdikici, “Vehicle-classification algorithm based on component analysis for single-loop inductive detector,” IEEE Transactions on Vehicular Technology, vol. 59, no. 6, pp. 2795–2805, 2010.
  • [20] Z. Chen, T. Ellis, and S. A. Velastin, “Vehicle detection, tracking and classification in urban traffic,” in Proc. of ITSC, 2012.
  • [21] C. M. Bautista, C. A. Dy, M. I. Mañalac, R. A. Orbe, and M. Cordel, “Convolutional neural network for vehicle detection in low resolution traffic videos,” in Proc. of TENSYMP, 2016, pp. 277–281.
  • [22] E. Odat, J. S. Shamma, and C. Claudel, “Vehicle classification and speed estimation using combined passive infrared/ultrasonic sensors,” IEEE Transactions on Intelligent Transportation Systems, 2017.
  • [23] R. Wang, L. Zhang, K. Xiao, R. Sun, and L. Cui, “Easisee: Real-time vehicle classification and counting via low-cost collaborative sensing,” IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 1, pp. 414–424, 2014.
  • [24] H. Lee and B. Coifman, “Side-fire lidar-based vehicle classification,” Transportation Research Record: Journal of the Transportation Research Board, no. 2308, pp. 173–183, 2012.
  • [25] R. Niessner, H. Schilling, and B. Jutzi, “Investigations on the potential of convolutional neural networks for vehicle classification based on rgb and lidar data,” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 4, p. 115, 2017.
  • [26] W. Ma, D. Xing, A. McKee, R. Bajwa, C. Flores, B. Fuller, and P. Varaiya, “A wireless accelerometer-based automatic vehicle classification prototype system,” IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 1, pp. 104–111, 2014.
  • [27] M. Haferkamp, M. Al-Askary, D. Dorn, B. Sliwa, L. Habel, M. Schreckenberg, and C. Wietfeld, “Radio-based traffic flow detection and vehicle classification for future smart cities,” in Proc. of VTC, 2017.
  • [28] L. Hanzo, Y. Akhtman, J. Akhtman, L. Wang, and M. Jiang, MIMO-OFDM for LTE, WiFi and WiMAX: Coherent versus non-coherent and cooperative turbo transceivers.   John Wiley & Sons, 2010.
  • [29] Y. Zeng, P. H. Pathak, C. Xu, and P. Mohapatra, “Your ap knows how you move: fine-grained device motion recognition through wifi,” in Proc. of HotWireless, 2014.
  • [30] W. Wang, A. X. Liu, M. Shahzad, K. Ling, and S. Lu, “Understanding and modeling of wifi signal based human activity recognition,” in Proc. of Mobicom, 2015.
  • [31] M. Li, Y. Meng, J. Liu, H. Zhu, X. Liang, Y. Liu, and N. Ruan, “When csi meets public wifi: Inferring your mobile phone password via wifi signals,” in Proc. of CCS, 2016.
  • [32] K. Ali, A. X. Liu, W. Wang, and M. Shahzad, “Keystroke recognition using wifi signals,” in Proc. of Mobicom, 2015.
  • [33] C. Wu, Z. Yang, Z. Zhou, K. Qian, Y. Liu, and M. Liu, “Phaseu: Real-time los identification with wifi,” in Proc. of INFOCOM, 2015.
  • [34] S. Sen, B. Radunovic, R. R. Choudhury, and T. Minka, “You are facing the mona lisa: spot localization using phy layer information,” in Proc. of Mobisys, 2012.
  • [35] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015.
  • [36] FWHA classification. [Online]. Available:
  • [37]

    M. Liang, X. Huang, C.-H. Chen, X. Chen, and A. Tokuta, “Counting and classification of highway vehicles by regression analysis,”

    IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 5, pp. 2878–2888, 2015.
  • [38] Y. Wang, K. Wu, and L. M. Ni, “Wifall: Device-free fall detection by wireless networks,” IEEE Transactions on Mobile Computing, vol. 16, no. 2, pp. 581–594, 2017.