1 Introduction
WiFi indoor localization has witnessed tremendous progress in the past decade owing to the pervasive deployment of wireless local area networks (WLANs). The stateoftheart approaches can be categorized into two types, the fingerprinting based [1][2][3][4][5]
and the channel estimation based
[6][7][8], stemming from distinct treatment of wireless signal propagations in indoor environment. In the former, a position is characterized by its detected signal patterns in the vector form of received signal strength (RSS) to different access points, namely the fingerprint. The fingerprinting based approaches typically construct a locationdependent radio map offline, and use this map to infer the location of a device with online measurement henceforth. For instance, Bahl Padmanabhan adopted the Euclidean distance as the matching rule to compare the received RSS vector with the stored fingerprints
[1]. Authors in [2] and [5] took account of temporalspatial patterns when constructing the fingerprint database. The channel estimation based approaches aim to decompose the composite multipath propagation by estimating the angular parameters and the timeofflight so that elementary geometric methods are utilized to pinpoint the location. Xiong and Jamieson designed an AoA indoor localization system that used MUSIC algorithm to estimate the angle of arrival (AoA) of WiFi radio signal [6]. More advanced signal processing techniques were implemented in succession, which traded the accuracy of channel estimation with the number of antennas, the spectrum width, and the complexity order of algorithms [7][9]. Improved accuracy of WiFi positioning has fostered the prosperity of location based services (LBS).The fingerprinting and the channel estimation approaches both have pros and cons: the former uses received signal strength (RSS) that is handily available at offtheshelf smartphones, but requires great efforts to construct a fingerprint database before practising localization; the latter is usually more accurate, but may demand the cooperation among the device and the access points (APs), and the equipment of multiple antennas. It may underperform the former when the channel estimation loses its accuracy in a rich multipath environment. Therefore, despite of the potential advantage of channel estimation, the fingerprinting based approaches continue to dominate the practical implementation of indoor localization systems [10].
A recent trend of WiFi fingerprinting is to replace RSS by channel state information (CSI) that represents the channel properties over all the subcarriers at the frequency domain of Orthogonal Frequency Division Multiplexing (OFDM) systems [8][11][12]. In each packet transmission, a vector of complex CSI values are obtained, instead of a single RSS value. The CSI amplitude is not only more temporally stable, but also more representative in terms of the feature of a location owing to frequency diversity on different subcarriers. However, CSI fingerprinting based localization is obstructed by two practical issues. One is the practical CSI acquisition. Existing studies rely on Intel 5300 CSI Tool [13], a toolkit to extract CSI from received data packets. The operation of CSI Tool either requires the successful connection to each AP or the hardcoded MAC address for passive monitoring so that the conducting the site survey is time consuming with many APs. It does not function when the surrounding APs are password protected or operated by a WLAN controller for roaming management. The other is the variability of AP deployment. When a subset of APs malfunction or have been replaced, these changes should be detected automatically, and the fingerprint database should be updated accordingly. Hence, to make the best of CSI, the indoor localization system should figure out a convenient way of acquiring CSI and reconfigure the fingerprint database against the change of AP deployment.
In this paper, we present CRISLoc, the first WiFi CSI fingerprinting localization system using ubiquitous smartphones
. The practical advantages of CRISLoc are twofold. Firstly, CRISLoc can operate in a completely passive mode, overhearing the packets onthefly for CSI acquisition when the smartphone is not allowed to access APs or different APs sharing the same SSID. In addition, CRISLoc can acquire CSI from data frames, ACK frames and beacons through smartphones, thus providing more freedoms of implementation than previous systems. In a nutshell, CRISLoc pushes the conceptual CSI fingerprinting closer toward realworld deployment. Secondly, CRISLoc is able to detect the variation of CSI fingerprints when one or more APs change their positions. The CSI fingerprints of the altered APs are reconstructed using advanced machine learning techniques. Hence, CRISLoc has the potential to achieve both high localization accuracy and resilience against environmental changes.
Designing CRISLoc is technically challenging, mixed with profound observations of CSI in smartphones. Our first obstacle is the CSI calibration. CRISLoc constructs the fingerprint database using a newly developed smartphone CSI toolkit named Nexmon [14]. However, the raw CSI amplitudes cannot be directly used because the automatic gain control (AGC) function distorts the measured amplitude of the received signal. We calibrate the measured CSI amplitudes by removing AGC, and design a couple of filters to dispose of unstable subcarriers and abnormal frames. After the preprocessing, CRISLoc can use around fifty subcarriers, nearly two times that of Intel 5300 CSI Tool, and the CSI amplitude measurement is more stable over time.
Our second obstacle is the detection of the altered APs as well as the reconstruction of their CSI fingerprints. When the localization is carried out using different partitions of the AP set, the estimated positions are likely to huddle together if none of the APs in these subsets are redeployed, and are prone to being scattered otherwise. The estimated locations may fall in several clusters of comparable sizes due to estimation errors, making the direct clustering analysis very fishy. We develop a joint clustering and outlier approach to gauge the sizes of largest two clusters that detects the altered APs with both high precision and high recall. The CSI fingerprints of altered APs, though becoming obsolete, reflect the room layout, the path loss pattern and the spatial correlation. Therefore, we propose to exploit
transfer learning to distill the knowledge gained from the outdated fingerprints other than discarding them. To this goal, a novel optimization framework is formulated with the target of finding a transform matrix that projects both the outdated and the fresh CSI data into a subspace where the distributions of highdimensional CSI data match well.Our third obstacle is to cope with corner cases in indoor localization. An interesting observation regarding the spatial pattern of CSI amplitudes is that they are relatively less sensitive to the increase of propagation distance compared with RSS. Therefore, choosing as many APs as possible does not yield a more accurate CSI localization, especially when the target smartphone is placed at the corners of a room. We proposed an edge enhanced nearest neighbors (EEKNN) method that automatically adjusts the number of neighbors and their corresponding locationdependent weights. Extensive experiments show that CRISLoc achieves a mean error of 0.29m in a research laboratory (6m8m), and a median error of 0.78m in a complex academic building (8m28m) consisting of a research room, a long corridor and a small square. As the benchmark system, the RSS fingerprinting only achieves the mean errors of 0.40m and 1.20m respectively. When one or two APs out of nine are altered randomly, the localization errors of CRISLoc merely increases by 5.4cm and 8.6cm respectively, manifesting the effectiveness of the proposed detection algorithm for altered APs and the transfer learning approach.
In summary, our main contributions are as follows.

To the best of our knowledge, CRISLoc is the first CSI fingerprinting localization system using offtheshelf smartphones.

We design a suit of methods to sanitize CSI data, encompassing the cancellation
of automatic gain control and the filtering of unstable subcarriers and frames.

We design a joint clustering and outlier detection approach to find the altered APs, and develop a novel transfer learning approach to reconstruct their CSI fingerprints.

We point out the imperfection of CSI as the location feature, and present an enhanced KNN approach to improve the localization accuracy of corner cases.
The remainder of this paper is organized as follows. Section 2 introduces several preliminaries. Section 3 states the overall advantages of CRISLoc and its system diagram. Section 4 and Section 5 present the detailed system design and the CSI fingerprint reconstruction. In Section 6, we introduce the setup of our experimental environment and claim some metrics and Section 7 shows an exhaustive evaluation of CRISLoc. Several works related to indoor localization are displayed in Section 8. Finally, Section 9 summarizes the properties of CRISLoc.
2 Preliminaries
In this section, we introduce the basic physical concepts of WiFi indoor localization, traditional methods of fingerprint matching and the machine learning approaches that are pertinent to this work.
2.1 Received Signal Strength and Channel State Information
Received signal strength (RSS) indicates the power of a signal received at the physical layer in the unit of decibel. In general, the farther the receiver is away from the transmitter, the lower RSS will be. A widely adopted path loss model, namely wall attenuation factor (WAF) model [15], that captures the signal attenuation at GHz frequency band in indoor environments takes the following form:
(1) 
where is the measured attenuation rate, is a predefined threshold, and is the number of walls between the transmitter and the receiver. is the RSS at a reference point meters away from the transmitter, and is that of the receiver meters apart from the transmitter. Both and are in the unit of dBm. The path loss models provide a rigorous approach to quantify the impact of transmission distance and environmental parameters on the signal strength capacity of wireless links. Nevertheless, existing models cannot precisely capture the complex signal attenuation, or tell the subtle differences between RSS values of adjacent locations.
Channel state information (CSI) describes the channel properties of a communication link, especially how a signal propagates from the transmitter to the receiver and represents the combined effect of scattering, fading, and power decay with distance [16]. In a narrowband channel, the joint effect of wireless environment yields a linear model:
(2) 
where and are the transmitted and received signals, is the channel state information (CSI) and is the additive white Gaussian noise, all represented by complex values. Therefore, the channel response of subcarrier can be estimated by . CSI characterizes the channel response with both the amplitude and the phase:
(3) 
and
(4) 
The availability of CSI in WiFi systems has fostered a plethora of applications including indoor localization and activity sensing. The former leverages an antenna array to collect the signals arriving at different antennas so that the Angle of arrival (AoA) and/or the time of flight (ToF) can be estimated for positioning [6][7]. The latter takes the variation in the amplitude and phase of CSI over time as a feature that reflects human activities in a nearby wireless link.
2.2 Weighted KNearest Neighbors
We next describe weighted Nearest Neighbors (WKNN)[17], a statistical learning technique commonly used in WiFi indoor positioning systems as the matching rule. WKNN computes the distances (typically Euclidean distances) between each fingerprint in the database and a test sample, and picks up fingerprints with the smallest distances. Then, the estimation is made by taking the weighted average of the positions of the fingerprints. Those of smaller distances are endowed with relatively larger weights:
(5)  
(6) 
where is the estimated location, is the position of the th neighbor, and is the corresponding distance.
2.3 Transfer Learning
We briefly describe the basic principle of transfer learning that is used in our system hereby. As is well known, machine learning is usually restricted by lack of sufficient data and fails to gain accurate knowledge. Transfer learning, a branch of machine learning, aims to apply the knowledge gained in solving one problem to address a different but related problem where limited or even no labeled data is available. For instances, the knowledge acquired from learning to recognise cars can be applied to recognizing trucks, or the knowledge learned from CSI fingerprints in one location to constructing a prediction model in another location.
In general, the concepts of a domain and a task are involved in transfer learning [18]. A domain consists of a feature space
and a marginal probability distribution
defined over this feature space where . A task contains a label spaceand a conditional probability distribution
that is a model learned from the training data. Transfer learning contains two domains: a source domain and a target domain denoted by and respectively. The domain has the task and the domain has the task .Given the adequate knowledge of the source domain and a small amount of labeled data in the target domain , one popular method of transferring the information is to find a latent subspace for the source and target data so that the difference of probability distribution and between the two domains are minimized. In our transfer learning, we adapt maximum mean discrepancy (MMD) [19] as metrics, a nonparametric metric of the distribution difference. Given the samples from two domains and , it computes the distance between averages of the sample projected into the subspace
(7) 
where and are the numbers of samples in the source and target domains. As shown in Equation (7), MMD equals zero when the two distributions in the subspace are the same. With the latent subspace found, a very limited amount of data at the target domain is capable of making classification with certainty.
3 System Overview
In this section, we articulate the advantages of CRISLoc over the stateoftheart systems, followed by the description of system architecture.
3.1 Advantages of CRISLoc
Intuitively, CRISLoc utilizes the CSI fingerprints of commodity smartphones for indoor localization. A crucial question is why the smartphone CSI is a better choice than the Intel 5300 CSI Tool and the smartphone RSS in fingerprinting. The reasons are summarized in two aspects.

Practicability. i) A smartphone is able to overhear the transmitted frames in the air to acquire CSI, while Intel 5300 CSI Tool requires the successful connection to each AP at a time. When the surrounding APs are password protected or operated by a WLAN controller for roaming management, the smartphone can still be used for CSI indoor localization, but not the Intel 5300 CSI Tool; ii) Data frames, beacons and ACK frames can all be exploited for CSI acquisition in the smartphone, while only data fames are useful in the Intel 5300 CSI Tool. The above prominent properties not only make CSI fingerprinting ubiquitous, but also significantly reduce the time of conducting site survey.

Performance. i) The CSI of the smartphone is more stable than that of the Intel 5300 CSI Tool: as shown in Fig. 1, the CSI extracted by CSI tool fluctuates over time in axis; ii) The CSI on more subcarriers can be extracted from the smartphone than the CSI Tool, given the identical spectrum width. Therefore, the smartphone CSI has the potential to outperform Intel 5300 CSI Tool with more high quality fingerprints.
The augmentation of smartphone CSI brings new technical challenges spanning from the calibration of measured CSI to the detection of anomaly APs. The reconstruction of CSIvector fingerprints when a fraction of APs are altered is especially difficult compared with that of simple RSS values.
3.2 System Architecture
CRISLoc has two major objectives: one is to perform indoor localization with the collected CSI fingerprints; the other is to automatically detect the change of APs and reconstruct the fingerprinting database with minimal extra site surveys. The latter empowers the CSI fingerprinting based localization with anomaly detection, thus improving the positioning accuracy.
The overall architecture of CRISLoc is shown in Fig. 2 which consists of the preprocessing module, the altered AP detection module, the reconstruction module and the localization module.
Preprocessing. The raw CSI data collected by a smartphone cannot be directly used to create the fingerprinting database because the automatic gain control (AGC) function distorts the measured amplitude of the received signal. Meanwhile, the subcarrier filtering is introduced to remove unstable CSI subcarriers and the frame filtering is adopted to remove abnormal frames.
AlteredAP Detection.
An obstacle preventing the usage of WiFi fingerprinting localization is the change of AP deployment. Once a fraction of APs malfunction or are moved to other locations, the fingerprint database becomes obsoleted. CRISLoc tackles this challenge in two scenarios: i) using kernel density estimation in the presence of reference points (RPs); ii) clusteroutlier joint approach followed by sequential analysis in the absence of RPs.
Fingerprint Reconstruction. CRISLoc adopts domain adaptation transfer learning to reconstruct the fingerprints of altered APs. After projecting the outdated fingerprints and the newly collected fingerprints of RPs onto a subspace, CRISLoc generates a new fingerprint database through minimizing their Euclidian distance.
Matching Rule. CRISLoc developes an edge enhanced nearest neighbours (EEKNN) approach to pinpoint a smartphone. EEKNN is capable of handling the corner scenarios where the previous approaches are prone to large errors.
4 CSI Filtering and Anomaly Detection
In this section, we present filtering methods of smartphone CSI and algorithms to detect the set of altered APs.
4.1 Preprocessing
We only make use of the CSI amplitude and leave the CSI phase unexploited. The reasons are twofolded. Firstly, the extracted CSI phase is not a fixed value. Due to carrier frequency offset (CFO) and sampling frequency offset (SFO), the estimation error of CSI phase occurs and accumulates over time. Besides, an extremely subtle change in the internal circuits (e.g. oscillators, phase lockers and amplifiers) may cause a remarkable drift in the phase estimate, making the estimated phase obsolete in a short period. Secondly, the feature of phases in a narrowband channel is less indicative than that of amplitude envelopes. If the phase estimates are expanded along the subcarriers, a linear correlation is observed due to the evenly spaced frequencies of different subcarriers. Such a simple feature is insufficient to serve the “identity” of a location.
The CSI data is preprocessed by taking the following three steps: subcarrier filtering, frame filtering, and CSI calibration.
4.1.1 Subcarrier Filtering
The CSI vector extracted by Nexmon contains sixtyfour elements. Though affluent, some of them are actually not CSI and some others are relatively unstable. The inclusion of these CSI in the fingerprinting database may incapacitate the localization. Invalid subcarriers whose CSI values are zeros for all frames can be easily removed. As for unstable subcarriers, we propose to use coefficient of variation (CV), which is a normalized metric of capturing the variability of a subcarrier. The CV of CSIs on subcarrier at position is defined as
(8) 
where is the sample standard deviation, and is CSI sample average. Given that the CV of each subcarrier varies in different positions, we quantitatively evaluate the quality of a subcarrier based on its overall performance, that is, the average of across all positions.
Fig. 3
plots the averaged CV in one localization scenario. It is obvious that most subcarriers have a CV lower than 0.05 while several others’ CVs are relatively high. Therefore, we set a threshold to filter these unstable subcarriers. To decrease the impact of subcarrier filtering on the following filters, we only remove those subcarriers with extreme high CV. Since the overall stability of subcarriers on the right of the central frequency differs from those on the left, we set different thresholds for the two halves separately. The threshold is set based on the median (instead of mean, which is seriously influenced by the extreme CV values of unstable subcarriers) and the variance of the CVs. Those subcarriers whose CVs exceed the threshold (the dash orange line in Fig.
3) are eliminated. Removed subcarriers appear mainly on the sides of the left and the right half. Such removing makes sense because, empirically, those subcarriers either near the zero subcarrier or on the side of the band are more likely to be unreliable.The impact on the stability of subcarriers by different smartphones is relatively low. This is because CSI is an objective metric across space, independent to the devices that measure and extract it, except for the subtle measurement errors. Therefore, it is reasonable to filter out the same subcarriers for different users. As for different APs in various scenarios, the CSI performance on each subcarrier might changes. Even though different subcarriers might be removed based on the rule mentioned above, the following part of the system still works.
4.1.2 Frame Filtering
Abnormal CSI measurement may appear due to the environment noise. We plot the envelopes of CSI amplitude of different data frames across all subcarriers in Fig. (a)a. One out of nearly fifty frames exhibits a remarkably different envelope that has errors on many subcarriers. Hence, the subcarrier filtering scheme cannot remove such abnormal measurements. We further develop a frame filtering approach based on Mahalanobis distance. In general, the Mahalanobis distance is defined as:
(9) 
where is the sample vector, is the arithmetic mean vector of a set of observations, and is the covariance matrix. Mahalanobis distance is a metric that measures how an observation is away from the mean of a set of observations, taking the covariance of elements in a vector into account. Note that a minimum number of CSI measurements are required to calculate the covariance matrix . Therefore, we recollect CSIs when the sampled frames are not enough.
The frame filter calculates the Mahalanobis distance of each frame
received at the same point (site survey) and in a short period (user
request). We evaluate the Mahalanobis distances of three hundred
frames and plot them in a histogram (Fig. (b)b).
Here, xcoordinate represents the discretized interval of
Mahalanobis distance and ycoordinate represents the fraction of
data frames falling in a specific interval.
Our experimental results show that the abnormal frames appears to be totally different from other frames and are prone to
having so large Mahalanobis distances that an obvious gap appears between the most frames and the abnormal frames. By setting an adaptive threshold located on such gap
that filters out 5% of frames with large Mahalanobis distances,
we obtain the purified CSI frames suitable for fingerprinting.
4.1.3 CSI Calibration
The CSI data extracted from smartphones cannot be directly used due
to that the automatic gain control (AGC) scheme at the receiver
magnifies the amplitude of the original CSI.
This is to say, the extracted CSI is multiplied by an unknown factor whose value changes as the user moves.
As a result, the power of CSI no longer follows the basic
path loss principle and the basic rationale of fingerprinting based localization fails
miserably.
To solve this problem, we propose to rescale the extracted CSI so that AGC is canceled using RSS based on the fact that RSS is obtained before AGC while CSI is obtained after AGC.
Since AGC is a linear timeinvariant (LTI) system and is homogenious for every subcarrier, the ratio between the CSI of a couple of different subcarriers remains the same. Given that the sum of CSI squared over all the subcarriers should be consistent with RSS, we multiply the extracted CSI of all subcarriers by a single coefficient:
(10) 
which yields the CSI before AGC. Here, is the extracted CSI of the
subcarrier and is the received signal quality in mW. In this way, the power of the rescaled CSI equals to the corresponding
RSS and AGC is thus cancelled.
The effectivenss of CSI calibration is demonstrated in Fig. 5, where we plot the relative CSI amplitude on one subcarrier (shown in the z axis) in a 4.5 meters by 5 meters area for both uncalibrated and calibrated CSI. The origin of the figure is one corner of the area and the AP locates at (3.5, 2). It is easy to see that the uncalibrated CSI does not follow the signal path loss, while the calibrated CSI does: a smartphone closer to the AP usually collects a CSI with a larger amplitude.
4.2 Altered AP Detection
The biggest challenge of fingerprinting based localization is the sensitivity to environmental changes, especially the relocation of APs. One crucial question arises: can we accurately detect the change of APs and revamp the fingerprint database accordingly? We develop comprehensive approaches to detect altered APs in two different situations, namely, with and without reference points. A reference point (RP) is a mobile terminal that continuously collect the fingerprints such as CSI from surrounding APs. The detection of the change of APs assisted by RPs is more accurate than that without them, yet at the cost of their deployment.
4.2.1 Detection With RPs
The detection with RPs is relatively direct: comparing the uptodate CSI collected at the RPs with the previous CSI fingerprints using the kernel density estimation (KDE). KDE is adopted because the distributions of the CSI values do not follow the normal distribution. Taking the CSI value of each subcarrier as a random variable, we test CSI samples using the Pearson’s chisquared test
[20]. Unfortunately, more than half (i.e. 54%) of CSI samples cannot pass the test. Therefore, we cannot simply assume the normal distribution of CSI values and use KDE to estimate the probability density function (PDF) instead, where the kernel is the Gaussian function. Each kernel is associated with one sample of CSI value and the Gaussian kernel functions add up to the probability distribution of the CSI values.
Note that the CSI on different subcarriers is correlated, which is difficult to be implemented in KDE for two reasons. First, the joint distribution of CSI on the entire spectrum is difficult to obtain and is hard to be estimated accurately with limited amount of samples. Second, the complexity increases exponentially as the number of subcarriers increases. We choose to compute the probability distribution function of each subcarrier independently where the cardinality of the set of PDFs is only the number of subcarriers.
This simplification significantly reduces the required CSI samples in the estimation without compromising the performance of altered AP detection and reduces the exponential complexity to the polynomial complexity.
During the estimation phase, the probability of observing the CSI sample at the RP is estimated for each AP once a user request is received. If the probability is lower than value, typically, 5%, we consider the AP as altered and the reconstruction is required.
4.2.2 Detection Without RPs
The detection of altered APs becomes more challenging in the absence of RPs. We hereby propose a novel clusteroutlier joint approach to find out the altered APs. The basic idea is to discover the discrepancy of localization results using different subsets of available APs. Adopting the joint approach is crucial: a clustering method may cause an overestimation of altered APs; a outlier detection approach, on the contrary, may underestimate the number of altered APs. The joint clusteroutlier approach has the potential to achieve both high precision and high recall of detection.
In the clusteroutlier joint approach, we can acquire different estimations of the user position with multiple AP subsets. By scrutinizing these results, we observe that the subsets containing one or more altered APs yield scattered positions, while those without altered APs have agminated positions. We hence detect the altered APs by distinguishing these two kinds of subsets.
The setting of the number of APs in each subset is crucial. Let be the whole set of APs, and be a subset of . When contains a very few APs, tends to have large localization errors even if it does not contain altered APs. On the contrary, is likely to perform very well if there are a lot of wellfunctioning APs but few altered APs. Therefore, a minimum and a maximum values are configured to select the subsets of APs (minimum is three and maximum is five in our implementation). For each subset, we estimate the location of a smartphone using the matching rule in Section 5.2. When the APs are densely deployed, we randomly select a fraction of all the subsets. We set the total number of altered APs as much as the computation supports and time permits. Meanwhile, we guarantee that the frequency that each AP appears in the these subsets are the same.
ClusterOutlier Joint Approach. Intuitively, as shown in the Fig.6, the localization results coming from the subsets without altered APs tend to be in the same and the largest cluster and close to the ground truth. The largest cluster is deemed as the “ground truth” cluster (GTC) of localization mostly containing unaltered APs. Similarly, we deem those localization results that is not included in the GTC as nonGTC points and consider all the nonGTC points as the localization results of the subsets containing altered APs. In order to detect altered APs, we examine the nonGTC points. However, due to the estimation errors, the localization results may be grouped into multiple clusters that are of comparable sizes. As shown in Fig. 6, our experiments demonstrate the existence of several clusters that are hard to differentiate their relative significance. If we wrongly pick the “largest” cluster, the detection of altered AP will fail miserably. After the good APs being removed and the altered APs being kept, the accuracy of localization is worse off than that with the original CSI fingerprints. To avoid such kinds of mistakes, we only examine the outliers of localization results under such circumstances. Fig. 6
shows that the marjority of the outliers are generated by the subsets with altered APs. On the other hand, the localization results of the subsets of unaltered APs are inclined to be classified together, and may not give rise to many outliers.
The clusteroutlier joint approach operates in two steps. The clustering algorithm runs first in which the classical DBSCAN method [21] is employed. Denote by and the sizes of the largest and the second largest clusters. In the second step, we leverage all the points in the nonGTC points when the ratio is greater than a certain threshold . Otherwise, we transit to the second step by exploiting the outlier detection.
DBSCAN identifies clusters by the following law: given two global parameters radius and the threshold number of neighbors , a point is a member of a cluster either it is a core point that has at least neighbors with the distances less than , or it is a neighbor of a core point; otherwise it is considered as an outlier. The process to extend clusters in DBSCAN is based on breadth first search (BFS).
Each unvisited point is tested to be either a member of a cluster or an outlier henceforth. If belongs one cluster, a new cluster is created and the above procedure is repeated for ’s neighbors until all the points have been visited.
The DBSCAN algorithm is tailored for our
localization system shown in Algorithm 1.
The parameter is kept at the default value, which is the double of the dimension of the data space, i.e., four in twodimension space. As for , its optimal value varies in different situations, e.g., different number of altered APs. Instead of setting as a constant, we automatically select its value according to distance [22]. distance means the distance of the th nearest samples
. In our implementation, equals three. The sorted distance of localization results increase dramatically after a certain point. The results whose distance are smaller than that point probably lie in clusters. Therefore, we find the point where the slope starts to be steep and let its distance be the parameter . With this approach, we avoid the risk of choosing the classification parameters arbitrarily.
We then identify the altered APs based on the total frequency that each AP appears as the nonGTC points or outliers. Those APs with higher frequency of occurrence are more likely to alter.
In the cluster method, the frequency is simply accumulated by one for each subset result in the nonGTC points.
In the outlier method, we weight each outlier differently based on the fact that the larger the distance of a localization result is, the more likely it is generated by the subsets with altered APs. When counting the frequency of each AP in the outlier set, we put a proportional weight of every outlier to the distance:
(11) 
With the summedup frequency of each AP, we employ Jenks natural breaks classification method [23] to classify the altered and unaltered APs. Note that Jenks method classifies symmetrically, which disobeys the truth that the appearance of altered APs and unaltered APs are asymmetric. The variance of frequencies of altered APs are multiplied by an adaptive factor . The larger is, the fewer number of altered APs the system is likely to claim. We adjust the value of automatically based on the overall dispersion, which is evaluated by the mean distance between every two results, including GTC and nonGTC points. The rationale is that the overall dispersion of localization results goes up as the number of altered APs increases. Then, the classification can be found by minimizing the weighted sum of the squared deviations from the class means.
(12)  
(13) 
where and are the unaltered and altered classes, and are the mean frequency of the two classes, is the overall dispersion, and are the average and standard deviation of overall dispersion when no AP alters. The whole algorithm of clusteroutlier joint approach is shown in 2.
Sequential Analysis.
Sequential analysis is adopted after the joint approach to improve the accuracy of the detection. The basic idea is to combine several samples to come out a more reliable result. In addition, sequential analysis solves an inevitable problem joint approach incurs: Jenks method always classifies into two classes, both of which contains at least one element. Therefore, contains at least one AP even if there’s no AP alters and thus false alarm occurs.
In sequential analysis, we first take number of consecutive samples into account, and the ratio of alarms that each AP change is calculated. The ratio is taken as the reliability level of asserting an AP as altered. On the one hand, if is higher than a reliability level threshold , an AP can be claimed as altered with certainty. On the other hand, if the ratio is lower than , the AP is claimed as unaltered. Otherwise, a judgment whether an AP is altered can not be made for sure, in which more samples are taken into account until the judgement can be made. Note that this process will be interrupted with an explicit output by comparing to 0.5 if the number of the samples we check exceeds an upper bound . The procedure of detecting altered AP without RPs is summarized in Algorithm 3.
As for the cases that no AP alters, the AP which appears most frequently as outliers or nonGTC points differs for different test samples. That is to say, no AP is frequently classified into the altered class.
5 CSI Reconstruction and Localization
In this section, we propose a novel transfer learning method to reconstruct the CSI fingerprints based the maximum mean discrepancy measure. An edgeenahced CSI matching rule is designed to perform indoor localization.
5.1 Fingerprint Reconstruction
When APs are altered, the CSI fingerprints seem to be useless. Yet, recollecting new CSI is timeconsuming and economically inefficient. An interesting question is whether the outdated CSI fingerprints combined with a few fresh CSI samples from the reference points can be used to generate the updated fingerprints without cumbersome survey of all the sites. We observe that the factors influencing CSI such as the building layout usually change very gently despite the change of the location of an AP. The path loss pattern of spatially adjacent points may change in a similar way. Therefore, we leverage the transfer learning approach to reconstruct the CSI database at new scenarios with the knowledge gained from the outdated fingerprints.
To be more accurate, the main target of transfer learning is to find a transform matrix that projects both the outdated CSI data, serving as the source domain, and the updated CSI data, serving as the target domain, into a subspace where data distribution is matched. The properties of CSI radio map should be preserved at the same time. By projecting outdated CSI data into the subspace, we reconstruct the CSI fingerprints which achieve high localization accuracy. It is assumed that training points includes all RPs. In the following, we present how to find the optimal transform matrix . The notations that we use are listed in Table I.
Notation  Definition 

Allone column vector  
Identity matrix  
The dimension of one AP’s CSI samples  
The dimension of subspace  
The number of the outdated samples at point  
The number of the updated samples at point  
The number of points with outdated labeled samples  
The number of points with updated labeled samples  
Transform matrix,  
The th outdated CSI sample at point ,  
The th updated CSI sample at point ,  
The average of over samples at point  
The average of over samples at point  
The average of the outdated samples over points  
The set of neighbours of point  
The set of nonneighbours of point 
5.1.1 Minimizing Distance between Distributions
The autoupdate of fingerprints is required to align the outdated CSI data with the uptodate WiFi environment. In our task of fingerprint reconstrction, the source domain is the outdated CSI data and the target domain is the newly collected CSI data. Given that the position of the AP changes, we are likely to collect different CSI data at the same position. That is to say, the distribution changes, where is the CSI vectors and is the position where data is collected. In order to minimize the difference of distribution between source and target domains, we use the revised maximum mean discrepancy (MMD) measure. A variable is used to measure the extent to which two distributions resemble one another according to MMD. is measured by summing up the distances between the means of the projected outdated samples and updated ones over the points where both updated labeled samples and outdated samples exist. Here we use the Euclidean distance and rewrite the Euclidean distance using matrix traces.
(14)  
where and are defined as:
(15)  
(16) 
5.1.2 Minimizing IntraClass Distance
When minimizing distribution distances, important properties of CSI data such as stability should be maintained. One aspect of data properties is that the projected samples within the same class ought to be as clustered as possible. Intraclass distance measures the dispersion of the samples collected at point :
(17)  
Equation (17) measures the dispersion within one class. Summing up of all classes yields the intraclass distance of all samples :
(18) 
5.1.3 Maximizing Interclass Distance
In order to distinguish classes more readily, the separation between classes should be maximized. The separation is measured by the global dispersion minus the intraclass distance . The global dispersion is defined as summing up all the distances between every projected outdated sample and , the average of projected outdated samples over all sitesurvey points.
(19) 
Then we have
(20)  
5.1.4 Closer Points Sharing Similar CSI Data
One special data property of CSI radio map is that the points spatially closer to one other share similar CSI fingerprints, as shown in Fig. 7. Generally, the nearby points tend to have an much lower distance in fingerprints compared with points far apart. If the rule does not hold true, the system may pick up ‘CSI neighbors’ spatially far away from the user as its predictions. Therefore, we set a rule that the projected CSI distances between neighbors should be smaller than the distances between far away points .
(21) 
where . Let for , we have
(22) 
Similarly,
(23) 
where and for .
Having defined metrics and , we can introduce an inequality to keep the projected CSI distance of spatial neighboring points close, while driving that of
other points far away:
(24) 
5.1.5 Algorithm
By incorporating the objectives in the above subsections,
we formulate the optimization problem as maximizing while minimizing and , and a regularization term . Due to the arbitrary of the absolute value of , it makes sense that we can fix , whose trace is , as the identity matrix and maximize the other values. The optimal function can be represented as:
(25)  
s.t.  (26) 
where , are the parameters for weighting the intraclass distance and the Frobenius norm. We plug Equation (14), (18) and (20) in Equation (26), obtaining:
s.t.  (27) 
The Lagrangian approach is used to find the optimality of Equation (27), which is:
(28) 
where is the Lagrangian multiplier with as the rank of the matrix . To find the minimum value, the derivatives
should equal to a zero matrix, that is,
(29) 
By multiplying on the left side of both sides of the Equation (29) and submitting into Equation (28), we simplify the above optimality condition as:
(30) 
(31) 
which can be transferred as:
(32) 
(33) 
We adopt an approach similar to kernel Fisher discriminant (KFD) analysis [24] to find the optimal solution.
According to Equation (29) we can derive:
(34) 
from which we can get:
(35) 
Note that the inverse of Lagrange multiplier is , a diagonal matrix. For the th column vector in , there has:
(36) 
which suggests that each
is one of the eigenvalues w.r.t
andis a combination of independent column eigenvectors. Considering Equation (
33), the larger the trace of is, i.e., the larger of the sum of selected eigenvalues is, the larger the optimal function is. Therefore, the transform matrix is made of largest independent eigenvalues of .We further introduce the additional inequality Equation (24) to refine
the selection of the eigenvalues of . Then, the solution of is the matrix with the largest corresponding eigenvalues that satisfies the inequality constraint.
5.2 Matching Rule
We propose the edge enhanced nearest neighbors algorithm (EEKNN) as the matching rule for indoor localization. It specially tackles the decreasing accuracy of fingerprinting based localization on the edge or at the corner, where there are fewer training points as neighbors for the testing points.
5.2.1 Motivation
EEKNN is derived from weighted nearest neighbors (WKNN) and motivated by the following two observations.
Profound Perspective of CSI. We hereby provide a profound perspective of the role that CSI plays in localization. In the Fig.8
, we plot the relationship between the spatial distances and fingerprint distances of all training points with respect to a certain location. Yellow circles are generated by the mean value of all samples in one position, and blue ones are generated by samples. The red line is the linear regression using least squares.
Compared with RSS, CSI is much more stable. Blue circles of RSS are scattered in Fig.(b)b while those of CSI cluster around the average (yellow circles) in Fig.(a)a. Such stability is beneficial to localization.
On the other hand, the fingerprint differentials across spatial distance of CSI is comparatively low, which serves as a deficiency of CSI. The slope of the red line of CSI is not as steep as that of RSS. For example, Fig.(c)c and Fig.(d)d demonstrate the process of seeking the three nearest neighbors: the dashdotted blue line moves upwards until three neighbors are found. Due to the gentle slope, the spatial distances of CSI neighbors tend to be much farther than those of RSS, thus reducing the localization accuracy. That is to say, the number of neighbors in WKNN has more critical influence on CSIbased localization, especially for edge points where there are fewer spatial nearest neighbors.
Corner points suffer larger errors. When implementing WKNN in our system, we observe that the predicted positions of corner points are usually farther apart from the corner than they actually are. The reason is that all the neighbors of a corner point are on its single side, thus pulling the prediction result in one direction away from the corner. Even worse, unlike central points, a corner point has only two spatially nearest neighbors (Fig.9). When is greater than two, WKNN may pick up one faraway training point as a neighbour, thus incurring big errors.
Algorithm. Edge enhanced nearest neighbors (EEKNN) is a method purposed based on these two observations. It improves the accuracy of edge and corner points while maintaining the accuracy of central points simultaneously. The main idea is to automatically adjust the number of neighbors and the weights of different neighbors on the basis of their spatial locations. Recall that edge and corner points are prone to having fewer nearest neighbors. Hence, we decrease the number of neighbors once corner (or edge) training points are selected. Moreover, these points are given higher weights so as to pull the predictions back to the edge.
The algorithm is illustrated as follows. We refer to the inverse of the number of nearest neighbors that training point has as neighbor portion :
(37) 
For each testing point, we find several neighbors i with sum up to . Here, equals 1 in the default settings. Then we weight these training points not only by the fingerprint distances, but as well. The larger is, the higher proportion training point accounts for among all neighbors.
(38)  
where is the Euclidean distance as it is in WKNN, is the estimated location, and is the position of the th neighbor.
Under such definition, edge or corner training points are assigned a larger than central ones. As for the testing points, those on the edges are more likely to pick up neighbors with a larger . For example in Fig. 9, central points with four nearest training point neighbors tend to have for four neighbors while the corner points probably pick up only two nearest neighbors, whose ’s equal . Hence, the numbers of neighbors for central points remain and that for edge and corner points goes down.
6 Experimental Setup
In this section, we describe the software and hardware configurations for CSI fingerprinting, and the performance metrics for indoor localization.
6.1 Devices
We implement CRISLoc using TPlink TLWR885N as routers and Nexus 5 as mobile devices. The whole system works at 2.4 GHz with the bandwidth of 20MHz. Moreover, we select a relative empty channel to avoid the interference from adjacent traffic.
RSS. We develope an Android application to extract RSS from multiple APs simultaneously with an interface provided by Android Studio. The extracted RSS is represented by decibels, ranging from 100 to 0 dB. For those routers whose signals are too weak to be detected by our android APP, we set RSS as a minimum value 100 dB by default.
CSI. Nexus 5 with Nexmon[25] installed overhears frames transmitted by the router and extracts the CSI from the router to the Nexus 5, approximately up to 100 frames per second. The rate decreases as the distance between Nexus 5 and router increases. As for the cases that the routers and the smartphones are so far apart that the frames cannot be caught, we set CSI as 0 in every subcarrier.
6.2 Scenarios
We implement CRISLoc in two different typical indoor localization scenarios:
Research laboratory. First, we set up a testbed in the center of a research laboratory on the desks (Fig.(a)a). There are few multipath reflections and little disturbance. RSS and CSI are collected at 90 positions with spacing. Nine APs are placed inside the room.
Academic Building. Then, we conduct the experiment on the third floor of an academic building (Fig. (b)b). The test area is much more complex with many obstacles around. People walk around when fingerprints are collected, thus bringing disturbance to the data. The area covers an office, a corridor, and a lobby, and it is divided into grids with the edge width of on average at the height ranging from 0.8m to 1.5m. Ten APs in all are deployed: five in the corridor and five in the office.
Points marked with circles in Fig.11 are used as both training points in site survey and testing points, where there are eight reference points that distribute evenly. The other half of the points, those marked with triangles, are used as testing points only. The training points and testing points are distributed alternatively.
6.3 Evaluation metrics
Different from traditional multiple classification in machine learning that only one class will be selected as the results, altered AP detection does not know whether or what number of APs are altered: thus there are cases where either none alters or multiple APs alter. It is necessary to redefine the concepts of true positive (TP), true negative (TN), false positive (FP), false negative (FN) and confusion matrix.
TP, TN, FP and FN. Basically, we define these four concepts with respect to each AP separately. We accumulate when alarms occur at the th AP if the th AP actually alters and when alarms occur if the th AP does not alter. Similarly, refers to the case that no alarm occurs at the th AP but it actually alters and refers to the case that no alarm and the th AP does not alter.
Precision, Recall and F1score. Now that we have the new definitions of TP, TN, FP and FN, it is clear to define precision, recall and F1score, which follows the way as in the multiple classification. Precision is calculated as
(39) 
where indicates microaveraging and is the index of AP.
And recall is calculated as
(40) 
F1score is the harmonic mean of precision and recall, calculated as
(41) 
Confusion Matrix. To provide a further look at the performance when single or none AP changes, we redefine confusion matrix as well. For the th row (except the last row) in the matrix, we alter the th AP. The last row named ‘none’ represents the case that no AP alters. The figures in the matrix are the percentage of alarms out of the total test samples in each row. Note that it is not a standard confusion matrix as that in simple classification scenarios. Without knowing the actual number of altered APs, altered AP detection may take all APs as unaltered, or several APs as altered. Therefore, the sum of a row might be less, or greater than one.
7 Experimental Evaluation
In this section, we begin with a set of microbenchmark experiments to validate the effectiveness of CSI calibration and EEKNN. We next evaluate the performance of altered AP detection and fingerprint reconstruction separately, and CRISLoc as a unity.
7.1 Effectiveness of CSI Calibration and EEKNN
In this set of experiments shown in Fig.12, we evaluate the accuracy of localization using different forms of fingerprints, namely, uncalibrated CSI, calibrated CSI, and RSS. Uncalibrated CSI yields poor results because it does not follow the rule that closer points share similar fingerprints due to the automatic gain control. Calibrated CSI performs better than RSS in two aspects. First, the overall error of calibrated CSI is lower than that of RSS: the mean distance error of CSI based localization is 36.1% lower than that of RSS in the research laboratory, and 35.6% lower in the academic building. Second, fewer outliers of CSIbased localization appear, due to the high stability of calibrated CSI.
Fig.12 demonstrates the effectiveness of EEKNN as well. By using an adaptive neighbor portion , EEKNN avoids picking up a physically faraway neighbour and greatly improves localization accuracy, especially for corner points and edge points. In the academic building with more corners and edges, the mean distance error of calibrated CSI is reduced by 34.1% while the error is reduced by 21.3% in the research laboratory. Besides, since CSI is more likely to pick up a faraway neighbor as goes up, EEKNN benefits calibrated CSI more than RSS. The mean error of RSS decreases by 11.6% in the building and 11.2% in the lab.
7.2 Altered AP Detection
7.2.1 Detection with RPs
The approach of detection with RPs is based on the CSI collected at RPs from each AP separately. Therefore, the result this approach is independent with respect to each AP: the precision and recall of one AP always remain unchanged whether other APs are altered or unaltered. It is redundant to present the correlation between APs in a confusion matrix. The performance of detection with RPs is tested in Table III and Table III, where there are eight APs in the lab and nine APs in the building respectively. The values of recall and precision reach 100% and 99.4%.
No.1  No.2  No.3  No.4  No.5  No.6  No.7  No.8  
Altered  100.0  100.0  100.0  100.0  100.0  100.0  100.0  100.0 
Unaltered  1.0  0.2  0.0  0.1  0.0  0.0  0.0  0.0 
No.1  No.2  No.3  No.4  No.5  No.6  No.7  No.8  No.9  
Altered  100.0  100.0  100.0  100.0  100.0  100.0  100.0  100.0  100.0 
Unaltered  1.0  1.0  1.2  0.9  1.0  0.9  0.8  0.8  0.9 
7.2.2 Detection without RPs
We first compare the improvement of joint approach against clustering or outlier alone shown in Fig.13. As what we expect, clustering alone [26] suffers lower precision and outlier alone suffers lower recall, which is more intense in the academic building. The joint approach avoids the disadvantages of them, achieving the best F1score.
We next examine the performance of the joint approach detection, when single AP is moved in Fig.14. Generally, the performance in research laboratory is a little better than that in the academic building.
Finally, we illustrate the impact of the number of altered APs in Fig.15. We claim that it is less likely in reality that multiple APs alter simultaneously, so we only examine the cases that one, two or three APs alter. It is inevitable that the precision, recall and F1score decrease slighty as the number of altered APs increases, the F1scores of which are still above 68%.
7.3 Fingerprint Reconstruction
In this section, we evaluate the performance of fingerprint reconstruction separately, i.e., assuming that the prediction of altered AP is correct. We demonstrate its performance in two aspects: the error reduction by reconstruction and the impact of the number of RPs.
7.3.1 Error Reduction by Fingerprint Reconstruction
We hereby assess the performance of fingerprint reconstruction with eight RPs. There are three advantages in our reconstruction shown in Fig.1618.
First, CRISLoc eases the localization errors when a fraction of APs are altered. Particularly, when single AP is altered, which occurs the most frequently, the mean error is reduced from 0.46m (using outofdate database) to 0.30m in the research lab,
only a little higher than that in the initial situation 0.28m where no AP alters.
Second, the fingerprint reconstruction manages to reduce the dramatically increasing errors as the number of APs increases.
Third, CSI based localiztion outperforms RSS based localization no matter whether the fingerprint is newly collected, outdated, or reconstructed. The dashed lines in Fig.1618 demonstrated the efficiency of a similar RSS based fingerprinting localization system. Such RSS based localization reconstructs its fingerprints by LAAFU[26], which requires a stringent condition that the fingerprints follow the path loss model. LAFFU’s method cannot be applied to CSI, since the CSI amplitude of each subcarrier itself does not follow the path loss model as RSS does. Even worse, LAFFU even fails when the environment is complex and RSS does not follow this model. Transfer learning, as a result, conducts a better result.
7.3.2 Impacts of the Number of RPs
To understand how the fingerprint reconstruction is influenced by the number of RPs, we measure the accuracy of fingerprints reconstructed with one to ten evenlydistributed RPs in the research lab in Fig.19. Note that none represents the situation without reconstruction and inf represents the initial situation before AP alters. The reduction of error is quite obvious when the number of RPs increases from zero to four (no more than 9% of the number of training points) in the research laboratory. This proportion varies with the complexity of the environment.
7.4 Overall Performance
The overall performance in shown in Fig.20. Since detection with RPs has extremely high accuracy and recall, the overall performance with RPs is very much like the results in Section 7.3.1. Hence, we only demonstrate the overall performance without RPs here. The green, blue, and red lines refer to the case that no AP, 1 AP, and 2 APs are altered in reality. In all the cases, the system does not know how many APs are altered, needless to mention which one is altered.
The system exhibits high localization accuracy. Take the research laboratory as example. Compared with the baseline performance when the system knows that no AP is altered, the mean error increases by only 4.6%, i.e., 0.29m in the ‘none’ case. It shows that the system hardly mislabels unaltered APs as altered and that the localization accuracy remains high even when a few false alarms occur. As for the case that one or two APs are altered, the mean error rises by 5.4 cm and 8.6 cm respectively compared to the ‘none’ case. The effectiveness of detection and fingerprint reconstruction can be easily verified.
8 Related Work
8.1 Measurements and Matching Rules
As the accessible patterns on smartphones, RSS and SNR are used in traditional WiFi fingerprint [1][2][3][27][4][5]. Recently, He and Gary’s work concentrated on how WLAN chips on smartphones influence the RSS across different devices[3]and methods have been purposed to calibrate the heterogeneity of smartphones [27][4]. With the introduction of CSI Tool [13], WiFi fingerprint localization such as [8]
achieved higher accuracy because of CSI’s highdimensional properties. CSI amplitudes and phases are separately utilized for localization with deep learning in
[11] and [12]. However, CSI Tool can only be implemented on computers, thus making CSIbased localization less feasible in daily life than in experimental settings.Recent research has focused on the efficiency of various of matching rules [28]. Furthermore, different from RSS, CSI is a vector as fingerprints, which requires more sophisticated matching rules. A number of positioning algorithms that match the online measurement with offline fingerprints have been employed.
nearest neighbors (KNN), support vector machine (SVM), and maximum a posteriori estimation (MAP) were used in
[29], [30], and [8] respectively. Besides, the authors in [31][11][12] made use of deep learning approach to perform localization.8.2 Sitesurvey Overhead Reduction
One weakness of fingerprinting based localization is that fingerprint collection is timeconsuming and expensive. In order to lower the site survey overhead, various researches have been conducted. The authors in [32] determined the latentspace locations of unlabeled RSS data by the Gaussian Process Latent Variable Model (GPLVM). Zee[33] measured acceleration and orientation while performing WiFi scans. The inferred location was then used as labels of WiFi fingerprints. WILL [34] leveraged the WiFi property that signals are remarkably attenuated when passing through the wall to construct the radio floor plan. UnLoc [35] identified sensory signatures with WiFi sensors, accelerometers, and compasses. It then used the deadreckoning scheme to track users between these landmarks, thus bypassing the need for wardriving. WalkieMarkie [36] exploited WiFi landmarks and crowdsourced trajectory information to automatically build internal pathway maps of buildings.
Given that signal environment changes over time and regular site survey is required to maintain localization accuracy, many researchers aim to relieve its high cost by autoreconstructing the fingerprints. The authors in [37] employed a linear regression model to encode the correlations between RSS at RPs and nonRPs based on the initial fingerprint, and then updated nonRPs’ signal strength with the newly collected data at RPs. LAAFU [26] updated the fingerprint using Gaussian process regression and path loss model. Yet, this approach can be applied to CSI only if we use the power of CSI as fingerprints instead of CSI vectors. This is because CSI amplitude of each subcarrier itself does not follow the path loss model. Information conveyed by different subcarriers is then lost and the accuracy of localization significantly decreases.
Transfer learning has been used in sitesurvey reduction as well. The work in [38] learned a lowdimensional manifold shared between RSS collected in different areas, which enabled the localization model to be transferred from one area of a building to another. The authors in [39] learned the distance metrics that gathers RSS vectors in the same spatial cluster and separates RSS vectors in different spatial clusters based on a wellbuilt fingerprint. The learned metrics reduced the required number of sitesurvey points for constructing an accurate fingerprint database. Transfer kernel learning (TKL) was used in [40] to match the outdated and updated RSS distribution in the reproducing kernel Hilbert space. The kernel was then taken as the input of the localization model, which provided accurate location estimation despite environmental dynamics. In [41] and [42], MMD also serves as the metrics in transfer learning, based on which fingerprints are reconstructed in [41].
8.3 Other Localization Approaches
Angleofarrival (AoA)based and timeofflight (ToF)based solutions are the two other localization methods. These approaches were developed based on the MUSIC algorithm [43] and the first implementation is ArrayTrack [6]. By using frequencyagile wireless networks, ToneTrack [7] increased the effective bandwidth and improved localization accuracy. However, due to the blocked direct path and the hardware imperfection, the accuracy of these solutions was usually low in realworld applications.
Apart from WiFi signals, some other wireless communication services were utilized for localization as well. Radio frequency identification (RFID)based localization such as [44] and [45] achieved high accuracy, yet required additional RFID tags. Besides, beacon nodes were utilized for localization in [46] and [47].
9 Conclusion
CSI based fingerprinting localization has attracted lots of interests, which is limited by the commodity devices and difficult to be implemented by offtheshelf smartphones. It is inevitable that the APs are altered as time goes by, reducing the localization accuracy to a great extent. In this paper we present CRISLoc, a system exploiting CSI as fingerprints and automatically reconstructing CSI fingerprints for smartphone localization. We successfully extract CSI from Nexus 5 with 20 MHz bandwidth in 2.4 GHz without building exact connections and use it efficiently with our novel matching rule EEKNN, achieving the error reduced by 21 percent to 34 percent. In addition, our system is able to detect the AP alternation by a novel algorithm clusteroutlier joint approach with high F1scores and reconstruct by transfer learning accordingly over time. The mean error only rises by no more than 10 cm when APs are altered.
References
 [1] P. Bahl and V. N. Padmanabhan, “Radar: an inbuilding rfbased user location and tracking system,” in Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064), vol. 2, March 2000, pp. 775–784 vol.2.
 [2] M. Youssef and A. Agrawala, “The horus wlan location determination system,” in Proceedings of the 3rd international conference on Mobile systems, applications, and services. ACM, 2005, pp. 205–218.
 [3] S. He and S.H. G. Chan, “Wifi fingerprintbased indoor positioning: Recent advances and comparisons,” IEEE Communications Surveys & Tutorials, vol. 18, no. 1, pp. 466–490, 2015.
 [4] C. Laoudias, D. ZeinalipourYazti, and C. G. Panayiotou, “Crowdsourced indoor localization for diverse devices through radiomap fusion,” in International Conference on Indoor Positioning and Indoor Navigation. IEEE, 2013, pp. 1–7.
 [5] X. Tian, M. Wang, W. Li, B. Jiang, D. Xu, X. Wang, and J. Xu, “Improve accuracy of fingerprinting localization with temporal correlation of the rss,” IEEE Transactions on Mobile Computing, vol. 17, no. 1, pp. 113–126, 2017.
 [6] J. Xiong and K. Jamieson, “Arraytrack: A finegrained indoor location system,” in Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), 2013, pp. 71–84.
 [7] J. Xiong, K. Sundaresan, and K. Jamieson, “Tonetrack: Leveraging frequencyagile radios for timebased indoor wireless localization,” in Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. ACM, 2015, pp. 537–549.
 [8] J. Xiao, K. Wu, Y. Yi, and L. M. Ni, “Fifs: Finegrained indoor fingerprinting system,” in 2012 21st international conference on computer communications and networks (ICCCN). IEEE, 2012, pp. 1–7.
 [9] Y. Xie, Z. Li, and M. Li, “Precise power delay profiling with commodity wifi,” IEEE Transactions on Mobile Computing, vol. 18, no. 6, pp. 1342–1355, 2018.
 [10] P. Davidson and R. Piché, “A survey of selected indoor positioning methods for smartphones,” IEEE Communications Surveys & Tutorials, vol. 19, no. 2, pp. 1347–1370, 2016.
 [11] X. Wang, L. Gao, S. Mao, and S. Pandey, “Deepfi: Deep learning for indoor fingerprinting using channel state information,” in 2015 IEEE wireless communications and networking conference (WCNC). IEEE, 2015, pp. 1666–1671.
 [12] X. Wang, L. Gao, and S. Mao, “Phasefi: Phase fingerprinting for indoor localization with a deep learning approach,” in 2015 IEEE Global Communications Conference (GLOBECOM). IEEE, 2015, pp. 1–6.
 [13] D. Halperin, W. Hu, A. Sheth, and D. Wetherall, “Tool release: Gathering 802.11n traces with channel state information,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 1, pp. 53–53, Jan. 2011.
 [14] M. Schulz, J. Link, F. Gringoli, and M. Hollick, “Shadow wifi: Teaching smartphones to transmit raw signals and to extract channel state information to implement practical covert channels over wifi,” in Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 2018, pp. 256–268.
 [15] S. Y. Seidel and T. S. Rappaport, “914 mhz path loss prediction models for indoor wireless communications in multifloored buildings,” IEEE transactions on Antennas and Propagation, vol. 40, no. 2, pp. 207–217, 1992.
 [16] A. M. Tulino, A. Lozano, and S. Verdu, “Impact of antenna correlation on the capacity of multiantenna channels,” IEEE Transactions on Information Theory, vol. 51, no. 7, pp. 2491–2509, 2005.
 [17] S. A. Dudani, “The distanceweighted knearestneighbor rule,” IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC6, no. 4, pp. 325–327, April 1976.
 [18] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, Oct 2010.
 [19] A. Gretton, K. Borgwardt, M. Rasch, B. Schölkopf, and A. J. Smola, “A kernel method for the twosampleproblem,” in Advances in Neural Information Processing Systems 19, B. Schölkopf, J. C. Platt, and T. Hoffman, Eds. MIT Press, 2007, pp. 513–520.
 [20] K. P. F.R.S., “X. on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling,” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 50, no. 302, pp. 157–175, 1900.
 [21] M. Ester, H.P. Kriegel, J. Sander, and X. Xu, “A densitybased algorithm for discovering clusters a densitybased algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, ser. KDD’96. AAAI Press, 1996, pp. 226–231.
 [22] E. Schubert, J. Sander, M. Ester, H. Peter Kriegel, and X. Xu, “Dbscan revisited, revisited: Why and how you should (still) use dbscan,” ACM Transactions on Database Systems, vol. 42, pp. 1–21, 07 2017.
 [23] G. F. Jenks, “The data model concept in statistical mapping,” International yearbook of cartography 7, pp. 186–190, 1967.

[24]
K. . Muller, S. Mika, G. Ratsch, K. Tsuda, and B. Scholkopf, “An
introduction to kernelbased learning algorithms,”
IEEE Transactions on Neural Networks
, vol. 12, no. 2, pp. 181–201, March 2001.  [25] M. Schulz, D. Wegemer, and M. Hollick. (2017) Nexmon: The cbased firmware patching framework.
 [26] S. He, W. Lin, and S.H. G. Chan, “Indoor localization and automatic fingerprint update with altered ap signals,” IEEE TRANSACTIONS ON MOBILE COMPUTING, vol. 16, no. 7, pp. 1897–1910, July 2017.
 [27] A. M. Hossain, Y. Jin, W.S. Soh, and H. N. Van, “Ssd: A robust rf location fingerprint addressing mobile devices’ heterogeneity,” IEEE Transactions on Mobile Computing, vol. 12, no. 1, pp. 65–77, 2011.
 [28] P. Müller, M. Raitoharju, and R. Piché, “A field test of parametric wlanfingerprintpositioning methods,” in 17th International Conference on Information Fusion (FUSION). IEEE, 2014, pp. 1–8.
 [29] S. Sen, B. Radunovic, R. R. Choudhury, and T. Minka, “You are facing the mona lisa: Spot localization using phy layer information,” in Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services, ser. MobiSys ’12. New York, NY, USA: ACM, 2012, pp. 183–196.
 [30] ChaoLin Wu, LiChen Fu, and FengLi Lian, “Wlan location determination in ehome via support vector classification,” in IEEE International Conference on Networking, Sensing and Control, 2004, vol. 2, March 2004, pp. 1026–1031 Vol.2.
 [31] X. Wang, L. Gao, S. Mao, and S. Pandey, “Csibased fingerprinting for indoor localization: A deep learning approach,” IEEE Transactions on Vehicular Technology, vol. 66, no. 1, pp. 763–776, Jan 2017.
 [32] B. Ferris, D. Fox, and N. Lawrence, “Wifislam using gaussian process latent variable models,” in Proceedings of the 20th International Joint Conference on Artifical Intelligence, ser. IJCAI’07. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2007, pp. 2480–2485.
 [33] K. K. Chintalapudi, V. Padmanabhan, R. Sen, and K. Chintalapudi, “Zee: Zeroeffort crowdsourcing for indoor localization,” in Mobicom, August 2012.
 [34] C. Wu, Z. Yang, Y. Liu, and W. Xi, “Will: Wireless indoor localization without site survey,” IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 4, pp. 839–848, April 2013.
 [35] H. Wang, S. Sen, A. Elgohary, M. Farid, M. Youssef, and R. R. Choudhury, “No need to wardrive: Unsupervised indoor localization,” in Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services, ser. MobiSys ’12. New York, NY, USA: ACM, 2012, pp. 197–210.
 [36] G. Shen, Z. Chen, P. Zhang, T. Moscibroda, and Y. Zhang, “Walkiemarkie: Indoor pathway mapping made easy,” in Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). Lombard, IL: USENIX, 2013, pp. 85–98.
 [37] V. W. Zheng, E. W. Xiang, Q. Yang, and D. Shen, “Transferring localization models over time.” in AAAI, 2008, pp. 1421–1426.
 [38] S. J. Pan, D. Shen, Q. Yang, and J. T. Kwok, “Transferring localization models across space.” in AAAI, 2008, pp. 1383–1388.
 [39] K. Liu, H. Zhang, J. K. Ng, Y. Xia, L. Feng, V. C. S. Lee, and S. H. Son, “Toward lowoverhead fingerprintbased indoor localization via transfer learning: Design, implementation, and evaluation,” IEEE Transactions on Industrial Informatics, vol. 14, no. 3, pp. 898–908, March 2018.
 [40] H. Zou, Y. Zhou, H. Jiang, B. Huang, L. Xie, and C. Spanos, “Adaptive localization in dynamic indoor environments by transfer kernel learning,” in 2017 IEEE Wireless Communications and Networking Conference (WCNC), March 2017, pp. 1–6.

[41]
M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu, “Transfer feature
learning with joint distribution adaptation,” in
2013 IEEE International Conference on Computer Vision
, Dec 2013, pp. 2200–2207.  [42] S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang, “Domain adaptation via transfer component analysis,” IEEE Transactions on Neural Networks, vol. 22, no. 2, pp. 199–210, 2010.
 [43] R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE transactions on antennas and propagation, vol. 34, no. 3, pp. 276–280, 1986.
 [44] J. Wang, D. Vasisht, and D. Katabi, “Rfidraw: virtual touch screen in the air using rf signals,” ACM SIGCOMM Computer Communication Review, vol. 44, no. 4, pp. 235–246, 2015.
 [45] C. Jiang, Y. He, X. Zheng, and Y. Liu, “Orientationaware rfid tracking with centimeterlevel accuracy,” in Proceedings of the 17th ACM/IEEE International Conference on Information Processing in Sensor Networks, ser. IPSN ’18. Piscataway, NJ, USA: IEEE Press, 2018, pp. 290–301.
 [46] X. Wu, R. Shen, L. Fu, X. Tian, P. Liu, and X. Wang, “ibill: Using ibeacon and inertial sensors for accurate indoor localization in large open areas,” IEEE Access, vol. 5, pp. 14 589–14 599, 2017.
 [47] X. Tong, K. Liu, X. Tian, L. Fu, and X. Wang, “Fineloc: A finegrained selfcalibrating wireless indoor localization system,” IEEE Transactions on Mobile Computing, 2018.
Appendix: Additional Notations
Notation  Definition 

&  Estimated location & position of the th neighbor 
The weight for the th neighbor in WKNN and EEKNN  
The whole set of APs & a subset of APs  
Radius of neighborhood in DBSCAN  
The minimum number of neighbors as a core point in DBSCAN  
The threshold for the joint clusteringoutlier approach  
The predicted class of altered & unaltered APs  
The frequency of the th AP as an altered AP  
Dispersion of location results in one sample  
Average of dispersion among all the samples  
Standard deviation of dispersion among all the samples  
Adaptive weight factor of variance in Jenks method  
The threshold of reliability level in sequential analysis  
The minimum number of samples in sequential analysis  
The maximum number of samples in sequential analysis  
The neighbor portion of training point or the th neighbor  
The total neighbor portion for a testing point, by default one 
Comments
There are no comments yet.