I Introduction
In recent years, the world has witnessed a booming number of smartphones, which apparently has had a profound influence on personal lives. Meanwhile, the application of Internet of Things (IoT) has promoted the development of information technology. The real leap forward comes through the combination of IoT and mobility, which offers us opportunities to reimagine the ideal life. One promising application of IoT on mobile devices is indoor positioning. More recently, locationbased services (LBS) have been used in airports, shopping malls, supermarkets, stadiums, office buildings, and homes [1][2]. Generally, there are three approaches for indoor localization systems, viz., cellbased approach, modelbased approach, and fingerprintbased approach. In this paper, we will focus on radio frequency (RF) fingerprintbased localization, which is one of the most promising approaches.
RF fingerprintbased indoor localization [3][4]
consists of two phases: training and operating. In the training phase, radio signal strength (RSS) fingerprints are collected from different access points (APs) at reference points (RPs) in the region of interest (ROI). The server uses the collected fingerprint samples to train a TGAN, i.e., taking the fingerprint samples as feature vectors while RPs’ coordinates as labels. Then in the operating phase, a user downloads the trained GAN and uses the newly sampled fingerprint to obtain the location estimation. Fig
1 demonstrates the schema of the indoor localization scenario.Existing works on RF fingerprintbased indoor localization have the following limitations. First, for the training phase, constructing a fingerprint database as a prior is unpractical due to lack of highgranularity fingerprint samples we can collect [3][4]. Secondly, for a mobile device, storage space may not meet the minimum requirement of storing fingerprint database, and the computing capability, usually at tens or hundreds of Million Floatingpoint Operations per Second (MFLOPs), is also insufficient when encounter such data. Relying on server for storage and computation raises the challenge of communication delay for realtime localization application. In this context, we exploit a novel indoor localization approach based on deep learning.
Inspired by the powerful performance of generative adversarial networks (GAN) [5][6], we propose a tensorbased generative adversarial network (TGAN) to achieve superresolution of the fingerprint tensor. The advantage of GAN lies in recovering the texture of input data, which is better than other superresolution approaches. We also adopt transformbased approach introduced in [7] to construct a new tensor space by defining a new multiplication operation and tensor products. We believe with these multilinear modeling tool, we can process multidimensionalfeatured data more efficient with our tensor model.
Our paper is organized as follows. Section II presents details about transformbased tensor model with realworld trace verification. Section III describes the architecture of TGAN and gives two algorithms for highresolution tensor derivation and the training process. Section IV introduces a evaluation of TGAN and implements a tracebased localization experiment. Conclusions are made in Section V.
Ii TransformBased Tensor Model
In this section, we model RF fingerprints as a 3D lowtubalrank tensor. We begin by first outlining the notations and taking a transformbased approach to define a new multiplication operation and tensor products under multilinear tensor space as in [3]. We suppose, in the multilinear tensor space, multidimensionalorder tensors act as linear operators which is alike to conventional matrix space. In this paper, we use a thirdorder tensor
to represent the RSS map. Specifying the discrete transform of interest to be a discrete Fourier transform, the
operation represents circular convolution.Notations In this paper, scalars are denoted by lowercase letters. e.g. ; vectors are denoted by boldface lowercase letters e.g. and the transpose is denoted as ; matrices are denoted by boldface capital letters e.g. ; and tensors are denoted by calligraphic letters e.g. and the transpose is denoted by . We use to denote the set . The norm of tensor is denoted as . The Frobenius norm of a tensor is defined as .
Definition 1.
Tensor product: For two tensors , and . The tensor product . The is a tube given by , for and .
Definition 2.
Identity tensor: A tensor is identity if
is identity matrix of the size
and other tensor are all zeros.Definition 3.
Orthogonal tensor: is orthogonal if .
Definition 4.
fdiagonal tensor: A tensor is fdiagonal if all are diagonal matrices.
Definition 5.
tSVD: can be decomposed as , where and are orthogonal tensors of sizes , individually and is a rectangular fdiagonal tensor of size .
Definition 6.
Tensor tubalrank: The tensor tubalrank of a thirdorder tensor is the number of nonzero fibers of in tSVD.
In this framework, we can apply tSVD [3][7], which indicates that the indoor RF fingerprint samples can be modeled into a low tubalrank tensor. [7] gives out the decomposition measurement of tSVD regarding normal matrixSVD, which explains tSVD may be a suitable approach for further processing. To verify this lowtubalrank property, we obtain a realworld data set from [8] in an indoor region of size represented as Fig. 4, located in a college building.
The region is divided into grid map with gird size , and there are 21 APs randomly deployed. Therefore, our ground truth tensor is of size . The Fig. 5 demonstrates that compared with matrixSVD and CP decomposition [9]
, the tSVD shows RF fingerprint data are more likely to be in the form of a low tubalrank structure. For tSVD, 21 out of the total 64 singular values capture
energy of the fingerprint tensor, and the amount is increased to 38 for the matrixSVD method and CP decomposition method. Therefore, the low tubalrank property of transformbased tensor model is more suitable for the RSS fingerprint data than matrix.In this section, we compare three decomposition methods and decide to utilize the transformbased tensor model that are constructed under a new tensor space with 2D sparse coding. With the tensor calculations defined [7], we are able to extend the conventional matrix space to thirdorder tensors. This framework enables a wider tensor application. In the next section, we will introduce generative adversarial networks (GAN) to implement superresolution based on the tensor model we propose.
Iii Tensor Generative Adversarial Network
TGAN is used to generate extra samples from coarse RF fingerprint samples. The ultimate goal is to train a generator that estimates a finer granularity tensor from a coarse granularity input tensor . TGAN consists of three parts: the generator named tensor sparse coding network (TSCN), the discriminator and the localization network named localizer. In this section, we introduce whole architecture of TGAN and illustrates the data delivery and parameter updating process in TGAN as show in Fig. 6.
TSCN is used to encode the input coarse granularity tensor into a sparse representation and its dictionary with feedback from discriminator. Therefore, TSCN can be regarded as two parts. The first step is figuring out the sparse representation and the dictionary
by implementing Learned Iterative Shrinkage and Thresholding Algorithm based on Tensor (LISTAT) and three fullyconnection layers of network to receive feedback from discriminator. Then the output of the LISTAT algorithm is modified. This operation scheme is similar to the function of sparse autoencoder, since the TSCN’s target function (
2) has the same format with autoencoder.Let be the input, be the model distribution and be the data distribution. can be regard as encoding distribution of auto encoder, or as TSCN correspondingly. Therefore, we can define that . Let be the arbitrary prior distribution of . By matching the aggregated posterior to an arbitrary prior distribution , the TSCN can be regularized. In the following work, TSCN parametrized by is denoted as . Then, finding a suitable TSCN for generating a finer granularity tensor out of coarse tensor can be represent as: . The target adversarial minmax problem is:
(1)  
The encoding procedure from a coarse granularity tensor to a higher resolution tensor of a tensor can be regarded as a superresolution problem. The problem is illposed and we model two constraints to solve it. The first one is that the finer granularity patches can be sparsely represented in an appropriate dictionary and that their sparse representations can be recovered from the coarse granularity observation. The other one is the recovered tensor must be consistent with the input tensor for reconstruction. Corresponding to and , dictionaries are defined and . In our algorithm, the approach to recover from is to keep and as a same sparse representation . To simplify the tensor model, we should minimize the total number of coefficients of sparse representation, which can be represent as , where higher resolution tensor , sparse representation coefficient and dictionary .
Suggested by [10], in order to make sparse, we present the optimization equation: equation (2). It can be simplified by Lagrange multiplier to equation (3). Here, balances sparsity of the solution and fidelity of the approximation to .
(2) 
(3) 
Recent researches [11][12][13] indicate that there is an intimate connection between sparse coding and neural network. In order to fit in our tensor situation, we propose a backpropagation algorithm called learned iterative shrinkage and thresholding algorithm based on tensor (LISTAT) to efficiently approximate the sparse representation (sparse coefficient matrix) of input via using trained dictionary to solve equation (2). The entire superresolution algorithm described above is illustrated in Algorithm 1.
Accompanied with the resolution recovering procedure to obtain pairs of high and coarse granularity tensor patches that have the same sparse representations, we should solve the respected two dictionaries and . We present an efficient way to learn a compact dictionary by training single dictionary.
Sparse coding is a problem of finding sparse representations of the signals with respect to an overcomplete dictionary . The dictionary is usually learned from a set of training examples . To optimize coefficient matrix , we first fix and utilize ISTAT to solve the optimization problem. That is:
(4) 
where stands for the data reconstruction term , the Forbenius norm constraints on the term remove the scaling ambiguity, and stands for the sparsity constraint term .
ISTAT is used to solve this problem, which can be rewritten as a linearized, iterative functional around the previous estimation with proximal regularization and nonsmooth regularization. Thus at the th iteration, can be update by equation (5). By using to be Lipschitz constant, and to be gradient defined in the tensor space, we can have equation (6)
(5)  
(6)  
Here, . As we introduced in section II, the transformbased tensor model can also be analyzed in the frequency domain, where the tensor product . It can be computed as , where denotes the kth element of the expansion of in the third dimension and . For every and , we can conduct that:
(7)  
Thus Lipschitz constant for in our algorithm is:
(8) 
Then we can optimized by proximal operator , where is the softthresholding operator [8]. After that, we should fix to update corresponding dictionary . We rewrite equation (4) as equation (9). For the reason that, in this paper, we specify the discrete transform of interest to be a discrete Fourier transform, aka., decomposing it into knearlyindependent problems using DFT like equation (10). Here, denotes the frequency representation of , i.e. .
(9)  
(10)  
We can apply Lagrange dual with a dual variable to solve equation (10) in frequency domain as equation (11). By defining , the minimization over can be write as equation (12). Substituting the equation (12) into Lagrangian , we will obtain the Lagrange dual function as equation (13). Notation represents the trace norm of a matrix.
(11)  
(12) 
(13) 
Equation (13) can be solved by Newton’s method or conjugate gradient. After we get the dual variables , the dictionary can be recovered. The whole training algorithm of dictionary is summarized in Algorithm 2.
With the input finer granularity tensor , the algorithm can return the trained dictionary and with the input coarse granularity tensor , the algorithm returns . So, this is an efficient way to learn a compact dictionary by training single dictionary.
Iv Performance Evaluation
In this section, we first compare the superresolution tensor results that obtained by applying TGAN on trace based fingerprint tensor data with other algorithms. Then we evaluate the localization accuracy and compare the performance of our localization model with other previous conventional models using the same data. Finally we implement our own experiment and collect another groundtruth fingerprint data set to verify our approach.
As illustrated in Fig. 5, the trace data set is collected in the region of size and the tensor set can be measured as . Each block have RPs, we then set some blocks which have RPs of coarse granularity tensor contrasted with the rest which have RPs of finer granularity tensor by upsampling (The finer granularity of blocks is ). blocks enjoy finer granularity, and blocks ready to be processed by TGAN. For this scenario, to measure the quality of generated finer granularity tensor, we adopt the peak signaltonoise ratio (PSNR) criteria denoted as:
(14) 
where
There are several existing superresolution algorithms with our LISTAT, as suggested in [6][14][11]
, Trilinear interpolation, Superresolution GAN (SRGAN) and the Sparse coding based networks (SCN) we adopt. The measurements are shown in Fig.
9 and Fig. 9.From Fig. 9, we can tell that the calculation of TGAN converge fast. And we measure the Euclidean distance between the estimated location and the actual location of the testing point as localization error by equation (15).
Fig. 9
describes the cumulative distribution function (CDF) of location estimation error of fingerprint classify network with and without original training samples and KNN algorithm.
(15) 
For trace data in localization performance, in Fig. 9, we adopt KNN and SRGAN to compare with the fingerprintbased indoor localization and draw a CDF graph of localization error. For the location estimation with fingerprint data, both neural networks (TGAN and CNN) perform better than KNN with small variation. The error of localization with KNN will increase stage by stage, while the error in neural network increases smoothly. By comparing with the TGAN and CNN supplied by the SRGAN, the improvement shows that the TGAN performs better than the simple merge of generated extra tensor samples and CNN. The TGAN can catch the interconnection inside the data better and generate more similar data for localization.
In our own experiment, we develop an Android application to collect the indoor WiFi RF fingerprint data in a smaller region. With the native Android WifiManager API introduced in Android developer webpage, we can manage all aspects of WiFi connectivity. To be specific, we select a region with 14 APs lie inside, and we set the scale of the RPs as , so RSS fingerprint tensor set is of size . The raw fingerprint data we collect is 4.2MB.
After collecting data, we choose two representative localization techniques to compare with TGAN, namely, weighted KNN and conventional CNN. Weighted KNN is the most widely used technique and is reported to have good performance for indoor localization systems. The conventional CNN in this experiment is similar to the localization part of TGAN, while TGAN holds an extra step for processing collected coarse data to high granularity ones, which is called Superresolution via Sparse Coding. The trained TGAN downloaded to smartphone is only 26KB in our experiment. The result is shown in Fig.9. By comparing the localization error of KNN algorithm, CNN and TGAN, the terrific result shows that the neural network is much suitable for indoor RSS fingerprint data than KNN, which is consistant with the result of our simulation.
Based on the results of simulation and experiment we made above, we can summarize the improvement of indoor localization in our work. First, the transformbased tensor model can handle insufficient fingerprint samples well by sparse representation, which means we can obtain a finer granularity fingerprint with sparse coding. Secondly, we adopt CNN like localization step and it is proved to be much suitable for analyzing the indoor RSS fingerprint data than KNN, which enables TGAN to give out an improved solution in localization accuracy. Thirdly, TGAN is easy to use and proved to reduce the storage cost of smartphones by downloading the TGAN rather than raw data, so it is acceptable to implement TGAN on smartphones.
V Conclusion
We first adopted a transformbased tensor model for RF fingerprint and, based on this model, we raised a novel architecture for realtime indoor localization by combining the architecture of sparse coding algorithm called LISTAT and neural network implementation called TSCN. In order to solve the insufficiency of training samples, we introduced the TGAN to complete superresolution for generating new fingerprint samples from collected coarse samples. In future works, we will try to figure out a more efficient and slim network architecture for localization process. We believe that applying neural network on mobile platforms is promising.
References
 [1] Iris A Junglas and Richard T Watson, “Locationbased services,” Communications of the ACM, vol. 51, no. 3, pp. 65–69, 2008.
 [2] Yan Wang, Jian Liu, Yingying Chen, Marco Gruteser, Jie Yang, and Hongbo Liu, “Eeyes: devicefree locationoriented activity identification using finegrained wifi signatures,” in Proceedings of the 20th Annual International Conference on Mobile Computing and Networking. ACM, 2014, pp. 617–628.
 [3] XiaoYang Liu, Shuchin Aeron, Vaneet Aggarwal, Xiaodong Wang, and MinYou Wu, “Adaptive sampling of rf fingerprints for finegrained indoor localization,” IEEE Transactions on Mobile Computing, vol. 15, no. 10, pp. 2411–2423, 2016.
 [4] Zheng Yang, Chenshu Wu, and Yunhao Liu, “Locating in fingerprint space: wireless indoor localization with little human intervention,” in Proceedings of the 18th Annual International Conference on Mobile Computing and Networking. ACM, 2012, pp. 269–280.
 [5] Ian Goodfellow, Jean PougetAbadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, 2014, pp. 2672–2680.
 [6] Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al., “Photorealistic single image superresolution using a generative adversarial network,” arXiv preprint arXiv:1609.04802, 2016.
 [7] XiaoYang Liu and Xiaodong Wang, “Fourthorder tensors with multidimensional discrete transforms,” CoRR, vol. abs/1705.01576, 2017.
 [8] XiaoYang Liu, Shuchin Aeron, Vaneet Aggarwal, Xiaodong Wang, and MinYou Wu, “Tensor completion via adaptive sampling of tensor fibers: Application to efficient indoor rf fingerprinting,” in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, 2016, pp. 2529–2533.
 [9] XiaoYang Liu, Xiaodong Wang, Linghe Kong, Meikang Qiu, and MinYou Wu, “An lsdecomposition approach for robust data recovery in wireless sensor networks,” arXiv preprint arXiv:1509.03723, 2015.
 [10] David L Donoho, “For most large underdetermined systems of linear equations the minimal l1norm solution is also the sparsest solution,” Communications on Pure and Applied Mathematics, vol. 59, no. 6, pp. 797–829, 2006.

[11]
Zhaowen Wang, Ding Liu, Jianchao Yang, Wei Han, and Thomas Huang,
“Deep networks for image superresolution with sparse prior,”
in
Proceedings of the IEEE International Conference on Computer Vision
, 2015, pp. 370–378.  [12] Koray Kavukcuoglu, Marc’Aurelio Ranzato, and Yann LeCun, “Fast inference in sparse coding algorithms with applications to object recognition,” arXiv preprint arXiv:1010.3467, 2010.

[13]
Karol Gregor and Yann LeCun,
“Learning fast approximations of sparse coding,”
in
Proceedings of the 27th International Conference on Machine Learning (ICML10)
, 2010, pp. 399–406.  [14] Wenzhe Shi, Jose Caballero, Lucas Theis, Ferenc Huszar, Andrew Aitken, Christian Ledig, and Zehan Wang, “Is the deconvolution layer the same as a convolutional layer?,” arXiv preprint arXiv:1609.07009, 2016.
Comments
There are no comments yet.