rf fingerprints prediction for cellular network...

1536-1233 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2893278, IEEETransactions on Mobile Computing

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. XX, NO. XX, XXXX 2019 1

RF Fingerprints Prediction for Cellular NetworkPositioning: A Subspace Identification Approach

Xiaohua Tian, Member, IEEE, Xinyu Wu, Hao Li, Xinbing Wang, Senior Member, IEEE,

Abstract—Cellular network positioning is a mandatory requirement for localizing emergency callers, such as E911 in North America.Although smartphones are normally equipped with GPS modules, there are still a large number of users with cell phones only as basicdevices, and GPS could be ineffective in urban canyon environments. To this end, the RF fingerprints based positioning mechanismis incorporated into LTE architecture by 3GPP, where the major challenge is to collect geo-tagged RF fingerprints in vast areas. Thispaper proposes to utilize the subspace identification approach for large-scale RF fingerprints prediction. We formulate the probleminto the problem of finding the optimal subspace over Stiefel manifold, and redesign the Stiefel-manifold optimization method with fastconvergence rate. Moreover, we propose a sliding window mechanism for the practical large-scale fingerprints prediction scenario,where recorded fingerprints are unevenly distributed in the vast area. Combining the two proposed mechanisms enables an efficientmethod of large-scale fingerprints prediction in the city level. Further, we validate our theoretical analysis and proposed mechanismsby conducting experiments with real mobile data, which shows that the resulted localization accuracy and reliability with our predictedfingerprints exceed the requirement of E911.

Index Terms—RF Fingerprints prediction, localization, Stiefel manifold

F

1 INTRODUCTION

C ELLULAR network positioning is mainly driven by the gov-ernments’ mandatory requirement for operators to localize

the caller in emergency situations, such as E911 in North Amer-ica and E112 in Europe [1]–[3]. This is because most of theemergency callers (e.g., 60% in the Europe Union in 2013 [4],[5]) are unable to provide their current positions accurately. Tothis end, the 3rd Generation Partnership Project (3GPP) has madethe positioning methods such as Cell ID (CID) mandatory sinceRelease 8 [6], and specified the architecture of fingerprintingbased positioning for LTE networks in Release 9 [7]. The po-sitioning capability also can be leveraged for network planning,troubleshooting [13], and location based services such as eventrecommendation and location-aware advertising [2], [8], [14].

The past decades have witnessed a large body of work devotedto indoor positioning [9]–[12], [21], where it is largely believedthat the global navigation satellite system (GNSS) such as GPShas satisfied the need of localization in outdoor spaces; however,the fact is that solely relying on GNSS is unable to meet thepositioning requirement of E911 or E112 even in outdoor spaces.First, a large number of mobile devices without GPS functionalitystill remain in use. It is found that more than 90% of Americanshave cell phones, but the smartphone adoption level is only 77%in 2017, and the level for the group of senior Americans (60+) ismerely around 42% [24]. Second, even for those smartphone users,it has been verified in the operator’s practice that the performanceof GPS is unacceptable in urban canyon environments. Somelocations of such kind are unable to have a single satellite visible,and a notable of locations have less than 3 satellites visible, whichis the basic requirement for GPS localization [2], [5], [8].

• A preliminary version of this article will appear in Proceedings of the37th IEEE International Conference on Computer Communications (IEEEINFOCOM 2018).

• Xiaohua Tian, Xinyu Wu, Hao Li and Xinbing Wang are with the Schoolof Electronic Information and Electrical Engineering, Shanghai Jiao TongUniversity, Shanghai, 200240.E-mail: xtian, wuxinyu, iamgeek, [email protected].

Although not 100% adopted, considerable widespread use ofsmartphones with the GPS module can facilitate fingerprintingbased localization, which particularly suits cellular networks [13],[14], [17]–[20], [22], [23]. LTE smartphones regularly report theuser measurement data (UMD) to the database at the networkedge in the network control and management process [2], [5]–[7], [13], [14], [17], where the RF measurements contained inthe UMD such as the reference signal received power (RSRP)can be regarded as a kind of wireless fingerprint of the observedlocation. Leveraging the natural process, the network operator canconstruct a comprehensive and up-to-date fingerprints databasein a crowdsourcing manner [13]. As the basic mobile devicestill needs to report the RSRP to the network periodically, thenthe device’s current location can be estimated by comparing thereported RSRP with the fingerprints database.

However, the challenge for fingerprinting localization in thecellular network is that vast areas still need to be surveyed.The regular UMD contains no GPS location information if somespecial software is not installed in the user’s mobile device [13],[14]; therefore, the war-driving method is still needed to geo-tagthe UMD [14], [17]. The expensive war-driving process motivatesthe idea of fingerprints prediction based on the particular radiopropagation model [17]; the mobile trajectory tracking method isalso utilized to match a time series of the UMD to a route [14],[15], [18], so that the location information of the continuously-tracked UMD can be derived. While such interesting ideas canbe helpful handling particular cases, a systematical approach toperforming large-scale outdoor fingerprints prediction in the citylevel is still an open issue.

In this paper, we propose a large-scale fingerprints predictionapproach to facilitate cellular network positioning, where weredesign the subspace identification mechanism to fully exploit theintrinsic connections among RF fingerprints. Our contributions areas following:

First, we formulate the fingerprints prediction problem into theproblem of finding the optimal subspace over the Stiefel manifold[25], and propose a streamlined Stiefel-manifold optimization



TIAN et al.: RF FINGERPRINTS PREDICTION FOR CELLULAR NETWORK POSITIONING: A SUBSPACE IDENTIFICATION APPROACH 2

algorithm for fingerprints prediction. The d-dimensional Stiefelmanifold is the set of all orthogonal d linearly independent vectorsin the m-dimensional space, where each element in the set canspan a d-dimensional subspace. The basic idea of the classicStiefel-manifold optimization algorithm is similar to the gradientdescent method, where the difference is that the decision variableof the former lies in the Stiefel manifold instead of the real numberdomain. We streamline the classic Stiefel-manifold optimizationalgorithm to accommodate the characteristics of fingerprints pre-diction (§Section 4.1), and prove convergence of the proposedalgorithm (§Section 4.2); moreover, we reveal the fundamentalreason why our design converges faster than an alternative ap-proach to optimizing decision variables over Grassmann manifold[35]–[37].

Second, we propose a dynamic sliding window mechanism todeal with the practical fingerprints prediction scenario, where thefingerprints are unevenly distributed in the vast area. The proposedmechanism scans the entire area multiple times with a slidingwindow, where the window size increases in each new round ofscanning as the predicted fingerprints obtained in the previousround increase the density of available fingerprints (§Section 5.1).The mechanism is highly efficient for completing fingerprintsprediction over 69.8km2 area within 7 rounds of scanning. Thecrux of the mechanism design is to determine the dimensionof the subspace d for the matrix in the sliding window. Wepropose to sample a complete sub-matrix in each window matrix,and determine d through applying singular value decomposition(SVD) to the sub-matrix; we theoretically prove that the subspacedimension determination method incurs tractable information loss(§Section 5.2).

Third, we validate our theoretical analysis and proposed mech-anism with a real data set, which contains around 8, 820, 000RSRP data records collected from a 69.8 km2 area in a city.Our experimental results show that the proposed scheme pro-vides satisfactory accuracy of fingerprints prediction. We con-duct positioning experiments with the predicted data, and theresults show that the user can be localized in the 100m and300m neighborhood of the real location at 70.8% and 98.67%respectively, which exceeds the E911 network based localizationrequirement regulated by the federal communication comission(FCC): “within 100m for 67% and 300m for 90%” (§Section6.3). Moreover, the results verify that the proposed streamlinedStiefel-manifold optimization algorithm converges faster than theGrassmann manifold alternative (§Section 6.4).

2 RELATED WORK

Multiple mechanisms are supported by the LTE network position-ing architecture of 3GPP [6], [7] including CID, TOA, TDoA andfingerprinting, among which the fingerprinting approach drawsmuch attention in the research community; because the CIDperformance is highly dependent on the density of the BS, andthe information in the practical UMD can be insufficient for TOAand TDOA [13], [14], [17]–[20], [22], [23].

The 3GPP specifies the architecture of fingerprinting basedpositioning for LTE networks in Release 9 [7], as illustratedin Fig. 1. The LTE positioning architecture contains three mainelements: the location service client (LCS), the location server,and the target device. Normally, the dedicated program is installedin the target device, which reports measurements or location of thedevice itself to the location server. The location server processes

request from the LCS, which not only collects measurements andother location information from the device and eNBs, but alsoprovides assisting measurements information for the target deviceto estimate its position. The LCS could send the network initiatedlocation request for emergency positioning, where the networkinstructs the target device to provide the position with unsolicitedassistance data; the LCS could also send the mobile terminatedlocation request for positioning the target device, which containsprivacy features and can be rejected by the target device. Thetarget device also could actively send mobile originated locationrequest, which is relayed by the MME to the location server.The positioning data can be carried by both the control planeand the data plane protocols, where the LTE positioning protocol,LPP Annex protocol and SUPL 2.0 protocol are developed for thepurpose.

Localization server:

UE eNB

MME

P-GWS-GW

E-SMLC

SLP

LPP

LPPa

Control planeLTE positioning protocol

(LPP)

User PlaneSecure User Plane Location 2.0

(SUPL 2.0)

LCS client

E-UTRAN EPC

Fig. 1. Fingerprinting localization architecture in the LTE network.

The major challenge for fingerprinting positioning in thecellular network is to construct a wide-area radio map. Margolieset al. develop a fingerprinting based cellular network positioningtestbed, where the radio map is constructed with crowdsourceddata from 4 million unique users whose mobile devices areinstalled with a proprietary software [13]. Comprehensive eval-uations are performed with the testbed, but there are still wideareas not covered by the crowd workers. Ray et al. utilize the usermobility to derive the location information of the continuouslysampled UMD by matching the UMD time series to a physicalroute [14], where the predicted fingerprints are basically alongthe main roads of the city. The idea of utilizing the user mobilityis also adopted by other work on cellular network measurement[15], [18]. Chakraborty et al. propose a geo-tag method basedon Gaussian Mixture Model (GMM) [17], where the RF signalcharacteristics are modeled with a Gaussian distributed randomvariable.

Fingerprints prediction schemes based on matrix analysisapproaches have been applied to indoor localization systems forsaving the cost of site survey [9]–[12], which are very similarwith the scenario studied in this paper. In particular, the matrixcompletion algorithms are utilized to deal with the issue. Liu et al.model the indoor fingerprints prediction problem as a tensor com-pletion process, where the tensor can be viewed as a 3-dimensionalmatrix [9]. Gu et al. propose a novel matrix completion algorithmcombining singular value decomposition and K-Nearest Neighbors(KNN) algorithm [9]. Nikitaki et al. focus on multi-channel indoorfingerprinting localization system utilizing matrix theories [31].Jain et al. design an alternating minimization approach improvingthe accuracy and efficiency in matrix completion [32]. Althoughalso adopting the matrix completion model, our work in this paperfor the first time formulates the fingerprints prediction problem




into a Stiefel manifold optimization problem to the best of ourknowledge.

Edelman et al. present the framework to use the gradientdescent method on the Grassmann and Stiefel manifold [25].While the Grassmann manifold optimization problem is studiedand applied in the field of image processing and remote sensing[35]–[37], the Stiefel manifold is under studied. Our work inthis paper redesign the original Stiefel manifold optimizationalgorithm used in [25] to accommodate the fingerprints predictionscenario, where the theoretical issues such as convergence rateanalysis and step size design are resolved in contrast to [25].

3 PROBLEM FORMULATION

3.1 Fingerprints Prediction: A Subspace IdentificationPerspectiveThe fingerprints prediction problem can be formulated into amatrix completion problem [9]–[12]. The area needs localizationservice is first divided into grids, and the fingerprints sampled ingrids are like elements in a matrix. The purpose of fingerprintsprediction is essentially to complete the entire matrix by derivingthe unknown elements based on those available ones. A number ofmathematical tools for matrix completion are available, such as thesingular value thresholding (SVT) [26], singular value partition(SVP) [27], forward-backward algorithm for matrix completion(FBMC) [28] and iterative reweighted least squares (sIRLSp) [29].Though with different implementation details, those algorithms aregenerally based on singular value decomposition (SVD).

In particular, given an incomplete m×n matrix induced fromthe incomplete radio map, if we assign the values of all unknownelements to be 0s, then we have a complete radio map matrixA = UΛV T after SVD, where U is anm×m real unitary matrix,Λ is an m× n rectangular diagonal matrix with non-negative realnumbers on the diagonal and V T is an n×n real unitary matrix. Itis usually assumed that A is a low-rank matrix, which means thatall the column vectors in A are linearly dependent to each other;this is based on the in-practice observation that fingerprints arecorrelated within a certain area. To exploit the linear dependency,we could keep d greatest singular values lying on the diagonal ofΛ making it a d×dmatrix, and make the corresponding parts in Uand V T m×d and d×n matrices, respectively. Then multiplyingthe three parts results in a new m × n matrix A, which containsestimations to those unknown elements originally assigned valuesof 0s in A.

The essence of the SVD method is actually to find a lowerdimensional subspace that contains all the column vector in A.Consider anm-dimensional space that contains all m-dimensionalcolumn vectors in A, if most of those vectors are linearly depen-dent with each other, then most of them should belong to a lowerdimensional subspace of the m-dimensional space. For example,imagine that there are some points in the 3-D space, if most of thepoints are linearly dependent with each other, then those pointsshould be in a 2-D plane or a straight line. In SVD, the residualm×d matrix U is such a lower (d) dimensional subspace inducedby the greatest d singular vectors. If the subspace is found, thenany vector belong to the subspace are available; this is why theunknown elements can be estimated.

The accuracy of the elements estimation is highly dependenton whether the obtained subspace indeed contains most of thevectors in A. There are infinite number of possible subspacesthat can be induced by A, however, the SVD method factually

always finds one specific type of the subspace, as it initializes theincomplete elements in A by assigning them with specific values,for example all 0s, before performing decomposition. Assigningdifferent values to those unknown elements yields different sub-spaces, but there are infinite number of possible situations, whichmakes SVD method unable to guarantee that the found subspaceis always optimal.

It is worth mentioning that the subspace identified by SVDis optimal in terms of minimizing the Frobenous norm accordingto the Eckart-Young Theorem [33]. In particular, if we use Ac todenote the matrix obtained after A going through the initializationprocess of the SVD method, and Ao the matrix obtained afterthe entire SVD process, then ||Ac − Ao||Frobenous is minimizedaccording to Eckart-Young Theorem. However, this does notconflict with our problem formulation in this paper, because ourobjective is to minimize ||PΩ(A) − PΩ(Ao)|| as to be shownin (1). Generally, depending on the method chosen for matrixcompletion, the unknown elements could be completed to thevalues that lead to the lower rank matrix during the initializationprocess, and the initial values will be updated by an iterative pro-cess such as the augmented Lagrange multiplier method (ALM)for matrix completion [34]. In contrast, our approach proposedin this paper iteratively finding the optimal subspace directlywithout initializing assignments of unknown element in A. In thefollowing, we are to show how to find the optimal subspace in thewhole set of possible subspaces.

3.2 Problem Formulation

The fingerprints prediction problem can be formulated into thefollowing matrix completion problem:

minA

||PΩ(A)− PΩ(A)||,

s.t. rank(A) ≤ d,(1)

where || · || represents any suitable norm; A is the matrix rep-resenting the radio map. Since some elements in A have notbeen measured thus unavailable, we use PΩ(A) to denote thoseavailable fingerprints in A. The fingerprints prediction mechanismyields A; this is a complete estimation of A, which containsestimations to those unmeasured fingerprints in correspondingpositions. The constraint rank(A) ≤ d means that the estimationA is under the constraint of low rank, where d << n in practice.The physical meaning of the problem is to find A that minimizesthe deviation from the available observations denoted by PΩ(A),given the low-rank constraint.

The j-th column of A can be regarded as an m-dimensionalvector denoted by aj , and all the column vectors are in an m-dimensional space denoted by M . Since ajs are correlated to eachother, then we can assume that the matrix A is a rank-d matrix,which means that all vectors of A belong to a d-dimensionalsubspace of M . However, as some elements in A are unknown,to obtain a prediction of those elements, we want to find a rank-d matrix Ad based on the known elements in each aj . The Adfound must be a complete matrix and Ad = UdΛdV

Td after

SVD, where the matrix Ud contains d m-dimensional orthogonalvectors, which can span a d-dimensional subspace of M . FindingAd directly could be challenging, but due to the property of thelow rank matrix, if we can find Ud, then Ad can be derived.




The problem is now transformed into a subspace identificationproblem in the following form:

minUd:m×dwj :d×1

n∑j=1

||[Udwj ]Ω − [aj ]Ω||22, (2)

where Udwj represents the column vector in Ad correspondingto the column vector in A denoted by aj . Since aj could beincomplete, we use [aj ]Ω to denote the available elements inaj ; the corresponding elements in Ad is denoted by [Udwj ]Ω,as the matrix Ad can be transformed into Ad = UdW withW = ΛdV

Td , where wj is the jth column of W .

The problem (2) distinguishes itself from other commonlyseen optimization problems in that the decision variable Ud isa subspace in the form of a matrix. Recall the m-dimensionalspace where all vectors of A are in, all d-tuples of orthogonalm-dimensional vectors form a d-dimensional Stiefel manifold;therefore, the problem becomes to find a point in the Stiefelmanifold considering the objective function in (2). This is a Stiefelmanifold optimization problem [25].

It is worth mentioning that multiple points in the Stiefel man-ifold can possibly form the same subspace, because one subspacemay have multiple sets of basis. All those lower-dimensional sub-spaces in the m-dimensional space form another kind of manifoldknown as Grassmann manifold [35]–[37], which is to be used forconvergence proof in the discussion later. We will theoreticallyprove that the convergence rate of our proposed mechanism basedon Stiefel manifold can be higher than performing optimizationover the Grassmann manifold, which especially suits fingerprintsprediction in extremely large-scale areas.

4 STREAMLINED STIEFEL MANIFOLD OPTIMIZA-TION

4.1 Algorithm Design

The basic idea of the Stiefel manifold optimization approach issimilar to the gradient descent method, which is frequently usedin resolving optimization problems. The gradient descent methodstarts with a given point on a curve representing the objectivefunction, and iteratively takes steps proportional to the negativeof the gradient of the function at the current point. The methodcan be extended to the Stiefel manifold optimization problem[25]; however, the practical scenario of fingerprints predictionproblem can not fit in the general framework presented in [25].Moreover, it is not mentioned in [25] whether the approach willconverge, and how to choose important parameters to guaranteeperformance, which makes it necessary to streamline the existingStiefel manifold optimization approach. We are to go throughthe classic Stiefel-manifold optimization approach and show thechallenge to be confronted in resolving the fingerprints predictionproblem, and then present the streamlined design of the approachfor dealing with the challenges.

Challenge 1: Determine the direction of iteration. The firststep in [25] is to determine the direction of iterations, which isrealized by finding the Hessian matrixH with respect to the objec-tive function F =

∑nj=1 ||[Udwj ]Ω − [aj ]Ω||22, and then finding

the inverse of H . However, we find that H for the fingerprintsprediction problem may not have full rank, thus H−1 may beunavailable. Specifically, ∂F

∂Ud= 2

∑nj=1([Udwj ]Ω − [aj ]Ω)wTj

is an n × d matrix; finding H needs to take the second orderderivative of F with respect to Ud, which yields an n2×d2 matrix

H =

B11 · · · B1d

.... . .

...Bn1 · · · Bnd

,where Bij denotes an n × d submatrix. Note that there areunavailable samples in aj , thus [Udwj ]Ω contains zero-valuedelements, which results in that there are all-zero rows in then × d matrix after taking the first derivative. This will furthermake elements from certain Bi1 to Bid in H all zeros after takingthe second order derivative. Consequently, it is possible that Hcontains all-zero rows. To this end, we propose to replace thedirection derived from H−1 with the gradient of the objectivefunction ∇F , where the intuition is that ∇F can also represent apossible iteration direction.

Challenge 2: Complex objective function. The second stepis to find the common iteration equation:

Ud,t+1 = Ud,tMt +QNt,

where Q satisfies the QR decomposition of (I − Ud,tUTd,t)∇F ,and Mt and Nt satisfy[

Mt

Nt

]=

(exp

(t

[UTd,t∇F −RTR 0

]))[Id0

], (3)

where we replace H−1 with ∇F . However, it is noted that Udand wj are factually dependent to each other, since Ad = UdWwith W = ΛdV

Td and wj is the jth column of W . Moreover,

there is a matrix exponential function in the iteration, which wefind could hinder finding the iteration equation due to the tediousform involving infinite matrix series.

We propose to replace the original objective function with

F (Ud) = minxj||[Udxj ]Ω − [aj ]Ω||22, (4)

which decouples the dependence betweenUd andwj . It is straight-forward that wj = x∗j = argminxj ||[Udxj ]Ω − [aj ]Ω||22. Thenew objective function is factually the item of the summation inthe objective function in problem (2). A natural question is: willthe solution with the new objective function be the same as thatwith the original objective function?

It is straightforward to verify that the second order derivativeof F (Ud) with respect to Ud is a semi-definite matrix, whichmeans that the new objective function is convex on Ud, thusdefinitely can achieve a unique minimum value. However, thesolution for optimizing different items to make each item achievethe minimum value may be different. Recall the nature of A thatthe column vectors ajs of A fall in a d-dimensional subspacedue to their correlation. In the process of optimizing each item,the solution must make the corresponding aj fall in the same d-dimensional subspace. However, even if solutions of optimizingall those items can make those ajs fall in the subspace, theyare not necessarily the same, because each solution is factuallya set of basis of the d-dimensional subspace according to thedefinition of Stiefel manifold. It is possible that the solutions ofoptimizing different items vary, because the subspace can havemultiple sets of basis. Nevertheless, the interesting point is that,since each solution is a set of basis of the same subspace, then anyelement vector in a given set of basis definitely can be representedby any other set of basis. This means that if an item achievesthe minimum value, the corresponding solution can also make




other items achieve the minimum value; therefore, the solutionof the transformed problem is indeed the solution of the originalproblem.

Challenge 3: Obtain the common iteration equation. Basedon the revision mentioned above, we now try to obtain thecommon iteration equation, which requires to perform QR de-composition to the matrix (I − UtUTt )∇F 1.

Since ∇F = −2rtwTt − Ut(−2rtw

Tt )TUt = −2rtw

Tt with

rt = PΩ(Udwj − aj), we have (I − UtUTt )∇F = 2rtwTt with

UTt rt = 0, where wt = x∗j for the current vector aj we areconsidering. This means that the rank of the matrix is 1 thus it isimpossible to perform QR decomposition to the matrix. To dealwith this challenge, we propose to replaceQR decomposition withSVD, so that we can still obtain an orthonormal matrix Q. Werelax the constraint in the classic approach and allow the matrixR to be singular, then we have Q = [ rt

||rt|| q2 q3 · · · qn]

and R = [2||rt||wt 0 0 · · · 0]T , where q2, q3, ...qn areall orthonormal singular vectors and orthogonal to rt

||rt||2. Then

we have Mt = Id and Nt = ηtR with the consequent commoniteration equation:

Ut+1 = Ut + 2ηtrtw

Tt

||rt||||wt||, (5)

where ηt is the step size parameter.Challenge 4: Determine step size of iteration. Before execut-

ing the iteration, we must determine the step size first. Imagine thesubspace identification process, we can first estimate a subspaceUt and see if the aj we are considering is in the subspace. If ajis not in Ut, the angle between the projection of aj on Ut andaj itself denoted by θ must be unequal to zero. Then we need torotate the subspace to decrease θ to find a new estimation of thesubspace. The degree of the rotation is factually the step size and itis straightforward that the step size should just make θ = 0. Thenwe continue to rotate the previously estimated subspace Ut in thesame way to make it contain other ajs, and the resulted Ut is thesubspace we are finding. More precisely, we are trying to find theset of basis that spans a subspace containing all ajs according tothe definition of Stiefel manifold.

Let us define a d × d matrix Wt in the t-th iteration, whereWt = [ wt

||wt|| Ct]. Note that Ct is a d × (d − 1) matrix, whosecolumns are unit vectors and all orthogonal to wt

||wt|| . According toGram-Schmidt transformation, Ct can be turned into a new matrixwith all columns orthogonal to each other. Here we assume thecolumns of Ct are orthogonal to each other for the convenience ofpresentation. Multiply Wt in both sides of equation (5), we haveUt+1Wt = UtWt + 2ηt

rt||rt||

[1 0 0 · · · 0

]. The physical

meaning of such operations is to perform rotation to the estimatedsubspace Ut, and the equation above can be further transformedinto Ut+1

wt

||wt|| = pt||wt|| + 2ηt

rt||rt|| , where ||pt|| = ||wt||. Note

that pt is the projection of aj on Ut, rt||rt|| is a unit vector that

is orthogonal to Ut; therefore, the degree Ut to be rotated isdetermined by ηt. We need to take an appropriate value of ηtso that aj can fall in the resulted Ut+1. To this end, we should

1. Note that we use F to denote F (Ud) and Ut to denote Ud,t respectivelyfor the convenience of presentation, and such denotations are also to be adoptedin the following discussions.

2. Hereinafter we use || · || to represent || · ||2, the 2-norm of a vector.

have Ut+1wt//(pt+rt), i.e. ( 2ηt||rt|| ,

1||wt|| )//(1, 1), which results

in the step size in iteration t:

ηt =1

2

||rt||||wt||

. (6)

After overcoming the challenges above, we present the stream-lined Stiefel-manifold optimization algorithm (SSOA) as in Algo-rithm 1, where we use UΩ to denote the rows of U whose indexis in Ω.

Note that in Algorithm 1, after we obtain the optimal subspaceU , we need to predict the unavailable fingerprints. Based on thetheory of projection, we have the following lemma ensuring theaccurate prediction of fingerprints within the subspace, whichcorresponds Step 12 in Algorithm 1.

Lemma 1. Assume that A is a rank-d matrix. Denote ei, i =1, 2, ...m as the unit orthogonal basis vector in which the i-thelement is 1 while others are 0. U is the m×d matrix where eachcolumn of A, denoted as aj , j = 1, 2, ..., n, satisfies aj ∈ R(U),where R(U) means the range space (column space) of U . Then ifthe number of sampled elements in aj , |Ωj |, is larger than d, andUTΩj

UΩjis invertible, we have our estimation aj = aj via

aj = U(UTΩjUΩj )−1UTΩj

[aj ]Ωj .

Proof. Since aj ∈ R(U), then there exists a d× 1 vector cj , s.t.aj = Ucj . Therefore those indexes of aj which are sampledsatisfy [aj ]Ωj = [Ucj ]Ωj = [U ]Ωtcj . Due to the fact thatUTΩj

UΩj is invertible, hence we have

aj = U(UTΩjUΩj

)−1UTΩj[aj ]Ωj

= U(UTΩjUΩj

)−1UTΩjUΩj

cj

= Ucj = aj .

Algorithm 1: Streamlined Stiefel-manifold OptimizationAlgorithm (SSOA)

Input:An initial column-orthonormal m× d matrix U0;sample set Ω, m× n sample matrix PΩ(A);maximum number of iteration T .

Output:Estimated matrix Ad.

1: t = 0;2: while t < T do3: Randomly choose a column index q ∈ 1, 2, ..., n, get

[aq]Ω;4: wt = ([Ut]

TΩ[Ut]Ω)−1[Ut]Ω[aq]Ω;

5: pt = Utwt;6: rt = PΩ(aq − pt);

7: Ut+1 = Ut +rtw

Tt

||wt||2 ;8: t = t+ 1;9: end while

10: U = Ut;11: for each i ∈ 1, 2, ..., n do12: ai = U([U ]TΩ[U ]Ω)−1[U ]Ω[ai]Ω;13: end for14: Ad = [a1, a2, ..., an].

In terms of the time complexity of Algorithm 1, note that themain cost lies in Step 3 and Step 11, where a matrix multiplication




costsO(nd2) and a matrix inversion costsO(d3). So the total timecomplexity of Algorithm 1 is O((n+ T )nd2).

4.2 Convergence AnalysisThe challenge to prove the convergence of the proposed SSOA inAlgorithm 1 is that the elements in each aj are not completelyknown. In particular, recall that in problem (2) our goal is tofind the Ud that minimizes the objective function, we can claimAlgorithm 1 converges if we can indeed find the Ud, which spansa subspace containing all vectors ajs in A, but aj is incompleteitself.

We use βt(U,Ut) = 1 − δt(U,Ut) = 1 − |UTUt|2 as themetric of measuring the distance between the estimated subspacein the t-th iteration Ut and its true value U [37], where |UTUt|means the determinant of UTUt. Recall that we assume A isa rank-d matrix, where the subspace U contains all the vectorsin A. According to the definition, if U and Ut span the samesubspace, then βt(U,Ut) = 0; if U and Ut are orthogonal to eachother (any column in U is orthogonal to all columns in Ut), thenβt(U,Ut) = 1 meaning that U and Ut have the largest distance.

With the metric, we first prove in Lemma 2 that SSOAconverges at least as fast as Grassmann-manifold optimizationalgorithm with step size ηt. Then we prove in Theorem 1 thatSSOA converges faster than Grassmann-manifold optimizationalgorithm with appropriately choice of ηt. Faster convergencerate is very meaningful especially for large-scale fingerprintsprediction.

Lemma 2. Let ηt = ||rt||2||wt|| , then SSOA algorithm converges at

least as fast as Grassmann-manifold based mechanism.

Proof. Define Ut = UtWt =[pt||pt|| Ct

]and Ut+1 = UtWt =[

qt||wt|| Ct

]. Since the first column of Ut+1 is not a unit vector,

we define Ut+1 =[qt||qt|| Ct

]. With the step size ηt, we have

qt = pt + 2ηt||wt||||rt|| rt = at. Since Wt is an orthogonal matrix,

then R(Ut) = R(Ut) and R(Ut+1) = R(Ut+1), where R(U)denotes the image of a matrix U , i.e., R(U) = UA|∀A ∈Rm×n.

Then we define U = UWt =[at||at|| C

], where C is a

d× (d− 1) matrix with orthonormal columns, and these columnsare all orthogonal to at

||at|| . Hence U is an orthogonal matrixand R(U) = R(U). Then there exists an orthogonal matrix Ytsuch that U = UYt, and δt+1

δt=|UT

t+1U |2

|UtT U |2

=|UT

t+1U |2

|UTt U |2

=

|CTt C|

2

|CTt C|2

( ||pt||||at||pTt at

)2 = ||at||2||pt||2 which is the convergence rate of

the Grassmann-manifold optimization algorithm [37]. Note thatpt is the projection of at over the subspace Ut, thus the physicalmeaning of the equation above means that the distance betweenUt and U decreases faster in each iteration.

Theorem 1. Denote σi(A) as the i-th largest singular value of Aand λi(A) as the i-th largest eigenvalue of A. If we set the stepsize ηt such that

λ1(UTt Ut)

λ1(UTt Ut) + 4η2t

λd(UTt Ut)

λ2(UTt+1Ut+1)(1 + 2ηt

||rt||||pt||

)2 >||at||2

||pt||2,

(7)then the convergence rate of the SSOA is strictly greater than||at||2||pt||2 , which is known as the convergence rate of Grassmann-manifold optimization algorithm [37].

Proof. Note that Ut+1 and Ut are not necessarily with orthonor-mal columns with the step size ηt now. We first apply the QRdecomposition to Ut+1 and Ut denoted by Ut+1 = UQt+1Rt+1

and Ut = UQt Rt, respectively. Similar to the derivation in Lemma2, we can derive

δt+1

δt=|(UQt+1)TU |2

|(UQt )TU |2=|R−1t+1|2

|R−1t |2

((pt + 2ηt||pt||||rt|| rt)

T at)2

(pTt at)2

. (8)

Note that (i) Ut+1 and Rt+1 share the same singular valuessince multiplying an orthogonal matrix does not alter the sin-gular values; (ii) Rt+1 is a diagonal matrix, thus we have|R−1t | = 1∏d

i=1 σi(Ut)and |R−1

t+1| = 1∏di=1 σi(Ut+1)

where

σi(Ut+1) =√λi(UTt+1Ut+1) =

√λi(UTt Ut + 4η2

t

wtwTt||wt||2

),

based on the fact that UTt rt = 0. Since wtwTt is a rank-1 matrix,with the only non-zero eigenvalue ||wt||2, therefore according toWeyl’s inequality [40], we have

λi(UTt Ut + 4η2

t

wtwTt

||wt||2) ≥

λ1(UTt Ut) + 4η2

t , i = 1;

λi−1(UTt Ut), i ≥ 2.(9)

With Eqn. (8) and Inequality (9), it is easy to derive δt+1

δt≥

λ1(UTt Ut)

λ1(UTt Ut)+4η2t

λd(UTt Ut)

λ2(UTt+1Ut+1)

(1 + 2ηt||rt||||pt|| )

2. Since RHS of the

inequality is greater than ||at||2

||pt||2 , which is the convergence rate ofGrassmann-manifold optimization algorithm as presented in [37],the convergence rate of Stiefel-manifold optimization algorithm isfaster. Proved.

Theorem 1 presents the general condition where SSOA out-performs Grassmann-manifold optimization algorithm in conver-gence rate. In fact, based on experiments on the real big datasetsin Section 6.2 we find that (i) λ1(UTt Ut) surges rapidly andbecomes much greater than λ2(UTt+1Ut+1) and λd(UTt Ut); (ii)λd(U

Tt Ut)/λ2(UTt+1Ut+1) ∈ [c, 1] with the constant c > 0. The

verifications are illustrated in Fig. 2(a) and Fig. 2(b). Fig. 2(a)verifies observation (i) since under different dimension of sub-spaces (d = 10, 15, 20), we find that λ1(UTt Ut) is always greaterthan λ2(UTt Ut), and as more iterations are conducted, the gapbecomes greater. Fig. 2(b) confirms observation (ii) since underd = 10, 15, 20, the ratio λd(U

Tt Ut)/λ2(UTt+1Ut+1) is always

less than 1 and in our situation we can set c = 0.45. Meanwhile,combining both figures we verify λ1(UTt Ut) is always greaterthan λd(UTt Ut).

Therefore we present a more practical condition in Proposition1 that can facilitate determining the step size in practice, which isan approximation to the general Theorem 1.

Proposition 1. We set γt =λd(UT

t Ut)

λ2(UTt+1Ut+1)

∈ [c, 1]. If we choose

the step size ηt such that 12||at||−||pt||√

c||rt||< ηt <<

12σ1(Ut), then

the convergence rate of the SSOA is strictly higher than that ofGrassmann-manifold optimization algorithm.

Proof. Since λ1(UTt Ut) >> λ2(UTt+1Ut+1) = γtλd(UTt Ut),

we can approximate Eqn. (7) in Theorem 1 as γt(1+2ηt||rt||||pt|| )

2 >||at||2||pt||2 . Then we can easily obtain

ηt >1

2

||at|| − ||pt||√c||rt||

.




(a) λ1(UTt Ut) vs λ2(UTt Ut) (b) γt =σd(U

Tt Ut)

σ2(UTt+1Ut+1)

Fig. 2. Experimental verifications of observations about Theorem 1.

On the other hand, since σ1(Ut) =√λ1(UTt Ut)we need to

ensure that ηt << 12σ1(Ut) to let λ1(UT

t Ut)

λ1(UTt Ut)+4η2t

→ 1. Thus

when 12||at||−||pt||√

c||rt||< ηt <<

12σ1(Ut), the convergence rate of

SSOA is higher than that of Grassmann-manifold optimizationalgorithm.

Note that sometimes when ||rt|| is small enough, especiallywhen the algorithm is close to convergence, ||at||−||pt||√

c||rt||>

λ1(UTt Ut). However, in this situation we just use ηt = 12||rt||||wt|| .

Combining the ways above to determine step sizes, we will showthat the convergence speed of SSOA outperforms significantlythan that of Grassmann-manifold based algorithm under real bigdatasets in Section 6.2.

4.3 Discussions

The fundamental reason that the Stiefel-manifold optimizationapproach converges faster than the Grassmann-manifold coun-terpart is that the physical meaning of a point on the Stiefel-manifold is a set of basis of a d-dimensional subspace and thaton the Grassmann-manifold is a d-dimensional subspace itself.It is supposed that our optimization algorithm should measurethe distance between aj and the estimated set of basis in eachiteration; however, such kind of iteration will definitely incur highcomputational complexity due to the finer-granularity of distancemetric. Since a set of basis can span a subspace, we use the anglebetween aj and its projection on the subspace as the distancemetric in our algorithm design. In this case, the solution of ouralgorithm is factually the subspace spanned by the set of basis,instead of the particular set of basis itself the traditional Stiefel-manifold optimization mechanism is finding. This is because asubspace can have multiple sets of basis.

However, this means that there may exist multiple solutionscan be obtained by our proposed SSOA, and any one of thesolutions can satisfy the requirement. Recall that we transformthe objective function in the algorithm design (§Section 3.1), andthe new objective function is convex on Ud, then the situation ofoptimizing over the Stiefel manifold is like that illustrated in theright part of the Fig. 3, and the left part of the figure shows thesituation of optimizing over the Grassmann manifold. There couldbe multiple solutions on the Stiefel manifold, but only one solutionon the Grassmann manifold, because a d-dimensional subspace isjust regarded as a point on the Grassmann manifold according tothe definition. Consequently, it is easier to find a solution on theStiefel manifold and the convergence rate is higher.

We present numerical analysis results in Fig. 4 showing theaverage number of iterations for completing 500 randomly gener-ated matrix on the Stiefel and Grassmann manifold, respectively.The two sub-figures show the number of iterations needed to

achieve different levels of prediction accuracy (average error) forthe cases that 40% and 60% of the real data are available. It isclear that the proposed SSOA requires less numbers of iterationscompared with the Grassmann manifold optimization algorithm.The experimental results with the real data to be presented inSection V-C also corroborates our analysis.

U U1,U2,U3...

Grassmann

Fig. 3. Convergence over Stiefel and Grassmann manifold.

Fig. 4. Numerical analysis to the convergence rate (Left: 40%; Right:60%).

5 FINGERPRINTS PREDICTION WITH SLIDINGWINDOW

5.1 Sliding Window Mechanism Design

In practice, the sampled RF fingerprints for cellular networks areunevenly distributed in the extremely vast area, as to be shownin Section V. It is impossible to complete the matrix of the entirearea in one shot, since the sampled data are sparsely distributedin some subareas. To deal with this issue, we can create a slidingwindow to scan the entire area in a row-by-row manner, wherethe crux is to determine the size of the window. In particular, wefirst grid the entire area into square cells, where the edge lengthof each cell is dependent on the accuracy requirement (normallyin tens of meters). We then let the sliding window cover a numberof such cells and move from left to right and top to bottomso that the entire area can be scanned. The sliding window’smovement step size in both horizontal and vertical direction israndomly assigned, so that the predicted fingerprints obtainedin the window’s previous location can be utilized to predict thefingerprints in the current area covered by the window.

Moreover, the window size must be set small enough initiallyto make sure there are enough amount of data available withinthe window for prediction. After a round of scanning, we havepredicted fingerprints available in some cells, which can be re-garded as the training data thus the density of available fingerprintsincreases. Then we could enlarge the window and scan the entirearea again still in a row-by-row manner. In this way, we scan thearea multiple times, so that fingerprints in most of the area can bepredicted.

Fig. 5 presents a vivid example of the sliding window mech-anism. At the beginning, the sampled fingerprints are sparse, asshown in the upper left of Fig. 5, so we set the size of the windowto be small, in order to predict the fingerprints just adjacent tothe samples at first. During the scanning process, the fingerprints




are not always predictable in the window, which depends on thesampling rate inside the window. For example, in the upper leftof Fig. 5, when the window scans to the red frame, the samplingrate in this frame is high enough, thus the window is predictable.The red elements inside the window are the predicted elements.However, when the window scans to the blue and green frames, thesampling rate is very low, thus they are not predicted in this round.With an increasing number of elements predicted, the windowexpands gradually as shown in Fig. 5, and in this way the wholeradio map will be recovered finally.

√ ×× ×√

√

Fig. 5. An example of sliding window mechanism.

The crux of the sliding window design is to determine thewindow size, which is essentially to determine the dimension ofthe subspace d. This is because the length and width of the slidingwindow must be greater than d, or it is impossible to predictthe fingerprints in the unsampled cells. However, the challengeis that the matrix corresponding to the sliding window itself isincomplete. To deal with the challenge, we propose to determined of the sliding window matrix by sampling a complete sub-matrix within the window. The rationale is that if the vectorsin the window matrix are correlated, the correlation should bereflected by any sub-matrix within. The question is how good wecan predict the fingerprints in the window matrix if we determined in this way.

The corner stone assumption of the subspace identificationapproach is that all the vectors in the window matrix denoted byAw is in the same subspace denoted by Ud; however, although thefingerprints are correlated, some vectors indeed are not in Ud thusincur prediction errors. Note that a large-valued d can decreasesuch error since more vectors can be included in Ud, but requiresmore elements in Aw to be available; in contrast, a small-valuedd can increase the error, but it can accommodate more-sparsely-sampled Aw.

In particular, suppose Aw = UΛV T after SVD 3. We useσi = Λii, i = 1, 2, ...,m (set m < n) to denote the i-thlargest singular value of Aw. Suppose we had known d, thenwe approximate Aw with Ad = UdΛdV

Td , where Ud and Vd

are the first d columns of U and V respectively, and Λd is thesub-matrix comprised of the first d columns and rows of Λ. We

use ld =

√∑di=1 σ

2i∑m

i=1 σ2i

to denote the remained information after the

approximation which only includes the largest d singular values

3. Note that we use U , Λ, and V to denote the SVD of window matrix inthis section for the convenience of demonstration, which is different from theprevious sections where A = UΛV T .

in Λ. The rationale is that most of the information of the matrixlies in the largest singular values of the matrix after SVD. It isstraightforward that a greater d leads to a greater ld.

In the following discussion we will theoretically prove that thegap between the remained information of the complete sub-matrixwe sample within the window, ld , and that of the whole slidingwindow matrix, ld, is small enough when the correlation of thecolumns of the sliding window matrix is strong. This validates thatwe can determine d of the sliding window matrix by conductingSVD on the complete sub-matrix within the window, with similarremained information.

5.2 Remained Information AnalysisWe first present some lemmas to prepare for Theorem 2, theremained information analysis.

Lemma 3. If real numbers a, b, c, d > 0, b << a, d << c,|a− c| ≤ ε1 and |b− d| ≤ ε2, then we have

| ba− d

c| ≤ maxminb, d

acε2,

ε1maxa, c

.

Proof. This lemma can be proved in a case-by-case manner asfollowing:

Case 1: If ba ≥

dc and a ≥ c (note that now b ≥ d), then

| ba −dc | =

ba −

dc ≤

|b−d|a ≤ ε1

a ;Case 2: If b

a ≥dc , a < c and b < d, then | ba −

dc | =

bc−adac ≤

dac |c− a| ≤

dacε2;

Case 3: If ba ≥

dc and a < c, b ≥ d, then | ba −

dc | =

bc−adac ≤

bac |c− a| ≤

bacε2;

Similarly we can conduct the other 3 cases when ba < d

c .Combining all the 6 cases, we finish the proof.

Lemma 4. [The Hoffman-Wielandt inequality] Given two k × kreal symmetric matrices X and Y , if the eigenvalues of X areµ1,µ2,...,µk and that of Y are ν1,ν2,...,νk, with µ1 ≥ µ2 ≥ ... ≥µk ≥ 0, ν1 ≥ ν2 ≥ ... ≥ νk ≥ 0, then we have

∑ki=1(µi −

νi)2 ≤ ||X − Y ||2F .

Lemma 5. [38] If U is an orthonormal matrix and Ω is a randomset of row indexes. Then there exists a positive constant C, suchthat E|| m|Ω|U

TΩUΩ − I|| ≤ Cm

√logn|Ω| max1≤k≤m ||uk|| ≤ 1 ,

where uk denotes the k-th row of U .

Lemma 6. If A = AV Tp , where Vp is an n × t matrix withorthonormal columns. Then there exists a constant C > 0, suchthat E|| m|Ω| A

T A−ATA|| ≤ Cm√

logn|Ω| max1≤k≤m

||uk||||Λ||2 ≤ |||Λ||2

Proof. Note thatA = UΛV T after SVD, and A is a sub-matrix ofA concatenated by the rows whose indices belonging to Ω. Thenwe have A = UΩΛV T , and

E|| m|Ω|

AT A−ATA||

= E|| m|Ω|

V ΛUTΩUΩΛV T − 1

|Ω|V ΛUTUΛV T ||

= E||V Λ(m

|Ω|UTΩUΩ − Im)ΛV T ||

≤ E|| m|Ω|

UTΩUΩ − Im||||Λ||2

≤ Cm√

log n

|Ω|max

1≤k≤m||uk||||Λ||2.

(10)




The last inequality holds based on Lemma 5.

Theorem 2. We use ld and ld to denote the remained informationof the incomplete p× q window matrix Aw and a complete s× tsub-matrix Aw within, where we set p ≤ q and s ≤ t, and dis the dimension of the subspace obtained by performing SVD tothe sub-matrix; if the linear correlation of fingerprints is strongenough in the window matrix Aw, then |ld − ld| → 0.

Proof. Denote the following SVDs: Aw = UΛV T , Aw =U ΛV T , Ad = UdΛdV

Td and Ad = UdΛdV

Td . The window

matrix Aw is highly linearly correlated, which means that almostall the information is contained within the subspace spanned byseveral principal axes while the other ones can be reasonably

neglected. Then we have∑p

i=d+1 σ2i∑p

i=1 σ2i→ 0. Based on Taylor’s

expansion, we obtain

ld =

√1−

∑pi=d+1 σ

2i∑p

i=1 σ2i

≈ 1− 1

2

∑pi=d+1 σ

2i∑p

i=1 σ2i

≈ 1− 1

2

||Λ− Λd||2F||Λ||2F

.

Similarly, ld ≈ 1 − 12||Λ−Λd||2F||Λ||2F

, thus according to Lemma 3, weobtain

|ld − ld| =1

2| ||Λ− Λd||2F||Λ||2F

− ||Λ− Λd||2F||Λ||2F

|

=1

2| ||Λ− Λd||2F||Λ||2F

−ps ||Λ− Λd||2F

ps ||Λ||

2F

| ≤ 1

2maxS, T,

(11)where

S =min||Λ− Λd||2F ,

ps ||Λ− Λd||2F

ps ||Λ||

2F ||Λ||2F

|||Λ||2F −p

s||Λ||2F |,

T =1

max||Λ||2F ,ps ||Λ||

2F |||Λ− Λd||2F −

p

s||Λ− Λd||2F |.

For S, set Aw = AwVTp , where Vp is an q × t matrix

with orthonormal columns and Aw is an s × q matrix. LetZ = p

s ATwAw − ATwAw, which is an q × q matrix. Note that

(i) |||Λ||2F −ps ||Λ||

2F | = |

∑pi=1 σ

2i −

ps

∑si=1 σ

2i |; (ii) ps A

TwAw

and ATwAw are both symmetric matrices; (iii) Aw and Aw sharethe same singular values. Thus according to Lemma 4 and 6,

||Z||2F =

p∑i=1

λ2i (Z) ≥

p∑i=1

(λi(ATwAw)− p

sλi(A

TwAw))2

≥ 1

p(

p∑i=1

σ2i −

p

s

s∑i=1

σ2i )2 =

1

p|||Λ||2F −

p

s||Λ||2F |2.

Then

S ≤min||Λ− Λd||2F ,

ps ||Λ− Λd||2F

ps ||Λ||

2F

√n. (12)

Now we focus on T . If ||Λ||2F ≥ps ||Λ||

2F , then

T ≤ ||Λ− Λd||2F||Λ||2F

+p

s

||Λ− Λd||2F||Λ||2F

≤ ||Λ− Λd||2F||Λ||2F

+||Λ− Λd||2F||Λ||2F

.

(13)Similarly, if ||Λ||2F <

ps ||Λ||

2F , we also have Inequality (13). Then

if the linear correlation of fingerprints in Aw is strong, ||Λ −Λd||2F and ||Λ − Λd||2F approach zero, which makes both S andT approach zero. According to Eqn. (11), we prove that |ld −ld| → 0, which means that using Aw to estimate the subspacedimension d for Aw does not incur much deviation in remainedinformation.

5.3 Sliding Window AlgorithmNow we present our sliding window algorithm as Algorithm 2.Its input is the sampled fingerprint data set while its output is thefully predicted fingerprint database of our target region. Algorithm2 mainly consists of 3 loops: (i) Each round of the outermost loop(from Line 1 to Line 20) is a complete scanning of the whole targetregion with the same Aw; (ii) Each round of the intermediateloop (from Line 4 to Line 18) represents the scanning of differentrows of A; (iii) The innermost loop (from Line 5 to Line 15)completes the prediction of every column in the row specified bythe intermediate loop.

Algorithm 2: Sliding Window Based Fingerprint PredictionAlgorithm

Input:Sampled matrix PΩ(A); Initial subspace dimension d0.

Output:Estimated matrix A.

1: while t < T do2: Construct the scanning matrix Aw, with at rows and bt

columns. (i.e. Aw ∈ Rat×bt )3: Set row and col as the current position of Aw

(represented by an element in A at (row,col)).4: while Aw has not scanned all the rows of A do5: while Aw has not scanned all the columns in current

row do6: Scan a row, check whether the current scanning

matrix satisfies conditions in Lemma 7.7: if conditions in Lemma 7 hold then8: Conduct SSOA on Aw, and obtain the estimation

Aw.9: end if

10: if Some of the elements in Aw are not in reasonablerange then

11: Set these elements as 0.12: end if13: Set τc as a random real number in (0, 1).14: Update col← col + τcbt.15: end while16: Set τr as a random real number in (0, 1).17: Update row ← row + τrat.18: end while19: Update at+1, bt+1 and dt+1 from at, bt and dt.20: end while

There are several important details required for further expla-nation: First, in Line 6 we need to check whether the scanningwindow Aw can be predicted. We give the following Lemma 7to show the conditions under which prediction by SSOA in Awwill be more likely to be reasonable. Secondly, in Line 13 to14 and Line 16 to 17, we set two random real numbers τc andτr . The goal of this setting is to guarantee the sliding windowscanning the whole target region while not to lead to much highertime complexity: Setting τc and τr larger than 1 will leave somecolumns and rows unscanned, while adding col and row one byone cost too much time.

Now we give Lemma 7 and its proof in detail.

Lemma 7. Assume that Aw is of rank d. Set Aw with at rowsand bt columns (here we set at ≤ bt), and in the i-th column ofAw there are mi unsampled fingerprints. Set d as the dimension




of the subspace. Then the prediction has finite solutions under thefollowing necessary conditions:

• d ≤ at and d ≤ mi,∀i;• (at − d)(bt − d) ≥

∑bti=1mi.

Proof. Obviously, if d > bt, then there are no restrictions of thesebt columns since the rank of Aw is less than subspace dimensiond. So there are infinite solutions to complete missing parts insideAw. The same reason for the impossibility of d > at. Thus d ≤at.

Then since Aw is of rank d, columns come from the samed-dimensional subspace. Therefore we have the following expres-sions based on linear algebra

xd+1 = kd+1,1x1 + kd+1,2x2 + ...+ kd+1,dxdxd+2 = kd+2,1x1 + kd+2,2x2 + ...+ kd+2,dxd...xn = kn,1x1 + kn,2x2 + ...+ kn,dxd

(14)

where xi is the i-th column of Aw, and ki,j is an unknowncoefficient.

Before going further, we should note that one can easily setanother equation, for example, xd+2 = k′d+2,1x1 + k′d+2,2x2 +...+ k′d+2,d−1xd−1 + k′d+2,d+1xd+1. However, notice that xd+1

are also linear combinations of x1 to xd, thus this newly addedequation is not an independent one, thus it brings no extrainformation.

Focus on the first equation. Since there are mi unknown ele-ments in xi, therefore in the first equation there are

∑d+1i=1 mi + d

unknown variables. With regard to the second equation, it containsd new coefficients and md+2 unknown elements in xd+2. Hencethe second equation adds d+md+2 new unknown variables. Like-wise, if we add in the j-th new equation, there are md+j + d newunknown variables. Then totally we have

∑bti=1mi + (bt − d)d

unknown variables.Meanwhile, note that for each vector equation in Eqn. (14),

it contains at element-wise equations. Thus there are at(bt − d)equations in sum.

So if at(bt − d) ≥ d(bt − d) +∑bti=1mi, then the equations

have finite solutions. Here it also implies that the sampling rateinside Aw should be at least 1− (at−d)(bt−d)

atbt= (at+bt−d)d

atbt.

Remark: In Lemma 7, note that we only give two necessaryconditions for Eqn. (14) having finite solutions. It can ensure finitesolutions for the prediction, but not a single determined accuratesolution. Therefore, these two conditions cannot guarantee theabsolute accuracy. However, they serve as a promotion of accuracysince they largely eliminate unreasonable outcomes and limit thereasonable prediction inside a finite set.

6 EXPERIMENTAL RESULTSWe do experiments with real data sampled by a network operatorin two cities, where the data sets are sampled within 48 hours,covering 2.2 km2 and 69.8 km2 areas in the two cities and con-taining around 60, 000 and 8, 820, 000 data records, respectively.Each data record contains the GPS location information in termsof latitude and longitude, the corresponding RSRP, the time themeasurement is performed, and some other irrelevant parameters.

We will first show the results with the smaller data set, inorder to verify that the proposed subspace identification approachoutperforms other frequently used matrix completion algorithms

[26]–[29]. Then we will show the results with the larger data setto examine the performance of our proposed SSOA and dynamicsliding window mechanism in fingerprints prediction in the large-scale scenario. To verify our prediction mechanism, we performlocalizations with the predicted fingerprints and see if the accuracyand reliability can meet the requirement of E911; the results arecompared with a series of algorithms in prior arts as listed in Table1. When evaluating the localization accuracy, we use the GPSlocation information as the ground truth and the localization errorsare deviations of the estimated location from the ground truth.Moreover, we validate our theoretical result that the convergencerate of our proposed SSOA outstrips the Grassmann approach withsimilar methodology by the larger data set.

TABLE 1Algorithms in Comparison

Algorithm SourceCell-ID (CID) [16]

Gaussian Mixture Model (GMM) Chakraborty et al. [17]Tensor Completion (TC) Liu et al. [9]

Sparsity Rank SVD (SRSVD) Gu et al. [30]Bayesian Sparse Learning (BSL) Nikitaki et al. [31]

Alternate Minimization (AM) Jain et al. [32]

6.1 Experiments on Small Data SetOverview of Fingerprints Prediction Results: We first illustratethe fingerprints prediction results obtained by our proposed mech-anism and then compare our mechanism with others in terms ofdifferent metrics. The top sub-figure in the first column of Fig. 6shows the data sampled on the main road of the target area.This data set also contains fingerprints obtained from branchesof those main roads. We use the proposed mechanisms to predictfingerprints on those branches, and compare the predictions withthe real data value. The distribution of RSRP fingerprints isshown in the bottom sub-figure in the first column. The rest ofthe sub-figures illustrate the process of matrix completion withour proposed SSOA combined with the dynamic sliding windowscanning. The top sub-figure in the second column shows thelocations on those branches of the main roads where real data aresampled. The following sub-figures show the distribution of thecompletion errors in dBm after each round of scanning with thesliding window. It can be seen from the results that the fingerprintscan be predicted, and the average error and standard deviation are5.1dBm and 3.5dBm respectively.

Algorithms in Comparison: We compare the performance ofthe SSOA with that of the following matrix completion methods:Singular value thresholding (SVT) is a variant of SVD, where thereis a threshold τ to determine the rank of the rectangular diagonalmatrix; Singular value partition (SVP) is another variant of SVD,which partitions available observations into subsets and performcompletion on each subset; Forward-backward algorithm for ma-trix completion(FBMC) scheme formulates the matrix completionproblem into a convex optimization problem, with minimizingthe objective function that is a combination of completion errorand the rank of the estimated matrix; Iterative reweighted leastsquares(sIRLSp) is a family of algorithms, which conducts aleast square minimization problem with the decision variable asa matrix.

Performance Metrics: To evaluate the performance of eachmechanism, we use a subset of the whole data set to form thetraining set, based on which the fingerprints in other areas will




Fig. 6. Completion process with small data set.

be predicted, and the rest of the data form a test set denoted byΩtest, where the data in the set are the ground truth for testingthe predicted fingerprints. All the predicted fingerprints form a setdenoted by Ωc, but note that Ωtest and Ωc are not necessarily thesame, since we may predict fingerprints of some area that is notsurveyed by the technicians. The performance of the mechanismsmentioned above is evaluated with metrics including ErExp(A),ErStd(A), NSE(A) and ITR(A), where detailed definitionsare available in [41].

Prediction Performance 1: Fig. 7 shows the performanceunder different proportions of real data in the test set Ωtest, whichis termed as the sampling rate. If we just use the data obtainedfrom the main road to perform prediction, the sampling rate is 0,if we take 50% of the data in Ωtest out also as the training set thenthe sampling rate is 50%. It is straightforward that the subspaceidentification approach no matter over the Stiefel manifold or theGrassmann manifold outperforms other mechanisms, except forthe ErStd metric. Although the subspace identification approachresults in a higher standard deviation, the expectation of the errorsErExp(A) is smaller than that of others. This is because the realdata set itself fluctuates dramatically, but the results by othermechanisms are unable to reflect such variation.

Prediction Performance 2: Fig. 8 shows the performanceunder different ways of griding. We divide the region into cellswith different edge length, which is termed as resolution. Basedon the operator’s localization accuracy requirement, the target areamay be divided into grids with different resolutions, which inessence is to change the size of the matrix A. In Fig. 8, theresolution 5000 means that the target area is divided into 5000equally-sized small cells over each edge. We only use the main-road data as the training set to predict fingerprints in the branchroads. The results in Fig. 8 shows that our proposed mechanismswork well under different ways of griding.

Remark: Note that both Stiefel-manifold and Grassmann-manifold based mechanism perform well, with faint differencein performance metrics above. However, we will show the sig-nificantly higher efficiency of Stiefel-manifold based mechanismin larger data set, which coincides with its faster convergencerate shown in Theorem 2 and promises its higher practicality inlarge-scale fingerprints prediction than Grassmann-manifold basedmechanism.

6.2 Experiments on Large Data SetOverview of Fingerprints Prediction Results: The map of thecity where the data were sampled is shown in Fig. 9(a); thered dots on the map represent the location of the BSs. We usefingerprints collected along the main roads of the city to predict the

fingerprints on those branching roads. The spatial distribution offingerprints on main road are shown in Fig. 9(b), which accountsfor only 6.7% of the whole region. After 7 iterations of slidingwindow based prediction mechanism with SSOA, we obtain Fig.9(c) shown the prediction result. The predicted region accountsfor 73.2% of the whole region, having fingerprints in most ofthe locations predicted. To examine the prediction accuracy, wecompare the predicted results with the ground true, and showcorresponding error of each prediction in Fig. 9(d), where differentcolors represent different levels of errors in dBm. We find thatthe average and median predicting errors are 8.46 and 7.09respectively.

Local Performance of Fingerprints Prediction: We hereshow the local performance of fingerprints prediction in Fig. 9(d).Recall that the sliding window mechanism can be viewed as a fil-tering process to some extent, and the error could increase, whichis influenced by estimations obtained by the previous iteration.This is actually reflected in Fig. 9(d), where it can be seen thatdots with deep color usually exist by clusters. The phenomenoncan be incurred by multiple factors when executing the slidingwindow algorithm, such as the initial state of the sliding window,the spatial distribution of samples, and the noise of samples. In theexperiments, we randomly sample a 250m× 300m sub-area overthe city region as shown in Fig. 9(d) multiple times, and examinethe prediction performance within the window each time. Then theaverage and distribution of the errors can be obtained.

Figure 9(e) shows the average, maximum and minimum pre-diction errors when we select different numbers of sub-areas toexamine. It can be found that the average predicting error (thebars) fluctuates slightly around 8.5dBm in different number ofwindows varying from 10 to 100, with the standard deviation0.17dBm. This indicate the stability of average predicting perfor-mance in different sub-areas. We can also see that the minimumvalue is generally much closer to the average error than themaximum value, indicating that the good predictions are morethan the bad ones. Figure 9(f) shows the cumulative distributionfunction (CDF) of prediction errors. It can be seen that the CDFsunder different numbers of samplings approximately overlap witheach other, indicating that the prediction performance in each sub-area is stable.

Positioning Results with Predicted Fingerprints: We herevalidate that the predicted fingerprints can be utilized for locationestimation with the accuracy and reliability satisfying E911 re-quirement. We first grid the entire area into 871×663 square cellswith each edge length to be 11m. As mentioned above, the dataset contains data from 611 BSs, but a number of base stations areonly observed at a couple of locations. Thus we first sort the BSs




Fig. 7. Metrics with different sampling rates on branch roads.

Fig. 8. Metrics with different resolutions of the area.

(a) Target region and BSs distribution (b) Fingerprints on main roads (c) Predicted fingerprints

(dBm)0 10 20 30 40

(d) Prediction errors distribution

(e) Average prediction errors (f) CDF of prediction errors (g) CDF of localization errors (h) Convergence comparision

Fig. 9. Experimental results on 69.8km2 dataset.

0 100 200 300 400 500 6000

0.2

0.4

0.6

0.8

1

Localization Error (m)

CD

F

SSOAAltMinCompleteSRSVDTensorBayesian

Fig. 10. Comparison of the SSOA with other matrix completion methods.

according to the frequency they are observed at all locations ofthe area, and select the top 135 BSs. Then the corresponding dataaccount for 96% of the entire data set. To perform localization,we choose those cells that both have the measured and thepredicted fingerprints. We construct a fingerprints database withthe predicted fingerprints, and use the real data as the user’s

reported data for localization. According to the statics of the dataset, a user’s mobile device normally can observe 1 to 12 BSs, andour preliminary experimental results show that the localizationaccuracy will be unacceptable if the user just report the fingerprintwith respect to only one BS; therefore, we just consider the cellsthat can observe at least two BSs.

With our predicted fingerprints, we compare the performanceof fingerprinting localization performance with that of Cell ID(CID) and Gaussian Mixture Model (GMM) based method [17].The basic idea of the CID approach is to estimate the user’slocation to be the geometric center of all BSs the user can observe;GMM method is to estimate the location of a reported fingerprintusing the GMM model constructed based on the Gaussian radiopropagation model, which also can be regarded as a method topredict a given fingerprint’s location.

We perform localizations for around 4500 test samples bythree methods respectively. For SSOA and Gaussian mixturemethod, we use the predicted fingerprints and part of the sampledfingerprints to form a fingerprints database, and use the rest of the




sampled fingerprints as the test set. Given some fingerprints in thetest set, we run the regular localization algorithm to estimate thecorresponding location, and the localization errors are deviationsof the estimated location from the ground truth. We draw theCDF of localization errors for each method, as shown in Fig.9(g). The localization error is the Euclidean distance betweenthe user’s estimated location and the ground truth. We use theE911’s localization requirement benchmark to evaluate the threelocalization methods, which is “within 100m for 67% and within300m for 90%”. We can see that the fingerprinting method usingour predicted fingerprints by SSOA achieves “within 100m for71% and within 300m for 98%”, CID method achieves “within100m for 34% and within 300m for 93%”, and the GMM methodachieves “within 100m for 14% and within 300m for 75%”.This is because CID’s performance is impacted by the unbalanceddistribution of BSs, and GMM’s assumption that the receivedsignal strength at a given location is a multivariate Gaussiandistributed random variable [17] is not always realistic especiallyin urban environment with more serious shadowing and multipatheffects.

Convergence Rate: Our convergence analysis reveals that theproposed SSOA mechanism converges faster than the Grassmann-manifold optimization algorithm, and we now provide experimen-tal results to validate this claim. We consider the entire area as agiant matrix, and use 40% of the data as the training set to predictthe rest of the data. We let the SSOA and the Grassmann-manifoldoptimization algorithm iterate 60, 000 times and examine theprediction error after each iteration. The prediction error is foundby comparing the predicted data and the real data in the other60% of the data set, and each error is represented as a pointin Fig. 9(h). It shows that the average prediction error usingSSOA reaches around 11dBm within 1000 times, while the errorusing Grassmann method only reaches around 12.5dBm after30000 times. It takes 75min for the Grassmann method to reach12.5dBm error, while our proposed SSOA just consumes around2min to achieve 11dBm error.

Comparison to other Matrix Completion Approaches: Wenote that efforts have been dedicated to utilize matrix completionapproach for predicting RF fingerprints in the indoor environ-ment. Although considering different localization environment,in essence such mechanisms share some similarities with theproposed one. We here compare performance of the following4 such algorithms with that of the proposed approach SSOA inthe outdoor scenario: AltMinComplete minimizes the fingerprintsestimation error by optimizing the two unitary matrices in theSVD process in an alternating manner [32]; Tensor combinesthe signal values with respect to different BSs together to forma tensor, which can be regarded as a 3-D matrix, in order tocomplete those unavailable fingerprints [9]; SRSVD reduces thepossible interference in Wi-Fi signal which may incur negativeimpact on fingerprints completion [30]; Bayesian sparse learningapproach conducts localization by exploiting the low-rank andsparsity properties of the sampled signals [31].

The localization experiments are conducted in the same wayas mentioned before, and the results are shown in Fig. 10. Itcan be seen that the proposed SSOA mechanism outperformsall the other ones. AltMinComplete presents the second bestperformance, closest to SSOA, because it directly optimizes theunitary matrices in SVD; however, there is no guarantee thatthe alternating optimization approach adopted can necessarilyachieve the optimum. Tensor is realized in a similar manner as

SSOA but does not directly optimize U , instead, it predicts thesubspace that U spans by randomly selecting fingerprints that aresampled. SRSVD is a slight variation of the classical SVD-basedapproach. Bayesian sparse learning considers the potential sparsityproperties of RF signals, but it relies on setting a slew of pre-defined parameters (variation of signal, precision of measuring)and the assumptions of signal’s Gaussian distribution, which arenot always practical in real localization scenarios thus presents theworst performance.

7 CONCLUSION AND FUTURE WORK

This paper has proposed to utilize the subspace identificationapproach to predict fingerprints in unsurveyed areas with availablefingerprints sampled in the nearby areas. We have formulatedthe fingerprints prediction problem into the problem of findingthe optimal subspace over the Stiefel manifold, and proposed astreamlined Stiefel-manifold optimization algorithm with fast con-vergence rate for the fingerprints prediction scenario. Moreover,we have proposed a sliding window mechanism to deal with thepractical fingerprints prediction scenario, where the fingerprintsare unevenly distributed in the vast area. Combining the twoproposed mechanisms enables an efficient method to predict large-scale fingerprints prediction in the km2 level. Further, we havevalidated our theoretical analysis and proposed mechanisms byconducting experiments with real mobile data sets sampled intwo cities; it has been shown that the localization accuracy andreliability exceed the requirement of E911 by FCC, moreover,the convergence rate of the proposed mechanism outperforms theGrassmann approach with the similar methodology. Our futurework is to apply the proposed mechanism to the indoor localiza-tion scenario.

8 ACKNOWLEDGEMENT

The work in this paper is supported by the National NaturalScience Foundation of China (NSFC) 61872233, the National KeyResearch and Development Program of China 2017YFB1003000,and NSFC 61572319, 61829201, 61532012, 61325012, 61428205.

REFERENCES

[1] Enhanced 911 - Wireless Services Online:https://www.fcc.gov/general/enhanced-9-1-1-wireless-services.

[2] Ericsson White Paper, Postioning with LTE, Online:http://www.sharetechnote.com/Docs/WP-LTE-positioning.pdf, Sept. 2011.

[3] European Global Navigation Satellite Systems AgencyHow to Enable Better Location for Emergency Calls:Galileo and 112 Online:https://www.gsa.europa.eu/news/how-enable-better-location-emergency-calls-galileo-and-112

[4] Rohde & Schwarz, ,LTE Location Based Services Technology Intro-duction White paper, Online:http://www.rohde-schwarz-wireless.com/documents/LTELBSWhitePaper RohdeSchwarz.pdf , Apr. 2013

[5] Spirent, White Paper: An Overview of LTE Positioning,Online:https://www.spirent.com/∼/media/White%20Papers/Mobile/LTE LBS White Paper 2012.pdf,Feb. 2012

[6] 3GPP Release 8, Online:http://www.3gpp.org/specifications/releases/72-release-8

[7] 3GPP Release 9, Online:http://www.3gpp.org/specifications/releases/71-release-9

[8] Huawei, The second phase of LTE-Advanced LTE-B : 30-fold capac-ity boosting to LTE, Online:www.huawei.com/ilink/en/download/HW259010 Apr. 2013

[9] X. Liu, S. Aeron, V. Aggarwal, X. Wang and M. Wu, “Adaptive Sam-pling of RF Fingerprints for Fine-Grained Indoor Localization,” IEEETransactions on Mobile Computing, vol. 15, no. 10, pp. 2411–2423, Oct.2016.




[10] D. Milioris, G. Tzagkarakis and A. Papakonstantinou, “Low dimensionalsignal-strength fingerprint-based positioning in wireless LANs,” Ad HocNetw., vol. 12, pp. 100–114, 2014.

[11] S. Nikitaki, G. Tsagkatakis and P. Tsakalides, “Efficient multichannelsignal strength based localization via matrix completion and Bayesiansparse learning,” IEEE Trans. Mobile Comput., vol. 14, no. 11, pp. 2244–2256, Nov. 2015.

[12] D. Milioris, M. Bradonjic and P. Muhlethaler, “Building complete train-ing maps for indoor location estimation,” in Proc. IEEE INFOCOM,2015, pp. 75–76

[13] R. Margolies, R. Becker, S. Byers, S. Deb, R. Jana, S. Urbanek andC. Volinsky, “Can you find me now? Evaluation of network-basedlocalization in a 4G LTE network,” in Proc. IEEE INFOCOM, 2017.

[14] A. Ray, S. Deb and P. Monogioudis, “Location of LTE measurementrecords with missing information,” in Proc. IEEE INFOCOM, 2016.

[15] R. Margolies, A. Sridharan, V. Aggarwal, R. Jana, N. Shankara-narayanan, V. A. Vaishampayan, and G. Zussman, “Exploiting mobilityin proportional fair cellular scheduling: Measurements and algorithms,”IEEE/ACM Trans. on Netw., vol. 24, no. 1, pp. 355–367, 2016.

[16] Cell ID based localization: https://combain.com/about/about-positioning/cell-id-positioning/

[17] A. Chakraborty, L. E. Ortiz and S. R. Das, “Network-side positioningof cellular-band devices with minimal effort,” in Proc. IEEE INFOCOM,2015.

[18] S. C. Ergen, H. S. Tetikol, M. Kontik, R. Sevlian, R. Rajagopal, andP. Varaiya, “RSSI-fingerprinting-based mobile phone localization withroute constraints,” IEEE Trans. Veh. Technol., vol. 63, no. 1, pp. 423–428, 2014.

[19] M. Y. Chen, T. Sohn, D. Chmelev, D. Haehnel, J. Hightower, J.Hughes, A. LaMarca, F. Potter, I. Smith and A. Varshavsky, “Practicalmetropolitan-scale positioning for GSM phones,” in Proc. ACM Ubi-Comp, 2006.

[20] P. Nurmi, S. Bhattacharya, and J. Kukkonen “A grid-based algorithm foron-device GSM positioning ,” in Proc. ACM UbiComp, 2010.

[21] G. Sun, J. Chen, W. Guo and K. J. R. Liu, “Signal processing techniquesin network-aided positioning: a survey of state-of-the-art positioningdesigns,” IEEE Signal Processing Magazine, vol. 22, no. 4, pp. 12–23,2005.

[22] T. Wigren, “ Adaptive Enhanced Cell-ID Fingerprinting Localization byClustering of Precise Position Measurements,” IEEE Transactions onVehicular Technology, vol. 56, no. 5, pp. 3199–3209, Sept. 2007.

[23] L. Shi and T. Wigren, “AECID Fingerprinting Positioning Performance,”in Proc. IEEE GLOBECOM, 2009.

[24] Pew Research Center, “10 facts about smartphones as the iPhone turns10,” Research Report, Online:http://www.pewresearch.org/fact-tank/2017/06/28/10-facts-about-smartphones/.

[25] A. Edelman, T. A. Arias and S. T. Smith, “The geometry of algorithmswith orthogonality constraints,” Siam Journal on Matrix Analysis andApplications, vol. 20, no. 2, pp.303–353, 1998.

[26] J. F. Cai, E. J. Cands and Z. Shen, “A singular value thresholding algo-rithm for matrix completion,” SIAM Journal on Optimization, vol. 20,no. 4, pp.1956–1982, 2010

[27] P. Jain and P. Netrapalli, “Fast Exact Matrix Completion with FiniteSamples ,” Online:https://arxiv.org/pdf/1411.1087.pdf

[28] D. Lazzaro, “A nonconvex approach to low-rank matrix completion us-ing convex optimization,” Numerical Linear Algebra with Applications,vol. 23, no. 5, pp.801–824, 2016.

[29] K. Mohan and M. Fazel, “Iterative reweighted algorithms for matrix rankminimization,” Journal of Machine Learning Research, vol. 13, no. 1,pp.3441–3473, 2012.

[30] Z. Gu, Z. Chen, Y. Zhang, Y. Zhu, M. Lu, and A. Chen, “ Reducing fin-gerprint collection for indoor localization,” Computer Communications,vol. 83, pp.56–63, 2016.

[31] S. Nikitaki, G. Tsagkatakis, and P. Tsakalides, “Efficient multi-channelsignal strength based localization via matrix completion and Bayesiansparse learning”, IEEE Transactions on Mobile Computing, vol. 14,no. 11, pp. 2244–2256, 2015.

[32] P. Jain, P. Netrapalli, and S. Sanghavi, “Low-rank matrix completionusing alternating minimization”, in Proceedings of the forty-fifth annualACM symposium on Theory of computing, pp. 665–674, 2013.

[33] Wikipedia, Online: https://en.wikipedia.org/wiki/Low-rankapproximation.

[34] Z. Lin, M. Chen and Y. Ma, “The augmented Lagrange multiplier methodfor exact recovery of corrupted low-rank matrices,” Online, https://arxiv.org/pdf/1009.5055.pdf.

[35] L. Balzano, R. Nowak and B. Recht, “Online identification and trackingof subspaces from highly incomplete information,” in Communication,Control, and Computing, 2010, pp. 704–711.

[36] L. Balzano and S. J. Wright, “Local convergence of an algorithm forsubspace identification from partial data,” Foundations of ComputationalMathematics, vol. 15, no. 5, pp.1279–1314, 2015.

[37] D. Zhang and L. Balzano, “Global convergence of a Grassmanniangradient descent algorithm for subspace estimation,” Online:https://arxiv.org/pdf/1506.07405.pdf.

[38] E. J. Candes and J. Romberg, “Sparsity and Incoherence in CompressiveSampling,” Inverse Problems, vol. 23, no. 3, pp.969–985, 2007.

[39] M. E. Kilmer, K. Braman and N. Hao and R. C. Hoover, “Third-OrderTensors as Operators on Matrices: A Theoretical and ComputationalFramework with Applications in Imaging,” SIAM Journal on MatrixAnalysis and Applications, vol. 34, no. 1, pp.148–172, 2013.

[40] S. Fisk, “A Note on Weyl’s Inequality,” American Mathematical Monthly,vol. 104, no. 5, pp.257–258, 1997

[41] Technical Report, Online: https://www.dropbox.com/s/nq72vr7x3xvbpzk/subspace.pdf?dl=0.

Xiaohua Tian (S’07-M’11) received his B.E.and M.E. degrees in communication engineer-ing from Northwestern Polytechnical University,Xian, China, in 2003 and 2006, respectively. Hereceived the Ph.D. degree in the Departmentof Electrical and Computer Engineering (ECE),Illinois Institute of Technology (IIT), Chicago, inDec. 2010. Since Mar. 2011, he has been withthe School of Electronic Information and Electri-cal Engineering in Shanghai Jiao Tong Univer-sity, and now is an Associate Professor with the

title of SMC-B scholar. He serves as the column editor of IEEE NetworkMagazine and the guest editor of International Journal of Sensor Net-works (2012). He also serves as the TPC member for IEEE INFOCOM2014-2017, best demo/poster award committee member of IEEE INFO-COM 2014, TPC co-chair for IEEE ICCC 2014-2016, TPC Co-chair forthe 9th International Conference on Wireless Algorithms, Systems andApplications (WASA 2014), TPC member for IEEE GLOBECOM 2011-2016, TPC member for IEEE ICC 2013-2016, respectively.

Xinyu Wu is currently pursuing his B.E. degreein Electronics Engineering from Shanghai JiaoTong University, China and is expected to gradu-ate in Jun. 2018. His recent research work is onmobile social computing.

Hao Li is currently pursing his M.S. degreein Electronics Engineering from Shanghai JiaoTong University, China and is expected to grad-uate in Mar. 2021. His recent research work isabout wireless sensing networks.

Xinbing Wang received the B.S. degree (withhons.) from the Department of Automation,Shanghai Jiaotong University, Shanghai, China,in 1998, and the M.S. degree from the Depart-ment of Computer Science and Technology, Ts-inghua University, Beijing, China, in 2001. Hereceived the Ph.D. degree, major in the Depart-ment of electrical and Computer Engineering,minor in the Department of Mathematics, NorthCarolina State University, Raleigh, in 2006. Cur-rently, he is a professor in the Department of

Electronic Engineering, Shanghai Jiaotong University, Shanghai, China.Dr. Wang has been an associate editor for IEEE/ACM Transactionson Networking and IEEE Transactions on Mobile Computing, and themember of the Technical Program Committees of several conferencesincluding ACM MobiCom 2012, ACM MobiHoc 2012-2014, IEEE INFO-COM 2009-2017.

rf fingerprints prediction for cellular network...

Documents