IEEE TRANSACTIONS ON SYSTEMS, MAN, AND ...ivsn-group.com › home › article › wuzhefu ›...

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS 1

Passive Indoor Localization Based onCSI and Naive Bayes Classification

Zhefu Wu, Qiang Xu, Jianan Li, Chenbo Fu, Qi Xuan, and Yun Xiang

Abstract—Passive indoor localization is important. Unlikeactive localization techniques, it does not require for users tocarry measuring devices, e.g., smart phones. Thus, it is widelyused in applications such as security, smart housing, objecttracking, etc. However, in real-world applications, the passivelocalization accuracy is limited due to the environment noises,multipath effect, etc. To address those problems, in this paper,we propose to use channel state information (CSI) instead.Specifically, we make the following contributions: 1) we designa CSI-based passive indoor localization system; 2) we develop aNaive Bayes classifier enhanced with confidence level informa-tion; and 3) we demonstrate the effectiveness of our techniqueusing real-world deployments. The experimental results show thatour technique can achieve more than 86% accuracy on averageand at least 15% better than the baseline Naive Bayes classifier.

Index Terms—Channel state information (CSI), indoor local-ization, naive Bayes, passive.

I. INTRODUCTION

INDOOR localization is an important yet not fullyaddressed problem. Global positioning system is widely

used in outdoor localization applications with reasonableaccuracy. However, it cannot be used in indoor environ-ment where the satellite signals are blocked. Moreover,indoor localization applications require high accuracy in com-plex and unknown environments. To address those problems,researchers develop various technologies, such as infraredrays, ultra wide bandwidth, radio frequency identification,ultrasound, motion and vision sensors, etc. [1]–[6]. However,because of additional infrastructure requirement, those tech-niques are typically too expensive to be deployed in real-worldapplications.

Therefore, the mainstream indoor localization techniquesare based on wireless local area networks (WLANs) [7], whichare already densely deployed in many indoor environments,such as schools, hospitals, restaurants, super markets [8],etc. By utilizing the existing infrastructure, the deploymentcost can be significantly reduced. As the development of theWLAN technologies, the bandwidth of 802.11ac is expected

Manuscript received July 29, 2016; revised November 7, 2016 andJanuary 1, 2017; accepted February 22, 2017. This paper was recommendedby Associate Editor D. Akopian. (Corresponding author: Yun Xiang.)

The authors are with the College of Information Engineering,Zhejiang University of Technology, Hangzhou 310014, China (e-mail:[email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSMC.2017.2679725

to be greatly improved [9], further increasing the accuracy ofindoor localization techniques.

The existing WLAN-based indoor localization techniquescan be generalized into two categories: 1) active and 2) pas-sive [10]. For the active techniques, people are required tocarry specific measuring devices, e.g., smart phones [11].Those devices are used to receive and measure WLAN signals,such as received signal strength indicator (RSSI). However, inreal-world applications, requiring users to report their mobilephone status can be inconvenient, inappropriate, and some-times impossible. Thus, in this paper, we focus on passivetechniques, which do not require the participation of userdevices.

There are two main challenges for passive localizationmethods.

1) Constraints of Physical Signals: Existing techniques,such as fingerprint database [12], usually use RSSI aspassive localization indicator [13]. RSSI is the aggre-gated signal strength of multiple signal paths. The RSSIfingerprint database is built by measuring the corre-sponding RSSI values with people located in all thepossible positions. However, because of the multipatheffect, the RSSI can contain significant noises [14],which greatly affects the positioning accuracy, as shownin Section IV.

2) Complexity of Indoor Environments: Passive indoorlocalization techniques rely on the interference of theWLAN signals by the human objectives. Thus, they arevery sensitive to the environmental variations [15]. Inreal-world applications, usually there exist many noisesources such as furniture, multipath effect [16], variouselectronic devices, etc. Moreover, the RSSI varies signif-icantly in different locations inside the same room. Forexample, the adjacent locations in the center of the roommay have large RSSI variations, while the RSSI betweentwo corner locations are much closer. This is becausethat the corner locations are less sensitive to environmen-tal noises. In general, the indoor environment complexitysignificantly affects the localization accuracy.

To address these challenges, we propose to use channel stateinformation (CSI) [17], [18]. Compared with RSSI, CSI ismore stable and accurate, as shown in our own analysis inSection IV. Moreover, antennas react differently to the samesignal. Thus, in this paper, we deploy multiple antennas toimprove the localization accuracy.

In our technique, we first collect the CSI information fromdifferent locations of the indoor environment to train the

2168-2216 c⃝ 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

mailto:[email protected]

http://ieeexplore.ieee.org

http://www.ieee.org/publications_standards/publications/rights/index.html


2 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Fig. 1. Overall system flow.

Naive Bayes classifier. For each pair of antenna, 30 subcar-rier waves are grouped as one signature. After the fingerprintlibrary is built, the indoor locations are determined by compar-ing the current signature with the library using the classifier.To further improve the localization accuracy, we introduce theconcept of confidence level.

In this paper, we make the following main contributions.1) We design a CSI signature-based passive indoor local-

ization system, which does not require participation ofuser devices.

2) We propose a Naive Bayes classifier-based techniquecombined with confidence level to further increaselocalization accuracy.

3) We validate our techniques using real-world experimentsand demonstrated that our methods performs even betterthan mainstream RSSI-based active techniques.

The rest of this paper is organized as follows. Section II sur-veys the related work. Section III describes the system design,theories, and experimental methods. Section IV presents theexperiment and data analysis results. Finally, Section V con-cludes this paper.

A. System Flow

Fig. 1 shows the overflow of our system. The system con-tains two phases, which are training and testing, respectively.In the training phase, the CSI data from all the possible loca-tions are collected and preprocessed. The mean and standarddeviation values of the subcarrier waves are calculated andused to generate the fingerprint library. During the testingphase, the CSI data are measured and preprocessed. Theyare then compared with the database using a Naive Bayesclassifier. The confidence level is used to further improve theresults and derive the final estimated location of the humanobjects.

II. RELATED WORK

Based on the physical signals used, the existing WLANindoor localization techniques can be generalized into twocategories.

A. RSSI-Based Methods

RSSI-based techniques are mainstream in indoor localiza-tion applications [19]–[22]. Existing literatures is focused ontwo major areas, which are power distance modeling andfingerprint-based methods [23], respectively. For the powerdistance modeling methods, the distance is calculated basedon the power strength of multiple received signals. A widelyused technique is triangle positioning [12]. In these methods,the RSSIs of at least three access points (APs) with knownlocations are collected to determine the objective coordinates.However, these techniques typically require the objectives tobe in line of sight, which is unrealistic in real-world appli-cations with complex indoor environments. To address thisproblem, Xiao et al. [24] proposed to differentiate the signalswithin and outside the line of sight and the localization accu-racy can reach about 95%. Moreover, since it is relatively easyto derive RSSI values, it is often used in hybrid techniques,such as video surveillance applications [25].

For the fingerprint-based methods, they typically includetwo stages: 1) offline training and 2) online testing. Thekey technique is to generate the comprehensive fingerprintdatabase during the training phase. RSSIs from multiple loca-tions are collected and combined as a fingerprint. During thetesting phase, the corresponding fingerprints are collected andcompared with the database. The closest entry in the databaseis considered as the objective location [26].

There are many challenges for the RSSI-based meth-ods. For example, in the real-world applications, RSSI val-ues in the same location can vary significantly [16]. It ismainly caused by the multipath effects in indoor environ-ment. Elnahrawy et al. [14] analyzed the physical constraintsof RSSI-based methods. Unfortunately, those constraints canonly be resolved by either using more detailed and compre-hensive indoor environment models or introducing additionalequipments, which can be too expensive and complicated forreal-world applications.

B. CSI-Based Methods

The RSSI signals are very unstable, which greatly lim-its its applications. Thus, recent researches are focused onCSI as an alternative. Yang et al. [16] analyzed in detailthe properties of CSI and its advantages against using RSSI.Li et al. [10] proposed a novel triangular positioning-basedtechnique which uses CSI to reduce the impact of multipatheffects. Their methods significantly improve the localizationaccuracies. Moreover, the development of open source soft-wares is also facilitating the acquisition of CSI informationfor researches and industries [27].

Currently, the CSI researches are mainly focused on twoareas, which are detection and localization, respectively.Detection includes areas, such as intruder detection, bodymovement, behavior analysis, etc. Wu et al. [28] used the CSIinformation to address the human body movement detectionproblems. Zeng et al. [29] combined the CSI information withmotion sensors embedded in smart phones. Their techniquescan successfully distinguish human body motion as well as itsdirection. Researchers also use CSI to identify human gesture


WU et al.: PASSIVE INDOOR LOCALIZATION BASED ON CSI AND NAIVE BAYES CLASSIFICATION 3

(a) (b)

Fig. 2. Example of CSI amplitudes in various conditions. (a) CSI amplitudes in the same location. (b) CSI amplitudes in different locations.

and key pressings [30], [31]. Those information can be fur-ther used in applications, such as customer behavior analysis,etc. [32].

Based on the requirement of signal measurement instru-ments, the localization techniques can also be categorized aspassive or active. For the active techniques, Sen et al. [33]developed a CSI-based technique called PinLoc, which canlocate the human objectives into 1 × 1 m2 square blockswith 89% accuracy. Wang et al. [34] processed the CSIinformation with deep leaning techniques and increase thelocalization accuracies significantly. While for the passivelocalization techniques, Xiao et al. [35] proposed to usemultiple pairs of APs and monitoring points (MPs) to collectCSI channel information for indoor localization applications.Sabek and Youssef [36] further improved the localizationresults using only one pair of AP and MP. For their approaches,the passive localization accuracy can reach about 0.95 m.

It should be noted that for the detection and localizationapplications, they are usually based on the same theoriesand techniques. Moreover, in the localization applications,especially for the passive ones, the detection of the humanobjectives is a prerequirement. Thus, in this paper, we do notdistinguish them specifically.

III. THEORIES AND METHODS

A. CSI Introduction

The existing passive localization techniques are mostlybased on RSSI signals, which can be unstable, unreliable, andinaccurate. To address those problems, we use CSI informationderived from the network’s physical layer.

Typically, in indoor environment, the WLAN signals travelthrough multiple paths before they can reach the receiver end.This phenomenon is referred to as multipath effects. For eachpath, the environment can cause different amounts of timedelay, amplitude decay, and phase shift. To differentiate eachpath, the wireless channel is modeled as a spatial linear filter.Thus, the channel impulse response (CIR) can be describedusing the following equation:

h(τ ) =N!

i=1

|αi|e−jθiδ(τ − τi) (1)

where αi, θi, and τi are amplitude, phase, and delay forthe ith path, respectively. By applying fast Fourier transfor-mation (FFT) to the CIR functions, we derive the channelfrequency response (CFR) in the frequency domain, as shownin the following equation:

H( f ) = FFT[h(τ )]. (2)

For the most widely used WLAN protocols, such as IEEE802.11n, orthogonal frequency division multiplexing (OFDM)and multiple in multiple out (MIMO) technologies are becom-ing standard [23]. Their developments facilitate the derivationof CSI data.

OFDM samples the CIR value at predetermined intervals.The discretized CIR is shown as follows:

H( f ) ="H( f1), . . . , H( fk)

#(3)

where H( fk) is a combination of amplitude and phase forsubcarrier wave fk. In this paper, we derive the CFR of 30subcarrier waves and group them together as one CSI signature.

The MIMO technique allows the sending and receiving ofsignals on multiple antennas. Assuming the system has T send-ing antennas and R receiving antennas, there can be T × Rlinks on maximal. The CSI signatures in each link are extractedfor further analysis. In this case, for each system, there are T ×R × 30 subcarrier waves in total. By analyzing the aggre-gated CSI signature data from all the links, the reliability andstability can be improved significantly.

In general, the passive indoor localization techniques arebased on the observation that when human objectives are ondifferent locations, they interact and impact the environmentdifferently, which can cause varying changes to the WLANsignal channels. Therefore, for indoor localization, the signalsused should satisfy the following conditions: 1) stable in thesame location and 2) distinguishable in different locations. Asshown in Fig. 2, CSI signature satisfies those two conditions.

Fig. 2(a) shows two sets of CSI signatures measured in thesame location, but at different time. The x-axis represents the30 subcarrier waves, while the y-axis represents their ampli-tudes. For each group, it contains 30 CSI signatures. Theresults demonstrate that CSI are mostly stable in the samelocation. Fig. 2(b) shows two sets of CSI signatures measured



(a) (b)

Fig. 3. Comparison of stability between (a) RSSI and (b) CSI.

in different locations at the same time. Compared with thecase shown in Fig. 2(a), it is much more differentiable.

Fig. 3 shows the comparison between CSI and RSSI.Fig. 3(a) and (b) shows the RSSI and CSI distributions fromthe same AP. In the figure, x-axis is the amplitude and y-axis isthe probability. It is observed that for the RSSI, the amplitudevalues are more unpredictable, with the highest probabilitybeing only around 25%. While for the CSI signatures, there aresignificantly less interferences and noises. Moreover, the RSSIonly provides the overall power status, while the CSI containsinformation of all the subcarrier waves. It is shown later inSection IV that passive CSI methods can even outperform theactive RSSI techniques.

B. Naive Bayes Classifier

Passive indoor localization techniques can be generalized asa classification problem, which aims to identify the existenceof the human objectives in certain locations. In this paper,we implement a Naive Bayes classifier for its simplicity andeffectiveness [37]. It should be noted that the purpose of thispaper is not to compare different classifiers. Moreover, moreeffective and advanced classifiers can be easily integrated intoour system.

The Naive Bayes classifier is based on the Bayes theoremas shown in the following equation:

P(A|B) = P(B|A)· P(A)

P(B)(4)

where P is the probability and P(A|B) is the conditionalprobability.

For a Naive Bayes classifier, the probability of each event iscalculated based on (4). The event with the highest probabilityis considered as the candidate. Naive Bayes classifier reliesupon two basic assumptions: 1) the features are independentfrom each other and 2) each feature has the same prominence.In general, there are two stages for a Naive Bayes classifier,which are offline training and online testing, respectively.

As shown in Algorithm 1, during the training stage, we cal-culate the mean and standard deviation values of the featuresin each location.

Algorithm 1 Training StageRequire: X {The training dataset}Require: mean, std {The output}

for each location i in X docalculate mean(x) and std(x)

end for

Algorithm 2 Testing StageRequire: x {The input feature signature}Require: L {Output estimated locations}

for each location li docalculate P(li|x)

end forL← argmax∀li∈x{P(li|x)}

In the testing stage, the original CSI signatures x ={f1, f2, . . . , fm} are classified based on the corresponding fea-tures. The probabilities for the objective existing in eachlocation are calculated and the highest one is chosen as theestimated location.

Based on the Bayes theorem, we derive the followingequation:

P(Li|x) = P(x|Li)· P(Li)

P(x)(5)

where P(Li) and P(x) are assumed to be known. Thus, tocalculate the estimated locations, we need to find the cor-responding location Li that maximizes P(x|Li). To calculateP(x|Li), we assume that the possible values follow Gaussiandistribution, i.e., P(x|Li) ∼ N(σ, θ). σ and θ are derived duringthe training stage. Algorithm 2 shows the detailed algorithm.

C. Confidence Level Enhancement

The reliabilities of link pairs are different. Thus, to furtherincrease the localization accuracy, we need to combine theresults from all the links based on their trustworthiness. Thus,we propose to use confidence levels to enhance the NaiveBayes classifier. For the rest of this section, we first give theconfidence level definition, and then describe its applications.



Fig. 4. Testing environments (a) 1 and (b) 2.

TABLE IEXAMPLE CLASSIFICATION RESULT

Assume that for certain antenna pair, the unclassified sam-pling data can be described as follows:

S =$s1, s2, . . . , sp

%(6)

where si represents the ith elements in sample S. Thus,after applying the Naive Bayes classifier, we can derive theestimated location cluster L as described in the followingequation:

Li = Bayes(si) ∀i ∈ S (7)

where Li is the classification results for the ith elements. Thus,we define the confidence level as follows:

confidencei = number(max(Li))

p(8)

where number(max(Li)) is the counting number of the mostfrequently appearing element in Li and p is the size of thesample. For example, assume there are 20 members in thetesting sample, among which 15 elements predict objective inlocation 1 while the rest predict otherwise. Thus, we derivethat number(max(Li)) equals 15 and p equals 20. Therefore,the confidence level is calculated as 15/20 = 0.75.

An advantage of using confidence level is that we can com-pare and utilize the results from multiple antennas. For a singleantenna pair, the classification results can be random and erro-neous. As shown in Table I, the classification results of thesame location for different antenna pairs can be conflicting.Thus, in that case, confidence level can be used to determinewhich antenna pair to trust. By considering the informationfrom multiple link pair simultaneously, it can significantlyimprove classification performance.

IV. EXPERIMENTAL RESULTS

In this section, we show the experimental results that vali-date our confidence level-based technique. First, we describethe setup of the experiment and data processing techniques;

second, we compare the performance of state-of-art classifica-tion algorithms; third, we show the performance of CSI-basedlocalization techniques; fourth, we compare the performanceof our confidence level-based method with existing ones; andfinally, we further validate our technique in more extremeenvironment conditions.

A. Experiment Setup and Preprocessing

1) Experiment Setup: The most important experimentinstruments include APs and MPs. The AP uses TP-LINKWR842N router, which has two antennas. MP uses an Intel5300 wireless network card, which has three MIMO anten-nas. To extract the CSI information, we utilize a network cardinformation extraction tool, CSITOOL by Dhalperi.

In theory, there can be 2 × 3 = 6 link pairs. However,in practice, typically only 2–4 links can work reliably due toenvironment variations. Without loss of generality, we choosetwo pair of links in our experiment, which are link 1-1 and1-2, respectively.

Fig. 5 shows the experiment instruments and environ-ments. The human objective is positioned in various indoorlocations. The MP receives the data packets sent from theAP. After processing the packets, the CSI information canbe extracted. The existence of human objective interfereswith the data transmission channels and thus affects the CSIstrengths.

Our techniques are tested in two different indoor sceneries.The first testing location is a tech laboratory. As shown inFig. 4(a), it is quite crowded and hence has a relatively largenumber of multipaths. The size of the room is 7.2 × 7.2 m2,and there are a total of 19 data collection points distributeduniformly in the room.

The second testing location is an empty classroom, whichsuffers less from the multipath effects. Fig. 4(b) shows theenvironment of the second testing location. The size of theroom is 5 × 15 m2. There are 30 data collection pointsdistributed uniformly inside the room.

In both testing cases, the data collection interval is 80 s.The AP is placed 20 cm above the ground, while theMP is 38 cm above the ground. The human objective’sorientation and stance remain unchanged during each datacollection period. For the signatures collected at each point,



Fig. 5. Experiment instruments and environment.

Algorithm 3 Abnormal Reading RemovalRequire: D {The input dataset}Require: V {The output processed dataset}

for each feature f in D dofor each f do

if |V(s, f )− mean(1, f )| > 3− std(1, f ) thenV(s, f )← mean(1, f )

end ifend for

end for

we randomly select 20 sets to perform online testing, whilethe rest are grouped as the training set for the Naive Bayesclassifier.

2) Data Preprocessing: Before being processed bythe Naive Bayes classifier, the data should be preprocessed.The purpose of the preprocessing stage includes removing theabnormal data and normalization.

Abnormal readings in CSI signatures can significantly devi-ate from the average. For example, as shown in Fig. 6(a),subcarrier wave 15 has large deviations from the rest sub-carrier waves. Thus, we consider this subcarrier wave asabnormal wave. Those abnormal readings can be caused bythe clustering of multiple paths.

To address the abnormal reading problem, we prepro-cess data using Pauta criterion. The data are compared withthe mean value. If the difference is greater than 3σ , thedata are considered abnormal. The pseudo-code is shown inAlgorithm 3.

In Algorithm 3, V(s, f ) is the CSI value for the sth sampleand f th feature, mean(1, f ) is the average value for f th feature,and std(1, f ) is the standard deviation. After the preprocessingof the abnormal values, the noise is greatly suppressed, asshown in Fig. 6(b).

The other step of preprocessing is normalization, whichmaps the values of each CSI signature into range of 0 to 1.Normalization can facilitate the comparison between differentfeatures in case their values differ in order of magnitude. It is

(a)

(b)

Fig. 6. Effectiveness of noise reduction preprocessing. (a) CSI signaturebefore preprocessing. (b) CSI signature after preprocessing.

performed based on the following equation:

Vnormalized = V −minmax−min

(9)

where V is the original CSI signature data set and min and maxare minimal and maximal values for certain feature, respectively.The abnormal readings can significantly affect the normalizationprocess as shown in Fig. 6(a). After removing the abnormalreadings, the normalization results can be improved.

B. Classifier Algorithm Comparison

1) Machine Learning Algorithm Comparison: Machinelearning-based classification is a well studied area. Accordingto existing literature, there are three most widely usedmachine learning techniques, which are support vectormachine (SVM), K nearest neighbors (KNNs), and NaiveBayes, respectively [38].

For the indoor localization application, accuracy and speedare both important. In many applications, especially safetymonitoring area, the human location can change randomly andrapidly, which requires a short classification and response time.



TABLE IIPERFORMANCE COMPARISON WITH HNB ALGORITHM

Moreover, the indoor environments, such as people, furni-ture, weather, etc., vary constantly. Thus, the classifier requiresfrequent retraining. Therefore, fast classification and trainingspeed is an important criterion for classification algorithms.

There are tradeoffs between different algorithms. SVM hasexcellent classification accuracy, but is hard and expensive totrain. KNN is simple. However, since it requires calculating thedistance of each testing sample to all the training samples, theclassification speed is relatively slow. Naive Bayes is simpleand fast, but the classification performance is sacrificed [39].

In this paper, we choose the Naive Bayes classifier basedon the following observations and reasons.

1) Data Size: Each sample contains about 30 subcarrierwaves. During our experiment, the data packets aretransmitted and recorded every 1 s. Moreover, the sam-ples are labeled using multiple classes. For example,there are 20 classes for environment 1 and 31 classesfor environment 2. The data size is large and hence, theclassification speed is important and critical.

2) Noises: The experiment data are collected from vari-ous indoor scenarios, which usually contain significantnoises. The Naive Bayes classifiers are relatively tolerantto noisy data.

3) Simplicity and Popularity: The purpose of this paper isnot to compare the performance of state-of-art classifi-cation techniques. To demonstrate the effectiveness ofour confidence level-based method, we prone to use thesimple, fundamental, and popular classifiers.

Fig. 7 shows the comparison results of the three popularclassification algorithms. The x-axis is the different train-ing sets and y-axis shows both speed and accuracy. Thedata set contains 50 samples and is tested using varyingtraining and testing sets. Among the three algorithms, theSVM method has the best performance and the Naive Bayesmethod is about 15% less accurate. However, for the algo-rithm speed comparison, the Naive Bayes algorithm requiressignificantly less time. Thus, in general consideration, weuse Naive Bayes classifier in this paper. Note that for dif-ferent applications and requirements, we can easily integratethe most appropriate classification algorithms into the currentframework.

2) Improved Naive Bayes Algorithms: The Naive Bayesmethod can encounter many problems in real-world appli-cations, e.g., interdependent datasets, unobservable param-eters, complex dependencies, etc. Thus, researchers pro-pose various improved Naive Bayes algorithms, such asBayes network [40], tree augmented Naive Bayes [41],averaged one-dependence estimators [42], weighted aver-age of one-dependence estimators [43], hidden NaiveBayes (HNB) [44], etc. Therefore, it is necessary to investigate

Fig. 7. Accuracy and speed comparison of the three algorithms.

the performance-time tradeoff and effectiveness of ourconfidence level-based method on those improved Naive Bayesmethods. Without loss of generality, we choose the state-of-artHNB as the comparison method.

The comparison results are shown in Table II. The exper-iment is done in the same environments as described inSection IV-A1. The WEKA machine learning software [45]is used to perform the Naive Bayes and HNB analysis.In this experiment, the data are discretized but withoutpreprocessing. In the table, for each method, the runningtime includes both training and testing time. Compared withthe Naive Bayes method, the HNB method has about 4.5%accuracy improvement on average, and 9.2% improvement onmaximal. However, the running time of HNB is increased bymore than three times, which is a significant drawback forreal-time passive localization applications. Moreover, we applyour confidence level-based technique on HNB method. As aresult, compared with the original HNB algorithm, the local-ization accuracy is improved by more than 10% on averageand 32.5% on maximal.

In general, it is demonstrated that our technique can also sig-nificantly improve the performance of improved Naive Bayes-based algorithms. Our technique is based on the assumptionthat the trustworthiness of different antenna pairs changes withthe environment, which is independent of the data processingand analysis algorithms. Thus, even though we choose NaiveBayes as our baseline, the conclusions are still valid for othermore advanced Bayes algorithms.

C. CSI Classifier Results

In this experiment, we use data from antenna pair 1-1 and1-2. In the two testing environment, there are 19 and 30



(a) (b)

Fig. 8. Clustering results for the first test environment. Confusion matrix for link (a) pair 1-1 and (b) pair 1-2.

(a) (b)

Fig. 9. Clustering results for the second test environment. Confusion matrix for link (a) pair 1-1 and (b) pair 1-2.

data collection points, respectively. The unoccupied scenar-ios are represented as point 20 and 31. For each location, 20signatures are selected randomly for testing.

The classification results are compared with the actual loca-tions. Thus, we can draw the confusion matrix based on theresults. Fig. 8(a) shows the confusion matrix of antenna pair1-1 for the first testing environment; Fig. 8(b) shows theconfusion matrix of antenna pair 1-2 for the first testing envi-ronment; Fig. 9(a) shows the confusion matrix of antenna pair1-1 for the second testing environment; and Fig. 9(b) showsthe confusion matrix of antenna pair 1-2 for the second testingenvironment. In the confusion matrix, the x-axis represents theactual locations and the y-axis represents the locations returnedby the classifier. The gray scale, as shown in the right bar, iscorresponding to the number of certain location being clas-sified into various locations. Obviously, the diagonal blocksrepresent the correct classifications. Thus, a deeper color inthe diagonal blocks represents a better classification result.

In general, as shown in the figures, the classification resultsare excellent. Certain points, e.g., location 14, 15 in testing

environment 1, have relatively poor results. The reason is thatthose points are typically at the corners of the room, wherethe impact of human existence to the channel is relativelyinsignificant. The results of testing environment 2 are bettersince it suffers less from the multipath effects.

Fig. 10(a) and (b) shows the ratio of false negative (FN)and false positive (FP) for the whole data set. In both testingcases, FP is greater than FN. The reason is that some points,such as the corner ones, have relatively insignificant impact tothe channel status. Thus, human existence in those locationsis harder to detect. In general, testing case 2 is better thantesting case 1 due to its simpler indoor environment. However,in testing case 2, the FP ratio for link 1-2 reaches 100%. Thelink is very unreliable and must be compensated.

The experiment shows that for the same testing environ-ment, different antennas have significantly different classifi-cation results. Thus, to improve the classification results, weneed to quantify the confidence level of different antennas andchoose the most reliable links. Therefore, we apply confidencelevel test to the antenna pairs.



(a) (b)

Fig. 10. Experimental results for different techniques. FN/FP ratio for (a) environment 1 and (b) environment 2.

(a) (b)

Fig. 11. Experimental results for different techniques. Classification results for (a) environment 1 and (b) environment 2.

D. Confidence Level Method Comparison

In this section, we show the comparison results of ourconfidence level-based technique with the following methods.

1) RSSI-Based: The main stream active RSSI-based tech-nique. To compare it with our technique, we collect theRSSI information the same way as shown in Fig. 4.We use the same MP to collect the RSSI data fromthree different APs and applied the same Naive Bayesclassifier.

2) Link 1-1: Classification results by using the CSI signa-tures from link 1-1 only.

3) Link 1-2: Classification results by using the CSI signa-tures from link 1-2 only.

4) Confidence Level Method: Our proposed technique.Fig. 11(a) and (b) shows the classification accuracies in

both testing environments. Experimental results show that ourtechnique outperforms others significantly. Our technique isabout 30% better than the second best in testing environment1, and 15% better in testing environment 2. In both cases,the overall accuracies of our methods are above 90%. Ourmethod improves upon the existing technique by choosing

Fig. 12. Example RSSI values.

the more reliable link pairs and filtering out the extremeconditions.

It should be noted that the traditional RSSI-based tech-niques, even if it is active, are significantly less accurate than



Fig. 13. Testing environments of (a) 3 and (b) 4.

(a) (b)

Fig. 14. Performance comparison of environments (a) 3 and (b) 4.

the CSI-based techniques. Its accuracies in both sceneries arejust 52.6% and 56.7%, respectively.

The reason is that RSSI signals are much more fluctuatingand unstable than CSI. Fig. 12 shows an example of RSSIvalues of three APs from various locations at the same time.It can be observed that the RSSI values are very noisy. Atcertain point, the difference between APs can reach 50 dBm.

E. Comparison Results in Extreme Environments

In the previous sections, we demonstrate the performance ofour technique in two different rooms. To further validate ourmethods in more extreme environments, we repeat the exper-iment in two additional indoor scenarios, which are shown inFig. 13(a) and (b), respectively. The descriptions of those twoenvironments are as follows.

1) Environment 3: It is a crowded corridor and has manypeople passing by during the data collection period. Therooms beside the corridor belong to a communicationlaboratory, which generates significant electromagneticwaves. This environment contains a large amount ofnoises and interferences.

2) Environment 4: It is located in a room which has alarge outdoor balcony. Part of the room is connectedto the outdoor environment, which suffers less from themultipath effect. The channel state characteristics insideand outside the room are quite different.

For those two environments, we select 26 and 18 data collec-tion points, respectively. The setup of the experiments is thesame as described in Section IV-A1.

Fig. 14(a) and (b) shows the comparison results in envi-ronments 3 and 4. The x-axis represents different localizationtechniques and the y-axis represents the accuracy. For thetesting case of environment 3, more than nine points aresignificantly affected by atmospheric interferences, such aspeople staying, walking, running, etc. The noisy environ-ment reduces the overall accuracies. Without confidence levelenhancement, the localization accuracies drop to below 50%.Moreover, the performances of CSI-based methods are becom-ing similar to the RSSI-based method. In this case, however,our confidence level-based algorithm records the greatestimprovement, increasing the accuracy by more than 50% onaverage. For the testing case of environment 4, because of the



Fig. 15. Performance improvement compared with the CSI-based techniquesfor all the testing environments.

large outdoor balcony, the interferences and multipath effectsin the environment are relatively limited. Moreover, the vari-ations of different antenna are small. Thus, the performancesof all the CSI-based techniques are similar and excellent. Theresults demonstrate that our method can significantly increasethe resiliency to noises and interferences, and hence improvethe overall localization performance.

To further explore and analyze the effectiveness of our con-fidence level-based technique, we compare the performanceimprovement of the CSI-based techniques in Fig. 15. In thefigure, x-axis is the testing environment and y-axis is theimprovement rate. The multipath effect, which is the main fac-tor affecting the localization performance, is positively relatedto the complexity of indoor environments. Thus, from weak tostrong, the impact of the multipath effect for the four testingenvironments ranks as follows: 4, 2, 1, and 3. It is observedthat our technique performs better in all the environments andhas more improvement as the multipath effect grows stronger.In general, our confidence level-based method is more resilientto the noises and multipath effect and thus, is more adaptiveto the environment variations.

V. CONCLUSION

In this paper, we design a Naive Bayes classifier-basedindoor localization system. Unlike the mainstream RSSI-basedmethods, we propose to use CSI, which is more stable andreliable. Moreover, to combine the datasets from different linkpairs with varying credibilities, we develop a confidence level-based method. Our technique ranks the confidence of the linkpairs and choose the most reliable ones for the localization.As a result, our technique can reach above 90% classificationaccuracy on average, which is more than 15% better than thesecond best approach and about 40% better than the activeRSSI-based method. In the future work, our technique can beextended to more advanced machine learning techniques, e.g.,deep belief nets, convolutional neural networks, etc.

REFERENCES

[1] J. Kemper and H. Linde, “Challenges of passive infrared indoor localiza-tion,” in Proc. Workshop Position. Navig. Commun., Hanover, Germany,2008, pp. 63–70.

[2] L. Zwirello, T. Schipper, M. Harter, and T. Zwick, “UWB localiza-tion system for indoor applications: Concept, realization and analysis,”J. Elect. Comput. Eng., vol. 2012, no. 5, 2012, Art. no. 849638.

[3] R. Gonçalves et al., “Smart floor: Indoor navigation based on RFID,” inProc. IEEE Wireless Power Transf., Perugia, Italy, 2013, pp. 103–106.

[4] C. Medina, J. C. Segura, and Á. De la Torre, “Ultrasound indoor posi-tioning system based on a low-power wireless sensor network providingsub-centimeter accuracy,” Sensors, vol. 13, no. 3, pp. 3501–3526, 2013.

[5] I. Vlasenko, I. Nikolaidis, and E. Stroulia, “The smart-condo:Optimizing sensor placement for indoor localization,” IEEE Trans. Syst.,Man, Cybern., Syst., vol. 45, no. 3, pp. 436–453, Mar. 2015.

[6] S. Minaeian, J. Liu, and Y.-J. Son, “Vision-based target detection andlocalization via a team of cooperative UAV and UGVs,” IEEE Trans.Syst., Man, Cybern., Syst., vol. 46, no. 7, pp. 1005–1016, Jul. 2016.

[7] B. P. Crow, I. Widjaja, J. G. Kim, and P. T. Sakai, “IEEE 802.11 wirelesslocal area networks,” IEEE Commun. Mag., vol. 35, no. 9, pp. 116–126,Sep. 1997.

[8] G. Ding, J. Zhang, L. Zhang, and Z. Tan, “Overview of received signalstrength based fingerprinting localization in indoor wireless LAN envi-ronments,” in Proc. IEEE Int. Symp. Microw. Antenna Propag. EMCTechnol. Wireless Commun., Chengdu, China, 2013, pp. 160–164.

[9] O. Bejarano, E. W. Knightly, and M. Park, “IEEE 802.11ac: From chan-nelization to multi-user MIMO,” IEEE Commun. Mag., vol. 51, no. 10,pp. 84–90, Oct. 2013.

[10] Z. Li, T. Braun, and D. C. Dimitrova, “A passive WiFi source localizationsystem based on fine-grained power-based trilateration,” in Proc. WorldWireless Mobile Multimedia Netw., Boston, MA, USA, 2015, pp. 1–9.

[11] Y. Jiang et al., “Hallway based automatic indoor floorplan construc-tion using room fingerprints,” in Proc. ACM Int. Joint Conf. PervasiveUbiquitous Comput., Zürich, Switzerland, 2013, pp. 315–324.

[12] S. Mazuelas et al., “Robust indoor positioning provided by real-timeRSSI values in unmodified WLAN networks,” IEEE J. Sel. Topics SignalProcess., vol. 3, no. 5, pp. 821–831, Oct. 2009.

[13] H. Liu, H. Darabi, P. Banerjee, and J. Liu, “Survey of wireless indoorpositioning techniques and systems,” IEEE Trans. Syst., Man, Cybern.C, Appl. Rev., vol. 37, no. 6, pp. 1067–1080, Nov. 2007.

[14] E. Elnahrawy, X. Li, and R. P. Martin, “The limits of localization usingsignal strength: A comparative study,” in Proc. 1st IEEE Commun. Soc.Conf. Sensor Ad Hoc Commun. Netw. (SECON), Santa Clara, CA, USA,2004, pp. 406–414.

[15] W. Liu, X. Gao, L. Wang, and D. Wang, “Bfp: Behavior-free pas-sive motion detection using PHY information,” Wireless Pers. Commun.,vol. 83, no. 2, pp. 1035–1055, 2015.

[16] Z. Yang, Z. Zhou, and Y. Liu, “From RSSI to CSI: Indoor localizationvia channel response,” ACM Comput. Surveys, vol. 46, no. 2, pp. 1–32,2013.

[17] S. Sen, B. Radunovic, R. R. Choudhury, and T. Minka, “Precise indoorlocalization using PHY information,” in Proc. Int. Conf. Mobile Syst.Appl. Services, Bethesda, MD, USA, 2011, pp. 413–414.

[18] Z. Wang, J. Han, W. Xi, and J. Zhao, “Efficient and secure key extrac-tion using channel state information,” J. Supercomput., vol. 70, no. 3,pp. 1537–1554, 2014.

[19] Z. Zhou, Z. Yang, C. Wu, L. Shangguan, and Y. Liu, “Towards omnidi-rectional passive human detection,” in Proc. IEEE INFOCOM, vol. 12.Turin, Italy, 2013, pp. 3057–3065.

[20] Z. Wu et al., “A fast and resource efficient method for indoor position-ing using received signal strength,” IEEE Trans. Veh. Technol., vol. 65,no. 12, pp. 9747–9758, Dec. 2016.

[21] S. Caccamo, R. Parasuraman, F. Båberg, and P. Ögren, “Extending aUGV teleoperation FLC interface with wireless network connectivityinformation,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Hamburg,Germany, 2015, pp. 4305–4312.

[22] C.-H. Lin and K.-T. Song, “Probability-based location aware design andon-demand robotic intrusion detection system,” IEEE Trans. Syst., Man,Cybern., Syst., vol. 44, no. 6, pp. 705–715, Jun. 2014.

[23] B. Li, J. Salter, A. G. Dempster, and C. Rizos, “Indoor positioningtechniques based on wireless LAN,” in Proc. IEEE Int. Conf. LAN,2007, pp. 13–16.

[24] Z. Xiao et al., “Non-line-of-sight identification and mitigation usingreceived signal strength,” IEEE Trans. Wireless Commun., vol. 14, no. 3,pp. 1689–1702, Mar. 2014.



[25] A. Alahi, A. Haque, and L. Fei-Fei, “RGB-W: When vision meets wire-less,” in Proc. IEEE Int. Conf. Comput. Vis., Santiago, Chile, 2015,pp. 3289–3297.

[26] T. Roos, P. Myllymäki, H. Tirri, P. Misikangas, and J. Sievänen, “A prob-abilistic approach to WLAN user location estimation,” Int. J. WirelessInf. Netw., vol. 9, no. 3, pp. 155–164, 2002.

[27] D. Halperin, W. Hu, A. Sheth, and D. Wetherall, “Predictable 802.11packet delivery from wireless channel measurements,” ACM SIGCOMMComput. Commun. Rev., vol. 40, no. 4, pp. 159–170, 2010.

[28] C. Wu et al., “Non-invasive detection of moving and stationaryhuman with WiFi,” IEEE J. Sel. Areas Commun., vol. 33, no. 11,pp. 2329–2342, Nov. 2015.

[29] Y. Zeng, P. H. Pathak, C. Xu, and P. Mohapatra, “Your AP knows howyou move: Fine-grained device motion recognition through WiFi,” inProc. ACM Workshop Hot Topics Wireless, 2014, pp. 49–54.

[30] W. He, K. Wu, Y. Zou, and Z. Ming, “WiG: WiFi-based gesture recog-nition system,” in Proc. Int. Conf. Comput. Commun. Netw., Las Vegas,NV, USA, 2015, pp. 1–7.

[31] K. Ali, A. X. Liu, W. Wang, and M. Shahzad, “Keystroke recognitionusing WiFi signals,” in Proc. Int. Conf. Mobile Comput. Netw., Paris,France, 2015, pp. 90–102.

[32] Y. Zeng, P. H. Pathak, and P. Mohapatra, “Analyzing shopper’s behav-ior through WiFi signals,” in Proc. Workshop, Florence, Italy, 2015,pp. 13–18.

[33] S. Sen, B. Radunovic, R. R. Choudhury, and T. Minka, “You are fac-ing the Mona Lisa: Spot localization using PHY layer information,” inProc. Int. Conf. Mobile Syst. Appl. Services, Ambleside, U.K., 2012,pp. 183–196.

[34] X. Wang, L. Gao, S. Mao, and S. Pandey, “DeepFi: Deep learning forindoor fingerprinting using channel state information,” in Proc. IEEEWCNC, New Orleans, LA, USA, 2015, pp. 1666–1671.

[35] J. Xiao, K. Wu, Y. Yi, L. Wang, and L. M. Ni, “Pilot: Passive device-freeindoor localization using channel state information,” in Proc. IEEE Int.Conf. Distrib. Comput. Syst., Philadelphia, PA, USA, 2013, pp. 236–245.

[36] I. Sabek and M. Youssef, “Monostream: A minimal-hardware highaccuracy device-free WLAN localization system,” presented at theIEEE Wireless Commun. Networking Conf., Apr. 7–10, 2013, Shanghai,China, 2013.

[37] X. Wu et al., “Top 10 algorithms in data mining,” Knowl. Inf. Syst.,vol. 14, no. 1, pp. 1–37, 2008.

[38] S. B. Kotsiantis, “Supervised machine learning: A review of classifica-tion techniques,” Informatica, vol. 31, no. 3, pp. 249–268, 2007.

[39] B. Lantz, Machine Learning With R. Birmingham, U.K.: Packt, 2015,pp. 89–118.

[40] J. Pearl, Probabilistic Reasoning in Intelligent Systems, Burlington, MA,USA: Morgan Kaufmann, pp. 521–538, 1988.

[41] N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian networkclassifiers,” Mach. Learn., vol. 29, nos. 2–3, pp. 131–163, 1997.

[42] G. I. Webb, J. R. Boughton, and Z. Wang, “Not so naive Bayes:Aggregating one-dependence estimators,” Mach. Learn., vol. 58, no. 1,pp. 5–24, 2005.

[43] L. Jiang, H. Zhang, Z. Cai, and D. Wang, “Weighted average of one-dependence estimators,” J. Exp. Theor. Artif. Intell., vol. 24, no. 2,pp. 219–230, 2012.

[44] L. Jiang, H. Zhang, and Z. Cai, “A novel Bayes model: Hidden naiveBayes,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 10, pp. 1361–1371,Oct. 2009.

[45] I. H. Witten and E. Frank, Data Mining: Practical Machine LearningTools and Techniques With Java Implementations. Burlington, MA,USA: Morgan Kaufmann, 2000.

Zhefu Wu received the B.Eng. and Ph.D. degreesin electronics and communication engineering fromZhejiang University, Hangzhou, China, in 1993 and2000, respectively.

He is currently an Associate Professor withthe College of Information Engineering, ZhejiangUniversity of Technology, Hangzhou. His currentresearch interests include social network data min-ing, complex network dynamics, machine learning,and communication network application.

Qiang Xu received the B.E. degree from the Collegeof Information Engineering, Zhejiang University ofTechnology, Hangzhou, China, in 2014, where he iscurrently pursuing the M.E. degree.

In 2016, he spent one month at Nokia HangzhouResearch and Development Center, Hangzhou, andtwo months at China Mobile CommunicationCorporation, Beijing, China, as internship. His cur-rent research interests include indoor localization,sensor network, and machine learning.

Jianan Li received the B.E. degree from the Collageof Information Engineering, Zhejiang Universityof Technology, Hangzhou, China, in 2015, wherehe is currently pursuing the M.E. degree withthe Department of Information and CommunicationEngineering.

His current research interests include complexnetwork, machine learning, and wireless networks.

Chenbo Fu received the B.S. degree in physics fromthe Zhejiang University of Technology, Hangzhou,China, in 2003, and the M.A.Sc. and Ph.D. degreesin physics from Zhejiang University, Hangzhou, in2009 and 2013, respectively.

He was a Post-Doctoral Researcher with theCollege of Information Engineering, ZhejiangUniversity of Technology, and was a ResearchAssistant with the Department of Computer Science,University of California at Davis, Davis, CA, USA,in 2014. He is currently a Lecturer with the College

of Information Engineering, Zhejiang University of Technology. His currentresearch interests including network-based algorithm design, social networkdata mining, chaos synchronization, network dynamics, and machine learning.

Qi Xuan received the B.Eng. and Ph.D. degreesin control theory and engineering from ZhejiangUniversity, Hangzhou, China, in 2003 and 2008,respectively.

He was a Post-Doctoral Researcher with theDepartment of Information Science and ElectronicEngineering, Zhejiang University, from 2008to 2010, and was a Research Assistant withthe Department of Electronic Engineering, CityUniversity of Hong Kong, Hong Kong, in 2010.From 2012 to 2014, he was a Post-Doctoral

Researcher with the Department of Computer Science, University ofCalifornia at Davis, Davis, CA, USA. He is currently an Associate Professorwith the College of Information Engineering, Zhejiang University ofTechnology, Hangzhou. His current research interests include network-basedalgorithm design, social network data mining, social synchronization andconsensus, reaction-diffusion network dynamics, machine learning, andcomputer vision.

Yun Xiang received the B.S. degree in informationand electrical engineering from Zhejiang University,Hangzhou, China, in 2006, the M.A.Sc. degreein electrical engineering from the University ofMassachusetts, Boston, MA, USA, in 2008, andthe Ph.D. degree in electrical engineering from theUniversity of Michigan, Ann Arbor, MI, USA, in2014.

He is currently an Associate professor withthe College of Information Engineering, ZhejiangUniversity of Technology, Hangzhou. His current

research interests including wireless sensor network, data fusion, machinelearning, and network algorithms.

Date post:	26-Jun-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND ...ivsn-group.com › home › article › wuzhefu ›...

Documents