**with Fusion of Smartphone Acoustics and PDR**

^{}Hucheng Wang1,2[0000−0002−9744−389X], Lei Zhang1[0000−0001−5879−514X], Zhi
Wang1[0000−0002−0490−2031], and Xiaonan Luo^{2}

1 State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310013, China

2 Satellite Navigation and Location Service Laboratory, Guilin University of Electronic Technology, Guilin 541004, China

**Abstract.** The indoor localization technology is bringing great conve-
nience in location-based service (LBS) recent years. Most high-accuracy
localization system is expensive and highly dependent on custom device
instead of smartphone. In this paper, we propose a high-accuracy indoor
localization system with fusion of smartphone acoustics and Pedestrian
Dead Reckon (PDR) for pedestrians. Acoustic signals are designed with
high frequencies chirp which is above the threshold of human auditory.

Time Diﬀerence of Arrival (TDOA) is adopted to eliminate the synchro- nization with the broadcasting infrastructure. For more convenience of pedestrians, inertial sensor is introduced to analyze human gesture and gait. We revise Extended Kalman Filter (EKF) to adapt coarse-grained combination and diﬀerent performance of acoustic and inertial sensors.

The experiments show that Pals system can achieve localization error of 0.28 meter within 95% conﬁdence.

**Keywords:** Acoustics·Extended Kalman Filter·TDOA·Pedestrian.

**1** **Introduction**

On current trends, the most popular and valuable LBS applications are on the smartphone [1–3]. Humans in modern society prefer carrying an almighty phone to wearing one more piece of facility for positioning, which drives smartphone indoor positioning as a kind of invisible rigid demand [4]. Various kinds of smart- phone applications which the secondary development of LBS pose an urgent need for high-precision indoor positioning [5, 6]. From opinions of Microsoft Indoor Localization Competition and Indoor Positioning and Indoor Navigation (IPIN) competition, acoustics and PDR are the best choices of indoor positioning in smartphone. For the former competition, AID from Zhejiang University won the ﬁrst place with pure acoustics of 3D group [20], which was also the prototype

Supported by Key Projects of Intergovernmental International Scientiﬁc and Tech- nological Innovation Cooperation (No. 2017YFE0101300), and National Natural Sci- ence Foundation of China (Nos. 61772149, 6177020565 and U1509215).

system of this paper. For IPIN Track 3 and Track 4, Team WHU from Wuhan University won the ﬁrst place with PDR.

We propose Pals, High-Accuracy Pedestrian Localization with Fusion of S- martphone Acoustics and PDR. Acoustics is outstanding for smartphone local- ization [7–9]. All of smartphones on the market are equipped with sound devices.

So long as the speakers and microphones are suitably deployed, the demand of acoustic signal positioning can be met. Inertial Measurement Unit (IMU) is already integrated into smartphones for several years. Rough self-contained po- sitioning can be achieved based only on them. IMU has little dependence on the environment, except for the magnetometer. PDR positioning principle is de- rived from the IMU reading integral, obtaining the displacement and orientation angle, estimating the stride size, and then reckoning the coordinates [10].

Things do not always go as well as we thought. Even if acoustics and IMU fusion simulation algorithm design is perfect, we still ﬁnd oceans of obstacles in practical application. So this paper focuses on practical problems. In general, the main contributions of Pals are listed in the following three aspects:

Pals proposes a hybrid structure incorporating acoustic positioning coordi- nates and PDR stride length. The structure is fast and suitable for most mobile navigation scenarios.

Pals creates a empirical Extended Kalman Filter algorithm for fusion. We update coordinate by modiﬁed observation estimation matrix rather than status prediction matrix.

Pals improves PDR algorithm. We add speed-constraint to the pace event judgment, and adjusting stride length estimation by fusion coordinate.

**2** **Related Works of Smartphone Indoor Positioning**

Recent commercial trends in high-accuracy indoor localization have led to a proliferation of studies. Chen [11] built a large number of ﬁngerprint databas- es using the strength of WiFi signal, realizing the commercialization of indoor positioning in shopping malls and airports. [12] presented a smartphone inertial sensor-based PDR approach for indoor localization and tracking with occasion- al iBeacon calibration. Google Tango [13] estimated positioning with cameras.

But camera casually may make troubles with others in an age which privacy is signiﬁcant.

Acoustic localization enjoys a wide range of applications. BeepBeep [14] emit- ted the double Beep sounds during a ranging session by Round-Trip Time (RTT).

In bi-directional communication mode of [14], user capacity is a major limitation.

BatTracker proposed for the superiority of the high precision and infrastructure- free moblie device tracking system in 3D space [15]. They continuously emitted acoustic signals that bounced oﬀ surfaces of nearby object. It limited positioning range near the smooth wall. ALPS [16] in Carnegie Mellon University (CMU) was very considerate in non-Line-of-Sight (NLOS) recognition, Doppler eﬀect and ﬂoor plan. [17] proposes an acoustic steering tracking system on smart- phone.

Inertial localization system is also a hot topic of infrastructure positioning in academia. Considering the cost and Commercial Oﬀ-The-Shelf (COTS), the smartphone inertial sensor is lightweight, so IMU localization only cannot main- tain high-accuracy for long-term caused by accumulative errors. The accumula- tive errors can also be eﬃciently removed by using Zero Velocity Update (ZUP- T) [18], but IMU are required to be mounted on the foot in ZUPT. In practical smartphone localization, all above methods are not suitable since the moblie is supposed to be held in hand or in pocket rather than to be mounted on waist or foot. Rui Z. [19] proposed a novel localization system fused the position infor- mation estimated by acoustic and inertial sensors. They are the closest to our structure design. We optimaize PDR of Pals with stride length feedback, fusion algorithm is also improved with experiment.

**3** **Pals in Fusion**

Pals fusion algorithm belongs to heterogeneous fusion. In practice, the data size of PDR obtained is much larger than the acoustics, so Pals coordinates are mostly given by PDR. Fusion occurs when mobile cache receives the acoustic coordinates, or when mobile sensor detects that the pedestrian has stopped.

**3.1** **Hybrid structure incorporating acoustics and PDR**

According to Extend Kalman Filter (EKF), we deﬁne the PDR algorithm as the state transition of EKF as

**X*** _{k+1}*=

*f*(X

*) +*

_{k}**w**

_{k+1}*, f*(X

*) =*

_{k}**X**

*+*

_{k}*S*

_{k}**u**

_{k}**T,**(1) where

**X**

*= [*

_{k}*x*

*k*

*y*

*k*] is denoted horizontal and vertical coordinate when

*t*=

*k*at indoor coordinate axis. Generally, there is a stationary deﬂexion in the horizontal direction between the indoor coordinate system and the geodetic coordinate system. The building is of course ﬁxed, so the interior coordinate system is also constant. In dealing with the coordinate deﬂexion of

*φ*ﬁxedly, we only need to multiply by a constant rotation matrix

**T**=

cos*φ−*sin*φ*
sin*φ* cos*φ*

*,*and *S** _{k}* is denoted
stride length in

*k*.

**u**

*= [sin*

_{k}*θ*cos

*θ*]

^{T}is controller vector where

*θ*is azimuth angle.

**w**

*follows a normal distribution with zero mean and variance*

_{k}**Q**

*.*

_{k}Then **X*** ^{A}* is obtained from acoustics, we take it out of mobile cache. Then
observation model can be deﬁned as

**z*** _{k+1}*=

*h*(X

*) +*

_{k+1}**v**

_{k+1}*, h*(X

*) = ¯*

_{k+1}*J*

*k+1*

**X**

*+ (1*

_{k+1}*−J*¯

*k+1*)X

^{A}

_{k+1}*,*(2) where ¯

*J*

*is normalization of cost function*

_{k+1}*J*(•) in MLE of TDOA in acoustic.

**v*** _{k+1}* follows a normal distribution with zero mean and variance

**R**

*. In other words, the observation model is weighted mean with noise.*

_{k+1}Pals employs the time update equation of EKF are given as

**X*** _{k+1|k}*=

*f*(X

*)*

_{k}*,*

**P**

*=*

_{k+1|k}**P**

*+*

_{k}**Q**

_{k}*,*(3)

where **P*** _{k+1|k}* and

**P**

*are a priori and a posteriori covariances. The update of EKF are given as*

_{k}**K*** _{k+1}*=

**P**

_{k+1|k}**H**

^{T}

*(H*

_{k+1}

_{k+1}**P**

_{k+1|k}**H**

^{T}

*+*

_{k+1}**R**

*)*

_{k+1}

^{−1}*,*(4)

**X**

*=*

_{k+1}**z**

*+*

_{k+1|k}**K**

*(z*

_{k+1}

_{k+1|k}*−*

**H**

_{k+1}**X**

*)*

_{k+1|k}*,*(5)

**P*** _{k+1}*= (I

*−*

**K**

_{k+1}**H**

*)P*

_{k+1}

_{k+1|k}*,*(6)

where**K** is the Kalman gain,**I**is 2-order identity matrix. In conventional EKF,
(6) can be competent the update of variance.

At the end, the fusion position trace is displayed by 2-order Gaussian s-
moothing. Notice that (5) exerts the predicted **z*** _{k+1|k}* of the measured values,
while formal EKF exerts the estimation of the state

**X**

*values. This is the third of originality in Pals system and is based on empirical inference. Smart- phone sensors are consumer-level, not industrial-level. Our acoustic system is more excellent by contrast. Thus Pals infers measurements matrix have a higher probability of reliability over the state predictions.*

_{k+1|k}**3.2** **Stride length update based on Pals coordinates**

As mentioned in Section IV, Weinberg’s stride length estimation involves a spe-
ciﬁc parameter*γ. Now in order to updateγ, we push back stride length accord-*
ing to the fusion coordinates. Assuming consecutive two-step model is (7) on the
basis of (1), the cost function of*S** _{k}* is denoted as

**X*** _{k+1|k+1}*=

**X**

*+*

_{k|k}*S*

*k*

**u**

_{k}**T**+

**w**

_{k+1}*,*(7)

*J*

*s*(

*S*

*k*) =

**X**

_{k+1|k+1}*−*

**X**

_{k|k}*−S*

*k*

**u**

_{k}**T**

^{2}

_{2}

*.*(8) To minimize the cost function ˆ

*S*

*k*= arg min

*S**k**∈R**J**s*(*S**k*), the stride length esti-
mation ˆ*S**k* must satisfy

*S*ˆ* _{k}*= [(u

_{k}**T)**

^{T}(u

_{k}**T)]**

*(u*

^{−1}

_{k}**T)(X**

_{k+1|k+1}*−*

**X**

*). (9) Then*

_{k|k}*γ*is calculated as (10)

ˆ*γ*= ˆ*S**k**/*(*|*ˆ*a**k**|**max**− |*ˆ*a**k**|**min*)^{1}^{4} (10)
Through the above process, Pals accomplishes the speciﬁc parameters of the
Least Squares (LS) update*γ*. Generally speaking, *γ* is relatively stable for the
same person with the same posture, maybe no change in short-term. Once the
pedestrian posture has changed, such as from slow walking to fast walking or
running,*γ*will be changed correspondingly. Based on above inference, we update
*γ*value every ﬁve seconds.

**4** **Experiment and Analyze**

Pals design pays more attention to user experience. In this section, we commit plenty of experiments, operating by diﬀerent people, diﬀerent smartphones and diﬀerent experimental paths.

**4.1** **Experimental setup**

**Fig. 1.**Experimental environment. Path 1 in red is a straight line, while Path 2 in blue
deliberately follows a curve and goes around twice to test Pals’ extreme performance

Experiments are shown in the Fig. 1. We arrange the base station in the posi-
tion shown in Fig. 1, and all the positions are temporarily determined. All of base
stations can be synchronized via Bluetooth as they are developed by us manu-
ally. Set the coordinate origin in the left bottom, and follow the right-handed
rule to establish indoor coordinate. Then measure*φ* = 161.5* ^{◦}* between indoor
coordinate system and the geodetic coordinate system. The rest of the details
are shown in Fig. 1. The ground truth positioning (GTP) of the experiments
were measured by WHU system. [21]

**4.2** **The implementation of experiment**

Test phone is Huawei Mate 9. Tester is of medium height with standard body.

During the experiment, tester holds the smartphone, walks along the path in ordinary posture and locates in real-time. The results of path are shown in Fig.

2. As can be seen, acoustic signals are of high accuracy. The reason is, acous-
tics always ﬁlters out jump points with overlarge MLE value*J*, and remaining
points are relatively ideal. However, this implementation and slow update rate
maybe lead to discontinuous trajectory. For experimental convenience, the initial
coordinate of PDR is as well provided by the acoustic system. The short-term
accuracy of PDR positioning is high precision, but even if the drift error of IMU
is calibrated, the long-term PDR algorithm still has signiﬁcant cumulative error.

It takes 43 seconds to walk through Path 1, and 27 seconds to Path 2. The PDR error in this paper is not constrained with other conditions, e.g. map, which is completely within the tolerance range. The cumulative error of PDR rotation estimation is more obviously reﬂected in Path 2, while the acoustic system is not aﬀected but the current environment elements. The fusion algorithm in Path 2 still shows strong stability.

0 2 4 6 8 10 meters

0 1 2 3 4 5 6 7

meters

PDR Points PDR Path Acoustic Points Acoustic Path Pals Fusion Path

0 1 2 3 4 5 6 7 8 9

meters 0

1 2 3 4 5 6

meters

PDR Points PDR Path Acoustic Points Acoustic Path Pals Fusion Path

0 0.5 1 1.5 2 2.5

Error Distance (meters) 0

0.2 0.4 0.6 0.8 1

Cumulative Error Distribution

Pals PDR Acoustic

0 0.5 1 1.5 2 2.5

Error Distance (meters) 0

0.2 0.4 0.6 0.8 1

Cumulative Error Distribution

Pals PDR Acoustic

**Fig. 2.**Under diﬀerent paths, separate acoustics, separate PDR and fusion algorithm
are compared. Path 1 on top left and Path 2 on top right. Cumulative error distributions
for paths. Path 1 on bottom left and Path 2 on bottom right.

From Fig. 2, it can also be seen that acoustic signal positioning error is not worse than fusion algorithm in some data, but Pals is stronger in terms of stability. The reason is that acoustics still remains some large error points, which are more likely to appear near the walls. The acoustic cumulative is slightly stronger than the Pals with a probability of about 50%, while about 75% in Path 2. In general, Pals algorithm can maintain within 95% of 0.28 meter error and within 90% of 0.27 meter error, which is more robust than acoustic and PDR positioning severally.

**4.3** **The analysis of Pals supplement approach**

The experimental results shown in Fig. 3 include diﬀerent approach. However,
stride length feedback update approach is necessary. The green small dot line
in Fig. 3 reﬂects the positioning performance without the approach. Compared
with Pals standard system, the mean error is down by about 9.7 centimeters. The
motion posture in our experiment is fortunately only normal walking steps. Pals
method will have more advantages if the walk/run switch is added. Cumulative
error distribution of the normal EKF with**X*** _{k+1|k}* are shown by carmine dash-
dotted line. The accuracy is about 10 centimeters lower than Pals.

0 0.1 0.2 0.3 0.4 0.5 0.6 Error Distance (meters)

0 0.2 0.4 0.6 0.8 1

Cumulative Error Distribution

Pals with Joseph's Stabilized Pals Standard

Pals without Step Feedback EKF with (34)

**Fig. 3.**Cumulative error distributions with diﬀerent approach.

**5** **Conclusion and Prospects**

Pals, High-Accuracy Pedestrian Localization with Fusion of Smartphone Acous-
tics and PDR, is proposed in this paper. To further enhance the smartphone
indoor positioning accuracy, Pals compares (6) with routine improves EKF in
which combined with the characteristics of smartphone sensor and selected the
observation value as fusion update strategy. Pals also amends the stride length
of the individual state parameter*γ*according to fusion coordinates. The experi-
ments show that Pals system can achieve indoor error of 0.28 meter within 95%

conﬁdence. Pals has been signed in 2022 Hangzhou Asian Games, as a security personnel and tourists indoor positioning.

Of course, Pals has it drawbacks. In this paper, Pals only takes into account the discrimination of motion or stillness when holding smartphone, and without considering putting it in the pocket, which is applicable to the scenario of walking follow the phone screen. We will combine the methods of deep learning and edge computing [6] for next research.

**References**

1. R. C. Shit, S. Sharma, D. Puthal, and A. Y. Zomaya, “Location of things (lot): A
review and taxonomy of sensors localization in iot infrastructure,”*IEEE Commu-*
*nications Surveys & Tutorials, vol. 20, no. 3, pp. 2028–2061, 2018.*

2. M. Z. Win, W. Dai, Y. Shen, G. Chrisikos, and H. V. Poor, “Network operation
strategies for eﬃcient localization and navigation,”*Proceedings of the IEEE, vol.*

106, no. 7, pp. 1224–1254, 2018.

3. E. Naserian, X. Wang, K. Dahal, Z. Wang, and Z. Wang, “Personalized location pre-
diction for group travellers from spatial–temporal trajectories,”*Future Generation*
*Computer Systems, vol. 83, pp. 278–292, 2018.*

4. D. Lymberopoulos and J. Liu, “The microsoft indoor localization competition: Ex-
periences and lessons learned,”*IEEE Signal Processing Magazine, vol. 34, no. 5, pp.*

125–140, 2017.

5. A. Rai, K. K. Chintalapudi, V. N. Padmanabhan, and R. Sen, “Zee: Zero-eﬀort
crowdsourcing for indoor localization,” in*Proceedings of the 18th annual interna-*
*tional conference on Mobile computing and networking.* ACM, 2012, pp. 293–304.

6. M. Tiwary, D. Puthal, K. S. Sahoo, B. Sahoo, and L. T. Yang, “Response time
optimization for cloudlets in mobile edge computing,”*Journal of Parallel and Dis-*
*tributed Computing, vol. 119, pp. 81–91, 2018.*

7. L. Zhang, D. Huang, X. Wang, C. Schindelhauer, and Z. Wang, “Acoustic nlos iden- tiﬁcation using acoustic channel characteristics for smartphone indoor localization,”

*Sensors, vol. 17, no. 4, pp. 727–, 2017.*

8. S. Pradhan, G. Baig, W. Mao, L. Qiu, G. Chen, and B. Yang, “Smartphone-based
acoustic indoor space mapping,” *Proceedings of the ACM on Interactive, Mobile,*
*Wearable and Ubiquitous Technologies, vol. 2, no. 2, p. 75, 2018.*

9. W. Mao, M. Wang, and L. Qiu, “Aim: Acoustic imaging on a mobile,” in*Proceedings*
*of the 16th Annual International Conference on Mobile Systems, Applications, and*
*Services.* ACM, 2018, pp. 468–481.

10. W. Kang and Y. Han, “Smartpdr: Smartphone-based pedestrian dead reckoning
for indoor localization,”*IEEE Sensors journal, vol. 15, no. 5, pp. 2906–2916, 2014.*

11. K.-H. Chow, S. He, J. Tan, and S.-H. G. Chan, “Eﬃcient locality classiﬁcation
for indoor ﬁngerprint-based systems,” *IEEE Transactions on Mobile Computing,*
vol. 18, no. 2, pp. 290–304, 2019.

12. Z. Chen, Q. Zhu, and Y. C. Soh, “Smartphone inertial sensor-based indoor local-
ization and tracking with ibeacon corrections,”*IEEE Transactions on Industrial*
*Informatics, vol. 12, no. 4, pp. 1540–1549, 2016.*

13. J. Ollion, J. Cochennec, F. Loll, C. Escud´e, and T. Boudier, “Tango: a gener- ic tool for high-throughput 3d image analysis for studying nuclear organization,”

*Bioinformatics, vol. 29, no. 14, pp. 1840–1841, 2013.*

14. C. Peng, G. Shen, and Y. Zhang, “Beepbeep: A high-accuracy acoustic-based sys-
tem for ranging and localization using cots devices,”*ACM Transactions on Embed-*
*ded Computing Systems (TECS), vol. 11, no. 1, p. 4, 2012.*

15. B. Zhou, M. Elbadry, R. Gao, and F. Ye, “Battracker: high precision infrastructure-
free mobile device tracking in indoor environments,” in*Proceedings of the 15th ACM*
*Conference on Embedded Network Sensor Systems.* ACM, 2017, p. 13.

16. N. Rajagopal, P. Lazik, N. Pereira, S. Chayapathy, B. Sinopoli, and A. Rowe,

“Enhancing indoor smartphone location acquisition using ﬂoor plans,” in*2018 17th*
*ACM/IEEE International Conference on Information Processing in Sensor Net-*
*works (IPSN).* IEEE, 2018, pp. 278–289.

17. X. Xu, J. Yu, Y. Chen, Y. Zhu, and M. Li, “Leveraging acoustic signals for vehicle
steering tracking with smartphones,” *IEEE Transactions on Mobile Computing,*
2019.

18. Z. Wang, H. Zhao, S. Qiu, and Q. Gao, “Stance-phase detection for zupt-aided
foot-mounted pedestrian navigation system,”*IEEE/ASME Transactions on Mecha-*
*tronics, vol. 20, no. 6, pp. 3170–3181, 2015.*

19. H. Yang, R. Zhang, J. Bordoy, F. H¨oﬂinger, W. Li, C. Schindelhauer, and L. Reindl,

“Smartphone-based indoor localization system using inertial sensor and acoustic
transmitter/receiver,”*IEEE Sensors Journal, vol. 16, no. 22, pp. 8051–8061, 2016.*

20. L. Zhang, M. Chen, X. Wang, and Z. Wang, “Toa estimation of chirp signal in
dense multipath environment for low-cost acoustic ranging,”*IEEE Transactions on*
*Instrumentation and Measurement, no. 99, pp. 1–13, 2018.*

21. X. Niu, Y. Li, J. Kuang, and P. Zhang, “Data fusion of dual foot-mounted imu
for pedestrian navigation,”*IEEE Sensors Journal, vol. 19, no. 12, pp. 4577–4584,*
2019.