2010 Asia Pacific Conference on Circuits and Systems (APCCAS 2010)
6 - 9 December 2010, Kuala Lumpur, Malaysia
Footstep Detection and Denoising using a SingleTriaxial Geophone
Vinod V. Reddy, V. Divya, Andy W. H. Khong and B. P. Ng
School of Electrical and Electronic Engineering
Nanyang Technological University, Singapore
email: {e060001, divy0012, AndyKhong, ebpng}@ntu.edu.sg
Abstract—In this paper we propose a new footstep detectiontechnique for data acquired using a triaxial geophone. The ideaevolves from the investigation of geophone transduction principle.The technique exploits the randomness of neighbouring datavectors observed when the footstep is absent. We extend thesame principle for triaxial signal denoising. Effectiveness of theproposed technique for transient detection and denoising arepresented for real seismic data collected using a triaxial geophone.
I. INTRODUCTION
The problem of signal detection in noise has been studied
for several decades. Conventionally, statistical hypothesis tests
are formulated to detect sources embedded in noise. Very
efficient likelihood tests are devised for deterministic and
random signal cases [1]. The problem is more challenging
when it comes to the detection of intermittent sources with
very small pulse width, encountered in applications such as
machine fault [2] and footstep detection [3] for surveillence
among others. Our focus on this topic is motivated to solve
the problem of detecting footsteps using a single three-axis
geophone. The use of triaxial geophones are becoming more
popular due to the ease of deployment as well as the additional
information obtained at almost the same cost as that of a
single-axis geophone.
Footsteps can be characterized as transient seismic events
propagating through the ground. Some of the existing detection
techniques for such transient signals include evaluating the
eigenvalues of short-time segment autocorrelation matrices,
kurtosis of short-time segments [3], cadence [3] and spectrum
analysis [4]. The first two metrics require a pre-defined thresh-
old to declare the presence of the source while the latter two
are based on data specific conditions. The most common signal
model used for sensor output is given by
x(k) =N−1∑l=0
s(k − l)h(l) + n(k), (1)
where x(k) is the channel sensor output at time index k, s(k) is
the source, n(k) is the additive noise, h(l) is the lth coefficient
of the channel response between the source and the sensor,
while N is the length of the channel response.
In practice, the sensor signals are subjected to some kind
of preprocessing prior to detection. Signal denoising is a
common technique used to suppress the effect of n(k) in (1).
Wavelet denoising is one of the most widely used technique
which transforms x(n) to the wavelet domain such that a
compact representation is obtained unlike noise. The technique
presented in [5] performs a wavelet packet transformation and
uses kurtosis as a criterion to distinguish wavelets correspond-
ing to signal from that of noise. The noise coefficients are
suppressed to obtain the signal with a higher SNR in time
domain.
In this paper, we first propose a new technique for footstep
detection using a triaxial geophone where three sensors are co-
located orthogonally within a single casing. We achieve detec-
tion by introducing two new metrics which exhibit distinction
between the signal and noise. This discrimination is based
on the geophone transduction principle and the independence
of the signals acquired in each of the co-located sensors.
Furthermore, we adopt this principle for signal denoising prior
to the succeeding stages in the footstep detection system.
The advantage of the proposed algorithm for both footstep
detection and denoising is its effectiveness and reduced com-
putational complexity.
II. PROPOSED METHOD
A. Geophone transduction principle
The geophone is a transducer which induces voltage pro-
portional to the medium particle velocity using the principle
of electromagnetic induction [6]. Any relative motion between
the suspended coil and the magnetic case generates a nonzero
output voltage. When there are no seismic events, the in-
duced voltages between the three orthogonal channels of the
geophone are uncorrelated. The background noise is due to
the random relative motion between the suspended mass and
the magnet, resulting in a nonzero voltage in each of the
three channels. Seismic waves, originated due to events such
as earthquake or footsteps, propagate through the ground in
all directions and the coupling of the triaxial geophone with
the ground detects the velocity of the particle motion at that
location. The voltage acquired by each channel is therefore
proportional to the particle velocity being decomposed onto
the three orthogonal axes.
Defining x1(k), x2(k) and x3(k) as the received signals
from the two horizontal and one vertical axis, respectively, we
denote
x(k) = [x1(k) x2(k) x3(k)]T (2)
as the received signal vector at time instance k. In the absence
of footsteps, it is expected that consecutive instances of x(k)
APCCAS.pdf 1 10/7/2010 11:14:16 AM
978-1-4244-7456-1/10/$26.00 ©2010 IEEE 1171
2010 Asia Pacific Conference on Circuits and Systems (APCCAS 2010)
6 - 9 December 2010, Kuala Lumpur, Malaysia
−10
1
−1
0
1−1
0
1
H1 axisH2 axis
V ax
is
−10
1
−1
0
1−1
0
1
H1 axisH2 axis
V ax
is
(a) 3D plot of noise only snapshots
−10
1
−1
0
1−1
0
1
H1 axisH2 axis
V ax
is
(b) 3D plot of transient signal snapshots
−10
1
−1
0
1−1
0
1
H1 axisH2 axis
V ax
is
Fig. 1. Evolution of x̄(k) corresponding to (a) noise, (b) Footstep signal.
are uncorrelated with each other. In the presence of footsteps,
however, the voltage induced in all the three axes due to
this seismic event is proportional to the particle velocity.
Since the particle motion is well defined, the consecutive data
vectors will be correlated. Due to the medium elasticity, the
particle velocity varies smoothly with time resulting in varying
voltages at consecutive snapshots.
At a sampling frequency greater than Nyquist frequency the
correlation between consecutive data samples can be observed
explicitly by normalizing each data vector with its �2-norm,
x̄(k) =x(k)
‖x(k)‖2 = [x̄1(k) x̄2(k) x̄3(k)]T . (3)
These normalized data vectors endure a more consistent and
slow varying nature unlike x(k) in the presence of a footstep.
In the absence of a seismic event however, vectors are expected
to be highly random.
Figures 1 (a) and (b) show an illustrative example of how
x̄(k) varies with time for two separate instances of recorded
background noise and footsteps, respectively. For each of these
plots, x̄(k) is plotted on a three dimensional vector space.
For clarity, at each index k, the point x̄(k) is plotted as
a line segment from the origin. From Fig. 1 (a), we note
that x̄(k) varies randomly with consecutive time instances
for background noise while, Fig. 1 (b) shows x̄(k) varying
smoothly forming a disc profile for a footstep signal. This
finding is consistent with the quaternion eigenaxis studies
discussed in [7] for elliptically polarized data. The distinction
between the noise and signal is apparently the slow variation of
consecutive data vetors in the second case. Based on this new
observation, we proposed to use two distance metrics which
are subsequently used for the development of a detection rule.
B. Neighbourhood Euclidean Distance Metric
For multi-dimensional vectors, the most commonly used
distance metric is the Euclidean distance (ED), defined by
ed = ‖x− y‖2,where x,y ∈ R
M . From Fig. 1, the ED between consecutive
data vectors is expected to approach zero when a footstep
is present, while a high ED is anticipated for noise only
data snapshots. We therefore construct a time-domain Neigh-
bourhood Euclidean Distance (NED) metric of all consecutive
normalized data vectors,
ex(k) = ‖x̄(k + 1)− x̄(k)‖2. (4)
Figure 2 (a) illustrates the variation of ex(k) along with the
scaled signal recorded from one of the triaxial geophone
channels resampled to 8 kHz. For this illustrative example,
s(n) is generated using a hammer stroke at a distance of 18 m
from the sensor. As expected, the variation of ex(k) is high in
the noise only time segments, whereas for the time segments
where the source is active, ex(k) varies less significantly.
Although a distinction between noise and signal can be
made using ex(k), the high temporal variation of ex(k) makes
signal detection challenging especially if a threshold rule is
applied to ex(k). To address this, we process overlapping time
frames of ex(k), e(b) = [ex((b−1)L+1) · · · ex(bF )]T where
b, L and F are the frame index, frame shift length and frame
size, respectively. The variance
σe(b) =1
FeT (b)e(b), (5)
is plotted in Fig. 2 (b). An overlapping factor of 0.85 is used.
As can be seen, although the high temporal variation of
ex(k) is reduced, σe(b) does not show significant distinction
between the source signal and noise only segments. This leads
to difficulty in defining a detection rule. One of the reasons
for this behaviour is that since unit norm data vectors x̄(k) are
used, the maximum value ex(k) can take is 2. Therefore, σe(b)is not significant for segregating the source and noise classes.
Furthermore, for polarized particle motion, it is possible that
0 2000 4000 6000 8000 10000−1
−0.5
0
0.5
1
1.5
2
Sample index
Amplit
ude
(a)
Recorded SignalNED
0 2000 4000 6000 8000 10000−1
−0.5
0
0.5
1
Sample index
Amplit
ude
(b)
Recorded SignalNED Variance
Fig. 2. Euclidean distance in 3D space as a metric (a) recorded data withseismic event along with the corresponding NED, (b) variance of NED withF = 50 ms and L = 0.85 ∗ 50 ms.
the signal is prolonged in one of the channels. Due to the
background noise in other channels, σe(b) is expected to
be high in those frames. Considering the above limitations,
we further propose a simple metric based on ratio of the
consecutive samples.
APCCAS.pdf 2 10/7/2010 11:14:20 AM
1172
2010 Asia Pacific Conference on Circuits and Systems (APCCAS 2010)
6 - 9 December 2010, Kuala Lumpur, Malaysia
0 2000 4000 6000 8000 10000−1
0
1
2
3
4
Sample index
Amplit
ude
(a)
0 2000 4000 6000 8000 10000−1
0
1
2
3
4
Sample index
Amplit
ude
(b)
0 2000 4000 6000 8000 10000−1
0
1
2
3
4
5
Sample index
Amplit
ude
(c)
Signal in H1 axisσ
y1
Signal in H2 axisσ
y2
Signal in V axisσ
y3
Fig. 3. Neighbourhood Ratio as a metric for a transient signal detection.Signals from (a) horizontal axis 1, (b) horizontal axis 2, (c) vertical axis.
C. Neighbourhood Ratio Metric
As noted in Section II-A, the consecutive time instances
of x̄(k) corresponding to the footstep time segments vary
slowly if the sampling frequency is higher than the Nyquist
rate. This implies that, for each channel the normalized signal
x̄i(k), defined in (3) exhibits smooth variations for the footstep
duration. With this understanding, we propose to employ the
neigbourhood ratio (NR) metric for each channel given by
yi(k) =
{x̄i(k + 1)/x̄i(k) x̄i(k) > δ1 otherwise,
(6)
where δ is a small value which avoids data vectors correspond-
ing to the zero crossings from consideration since under such
circumstances, noise suppresses the transduced voltage. It is
important to note that there are as many yi(k) as the sensors
unlike in the NED case.
When a footstep is present, yi(k) will be close to unity while
for noise only segments this ratio varies randomly. Similar to
ex(k), the variance is computed for overlapping time frames
of σyi(b) for each channel using (5). This variance for each of
the three axes are shown in Fig. 3 (a-c). A scaled version of the
corresponding sensor signals are shown for comparison. For
the same data used to obtain Fig. 2 (b), we observe from Fig. 3
that σyi(b) reduces close to zero when the signal is active and
increases to a higher value for noise only segments. Comparing
Figs. 2 (b) and 3, we note that σyi(b) discriminates transient
signals better than σe(b). This affirms that the NR metric
provides a better discretion of the signal from the background
noise than the NED metric.
In order to define a detection rule, we propose to use a
function of the variance σyi(b) given by,
zi(b) =1
(σyi(b) + 1)P, (7)
where P > 1 is an integer. The value of zi(b) approaches
unity in the presence of footstep and reduces to a low value
otherwise. The function that maps σyi to zi is plotted in
Fig. 4 for varying values of P . We note that for higher values
0 1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
σyi
z i
P=1
P=2
P=3P=4
P=5P=6
Fig. 4. Function mapping σyi to zi.
of P , zi(b) trails faster with increasing variance. Therefore,
under noisy environments, a high P is required to reduce false
alarm. The complete procedure, including the detection rule is
provided in Table 1. We note that although zi(b) is computed
for detection in each channel , it is possible to combine these
detection results to provide a unified robust solution.
TABLE ISTEPS FOR FOOTSTEP DETECTION USING TRIAXIAL GEOPHONE
1. Each vector of the multichannel data matrix X =[x(1)...x(K)], where K is the number of data snapshots,
is normalized to have unit �2-norm.
2. The matrix of NR metric is obtained as Y =[y(1)...y(K − 1)], where y(k) = [y1(k) y2(k) y3(k)]
T
with yk(k) as defined in (6).
3. Variance for each channel data of Y is computed over
overlapped time frames using (5). A function of variance
defined in (7) is then evaluated.
4. Signal detection rule: If zi(b) > γ, declare that the
signal is present in this frame.
D. Signal Denoising using the Proposed NR Metric
Based on the above discussion, we extend the same principle
for signal denoising. The variable zi(b) defined in (7) provides
a value close to unity when the signal is present and a value
close to zero otherwise. Weighing the sensor signal with zi(b)will therefore result in a denoised signal given by,
wi(k) = xi(k)zi(k), ∀k, (8)
where wi(k) is the denoised signal of the ith sensor.
III. EXPERIMENTAL RESULTS
We now present results of the proposed technique in the
context of footstep detection and signal denoising. Since
the technique exploits the transduction principle of the co-
located geophones, the performance can be studied only
on recorded seismic data. The setup consists of a triaxial
geophone (Geospace GS32-CT) whose output is preamplified
prior to digitization using a multi-channel ADC. The geophone
is buried in an open grass field and human footsteps are used
to generate seismic events at desired distance from the sensor.
The signal is downsampled to 8 kHz for processing.
APCCAS.pdf 3 10/7/2010 11:14:20 AM
1173
2010 Asia Pacific Conference on Circuits and Systems (APCCAS 2010)
6 - 9 December 2010, Kuala Lumpur, Malaysia
0 2 4 6 8 10 12−0.03
−0.02
−0.01
0
0.01
0.02
(a) Footstep signal
Time in seconds
Ampli
tude
0 2 4 6 8 10 120
0.5
1
1.5(b) Footstep detection using kurtosis
Time in seconds
Ampli
tude
0 2 4 6 8 10 120
0.5
1
1.5(c) Proposed detection algorithm
Time in seconds
Ampli
tude
15m from sensor
5m from sensor
5m from sensor
12m from sensor
5m from sensor 18m from
sensor
Fig. 5. Footstep detection performance (a) Footstep signal in horizontalaxis, (b) detection results using kurtosis, (c)detection results using proposedtechnique.
For validating the footstep detection technique, we use
data recorded when a person walks radially away from the
geophone starting from a distance of 5 m to 18 m. For
clarity of presentation, the recorded footsteps in one of the
axes is magnified and plotted in Fig. 5 (a). Figure 5 (b)
shows the footsteps detected using the kurtosis measure as
presented in [3]. If the kurtosis, computed over 200 ms time
frames with an overlapping factor of 0.75, is greater than
5, a footstep is adjudged. For the proposed NR technique,
Fig. 5 (c) is obtained with the parameters δ, P and γ set
to 0.05, 5 and 0.6 respectively. It can be observed that the
detection performance of the NR technique is more reliable
when compared to that of the kurtosis technique. This is due
to the exploitation of the geophone transduction principle by
the proposed NR algorithm. Setting the thresholds based on
the real-time background noise profile ensures a highly reliable
transient detection.
The range of a footstep detection algorithm is dependent
on the footstep intensity, medium composition and the pream-
plification provided. Therefore, a fair comparison would be
to compare the two methods for a given dataset. For the
above data, we observe that the proposed technique can detect
footsteps up to approximately 15 m while the kurtosis method
succeeds in detecting footsteps only up to approximately 12 m.
We next present the denoising capability of the proposed
technique. As described in Section II D, denoising is achieved
by weighing the sensor output with the window defined in (7).
The footsteps shown in Fig. 6 (a) refer to the data recorded
from the horizontal component of the geophone buried at
(0, 0) m and the person is walking from (−5, 15) m to
(5, 15) m. For comparison, denoising achieved by wavelet
packet method proposed in [5] is shown in Fig. 6 (b) while
Fig. 6 (c) shows the denoised signal wi(k) obtained by the
proposed technique with δ and P set to 0.05 and 6, respec-
0 1 2 3 4 5 6−0.01
−0.005
0
0.005
0.01
0.015
Time in seconds
Amplit
ude
(a) Recorded signal at 8 kHz sampling frequency
0 1 2 3 4 5 6−0.01
−0.005
0
0.005
0.01
Time in seconds
Amplit
ude
(b) Denoised signal Wavelet Packet Method
0 1 2 3 4 5 6−4
−2
0
2
4x 10−3
Time in seconds
Amplit
ude
(c) Denoised signal using the Proposed technique
Fig. 6. Signal denoising (a) recorded data, (b) wavelet packet denoising, (c)proposed technique with NR time window.
tively. In order to quantify the noise suppression achieved, we
evaluate average signal-to-noise ratio (SNR) over 11 footstep
and noise segments. For each segment, the signal power is
computed over a window of 250 ms containing a footstep
while the noise power is evaluated for the remaining time
segment. The average SNR for the geophone signal shown
in Fig. 6 (a), the denoised signals obtained by the method in
[5] shown in Fig. 6 (b) and the proposed NR-based denoising
method shown in Fig. 6 (c) are found to be 10 dB, 23.7 dB
and 28.8dB, respectively. This SNR improvement testifies the
denoising capability of the proposed technique.
IV. CONCLUSION
We presented an effective footstep detection algorithm based
on the transduction principle of a triaxial geophone. The
proposed Neighbourhood Ratio metric is found to have an im-
proved performance over the conventional Euclidean distance.
Extending this principle for signal denoising is observed to
provide promising results over wavelet packet denoising.
REFERENCES
[1] S. M. Kay, Fundamentals of Statistical Signal Processing, Volume 2:Detection Theory. Pearson Education, 1998.
[2] Z. K. Zhu, R. Yan, L. Luo, Z. H. Feng, and F. R. Kong, “Detectionof signal transients based on wavelet and statistics for machine faultdiagnosis,” Mechanical Systems and Signal Processing, vol. 23, no. 4,pp. 1076–1097, May 2009.
[3] R. G. G. Succi, D. Clapp and G. Prado, “Footstep detection and tracking,”vol. 4393, 2001, pp. 22–29.
[4] K. M. Houston and D. P. McGaffigan, “Spectrum analysis techniques forpersonnel detection using seismic sensors,” in Proc SPIE Conf on UGSTechnologies and Applications V, vol. 5090, 2003, pp. 162–173.
[5] P. Ravier and P.-O. Amblard, “Wavelet packets and de-noising based onhigher-order-statistics for transient detection,” Signal Processing, vol. 81,no. 9, pp. 1909 – 1926, 2001.
[6] W. Lowrie, Fundamentals of Geophysics, second edition ed. CambridgeUniversity Press, 2007.
[7] N. Le Bihan and J. Mars, “Singular value decomposition of quaternionmatrices: a new tool for vector-sensor signal processing,” Signal Process.,vol. 84, no. 7, pp. 1177–1199, 2004.
APCCAS.pdf 4 10/7/2010 11:14:21 AM
1174