Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | emma-booker |
View: | 215 times |
Download: | 0 times |
Position Calibration of Acoustic Sensors and Actuators on Distributed General Purpose Computing Platforms
Vikas Chandrakant Raykar | University of Maryland, CollegePark
Motivation
Many multimedia applications are emerging which use multiple audio/video sensors and actuators.
Microphones
Cameras
Speakers
Displays
Dis
trib
ute
d
Cap
ture
Dis
trib
ute
d
Ren
der
ing
Other Applications
Number Crunching
Cur
rent
The
sis
X
What can you do with multiple microphones…
Speaker localization and tracking.
Beamforming or Spatial filtering.
Some Applications…
Audio/Video Surveillance
Smart ConferenceRoomsAudio/Image Based
Rendering
Meeting RecordingSource separation and
Dereverberation
Speech Recognition
Hands free voice communication
Speaker Localizationand tracking
Multichannel speech Enhancement
MultiChannel echoCancellation
Novel Interactive audio Visual Interfaces
More Motivation…
Current work has focused on setting up all the sensors and actuators on a single dedicated computing platform.
Dedicated infrastructure required in terms of the sensors, multi-channel interface cards and computing power.
On the other hand
Computing devices such as laptops, PDAs, tablets, cellular phones,and camcorders have become pervasive.
Audio/video sensors on different laptops can be used to form a distributed network of sensors.
Common TIME and SPACE
Put all the distributed audio/visual input/output capabilities of all the laptops into a common TIME and SPACE.
This thesis deals with common SPACE i.e estimate the 3D positions of the sensors and actuators.
Why common SPACE
Most array processing algorithms require that precise positions of microphones be known.
Painful, tedious and imprecise to do a manual measurement.
This thesis is about..
X
YZ
If we know the positions of speakers….
If distances are not exact
If we have more speakers
X
Y
?
Solve in the least squaresense
If positions of speakers unknown…
Consider M Microphones and S speakers.
What can we measure?Distance between each speaker and all microphones.
Or Time Of Flight (TOF)
MxS TOF matrix
Assume TOF corrupted by Gaussian noise.
Can derive the ML estimate.
Calibration signal
Nonlinear Least Squares..More formally can
derive the ML estimateusing a Gaussian
Noise model
Find the coordinates which minimizes this
Maximum Likelihood (ML) Estimate..
we can define a noise modeland derive the ML estimate i.e. maximize the likelihood ratio
Gaussian noise
If noise is Gaussianand independentML is same asLeast squares
Reference Coordinate SystemReference Coordinate system | Multiple Global minima
X axis
Positive Y axis
OriginSimilarly in 3D
1.Fix origin (0,0,0)
2.Fix X axis
(x1,0,0)
3.Fix Y axis
(x2,y2,0)
4.Fix positive Z axis
x1,x2,y2>0
Which to choose? Later…
On a synchronized platform all is well..
However On a Distributed system..
The journey of an audio sample..
NetworkThis laptop wants to play a calibration signal on the other laptop.
Play comand in software.
When will the sound be actually played out fromThe loudspeaker.
Operating system
Multimedia/multistream applications
Audio/video I/O devices
I/O bus
t
t
jtsSignal Emitted by source j
Signal Received by microphone i
ijFOT ˆ
itmijTOF
Capture Started
Playback Started
Time Origin
On a Distributed system..
Joint Estimation..
Speaker Emission Start Times
S
Microphone Capture Start Times
M -1Assume tm_1=0
Microphone and speakerCoordinates
3(M+S)-6
MS TOF Measurements
Totally
4M+4S-7 parameters to estimates
MS observations
Can reduce the number of parameters
Use Time Difference of Arrival (TDOA)..
Formulation same as above but less number of parameters.
Assuming M=S=K Minimum K required..
Nonlinear least squares..
Levenberg Marquadrat method
Function of a large number of parameters
Unless we have a good initial guess may not convergeto the minima.
Approximate initial guess required.
Closed form Solution..
Say if we are given all pairwise distances between N points can we get the coordinates.
1 2 3 4
1 X X X X
2 X X X X
3 X X X X
4 X X X X
Classical Metric Multi Dimensional Scaling
dot product matrixSymmetric positive definiterank 3
Given B can you get X ?....Singular Value Decomposition
Same as Principal component Analysis
But we can measureOnly the pairwise distance matrix
How to get dot product from the pairwise distance matrix…
k
ijd
kjd
kid
i
j
Centroid as the origin…
Later shift it to our
orignal reference
Slightly perturb each location of GPCinto two to get the initial guess for the microphone and speaker coordinates
Example of MDS…
• Instead of pairwise distances we can use pairwise “dissimilarities”.
• When the distances are Euclidean MDS is equivalent to PCA.
• Eg. Face recognition, wine tasting
• Can get the significant cognitive dimensions.
MDS is more general..
Can we use MDS..Two problems
s1 s2 s3 s4 m1 m2 m3 m4
s1 ? ? ? ? X X X X
s2 ? ? ? ? X X X X
s3 ? ? ? ? X X X X
s4 ? ? ? ? X X X X
m1 X X X X ? ? ? ?
m2 X X X X ? ? ? ?
m3 X X X X ? ? ? ?
m4 X X X X ? ? ? ?
1. We do not have the complete pairwise distances
2. Measured distances Include the effect of lack of synchronization
UNKNOW
N
UNKNOW
N
Clustering approximation…
Clustering approximation…
j i
j j
i j
i i
Finally the complete algorithm…
ApproxDistance matrixbetween GPCs
Approxts
Approx tm
Clustering
Approximation
Dot product matrix
Dimension and coordinate system
MDS to get approx GPC locations
perturb
TOF matrix
Approx. microphone and speaker
locations
TDOA basedNonlinear
minimization
Microphone and speakerlocations tm
Sample result in 2D…
Algorithm Performance… •The performance of our algorithm depends on
•Noise Variance in the estimated distances.•Number of microphones and speakers.•Microphone and speaker geometry
•One way to study the dependence is to do a lot of monte carlo simulations.
•Else can derive the covariance matrix and bias of the estimator.
•The ML estimate is implicitly defined as the minimum of a certain error function.
•Cannot get an exact analytical expression for the mean and variance.
• Or given a noise model can derive bounds on how worst can our algortihm perform.
•The Cramer Rao bound.
Can use implicit function theorem and Taylors series expansion to get approximate expressions for bias and variance.
•J A Fessler. Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): Applications to tomography. IEEE Tr. Im. Proc., 5(3):493-506, 1996. •Amit Roy Chowdhury and Rama Chellappa, "Statistical Bias and the Accuracy of 3D Reconstruction from Video", Submitted to International Journal of Computer Vision
Using first order taylors series expansion
Jacobian
Rank Deficit..remove theKnown parameters
Estimator Variance…
Gives the lower bound on the variance of any unbiased estimator.
Does not depends on the estimator. Just the data and the noise model.
Basically tells us to what extent the noise limits our performance i.e. you cannot get a variance lesser than the CR bound.
Jacobian
Rank Deficit..remove theKnown parameters
Different Estimators..
Number of sensors matter…
Number of sensors matter…
Geometry also matters…
Geometry also matters…
Calibration Signal…
• Compute the cross-correlation between the signals received at the two microphones.
• The location of the peak in the cross correlation gives an estimate of the delay.
• Task complicated due to two reasons 1.Background noise. 2.Channel multi-path due to room reverberations.• Use Generalized Cross Correlation(GCC).
• W(w) is the weighting function. • PHAT(Phase Transform) Weighting
Time Delay Estimation…
Time Delay Estimation…
Synchronized setup | bias 0.08 cm sigma 3.8 cm
Mic 3
Mic 1
Mic 2
Mic 4
Speaker 1
Sp
eake
r 4S
pea
ker
2
Speaker 3
X
Z
Roo
m L
engt
h =
4.2
2 m
Room Width = 2.55 m
Room Height = 2.03 m
Distributed Setup…
Initialization phase Scan the network and find the number of GPC’s and the UPnP services available
MasterGPC 1 GPC 2 GPC M
•GPC 1 (Speaker) GPC 2 (Mic)•Calibration signal parameters
TOA Computation
TOATOA matrix
Position estimation
Play Calibration Signal
Play ML Sequence
Experimental results using real data
Related Previous work…
J. M. Sachar, H. F. Silverman, and W. R. Patterson III. Position calibration of
large-aperture microphone arrays. ICASSP 2002
Y. Rockah and P. M. Schultheiss. Array shape calibration using sources in unknown
locations Part II: Near-field sources and estimator implementation. IEEE Trans. Acoust.,
Speech, Signal Processing, ASSP-35(6):724-735, June 1987.
J. Weiss and B. Friedlander. Array shape calibration using sources in unknow locations a maximum-likelihood approach. IEEE Trans. Acoust., Speech, Signal Processing , 37(12):1958-1966, December 1989.
R. Moses, D. Krishnamurthy, and R. Patterson. A self-localization method for wireless
sensor networks. Eurasip Journal on Applied Signal Processing Special Issue on Sensor
Networks, 2003(4):348{358, March 2003.
index.htm
Our Contributions…
•Novel setup for array processing.
•Position calibration in a distributed scenario.
•Closed form solution for the non-linear minimization routine.
•Expression for the mean and variance of the esimators.
•Study the effect of sensor geometry.
Acknowledgements…
• Dr. Ramani Duraiswami and Prof. Rama Chellappa
• Prof. Yegnanarayana
• Dr. Igor Kozintsev and Dr. Rainer Lienhart, Intel Research
• Prof. Min Wu and Prof. Shihab Shamma
• Prof. Larry Davis
Thank You ! | Questions ?