+ All Categories
Home > Documents > Pinna Feature Extraction from hand-held device and HRTF ...€¦ · Pinna Feature Extraction from...

Pinna Feature Extraction from hand-held device and HRTF ...€¦ · Pinna Feature Extraction from...

Date post: 19-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
Pinna Feature Extraction from hand-held device and HRTF response recovery Gabriele Caro+Sha, Yujia Zhang Department of Electrical Engineering, Stanford University I. Motivation III. Image Preprocessing and ear localization IV. Finding the nearest HRTF Convert to gray scale Enhance with contrast limited adaptive histogram equalization Apply median filter to reduce details in the image Edge extraction using Canny detector with threshold Apply morphological operators 1. Dilation: ellipse, size = 3 2. Erosion: ellipse, size = 3 Identify connected edges Collect region properties for each area: a. Bounding box b. Extent c. Fitted ellipse d. Eccentricity e Orientation Select the correct bounding box based on: - Extent < 0.1 (thin edge) - Eccentricity: [0.6, 0.9] - Orientation: [0, 40], [140, 200] Iterate and find the exterior contour Resize bounding box to ensure the full ear is contained in the selected region Locate the ear contours in the image and only apply feature detectors to the region of interest Extract SURF descriptors of ear images from CIPIC database (MATLAB) (hLp://interface.cipic.ucdavis.edu/sound/hrQ.html) Build a database consisRng of ear features and corresponding HRTFs. (41 leV, 11 right, stored in mobile device) Compute SURF descriptors and find the closest HRTF using K nearest neighbor match method (OpenCV, iOS) Example database ear image with SURF keypoints Example database ear HRTF response Proof of Concept Sub-divide database images into training and testing sets Form database using training set Perform Query search with testing images and compute the corresponding Knn matches Compare query and its nearest neighbor’s HRTFs to evaluate algorithm Training Tes*ng LeV 28 13 right 7 4 The pinna (outer ear) plays an important role in localizing the elevation position of sound sources as it generates a series of elevation cues while filtering the acoustic signal. This can be described via a frequency response function called the head related transfer function (HRTF). Different individuals have distinctive HRTFs since the biometric parameters vary significantly in relation to size, shape, and orientation. This project designed and prototyped an algorithm that extracts pinna features from images captured by mobile device and retrieves the closest HRTF response from database. Both the image processing and classification are done on hand held device. Algorithm II. Workflow Take side face image Pinna feature/HRTF database computed offline and stored in hand-held device Image processing and ear detection Find the closest image and apply the associated HRTF as a filter for audio localizaRon Example of Knn Match Results Measure of results comparison: •HRTF responses are characterized by local max and min values (peak and notch) and their respective center frequencies •Query image’s HRTF is compared against that of the match by computing the distance score for the peaks and notches. Resulting observation: •The matched response are close to the minimum distance points over the small set. Matching can be improved with increasing database size. freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 left HRTF (dB) -30 -20 -10 0 10 20 Subject 060 - azimuth: 0, elevation: 0 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 right HRTF (dB) -50 -40 -30 -20 -10 0 10 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 left HRTF (dB) -80 -60 -40 -20 0 20 Subject 124 - azimuth: 0, elevation: 0 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 right HRTF (dB) -40 -30 -20 -10 0 10 20 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 -40 -30 -20 -10 0 10 20 Subject 020 - azimuth: 0, elevation: 0 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 -30 -20 -10 0 10 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 -50 -40 -30 -20 -10 0 10 20 Subject 148 - azimuth: 0, elevation: 0 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 -30 -20 -10 0 10 20 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 left HRTF (dB) -40 -30 -20 -10 0 10 Subject 156 - azimuth: 0, elevation: 0 freq (Hz) #10 4 0 0.5 1 1.5 2 2.5 right HRTF (dB) -25 -20 -15 -10 -5 0 5 10 Ear Match Results Corresponding HRTFs
Transcript
Page 1: Pinna Feature Extraction from hand-held device and HRTF ...€¦ · Pinna Feature Extraction from hand-held device and HRTF response recovery Gabriele(Caro+,Sha,(YujiaZhang Department

Pinna Feature Extraction from hand-held device and HRTF response recovery

Gabriele  Caro+-­‐Sha,  Yujia  ZhangDepartment of Electrical Engineering, Stanford University

I. Motivation III. Image Preprocessing and ear localization

IV. Finding the nearest HRTF

•  Convert to gray scale •  Enhance with contrast limited

adaptive histogram equalization

•  Apply median filter to reduce details in the image

•  Edge extraction using Canny detector with threshold

•  Apply morphological operators 1. Dilation: ellipse, size = 3 2. Erosion: ellipse, size = 3

•  Identify connected edges •  Collect region properties for

each area: a. Bounding box b. Extent c. Fitted ellipse d. Eccentricity e Orientation

•  Select the correct bounding box based on:

- Extent < 0.1 (thin edge) - Eccentricity: [0.6, 0.9] - Orientation: [0, 40], [140, 200] •  Iterate and find the exterior

contour

•  Resize bounding box to ensure the full ear is contained in the selected region

Locate  the  ear  contours  in  the  image  and  only  apply  feature  detectors  to  the  

region  of  interest  

Extract  SURF  descriptors  of  ear  images  from  CIPIC  database    

(MATLAB)  (hLp://interface.cipic.ucdavis.edu/sound/hrQ.html)  

Build  a  database  consisRng  of  ear  features  and  corresponding  HRTFs.  (41  leV,  11  right,  stored  in  mobile  device)  

Compute  SURF  descriptors  and  find  the  closest  HRTF  using  K  nearest  

neighbor  match  method  (OpenCV,  iOS)  

Example database ear image with SURF keypoints

Example database ear HRTF response

Proof of Concept

•  Sub-divide database images into training and testing sets •  Form database using training set •  Perform Query search with testing images and compute the corresponding Knn matches •  Compare query and its nearest neighbor’s HRTFs to evaluate algorithm

Training   Tes*ng  

LeV   28   13  

right   7   4  

The pinna (outer ear) plays an important role in localizing the elevation position of sound sources as it generates a series of elevation cues while filtering the acoustic signal. This can be described via a frequency response function called the head related transfer function (HRTF). Different individuals have distinctive HRTFs since the biometric parameters vary significantly in relation to size, shape, and orientation. This project designed and prototyped an algorithm that extracts pinna features from images captured by mobile device and retrieves the closest HRTF response from database. Both the image processing and classification are done on hand held device.

Algorithm II. Workflow

Take side face image Pinna feature/HRTF database computed offline and stored in hand-held device

Image processing and ear detection

Find  the  closest  image  and  apply  the  associated  HRTF  as  a  filter  for  audio  

localizaRon  

Example of Knn Match Results

Measure of results comparison: • HRTF responses are characterized by local max and min values (peak and notch) and their respective center frequencies • Query image’s HRTF is compared against that of the match by computing the distance score for the peaks and notches.

Resulting observation: • The matched response are close to the minimum distance points over the small set. Matching can be improved with increasing database size.

freq (Hz) #1040 0.5 1 1.5 2 2.5

left

HR

TF (d

B)

-30

-20

-10

0

10

20Subject 060 - azimuth: 0, elevation: 0

freq (Hz) #1040 0.5 1 1.5 2 2.5

right

HR

TF (d

B)

-50

-40

-30

-20

-10

0

10

freq (Hz) #1040 0.5 1 1.5 2 2.5

left

HR

TF (d

B)

-80

-60

-40

-20

0

20Subject 124 - azimuth: 0, elevation: 0

freq (Hz) #1040 0.5 1 1.5 2 2.5

right

HR

TF (d

B)

-40

-30

-20

-10

0

10

20

freq (Hz) #1040 0.5 1 1.5 2 2.5

left

HR

TF (d

B)

-40

-30

-20

-10

0

10

20Subject 020 - azimuth: 0, elevation: 0

freq (Hz) #1040 0.5 1 1.5 2 2.5

right

HR

TF (d

B)

-30

-20

-10

0

10

freq (Hz) #1040 0.5 1 1.5 2 2.5

left

HR

TF (d

B)-50

-40

-30

-20

-10

0

10

20Subject 148 - azimuth: 0, elevation: 0

freq (Hz) #1040 0.5 1 1.5 2 2.5

right

HR

TF (d

B)

-30

-20

-10

0

10

20

freq (Hz) #1040 0.5 1 1.5 2 2.5

left

HR

TF (d

B)

-40

-30

-20

-10

0

10Subject 156 - azimuth: 0, elevation: 0

freq (Hz) #1040 0.5 1 1.5 2 2.5

right

HR

TF (d

B)

-25

-20

-15

-10

-5

0

5

10

Ear Match Results Corresponding HRTFs

Recommended