+ All Categories
Home > Documents > MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf ·...

MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf ·...

Date post: 09-Jul-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
10
1 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham G. Piyush R. Ashish K. (ggarime1, proutra1, akumar34)@jhu.edu Johns Hopkins University, Baltimore, USA 1. Abstract The launch of XBOX Kinect opened exciting new avenues for 3D perception due to its easy to use, out-of-the-box depth and color video. Applications spectra seem to be widening over a series of software and hardware. We have all come across Music Players in our day to day life. Various methods of accessing these music interfaces exist but again, easier methods to access the same are always desirable. In this project, we create a gesture based 3D user interface for playing audio from a MATLAB based graphic user interface. Multiple object based background is assumed as the environment and hand detection over them activates processing of data. On detecting a gesture over a particular area in the foreground, corresponding functionality in the GUI is activated. As per the gesture of the user, for which the system was already trained, the music player responds promptly. The setup was tested over a set of saved images as well as in real-time from a Kinect Sensor images. The results varied over different operating systems as discussed later, but were satisfying and as desired. 2. Aims of the Project The aims we could see before the start of the project were: 2.1 Choice of environment for camera view. We assumed the camera view to be top down so as to emphasize on capabilities of Kinect sensor over other general cameras that provide us with only 2D image of the objects. However, Kinect has its own limitations and doesn’t give desired images within a very close range of its view. To be properly detected, an object has to be present at least 0.5metres away from the camera sensors [1]. 2.2 Identification of marker objects in real world. It was thought to be preferable to have some objects which correspond to buttons in the music player. Having predefined marker objects makes it easier for users other than the programmer to access the music player. Detection of these objects while start of the setup is desirable. 2.3 Background subtraction and filtering of noise. One of the aims of the project is to be able to identify dynamic objects and remove background or static objects. This would allow us to reduce the clutter in the image and focus on the objects of interest such as hand gestures which are dynamic. 2.4 Gesture recognition. Detecting and differentiating between gestures would reduce the need for more marker objects. Also it would be further efficient use of the Kinect sensor. 2.5 Music Player development. We feel a music player which is not as complex as the commercially available ones should be better for testing purpose of our project as it is more inclined towards the computer vision part. However, we desired to develop a music player
Transcript
Page 1: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

1 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

MATLAB Based Interactive Music Player using XBOX Kinect

Gowtham G. Piyush R. Ashish K.

(ggarime1, proutra1, akumar34)@jhu.edu

Johns Hopkins University, Baltimore, USA

1. Abstract The launch of XBOX Kinect opened exciting new avenues for 3D perception due to its easy to

use, out-of-the-box depth and color video. Applications spectra seem to be widening over a

series of software and hardware. We have all come across Music Players in our day to day life.

Various methods of accessing these music interfaces exist but again, easier methods to access the

same are always desirable. In this project, we create a gesture based 3D user interface for playing

audio from a MATLAB based graphic user interface. Multiple object based background is

assumed as the environment and hand detection over them activates processing of data. On

detecting a gesture over a particular area in the foreground, corresponding functionality in the

GUI is activated. As per the gesture of the user, for which the system was already trained, the

music player responds promptly. The setup was tested over a set of saved images as well as in

real-time from a Kinect Sensor images. The results varied over different operating systems as

discussed later, but were satisfying and as desired.

2. Aims of the Project The aims we could see before the start of the project were:

2.1 Choice of environment for camera view. We assumed the camera view to be top down so

as to emphasize on capabilities of Kinect sensor over other general cameras that provide us with

only 2D image of the objects. However, Kinect has its own limitations and doesn’t give desired

images within a very close range of its view. To be properly detected, an object has to be present

at least 0.5metres away from the camera sensors [1].

2.2 Identification of marker objects in real world. It was thought to be preferable to have

some objects which correspond to buttons in the music player. Having predefined marker objects

makes it easier for users other than the programmer to access the music player. Detection of

these objects while start of the setup is desirable.

2.3 Background subtraction and filtering of noise. One of the aims of the project is to be able

to identify dynamic objects and remove background or static objects. This would allow us to

reduce the clutter in the image and focus on the objects of interest such as hand gestures which

are dynamic.

2.4 Gesture recognition. Detecting and differentiating between gestures would reduce the

need for more marker objects. Also it would be further efficient use of the Kinect sensor.

2.5 Music Player development. We feel a music player which is not as complex as the

commercially available ones should be better for testing purpose of our project as it is more

inclined towards the computer vision part. However, we desired to develop a music player

Page 2: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

2 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

graphic user interface which is comprehensive enough to have primary functionalities of ‘Play’,

‘Pause’, ‘Stop’, ‘Volume control’ and ‘Scroll track’.

3. Approaches to tackle the problems 3.1 Choice of environment – From prior research work on operation of Kinect[1][7] and

experience of working on Kinect [10] we could find that Kinect sensor does not give desired

images for objects placed within 0.5m of it. So we decided to have an operation space of at least

1.5 meters, so as to facilitate free moving of the operator’s palm. As discussed earlier, top-down

view of Kinect is assumed i.e. Kinect is placed in such a manner that it views a floor or table-top

vertically below it. Even though it should not affect the functionality or our code, it is preferred

to have a clear background free from stray items except the marker objects.

Figure 1. A screen shot of image when hand goes out of bound of Kinect Sensor

3.2 Identification of marker objects – Marker objects are portions in the image view which

demarcate the various functionalities. Specific objects are associated with separate buttons on the

GUI. These marker objects can be pre-placed in the background or may be dynamically

introduced into the image frame. First we decided placing marker objects (symbols) in a pre-

defined order and detecting the edges while pre-processing [3]. Then we could use ‘regionprops’

command in Matlab to find the centroids of the marker objects. However, this wasn’t able to

achieve scale and rotation invariance while object detection. Another possible major

disadvantage of this method would have been doing away with the dynamic detection of the

marker objects. Hence we decided to check for SIFT features [2] and match the objects to

previously stored images of the object(s). To get more key-points, we designed the markers with

roman alphabets in ‘Algerian’ font. The SIFT matching technique is rotation and scale invariant.

Page 3: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

3 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

Hence there weren’t many outliers when matching. In almost all cases, we found the program

showing us correct corresponding matches. After detection of the markers, template boundaries

were calculated so as to demarcate functionalities of the objects.

Page 4: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

4 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

Figure 2,3,4,5 show SIFT feature matching of ‘Marker Objects’ with corresponding images and

plotting of respective bounding boxes is shown in Fig 6

3.3 Background subtraction and filtering of noise – To detect hand in the images we initially

processed the color images. For the first trial, we assumed a background image and updated it by

taking mean of its previous five frames[3]. Then we subtracted it from the current frame which

would give us the position of the hand. While this worked properly when the hand was entering

the template frames, considerable delay was present when the hand was to pull out of the frame.

Also, changing lighting conditions would affect this method drastically. The Kinect updates its

white balance after certain time interval and this also affects the background data. However, the

depth image is generally free of the background light changes. Hence it was desirable to use

depth image for processing of data. Depth images were found to have considerably less amount

of noise. The processing of the depth image was done by differentiating between current frame

and a reference frame which was chosen when no hand was over the marker object templates.

The noise in the resultant image was reduced further by Gaussian filtering and opening function

on the image. The opening function reduces the salt and pepper noise in the image to a great

extent. An amusing error was noticed when the noise was present due to reflection of IR rays of

Page 5: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

5 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

kinect sensor by the ring worn by one of the users. Also, shadows of hand in the color image

affected the results. Such errors were reduced to some extent by the ‘imopen’ function.

3.4 Gesture Recognition - To accomplish one of the objectives namely, Gesture Recognition

there are number of methods reported in the literature.

Template matching using expectation maximization [5]

Mean-shift or simple connected components

Machine learning

Machine learning allows for easier gesture recognition and provides a robust classifier at low

computational cost. This is particularly useful for real time systems. This project uses a simple

logistic regression classifier to recognize the gestures. The classifier currently recognizes three

types of gestures as shown below: NO HAND; HAND TYPE 1; HAND TYPE2. These gestures

are used in controlling the music player in various interesting ways. The classifier takes in a

filtered and cropped region of interest image (ROI) after background subtraction. The image is

resized to reduce the number of features used for training. This will avoid the possibility of over-

fitting the training data. The resized image is rolled into a 1 X 1600 feature vector. Each element

of the vector is a pixel of the image.

3.4.1 Dataset

The total dataset of training and test hand images consisted of 573 labeled images which are

divided into 473 training and 100 test images randomly. The training set for the classifier

consists of labeled images consisting of rotations of hand and scaling of each of the gesture. The

classifier after training provides parameter matrix which is a 3x1600 matrix. When applied on an

image, this provides the probabilities scaled to the range (-1, 1) of the template belonging to one

of the above Hand types described above.

The test set is used to verify the generalization of the above parameters. As the features are a lot

more than the dataset, the classifier tends to over-fit the current data. But this is tolerable since

the results from test data show an acceptable accuracy of 85 %.

Fig 7. First row shows some data for training of hand type 0. Second and third rows show some

data for training of hand type 1 and final row shows hand type 2.

Page 6: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

6 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

Figure 8. Random Test data for which 96% accuracy was noticed in Matlab.

The classifier although robust to the noise in the corresponding region of interest (ROI), is

sensitive to the size of ROI. The linear classifier does not perform well if the region of interest is

picked somewhat different from the true region. Apart from region of interest, the classifier does

not have the capability to work

3.5 One of the challenges that we faced during the course of this project was to design a

simple yet comprehensive music player in MATLAB. We achieved this using GUI EDITOR of

MATLAB.

3.5.1 Music Player Layout

We designed a simple MATLAB player having the below mentioned basic features:

Listbox: containing a list of songs. The songs are uploaded from a pre-determined folder

in the system.

Text Box: This contains the name of the current song which is playing. This box will

clear out if we stop a song.

Slider: To govern the volume of the player. The maximum and minimum values of the

slider are 1 and 0 respectively.

5 pushbuttons: Each of these buttons corresponds to Play, Pause, Stop, Next and Previous

buttons on the GUI.

Play: To start playing a song.

Pause: To stop a song, however if we press play after pausing a song it resumes from the

place where it had been stopped initially.

Stop: Same as pause, however if we press play after stopping a song it will again start

from the beginning.

Next: It will highlight the next song, but the song will not start playing until we press the

Play button. If the end of the playlist has been reached, nothing will happen.

Page 7: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

7 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

Previous: It will highlight the previous song, but as in Next the song will not start playing

until the Play button is pressed. If the song, which is highlighted, is the first song of the playlist

nothing will happen.

Figure 9. Matlab based music player GUI developed by us.

4. Integration of modules. After achieving desired results, modules were integrated to achieve the final aim of the project. The

interactive music player is governed by the hand movements/gestures, depending on where the

hand is in the Kinect image in the current frame. The Kinect continuously captures images of the

marker objects and where the hand is relative to each of these objects.

In the main program the function that governs this music player is ‘procctrl.m’. This function

takes in 3 arguments viz. ctrl, vol and H explained in detail below.

CTRL: This is a 1x5 vector which can have the following values:

[1 0 0 0 0] – If hand type 1 is on the Play marker object in the Kinect image then the

function ‘ctrlgen.m’ gives the value of the CTRL vector and using the Play functionality of the

Music Player is invoked.

Page 8: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

8 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

[0 1 0 0 0] – If hand type 2 is on the Play marker object in the Kinect image then the

function ‘ctrlgen.m’ gives this value of the CTRL vector and using the Pause functionality of the

Music Player is invoked.

[0 0 1 0 0] – If any known hand type is on the Stop marker object in the Kinect image

then the function ‘ctrlgen.m’ gives this value of the CTRL vector and this is used to invoke the

Stop functionality of the Music Player.

[0 0 0 1 0] – If any known hand type is on the Volume marker object in the Kinect image

then the function ‘ctrlgen.m’ gives this value of the CTRL vector and this is used to invoke the

Volume functionality of the Music Player. Whenever the volume functionality is invoked then

the argument ‘vol’ is also passed which gives the current volume value (between 0 and 1)

according to which the volume of the Music Player is set.

[0 0 0 0 1] - If hand type 2 is on the Scroll marker object in the Kinect image then the

function ‘ctrlgen.m’ gives this value of the CTRL vector and this is used to invoke the ‘Previous’

functionality of the Music Player.

[0 0 0 0 2] – If hand type 1 is on the Scroll marker object in the Kinect image then the

function ‘ctrlgen.m’ gives this value of the CTRL vector and this is used to invoke the Next

functionality of the Music Player.

VOL: The current value ( between 0 and 1 ) to which the volume of the Music Player is

to be set depending on the depth value of where the hand is on Volume marker object.

H: This is the handle of the GUI and is used internally in the program.

Figure 10 showing final integration of the project

Page 9: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

9 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

Conclusion Through this project, we have implemented an interactive music player which is controlled by

hand gestures in the depth images taken from XBOX Kinect. Usage of machine learning

algorithm made it possible for detecting hand and gestures even in presence of noise and is

invariant to rotation and scale.

The code was run on a set of image data taken from XBOX Kinect. The video showing execution

of the same can be found on http://youtu.be/JczfQOyJiiM. It shows the various functionalities of

the music player.

This project can be further improved upon by replacing logistic regression algorithm with better

and more efficient algorithms so that more gestures can be perfectly detected. Dynamic

background implementation can also be introduced in due course of time.

Page 10: MATLAB Based Interactive Music Player using XBOX Kinectrpiyush/documents/piyush_Fall2012.pdf · EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham

10 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project

References

[1] M. T. Draelos, "The Kinect Up Close: Modifications for Short-Range Depth Imaging," North

Carolina State University, Raleigh, North Carolina, 2012.

[2]D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International

Journal of Computer Vision, 2004

[3] Y Ivanov, A Bobick ,J Liu, “Fast Lighting Independent Background Subtraction”, MIT Media

Lab., 1999

[4] J. Canny, "A computational approach to edge detection," IEEE Transactions on Pattern

Analysis Machine Intelligence, p. 679–698, 1986.

[5] Galatsanos, N.P. , Wernick, M.N., “Impulse restoration-based template-matching using the

expectation-maximization algorithm” Image Processing, Proceedings., International Conference,

1997

[6] http://conanchen.com/Kinetris

[7] http://openkinect.org/wiki/Talk:Main_Page

[8] http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=MachineLearning

[9] http://matlabbyexamples.blogspot.com/2011/03/making-matlab-media-player.html

[10] P Routray, G Bhutra, S Rath, S Mohanty "Depth Image Processing and Operator Imitation

Using a Custom Made Semi Humanoid.," IOSR Journals, vol. 1, no. 1, pp. 31-35, 2012.


Recommended