Music Matrix - A Fuzzy Automated Genre Classification

Post on 20-Jun-2015

160 views 2 download

Tags:

description

Music Matrix, in simple terms, processes songs to sort them in accordance with genres and displays the results in an aesthetic, visually appealing and easy-to-operate manner. It is to contain a NxN matrix, with each individual cell containing a list of songs (or maybe even just one single song) that belong a single variation/combination of two or more genres. Intermediary steps include Preprocessing, Feature Extraction and Classification. As their name suggests, Preprocessing encompasses conversion of the continuous analog song input into a discrete, digital vector signal (or a byte array). It also covers noise removal. Feature Extraction, literally, is the extraction of features from the processed signal. Classification, as the namesake suggests, is classifying these extracted features using fuzzy logic, to increase accuracy, and passing the results to the Music Matrix, to be displayed.

transcript

MUSIC MATRIX – AUTOMATED GENRE CLASSIFICATION

SAMPURN RATTAN10503864

INTRODUCTION

AUTOMATED GENRE

CLASSIFICATION

*Automated genre classification is the process by which a musical piece is associated to a genre to allow users to search, browse, and organize their music catalogues; through machine learning and advanced algorithms.

*In simple terms, your songs are sorted, according to genre, without any intervention or effort on your part.

AUTOMATED GENRE

CLASSIFICATION

*FEATURE EXTRACTION

*CLASSIFICATION

FEATURE EXTRACTION

*Each digital audio file has some features. These are extracted for the purpose of genre identification.

*These features can be classified into three categories, namely, timbre, pitch and rhythm.

FEATURE EXTRACTION

*Timbre – the quality that distinguishes different types of sound production, such as voices and musical instruments, string instruments, wind instruments, and percussion instruments.

*Pitch – the perception-based quality that allows ordering of sound on a frequency-related scale.

*Rhythm – the timing of musical sounds and silences on a human scale.

*A list of features of audio file

FEATURE EXTRACTION

*Some formulae and procedures used to calculate features

*Zero Crossings Rate for (int samp = 0; samp < samples.length - 1; samp++)

{

if (samples[samp] > 0.0 && samples[samp + 1] < 0.0)

count++;

else if (samples[samp] < 0.0 && samples[samp + 1] > 0.0)

count++;

else if (samples[samp] == 0.0 && samples[samp + 1] != 0.0)

count++;

}

FEATURE EXTRACTION

*Beat Sumdouble sum = 0.0;

for (int i = 0; i < beat_histogram.length; i++)

sum += beat_histogram[i];

double[] result = new double[1];

result[0] = sum;

return result;

FEATURE EXTRACTION

*Strongest Frequency Via Zero Crossings

result = (zero_crossings / 2.0) * (sampling_rate / (double) samples.length)

CLASSIFICATION

*The above extracted features are then used to identify genre using one or more clustering algorithms.

*Many approaches are used for the above, including Unsupervised and Supervised approach.

CLASSIFICATION

*Unsupervised Approaches have no knowledge about genres. Classifier can observe the data position in the feature space, but do not know what the genre cluster of the data is.

*Unsupervised classifiers:K-means, Agglomerative hierarchical

clustering, Self-organizing Map (SOM), Growing hierarchical Self-organizing Map (GHSOM).

CLASSIFICATION

*In Supervised Approaches, the system is trained by manually labeling the data at first, then, when unlabeled data (new coming data) comes, the trained system is used to classify it into a known genre.

*Supervised classifiers:

K-nearest neighbor (KNN), Gaussian Mixture Model (GMM), Linear Discriminant Analysis (LDA), Support Vector Machines (SVMs), Artificial Neural Networks (ANNs).

CLASSIFICATION

*A fuzzy inference system is implemented.

*It is a supervised classifier.

*Rules are manually created.

*The rules are, then, implemented on two feature sets, and the output evaluated.

*Feature set 1 = (Zero Crossings, Beat Sum, Strongest Frequency)

*Feature set 2 =(MFCC)

CLASSIFICATION

*Classification results

  Accuracy Hits Ratio

Feature Set 1 (ZCR + BS + SF)

85.0% 65.38%

Feature Set 2 (MFCC) 72.5% 65.9%

MUSIC MATRIX

*The “front-end” of my project.

*The Music Matrix is a NxN matrix where each cell represents a list of song(s) which are placed in one or more genres, in a fuzzy manner.

*This system clearly demonstrates multi-label songs.

MUSIC MATRIX

*For example, choosing a cell in the following matrix may cause a list of songs to be played, that are 60%-70% classic, and 10%-15% pop.

PROBLEMS

*Huge size of genre (and sub-genre) list.

*Non-Agreement on Taxonomies – Well-known websites like Allmusic (http://www.allmusic.com—531genres), Amazon (http://www.amazon.com—719 genres), and Mp3 (http://www.mp3.com—430 genres).

*Trivialization of music art.

*Classification Basis

PROBLEMS

*Fuzzy definition of genres

*Differences in human perception

*Scalability of any AMC system

CONCLUSIONS & FINDINGS

*Automated Genre Classification is a non-trivial task.

*Emotion and music-matching is subjective.

*The problems of genre taxonomy are carried onto Automated Genre Classification.

*Extraction of all features of an audio file is not only unnecessary, but also counterproductive.

*Different combinations of extracted features and various classification algorithms yield different results, of different accuracy.

*A combination of low-level signal properties such as zero-crossing rate, spectral centroid and skewness, mean energy, etc. and perception-based features such as MFCCs, beat histograms, etc. may be the most appropriate set.

*Multi-label classification is the most appropriate for real world.

*A fuzzy classification algorithm must be used to allow for multi-label songs.

*A lot of novelty functions have been created, but, sadly, they return results of lesser accuracy.

*Practices used for Automated Genre Classification can also be used to sieve similar songs. It may help in copyright and IPR protection.

Ref: http://www.thatsongsoundslike.com/

Q & A