Music Matrix - A Fuzzy Automated Genre Classification

transcript

MUSIC MATRIX – AUTOMATED GENRE CLASSIFICATION

SAMPURN RATTAN10503864

INTRODUCTION

AUTOMATED GENRE

CLASSIFICATION

*Automated genre classification is the process by which a musical piece is associated to a genre to allow users to search, browse, and organize their music catalogues; through machine learning and advanced algorithms.

*In simple terms, your songs are sorted, according to genre, without any intervention or effort on your part.

AUTOMATED GENRE

CLASSIFICATION

*FEATURE EXTRACTION

*CLASSIFICATION

FEATURE EXTRACTION

*Each digital audio file has some features. These are extracted for the purpose of genre identification.

*These features can be classified into three categories, namely, timbre, pitch and rhythm.

FEATURE EXTRACTION

*Timbre – the quality that distinguishes different types of sound production, such as voices and musical instruments, string instruments, wind instruments, and percussion instruments.

*Pitch – the perception-based quality that allows ordering of sound on a frequency-related scale.

*Rhythm – the timing of musical sounds and silences on a human scale.

*A list of features of audio file

FEATURE EXTRACTION

*Some formulae and procedures used to calculate features

*Zero Crossings Rate for (int samp = 0; samp < samples.length - 1; samp++)

if (samples[samp] > 0.0 && samples[samp + 1] < 0.0)

count++;

else if (samples[samp] < 0.0 && samples[samp + 1] > 0.0)

count++;

else if (samples[samp] == 0.0 && samples[samp + 1] != 0.0)

count++;

FEATURE EXTRACTION

*Beat Sumdouble sum = 0.0;

for (int i = 0; i < beat_histogram.length; i++)

sum += beat_histogram[i];

double[] result = new double[1];

result[0] = sum;

return result;

FEATURE EXTRACTION

*Strongest Frequency Via Zero Crossings

result = (zero_crossings / 2.0) * (sampling_rate / (double) samples.length)

CLASSIFICATION

*The above extracted features are then used to identify genre using one or more clustering algorithms.

*Many approaches are used for the above, including Unsupervised and Supervised approach.

CLASSIFICATION

*Unsupervised Approaches have no knowledge about genres. Classifier can observe the data position in the feature space, but do not know what the genre cluster of the data is.

*Unsupervised classifiers:K-means, Agglomerative hierarchical

clustering, Self-organizing Map (SOM), Growing hierarchical Self-organizing Map (GHSOM).

CLASSIFICATION

*In Supervised Approaches, the system is trained by manually labeling the data at first, then, when unlabeled data (new coming data) comes, the trained system is used to classify it into a known genre.

*Supervised classifiers:

K-nearest neighbor (KNN), Gaussian Mixture Model (GMM), Linear Discriminant Analysis (LDA), Support Vector Machines (SVMs), Artificial Neural Networks (ANNs).

CLASSIFICATION

*A fuzzy inference system is implemented.

*It is a supervised classifier.

*Rules are manually created.

*The rules are, then, implemented on two feature sets, and the output evaluated.

*Feature set 1 = (Zero Crossings, Beat Sum, Strongest Frequency)

*Feature set 2 =(MFCC)

CLASSIFICATION

*Classification results

Accuracy Hits Ratio

Feature Set 1 (ZCR + BS + SF)

85.0% 65.38%

Feature Set 2 (MFCC) 72.5% 65.9%

MUSIC MATRIX

*The “front-end” of my project.

*The Music Matrix is a NxN matrix where each cell represents a list of song(s) which are placed in one or more genres, in a fuzzy manner.

*This system clearly demonstrates multi-label songs.

MUSIC MATRIX

*For example, choosing a cell in the following matrix may cause a list of songs to be played, that are 60%-70% classic, and 10%-15% pop.

PROBLEMS

*Huge size of genre (and sub-genre) list.

*Non-Agreement on Taxonomies – Well-known websites like Allmusic (http://www.allmusic.com—531genres), Amazon (http://www.amazon.com—719 genres), and Mp3 (http://www.mp3.com—430 genres).

*Trivialization of music art.

*Classification Basis

PROBLEMS

*Fuzzy definition of genres

*Differences in human perception

*Scalability of any AMC system

CONCLUSIONS & FINDINGS

*Automated Genre Classification is a non-trivial task.

*Emotion and music-matching is subjective.

*The problems of genre taxonomy are carried onto Automated Genre Classification.

*Extraction of all features of an audio file is not only unnecessary, but also counterproductive.

*Different combinations of extracted features and various classification algorithms yield different results, of different accuracy.

*A combination of low-level signal properties such as zero-crossing rate, spectral centroid and skewness, mean energy, etc. and perception-based features such as MFCCs, beat histograms, etc. may be the most appropriate set.

*Multi-label classification is the most appropriate for real world.

*A fuzzy classification algorithm must be used to allow for multi-label songs.

*A lot of novelty functions have been created, but, sadly, they return results of lesser accuracy.

*Practices used for Automated Genre Classification can also be used to sieve similar songs. It may help in copyright and IPR protection.

Ref: http://www.thatsongsoundslike.com/

Music Matrix - A Fuzzy Automated Genre Classification

Education