EntropyandtheEstimationofMusicalAbility
AndrewRussell
SubmittedinpartialfulfillmentoftherequirementsforthedegreeofMasterofScienceinMusicandTechnology
SchoolofMusic
CarnegieMellonUniversityApril,2016
ThesisCommittee: RogerB.Dannenberg
BhikshaRaj
ii
Abstract
Musicalabilitydetermineshowwellamusician,oraband,canperform.Froma
listener’sperspective,amusician’sabilityisoftenjudgedbythesubjectiveoverallimpressionof
themusic.However,others,suchasteachersortalentseekers,usemorespecificcriteriato
determineamusician’sability.Regardlessofwhoisevaluating,judgementsofmusicalability
arestillsubjectiveandrequirethejudgestolistentorecordingsorliveperformancesofthe
musiciansinquestion.Automaticestimationtechniqueswouldgreatlydecreasethetime
requiredtodetermineamusician’sability.Automaticallyestimatingamusician’sabilitywould
alsobeveryusefulforonlinecommunitiesofmusicianstohelpusersfindotheruserswitha
similarmusicaltalent.
Inthisthesis,theautomaticestimationofmusicalabilityisexplored.Morespecifically,
twofeatures,bothbasedontheconceptofentropy,areproposed.Thefirstfeaturelooksat
therhythmicconsistencyofarecording,whilethesecondlooksatthetonalconsistency.The
performanceandrelativeimportanceofeachfeatureisstudiedbycorrelatingtheresultsofthe
featurewithdatathatwasmanuallylabelledby12musicians.Usingtherhythmicfeature,a
Pearson’scorrelationcoefficientof-0.55withap-valueof0.00052wasfound,whereasthe
pitchfeaturehadacoefficientof-0.32andap-valueof0.056.
iii
TableofContents
Abstract..........................................................................................................................................ii
ListofFigures.................................................................................................................................v
ListofTables..................................................................................................................................vi
ListofEquations.............................................................................................................................vi
ListofCodeCharts.........................................................................................................................vi
1. Introduction............................................................................................................................1
2. RelatedWork..........................................................................................................................3
3. Entropy...................................................................................................................................5
4. Rhythm...................................................................................................................................7
4.1. OnsetDetection..............................................................................................................7
4.2. OnsetDiffHistogram.....................................................................................................10
4.3. ParzenSmooting...........................................................................................................13
4.4. Entropy..........................................................................................................................13
4.5. Discussion......................................................................................................................14
5. Pitch......................................................................................................................................16
6. Evaluation.............................................................................................................................20
6.1. DataCollection..............................................................................................................20
6.2. DataProcessing.............................................................................................................21
6.3. Results...........................................................................................................................22
6.4. ParzenSmoothing.........................................................................................................25
7. ConclusionsandFutureWork..............................................................................................26
7.1. Conclusions...................................................................................................................26
7.2. FutureRhythmEntropyWork.......................................................................................26
7.3. FuturePitchEntropyWork............................................................................................27
7.4. FutureNewFeatures.....................................................................................................28
7.5. SourceCodeandData...................................................................................................28
Acknowledgements.....................................................................................................................30
iv
References...................................................................................................................................33
A. Appendix1:RawData..........................................................................................................36
v
ListofFigures
Figure4.1:Positionofonsetsinanamateurrecording(song2)...................................................9
Figure4.2:Positionofonsetsinaprofessionalrecording(song38).............................................9
Figure4.3:Onsetdifferencesovertimeforanamateurrecording(song2)...............................11
Figure4.4:Onsethistogramforanamateurrecording(song2).................................................11
Figure4.5:Onsetdifferencesovertimeforaprofessionalrecording(song38).........................12
Figure4.6:Onsethistogramforaprofessionalrecording(song38)...........................................12
Figure4.7:Smoothedhistogramofanamateurrecording(song2)............................................14
Figure4.8:Smoothedhistogramofaprofessionalrecording(song38)......................................15
Figure5.1:Frequencyspectrumforanamateurrecording(song15).........................................17
Figure5.2:Loglogfrequencyspectrumforanamateurrecording(song15)..............................18
Figure5.3:Frequencyspectrumforaprofessionalrecording(song22).....................................18
Figure5.4:Loglogfrequencyspectrumforaprofessionalrecording(song22)..........................19
Figure6.1:Amateurandprofessionalrhythmentropiesvs.ratings...........................................24
Figure6.2:Amateurandprofessionalpitchentropiesvs.ratings...............................................24
vi
ListofTables
Table6.1:Theratingschemeusedbytheraterstoevaluateeachrecording.............................21
Table6.2:Thecorrelationresultsoftherhythmandpitchfeatures...........................................22
Table6.3:Theaverageandstandarddeviationofthecorrelationofeachhumanrater............23
TableA.1:ListofsongsusedwiththeiraverageratingsandURLs..............................................36
ListofEquations
(3.1)..............................................................................................................................................6
(4.1)..............................................................................................................................................8
(4.2)..............................................................................................................................................8
(4.3)............................................................................................................................................13
(6.1)............................................................................................................................................22
ListofCodeCharts
Code4.1:Thealgorithmthatcreatesahistogramofonsettimingdifferences..........................10
1
1. Introduction
Musicalabilityisdefinedashowwellamusiciancanplaytheirinstrument.Isamusician
abletoplayrhythmicallyontempowithaband?Canthatsamemusiciansinganoteinkey?
Musicalabilityisalsocommonlyusedtodescribehowwellamusiciancanemotionallyexpress
themselvesduringaperformance.However,wedonothaveacleardefinitionofability.For
example,ifalistenergreatlyenjoysamusicalperformanceforwhateverreason,theymight
ratemusicalabilityashigh.
Howmusicalabilityisdefinedcanalsochangedrasticallybetweengenres.Amusician
canperformwithaveryhighabilityforonegenre,buthaveabeginner’slevelperformancefor
asecondgenre.Therefore,evaluatingthemusicalabilityofamusicianacrossgenresisa
difficulttask.Thisthesisfocusesonjustwesternrockandpopmusictosimplifytheprocess.
Wealsoassumethatweareanalyzingpolyphonicrecordings.
Automaticallyrecognizingmusicalabilityisgainingimportanceforonlinecommunities
ofmusiciansasatoolfordiscovery.Previousworkshaveshownthatthebestcollaborations
happenwhenthecollaboratorsbelievethattheyallhaveasimilarskilllevel[1].Astudyby
Katiraetal[2]lookedatpairsofsoftwaredeveloperstudentsandattemptedtopredicttheir
compatibilitybasedondifferentattributes.Theyfoundthatperceivedskilllevelpredicteda
goodmatchforalllevelsofdeveloperswhileactualskilllevelpredictedagoodmatchfor
graduatelevelstudents.
2
Therehaverecentlybeenmanycommunitiescreated,suchasKompozorSplice[3,4],
formusicianstofindcollaboratorswithwhomtowriteorrecordsongs.Thesecommunitiesrely
onusersmanuallysortingthroughlistingsofthousandsofcollaborationpartners.Assuming
thatabilityisagoodwaytofindgoodcollaborators,theseonlinecommunitiescouldhelpusers
byratingtheirabilitiesandsuggestingcollaboratorswithsimilarability,interests,tastes,etc.
Additionally,automaticrecognitionofmusicalabilitycanbeusedbythemusicindustry
todiscovernewtalent.RecordlabelscouldsortmusiconsitessuchasSoundCloudbytheir
musicalabilitytohelpthemfindmusiciansthattheywouldliketosign.Additionally,
applicationssuchasSpotifyoriTunescouldusethistorecommendthebetterperformedsongs
totheirusers.
Inthisthesis,theproblemofrankingmusicalperformancesbasedontheirrhythmicand
tonalentropyisaddressed.Morespecifically,thisthesisdiscussesentropy,anddefineshowitis
used.Itwillthensuggesttwonewfeatures,onebasedontherhythmicentropyandtheother
basedonthetonalentropy,whichcanbeusedtofindthemusicalabilityofthemusiciansinan
audiorecording.
Theorganizationoftherestofthisthesisisasfollows.Section2presentstherelatedwork.
InSection3,entropywillbedefinedanddiscussed.Sections4and5definetherhythmicand
tonalfeaturesusedtofindthemusicalability.Section6willdetailtheevaluationofthenew
features.Finally,section7willconcludethethesisandpresentpossiblefuturework.
3
2. RelatedWork
Theabilitytomeasurethemusicalabilityofamusicianisnotanewconcept.One
branchofresearchhasattemptedtodrawuponthepsychologyofmusictotestmusicalability.
Otherresearchhastriedtomeasuremusicalabilitybyhavingthemusicianplayalongwitha
specificscore.However,noworkhasattemptedtoautomaticallydeterminethemusicalability
ofthemusiciansinanarbitrarysong.
Studiesofmusicalabilityreachbacktoatleasttheearly1900’swhereMursellshowed
thattherewasnotyetasatisfactorytestformusicalability[5].Overthenextfewdecades,
profilespublishedbySeashore[6],Gordon[7],andothers[8,9]foundsomeindicationof
musicalabilitybasedonvariousmeasures,suchasintervalandrhythmrecognition.However,
allstudiesarebasedonhavingthesubjectself-reportortakeaproctoredtest.Whereasthe
meansofextractinginformationisdifferent,thisthesisattemptstomeasureasimilarsetof
abilities.
Anotherareathathasstudiedmusicalabilityisscorefollowing.Scorefollowing
attemptstomapamusicalperformanceofasongtoasymbolicrepresentationofthesong.The
PianoTutorusedscorefollowingtohelptrainbeginnerpianostudents[10].Students
performedonaMIDIkeyboardandthePianoTutorwouldoutputerrorssuchasbadtimingor
wrongnotes.SmartMusic[11]isasimilarsystem;howeveritevaluatesperformancesonwind
andstringinstrumentsusingpitchestimationtechniques.MusicProdigy[12]andYousician[13]
aretwoothermusiclearningsystemswhichincludetheabilitytoevaluatemultiplepitch
instruments,suchasguitar.
4
Overthelastcoupleofdecades,somevideogames,oftencalledrhythmgames,let
usersplayalongwithpopularsongs.Someofthesegamesuseguitar-likeinstrumentsand
electricdrumsasinput,suchasRockBandandGuitarHero.Rocksmithbridgesthegapbetween
videogamesandmusiceducationsystemsbylettingmusiciansplayalongonanyelectricguitar,
andprovidebothavideogameexperienceandexercisestohelptraintheguitarists[14].These
rhythmgamesawardscoresbasedontimingerrorsonwrongnotes.
Allofthesesystemsthatmeasuremusicalabilityusescorefollowingorplayingalong
withfixedmedia.Theydonotworkwhenascoreisunavailable.Withwesternrockmusic,
songsareoftenlearnedwithoutscoresandsometimessectionsareentirelyimprovised.Inthis
situation,wecouldattempttotranscribetherecordinganddeterminethemusicalabilitybased
ontheresultingscore.However,thisisaverydifficultproblem[15,16]asitinvolvesfiguring
outwhatinstrumentsarecontainedinthepiece,thensourceseparationoftheinstruments
involved,bothofwhicharedifficultproblemsontheirown.Furthermore,theremaybenoway
todetectmistakessincewehaveno“groundtruth”thattellswhatthemusicianwasintending
toplay.
5
3. Entropy
Todeterminethemusicalabilityofarecordingwithoutascore,wecaninsteadlookat
methodsthatestimatepitchandrhythmaccuracy.Ifweassumearecordinghasonlynotes
playedperfectlyintune,theresultingspectrogramwouldcontainpeaksatthefrequenciesin
thatsong’sscaleandvalleyselsewhere.Thismeansthatawell-playedsong,intermsofpitch,is
missingthe“outoftune”frequencies.Similarly,asongthathasveryconsistentrhythmswould
haveaconsistentperiodicitybetweentheonsets.Thiscouldbedeterminedusingabeat
histogram,whichisanaudiofeatureoftenusedfordeterminingrhythmicsimilarity.Thiswould
meanthataperfectsonginarhythmicsensewouldonlyhavetimingintervalsthatmatchthe
tempoandthemeterofthesong.
Toallowustousethefrequencyspectrogramandbeathistogramtomeasurepitchand
rhythmasmeasuresofmusicalability,werequireamethodtoextracttheconsistency.To
explorethisconceptfurther,wewilldescribetheconceptofentropy,whichplaysanimportant
partinourwork.
Ingeneral,theconceptofentropyhasbeenusedbyphysiciststodefinetheamountof
energylostinreactions.Morespecifically,thesecondlawofthermodynamicsstatesthatthe
amountofentropyinanyisolatedsystemwillincrease[17].Theuseofentropyhasalsobeen
adoptedbyotherfieldstodescribesimilarconcepts.Forexample,informationtheorydefines
entropyasthelossofdataininformationtransmissionsystems[18].
Entropyhasalsobeenadaptedforsignalprocessing.EksteinandPavelka[19]proposed
adefinitionwhereasystemofmaximumentropyisasystemofonlynoise.Otherpartsofthe
6
system,suchasspeechormusic,wouldhavelowerentropy.Thisisalsodefinedsothatthe
moreperiodicasourceis,thelowertheentropywillbe.Theentropyforasourcesignalis
definedinEquation(3.1).
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = − 𝑥 ∗ log 𝑥 !
(3.1)
Thisversionofentropyhasbeenusedforsignalprocessinginalgorithmssuchas
automaticspeechrecognition(ASR)[20]andindetectingvisionimpairmentsinEEGs[21].In
thisthesis,wewillproposeanewusageofthisentropyfordetectingthemusicalabilityina
recording.
Ifconsistencyisthehallmarkofmusicianship,onemightexpectasimplemeasuresuchas
standarddeviationwouldbeagoodestimator.Thiswouldbetrueifwewerelookingfor
consistentvaluesaroundafixedmeanvalue.However,evenatasteadytempo,weexpect
differentrhythmicvaluesclusteredaroundcommonquantitiessuchaseighthnotesand
quarternotes.Withpitch,weexpectfrequenciesdeterminedbydiscretescalesteps,withfew
frequencycomponentsinbetween.Inbothcases,wearelookingforconcentratedclustering
aroundarelativelysmallsetofmeans.Thisisamoregeneralproblemthanstandarddeviation.
Wecouldconsiderformingclustersandestimatingthestandarddeviationofeachcluster,but
entropyseemstobeasimplermodelthataccomplishesmuchthesamething.Entropyishigher
whenvaluesareclustered,andlowerwhenvaluesarerandomandindependent.
7
4. Rhythm
Oneofthemethodsofestimatingthemusicalabilityinasongistouserhythmic
features.Atahighlevel,theideaistodetectwhenthemusiciansareplayingnotesatthesame
timeaseachotherandtousethatinformationtocomputetheirmusicalability.Wethen
correlatetheextractedfeatureswiththeuserlabelleddatasettodeterminetheusefulnessof
thefeatures.
Toextractthefeatures,wefirstbuildabeathistogram,andthencomputetheentropy
usingthefollowingsteps:
1) DetectOnsets:Wefindtheonsetsoftheentireaudiorecording.
2) CalculateOnsetDiffHistogram:Wesubtractthetimestampofeachonsetwiththe
timestampofitsneighborandstoretheresultsinahistogram.
3) ParzenSmoothing:WesmooththehistogramusingParzenSmoothing.
4) CalculateEntropy:Wecalculatetheentropyofthesmoothedhistogram.
4.1. OnsetDetection
Todetecttheonsets,weuseanoff-the-shelfonsetdetector,aubio[22],which“isatool
designedfortheextractionofannotationsfromaudiosignals”.Wechoseaubiosinceboththe
algorithmsitusesanditsAPIsarewelldocumented.aubiocomespackagedwithacommand
lineprogramcalled“aubioonset”whichweranoneachaudiofilewiththedefaultsettings.It
outputsalistoftimestamps,inseconds,forwhichonsetsweredetected.
8
aubioonsethaseightdifferentonsetdetectionalgorithms:energybaseddifference,
high-frequencycontent,complexdomain,phasebased,spectraldifference,Kullback-Liebler,
ModifiedKullback-Liebler,andspectralflux[23].Weusedthedefaultonsetdetectionfunction
whichishigh-frequencycontent.High-frequencycontentiscalculatedforeachframeby
linearlyweightingeachfrequencybin,andthensummingalloftheweightedbinstogetheras
shownin(4.1)[24].aubioonsetalsoallowsotheroptionsforthebuffersize,hopsize,onset
thresholdvalue,andsilencethresholdvalue.Wedecidedtousethedefaultvaluesof512,256,
0.1,and-90dBrespectivelyforeachoption.
𝐻𝐹𝐶 = 𝑖 ∗ 𝑎𝑏𝑠(𝑋[𝑖])
!"#(!)
!!!
(4.1)
Tofindtheactualonsettimefromthehigh-frequencycontent,aubiolooksforthepeaks
usingamovingmeanwithanadaptivethreshold.Thepeakpickertracksthemeanofthe
currenthigh-frequencycontentvaluealongwiththeprevioussixvalues.Itthencomputesa
newvaluewithanadaptivethresholdusing(4.2).Thepeakpickerthencomparestheresultto
thevaluesfromtheprevioustwoframesanddetectsapeakisthemiddlevalueisthelargestof
thethreeandisgreaterthan0.Ifthevolumeoftheaudioisthenlouderthanthesilence
thresholdvalue,thepeakisdeclaredanonset.
𝑝𝑒𝑎𝑘 = 𝐻𝐹𝐶 −𝑚𝑒𝑑𝑖𝑎𝑛 − (𝑚𝑒𝑎𝑛 ∗ 𝑜𝑛𝑠𝑒𝑡 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑) (4.2)
9
Figure4.1:Positionofonsetsinaperformanceofamateurmusicians(song2)
Figure4.2:Positionofonsetsinaperformanceofprofessionalmusicians(song38)
10
4.2. OnsetDiffHistogram
Giventhelistofonsettimes,wesortthelistinascendingorder,andcomputethe
differenceintimebetweeneachadjacentelementinthelist.Thisgivesusalistoftimings
betweenonsets.Wethenorganizethelistintoahistogramtogetacountforeachonsettiming
difference.Code4.1presentsPython-likepseudocodethatcouldbeusedtocomputethe
histogram.
defcompute_onset_diffs_histogram(onsets):diffs=[]foriinrange(1,len(onsets)):diffs.append(onsets[i]–onsets[i–1])histogram={}fordiffindiffs:histogram[diff]=(histogram[diff]||0)+1returnhistogramCode4.1:Thealgorithmthatcreatesahistogramofonsettimingdifferences.
Weonlycomputethedifferenceagainsttheneighboringonsetsandnotbetweenany
otheronsetssincethedifferencebetweenotheronsetswouldjustcalculateinformationthatis
alreadypresentintheneighboringonsetdifferences.Consideraperfectlyplayedsongthat
containsjustquarternotesandtheoccasionalhalfnote.Theonsetdiffhistogramwithjust
neighborswouldhaveapeakatthequarternotevalue,andasmallerpeakatthehalfnote
value.Ifwealsoincludedthedifferencebetweenonsetstwolocationsaway,twopeakswould
appearatthevaluesforhalfnotesandwholenotesthathavethesamesizeasthepeaks
currentpeaksforquarternotesandhalfnotes.
11
Figure4.3:Onsetdifferencesovertimeforarecordingofamateurmusicians(song2)
Figure4.4:Onsethistogramforarecordingofamateurmusicians(song2)
12
Figure4.5:Onsetdifferencesovertimeforaperformanceofprofessionalmusicians(song38)
Figure4.6:Onsethistogramforaperformanceofprofessionalmusicians(song38)
13
4.3. ParzenSmooting
WethenuseParzensmoothing,whichisalsoknownaskerneldensityestimation,to
estimatetheprobabilitydensityfunction.Thisisdonesothatwecorrectforthestatistical
errorsintheobservationoftheonsets.Parzensmoothingworksbytakingaspecificshapeto
useasakernelinestimatingtheactualdata’sshape[25].WeuseGaussiankernelsasthe
startingshapeanduseScott’sRule[26]todetecttheoptimalbandwidthoftheGaussian
kernel.AGaussianshapewaschosenforthekernelasweassumethatthemusicians’mistakes
arenormallydistributed.Thebandwidthofthekernel,orinthiscasethewidthoftheGaussian
curve,isafreeparameterthathasastronginfluenceontheresultofthesmoothing.Scott’s
Ruleisarule-of-thumbbandwidthselectorwhichattemptstooptimizethebandwidthbasedon
thelengthofthehistogram[27].TheequationforScott’sRuleisseenin(4.3).
𝑆𝑐𝑜𝑡𝑡!𝑠 𝑅𝑢𝑙𝑒:𝑛!! !!!, 𝑛 = 𝑎𝑟𝑟𝑎𝑦 𝑙𝑒𝑛𝑔𝑡ℎ,𝑑 = 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑠 (4.3)
4.4. Entropy
Wethencomputetheentropyoftherecording.ThisisdonebycalculatingtheShannon
entropyofthesmoothedonsettimingdifferencehistogram.Weusetheimplementationfrom
SciPywhichcalculates(3.1)[28].SciPyusesthenaturallogarithminthecalculation.
14
Figure4.7:Smoothedhistogramofaperformanceofamateurmusicians(song2).Theblackbarsofthisgraphrepresenttheonsetdifferencehistogramwhilethegreenlinerepresentsthe
Parzensmoothedprobabilitydensityfunction.Thedensityfunctionhasbeenverticallyscaledonthisgraphsothatitsoverallshapeiseasiertosee.
4.5. Discussion
Asimplemented,therhythmicfeaturemakessomeassumptions.First,itdependson
thedetectionofnoteonsets.Thisiscurrentlyanareaofactiveresearchandnoalgorithmsare
perfect.Thismeansthattherewillbebothfalsepositivesandfalsenegativesinouronsets
whichwillpropagateerrorthroughouttheentirefeature.Additionally,weassumethatthe
audiorecordinghasaconsistentrhythmandasteady,unchangingtempo.Astudyoftempoin
rockandjazzrecordingsshowedthatrecordingsbyprofessionalperformers(evenwithout
usingclicktracks)havemoreconstanttempithanrecordingsbyamateurperformers[29].
15
However,arhythmoratempothatchangeswillintroducemoreentropy,evenifthemusicians
changedtherhythmortempoonpurposeanddidsoskillfully.Finally,thisalgorithmassumes
thatthemusiciansarealltryingtoplaytheirnotesexactlyonthebeat.However,thisisnot
alwaysthecaseassometimes,musicianswillplayeitheraheadorbehindofthebeattocreate
adifferentfeelinthemusic.Again,thiscouldbeapositiveindicationofmusicalabilityrather
thananegativeone.
Figure4.8:Smoothedhistogramofaperformancebyprofessionalmusicians(song38).Theblackbarsofthisgraphrepresenttheonsetdifferencehistogramwhilethegreenlinerepresents
theParzensmoothedprobabilitydensityfunction.Thedensityfunctionhasbeenverticallyscaledonthisgraphsothatitsoverallshapeiseasiertosee.
16
5. Pitch
Thepitchfeaturetomeasuremusicalabilityisbasedonanalyzingthespectralcontent
oftheaudio.Atahighlevel,thealgorithmlooksforspectralsmearingwhichindicatesthatthe
musiciansareplayingoutoftunewitheachother.Thisisdonebyfirstcalculatingtheloglog
frequencyspectrumoftherecordingandthencomputingtheentropyofthespectrum.
Toextractthepitchfeature,wefirstsplittheaudiofileintochunksof64,000samples,
usinganoverlapof50%.EachchunkhasaHammingwindowappliedtoitthenhastheFourier
transformcomputed.Onlythemagnitudeofeachbinisstored.Eachbinissummedacross
everychunk,andthenaveragedsothatweendupwithasingle64,000binfrequencyspectrum
oftheentireaudiofile.
Wethentransformthefrequencyspectrumintoaloglogscale.Weusedecibelsforthe
magnitude,andalogbase10scaleforthefrequencies.Alogscalewaschosenforthe
frequenciessothatthelowerfrequencieshavemoreemphasis.Thisisbecausemostmelodic
contentispresentinthelowerfrequenciesoftheaudiospectrum,whereasthehigher
frequenciescontainmoreharmoniccontent.
Theentropyofthefrequencyspectrumisthencomputed.Again,SciPy’s
implementationofShannonentropyisusedwithanaturallogarithm.
Thepitchfeaturedescribedabovemakessomeassumptionsaboutthehowthe
recordingwasperformed.First,itassumesthatthereisnovibrato.Sincevibratorapidlyvaries
thepitchofthenotebysmallamounts,anyvibratowilladdmoreentropy.Similarly,ifthe
17
overallpitchofthesongdriftsovertime,moreentropywillbedetected.Also,fixedpitch
instruments,suchasaMIDIkeyboard,willhavelessentropythaninstrumentswithoutfixed
pitch,suchasviolins.Finally,iftherecordingchangeskeypartwaythrough,moreentropywill
bepresentasmoreuniquepitcheswillhavebeenplayed.
Figure5.1:Frequencyspectrumforanamateurrecording(song15)
18
Figure5.2:Loglogfrequencyspectrumforaperformanceofamateurmusicians(song15)
Figure5.3:Frequencyspectrumforaperformanceofprofessionalmusicians(song22)
20
6. Evaluation
Toevaluateourentropybasedmeasures,wecollectedmusicandusedhumansubjects
torateitformusicalability.
6.1. DataCollection
Thedataforthisprojectconsistsoftwoversionsoftwentyonedifferentsongs.One
versionisperformedbyanamateurbandandtheotherisperformedbyaprofessionalband.
ThesongsweretakenfromYouTube,andalistofthesongscanbefoundinAppendix1.
EachsongwasdownloadedfromYouTubeandwasclippedtoathirtysecondlongfile.
Thethirtysecondschosenforeachsongwerethesamefortheboththeamateurand
professionalcategories.Forexample,ifthefirstchoruswaschosenfortheamateurversionof
“SongA”,thefirstchoruswouldalsobechosenfortheprofessionalversionof“SongA”.The
songswerealsoallnormalizedinvolumeandconvertedtothemp3format.Allprocessingwas
completedusingAudacity.
Thesongswerethenratedbyothermembersofmyresearchgroup.Thisconsistedof
twelveusers,andeachuserratedarandomselectionoftwentysongs,tenfromeachofthe
amateurandprofessionalcategories.Eachraterfirstlistenedtothreesamplesongswhich
wereaccompaniedwithsuggestedratingsforeachsample.Alistoftwentysongs,fromthe
forty-twosongset,wasthenrandomlygenerated.Eachraterwasgiventenrandomamateur
songsandtenrandomprofessionalsongsinarandomorder.Theratingschemeusedisshown
inTable6.1.
21
Table6.1:Theratingschemeusedbytheraterstoevaluateeachrecording.
RatingLevel RatingDescription
5 Flawlessperformance
4 Minormistakesthatdonotdetractfromtheperformance
3 Manymistakesthatdonotdetractfromtheperformance
2 Manymistakesthatdetractfromtheperformance
1 Moremistakesthanplaying;hardtolistento
Theraterswerealsotokeepinmindthatthesongsweretoberatedonperformance
ability,andnotonproductionqualitiessuchasediting,mastering,microphoneplacements,
etc.,whichcangreatlyaffectalistenersjudgementofasong.Themusicalabilitywasdefined
astherhythmandintonationoftheperformers,bothintheirsoloperformance,andintheir
performancewithotherbandmembers.Eachsongwasencodedasalowbitratemp3to
attempttoadjustforthebiasintroducedbydifferencesinoverallproductionquality.
6.2. DataProcessing
Oncethedatawascollected,ithadtobepreprocessedtoaccountforarater’sbias.For
example,ifoneraterwasharsher(ratedlower)thantherestoftheraters,thatfirstraterwill
havetheirratingsadjustedhighertomatchtheotherrater’sratings.Thiswasdoneby
calculatingthez-scoreofeachratingbasedonthestandarddeviationandaverageofitsrater’s,
asshownin(6.1).Theoverallaveragez-scoreofeachsongwasthencomputed.
22
𝑧 − 𝑠𝑐𝑜𝑟𝑒 =𝑥 − 𝜇𝜎 (6.1)
6.3. Results
Todeterminehowusefulourrhythmandpitchfeaturesare,wefoundboththerhythm
andpitchentropyvaluesforeverysonginourdataset.Acorrelationvalueforeachfeaturewas
calculatedagainsttheuserlabelleddata.TheresultsofthecorrelationscanbeseeninTable
6.2.Tocalculatethecorrelations,weusedPearson’scorrelationcoefficient,ameasureof
correlationbetweentwonormaldatasets,withtwo-tailedp-values[30].TheuseofPearson’s
correlationcoefficientassumesthatthemusicalabilityofallsongsfollowsanormal
distribution.Morespecifically,themajorityofsongsareassumedtohaveanaveragemusical
abilityrating,whilereallywellperformedsongsandreallypoorlyperformedsongsareassumed
tobemuchrarer.
Table6.2:Thecorrelationresultsoftherhythmandpitchfeatures.
r-value p-value
RhythmEntropy -0.55 5.2e-4
PitchEntropy -0.32 0.056
Figure6.1andFigure6.2ploteachsong’sratingversusitsentropyvalue.Asexpected,
theprofessionalperformancestendtohavehigheruserratingsandlowerentropyvalueswhile
theamateurperformancestendtohaveloweruserratingsandhighentropyvalues.However,
itisinterestingtonotethatnotallprofessionalperformancesscoredwell,bothbytheraters
23
andbytheentropyfeatures.Additionally,someamateurperformancesscoredwellbyboththe
ratersandtheentropyfeatures.
Tobeabletoputourentropycorrelationsinperspective,wecancomparethe
correlationsofentropytothoseofthehumanraters.Sinceweconsidertheaveragez-scoreof
allraterstobethegoldstandard,wecancomputetheaveragez-scoreofallbutoneraterto
getanalmost-goldstandardandthenevaluatetheheld-outraterinthesamewayweevaluate
entropy,usingcorrelation.Weevaluateallratersbyholdingoutoneatatimeandrecomputing
the“almost-goldstandard”eachtime.Wethencomputeanoverallaveragecorrelationaswell
asthestandarddeviation.Thisthenestimatestheexpectedcorrelationofanyhumanrater,
whichwecanthencomparetotheentropycorrelations.
Theresultsofthisinter-userevaluationcanbeseeninTable6.3.Therhythmentropy
correlateswellwithinonestandarddeviationofthehumanraters(0.05differencein
correlation)whilethepitchentropycorrelationisalmostonestandarddeviationbelowthe
averagehumanraters(-0.18difference).Thissuggeststhatourrhythmentropyfeatureisas
efficientasanaveragehumanrater,whilethepitchentropyfeatureperformsworsethanan
averagehumanrater.
Table6.3:Theaverageandstandarddeviationofthecorrelationofeachhumanrater
AverageCorrelation StandardDeviation
0.50 0.21
24
Figure6.1:Amateurandprofessionalrhythmentropiesvs.ratings
Figure6.2:Amateurandprofessionalpitchentropiesvs.ratings
25
6.4. ParzenSmoothing
AfterconductingthisstudywithParzensmoothing,wealsotriedcomputingtherhythm
entropywithoutParzensmoothing.Intuitively,thecurrentwidthoftheGaussiankernelsistoo
wide.AnexampleofthiscanbeseeninFigure4.7,wheretherearefourdistinctpeaksinthe
underlyinghistogramdataat0.2s,0.4s,0.6s,and0.8s.However,theParzensmoothedcurve
barelydistinguishesbetweenthesepeaks.Onemightthinkthatthateitherreducingthewidth
ofthekernels,orremovingtheParzensmoothingaltogetherwillimprovetherhythmentropy
results.
However,theresultswithoutParzensmoothingseemtobesignificantlyworse,withthe
rhythmentropyhavinganr-valueof-0.14.Thisiscounterintuitiveanddeservesfurther
exploration.Unfortunately,therearemethodologicalproblemstothispursuit.Ifwewereto
computethep-valueasbefore,itwouldbe0.41,whichwouldnotindicateasignificant
correlation.Unfortunately,wecannotmakethisconclusionbecausethiswouldbeanincorrect
waytocomputethep-value.Thatisbecauseifwereusetherelativelysmallsetofsubjective
ratingswithdifferentparameters(suchasParzensmoothingkernels),theprobabilityoffinding
ahighcorrelationisincreased.(Thisisaformofoverfittingthedata.)Thequestionof
significancechangesfrom“istheprobabilityofthenullhypothesislow”to“istheprobabilityof
manyrelatednullhypotheseslow.”Ignoringthisdistinctionleadstogreatersignificancethan
warrantedbythedata.Thebestapproachwouldbetogathernewdatatoevaluatenew
techniques.Lackingmoredata,wecannotevaluatedifferentParzensmoothingparameters,so
weleaveopenthequestionofwhetherourapproachisbest.
26
7. ConclusionsandFutureWork
7.1. Conclusions
Thisthesishasintroducedtwonewfeatures,onebasedonrhythmandtheotherbased
onpitch,todeterminethemusicalabilityofthemusiciansinarecording.Thefeaturesboth
takeinanaudiorecordingasaninputandbothoutputascalarentropyvalue.Thefeatures
werecorrelatedagainstuser-labelleddatatodeterminehowusefuleachfeaturewas.The
resultsoftheevaluationshowthatthetwofeatureshavepotentialtobeusedasindicatorsof
musicalabilityinmusicians.Inparticular,rhythmicentropyhadaveryhighcorrelationwith
subjectiveratings(r=-0.59,p=0.00015).Pitchentropyappearstobecorrelated(r=-0.32)but
thecorrelationwasnotsignificantinthisstudy(p=0.056).
7.2. FutureRhythmEntropyWork
Thereismuchfutureworkthatcouldbedonetoenhancebothofthefeatures.One
suchimprovementwouldbetonotpenalizerecordingsthatcontainintentionalrhythmicor
tempochanges.Itwouldbeinterestingtolookintodetectingwhenthesechangeshappenand
thentodeterminethebestwaytocorrectforthesechanges.Thiscouldpossiblybedoneby
computingtheentropyforeachdistinctsectiononitsownandthencombingtheresulting
entropyvalues.
Anotherareathatcouldusemoreresearchisdetectingwhenamusicianisplaying
aheadoforbehindthebeat.Wecanspeculatethatamusicianplayingperfectlyoffbeatwill
produceasmoothedbeathistogramthathasconsistentlywiderpeaks,duetooneonsetalways
27
beingslightlydelayedfromitsneighbor.However,moreworkisrequiredtodeterminethebest
waytocorrectforthisbehavior.However,ifthetempochangeshappenedfornon-musical
reasons,thisapproachmightleadtofalsepositives.Automaticevaluationofhigh-levelmusical
decisions(oraccidents)seemstobeverychallenging.
Intherhythmfeature,wetreatedalltimingerrorsasabsolutevalues,meaningthat
missingaquarternoteby100mswouldpenalizethemusicalabilitythesameamountthat
missinganeighthnoteby100mswould.However,thisisnotnecessarilythecaseandboththe
productionandtheperceptionoftimingerrorsmayberelative.Futureworkshouldbedoneto
considerhowlogsoftheinter-onsettimeswouldperformratherthantheabsoluteinter-onset
times.
7.3. FuturePitchEntropyWork
Thereisalsoroomforfutureworkonthepitchfeature.Onesuchareaisnotpenalizing
recordingsforchangingthekeypartwaythrougharecording.Similartotempochanges,this
couldpossiblybesolvedbydetectingkeychangesandthencomputinganentropyvaluefor
eachsectionindividually.However,moreresearchwouldhavetobedonetodeterminethe
bestmethodfordoingso.
Finally,moreworkneedstobedonetoallowtheadditionofvibratointherecordings.
Whereasvibratoaddssmearingtothefrequencyspectrum,agoodvibratowouldhavea
consistentsmearingpatterninthespectrum.Moreworkcouldbedonetodeterminehowto
besthandlethevibrato.
28
7.4. FutureNewFeatures
Wehavealsothoughtaboutotherfeaturesthatcouldbeusedtodetectthemusical
ability.Toneproductionisalargeareaofmusicianshipthatwehaveignored.Anotecanbe
playedperfectlyintimeandonpitchbutwithapoorsound.However,webelievethatmost
instrumentswillrequiretheirownmeasure,asthenatureof“goodsound”variesgreatly
betweeninstrumentsandplayingstyles.
Currently,therhythmfeatureconflatestwoareasofmusicianship,playingintempo,
andplayingonbeat.Webelievethatsplittingtherhythmfeatureintotwofeatures,oneto
trackhowsteadythetempoisandtheothertotrackhowfaroffbeateachnoteisplayed,
willimprovetheresultsofmeasuringthemusicalability.Splittingtherhythmentropyinto
thesetwofeaturescouldalsomaketheissuesoftempochangesandplayingoffbeateasier
tosolve.
Withthepossibilityofmorefeaturesbeingavailabletomeasurethemusicalability,the
questionofhowtocombinethesemeasurestogetanoverallmeasurearises.Webelieve
thatusingstandardmachinelearningtechniques,suchasneuralnetworksorsupportvector
machines,tocombinethefeatureswouldhelpsolvethisproblem.However,morework
wouldbeneededtoverifythis.
7.5. SourceCodeandData
Thesoftwareanddatausedforthispaperhasbeenpublishedonlineandcanbefound
atthefollowinglocation:https://www.github.com/deadheadrussell/thesis.Documentationfor
29
howtocompileandrunthecodetoreplicatetheresultsofthisthesisisprovided.Allprovided
codeislicensedundertheMITlicense,thetextofwhichcanbefoundattheabovelocation.
30
Acknowledgements
Somestudentsareablestarttheirmasters,acealloftheircourses,andfinally,defenda
brilliantthesisinjust18months(andthengoontocompleteaPh.D.thesisinjust32months).
Othersstruggletofindatopic,takeforevertogetsomeresults,andthenletafullyearpass
beforefinishingeventhefirstfulldraft.Thatfirststudentisthefathertomyfiancée,David
Machina.Youcanguesswhothatsecondoneis.
WhenIstartedthisdegree,myparentswereofcourseproudofme.Agraphoftheir
emotionsmighthavelookedsomethinglikethefollowing:
However,allgoodthingsmustcometoanend,andasIdraggedthisprojectalong,the
graphoftheiremotionsbecamethefollowing:
31
However,they’vestuckwithme,andIamverythankfulforallofthesupportthatthey
havegivenme.Ialsoamgratefulforthesupportfrommybrother,sister,andtherestofmy
family.
Ialsowouldliketothankmyfiancée,KristenMachina,fornotconstantlynaggingmeto
finishmythesiswhenshehadeveryrightto.Thisalsoextendstoherfamily,whichwillalso
soonbemyfamilytoo,especiallysincetheyhaveaprodigyofcompletingthesesamongthem.
Thismaster’sthesiswouldnothavebeenthesamewithoutthecountlesspeopleat
CarnegieMellonwhohelpedmealong.FirstandforemostamongthemismyadvisorProf.
RogerDannenberg.Hisfeedbackandadvicethroughoutthisprocesswasmonumentalforthis
research.Withouthim,noneofthiswouldhavecometofruition.
Alsoamongthefaculty,IwouldliketothankProf.BhikshaRajforstickingaroundonthe
thesiscommittee.Healwaysmadekeyinsightsattherighttimes.Iwouldalsoliketothank
Prof.RiccardoSchulz,Prof.RichStern,andProf.TomSullivanfortheirfeedbackalongtheway.
32
Therewerealsomanyotherfriends,newandold,whohelpedmealongthroughthis
process.ThefriendsImadeoutinPittsburgh,HarisUsmani,TinaLiu,SeanBrennen,Robert
Kotcher,NickCoronado,AlanV.,GregHannenman,andtoomanyotherstolisthere,aswellas
theoldfriendswithwhomIstayedintouch.Theirfriendshipandsupportwasveryimportant
asIsloggedmywaytothefinish.
33
References
[1] P.J.Guinan,J.G.CoopriderandS.Faraj,"Enablingsoftwaredevelopmentteamperformanceduringrequirementsdefinition:Abehavioralversustechnicalapproach,"InformationSystemsResearch,vol.9,no.2,pp.101-125,1998.
[2] N.Katira,L.Williams,E.Wiebe,C.Miller,S.BalikandE.Gehringer,"Onunderstandingcompatibilityofstudentpairprogrammers,"ACMSIGCSEBulletin,vol.36,no.1,pp.7-11,2004.
[3] "KompozMusicCollaboration,"Kompz,2016.[Online].Available:http://www.kompoz.com/.[Accessed11042016].
[4] "Splice-MusicMadeBetter,"Splice,2016.[Online].Available:https://splice.com/.[Accessed11042016].
[5] J.L.Mursell,Thepsychologyofmusic,WWNorton&Co,1937.
[6] H.M.Stanton,"Seashoremeasuresofmusicaltalent,"PsychologicalMonographs,vol.39,no.2,p.135,1928.
[7] E.Gordon,Musicalaptitudeprofile,HoughtonMifflin,1965.
[8] T.H.Madison,"Intervaldiscriminationasameasureofmusicalaptitude,"ArchivesofPsychology(ColumbiaUniversity),1942.
[9] L.N.LawandM.Zentner,"Assessingmusicalabilitiesobjectively:ConstructionandvalidationoftheProfileofMusicPerceptionSkills,"PloSone,vol.7,no.12,p.e52508,2012.
[10]R.B.Dannenberg,M.Sanchez,A.Joseph,P.Capell,R.JospehandR.Saul,"Acomputer-basedmulti-mediatutorforbeginningpianostudents,"JournalofNewMusicResearch,vol.19,no.2-3,pp.155-173,1990.
[11]R.B.DannenbergandC.Raphael,"Musicscorealignmentandcomputeraccompaniment,"CommunicationsoftheACM,vol.49,no.8,pp.38-43,2006.
[12]"MusicEducationSoftware|MusicProdigy,"MusicProdigy,2016.[Online].Available:http://www.musicprodigy.com/.[Accessed11042016].
34
[13]"Yousician,"Yousician,2016.[Online].Available:https://get.yousician.com/.[Accessed11042016].
[14]B.J.Miller,"MusicLearningthroughVideoGamesandApps:GuitarHero,RockBand,Amplitude,Frequency,andRocksmith,andBandfuse,andBit.TripComplete,andAudiosurf,andBeatHazard,andBiophilia(review),"AmericanMusic,vol.31,no.4,pp.511-514,2013.
[15]C.Dittmar,E.Cano,J.AbeßerandS.Grollmisch,"Musicinformationretrievalmeetsmusiceducation,"DagstuhlFollow-Ups,vol.3,2012.
[16]A.KlapuriandM.Davy,Signalprocessingmethodsformusictranscription,SpringerScience&BusinessMedia,2007.
[17]E.A.Guggenhein,Thermodynamics:AnAdvancedTreatmentforChemistsandPhysicist,vol.2,North-Holland,1949.
[18]C.E.Shannon,"Amathematicaltheoryofcommunication,"ACMSIGMOBILEMobileComputinganCommunicationsReview,vol.5,no.1,pp.3-55,2001.
[19]K.EkšteinandT.Pavelka,"Entropyandentropy-basedfeaturesinsignalprocessing,"inProceedingsorPhDworkshopsystems&control,2004.
[20]T.Pavelka,"HybridMethodsofAutomaticSpeechRecognition,"UniversityofWestBohemiainPilsen,2009.
[21]V.Vijean,M.Hariharan,S.Yaacob,M.N.B.SulaimanandA.H.Adom,"Objectiveinvestigationofvisionimpairmentsusingsingletrialpatternreversalvisuallyevokedpotentials,"Computers&ElectricalEngineering,vol.39,no.5,pp.1549-1560,2013.
[22]"aubio,alibraryforaudiolabelling,"2015.[Online].Available:http://aubio.org/.[Accessed12042016].
[23]"ManpageofAUBIOONSET,"18122013.[Online].Available:http://aubio.org/manpages/latest/aubioonset.1.html.[Accessed12042016].
[24]P.Masri,Computermodellingofsoundfortransformationandsynthesisofmusicalsingals.,UniversityofBristol,1996.
[25]"scipy.stats.gaussian_kde-SciPyv0.17.0ReferenceGuide,"20022016.[Online].Available:http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html.
35
[Accessed12042016].
[26]D.W.Scott,"Scott'sRule,"WileyInterdisciplinaryReviews:ComputationalStatistics,vol.2,no.4,pp.497-502,2010.
[27]W.K.Härdle,M.Müller,S.SperlichandA.Werwatz,"MultivariateKernelDensityEstimation,"inNonparametricandsemiparametricmodels,SpringerScience&BusinessMedia,2012,pp.73-74.
[28]"scipy.stats.entropy-SciPyv0.17.0ReferenceGuide,"20022016.[Online].Available:http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.stats.entropy.html.[Accessed12042016].
[29]DannenbergandMohan,"CharacterizingTempoChangeinMusicalPerformances,"inProceedingsofthe2011InternationalComputerMusicConference,SanFrancisco,2011.
[30]"scipy.stats.pearsonr-SciPyv0.16.1ReferenceGuide,"24102015.[Online].Available:http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.pearsonr.html.[Accessed12042015].
36
A. Appendix1:RawData
TableA.1:ListofsongsusedwiththeiraverageratingsandURLs
Song MeanRating SongURL01 -1.20906004115 https://www.youtube.com/watch?v=is6m1P11GOo02 0.540713146544 https://www.youtube.com/watch?v=KNV7SLksMRs03 0.282561917948 https://www.youtube.com/watch?v=nQ1Z-kk9Sag04 -1.37609415656 https://www.youtube.com/watch?v=8jlo-DVdEP005 0.118076965045 https://www.youtube.com/watch?v=U5n4gCeiz0k06 -0.397310060726 https://www.youtube.com/watch?v=s5E3iGtyX1807 0.552061535741 https://www.youtube.com/watch?v=MV-v9xqKhUE08 -0.817280725133 https://www.youtube.com/watch?v=DEzWtijKXbc09 0.10550775384 https://www.youtube.com/watch?v=NXmegM4528E10 0.988164022952 https://www.youtube.com/watch?v=je0uwIhFsok11 -0.865006934615 https://www.youtube.com/watch?v=gOpnEOnmg5Y12 -0.727145514822 https://www.youtube.com/watch?v=8g7-Fk-ds2o13 -0.549727146416 https://www.youtube.com/watch?v=78MlhPSxcvM14 -1.13296091605 https://www.youtube.com/watch?v=sXLbBZmmT6815 -1.33630620956 https://www.youtube.com/watch?v=FjeMDvCdrtc16 (numbernotused) 17 0.785223638852 https://www.youtube.com/watch?v=QZuWSrH9T9s18 -0.13549787198 https://www.youtube.com/watch?v=H2x7elKbLl019 1.26949089908 https://www.youtube.com/watch?v=ALF8C1q5VrY20 -0.4553024383 https://www.youtube.com/watch?v=Is0Dts2tT4Y21 -0.162334161674 https://www.youtube.com/watch?v=-SJuHRMfF2w22 0.744275186681 https://www.youtube.com/watch?v=F3wZUXpYQls23 0.181570576065 https://www.youtube.com/watch?v=fE3mFOwUxdk24 -0.300802391868 https://www.youtube.com/watch?v=6QUWkFeGQ0A25 -0.464649038018 https://www.youtube.com/watch?v=8RFTB5vgV_426 0.753963122766 https://www.youtube.com/watch?v=XCMrXC8D05Q27 0.46106774938 https://www.youtube.com/watch?v=-RuEDNYQQ4028 0.522143251701 https://www.youtube.com/watch?v=r35Ius6JPS829 0.0898903225722 https://www.youtube.com/watch?v=wt0qhMEg-Xk30 1.03199524634 https://www.youtube.com/watch?v=pm3WFJOzjVc31 0.273614895995 https://www.youtube.com/watch?v=Yvz_LpQDr0k32 -0.257767486821 https://www.youtube.com/watch?v=NJPAjiSX7Rk33 0.54512634795 https://www.youtube.com/watch?v=FXDlwkuInBY34 -0.0514455517722 https://www.youtube.com/watch?v=4atn3ue-nEM35 1.26949089908 https://www.youtube.com/watch?v=4PekdeINQco
37
36 (numbernotused) 37 (numbernotused) 38 0.211627425481 https://www.youtube.com/watch?v=HYPh7UCzpHc39 1.26949089908 https://www.youtube.com/watch?v=PvE88H8vb-440 -0.58227507332 https://www.youtube.com/watch?v=R9WTlP08LEg41 0.16320814922 https://www.youtube.com/watch?v=c1huRuo6Wj042 -0.207865123 https://www.youtube.com/watch?v=jtN8oBjMr_E