ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
How to read networks and make them legible The “jazz network” test bed graph
Toexemplifyourmethod,wewantedtousea‘standardgraph’,butmosttestbednetworksweretoosmallforourpurposes–forinstance,thefamous“KarateClub”ofZachary,1977containsonly34nodes.Itiseasytoobserverelationalstructuresinnetworksofafewdozensorhundredsofnodes,butwewantedtoshowthatVNAcanalsobeappliedtonetworkswithseveralthousandsofnodes.Inspirationcamefromanothergraphoftendiscussedintheliterature:thenetworkofcollaborationsbetweenjazzmusiciansproducedbyGleiser&Danon(2003).AsobservedbytheMcAndrew et al. (2014), “as a music form, jazz is inherently social” and thus particularlypropitioustonetworkanalysis.Yet,Gleiser&Danonnetworkcontainsonly1.473nodesandislimitedtothejazzbandsthatperformedbetween1912and1940(makingitdifficulttointerpretfor the contemporary reader). We thus decided to produce an updated and expanded “jazznetwork”bydrawingonWikipedia’sontology.Hereistheprotocolthatallowedustoobtainagraphof6.049nodesand85.842edges:
• WeusedWikidata.orgtoextract1. Allthe6.796‘instances’of‘human’andthe976‘instances’of‘band’with‘genre=jazz’.WethusobtainedalistofindividualsandbandsthathaveapageintheEnglishWikipediaand that are related to jazz (mostly jazz musicians, but also jazz historians andproducers).Foreachofthem,wealsocollected(whenavailable):o the‘birthyear’(forindividual)and‘inception’date(forbands)o the‘citizenship’(forindividuals)and‘countryoforigin’(forbands)–whenmultiplenationswereavailable,wekeptonlythefirstone.
o the‘ethnicgroup’and‘genre’forindividuals.2. Allthe53‘subgenres’ofthegenre‘jazz’andallthe396‘recordlabels’associatedwiththeindividualsandbandsofthelistabove.
• We used the Hyphe web crawler (hyphe.medialab.sciences-po.fr; Jacomy et al., 2016;Ooghe-Tabanouetal.,2018)tovisitallthepagesoftheelementsaboveinEnglishWikipediaandextractthehyperlinksconnectingthem.
• Fromtheresultingnetworko Weremovedalltheedgesthatdidnothaveanindividualorabandasoneoftheirvertices(forreasonsthatwewilldiscusslater).
o Wekeptonlythelargestconnectedcomponent(thelargestgroupofconnectednodesandedges),obtaininganetworkof6.381nodes(5396individuals,589jazzband,346recordlabelsand50subgenres)85.826edges.
Positioning nodes
IntheintroductionwearguedthatthemostimportantvisualvariableofVNAisthepositionofthenodes.Nodesthataremoredirectlyorindirectlyassociated,wewrote,tendtofindthemselvescloserinthespatialisednetwork.Thecautionintroducedby“tendto”iscrucial,because(aswewill show in section 4), there is no strict correlation between the geometric distance in thespatialisedgraphandthemathematicaldistance(howeverdefined)inthegraphmatrix.InVNA,itisnottheexactpositionofanyspecificnodethatshouldbeconsidered,northedistancebetweennodecouples,butthegeneralgroupingofnodesandthedispositionofsuchgroups.Itisnotthenodes’positionthatcounts,butthenodes’density.Inparticular,whatshouldcatchtheeyeoftheobserverareemptyspaces.
ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
Inacontinuumthatgoesfromasetofdisconnectednodestoafullyconnectedclique,thestructureofanetworkisdefinedbythefullandthevoidscreatedbytheunevendistributionofitsrelations.Sinceforce-directedlayoutswouldrepresentbothextremesascirclesfilledwithnodesplacedatthesamedistance,everythingthatdepartsfromthisdispositionisanindicatorofstructure.Whenanalysingaspatialisednetwork,therefore,lookforshapesthatarenotcircular–whichindicatepolarisation–andofdifferenceinthedensityofnodes–whichindicatesclusterisation.
Don’tbetooquickdiscouraged,however,ifyournetworklookslikelooklikeamorphoustangle(a‘hairball’asinnetworkjargon).Thelegibilityofnetworkvisualisationsdependscruciallyonthechoiceof thespatialisationalgorithm.Thoughall force-directedalgorithmsarebasedonasimilarsystemofattractionandrepulsionforces,theirresultsmaydifferbecauseofthespecificway inwhich they handle computational challenges (in particular optimisations necessary toreducecalculations)andvisualproblems(inparticularthebalancebetweenthecompactnessandlegibility).Whatcan,atfirst,bemistakenforahomogenousdistributionofconnectionscan,insomecase,derivefromanunfortunatechoiceofthespatialisationalgorithmoritssettings.
This is why, among the many tools available for network analysis, we recommend Gephi(gephi.org,Bastianetal.,2009)andSigma.js(sigmajs.org).Havingbeendevelopedexpresslyfornetworksdrawing,thesepiecesofsoftwaredonottreatspatialisationasanautomatedoperationbutofferasubtlecontrolofvisualvariables.Amongtheforce-directedalgorithmsourfavouriteisForceAtlas2, because it offers good performances on relatively large networks whileimplementingattractionandrepulsioninarelativelypureway(cf.Jacomyetal.,2014).
Figure1.The‘jazznetwork’spatialised(a)withthealgorithmproposedbyFruchterman&Reingold,1991,(b)withForceAtlas2(withdefaultparameters)and(c)withForceAtlas2withtweakedparametersfor
‘LinLogmode’and‘gravity’
Asanexample,theimageaboveshowshowournetworkofjazzindividualsandbands(forthemoment,wearefilteringoutsubgenresandrecordlabels)lookasahairballwhenspatialisedwithFruchtermanandReingoldalgorithm(consideredasthefirstcomputerimplementationofforce-directed layout, see Fruchterman & Reingold, 1991), but acquire a clearer structure whenvisualisedwithForceAtlas2,particularlywhentwocrucialparametersareadjusted.
The‘LinLogmode’parameterstweaksthewayinwhichdistanceistakenintoconsiderationinthecomputation of attraction and repulsion forces. In default ForceAtlas both forces are linearlyproportionaltothedistance(withinverseforattraction),but,asdemonstratedbyNoack(2009),usinga logarithmicproportionality forrepulsionmakesclustersmorevisible. ‘Gravity’,on theotherhand,isagenericforcethatpullsallnodestowardthecentre.Whileitavoidsdisconnectednodestodriftinfinitelyfarfromtherestofthenetwork,suchagravitationalforceinterfereswith
ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
thepurityof force-directedlayouts(if toohighgravitypacksall thenodes inthecentreofthespace).ActivatingtheLinLogmodeandsettingthegravitytozerotendstomaketheclustersmorevisible,butalsoproduceamorescatterednetwork.Asaconsequence,itisimpossibletosuggesta‘catch-all’settingfortheseparameters.Recursivelyadjustingthespatialisationparameterstotheanalysednetworks iscrucial tomaketherelationalstructuresvisible(justaschoosingtherightchartandtweakingitsvisualpropertiesisessentialtomakesenseofalargedatatable).
Sizing nodes and labels
Nowthatwehavepositionedthenodesofournetwork,inordertorevealeffectsofpolarisationandclustering,westillhavetomakesenseofwhatwesee.Todoso,VNAdrawsontwoancillaryvisualvariables(Bertin,1967):sizeandcolour.Let’sconsidersizefirst.
Tools like Gephi allow to change diameter of the points representing the nodes according avariableselectedbytheuser.‘Degree’(thenumberofedgesconnectedtoanode)or,indirectednetworks,the‘in-degree’(thenumberofincomingedges)areclassicchoices,astheyrepresentaclassictranslationofvisibilityinnetworks.Beingentirelyrelational,degreecanbecomputedforanynetworks(andanydirectednetworksinthecaseofin-degree).Yet,whenavailable,othernon-relational variables could be equally interesting. For instance, we can change the size of theelementsofournetworksaccordingtothenumberofvisitsthateachoftherelatedWikipediapagereceivedin2017.
Figure2.The‘jazznetwork’withnodesandlabelssizedaccordingto(a)thein-degreeofthenodesofthe
graph;(b)thenumberofpageviewsoftherelatedpagesintheEnglishWikipedia.
Notethatinthefigure3,wehavevariednotonlythesizeofthenodes,butalsooftheirlabel(andevendeletedallthelabelssmallerthanagiventhreshold).Thisforegroundingoperatedthrough
ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
sizeiscrucialinVNAbecausewhenworkingwithnetworkswithhundredsorthousandsofnodes,inspectingallofthemisclearlynotanoption.Changinglabelsize(anddroppingsomelabels),however,entailslosingsomeinformation,andthisiswhyusingmorethanonescalingvariableisalwaysadvisable.
Observingthelabelsofthemostvisiblenodes,wecanstarttomakesenseofthefactorsthatshapeournetwork.Comparingthetwoimagesinfigure3,forexample,itispossibletoremarkthatthepageswithhighin-degreetendtobepositionedontheleft,whilepageswithhighpageviewsareratherfoundontheright.Also,nodeswithhighin-degreeareallfamousjazzmen(thetopfivebeing Dizzy Gillespie, Duke Ellington,Miles Davis, Benny Goodman and John Coltrane),whilenodeswithhighpageviewsseemstobepop-culturecelebrities(thetopfivebeingGeorgeMichael,AliciaKeys,BarbraStreisand,LizaMinelli,BingCrosby).Thissuggeststhataleft-rightpolarizationmayexistcorrespondingtoadifferencebetweenapurerjazzlineageandthecontaminationwithothergenres.
Thispolarisation,however,isaweakone,notonlybetweentheleftandrightoftheimage,butalsoandmost importantlybecausethenetworkappears tobestretchedverticallymuchmorethanhorizontally.Towhatmaythisverticalpolarisationcorrespond?
Colouring nodes
Toinvestigatetheverticalpolarisationofourjazznetwork,wewilladdtopositionandsizeathirdvisualvariable–colour.AccordingtoJacquesBertin(1967),colourcanbedecomposedintwodifferentvariables:brightness(orvalue)whichisbettersuitedtorepresentcontinuousnumericalvariablesandhuewhichisbettersuitedtorepresentcategorialvariables.VNAmakesuseofboth.
NoticingatthebottomnamessuchasLoisArmstrong,DukeEllingtonandBingCrosbyandatthetop Chick Corea, Weather Report and Frank Zappa, we can hypothesise that the verticalpolarisationof ournetwork is connected to time and inparticular to theperiod inwhich thedifferentactorsweremostactiveinthejazzscene.Whilesuchinformationisnotavailableinournetwork,wedohavetheyearofbirthandofinceptionofindividualsandbandsandwecanprojectthemonthenetworkusingascaleofbrightnessgoingfromblack(fortheoldestactors)towhite(forthenewest).
Figure3.The‘jazznetwork’withnodescolouredaccordingto
(a)theyearoftheirbirthorinception(fromdarkforolderindividualsandbandstowhitefornewer);(b)theirnationality(blackforUS,greyforallothercountries,whitefornotavailable);
(c)theirethnicgroups(blackforAfricanAmerican,greyforotherethnicgroups,whitefornotavailable);(d)theirgenre(blackforwomen,greyformen,whitefornotavailableorothers)
ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
The first image the figure 4 seems to confirm our hypothesis that the vertical polarisationcorrespondstotime.Whiletheseparationisnotcomplete,darkernodesaremorepresentatthebottomoftheimageandbrighteratthetop.
Intheotherthreeimagesinfigure4,wereliedonhue(usingonlyblack,greyandwhiteandnointermediaryshades)toobservehowdifferentcategoriesdistributesinthenetwork.Figure4band4carededicatedrespectivelytothenationalityandethnicgroup.Whiletheyaredifficulttointerpretalone, together theysuggestan interpretation.Figure4b,revealsunsurprisingly thatjazzisprimarilyanAmericangenreofmusic(butrememberthatwereliedonEnglishWikipediatobuildthenetwork),butitalsoshowsthatmostnon-Americanactors(ingrey)tendtobeontherightof the image.Similarly, figure4cshowsthatwhilemostnodesarenotqualified, theonlyethnicgroupthatstandsoutisAfricanAmerican(againnotsurprisinglyknowingthehistoryofthe genre). Thenodes representingAfricanAmerican actors (in black) are everywhere in thenetwork, but slightlymore to its left than to its right. Both observations seem to confirm theinterpretationwegotfromfigure3,thatthehorizontalpolarisationislooselyconnectedtothe‘purityoftheattachmenttothejazzgenre’.
Tobesure,notallvariableswillturnouttobeconnectedtothevisualstructuresofthenetwork.Infigure4d,forexample,weshowhowgenresarecompletelymixedinournetwork,inawaythatsuggeststhatatleastinthisfieldgenredoesnotproducearelationalfracture(butnoticehowmenaresignificantlymorenumerousthanwomen).
Using force-directed spatialisation to determine the position of nodes and size and colour toproject various variable on our visualisation, we have identified two perpendicular axes ofpolarisation of our jazz network (with amain vertical axis defined by time and a secondaryhorizontalaxisdefinedby‘genrepurity’).Thisconfigurationisdistinctiveofthisnetworkandisnottobeexpectedineverynetwork.Othernetworkscanhaveasingleaxisofpolarisation,morethantwoandsometimesnone(beinginsteadare‘stretched’betweenmultiplepoles).
Naming clusters
Sofar,wehavelookedonlyatthepolesofourgraph,notatitsclusters.Wehaveconsideredtheshapeofthenetwork,butnotthedifferentzonesofdensityproducedbythedispositionofnodes.InVNAclustersaredefinedasregionsthatgatherbymanynodescloselypackedtogetherandsurroundedbyareaswithamuchsparserdensity(the“structuralholes”ofBurt,1995).
Inthejazznetwork,theonlyeasilyidentifiableclusteristheonelocatedattheverytoprightofthe imageandwhosemostvisiblenode is theTrondheim JazzOrchestra (see figure3),whichcontainsagroupofmostlyNorwegianmusiciansmostofwhicharemembersoftheOrchestra.Theotherclustersofournetworkaremoredifficulttoidentifyandmakesenseof.Todoso,wepresent in this paper two advanced techniques for visual network analysis. These techniquesfacilitate,butdonotreplacethebasicoperationofthoroughlyexaminingthedensityandreadingnodeslabelsandqualification(whenavailable)tomakesenseofwhysomegroupsofnodesaremorecloselyconnectedthanothers.
ThefirsttechniqueentailsisnotavailableinGephibutcanbeperformedthroughanothertoolcalledGraphRecipes(tools.medialab.sciences-po.fr/graph-recipes)andbasedonSigma.js.Usingaspecialscriptavailable(asallthescriptsthatweusedtocreatethenetworkandthenetworksitself)atwww.tommasoventurini.it,wetransformedournetworkinanheatmapinordertomakethedifferencesofdensitymoresalient(seefigure5).
Thesecondtechniqueentailsqualifyingthedifferentareasofthenetworkusing‘qualifyingnodes’.This techniqueconsists inadding to thenetworkanewsetofnodes thatdonot influence thespatialisationbutcanbeusedtomakesenseofit.Inourexample,weusedthesubgenresofthe
ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
genrejazz(accordingtoWikidata)andtherecordlabelsassociatedwiththeartistsandensemblesofournetwork.Tomakesurethatthesequalifyingnodesdonotinfluencethelayout,weuseda‘doublespatialisation’.Wefirstspatialisedthenetworkwiththeonly(oftheindividualsandthebands).We then froze thepositionof these ‘primarynodes’, added the subgenres and recordlabels and run the spatialisation algorithma second timeon thequalifyingnodes only.A lastdetail: thoughtheWikipediapagesrelatedtothesubgenresandrecordlabelshavehyperlinksconnectingthem,wehaveremovedtheseedgesfromournetwork,sothatthequalifyingnodesareonlypositionaccordingtotheirconnectionstotheprimarynodes(andnotaccordingtotheconnectionsbetweenthemselves).
Afterthedoublespatialisation,thequalifyingnodescanbeusedtosuggestlabelsfortheclustersofthenetworksinwhichtheyendupbeinglocated.Tocompleteourvisualisation,weworkedwithajazzexpert(EmilianoNeri,whomweheartfullythankforhishelp),todropmostprimaryandqualifyinglabelsandkeeponlythemostsignificant.
Figure4.The‘jazznetwork’with(a)thelabelsofthemostsalientnodeofeachtype(greyforindividual,greenforbands,blueforsubgenresandredforrecordlabels)and(b)theidentificationonthestructureof
thenetworkintermsoftheevolutionofthejazzmusicallanguage.
Interpreting the position of nodes and clusters
Nowthatwehavedecidedonhowtospatializethenetwork,howtosizeandcolourit’snodes,andhowtonameitsclusters,wecantrytomakesenseofbothitsoverallstructuresandofthepositionofitsmostimportantnodes.Aswewillargueinthenextsection,itisadistinctiveadvantageofVNAthatitallowsobservingglobalpatternsandlocalconfigurationsinthesamevisualspace.
ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
In figures 6 and 7, one can observe (moving from the bottom to the top of the image) thedevelopmentof jazzmusical language.Thisevolutionoccupies the leftof the imageandstartsfromdixielandandswingmusicandprogressestobebop,hardbop,postbopandfinallytofreejazzandfreeimprovisation.FromthisbackboneofAfro-Americanjazz,departontherightofthechartsdeviations(suchasthecooljazzandwestcoastjazz)andcontaminationswithothergenres(suchasbossanova,latinjazzandlaterjazzfusion).
Figure5.Mosaicprovidingazoomonthedifferentregionsofthe‘jazznetwork’
[7.a]ThebottomoftheimagecorrespondsthustotheearlyyearsofthegenreandismarkedbyDecca Records, a label which dominated the jazz scene in the 1930s and 1940s, and CapitolRecords,alsoparticularlyactiveinthe1940s.Theregionofdixielandandswingmusicissplitintwoparallelclusters(alreadyidentifiedbyGlaiseretal.,2003):totheright,theand‘whitebigbands’gatheredaroundTommyDorsey,GlennMillerandBennyGoodman;andtotheleftthe‘blackbigbands’gatheredaroundLouisArmstrong,Coleman_Hawkins,CountBasieand,DukeEllington.Thislastbandleaderisalsoattheoriginofthesmallerclustertothebottomleft,constitutedbythe members of its orchestra. Famous vocalists such as Ella Fitzgerald and Billy Holiday arepositionedtowardthecentrebecauseof the largenumberof theircollaborations.More to theright,isDjangoReinhardt,theRomaniguitarist,whoseisolatedpositionisjustifiedbyhislivingininEurope.
[7.b]Shiftinguptowardthebebop,manynewrecordlabelsemergesuchasPrestige,Riverside,Savoy, Atlantic, and more importantly Verve and Columbia which were destined to imposethemselvesinthejazzscenesforyearstocome.Veryclosetothenoderepresentingbebop,onecanfind(notsurprisingly)thetrumpeterDizzyGillespieandthesaxophonistCharlieParker,whowereamong themost influential artistof thisnewstyle, and thevocalistSarahVaughanwhocollaboratedwithboth.InamorebridgingpositionareWoodyHermanandClarkTerry,whoselongcareersspannedbetweenswingandbebop.
[7.c]Moveupward,theincreaseinthenumberanddispersionofnodesillustratesthegrowingdiversificationinjazzlanguageinthe1950s.Ontheonehand(ontheleftofchart),bebopevolvesintohardbop,thankstotheBlueNoterecordlabelandtomusicianssuchasCharlesMingus,SonnyRollins,TheloniousMonk andArtBlakey.This lastbandleader is at theoriginof the importantensembleoftheJazzMessengers,whichcreatesalittlecapeontheleftofthemapandwhichacted
ExtractfromVenturini,Jacomy&Jensen.,forth.,“WhatdoWeSeeWhenWeLookatNetworks”draftdonotredistribute
asanincubatorfortalent,includingFreddieHubbard,McCoyTynerandWyntonMarsalis.Ontheotherhand (on the rightof thechart), theexperiencesofwest coast jazzandcool jazzevolvethroughthecontaminationwithstylesfromLatinAmerica,givingbirthtobossanovaand latinjazz,popularizedintheUSbyinfluentialfiguressuchasStanGetzandQuincyJones.JohnColtraneandMilesDavisoccupythecentreofthisregion(andofthewholegraph)forthecrucialroletheyplayedinbridgingalltheseexperiences.
[7.d]Inthe1960s,thecontaminationsobservedinthecentre-rightofthechartturntowardrockandfunkmusicaswellastheiruseofelectricinstrumentsandamplifiers,originatingtheso-calledjazzfusion.MusicianssuchasChickCorea,HerbieHancock,JohnScofieldandPatMetheny,aswellasthegroupWeatherReport,playacrucialroleinthisexperience.Ataboutthesametime,andwithconnectionsassuredbyartistssuchasJoeHendersonandMichaelBrecker,hardbopdevelopsintopost-bopthankstomusicianssuchasWayneShorterandElvinJones.
[7.e] In the 1970s, experiences of radical improvisation developed in the previous decadesconqueredthemusicalavant-garde,givingbirthtofreejazzandfreeimprovisation.Initiatedbymusicians such asSunRa,Cecil Taylor,Archie Shepp andOrnetteColeman, this stylehasbeendevelopedbyAnthonyBraxton,JohnZorn,EvanParkerandmanyothers.Interestingly,thisgenreseemstobeeditedparticularlybyEuropeanrecordlabelssuchasJMTandECM.ThislastrecordlabelisalsothebridgethatconnectstherelativelymarginalclusteroftheScandinavianjazz(atthetop-rightofthefigure)totherestofthemaps.