Faculty of ScienceMaster of Science in Mathematics - profile: Education
Similarity on Graphs and HypergraphsGraduation thesis submitted in partial fulfillment of the requirements forthe degree of Master of Science in Mathematics - profile: Education
Filip Moons
Promotor: Prof. Dr. P. Cara
August 2015
Acknowledgements
First of all, I would like to thank my promotor, Philippe Cara for his incredible attentionto detail, for the patience he had with me and the many reminders to continue and to workhard. He also showed me the way to this beautiful topic for a Master thesis. Besides his roleas promotor, I will never forget him as professor of the Linear Algebra course in the very firstterm of the BSc in Mathematics, that without any doubt was the most memorable courseduring my college years: the hard study and the stress I had before entering the exam roomare memories that I still will be telling in my old days.
Secondly, I would like to thank Roger Van Nieuwenhuyze, my math lecturer during theteacher training I followed for one year at the Hogeschool-Universiteit Brussel. Without hisencouragements, I would never have dared to leave and study Mathematics at university level.
I also thank all professors and teaching assistants who taught me a lot during the pastfive years. A special thanks goes to Reen Tallon, who always managed to overcome anyadministrative burden caused by the combination of study programs.
Last but not least, I want to thank my parents for their angry looks when they felt Idid not study enough, it certainly kept me on track. Also my partner Sacha, my classmates,my friends from university and the Classical Liberal Student Union made my college yearsunforgettable for the rest of my life.
1
Abstract
This thesis is about similarity on graphs and hypergraphs. In the first chapter, we give anoverview of all the preliminary knowledge needed to understand everything in the followingchapters. All the methods of similarity we will discuss, are eventually solving an eigenvalueproblem. So it is no surprise we prove the Perron-Frobenius theorem and introduce the powermethod in the first chapter, the power method is a numerical algorithm to calculate the largesteigenvalue and a corresponding eigenvector.
In chapter 2, we give a complete overview of the research done on similarity of graphs.The notion of similarity originated from the HITS algorithm: this algorithm was developedin the late nineties to sort results of a search engine in a meaningful way. The algorithm usesthe hyperlink structure of a set of webpages to assign to each webpage in the set a score inhow well they are in referring to other pages in the set (the hub-score) and it assigns to eachwebpage a score in how authoritative they are compared to the other sites in the set (theauthority-score). So you get two groups in the set of webpages: some pages are good hubs,others are good authorities. Obviously, this dichotomy is strongly related: pages that aregood hubs will contain a lot of links to authoritative pages, and, conversely, pages are goodauthorities when they have a lot of incoming links of good hubs. This algorithm was veryinnovative at the time, because back then, search engine results where sorted based only onthe number of incoming links or on the number occurences of the search query.
The HITS algorithm is generalized in 2004 to a notion of similarity between graphs. A setcontaining hyperlinked web pages, can indeed be seen as a graph. This generalization will leadto a method where we can compare two graphs by putting a similarity score between everynode of the first graph and every node of the second graph. This similarity score indicateshow similar these nodes are: two nodes will have a high similarity score when their adjacentnodes will also have a high similarity score.
Once this node similarity is introduced on graphs, we extend this notion to a similaritymethod that also returns a similarity score between the edges of two graphs. We will alsodiscuss a similarity method that deals with colored graphs. With all these methods, we havea complete overview of the most recent studies about graph similarity. It is remarkable thatthe convergence of the resulting algorithm of all these methods can be proved with the sameconvergence theorem (Theorem 2.2.10.). Every solution is in fact also just an eigenvalueproblem: every extension is a variant of the power method to calculate eigenvectors.
An important remark is that all the methods return slightly different results. At the mo-ment, ‘the’ similarity score between two vertices of a graph doesn’t exists, simply because itdepends on the method you are using: in some cases the results are equal (up to a constant),in other cases the results are different because slightly different criteria are used in the cal-culation. In that case, we can explicitly derive the difference. Still, it is not that worse: it isimportant to remember that the similarity scores are mainly interesting when compared to
2
3
other similarity scores within a graph and with one method. From that point of view, everymethod give the same ranking. This is also justified by the applications: all of them are usingsimilarity scores in comparison to other similarity scores.
In the last chapter, we discover a new field: we try to transfer the concept of similarity tothe more general structure of hypergraphs. We end this chapter with a reflection on the resultsof the previous chapter: when we try to develop a method for similarity on hypergraphs, whichconditions must be fulfilled? Are these conditions also met by the graph similarity methods ofthe first chapter? At first, we take a look at the graph representations of hypergraphs and wetry to get a good method of similarity by using them. It will be clear that the incidence graphof a hypergraph returns the best results when used for similarity. After that, we also try todevelop a method based on the incidence matrix. This method also fulfills all conditions. Weconclude the thesis by proving that both methods are equal up to a constant.
Samenvatting
Deze thesis bespreekt similariteit op graffen en hypergraffen. In het eerste hoofdstuk gevenwe een overzicht van alle voorkennis die nodig is om deze thesis te doorgronden. Bijna allemethodes die in deze thesis besproken worden, komen uiteindelijk neer op het oplossen van eeneigenwaardeprobleem: het is dan ook niet verwonderlijk dat we de Perron-Frobeniusstellingbewijzen en de methode van de machten invoeren om eigenvectoren numeriek te berekenen.
De notie van similariteit op graffen, ingevoerd in hoofdstuk 2, vindt zijn oorsprong in hetHITS-algoritme: dit algoritme werd eind jaren negentig ontdekt om zoekresultaten van eeninternetzoekmachine op een betekenisvolle manier te sorteren. Het algoritme onderzoekt derelaties tussen de hyperlinks van een verzameling webpagina’s: sommige webpagina’s wor-den bestempeld als goede doorverwijzers, andere webpagina’s worden dan weer als een geza-ghebbende bron beschouwd voor de zoekterm. Uiteraard is deze tweedeling sterk verwant:pagina’s die goede doorverwijzers zijn zullen veel links naar pagina’s bevatten die als eengezaghebbende bron gezien worden en, omgekeerd, zullen pagina’s als gezaghebbende bronworden beschouwd als ze veel inkomende links krijgen van goede doorverwijzers. Dit algo-ritme was erg vernieuwend in die tijd, vermits tot dan toe zoekresultaten enkel gesorteerdwerden op het aantal inkomende links (zonder daarbij de bronpagina na te gaan) of het aantalkeren dat de zoekterm op de pagina voorkwam.
Deze methode werd rond 2004 veralgemeend naar een notie van similariteit tussen graffen.Een verzameling webpagina’s die onderling gelinkt zijn zoals die in het HITS-algoritme ge-bruikt worden, kan immers beschouwd worden als een graf. Deze veralgemening leidt tot eenmethode waarbij we een soort score kunnen kleven op hoe twee toppen van twee verschillendegraffen op elkaar lijken. Deze ‘similariteisscore’ zal hoger zijn als de adjacente toppen vandeze twee toppen ook gelijkaardig zijn.
Eens we similariteit op de toppen van graffen hebben ingevoerd, gaan we deze methodeverder uitbreiden tot similariteit op toppen en bogen van graffen en tot similariteit tussengekleurde graffen. We hebben zo een compleet overzicht van de meest recente studies overhet onderwerp. Opvallend is dat de convergentie van de resulterende algoritmes van al dezeuitbreidingen via dezelfde convergentiestelling (Stelling 2.2.10.) kunnen bewezen worden.Elke uitbreiding komt ook neer op het oplossen van een eigenwaardeprobleem: uiteindelijk iselke uitbreiding een variant op de methode van de machten om eigenvectoren te berekenen.
Een belangrijke opmerking is hier wel dat deze verschillende uitbreidingen lichtjes andereresultaten opleveren. Er bestaat niet zoiets als ‘de’ similariteitsscore tussen twee toppen vantwee graffen, eenvoudigweg omdat elke uitbreiding een licht verschillend resultaat zal oplev-eren: in sommige gevallen zijn de resultaten op een constante na gelijk aan elkaar, in anderegevallen gebruiken de variaties iets andere criteria om similariteit te bepalen (bv. de methodevan similariteit op de toppen van graffen en similariteit op de toppen en de bogen), waardoorde resultaten sowieso verschillend zijn. We kunnen in dat geval wel expliciet het verschil
4
5
tussen de methodes uitschrijven. Belangrijk is te onthouden dat similariteitscores vooral inverhouding tot de andere scores interessant zijn: we kunnen bv. gemakkelijk bepalen welketop van de ene graf, het meest lijkt op de top van een andere graf simpelweg door de grootstesimilariteitsscore te bekijken. Dit zal ook blijken uit het korte overzicht van toepassingen datop het einde van hoofdstuk 2 is toegevoegd, ook daar wordt een similariteitsscore enkel inverhouding tot de andere similareitsscores bekeken.
In het laatste hoofdstuk ontdekken we tot slot nieuw terrein: we proberen het concept vansimilariteit over te brengen op de meer algemene structuur van hypergraffen. Dat start meteen uitgebreide reflectie op de resultaten van het vorige hoofdstuk: aan welke criteria moetzo’n similariteitsmethode voor hypergraffen allemaal voldoen? Voldoen de grafmethodes vanhet vorige hoofdstuk hier ook aan? Hypergraffen kunnen door verschillende grafrepresen-taties worden voorgesteld en we proberen eerst om via deze weg tot een goede methode tekomen. Het zal blijken dat de incidentiegrafrepresentatie van een hypergraf de beste resul-taten oplevert. Nadien proberen we ook een methode te ontwikkelen aan de hand van deincidentiematrix, ook deze methode voldoet aan alle voorwaarden. Welnu: we kunnen be-wijzen dat de methode met de incidentiegraf en de methode met de incidentiematrix op eenconstante na dezelfde resultaten oplevert.
Contents
1 Preliminaries and notations 71.1 Some families of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.1 Permutation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.1.2 Nonnegative and primitive matrices . . . . . . . . . . . . . . . . . . . 81.1.3 Irreducible nonnegative matrices . . . . . . . . . . . . . . . . . . . . . 9
1.2 Perron-Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.2.1 Spectral radii of nonnegative matrices . . . . . . . . . . . . . . . . . . 121.2.2 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.3.1 Vector norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
p-norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Maximum norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Norm equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.2 Matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Frobenius norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20p-norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Maximum norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.3 Spectral radius formula . . . . . . . . . . . . . . . . . . . . . . . . . . 221.4 Numerical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.1 Bachmann-Landau notations . . . . . . . . . . . . . . . . . . . . . . . 26Big O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Small O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Asymptotical Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.2 The Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Algorithm 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Computational cost and usage . . . . . . . . . . . . . . . . . . . . . . 30Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.5.1 General definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Product graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Colored graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Adjacency matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Incidence matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.5.2 Strong connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351.6 Hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6
CONTENTS 7
1.6.1 General definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36k-hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Directed hypergraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.6.2 Incidence matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2 Similarity on graphs 392.1 The HITS algorithm of Kleinberg . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.1.3 Constructing relevant graphs of webpages . . . . . . . . . . . . . . . . 412.1.4 Hubs and Authorities . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.1.5 Convergence of the algorithm . . . . . . . . . . . . . . . . . . . . . . . 452.1.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Searching for math professors at the VUB . . . . . . . . . . . . . . . . 49Predictors in the Eurovision Song Contest 2009-2015 . . . . . . . . . . 51
2.1.7 Final reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582.2 Node similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592.2.1 Convergence theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 632.2.2 Similarity matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702.2.3 Algorithm 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Equivalence with the power method . . . . . . . . . . . . . . . . . . . 73Computational cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.2.4 Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74HITS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74Central scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Self-similarity of a graph . . . . . . . . . . . . . . . . . . . . . . . . . . 79Undirected graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.3 Node-edge similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842.3.1 Coupled node-edge similarity scores . . . . . . . . . . . . . . . . . . . 842.3.2 Algorithm 6 and Algorithm 7 . . . . . . . . . . . . . . . . . . . . . . . 882.3.3 Difference with node similarity . . . . . . . . . . . . . . . . . . . . . . 892.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.4 Colored graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912.4.1 Colored nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Algorithm 8 and Algorithm 9 . . . . . . . . . . . . . . . . . . . . . . . 96Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.4.2 Colored edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
2.4.3 Fully colored graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1052.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
2.5.1 Synonym Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062.5.2 Graph matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
CONTENTS 8
3 Similarity on hypergraphs 1073.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1073.1.2 What would be a good hypergraph similarity method? . . . . . . . . . 107
3.2 Similarity through corresponding graph representations . . . . . . . . . . . . 1143.2.1 Line-graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
General definitions and properties . . . . . . . . . . . . . . . . . . . . 114Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 115Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.2.2 2-section of a hypergraph . . . . . . . . . . . . . . . . . . . . . . . . . 119General definitions and properties . . . . . . . . . . . . . . . . . . . . 119Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 121Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
3.2.3 Extended 2-section of a hypergraph . . . . . . . . . . . . . . . . . . . 124General definitions and properties . . . . . . . . . . . . . . . . . . . . 124Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 125Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
3.2.4 The incidence graph of a hypergraph . . . . . . . . . . . . . . . . . . . 129General definitions and properties . . . . . . . . . . . . . . . . . . . . 129Algorithm for similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 129Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
3.3 Similarity by using the incidence matrix . . . . . . . . . . . . . . . . . . . . . 1343.3.1 Compact form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1343.3.2 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1363.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1373.3.4 Concluding theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1393.3.5 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Appendix A Listings 143
Appendix B Results of the Eurovision Song Contest 2009-2015 151
Chapter 1
Preliminaries and notations
In this chapter, we give an overview of all important prelimiary knowledge needed to under-stand everything in the following chapters. We start with introducing different matrices andnext, we prove the Perron-Frobenius theorem. The Perron-Frobenius theorem states that areal square matrix with nonnegative entries has a unique largest real eigenvalue (i.e. withalgebraic multiplicity 1) with an eigenvector that has only positive entries. The theoremwas proved by Oskar Perron (1880-1975) in 1907 for strictly positive entries and extendedby Ferdinand Georg Frobenius (1849-1917) to irreducible matrices with nonnegative entries.Next, we introduce vector and matrix norms and we prove the equivalence of norms and thespectral radius formula. After that, we introduce the power method. We conclude with theintroduction of graphs and hypergraphs.
1.1 Some families of matrices
In this section, we first introduce different kinds of matrices. Note that all matrices in thismaster thesis have real entries, unless otherwise stated. We start with permutation matricesand their uses. With permutation matrices, we can introduce irreducible matrices. Alsononnegative and primitive square matrices are presented. After defining those, we look at thePerron-Frobenius theorem.
1.1.1 Permutation matrices
Definition 1.1.1. Given a permutation π of n elements:
π : {1, . . . , n} → {1, . . . , n},with:
π =(
1 2 · · · nπ(1) π(2) · · · π(n)
),
the associated permutation matrix Pπ is the n× n-matrix obtained by permuting the rowsof the identity matrix In according to π. So:
Pπ =
eTπ(1)eTπ(2)
...eTπ(n)
.
9
CHAPTER 1. PRELIMINARIES AND NOTATIONS 10
where ej is the j-th column of In, the identy matrix.
Example 1.1.2. The permutation matrix Pπ corresponding to the permutation
π =(
1 2 3 41 4 2 3
)
is:
Pπ =
1 0 0 00 0 0 10 1 0 00 0 1 0
.Note that pij = 1 if and only if π(i) = j.
Property 1.1.3. A permutation matrix P satisfies:
PP T = In.
Proof. By direct computation, we get with P = (pij):
(PP T )ij =n∑k=1
pikpTkj =
n∑k=1
pikpjk
Assume i 6= j. Then for each k, pikpjk = 0 since there is only one nonzero entry in the k-throw and i 6= j, pik and pjk can’t be both the nonzero entry. So, (PP T )ij = 0 when i 6= j.
When i = j, then there exists a k′ ∈ {1, . . . , n} with pik′pjk′ = 1, since there is only onenonzero entry in the k-th row, this k′ is unique, which results in ∑n
k=1 pikpjk = (PP T )ij = 1.In other words,
(PP T )ij ={
1 if i = j
0 otherwise,
this is exactly the formula for the entries of the identity matrix.
Corollary 1.1.4. The transpose of a permutation matrix P is its inverse:
P T = P−1.
This can also more easily be concluded from the fact that a permutation matrix is clearly anorthogonal matrix (a real n× n-matrix with orthonormal entries).
1.1.2 Nonnegative and primitive matrices
Definition 1.1.5. Let A and B be two real n×r-matrices. Then, A ≥ B (respectively A > B)if aij ≥ bij (respectively aij > bij) for all 1 ≤ i ≤ n, 1 ≤ j ≤ r.
Definition 1.1.6. A real n × r-matrix A is nonnegative if A ≥ 0, with 0 the n × r-zeromatrix.
Definition 1.1.7. A real n× r-matrix A is positive if A > 0, with 0 the n× r-zero matrix.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 11
Since column vectors are n× 1-matrices, we shall use the terms nonnegative and positivevector throughout.
Notation 1.1.8. Let B be an arbitrary complex n × r-matrix, then |B| denotes the matrixwith entries |bij |. This is not to be confused with the determinant of a square matrix B, whichwe denote by det(B).
Definition 1.1.9. A nonnegative square matrix A is called primitive if there is a k ∈ N0such that all entries of Ak are positive.
1.1.3 Irreducible nonnegative matrices
In developing the Perron-Frobenius theory, we shall first establish a series of theorems andlemmas on nonnegative irredicuble square matrices.
Definition 1.1.10. A square matrix A is called reducible if there is a permutation matrixP such that
PAP T =(B 0C D
)where B, and D are square matrices, each of size at least one and 0 is a zero matrix. Asquare matrix A is called irreducible if it is not reducible.
It follows immediately that a 1 × 1-matrix is always irreducible by definition. We nowshow a useful property to identify a reducible matrix.
Property 1.1.11. Let A be an n×n-matrix with n ≥ 2. Consider a nonempty, proper subsetS of {1, . . . , n} with aij = 0 for i ∈ S, j 6∈ S. Then A is reducible.
Proof. Let S = {i1, i2, . . . , ik}, where we assume, without loss of generality, that i1 < i2 <· · · < ik−1 < ik. Let Sc be the complement of S, consisting of the ordered set of elementsj1 < j3 < · · · < jn−k. Consider the permutation σ of {1, 2, . . . , n} given by
σ =(
1 2 . . . k k + 1 k + 2 . . . ni1 i2 . . . ik j1 j2 . . . jn−k
)
σ can be represented by the permutation matrix Pσ = (pij), where prs = 1 if σ(r) = s.We prove that
PAP T =(B 0C D
)where B and D are square matrices and 0 is a k × (n − k) zero matrix. Consider row c
and column d, where 1 ≤ c ≤ k and k + 1 ≤ d ≤ n:
(PAP T )cd =∑i
∑j
pciaijpdj . (1.1)
It is enough to show that each term in the summation is zero. Suppose pci = pdj = 1. Thusσ(c) = i and σ(d) = j. Since 1 ≤ c ≤ k, then i ∈ {i1, i2, . . . , ik}; similarly, since k+1 ≤ d ≤ n,we have j ∈ {j1, j2, . . . , jn−k}. By assumption, for such a pair i, j, we have aij = 0. Thatcompletes the proof.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 12
We know prove some equivalent definitions for a nonnegative, irreducible square matrix.
Theorem 1.1.12. Let A ≥ 0 be a nonnegative n× n-matrix. Then the following conditionsare equivalent:
(1) A is irreducible.
(2) (I +A)n−1 > 0
(3) For any pair (i, j), with 1 ≤ i, j ≤ n, there is a positive integer k = k(i, j) ≤ n suchthat (Ak)ij > 0.
Proof. (1) ⇒ (2): Let x ≥ 0,x 6= 0 be an arbitrary vector in Rn. If a coordinate of xis positive, the same coordinate is positive in x + Ax = (I + A)x as well. We claim that(I + A)x has fewer zero coordinates than x as long as x has a zero coordinate. If this claimis not true, then the number of zero coordinates must be at least equal, this means that foreach coordinate j with xj = 0 we would have that xj + (Ax)j = 0. Let J = {j : xj > 0}.For any j 6∈ J, r ∈ J , we have (Ax)j = ∑
k ajkxk = 0 and xr > 0. It must be that ajr = 0.It follows from Property 1.1.11 that A is reducible, which is a contradiction and the claim isproved. Thus (I + A)x has at most n − 2 zero coordinates. Continuing in this manner weconclude that (I + A)n−1x > 0. Let x = ei, then the corresponding column of (I + A)n−1
must be positive. Thus (2) holds.(2)⇒ (3): We have (I +A)n−1 > 0, A ≥ 0, so A 6= 0 and
A(I +A)n−1 =n∑k=1
(n− 1k − 1
)Ak > 0.
Thus for any i, j at least one of the matrices A,A2, . . . , An has its (i, j)-th element entrypositive.
(3)⇒ (1): Suppose A is reducible. Then for some permutation matrix P ,
PAP T =(B1 0C1 D1
)where B1 and D1 are square matrices. Furthermore, we know from Property 1.1.3 thatPAP TPAP T = PA2P T , hence for some square matrices B2, C2 we have:
PA2P T =(B2 0C2 D2
)More generally, for some matrix Ct and square matrices Bt and Dt,
PAtP T =(Bt 0Ct Dt
)
Thus (PAtP T )rs = 0 for t = 1, 2, . . . and for any r, s corresponding to an entry of the zerosubmatrix in PAP T . Now, for t = 1, . . . , n :
0 = (PAtP T )rs =∑k
∑l
prka(t)kl pls
By using the same reasoning as in 1.1, choose r, s so that prk = pls = 1. Then a(t)kl = 0 for all
t, contradicting the hypothesis. This completes the proof.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 13
Corollary 1.1.13. If A is irreducible then I +A is primitive.
Corollary 1.1.14. AT is irreducible whenever A is irreducible.
Property 1.1.15. No row or column of an irreducible matrix A can vanish. This means thatA cannot have a row or a column of zeros.
Proof. Suppose that A has a zero row, then it could be permuted to
PAP T =
0 0 . . . 0c1...cn
D
by some permutation matrix P . It follows from Definition 1.1.10 that A is reducible. Similarly,if A has zero column, it can be permuted to
PAP T =
B
0...0
c1 . . . cn 0
,
again from Definition 1.1.10 we conclude that A is reducible.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 14
1.2 Perron-Frobenius Theorem
We know prove the Perron-Frobenius theorem. The results of this section are mainly basedon the book ‘Elemetary Matrix Theory’ of H.W. Eves [? ]. Although the proof of the Perron-Frobenius is very straightforward in this work, it was sometimes necessary to extend thereasonings in some proofs, the books [? ] and [? ] were very helpful in this case.
1.2.1 Spectral radii of nonnegative matrices
Definition 1.2.1. Let A be an n× n-matrix with real entries and eigenvalues λi, 1 ≤ i ≤ n.Then:
ρ(A) = max1≤i≤n
|λi|
is called the spectral radius of the matrix A.
Geometrically, if all the eigenvalues λi of A are plotted in the complex plane, then ρ(A)is the radius of the smallest disk |z| ≤ R, with center at the origin, which includes all theeigenvalues of the matrix A.
We now establish a series of lemmas on nonnegative irreducible square matrices. Theselemmas will allow us to prove the Perron-Frobenius at the end of this section.
If A ≥ 0 is an irreducible n× n-matrix and x, a vector of size n with 0 6= x ≥ 0, let
rx = min{∑n
j=1 aijxj
xi
}(1.2)
where the minimum is taken over all i for which xi > 0. Clearly, rx is a nonnegative realnumber and is the supremum of all p ≥ 0 for which
Ax ≥ px (1.3)
We now consider the nonnegative quantity r defined by
r = supx≥0x 6=0
{rx} (1.4)
As rx and rαx have the same value for any scalar α > 0, we need consider only the setB of vectors x ≥ 0 with ||x|| = 1, and we correspondingly let Q be the set of all vectorsy = (I + A)n−1x where x ∈ B. From Theorem 1.1.12 , Q consists only of positive vectors.Multiplying both sides of the inequality Ax ≥ rxx by (I +A)n−1, we obtain:
∀y ∈ Q : Ay ≥ rxy,
and we conclude from (1.3) that ry ≥ rx. Therefore, the quantity r of (1.4) can be definedequivalently as:
r = supy∈Q{ry} (1.5)
CHAPTER 1. PRELIMINARIES AND NOTATIONS 15
As B is a compact set (in the usual topology) of vectors, so is Q, and as ry is a continuousfunction on Q, we know from the extreme value theorem that there necessarily exists a positivevector z for which:
Az ≥ rz, (1.6)
and no vector w ≥ 0 exists for which Aw > rw.
Definition 1.2.2. We call all nonnegative, nonzero vectors z satisfying (1.6) extremal vec-tors of the matrix A.
Lemma 1.2.3. If A ≥ 0 is an irreducible n× n-matrix, the quantity r of (1.4) is positive.
Proof. If x is the positive vector whose coordinates are all unity, then since the matrix A isirreducible, we know from Property 1.1.15 that no row of A can vanish, and consequently nocomponent of Ax can vanish. Thus, rx > 0, proving that r > 0.
Lemma 1.2.4. If A ≥ 0 is an irreducible n× n-matrix, each extremal vector z is a positiveeigenvector of A with corresponding eigenvalue r of (1.4), i.e., Az = rz and z > 0.
Proof. Let z be an extremal vector with Az− rz = t. If t 6= 0, then some coordinate of t ispositive; multiplying through by the matrix (I +A)n−1, we have:
Aw− rw > 0, with w = (I +A)n−1z
from Theorem 1.1.12 we know that w > 0. It would then follow that rw > r, contradictingthe definition of r in (1.5). Thus Az = rz, and since w > 0 and w = (1 + r)n−1z, then wehave z > 0, completing the proof.
Lemma 1.2.5. Let A ≥ 0 be an irreducible n × n-matrix, and let B be an n × n- complexmatrix with |B| ≤ A. If β is any eigenvalue of B, then
|β| ≤ r, (1.7)
where r is the positive quantity of (1.4). Moreover, equality is valid in (1.7), i.e., β = reiφ,if and only if |B| = A, and where B has the form:
B = eiφDAD−1, (1.8)
and D is a diagonal matrix whose diagonal entries have modulus unity.
Proof. If βy = By where y 6= 0, then
βyi =n∑j=1
bijyi, with 1 ≤ i ≤ n.
Using the hypotheses of the lemma and the notation of Definition 1.1.8, it follows that:
|β||y| ≤ |B||y| ≤ A|y|, (1.9)
which implies that |β| ≤ r|y| ≤ r, proving (1.7). If |β| = r, then |y| is an extremal vectorof A. Therefore, from Lemma 1.2.4, |y| is a positive eigenvector of A corresponding to thepositive eigenvalue r. Thus,
r|y| = |B||y| = A|y|, (1.10)
CHAPTER 1. PRELIMINARIES AND NOTATIONS 16
and since |y| > 0, we conclude from (1.10) and the hypothesis |B| ≤ A that
|B| = A (1.11)
For the vector y, where |y| > 0, let
D = diag{y1|y1|
, . . . ,yn|yn|
}.
It is clear that the diagonal entries of D have modulus unity, and
y = D|y|. (1.12)
Setting β = reiφ, then By = βy can be written as:
C|y| = r|y|, (1.13)
where
C = e−iφD−1BD. (1.14)
From (1.10) and (1.13), equiting terms equal to r|y| we have
C|y| = |B||y| = A|y|. (1.15)
From the definition of the matrix C in (1.14), |C| = |B|. Combining with (1.11), we have:
|C| = |B| = A. (1.16)
Thus, from (1.15) we conclude that C|y| = |C||y|, and as |y| > 0, it follows that C = |C|and thus C = A from (1.16). Combining this result with (1.14), gives the desired result thatB = eiφDAD−1. Conversely, it is obvious that if B has the form in (1.8), then |B| = A, andB has an eigenvalue β with |β| = r, which completes the proof.
Corollary 1.2.6. If A ≥ 0 is an irreducible n × n-matrix, then the positive eigenvalue r ofLemma 1.2.4 equals the spectral radius ρ(A) of A
Proof. Setting B = A in Lemma 1.2.5 immediately gives us this result.
In other words, if A ≥ 0 is an irreducible n×n-matrix, its spectral radius ρ(A) is positive,and the intersection in the complex plane of the circle |z| = ρ(A) with the positive real axisis an eigenvalue of A.
Definition 1.2.7. A principal square submatrix of an n × n-matrix A is any matrixobtained by crossing out any j rows and the corresponding j columns of A, with 1 ≤ j ≤ n.
Lemma 1.2.8. If A ≥ 0 is an irreducible n × n-matrix, and B is any principal squaresubmatrix of A, then ρ(B) < ρ(A).
CHAPTER 1. PRELIMINARIES AND NOTATIONS 17
Proof. If B is any principal submatrix of A, then there is an n × n-permutation matrix Psuch that B = A11 where
C =(A11 00 0
);PAP T =
(A11 A12A21 A22
)(1.17)
Here, A11 and A22 are, respectively, m×m and (n−m)×(n−m) principal square submatricesof PAP T , 1 ≤ m ≤ n. Clearly, 0 ≤ C ≤ PAP T , and ρ(C) = ρ(B) = ρ(A11), but asC = |C| 6= PAP T , the conclusion follows immediately from Lemma 1.2.5 and Corollary 1.2.6.
The following lemma is used to prove that ρ(A) is a simple eigenvalue of A in the Perron-Frobenius theorem. The proof uses the extension of the product rule of derivation for mul-tilinear functions M(a1, . . . , ak). Suppose x1, .., xk are differentiable vector functions, thenM(x1, . . . , xk) is differentiable and:
ddtM(x1, . . . , xk) = M( d
dtx1, x2, . . . , xk) +M(x1,ddtx2, . . . , xk) + . . .+M(x1, x2, . . . ,
ddtxk)
The most important application of this rule is for the derivative of the determinant:
ddt det(x1, . . . , xk) = det( d
dtx1, x2, . . . , xk)+det(x1,ddtx2, . . . , xk)+ . . .+det(x1, x2, . . . ,
ddtxk)
Lemma 1.2.9. Let A be an n × n-matrix over the complex numbers and let φ(A, λ) =det(λIn − A) be the characteristic polynomial of A. Let Bi be the principal submatrix ofA formed by deleting the i-th row and column of A and let φ(Bi, λ) be the characteristicpolynomial of Bi. Then:
φ′(A, λ) = dφ(A, λ)dλ =
∑i
φ(Bi, λ)
Proof. The proof is immediately done by direct computation:
φ(A, λ) = det
λ− a1,1 −a1,2 . . . −a1,n−a2,1 λ− a2,2 . . . −a2,n
...... . . . ...
−an,1 −an,2 . . . λ− an,n
.Using the extension of the product rule of derivation for multilinear functions
φ′(A, λ) = det
1 0 . . . 0−a2,1 λ− a2,2 . . . −a2,n
...... . . . ...
−an,1 −an,2 . . . λ− an,n
+ det
λ− a1,1 −a1,2 . . . −a1,n
0 1 . . . 0...
... . . . ...−an,1 −an,2 . . . λ− an,n
+ . . .
+ det
λ− a1,1 −a1,2 . . . −a1,n−a2,1 λ− a2,2 . . . −a2,n
...... . . . ...
0 0 . . . 1
=∑i
φ(Bi, λ).
CHAPTER 1. PRELIMINARIES AND NOTATIONS 18
1.2.2 Proof
We now collect the above results into the following main theorem: we finally arrived at thePerron-Frobenius Theorem:
Theorem 1.2.10. (Perron-Frobenius theorem)Let A ≥ 0 be an irreducible n× n-matrix. Then,
1. A has a positive real eigenvalue equal to its spectral radius.
2. To ρ(A) there corresponds an eigenvector x > 0.
3. ρ(A) increases when any entry of A increases.
4. ρ(A) is a simple eigenvalue of A.
5. If Ax = ρ(A)x where x > 0 and x is a normalized vector, then x is unique.
Proof. (1) and (2) follow immediately from Lemma 1.2.4 and Corollary 1.2.6.(3) Suppose we increase some entry of the matrix A, giving us a new irreducible matrix
A where A ≥ A and A 6= A. Applying Lemma 1.2.5, we conclude that ρ(A) > ρ(A).(4) ρ(A) is a simple eigenvalue of A, i.e., ρ(A) is a zero of multiplicity one of the char-
acteristic polynomial φ(λ) = det(λIn − A), we make use of Lemma 1.2.9 by using the factthat φ′(λ) is the sum of the determinants of the principal (n − 1) × (n − 1) submatrices ofλI −A. If Ai is any principal submatrix of A, then from Lemma 1.2.8, det(λI −Ai) (with Ithe identity matrix with the same size as the principal submatrix Ai) cannot vanish for anyλ ≥ ρ(A). From this it follows that:
det(ρ(A)I −Ai) > 0,
and thusφ′(ρ(A)) > 0.
Consequently, ρ(A) cannot be z zero of φ(λ) of multiplicity greater than one and thus ρ(A)is a simple eigenvalue of A.
(5) If Ax = ρ(A)x where x > 0 and ||x|| = 1 (‖ · ‖ denotes the standard Euclidean norm),we cannot find another eigenvector y 6= sx, with s a scalar, of A with Ay = ρ(A)y, so thatthe eigenvector x , meaning that the normalized eigenvector x is uniquely determined.
With the previous proof in mind, the following definition comes not unexpected:
Definition 1.2.11. If a matrix A has an eigenvalue equal to the spectral radius ρ(A), thiseigenvalue is called the Perron root, the corresponding eigenvector x such that:
Ax = ρ(A)x and ‖x‖1 = 1
is called the Perron vector.
Corollary 1.2.12. Being a strictly positve square matrix (A > 0) is a sufficient condition toapply the Perron-Frobenius Theorem.
Proof. This follows immediately from Theorem 1.1.12: a positive square matrix is alwaysirreducible.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 19
1.2.3 Example
To check wether a matrix with nonnegative entries is primitive, irreducible or neither, wejust have to replace all nonzero entries by 1 since this does not affect the classification. Thematrix (
1 11 1
)is strictly positive and thus primitive. The matrices(
1 01 1
)and
(1 10 1
)
both have 1 as a double eigenvalue hence cannot be irreducible. The matrix(
1 11 0
)satisfies:
(1 11 0
)2
=(
2 11 1
)
and hence is primitive. The same goes for(0 11 1
),
this matrix is irreducible but not primitive. Its eigenvalues are 1 and −1.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 20
1.3 Norms
If one has several vectors in Rn or several matrices in Rn×m, how do we measure that some ofthem are ‘large’ and some of them are ‘small’? One way to answer this question is to studynorms, which are basically functions that assign a positive ‘size’ to a vector in a vector space.The norms that we define in this master thesis are limited to Rn and Rn×m and we call themrespectively vector norms (norms on Rn) and matrix norms (norms on Rn×m). This sectionis mainly based on chapter 2 from [? ] and chapter 5 form [? ].
1.3.1 Vector norms
Definition 1.3.1. A vector norm on Rn is a function ‖ · ‖ : Rn → R with the followingproperties:
1. ‖x‖ ≥ 0, for all x ∈ Rn with equality if and only if x = 0.
2. ‖x + y‖ ≤ ‖x‖+ ‖y‖ for all x,y ∈ Rn.
3. ‖αx‖ = |α|‖x‖ for all α ∈ R, x ∈ Rn.
We now give some well known vector norms:
p-norms
Definition 1.3.2. The Holder or p-norms are defined by:
‖x‖p = (|x1|p + · · ·+ |xn|p)1p =
(n∑i=1|xi|p
) 1p
,
for x ∈ Rn.
The 1-norm is also known as the Manhattan norm:
‖x‖1 = |x1|+ · · ·+ |xn|
The 2-norm is also known as the standard Euclidean norm:
‖x‖2 = (|x1|2 + · · ·+ |xn|2)12 = (xTx)
12
Notice that the 2-norm is invariant under orthogonal transformation, for if QQT = I withQ ∈ Rn×n and x ∈ Rn:
‖Qx‖22 = xQQTxT = xTx = ‖x‖22
Maximum norm
Finally, when p→∞ we get the maximum norm:
‖x‖∞ = max(|x1|, . . . , |xn|)
We will prove this:
CHAPTER 1. PRELIMINARIES AND NOTATIONS 21
Theorem 1.3.3. Let x ∈ Rn then:
limp→∞
‖x‖p = ‖x‖∞ = max(|x1|, . . . , |xn|)
Proof. Rewrite ‖x‖p as:
‖x‖p =(
n∑i=1|xi|p
) 1p
= ‖x‖∞(
n∑i=1
( |xi|‖x‖∞
)p) 1p
Note that(|xi|‖x‖∞
)≤ 1 for every i, with equality at least once and at most n times, then:
‖x‖∞ ≤ ‖x‖p ≤ ‖x‖∞n1p
because n > 0, we get limp→∞ n1p = 1, then:
limp→∞
‖x‖p = ‖x‖∞.
Norm equivalence
One very important property of all the norms of Rn is that they are all equivalent, meaningthat when two vectors have about the same size according to one vector norm, they will alsohave more or less the same size (up to a constant) according to another vector norm.
Theorem 1.3.4. All norms on Rn are equivalent, i.e., if ‖ · ‖α and ‖ · ‖β are norms on Rn,then there exist positive constants c1, c2 ∈ R+ such that for all x ∈ RËĘn:
c1‖x‖α ≤ ‖x‖β ≤ c2‖x‖α
Proof. We will prove for the norm ‖ · ‖2 that there are c1 > 0, c2 > 0 such that for all x > 0it holds:
c1‖x‖α ≤ ‖x‖2 ≤ c2‖x‖α.
From this, the theorem easily follows (notice that for x = 0 this inequality evidently holdsby definition of a norm). First let
x = x1e1 + x2e2 + · · ·+ xnen,
where e1, e2, . . . , en is a basis for Rn. By the triangle inequality:
‖x‖α ≤n∑j=1|xj |‖ej‖α.
By Cauchy’s inequality, for the standard inner product ∑nj=1 |xj |‖ej‖α,
n∑j=1|xj |‖ej‖α ≤
n∑j=1
x2j
1/2 n∑j=1‖ej‖2α
1/2
,
CHAPTER 1. PRELIMINARIES AND NOTATIONS 22
so:c1‖x‖α ≤ ‖x‖2,where c1 = 1(∑n
j=1 ‖ej‖2α)1/2 .
Now consider the function x → ‖x‖α on the set ‖x‖2 = 1. The set K = {x | ‖x‖2 = 1}is closed and bounded, so the theorem of Heine-Borel (see Theorem 3.5.2 in [? ]) shows thatit is compact. Now, we have actually proved that the function x → ‖x‖ is continuous onRn, meaning that if ‖xj‖2 → 0 then ‖xj‖α → 0 (j ∈ {1, . . . , n}), so this function attains it’sminimum m on K, m is strictly greater than 0 because a norm is always greater or equal tozero and 0 6∈ K. Now let x 6= 0 be any vector on Rn and let u = x
‖x‖2∈ Rn. Then ‖u‖2 = 1
so ‖u‖α ≥ m. Hence:
x ≥ m‖x‖2, or ‖x‖2 ≤ c2‖x‖, where c2 = 1m.
For example, for any x ∈ Rn we have:
‖x‖2 ≤ ‖x‖1 ≤√n‖x‖2
‖x‖∞ ≤ ‖x‖2 ≤√n‖x‖∞
‖x‖∞ ≤ ‖x‖1 ≤ n‖x‖∞
1.3.2 Matrix norms
The analysis of algorithms where matrices are involved, requires that we are able to assessthe size of matrices. We introduce therefore matrix norms, of course, this definition is com-pletely analogous but with an extra condition: for square matrices, the matrix norm must besubmultiplicative:
Definition 1.3.5. A matrix norm on Rn×m is a function ‖ · ‖ : Rn×m → R with thefollowing properties:
1. ‖A‖ ≥ 0, for all A ∈ Rn×m with equality if and only if A is a matrix with only zeroentries.
2. ‖A+B‖ ≤ ‖A|+ ‖B‖ for all A,B ∈ Rn×m.
3. ‖αA‖ = |α|‖A‖ for all α ∈ R, A ∈ Rn×m.
4. For all A,B ∈ Rn×n : ‖AB‖ ≤ ‖A‖‖B‖.
Frobenius norm
One of the most frequently used matrix norms for a matrix A ∈ Rn×m is the so calledFrobenius norm:
‖A‖F =
n∑i=1
m∑j=1
a2ij
12
= trace(ATA)12 ,
If A ∈ R1×m, then the Frobenius norm equals the 2-norm.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 23
p-norms
We can also define p-norms on matrices:
‖A‖p = supx 6=0
‖Ax‖p‖x‖p
Maximum norm
First notice that the norm defined for A ∈ Rn×n as:
‖A‖max = max1≤i,j≤n
|aij |
is a norm on the vector space Rn×n, but is not a matrix norm. Consider the matrix J = ( 1 11 1 )
and compute J2 = 2J . So: ‖J‖max = 1, ‖J2‖max = ‖2J‖max = 2‖J‖max = 2. So it is not thecase that ‖J2‖max ≤ ‖J‖2max, and hence ‖ · ‖max is not a submultiplicative norm. Thereforewe define the maximum norm as follows:
Definition 1.3.6. Let ‖ · ‖ be a vector norm on Rn. Define the maximum norm |||·||| onRn×n by:
|||A||| = max‖x‖=1
‖Ax‖
It is easy to see these equivalences:
|||A||| = max‖x‖=1
‖Ax‖
= max‖x‖≤1
‖Ax‖
= maxx 6=0
‖Ax‖‖x‖
Because we need in the next paragraph the fact that the maximum norm is indeed a matrixnorm, we prove this:
Theorem 1.3.7. The function |||·|||, also called the maximum norm, is a matrix norm onRn×n.
Proof. Let A be a n × n-matrix. Axiom (1) of matrix norms (definition 1.3.5) follows fromthe fact that |||A||| is the maximum of a nonnegative valued function. That this maximumexists follows from the extreme value theorem of Bolzano (see Theorem 5.5.1 in [? ]) because‖Ax‖ is a continuous function of x on the unit ball with ‖x‖ = 1, which is a compact set.That |||A||| = 0 occurs only when A is a matrix with only zero entries follows from the factthat Ax = 0n×1 for all x is only possible when A = 0n×n.
To see axiom (2), we notice that the triangle inequality is inherited from the vector norm
CHAPTER 1. PRELIMINARIES AND NOTATIONS 24
‖ · ‖, since:
|||A+B||| = max‖x‖=1
‖(A+B)x‖
= max‖x‖=1
‖Ax +Bx‖
≤ max‖x‖=1
(‖Ax‖+ ‖Bx‖)
≤ max‖x‖=1
‖Ax‖+ max‖x‖=1
‖Bx‖
= |||A|||+ |||B|||
Axiom (3) of matrix norms follows from the calculation :
|||αA||| = max‖x‖=1
‖αAx‖ = max‖x‖=1
|α|‖Ax‖ = |α| max‖x‖=1
‖Ax‖ = |α||||A|||
The submultiplicative axiom (4) follows from the fact that:
|||AB||| = maxx 6=0
‖ABx‖‖x‖
= maxx 6=0
‖ABx‖‖Bx‖
‖Bx‖‖x‖
≤ maxx 6=0
‖Ay‖‖y‖ max
x 6=0
‖Bx‖‖x‖
= |||A|||.|||B|||
Again, we can show that all matrix norms are equivalent with the same reasoning as inTheorem 1.3.4.
1.3.3 Spectral radius formula
In this section, we prove the spectral radius formula also known as the Gelfand’s formula. Thisformula is a way to calculate the spectral radius of a square matrix as a limit of matrix norms.The formula will play an important role in Chapter 2, where it will be used to get some moreadvanced results of the Perron-Frobenius when considering nonnegative, symmetric matrices.
Lemma 1.3.8. Let A be a square matrix, then
ρ(A)k = ρ(Ak)
Proof. This follows immediately from the fact that if A has eigenvalues λ1, λ2, . . . , λn, thenλk1, λ
k2, . . . , λ
kn are all the eigenvalues of Ak (see 1.2 in [? ]), so:
ρ(A)k =(
max1≤i≤n
|λi|)k
= |max λi|k = ρ(Ak).
CHAPTER 1. PRELIMINARIES AND NOTATIONS 25
Lemma 1.3.9. Let A be a square matrix and let ‖ · ‖ be a matrix norm then:
ρ(A) ≤ ‖A‖
Proof. Let λ be an eigenvalue of A and let x 6= 0 be a corresponding eigenvector (x ∈ Rn).From Ax = λx, we get:
AX = λX, where X = (x . . .x) ∈ Rn×n \ 0n×n
It follows from property (3) and (4) of matrix norms that:
|λ|‖X‖ = ‖λX‖ = ‖AX‖ ≤ ‖A‖‖X‖,
and simplifying by ‖X‖ (‖X‖ > 0 by matrix norm property 1) gives:
|λ| ≤ ‖A‖.
This holds for all eigenvalues, so also for the maximum of all the eigenvalues of A. Hence theresult follows.
Before we prove the following lemma, remember the well known theorem about the Jordancanonical form of a square matrix A (see Theorem 2.3.1 in [? ]).
Theorem 1.3.10. For each square matrix A there exists an invertible matrix P such that
PAP−1 = J ′
J ′ is called the Jordan normal form of A and:
J ′ =
J1. . .
Jp
where each block Ji is a square matrix of the form:
Ji =
λi 1 0 · · · 00 λi 1 · · · 0...
...... . . . ...
0 0 0 λi 10 0 0 0 λi
,
with λi’s eigenvalues.
Lemma 1.3.11. Let A be an n × n matrix and ε > 0, there exist a matrix norm ‖.‖ρ suchthat:
‖A‖ρ ≤ ρ(A) + ε
Proof. The Jordan canonical form of A is:
A = S
Jn1(λ1) 0 . . . 0
0 Jn2(λ2) . . . ...... . . . . . . 00 . . . 0 Jnk(λk)
S−1,
CHAPTER 1. PRELIMINARIES AND NOTATIONS 26
where S ∈ Rn×n is an invertible matrix, λ1, . . . , λk are the eigenvalues of A and n1 +· · ·+nk =n. Let:
D(η) =
Dn1(η) 0 . . . 0
0 Dn2(η) . . . ...... . . . . . . 00 . . . 0 Dnk(η)
with Dm(η) =
η 0 . . . 00 η2 . . . ...... . . . . . . 00 . . . 0 ηm
Since the left multiplication by Dm(1/ε) multiplies the ith row by 1/εi and the right multi-plication on the right by Dm(ε) multiplies the jth colum by εj , we calculate:
D(1/ε)S−1ASD(ε) =
Bn1(λ1, ε) 0 . . . 0
0 Bn2(λ2, ε). . . ...
... . . . . . . 00 . . . 0 Bnk(λk, ε)
with
Bm(λ, ε) = Dm(1/ε)Jm(λ)Dm(ε) =
λ ε 0 . . . 00 λ ε 0
...0 . . . . . . . . . 0... . . . . . . λ ε0 . . . 0 0 λ
We now define the matrix norm for M ∈ Rn×n by:
‖M‖ρ = max‖x‖1=1
‖D(1/ε)S−1MSD(ε)x‖1 (1.18)
= maxl∈[1:n]
n∑k=1|(D(1/ε)S−1MSD(ε))k,l|. (1.19)
The conditions for being a matrix norm are trivially met because max‖x‖1=1 ‖Ax‖ is a matrixnorm by Theorem 1.3.7.
Theorem 1.3.12. Spectral radius formula Let a be an n × n matrix and let ‖ · ‖ be amatrix norm then:
ρ(A) = limk→∞
‖Ak‖1/k
Proof. Given k ≥ 0, we use Lemma 1.3.9 to write:
ρ(A)k = ρ(Ak) ≤ ‖Ak‖,
so:ρ(A) ≤ ‖Ak‖1/k.
Taking the limit as k →∞ gives ρ(A) ≤ limk→∞ ‖Ak‖1/k. To establish the reverse inequality,we need to prove that, for any ε > 0, there exists a K ≥ 0 such that ‖Ak‖1/k ≤ ρ(A)+ε for allk ≥ K. From Lemma 1.3.11, we know that there exists a matrix norm ‖.‖ρ so ‖A‖ρ ≤ ρ(A)+ε.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 27
Moreover, by the equivalence of the norms (see Theorem 1.3.4) on Rn×n, we know that thereexists some constant C > 0 such that ‖M‖ ≤ C‖M‖ρ for all M ∈ Rn×n. Then, for any k ≥ 0,
‖Ak‖ ≤ C‖Ak‖ρ ≤ C‖A‖kρ ≤ C(ρ(A) + ε)k
So:‖Ak‖1/k ≤ C1/k(ρ(A) + ε),
and thus:limk→∞
‖Ak‖1/k ≤ ρ(A) + ε
This implies the existence of K ≥ 0 such that ‖Ak‖1/k ≤ ρ(A) + ε for k ≥ K, as desired.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 28
1.4 Numerical analysis
1.4.1 Bachmann-Landau notations
For comparing the computational cost of algorithms, it’s important to know the Bachmann-Landau notations. These notations are used to describe the limiting behavior of a functionin terms of simpler functions. These notations are used a lot in computer science to classifyalgorithms by how their number of steps depends on changes in input size. We are onlyinterested in the effects on the number of steps for really large input sizes, so constants don’tplay any role in the classification.
Big O
Definition 1.4.1. (Big O)The Big O of a function g is the set of all functions f that are bounded above by g asymptot-ically (up to constant factor).
O(g(n)) = {f | ∃c, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ cg(n)}We now prove a very simple lemma to show that indeed constant factors don’t matter for
the Big O:Lemma 1.4.2. ∀k > 0 : O(k.g(n)) = O(g(n))Proof.
O(k.g(n)) = {f | ∃c, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ k.c.g(n)}= {f | ∃c, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ (k.c).g(n)}
let c’ = k.c= {f | ∃c′, n0 ≥ 0 : ∀n ≥ n0 : 0 ≤ f(n) ≤ c′.g(n)}= O(g(n))
Small O
Definition 1.4.3. (Small O)Small O of a function g is the set of all functions f that are dominated by g asymptotically.
o(g(n)) = {f |∀ε > 0, ∃n0∀n ≥ n0 : f(n) ≤ ε.g(n)}Note that the small-notation is a much stronger statement than the corresponding big
o-notation: every function that is in the small o of g is also in big o, but the inverse isn’tnecessarily true. Intuitively, f(x) ∈ o(g(x)) means that g(x) grows much faster than f(x).
Asymptotical Equality
Definition 1.4.4. (Asymptotically Equal)
Let f and g be real functions, then f is asymptotically equal to g iff limx→+∞
f(x)g(x) = 1. Notation:
f ≈ g.In fact asymptotical equality, can also be defined equivalently as an equivalency relation:
f ≈ g ⇔ (f − g) ∈ o(g). It is trivially clear that as f ≈ g ⇒ f ∈ O(g).
CHAPTER 1. PRELIMINARIES AND NOTATIONS 29
1.4.2 The Power Method
We now introduce the classical power method, also called the Von Mises iteration ([? ]). Thisnumerical algorithm is the core of the whole thesis, because we will see in the next chaptersa lot of different algorithms that are in fact just adaptations of this iterative method.
The power method is an eigenvalue algorithm that, given a diagonalizable matrix A, findsthe eigenvalue λ with the greatest magnitude and a corresponding eigenvector v such that:
Av = λv.
The power method is special because it doesn’t use any matrix decomposition technique forobtaining results, making it suitable for very large matrices. On the other hand, it only findsone eigenvalue with a corresponding eigenvector and the the iterative process might convergevery slowly.
There are plenty of variations of the power method available that overcome all theselimitations, but we limit our discussion here to the very basic method, therefore we call it theclassical power method.
We first introduce some needed definitions, theorems and notations.
Definition 1.4.5. Consider a real n×n-matrix A with (not necessarily different) eigenvaluesλ1, λ2, . . . , λn. When
|λ1| > |λ2| ≥ |λ3| ≥ · · · ≥ |λn|,
λ1 is called the dominant eigenvalue.
Corollary 1.4.6. If a real, n× n-matrix A has a dominant eigenvalue λ1, then λ1 is real.
Proof. Recall that the eigenvalues of a real matrix A are in general complex, and occur inconjugate pairs. So if λ1 would be complex, the complex conjugate of λ1 would also be aneigenvalue with the same modulus. Because λ1 must be the only eigenvalue that is strictlygreater than all the other eigenvalues, this is impossible.
Notation 1.4.7. Let x(i) denote vector x at iteration step i.
Algorithm 1
Let A ∈ Rn×n be a diagonalizable matrix with dominant eigenvalue λ1. We know that Ahas an eigenbasis V = {v1,v2, . . . ,vn}. Every vector x(0) ∈ Rn can be written as a linearcombination of elements in V , because V spans the space Rn. So:
x(0) =n∑i=1
ξivi.
Now construct the sequence of vectors x(k):
x(k) = Ax(k−1) = Akx(0)
CHAPTER 1. PRELIMINARIES AND NOTATIONS 30
Now:
Akx(0) =n∑i=1
ξiAkvi
=n∑i=1
ξiλki vi
= λk1
{ξ1v1 + ξ2
(λ2λ1
)kv2 + · · ·+ ξn
(λnλ1
)kvn
}
Because |λi| < |λ1| for i > 1, we have that(λiλ1
)k→ 0 for k →∞
so:
x(k) = λk1ξ1v1 + o(1) for k →∞ (1.20)
This means that for k large enough x(k+1) is almost equal to λ1 times x(k). So when theratio between the corresponding vector entries in x(k+1) and x(k) becomes constant after kiteration steps, then this ratio will be equal to the dominant eigenvalue λ1. The vector x(k)
will be a corresponding eigenvector because it is proportional to v1.The start value x(0) must only satisfy the condition that ξ1 6= 0, in other words: x(0) must
have a non-zero component belonging to the dominant eigenvector. Each randomly chosenoption for x(0) will fulfill this requirement. Even if we are so unlucky to pick a startingvector which doesn’t, subsequent x(k) will again fulfill the requirement because roundingerrors sustained during the iteration will have a component in this direction.
A practical problem arises now when one of the components of x(k) is equal to zero. Ifwe want to take the ratio between the corresponding components of x(k+1) and x(k) we get adivision by zero. We can solve this by axiom (3) of vector norms (see definition 1.3.1):
‖λ1x(k)‖ = |λ1|‖x(k)‖.
Because x(k) 6= 0 we have that ‖x(k+1)‖ ≈ ‖λ1x(k)‖, so we calculate λ1 in the power methodby:
|λ1| = limk→∞
‖x(k+1)‖‖x(k)‖
To decide on the sign of λ1 (λ1 is always real, see Corollary 1.4.6), just divide two non-zerocomponents of x(k+1) and x(k).
Another issue to address is that the components of x(k) = Akx(0) can become too largeor too low, which can cause an overflow or underflow in the real number representation ofcomputers. To avoid this, we use normed versions of the x(k)-vectors: we start with a vectory(0) with ‖y(0)‖ = 1. Subsequently, we calculate for k = 0, 1, . . .:
z(k+1) = Ay(k), µk+1 = ‖z(k+1)‖, y(k+1) = z(k+1)
µk+1.
The vectors y(k) all have magnitude 1 and the components of z(k) are restricted because:
‖z(k)‖ = ‖Ay(k−1)‖ ≤ ‖A‖‖y(k−1)‖ = ‖A‖
CHAPTER 1. PRELIMINARIES AND NOTATIONS 31
and when A is invertible we have ‖y(k−1)‖ = 1 ≤ ‖A−1‖‖z(k)‖. Thus ‖z(k)‖ ≥ 1‖A−1‖ . If we
want to calculate the eigenvalue we can use:
Aky(0) = Ak−1Ay(0) = Ak−1z(1) = Ak−1µ1y(1)
= Ak−2µ1Ay(1) = Ak−2µ1z(2) = Ak−2µ1µ2y(2)
= · · ·= µ1µ2 . . . µky(k)
So:
|λ1| = limk→∞
‖Ak+1y(0)‖‖Aky(0)‖2
(1.21)
= limk→∞
µ1µ2 . . . µk+1‖y(k+1)‖µ1µ2 . . . µk‖y(k)‖2
= limk→∞
µk+1
Because µk+1 converges, a good choice for a stop condition for our numerical algorithmcould be (≈ is defined in definition 1.4.4)
|µk − µk−1| ≈ TOL,
which guarantees an estimation error of at most TOL (usually TOL=10−5) for the approxi-mation of the dominant eigenvalue. With all this information, we construct algorithm 1. Justto give an understandable algorithm, we used the Euclidean norm in this algorithm.
Data:A: a matrixy(0): a start vector with ‖y(0)‖2 = 1,T ol: Tolerance for the estimation error.Result:y(k): an estimation of a dominant eigenvector,µk: an estimation of the dominant eigenvalue.begin power method(y(0), k)
k = 1 ;repeat
z(k) = Ay(k−1);µk = ‖z(k)‖2;
y(k) = z(k)
µk;
k = k + 1until k > 2 and |µk − µk−1| < TOL;if the components y(k) and y(k−1) have a different sign then
µk = −µk;endreturn y(k), µk;
endAlgorithm 1: The Power method
CHAPTER 1. PRELIMINARIES AND NOTATIONS 32
Computational cost and usage
The computational cost of the algorithm is determined by the speed at which the o(1) termsin (1.20) go to zero. This is indicated by the slowest converging term (λ2/λ1)k. This meansthat the algorithm converges slowly when there is an eigenvalue close in magnitude to thedominant eigenvalue. We get the following expression for approximation µk of λ1:
|µk − λ1| = O
(∣∣∣∣λ2λ1
∣∣∣∣k)
In our algorithm we accepted an estimation error of 10−5, so the number of steps n can becomputed as: ∣∣∣∣λ2
λ1
∣∣∣∣n ≈ TOL
So, for example, for TOL = 10−5 we get n = −5log λ1
λ2
, so:
power method ∈ O( −1
log(λ1)− log(λ2)
)Note that this O holds for any estimation error TOL of the form 10−e with e ∈ N.
When λ2 ≈ λ1 we see that the power method (almost) needs infinitely many steps. Sincewe do not know the eigenvalues of A, this means we cannot know in advance whether thepower method will converge or not. Recall that the eigenvalues of a real matrix A are ingeneral complex, and occur in conjugate pairs. This means, when λ1 is not real, the powermethod will certainly fail. Therefore, it is a good idea to apply the power method only tomatrices whose eigenvalues are known to be real. The only thing that can go wrong withthose matrices is that the dominant eigenvalue has an algebraic multiplicity larger than 1.
From the Perron-Frobenius theorem in 1.2.10, we also get another good choice: namelythe irreducible matrices or any matrix with strictly positive entries. Indeed, The Perron-Frobenius theorem tells us that they have a unique dominant eigenvalue.
Example
Example 1.4.8. Consider the matrix:
A =
1 −3 5−1 −7 11−1 −9 13
A has a dominant eigenvalue 3 and a double eigenvalue 2. A corresponding dominant eigen-vector is (1, 1, 1). Now we use the classical power method with start vector y(0) = (1, 0, 0),TOL= 10−5 and we become the values in Table 1.4.8. Because we know the eigenvalues ofA, we can predict the number of steps n = −5
log(λ2λ2
)= −5
log(2/3) ≈ 28. Because λ2/λ1 = 2/3 is
not that small, the convergence here is also not that fast. We have for the approximation ofa corresponding eigenvector
y(29) = (−0.577348,−0.577352,−0.577351)
which is (more or less) in proportion with (1, 1, 1).
CHAPTER 1. PRELIMINARIES AND NOTATIONS 33
k µk k µk0 1.00000 15 3.004591 1.73205 16 3.003052 4.12311 17 3.002043 4.06564 18 3.001364 3.58774 19 3.000905 3.34047 20 3.000606 3.20743 21 3.000407 3.13055 22 3.000268 3.08385 23 3.000179 3.05457 24 3.00011
10 3.03582 25 3.0000811 3.02363 26 3.0000512 3.01564 27 3.0000313 3.01037 28 3.0000214 3.00689 29 3.00001
Table 1.1: The iteration values µk of Example 1.4.8.
1.5 Graphs
After introducing different kinds of matrices and proving the Perron-Frobenius theorem, wenow take a closer look at graphs. Here too, we’ll look at different families of graphs and provesome relevant properties about them. We also link the concept of graphs with different kindsof matrices, deepening our insight of some theorems of the previous section.
The definitions and results in this section are mainly based on the course ‘Discrete Math-ematics’ by P. Cara [? ].
1.5.1 General definitions
Definition 1.5.1. A graph is an ordered pair (V,→) where V is a set and → is a relation.The elements of V are called vertices or nodes and → is called the adjacency relation.Let u, v ∈ V , then the ordered pair (u, v) belonging to → is called an arc or edge and wewrite u → v. We also say that u is adjacent to v. When v → v (with v ∈ V ) we say thatthe graph has a loop at v. A graph (V,→) is most of the time denoted by calligraphic lettersG ,H , ...
With this definition, we defined directed graphs, when the relation → is symmetric, wecall the graph undirected, in this case we often write ↔ instead of →.Example 1.5.2. The graph
v1
v3v2
CHAPTER 1. PRELIMINARIES AND NOTATIONS 34
is an undirected graph with vertices v1, v2, v3. The adjacency relation ↔ equals{(v1, v2), (v2, v1), (v2, v3), (v3, v2), (v3, v1), (v1, v3)}.
There is a small problem with our definition, because not all graphs are taken into account,for example the graph below is not a graph following our definition because you cannot definemultiple edges between vertices in a relation.
v1
v3v2
Therefore we define multisets and make a remark that introduces the concept of multi-plicity of an edge using these multisets. Multisets are a generalization of the notion of a setin which elements are allowed to appear more than once.
Definition 1.5.3. A multiset A = (S, µ) is an ordered pair with S a set and µ : S → N0 afunction that gives the multiplicity of an element of S. A multiset can be written as a setin the following way:
A = (S, µ) = {(s, µ(s)) : s ∈ S}
The cardinality of a multiset is defined as
|A| =∑s∈S
µ(s).
Consider as an example the multiset {a, a, b, b, bc} (with a, b, c different elements), whichcan be denoted as a set as {(a, 2), (b, 3), (c, 1)}
We now introduce the following important remark using multisets:
Remark 1.5.4. Although we will stay writing a graph as G = (V,→), this definition doesn’tallow for repeated edges. A graph that can have multiple edges between two vertices is oftencalled a multigraph, but in this master thesis we call a multigraph just a graph. To definethis in a mathematical correct way, you define G as an ordered pair (V,→) where V is a finiteset and → is a multiset consisting of elements of the cartesian product V × V .
Definition 1.5.5. The neighbourhood of a vertex v of a graph G = (V,→) is the inducedsubgraph Gv with vertex set V ′ consisting of all vertices adjacent to v without v itself and withthe multiplicity function µ′, which is the restriction of µ to the vertices in V ′. A vertex witha neighbourhood equal to the empty graph (a graph with an empty set of vertices) is calledisolated.
Definition 1.5.6. Let G = (V,→) be a graph with vertices v1, v2, . . . , vn ∈ V and edges(defined as ordered pairs) e1, . . . em ∈→. Let u → v be an edge between u, v ∈ V . We call uthe source node and v the terminal node of the edge. sG (i) denotes the source node ui ofedge i, tG (i) denotes the terminal node wi of edge i.
Definition 1.5.7. The order of a finite graph G is the number of vertices of G and isdenoted by |G |.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 35
Definition 1.5.8. The indegree of a vertex v in a graph G is the number of times v is aterminal node of an edge.
Definition 1.5.9. The outdegree of a vertex v in a graph G is the number of times v isa source node of an edge.
Definition 1.5.10. The degree of a vertex v in a graph G is the sum of the indegree andoutdegree of vertex v.
Definition 1.5.11. A walk in a graph G is a sequence of vertices
a0, a1, . . . , ak
such that ai−1 → ai for each i ∈ {1, . . . , k}. The length of the walk is k, one less than thenumber of vertices.
Definition 1.5.12. If all edges are distinct in a walk in a graph G , we call the walk a path.
Definition 1.5.13. A cycle is a walk from v0 to v0 in which all vertices except v0 are distinct.
Definition 1.5.14. A simple graph is an undirected graph G = (V,→) containing no loopsand for all vertices vi, vj ∈ V , their is at most one edge.
Definition 1.5.15. A clique in a graph G = (V,→) is a subset C of V , such that every twodistinct vertices in C are adjacent.
Definition 1.5.16. A bipartite graph is a graph G = (V,→) whose vertices can be dividedinto two disijount sets U and T such that every edge connects a vertex in U and a vertex V .There are no edges between vertices in U or between vertices in V .
Product graphs
Definition 1.5.17. Take two graphs G = (U,→),H = (V,→′), the product graph G ×His the graph with |G |.|H | vertices and that has an edge between vertices (ui, vj) and (uk, vl)if there is an edge between ui and uk in G and there is an edge between vj and vl in H .
Colored graphs
Definition 1.5.18. A node colored graph G is quadruple (V,→, C, a) with V a set ofvertices, → an adjacency relation, C a set of colors and a a surjective function a : V → Cthat assigns to each vertex one color.
Definition 1.5.19. In a node colored graph G = (V,→, C, a), cG (V, i) denotes the number ofvertices of color i. So:
cG (V, i) = |{(i, vj) ∈ C × V |a(vj) = i}|
Definition 1.5.20. A edge colored graph G is quadruple (V,→, C, b) with V a set ofvertices, → an adjacency relation, C a set of colors and b a surjective function b : (→)→ Cthat assigns to each edge one color.
Definition 1.5.21. In an edge colored graph G = (V,→, C, b), cG (→, i) denotes the numberof edges of color i. So:
cG (→, i) = |{(i, ej) ∈ C × E|b(ej) = i}|
CHAPTER 1. PRELIMINARIES AND NOTATIONS 36
Definition 1.5.22. A node-edge colored graph or a fully colored graph G is 5-tuple(V,→, C, a, b) with V a set of vertices,→ an adjacency relation, C a set of colors, a a functiona : V → C that assigns to each vertex one color and b a function b : (→)→ C that assigns toeach edge one color with as condition that a(V ) ∪ b(→) = C.
Adjacency matrix
We now represent a finite graph in the form of an adjacency matrix. This matrix gives a lotof useful information about the graph and vice versa.
Definition 1.5.23. Let G = (V,→) be a graph of order n and define a numbering on thevertices v1, . . . , vn. Then the adjacency matrix AG of G is the real n × n-matrix with aijequal to µ(vi, vj).
Corollary 1.5.24. The adjacency matrix of an undirected graph G = (V,∼) is a symmetricmatrix.
Proof. This is trivial by the definition of an undirected graph.
Theorem 1.5.25. Let k > 0. The element on place (i, j) in AkG contains the number of walksof length k from i to j in the graph G = (V, µ).
Proof. By induction on k.For k = 1 we count the walks of length 1. These are edges and the result follows immedi-
ately from the definition of AG .Let vl be a vertex of G . If there are bij walks of length k from i to l and alj walks of
length 1 (edges) from vl to vj , then there are bilalj walks of length k+ 1 from vi to vj passingvertex vl. Therefore, the number of walks of length k + 1 between vi and vj is equal to:∑
l∈Vbilalj =: cij .
By the induction hypothesis we know that bil equals the element on place (i, l) in AkG so cijis exactly the element on place (i, j) in the matrix product
AkGAG = Ak+1G .
Example 1.5.26. The adjacency matrix of the graph in Example 1.5.2 is:
A =
0 1 11 0 11 1 0
Incidence matrix
A graph can also be represented as an incidence matrix by numbering the vertices and theedges. The resulting incidence matrix will have −1, 1 or 0 as entries.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 37
Definition 1.5.27. The incidence matrix of directed graph G = (V,→) with verticesv1, . . . , vn ∈ V and edges e1, . . . , em ∈ E is a n × m-matrix A where rows represent thevertices and the columns represent the edges, such that:
(A)ij =
1 if vi is the source node of ej−1 if vi is the terminal of ej0 otherwise
In the case of an undirected graph G = (V,↔) with vertices v1, . . . , vn ∈ V and edgese1, . . . , em ∈ E we define:
(A)ij ={
1 if vi is a node of ej0 otherwise
1.5.2 Strong connectivity
In this section, we take a closer look at directed graphs and introduce the concept of connec-tivity.
Definition 1.5.28. An undirected graph G = (V,↔) is connected if it possible to establisha path from any vertex to any other vertex.
Definition 1.5.29. A directed graph G (V,→) is connected if the underlying undirected graph(remove all arrows on the edges) is connected, the directed graph G is strongly connectedif there is a path in each direction between each pair of vertices of the graph.
In the next proof, we study the equivalence of the matrix property of irreducibility ofDefinition 1.1.10 with the concept of the strongly connected directed graphs of a matrix:
Theorem 1.5.30. Let G be a (directed) graph with adjacency matrix A. Then G is stronglyconnected if and only if A is irreducible.
Proof. From Theorem 1.5.25 we know that a graph is strongly connected if and only if forevery pair of indices i and j there is an integer k such that (Ak)ij > 0, from Theorem 1.1.12we know this means that A is irreducible and vice versa.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 38
1.6 Hypergraphs
After introducing graphs, we now look at hypergraphs. Intuitively, a hypergraph is a gener-alization of a graph in which an edge can connect any number of vertices. The definitionspresented are mainly from [? ] and [? ].
1.6.1 General definitions
Definition 1.6.1. A hypergraph is an ordered pair (V, E) with V a finite set and E ⊆ 2V \∅,the power set of V with E 6= ∅. The elements of V are called the vertices and the elementsof E are called the edges. An hypergraph (V, E) will be denoted by the calligraphic lettersG,H, . . ..
Remark 1.6.2. As in graphs, the classic definition of a hypergraph doesn’t allow to havemultiple edges that cover the same vertices. Also repeated vertices within an edge, often calledhyperloops are not allowed by the above definition. A hypergraph that can have multipleedges connecting the same vertices, and hyperloops is called a multi-hypergraph, but in thismaster thesis we call a multi-hypergraph just a hypergraph. So a hypergraph G is an orderedpair (V,E) where V a finite set and E is a multi-set consisting of multi-subsets of V (amulti-subset is just a subset of a set where the elements are allowed to appear more thanonce).
Example 1.6.3. The hypergraph G = (V,E) consists of 7 vertices and 4 edges. The edgesare equal to the multiset {{v1, v2, v3}, {v2, v3}, {v3, v5, v6}, {v3, v5, v6}, {v4}}. Note that thecolors don’t have any meaning to the hypergraph (the hypergraph is not an edge coloredhypergraph), but only serve to clarify the drawing.
v1 v2 v3
v4 v5 v6
v7
E1
E2
E3
E4
E5
Definition 1.6.4. In a hypergraph G = (V,E), two vertices vi, vj ∈ V are called adjacent ifthere is an edge Ei ∈ E that contains both vertices. Two edges Ek, El are called adjacent iftheir intersection is not empty.
Definition 1.6.5. The order of a finite hypergraph G is the number of vertices of G andis denoted by |G|.
CHAPTER 1. PRELIMINARIES AND NOTATIONS 39
We now define the degree of a vertex in a hypergraph. Note that we don’t define theindegree and outdegree of a vertex as the hypergraphs we consider are undirected.
Definition 1.6.6. The degree of a vertex v in a hypergraph G is the number of times v iscontained in an edge.
Definition 1.6.7. A path in an hypergraph G = (V,E) is a sequence p = (a0, A1, a1, . . . , Ak, ak),k ≥ 1 where the ai’s are pairwise distinct vertices, the Ai’s are pairwise distinct edges andai−1, ai ∈ Ai for 1 ≤ i ≤ k. The path p is said to join a0 and ak. The length of the path isk.
Definition 1.6.8. A hypergraph is connected if for each vertex there is a path to any othervertex.
k-hypergraphs
In most applications, the edges of an hypergraph connect a fixed number of vertices.
Definition 1.6.9. An hypergraph is called a k-uniform hypergraph for k ≥ 2 ∈ N if for allEi ∈ E, the cardinality |Ei| is equal to k. The cardinality of multisets is defined in Definition1.5.3. The term k-graph is often used instead of a k-uniform hypergraph. The edges in ak-graph are sometimes called k-edges.
Notice that the 2-hypergraphs are just the undirected graphs we defined in the previoussection.
Directed hypergraphs
An important difference with the previous section is that all hypergraphs we have definedare undirected: there is no specific order in which an edge connect different vertices. In agraph, directed edges arise naturally as some vertex can be the source node and some vertexcan be the terminal node, no other nodes are connected by an edge of a graph. For edgesof hypergraphs this concept is not straightforward to generalize: one option is to see edgesas paths connecting vertices in a specific order, another option is to partition the verticesconnected by an edge in a set of source nodes and a set of terminal nodes. This last notionis studied in [? ]. We will not discuss this topic in detail, all the hypergraphs in this masterthesis are undirected.
1.6.2 Incidence matrix
A hypergraph can be represented as an incidence matrix by numbering the vertices and theedges. The resulting incidence matrix will be a boolean matrix with only 1 and 0 entries. Analternative way to represent a hypergraph is by using adjacency tensors.
Definition 1.6.10. The incidence matrix of a hypergraph G = (V,E) with vertices v1, . . . , vn ∈V and edges e1, . . . , em ∈ E is a n ×m-matrix A where rows represent the vertices and thecolumns represent the edges, such that:
(A)ij ={
1 if vi ∈ ej0 if vi 6∈ ej
CHAPTER 1. PRELIMINARIES AND NOTATIONS 40
Example 1.6.11. The corresponding incidence matrix of the hypergraph of Example 1.6.3is:
1 0 0 0 01 1 0 0 01 1 1 1 00 0 0 0 10 0 1 1 00 0 1 1 00 0 0 0 0
.
Chapter 2
Similarity on graphs
In the previous chapter all the basic terminology and results were introduced. Now we takean extensive look at the concept of similarity on graphs. Similarity on graphs is a fairlynew concept to compare the nodes of two graphs. The concept arose from the researchon algorithms for web searching engines (like Google, Yahoo,. . . ) in the late nineties. Morespecifically, Jon M. Kleinberg introduced in his paper ‘Authoritative Sources in a HyperlinkedEnvironment’ [? ] the famous ‘HITS algorithm’ for extracting information from the linkstructure of websites. The method leads to an iterative algorithm where graphs represent thelink structure of a collection of websites on a specific topic. Because this paper formed thebasis of later research on similarity on graphs, the whole idea and algorithm of Kleinberg isintroduced in the first section of this chapter. In 2004, V.D. Blondel et al. [? ] generalizedthe algorithm of Kleinberg, introducing the notion of similarity on directed graphs. Thissimilarity is covered in the second section. With this similarity on directed graphs, there is amuch wider scope of applications than just search algorithms. The method of Blondel onlyreturns node similarity scores, which are in fact a measurement of how similar two nodes oftwo graphs are to each other, and so in the next section, we extend the method of Blondelto both node and edge similarity scores. We conclude this chapter with a notion of similarityon colored graphs.
2.1 The HITS algorithm of Kleinberg
2.1.1 History
Back in the nineties, internet became more and more popular. The popular search enginesback then where Altavista and Yahoo, but they weren’t as advanced as search engines today.The main pitfall of the first search engines was that the search results were purely based onthe number of occurrences of a word in a webpage. This was a pitfall for many reasons. Thefirst reason was the growing popularity of the internet: as more and more webpages wereput online, simply getting the relevant pages to a search query in this text-based manner,was a process that could possibly return millions of relevant pages. Also content similaritywas an issue: a website owner can easily cheat in a text-based search system by just addingand repeating some very popular search words, making his website appear in the results of alarge number of search queries. Two possible solutions were simultaneously invented in 1997and 1998. The first one was the PageRank-system developed by Larry Page and Sergey Brin([? ]). The PageRank system led to the foundation of the immensely popular Google search
41
CHAPTER 2. SIMILARITY ON GRAPHS 42
engine. Meanwhile, also Jon Kleinberg came up with his own solution, the HITS algorithm(hyperlink-induced topic search). At that time, he was both a professor in the ComputerScience Department at the Cornell University and researcher for IBM. The algorithm is usedinter alia today by the Ask search engine (www.ask.com). Both these algorithms use thehyperlinks between webpages to rank search results. Because this master thesis is aboutsimilarity and this concept is introduced on graphs as a generalization of the HITS algorithm,we don’t go into further detail about the PageRank algorithm. In the following paragraphs,the HITS algorithm is extensively explained.
2.1.2 Motivation
Kleinberg’s work originates in the problems that arise with text-based searching the WWW.Text-based searching just counts all the occurrences of a given search query on webpages andreturns a set of webpages ordered by decreasing occurency. When a user supplies a searchquery, we probably face an abundance problem with this method: the number of pages thatcould reasonably be returned as relevant is far too large for a human user to digest. To provideeffective search results under these conditions, we need to filter the ‘authoritative’ ones. Weface some complications when we want to filter the ‘authoritative’ webpages in a text-basedsystem. For example, if we search for ‘job offers in Flanders ’ the most authoritativepage and expected first result in a search engine would be www.vdab.be. Unfortunately, thequery ‘job offers’ is used in over a million pages on the internet and www.vdab.be is notthe one using the term most often. Therefore, there is no way to favor www.vdab.be in atext-based ranking function. This a recurring phenomenon, e.g., if you search for the query‘computer brands’, there is no reason at all to be sure that the website of Apple or Toshibaeven contains this search term.
The HITS algorithm solves these difficulties by analyzing the hyperlink structure amongwebpages. The idea is that hyperlinks encode a sort of human judgment and that thisjudgement is crucial to formulate a notion of authority. Specifically, when a page p includesa link to page q, it means that p gives a conferred authority on q. Again we face difficulties,because this conferred authority doesn’t hold for every link. Links are created for a widevariety of reasons, for example, a large number of links are created for navigation within awebsite (e.g. “Return to homepage”) and these have of course nothing to do with a notion ofauthority.
The HITS method is based on the relationship between the authorities for a topic and thepages that link to many related authorities, called hubs. Page p is called an authority for thequery ‘smartphone brand’ if it contains valuable information on the subject. In our examplewebsites of smartphone manufacturers such as ‘www.apple.com’, ‘www.samsung.com’,... wouldbe good authorities for this search query. These are also the results a user would expect froma search engine.
A hub is a second category of pages needed to find good authorities. Their role is to adver-tise authoritative pages. Hubs contain useful links toward these authorities. In our example,consumer websites with reviews on smartphones, websites of smartphone shops,. . . would begood hubs. In fact, hubs point the search process in the ‘right direction’.
To really grasp the idea, we make an analogy with everyday life. If you tell a friend thatyou think of buying a new smartphone, he might tell you his experiences with smartphonesand he will probably share some opinions he got from other friends. He might suggest yousome good models and good brands. Now, you are more inclined to buy a smartphone that
CHAPTER 2. SIMILARITY ON GRAPHS 43
Data:σ: a query string.E : a text-based search engine.t: natural number (usually initiated to 200)d: natural number (usually initiated to 50).Result: A page set Sσ satisfying all the properties of our wish list.begin create graph(σ, E, t, d)
Let Rσ denote the top t results of E on σ;Set Sσ := Rσ;for each page p ∈ Rσ do
Let Γ+(p) denote the set of all pages p points to;Let Γ−(p) denote the set of all pages pointing to p;Add all pages in Γ+(p) to Sσ;if |Γ−(p)| ≤ d then
Add all pages in Γ−(p) to Sσ;else
Add an arbitrary set of d pages from Γ−(p) to Sσ;end
endreturn Sσ;
endAlgorithm 2: Algorithm to construct Sσ.
your friend suggested. Well, this idea is used in the HITS method: your friend served as ahub, the brands and models he suggested are good authorities.
2.1.3 Constructing relevant graphs of webpages
Any collection of hyperlinked pages can be transformed to a directed graph G = (V,→):the nodes correspond to the pages, and if there is a link from page p to page q, there isan arc p → q. Suppose a search query is performed, specified by a query σ. We wish todetermine the authoritative pages by an analysis of the link structure. But first we have toconstruct a subgraph of the internet on which our algorithm will operate. We want to makethe computational effort as efficient as possible, so we restrict the subgraph to the set Qσ ofall pages where the query σ occurs. For this, we could use any already existing text-basedsearch engine. But, for our algorithm Qσ is possibly much too big: it may contain millionsof pages making it impossible for any computer to preform the algorithm. Moreover it is,as explained in the motivation in 2.1.2, possible that Qσ does not contain some of the mostimportant authorities because they never use the query string σ on their website.
Therefore, we wish to transform the set Qσ to a set Sσ of pages following this ‘wish list’of properties:
1. Sσ is relatively small,
2. Sσ is rich in relevant pages,
3. Sσ contains most of the strongest authorities.
CHAPTER 2. SIMILARITY ON GRAPHS 44
By keeping Sσ small, the computational cost of preforming non-trivial algorithms can be keptunder control. By the property of being rich in relevant pages, it will be easier to find goodauthorities.
To construct Sσ, we first construct a root set Rσ with the t highest-ranked pages for σusing a text-based search engine (they sort results based on the occurence of σ). Typically, tis set about 200. Rσ complies with properties 1 and 2 of our wish list, but because Rσ ⊂ Qσ,it may fail from satisfying property 3. Now we use the root set Rσ to create the set Sσsatisfying our complete wish list. When a strong authority is not in Rσ, it is very likely thatat least one of the pages in Rσ points to this authority. Hence, by using the pages in Rσ, wecan expand it to Sσ by looking at the links that enter and leave Rσ. We get algorithm 2.
Thus, we obtain Sσ by expanding Rσ to include any page pointed to by a page in Rσ.We also add d pages that point to a page in Rσ. d is usually initiated to 50. The parameterd is crucial to stay in accordance with property 1 of our wish list. Indeed, a webpage canbe pointed to by several thousands and thousands of other webpages, and we don’t want toinclude them all if we want to keep Sσ relatively small. Some experiments in [? ] showed thatthis algorithm resulted in a Sσ with a size in the range of 1000 to 5000 web pages. Property3 of our wish list is usually met because a strong authority need only be reference once in thet pages of the root set Rσ to be added to Sσ.
Denote the resulting graph of the page set Sσ by G [Sσ]. Note that G [Sσ] will containa lot of links serving only navigational purposes within a website. As mentioned before,these links have nothing to do with the notion of authority and they must be removed fromour final graph if we want a good determination of the authoritative pages by an analysisof the link structure. A very simple heuristic can be used to derive a subgraph of G [Sσ]leaving out all the navigational links: we make a distinction between transverse links andintrinsic links. Transverse links are links between different domain names (e.g. a link betweenwww.vub.ac.be and www.ua.ac.be) and intrinsic links are links between the same domainname (e.g. a link between www.vub.ac.be and dwis.vub.ac.be). Intrinsic links exist toallow navigation within a website and they tell us very little about the authority of thepages they point to. Therefore, we delete all intrinsic links from G [Sσ], keeping only the arcscorresponding to transverse links.
Our graph still contains some meaningless links in the context of page authority. Supposea large number of pages from the same domain name have a transverse link to the same pagep. Most of the time, this means a form of advertisement (by example ‘Website created by. . . .’at the bottom of each page). It is useful to only allow m pages (m is usually initiated to 6)from the same domain name to have a transverse link to the same page. If m is exceeded,all the transverse links must be deleted from the graph. Note, however, that not all links toadvertisements will be erased because on most web pages, advertisements change on everypage which avoids the exceeding of m.
Applying the two described heuristics above on G [Sσ], we get a new graph G ′σ which isexactly what we need to preform our link analysis.
2.1.4 Hubs and Authorities
A very simple approach would now be to order the pages in G ′σ by their indegree. Althoughthis approach can sometimes return good search results, this heuristic is often too simplebecause Sσ will probably contain some web pages with a lot of incoming links without beingvery relevant to the search query σ (e.g. advertisements). With these incoming links, those
CHAPTER 2. SIMILARITY ON GRAPHS 45
web pages are ranked high in the final search result, which we want to avoid.Do we have to return to a text-based approach to avoid irrelevant web pages being on top
of the search results? No, the link structure of G ′σ can tell us a lot more than it may seemat first glance. Authoritative pages relevant to query σ should indeed have a large in-degree,but there should also be a considerable overlap in the sets of pages that point to authoritativepages. This set of pages that point to authoritative pages are called hubs. Hubs have linksto several authoritative pages and they sort of “concentrate” all the authorities on query σ.Figure 2.1.1 shows what this means conceptually.
Hubs AuthoritiesIrrelevant page of largeindegree
Figure 2.1.1: The concept of hubs and authorities
So, for each page j we assign two scores, an authority score which estimates the valueof the content of the page and a hub score which estimates the value of the outgoing linksto other pages. We now get a dichotomy: a good hub is a page pointing to many goodauthorities, a good authority is a page that is pointed to by many good hubs. This leads usto a mutually reinforcing relation resulting in an iterative method to break this circularity.
So let G ′σ = (V,→) and let hj and aj be the hub and authority scores of vertex vj(corresponding with page j). These scores must be initialized by some positive start valuesand then updated simultaneously for all vertices. This leads to a mutually reinforcing relationin which the hub score of vj is set equal to the sum of the authority scores of all vertices pointedto by vj and in an equal manner the authority score of vj is set equal to the sum of the hubscores of all vertices pointing to vj .{
hj := ∑i:(vj ,vi)∈→ ai,
aj := ∑i:(vi,vj)∈→ hi.
The basic operations in which hubs and authorities reinforce one another are depicted inFigure 2.1.2.
CHAPTER 2. SIMILARITY ON GRAPHS 46
page jaj = sum of hl, for alll pointing j
l1
l2
l3
k1
k2
k3
page jhj = sum of ak, for all kpointed to by j
Figure 2.1.2: The basic operations in the reinforcing relation between hubs and authorities
Let B be the adjacency matrix of G ′σ and denote a as the authority vector with coordinates(a1, a2, . . . , an) (with n = |G ′σ|, the number of pages) and h as the hub vector. The mutuallyreinforcing relation can now be rewritten as:(
ha
)(k+1)
=(
0 BBT 0
)(ha
)(k)
, k = 0, 1, . . . ,
In compact form, we denote
x(k+1) = Mx(k), k = 0, 1, . . . , (2.1)
where
x(k) =(
ha
)(k)
,M =(
0 BBT 0
)After each iteration, we have to normalize hj and aj . Indeed, we want to get the authority
and hub weights for each page and in order to compare these after each iteration step, theymust be normalized because only the relative differences do matter, otherwise the wholeprocedure would be meaningless. Pages with larger aj-scores are viewed as being betterauthorities, pages with larger hj-scores are better hubs.
We get the following sequence (with z(0) some positive start value) of normalized vectors:
z(0) = x(0) > 0, z(k+1) = Mz(k)
||Mz(k)||2, k = 0, 1, . . . , (2.2)
CHAPTER 2. SIMILARITY ON GRAPHS 47
Data:G : a graph of n linked pages.k: natural number.Result: A vector (h,a) containing the hub and authority scores after k steps.begin hits(G , k)
Set a(0) = (1, 1, . . . , 1) ∈ Rn;Set h(0) = (1, 1, . . . , 1) ∈ Rn;for i = 1, 2, . . . , k do
Calculateh′(i) =
(∑m:(v1,vm)∈→ a(i−1)
m ,∑m:(v2,vm)∈→ a(i−1)
m , . . . ,∑m:(vn,vm)∈→ a(i−1)
m
);
Normalize h′(i) obtaining h(i);Calculate a′(i) =
(∑m:(vm,v1)∈→ h(i)
m ,∑m:(vm,v2)∈→ h(i)
m , . . . ,∑m:(vm,vn)∈→ h(i)
m
);
Normalize a′(i) obtaining a(i);endreturn (h(k),a(k));
endAlgorithm 3: The iterative HITS algorithm.
How do we decide on x(0)? We will see that any positive vector in R2n is a good choice,but for the sake of simplicity, we make the natural choice1 1 ∈ R2n . The limit to which thesequence converges results in ‘definitive’ hub and authority scores for each page in the graphG ′σ.
To compute the iterative algorithm, we update the hub and authority scores in an alter-nating form (by each step we have to normalize the scores). Because we will prove that thesequence converges, theoretically we can keep on iterating until a fixed point is approximated.But in most practical settings, we choose a fixed number of steps k to reduce the computa-tional cost because we cannot know beforehand how large k has to be to reach the limit. Butof course, it is extremely important to know that method converges anyway. Let x(i) denotevector x at iteration step i as in Notation 1.4.7, and we get Algorithm 3.
To filter the top c hubs and the top c authorities, you can use the trivial Algorithm 4.How do we decide on the values of k and c? It is immediately clear that c and k must be
proportional: for low c values, a lower value for the number of iteration steps k is appropriateand vice versa. Experiments in [? ] showed that k set to 20 is sufficient to become stable forfinding the 5 best hubs and authorities, thus for c = 5.
2.1.5 Convergence of the algorithm
We now want to prove that for arbitrarily large values of k, the sequence Z(k) converges to alimit (h′,a′). Before proving the convergence, note that adjacency matrices are nonnegativeby definition, and thus the matrix M is nonnegative too. M is also clearly a symmetric n′×n′-matrix with nonnegative, real entries. We prove that such matrices have n′ (not necessarilydifferent) real eigenvalues and that we can diagonalize M . This is the first condition of thepower method we introduced in section 1.4.2. If we can also prove the second condition (havinga unique dominant eigenvalue), convergence is immediately shown by the power method.
11 is a matrix, or vector, whose entries are all equal to 1.
CHAPTER 2. SIMILARITY ON GRAPHS 48
Data:G : a graph of n linked pages.k: natural number.c: natural number.Result: A vector ((H1,H2, . . . ,Hc), (A1,A2, . . . ,Ac)) containing exactly the nodes of
the c top hubs and c top authorities.begin filter(G , k, c)
(h,a) = hits(G , k);Sort the pages with the c largest values in h, resulting in a vector of nodes(H1,H2, . . . ,Hc);Sort the pages with the c largest values in a, resulting in a vector of nodes(A1,A2, . . . ,Ac);return ((H1,H2, . . . ,Hc), (A1,A2, . . . ,Ac));
endAlgorithm 4: Returning the top c hubs and authorities
However, there is a problem here: we cannot prove that nonnegative symmetric matriceshave a unique dominant eigenvalue (a unique dominant eigenvalue means the largest eigen-value with multiplicity 1), simply because this is not true in general.2 In the original paperof Kleinberg [? ] he solves this issue by simply imposing that the matrix M has a uniquedominant eigenvalue and he doesn’t pay any further attention to this problem. He presentsit as ‘a small, technical assumption for the sake of simplicity’.
Is this justified in practice? Actually it is, because you can prove with probability theorythat a random matrix Cn, with a probability tending to 1, has no repeated eigenvalues as thesize of the matrix goes to infinity (See for example Thereom 2.2.3 in [? ]). You can also defendthis differently: the only reason why we can’t use the Perron-Frobenius theorem (see 1.2.10)here, is because M will have zero entries (not all pages in Sσ will be linked to each other, thegraph G ′σ is not strongly connected in general). But, it is intuitively clear that by adding 1to each entry of M , the final results of the algorithm (a sorted vector with the best hubs andauthorities) will not be changed at all, because pages with larger indegrees and outdegreeswill continue to get better hub and authority scores (note, however, that the relative hub andauthority scores can fluctuate a bit and the algorithm will converge slower because of thelack of zero entries). So, by adding 1 to each entry of M , the matrix becomes a positive, realmatrix and we know from the Perron-Frobenius that these matrices have a unique dominanteigenvalue. So yes, the ‘small, technical assumption’ in the paper of Kleinberg is justified.
Now that this problem is solved, we present the relevant theorems below. We also imposeon the matrix M that it has a unique dominant eigenvalue with the preceding explanations inmind. Remember that we will generalize the idea of the HITS algorithm to introduce similarityon graphs. Therefore, we will reconsider the convergence of the generalized algorithm in thefollowing section. We will prove that there also exists a limit even when the matrix M hasno unique dominant eigenvalue. We don’t present this result immediately, because we wantto present the results as authentic as possible and we want to show the evolution of the ideasin the successive papers.
2The matrix(
1 00 1
)is a simple counterexample of a symmetric, nonnegative, real matrix that has no
unique dominant eigenvalue.
CHAPTER 2. SIMILARITY ON GRAPHS 49
Theorem 2.1.1. If A is a symmetric, real n × n-matrix, then it has n (not necessarilydifferent) real eigenvalues corresponding to real eigenvectors.
Proof. First, threat A as complex matrix. The characteristic polynomial det(A − λI) has nroots in C and each root is an eigenvalue for A. Let λ ∈ C be any eigenvalue and v ∈ Cn bea corresponding eigenvector for A. We have:
Av = λv.
As A = At, we also get:vtA = λvt.
Taking the complex conjugate of both sides we get (A is a real matrix):
vtA = λvt
We get:vtAv = (vtA)v = (λvt)v = λvtv.
We also have:vtAv = vt(Av) = λvtv.
Hence:λvtv = λvtv.
We conclude that λ = λ for v 6= 0. We proved that every eigenvalue of A is real. If λ is aneigenvalue of A, then the matrix (A− λI) is not invertible so a vector s ∈ Rn exists with
(A− λI)s = 0,
proving that also the corresponding eigenvector is real.
Theorem 2.1.2. (Symmetric Schur Decomposition) Let A be a real symmetric matrix,then there exist an orthogonal matrix P such that:
(i) P−1AP = D, a diagonal matrix,
(ii) The diagonal entries of D are the eigenvalues of A,
(iii) The column vectors of P are the eigenvectors of the eigenvalues of A.
Proof. By induction on the order of the matrix. For n = 1 the theorem is trivial. Let A bea symmetric n × n-matrix. A has at least one eigenvalue λ1 by the previous theorem. Letx1 be a corresponding eigenvalue with ‖x1‖ = 1 and Ax1 = λ1x1. By the Gram-Schmidtprocedure, we construct an orthonormal basis V1 = {x1,v2, . . . ,vn} of Rn. Let:
S1 = [x1,v2, . . . ,vn],
since S1 is orthonormal, we get St1 = S−1. Consider the matrix: S−11 AS1. We have:
(S−11 AS1)t = (St1AS1)t = St1A
tS1 = S−11 AS1
CHAPTER 2. SIMILARITY ON GRAPHS 50
Thus S−11 AS1 is a symmetric matrix. Since S1e1 = x1, we get:
S−11 AS1e1 = (S−1
1 A)(x1)= S−1
1 (λ1x1)= λ1(S−1
1 x1)= λ1e1
So we get:
S−11 AS1 =
(λ1 00t A1
),
with 0 a vector of zero entries of size n − 1 and A1 an (n − 1) × (n − 1) symmetric matrix.We know by induction that there exist a (n − 1) × (n − 1) orthogonal matrix S2 such thatS−1
2 A1S2 = D′ with D′ an (n− 1)× (n− 1) diagonal matrix. Let:
S′2 =(
1 00t S2
),
and also S′2 is an orthogonal matrix, we get:
(S′2)−1
S−11 AS1S
′2 =
(1 00t St2
)(S−1
1 AS1)( 1 0
0t S2
)
=(
1 00t St2
)(λ1 00t A1
)(1 00t S2
)
=(λ1 00t St2A1S2
)
=(λ1 00t D′
)
Thus, if we put
P = S1S′2
D =(λ1 00t D′
),
we have proved (1). From the definition of diagonalizable matrices and the fact that a squarematrix is diagonalizable if and only if it has an eigenbasis (a basis containing only lineairindependent eigenvectors), (ii) and (iii) immediately follow.
Theorem 2.1.3. Giving a graph G with n linked pages, the sequence as defined in the previousparagraph:
z(0) = 1 ∈ Rn, z(k+1) = Mz(k)
||Mz(k)||2, k = 0, 1, . . . ,
converges when M has a unique dominant eigenvalue.
CHAPTER 2. SIMILARITY ON GRAPHS 51
Proof. Since 1) M is diagonalizable as symmetric matrix by Theorem 2.1.2 and 2) M has aunique dominant eigenvalue, it follows from the power method that the sequence will convergeto a corresponding dominating eigenvector (h′,a′). This eigenvector contains the hub andauthority scores.
We conclude with a nice corollary.
Corollary 2.1.4. The second power of the matrix M has the form:
M2 =(BBT 0
0 BTB
),
and the normalized hub and authority scores are given by the dominant eigenvectors of BBT
and BTB.
Proof. By the compact form given in equation 2.1, we see that hk ← (BBT )k−1Ba0 andak ← (BTB)ka0. Let a0 be 1 ∈ Rn. From the previous theorem we also know that:
limk→∞
hk = h and limk→∞
ak = a,
and also from the previous proof we know that (h,a) is the dominant eigenvector of M . Itfollows immediately that h is also the dominant eigenvector of BBT and a is the dominanteigenvector of BTB.
2.1.6 Examples
Searching for math professors at the VUB
Example 2.1.5. We conclude this section with a fictitious example of the HITS algorithm.Suppose you are looking for math professors vub with a text-based search engine and youget the following results:
• The website of the mathematics department of the VUB,
• The website of the faculty of science of the VUB,
• The websites of 4 math professors,
• The website of 10 PhD students at the the mathematics department of the VUB.
Lets take a look at the link structure of these web pages (remember that it is a fictitiousexample):
• The website of the mathematics department at the VUB links to the websites of all the4 professors, the 10 PhD students and the faculty of science,
• The website of the faculty of science of the VUB links to the websites of all the 4 mathprofessors and the mathematics department,
• The websites of the 4 math professors link to the website of the Mathematics departmentand the faculty of science,
CHAPTER 2. SIMILARITY ON GRAPHS 52
• The websites of the 10 PhD students at the the VUB link to the the website of theirpromotor. 1 professor has 4 PhD students, the other 3 professors have 2 PhD students.
We can now construct the graph Gσ (of course this graph is not completely made accordingto Algorithm 2) and we have the following adjacency matrix of Gσ:
• Row 1: website of the mathematics department,
• Row 2: website of the faculty of science,
• Row 3: website of the professor with the 4 PhD students,
• Row 4, 5, 6: websites of the professors with the 2 PhD students,
• Row 7, 8, 9, 10: websites of the 4 PhD students of the professor on row 3,
• Row 11, 12: websites of the 2 PhD students of the professor on row 4,
• Row 13, 14: websites of the 2 PhD students of the professor on row 5,
• Row 15, 16: websites of the 2 PhD students of the professor on row 6,
Leads to:
B =
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 0 1 1 1 1 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
,
Intuitively, we expect that the professor with his 4 PhD student will have the largest au-thority score, immediately followed by the other 3 professors. The website of the mathematicsdepartment is clearly the best hub in this example and should get the largest hub score. Alsothe website of the faculty of science should get a high hub score.
We now apply the HITS method by calculating the dominant eigenvector of BBT (thisreturns the hub scores) and the dominant eigenvector of BTB (this returns the authorityscores) with the power method (see 1.4.2). We get:
CHAPTER 2. SIMILARITY ON GRAPHS 53
a =
0.19790.31620.36880.32310.32310.32310.20290.20290.20290.20290.20290.20290.20290.20290.20290.2029
and h =
0.86450.36050.12070.12070.12070.12070.08660.08660.08660.08660.07580.07580.07580.07580.07580.0758
We see that the websites of the 4 math professors are indeed the best authorities for the
search query math professors vub and that the website of the mathematics department isan extremely good hub (this is very logic because it links to all the other relevant websites).The professor with his 4 PhD students would be ranked first in the search results (he has thehighest authority score), the other professors would appear just underneath him. Obviously,a hub score of 0.8645 is so high that it would be quite exceptional in a graph containing alot more websites (it is very unlikely that you find a website containing links to all the otherpages/nodes in the graph). Nevertheless, we conclude that the HITS algorithm returns theresults we wanted intuitively.
Predictors in the Eurovision Song Contest 2009-2015
The Eurovision Song Contest is an annual competition between countries whose public broad-caster is part of the EBU-network. The contest is the biggest music competition in the world,reaching about 200 million annually.
The contest consists of three shows: 2 semi-finals and 1 grand final. From each semi-final,10 countries proceed the the grand final. Italy, Germany, Spain, United Kingdom and Franceare always qualified for the grand final because they are the main funders of the event. Alsothe winner of last year participates automatically in the final. Each country, also those whodropped out during the semi-finals, gives points during the voting of the grand final. Thevoting during the grand final takes place after all the countries have performed their song.Each country is called and awards 12 points to their favorite song, 10 points to their secondfavorite, and then points from 8 down to 1 to eight other songs. Countries cannot vote forthemselves.
The voting system is in fact a positional voting system that is very similar to the Bordacount method (see [? ] for a scientific explanation of Borda count): the list of points of acountry represents the ranking of the 10 best countries in the voting of that country. So thepoints are values on an ordinal scale.
The complete voting procedure during the Eurovision Song Contest can be seen as adirected graph: all the participating countries are the nodes and the edges represent the
CHAPTER 2. SIMILARITY ON GRAPHS 54
points between the countries (when country a assigns 3 points to country b, then there are 3edges from a to b).
Let A be the adjacency matrix of the voting during a song contest (A will in fact be justa points table). If we take A as input for the HITS algorithm, we expect that the countrywith the highest authority score will be the winner of the competition. Actually we expecta lot more: when we order the countries based on their authority score, we expect that thisordering will be practically equal to the final ranking of the contest. This is based on thesimple fact that Borda count just sums up points, and we only expect very small differenceswhen the difference in points is low between two countries. These small differences are thencaused by the algorithm. Remember that the HITS algorithm does not simply give a highauthority score to nodes with a large indegree, but also takes the hub scores into account. Wewill see that the hub scores will be low (compared to some authority scores) in this example,so their influence will be limited.
But what is very interesting now, is the role that the hub scores are playing. In fact thesescores can be seen as a kind of ‘predictive value’ of a country: a country with a high hubscore will have assigned points in such way that it is seen as a reliable source, meaning thatthe points of that country will match well with the final result of the contest.
Let’s take the Eurovision Song Contest 2014 as an example.The final results of the contest where (the complete result table can be found in Appendix
B):
1. Austria (290 points)2. The Netherlands (238 points)3. Sweden (218 points)4. Armenia (174 points)5. Hungary (143 points)6. Ukraine (113 points)7. Russia (89 points)8. Norway (88 points)9. Denmark (74 points)
10. Spain (74 points)11. Finland (72 points)12. Romania (72 points)13. Switzerland (64 points)14. Poland (62 points)15. Iceland (58 points)16. Belarus (43 points)17. United Kingdom (40 points)18. Germany (39 points)19. Montenegro (37 points)20. Greece (35 points)
CHAPTER 2. SIMILARITY ON GRAPHS 55
21. Italy (33 points)22. Azerbaijan (33 points)23. Malta (32 points)24. San Marino (14 points)25. Slovenia (9 points)26. France (2 points)
Now we calculate the hub and authority scores, based on the full scoreboard and the resultsare presented in Table 2.1.6.
Notice that ordering the countries by their authority scores indeed returns the final rankingof the contest. The countries who have an authority score equal to 0 are the countries thatdidn’t make it to the final and couldn’t therefore receive any points. Also the complete losersreceiving the famous ‘nul points’ will have an authority score equal to 0, but that didn’t occurduring the final of 2014. When looking at the hub scores, it appears that Portugal ‘predicted’the final ranking the best. When we look at the points awarded by Portugal during the final,this is indeed true:
• 12 points: Austria• 10 points: The Netherlands• 8 points: Sweden• 7 points: Switzerland• 6 points: Hungary• 5 points: Denmark• 4 points: Armenia• 3 points: Norway• 2 points: Russia• 1 point: Romania
No less than 7 countries in the points of Portugal have achieved the final top 10, the top 3results of Portugal are even equal to the final top 3! Russia, Romania and Switzerland are the3 countries where Portugal awarded points to but didn’t achieve the top 10, but still they areplaced 1th, 12th and 13th in the final ranking. So it is absolutely not surprising that Portugalis the country with the highest hub score.
Note that countries who scored very well during the final, usually have a moderate hubscore: this is due to the fact that countries cannot vote for themselves. The average hubscore is, in general, also lower than the average authority score, this is because countries canaward points to only 10 countries, but (qualified) countries can receive points from everycountry, except themselves. The ‘predictive value’ of a country is therefore limited to only 10countries, while the voting produces a complete final ranking of 26 countries.
Of course, it is tempting to research the predictive value of countries during a coupleof years. We opted for the period 2009-2015 because during the last seven editions thevoting procedure remained unchanged: the points awarded by each country are based on 50%televoting and 50% jury vote. So we take the average of the hub scores of the last seven years
CHAPTER 2. SIMILARITY ON GRAPHS 56
Participant Authority ScoreAustria 0.285110029The Netherlands 0.235720093Sweden 0.212949392Armenia 0.156091783Hungary 0.124453384Ukraine 0.095410297Norway 0.086504385Denmark 0.074474911Finland 0.070675128Spain 0.068432379Russia 0.065352465Romania 0.062560768Switzerland 0.055868228Iceland 0.053973263Poland 0.052246035United Kingdom 0.037978889Germany 0.029958317Belarus 0.027945852Malta 0.026911408Italy 0.023817428Montenegro 0.023425296Azerbaijan 0.022613716Greece 0.021816155San Marino 0.009315756Slovenia 0.005843162France 0.002167387Albania 0Belgium 0Estonia 0FYR Macedonia 0Georgia 0Ireland 0Israel 0Latvia 0Lithuania 0Moldova 0Portugal 0
Participant Hub ScorePortugal 0.173610946Finland 0.172574335Belgium 0.169584694Latvia 0.166295198Spain 0.165764986Hungary 0.165699760Iceland 0.165062483Estonia 0.162056401Denmark 0.160549539Lithuania 0.159920120Greece 0.159278464Norway 0.156676045Slovenia 0.156533823Sweden 0.155725338Romania 0.155212094France 0.153993935Switzerland 0.153864143Israel 0.153059334Ireland 0.147584917The Netherlands 0.145211066United Kingdom 0.144342619Austria 0.137956312Germany 0.136869443Ukraine 0.135769452Italy 0.121638547Malta 0.119063784Georgia 0.117423087Moldova 0.115874904Poland 0.112296570FYR Macedonia 0.107531180Russia 0.102149330San Marino 0.098204303Montenegro 0.097193444Albania 0.091897617Belarus 0.089227164Azerbaijan 0.068005452Armenia 0.050899422
Table 2.1: The authority and hub scores of the countries during the Final of the EurovisionSong Contest 2014
of each country. The result is presented in Table 2.1.6. The complete results of each year canbe found in Appendix B.
We have to put a condition on the results in Table 2.1.6: countries should have participated6 out of 7 times, more than 1 absence would possibly make the average misleading. In that
CHAPTER 2. SIMILARITY ON GRAPHS 57
case, the best predictors are:
1. Hungary
2. Cyprus
3. The Netherlands
4. Belgium
5. Spain
And the worst predictors are:
1. Armenia
2. Azerbaijan
3. FYR Macedonia
4. Georgia
5. Albania
We see that the best predictors are indeed not the most succesfull countries at the contestitself: Hungary and Cyprus never reached the top 10 in the last seven years, The Netherlandsonly reached the final twice, Belgium qualified three times. Spain is as a main funder directlyqualified for the grand final, but was 5 out of 7 times placed lower than the 20th place. Thebottom is also not very surprising: in 2010, 2012 and 2013 Albania did not receive enoughtelevotes so the jury decided their points (see [? ] for more information), making their judgingprocess vary from one year to another. Azerbaijan and Armenia scored very high in the lastfive years, (e.g. Azerbaijan reached the top 5 every year except 2014 and 2015, won in 2011and became second in 2013) and due to their dispute about the Nagorno-Karabakh region,they never exchanged a single point during the last five years. Also cultural differences areprobably an explanation: Georgia, Azerbaijan and Armenia are located in the Caucasus, aremote corner of Europe, with many Asian influences (the region is sometimes refered to asEurasia).
CHAPTER 2. SIMILARITY ON GRAPHS 58Ta
ble
2.2:
The
aver
age
ofth
ehu
bsc
ores
betw
een
2009
-201
5
#P
arti
cipa
nt20
1520
1420
1320
1220
1120
1020
09A
vera
ge1.
Aus
tral
ia0.
1575
16-
--
--
-0.
1575
162.
Hun
gary
0.14
6121
0.16
5700
0.15
0320
0.15
1291
0.13
0650
-0.
1582
600.
1503
903.
Cyp
rus
0.14
5637
-0.
1614
130.
1403
850.
1456
910.
1441
750.
1465
670.
1473
114.
The
Net
herl
ands
0.15
5502
0.14
5211
0.13
3926
0.14
7638
0.13
7637
0.13
9085
0.16
3395
0.14
6056
5.B
elgi
um0.
1454
780.
1695
850.
1591
890.
1534
620.
1184
870.
1473
620.
1222
340.
1451
146.
Spai
n0.
1589
950.
1657
650.
1536
090.
1331
740.
1142
550.
1626
260.
1227
390.
1444
527.
Isra
el0.
1466
910.
1530
590.
1577
440.
1399
480.
1316
770.
1189
700.
1570
310.
1435
898.
Lat
via
0.14
9527
0.16
6295
0.13
4627
0.13
8436
0.12
1329
0.14
2837
0.14
3788
0.14
2406
9.Sl
ovak
ia-
--
0.13
8399
0.14
1130
0.14
5328
0.14
2350
0.14
1802
10E
ston
ia0.
1453
290.
1620
560.
1410
960.
1314
950.
1432
540.
1360
980.
1314
010.
1415
3311
.L
ithu
ania
0.11
8747
0.15
9920
0.14
1552
0.14
4121
0.12
7425
0.14
4634
0.15
2601
0.14
1286
12.
Den
mar
k0.
1605
070.
1605
500.
1124
960.
1397
040.
1047
820.
1441
740.
1563
650.
1397
9713
.M
alta
0.14
5890
0.11
9064
0.14
6482
0.12
2804
0.16
2192
0.13
8073
0.13
9682
0.13
9170
14.
Pol
and
0.15
9925
0.11
2297
--
0.12
7825
0.15
9433
0.13
5973
0.13
9090
15.
Slov
enia
0.13
5649
0.15
6534
0.14
1375
0.14
8665
0.11
8682
0.13
7277
0.13
3178
0.13
8766
16.
Icel
and
0.14
5817
0.16
5062
0.14
9906
0.12
9060
0.13
0358
0.13
7492
0.11
3493
0.13
8741
17.
Aus
tria
0.15
0454
0.13
7956
0.12
7058
0.15
3796
0.12
9429
-0.
1322
170.
1384
8518
.G
erm
any
0.15
7865
0.13
6869
0.12
6650
0.15
7167
0.10
3942
0.11
5909
0.15
5134
0.13
6220
19.
Cro
atia
--
0.16
9310
0.13
2023
0.12
9980
0.11
4014
0.13
4540
0.13
5974
20.
Rom
ania
0.15
3499
0.15
5212
0.14
1742
0.11
9829
0.13
1675
0.12
8084
0.11
9466
0.13
5644
21.
Fran
ce0.
1422
720.
1539
940.
1351
870.
1514
350.
1372
680.
1181
940.
1090
340.
1353
4122
.G
reec
e0.
1255
440.
1592
780.
1449
430.
1240
540.
1459
120.
0956
290.
1480
540.
1347
7423
.F
inla
nd0.
1495
640.
1725
740.
1080
080.
1414
720.
1018
910.
1333
300.
1327
730.
1342
3124
.Ir
elan
d0.
1414
920.
1475
850.
1377
900.
1328
640.
1037
040.
1474
110.
1281
260.
1341
3925
.U
nite
dK
ingd
om0.
1531
250.
1443
430.
1436
150.
1219
220.
0969
170.
1361
720.
1336
300.
1328
1826
.R
ussi
a0.
1284
120.
1021
490.
1302
770.
1299
070.
1433
600.
1418
610.
1529
920.
1327
0827
.Sw
eden
0.13
0248
0.15
5725
0.12
2816
0.09
9063
0.10
6864
0.15
3858
0.16
0329
0.13
2701
28.
Bul
gari
a-
-0.
1363
420.
1507
720.
1107
760.
1362
080.
1227
940.
1313
7829
.N
orw
ay0.
1443
310.
1566
760.
0989
330.
1548
500.
0990
120.
1554
550.
1075
460.
1309
72
CHAPTER 2. SIMILARITY ON GRAPHS 59Ta
ble
2.2:
The
aver
age
ofth
ehu
bsc
ores
betw
een
2009
-201
5(c
ontin
ued)
#P
arti
cipa
nt20
1520
1420
1320
1220
1120
1020
09A
vera
ge30
.B
osni
a&
Her
zego
vina
--
-0.
1351
920.
1271
860.
1390
540.
1191
580.
1301
4731
.P
ortu
gal
0.14
4985
0.17
3611
-0.
1168
360.
1307
230.
1076
740.
1058
990.
1299
5532
.U
krai
ne-
0.13
5769
0.10
6438
0.12
3998
0.12
0992
0.13
9278
0.14
2826
0.12
8217
33.
Bel
arus
0.14
7342
0.08
9227
0.14
3351
0.12
0914
0.13
8934
0.10
4108
0.14
9588
0.12
7638
34.
Serb
ia0.
1285
84-
0.16
5969
0.11
3907
0.09
5025
0.13
2455
0.12
3080
0.12
6503
35.
Swit
zerl
and
0.14
6739
0.15
3864
0.11
7585
0.12
5128
0.09
2621
0.12
7077
0.11
7733
0.12
5821
36.
Mol
dova
0.13
0904
0.11
5875
0.13
9519
0.11
9439
0.12
6125
0.11
9696
0.11
9161
0.12
4388
37.
Tur
key
--
-0.
1135
350.
1456
080.
1377
040.
0947
150.
1228
9038
.Sa
nM
arin
o0.
1315
330.
0982
040.
0925
210.
1308
410.
1585
59-
-0.
1223
3239
.It
aly
0.14
6217
0.12
1639
0.13
4389
0.10
9725
0.09
8845
--
0.12
2163
40.
Mon
tene
gro
0.08
5579
0.09
7193
0.15
6165
0.13
7301
--
0.12
6144
0.12
0476
41.
Alb
ania
0.12
4384
0.09
1898
0.10
2282
0.09
8185
0.14
2108
0.14
0538
0.14
1493
0.12
0127
42.
Geo
rgia
0.11
1867
0.11
7423
0.14
7386
0.12
3067
0.13
2744
0.08
6281
-0.
1197
9543
.F
YR
Mac
edon
ia0.
0854
140.
1075
310.
1385
210.
1326
120.
1161
970.
1276
260.
1242
280.
1188
7644
.C
zech
Rep
ublic
0.13
0862
--
--
-0.
1039
170.
1173
9045
.A
zerb
aija
n0.
1258
080.
0680
050.
1095
530.
1133
960.
1191
730.
1208
180.
1093
570.
1094
4446
.A
rmen
ia0.
1192
440.
0508
990.
1257
92-
0.12
8531
0.10
1509
0.10
9782
0.10
5960
CHAPTER 2. SIMILARITY ON GRAPHS 60
2.1.7 Final reflection
The HITS algorithm is one of the few algorithms that has the ability to rank pages accordingto a specific search query. Also the computational cost of the HITS algorithm, which equalsthe cost of the power method (see 1.4.2), is not excessive and feasible for most servers. Theresult of the HITS algorithm for popular queries will also be cached by most search engines,which reduces the computational cost even more because the saved results can be serveddirectly to the user without any new calculations.
The biggest disadvantage of the HITS algorithm is that it suffers from topic drift: thegraph G ′σ could contain nodes which have high authority scores for the query but are com-pletely irrelevant. E.g. Facebook is nowadays a universally popular website, almost everywebsite contains a ‘like’ or ‘share’ button linking to Facebook, and Facebook itself containstons of posts linking to other webpages. This means that Facebook has a great chance to ap-pear in almost any G ′σ and receive a high authority score because the original HITS algorithmas presented here cannot detect such ‘universally popular’ websites. The same goes for othersocial media websites and some advertisements.
Nowadays, we know that Ask.com uses this algorithm. In fact, most search engines arevery secretive about their search algorithm (e.g. Google) to make profit and avoid cheating bywebmasters. Still, the chances are that other search engines use some variant of the algorithmas well, in combination with a lot of other procedures.
CHAPTER 2. SIMILARITY ON GRAPHS 61
2.2 Node similarity
This section provides a detailed overview of the paper ‘A Measure of Similarity betweenGraph Vertices: Applications to Synonym Extraction and Web Searching’ [? ] of V. D.Blondel and others. The paper generalizes the HITS algorithm leading to the concept of(node) similarity on graphs. In the following chapters, we will call this method often the‘method of Blondel’ . This method is explained in detail and will serve as a framework forthe following sections and chapters. The method allows a mathematical rigorous approach,leading to a very important convergence theorem. Thanks to it is generality, this convergencetheorem is extremely powerful and will be used to prove convergence for all the presentedalgorithms in the next sections and chapters. This theorem also solves the issue from theprevious section where the matrix M has to have a unique dominant eigenvalue. Once weconstructed the algorithm, proved the convergence of it and introduced the compact form,we also look at some special cases of node similarity between graphs.
Introduction
In this introduction, we explain the main idea behind the method of Blondel. Because weare focussing on the intuitive ideas here, we don’t prove any theorems in this paragraph yet.The theoretical results are presented in the following paragraphs. Notice that all the graphsin this section are directed graphs, although this is not always explicitely stated. However,the method works for undirected graphs too, often leading to much easier formulas since thegraphs have symmetric adjacency matrices in that case. We will give explicit attention toundirected graphs in the section about ‘Special cases’ (see 2.2.4).
The method of Blondel is a pure generalization of the HITS algorithm. Remember fromthe previous section that we constructed a graph G ′σ and calculated hub and authorities scoresfor each vertex. Well, the authors of [? ] found a way to replace these hub and authorityscores by just any other graph, allowing us to compare G ′σ to any other graf H . We will callthe graph that replaces the hub and authority scores the structure graph.
How does this work? For G ′σ , the authority score of vertex vj of G ′σ can be thought of asa score between vj of G and the vertex denoted as authority of the graph:
hub authority
and, conversely, the hub score of vertex vj of G ′σ can be thought of as a score between vjand the vertex denoted as hub. We can replace G ′σ with any other graph G . We call thehub-authority graph a structure graph and we already know the resulting iterative methodfrom the previous section. We will call these scores similarity scores.
The central question is now: which mutually reinforcing relation do we get when usinganother structure graph, different from the hub-authority structure graph? If we take, forexample, as structure graph a path graph with three vertices v1, v2, v3:
v1 v2 v3
CHAPTER 2. SIMILARITY ON GRAPHS 62
Let G (W,→) be a graph. With each vertex wi of G we now associate three scores zi1, zi2 andzi3, one for each vertex of the structure graph. We initialize these scores with a positive valueand then update them according to the mutually reinforcing relation:
zi1 := ∑j:(wi,wj)∈→ zj2,
zi2 := ∑j:(wj ,wi)∈→ zj1 +∑
j:(wi,wj)∈→ zj3,
zi3 := ∑j:(wj ,wi)∈→ zj2,
or, in matrix form (zj denotes the column vector with entries zij , B is the adjacency matrixof graph G ), z1
z2z3
(k+1)
=
0 B 0BT 0 B0 BT 0
z1
z2z3
(k)
which we, again, can denote by
z(k+1) = Mz(k). (2.3)
The principle is now exactly the same as the previous example with hubs and authorities. Thematrix M is symmetric and nonnegative, and again the result is the limit of the normalizedvector sequence:
s(0) = z(0) > 0, s(k+1) = Ms(k)
||Ms(k)||2, k = 0, 1, . . . , (2.4)
Remember that the HITS algorithm assumed that M has a unique dominant eigenvalue butwe don’t want to make this assumption in this section because we want a concept that canbe applied to all kinds of directed graphs. We will see that without this assumption, thesequence 2.4 does not always converge but oscillates between the limits:
seven = limk→∞
s(2k) and sodd = limk→∞
s(2k+1)
The limit vectors seven and sodd do in general depend on the initial vector z(0). We will provelater on that the vector z(0) = J , with J a matrix of ones, is a good choice.
The extremal limit seven(J) will be defined as the similarity matrix. The element sij iscalled the similarity score between vertex wi of G and vertex vj of the structre graph.
We now give a numerical example.
Example 2.2.1. Take as structure graph again the path graph with three vertices v1, v2, v3:
v1 v2 v3
Let G (W,→) be the following graph:
CHAPTER 2. SIMILARITY ON GRAPHS 63
w1
w2 w3
w4 w5
Then the adjacency matrix B is:
B =
0 1 1 0 00 0 1 1 10 0 1 1 00 0 0 0 00 0 0 0 0
By using the described mutually reinforcing updating iteration we get the following similaritymatrix (a numerical algorithm to calculate this is presented later on in this section):
S =
0.3557 0.1265 00.3102 0.3451 0.05570.2732 0.4619 0.4115
0 0.1579 0.35570 0.0840 0.1521
The similarity score of w4 with v2 of the structure graph is equal to 0.1579.
We now construct the general case. Take two (directed) graphs G = (U,→) and H =(V,→′) with nG and nH the order of the graphs. We think of G as the structure graph (suchas the graphs hub → authority and the graph 1 → 2 → 3 in the previous paragraphs). Weget the following mutually reinforcing updating iteration with as updating equations:
z(k+1)ij :=
∑p:(vp,vi)∈→′,q:(uq ,uj)∈→
x(k)pq +
∑p:(vi,vp)∈→′,q:(uj ,uq)∈→
z(k)pq (2.5)
Consider the product graph G ×H (see Definition 1.5.17). The above updating equation isequivalent to replacing the scores of all vertices of the product graph by the sum of the scoresof the vertices linked by an incoming or outgoing edge.
Equation (2.4) can be rewritten in a more compact matrix form. Let Zk be the nH × nG
matrix of entries zij at iteration k, and A and B are the adjacency matrices of G and H .Then the updating equations can be written as:
Z(k+1) = BZ(k)AT +BTZ(k)A, k = 0, 1, . . . , (2.6)
CHAPTER 2. SIMILARITY ON GRAPHS 64
We will prove that the normalized even and odd iterates of this updating equation convergewhen using as start matrix J , the matrix of ones. This normalized result will be denotedby S, the similarity matrix. This compact form also shows the true meaning of similaritybetween nodes: the similarity score between two vertices is large when the similarity scoresof the adjacent vertices are large. We will dive deeper into the meaning of the algorithmin Chapter 3. The following example shows a calculated similarity matrix of two directedgraphs.
Example 2.2.2. Let H (V,→) be the following graph:
v2
v1
v3
v4
Let G (V ′,→′) be the following graph:
v′6
v′1
v′2
v′4 v′3
v′3
This results in the following similarity matrix (a numerical algorithm to calculate this matrixis introduced later in this section):
S =
0.2636 0.2786 0.2723 0.12890.1286 0.1286 0.0624 0.12680.2904 0.3115 0.2825 0.16670.1540 0.1701 0.2462 00.0634 0.0759 0.1018 00.3038 0.3011 0.2532 0.1999
CHAPTER 2. SIMILARITY ON GRAPHS 65
We see for example, that vertex v2 of H is most similar to vertex v′3 in G because thesimilarity score s32 is the highest among all the similarity scores of v2.
2.2.1 Convergence theorem
In the introduction, we mentioned already that the sequence in Equation (2.4) convergesfor even and odd iterates. We will prove this at the end of this subsection. But before wearrive there, we first need some results on the eigenvectors and eigenvalues of nonnegativematrices. The Perron-Frobenius applies only to nonnegative, irreducible matrices, but sinceone is confronted in practice with nonnegative matrices that are not necessary irreducible, weextend the Perron-Frobenius and see what remains without this assumption. We will provethat the spectral radius ρ(M) of a nonnegative matrix M is an eigenvalue of M , also calledin this case the Perron root (see Definition 1.2.11). Moreover, there exists an associatednonnegative eigenvector x ≥ 0(x 6= 0), the Perron vector, such that Mx = ρx. We will alsoinvestigate in more specific results if M is not only nonnegative, but also symmetric.
Lemma 2.2.3. Let A,B be n × n-matrices, if |A| ≤ B, then ρ(A) ≤ ρ(|A|) ≤ ρ(B). (SeeDefinition 1.1.8 for the definition of | · |).
Proof. For every m = 1, 2, . . . we have
|Am| ≤ |A|m ≤ Bm
by using some trivial properties of the absolute value function. Let ||·||2 be the matrix 2-norminduced by the Euclidean vector norm: for any matrix M , we have ‖M‖2 = max‖x‖2=1 ‖Mx‖2.For this matrix norm it is trivial to see that if |M | ≤ |M ′| (see Definition 1.1.5) it followsthat ‖M‖2 ≤ ‖M ′‖2 and also ‖M‖2 = ‖ |M | ‖2, we get:
‖Am‖2 ≤ ‖ |A|m ‖2 ≤ ‖Bm‖2
and‖Am‖1/m2 ≤ ‖ |A|m ‖1/m2 ≤ ‖Bm‖1/m2
for all m = 1, 2, . . . If we now let m→∞ and apply the spectral radius formula from Theorem1.3.12 we get:
ρ(A) ≤ ρ(‖A‖) ≤ ρ(B).
Theorem 2.2.4. If A ≥ 0 is an n× n-matrix, then ρ(A) is an eigenvalue of A and there isa nonnegative vector x ≥ 0,x 6= 0, such that Ax = ρ(A)x.
Proof. For any ε > 0 define A(ε) = (aij + ε) > 0, so in A(ε) we add ε to each entry ofA. Denote by x(ε) the Perron vector of A(ε), this exists by the Perron-Frobenius as A(ε)is a strictly positive matrix (see Theorem 1.2.10), so x(ε) > 0 and ∑n
i=1 x(ε)i = 1. Sincethe set of vectors {x(ε) : ε > 0} is contained in the compact set {x : x ∈ Rn, ‖x‖1 ≤ 1},we know from the Bolzano-Weierstrass theorem (see Theorem 2.1 in [? ]) that there is amonotone decreasing sequence ε1 > ε2 > . . . with limk→∞ εk = 0 such that limk→∞ x(εk) = xexists. Since x(εk) > 0 for all k = 1, 2, . . . , it must be that x = limk→∞ x(εk) ≥ 0; x = 0 isimpossible because:
n∑i=1
xi = limk→∞
n∑i=1
x(εk)i = 1
CHAPTER 2. SIMILARITY ON GRAPHS 66
By Lemma 2.2.3, ρ(A(εk)) ≥ ρ(A(εk+1) ≥ · · · ≥ ρ(A) for all k = 1, 2, . . . , so the sequence{ρ(A(εk))}k=1,2,... is a monotone decreasing sequence. Thus, ρ = limk→∞ ρ(A(εk)) exists andρ ≥ ρ(A). From the fact that
Ax = limk→∞
A(εk)x(εk)
= limk→∞
ρ(A(εk))x(εk)
= limk→∞
ρ(A(εk)) limk→∞
x(εk)= ρx,
and the fact that x 6= 0 we conclude that ρ is an eigenvalue of A. But then ρ ≤ ρ(A) so itmust be that ρ = ρ(A).
Now that we know that any nonnegative matrix M has his spectral radius as an eigenvalueand there exists an associated nonnegative eigenvector, we will see if we can get more specificresults when handling nonnegative, symmetric matrices. First we introduce the orthogonalprojection.
Definition 2.2.5. A projection of a vector x on a vector y is defined as3:
projyx = 〈y,x〉‖y‖ y,
the projection of x on a subspace Y with orthonormal basis {y1,y2, . . . ,ym} is defined as thesum of projections:
projY x =m∑i=1
〈yi,x〉‖yi‖
yi
We now combine the above results to get a new result of the Perron-Frobenius whenhandling nonnegative, symmetric matrices.
Theorem 2.2.6. Let V be a linear subspace of Rn with orthonormal basis {v1,v2, . . . ,vm}.Arrange the column vectors vi in a matrix V and let x ∈ Rn. The orthogonal projectionof x on V is then given by:
Πx = V V Tx,
the matrix Π = V V T is the orthogonal projector. Projectors have the property that Π2 = Π
Proof. We use the connection between transposes and the standard inner product to find thematrix of the orthogonal projection on the subspace V:
projVx =m∑i=1
〈vi,x〉‖vi‖
vi
=m∑i=1〈vi,x〉vi
=m∑i=1
vi〈vi,x〉
3〈x, y〉 =∑n
i=1 xiyi with x, y ∈ Rn is the standard inner product of two vectors in Rn.
CHAPTER 2. SIMILARITY ON GRAPHS 67
Remember that 〈x,y〉 = xTy so:
=m∑i=1
vi(vTi x)
=m∑i=1
(vivTi )x
= V V Tx
Proving that Π2 = Π is also trivial, remember that for an orthogonal matrix A it holds thatAT = A−1:
Π2 = (V V T )2
= V V TV V T
= V V −1V V T
= InV VT
= V V T
= Π
Lemma 2.2.7. Let Π be an orthogonal projecter, then:
ΠT = Π
Proof.ΠT = (V V T )T = (V T )TV T = V V T = Π
We already introduced the Jordan normal form in Theorem 1.3.10, but we also need thefollowing theorem. The proof of this theorem is highly technical and falls behind the scopeof this master thesis. The seperate results can be found in Theorems 1.3.15, 1.3.16, 1.3.17 in[? ].
Theorem 2.2.8. Let A be a n × n-matrix with s distinct eigenvalues λ1, . . . , λs. Let eachλi have algebraic multiplicity mi and geometric multiplicity µi. The matrix A is similar to aJordan matrix
J =
J1. . .
Jµ
where
1. µ = µ1 + µ2 + · · ·+ µs,
2. For each λi, the number of Jordan blocks in J with value λi is equal to µi,
3. λi appears on the diagonal of J exactly mi times.
Further, the matrix J is unique, up to re-ordering the Jordan blocks on the diagonal.
CHAPTER 2. SIMILARITY ON GRAPHS 68
We are now ready for a generalization of the Perron-Frobenius theorem: we generalize thePerron-Frobenius to nonnegative, symmetric matrices.
Theorem 2.2.9. Let M be a symmetric nonnegative matrix with spectral radius ρ. Thenthe algebraic and geometric multiplicity of the Perron root ρ are equal; there is a nonnegativematrix X whose columns span the eigenspace associated with the Perron root; and the elementsof the orthogonal projector Π on the vector space associated with the Perron root of M are allnonnegative.
Proof. We know that any symmetric nonnegative matrix M can be permuted to a Jordancanonical form J . So from the previous Theorem 2.2.8, we know that ρ will appear exactlymρ (= the algebraic multiplicity of ρ) times on the diagonal and we also know that therewill be exactly µρ (= the geometric multiplicity of ρ) Jordan blocks with value ρ on theirdiagonal. But from Theorem 2.1.2 we already know that J is in fact a diagonal matrix (allJordan blocks have size 1× 1) with all eigenvalues on the diagonal, so mρ = µρ. To constructthe nonnegative matrix X, we use again Theorem 2.1.2: P−1MP = D and we know from thetheorem that the column vectors of P are the eigenvectors corresponding to the eigenvalues ofM . Looking at the proof of Theorem 2.1.2 we know that the eigenvector xi corresponding tothe eigenvalue dii (the i’th diagonal element of D), can be found at the i’th column of P . Weknow that these eigenvectors are linear independent because of the definition of diagonalizablematrices (A n × n matrix A is diagonalizable if and only if it has an eigenbasis). So if weselect all the columns of P corresponding to ρ, we found the eigenspace of the Perron root.
To see that the resulting matrix X is nonnegative, notice that the eigenspace of ρ canalso be found by taking the Perron vectors of the (1× 1) Jordan blocks Ji that have ρ on thediagonal appropriately padded with zeros. All these blocks Ji are irreducible and nonnegativeand it follows from the Perron-Frobenius that the corresponding eigenvectors are nonnegative.
To see that the orthogonal projector Π associated with the eigenspace of ρ is nonnegative,notice that normalizing the vectors in X gives an orthonormal basis because by Theorem 2.1.2we know that P is orthogonal and we picked some columns of P to form our eigenbasis of ρ.Normalizing doesn’t affect the sign of the vectors, so the normalized X ′ is also nonnegative,so Π = X ′X ′T will also be nonnegative.
Theorem 2.2.10. Let M be a n×n-symmetric, nonnegative matrix of spectral radius ρ. Lets(0) > 0 and consider the sequence
s(k+1) = Ms(k)
||Ms(k)||2, k = 0, 1, . . .
Two convergence cases can occur depending on whether or not −ρ is an eigenvalue of M .When −ρ is not an eigenvalue of M , then the sequence of s(k) simply converges to Πs(0)
||Πs(0)||2,
where Π is the orthogonal projector on the eigenspace associated with the Perron root ρ. When−ρ is an eigenvalue of M , then the subsequences s(2k) and s(2k+1) converge to the limits
seven(s(0)) = limk→∞
s(2k) = Πs(0)
||Πs(0)||2and sodd(s(0)) = lim
k→∞s(2k+1) = ΠMs(0)
||ΠMs(0)||2.
where Π is the orthogonal projector on the sums of the invariant subspaces associated with ρand −ρ. In both cases the set of all possible limits is given by:{
seven(s(0)), sodd(s(0)) : s(0) > 0}
={ Πs||Πs||2
: s > 0}
CHAPTER 2. SIMILARITY ON GRAPHS 69
and the vector seven(1) is the unique vector of largest possible Manhattan norm in this set.
Proof. We prove only the case where −ρ is an eigenvalue, the other case is an easy adaption.Denote the invariant subspaces of M corresponding to ρ, −ρ and to the rest of the spectrum,respectively by Vρ,V−ρ and Vµ. From the previous theorem, we know that Vρ,V−ρ are certainlynontrivial because ρ and −ρ have at least multiplicity 1, so the eigenspace contains also atleast 1 vector. We also suppose that Vµ is nontrivial (if Vµ were trivial, the proof becomeswould get only easier). Let Vρ be the n × u-matrix whose columns span Vρ, V−ρ the n × vmatrix spanning V−ρ, Vµ the n× w- matrix spanning Vµ. We now have:
MVρ = ρVρ, MV−ρ = −ρV−ρ,
for M we can write:MVµ = VµMµ,
where Mµ is a w×w-matrix with spectral radius µ strictly less than ρ because Vµ is nontrivialby assumption.
Remember from Theorem 2.1.2 that P−1MP = D, with D a diagonal matrix, this can berewritten as M = PDP−1 or M = PDP T (P is an orthogonal matrix), we can rewrite thisin this case as (this is the so called eigendecomposition for symmetric matrices):
M =(Vρ V−ρ Vµ
)ρI −ρIMµ
(Vρ V−ρ Vµ)T
= ρVρVTρ − ρV−ρV T
−ρ + VµMµVTµ
Let A = ( Vρ V−ρ Vµ ), it follows that (remember that A is orthogonal from Theorem 2.1.2):
M2 = A
ρI −ρIMµ
ATAρI −ρI
Mµ
AT
= A
ρI −ρIMµ
A−1A
ρI −ρIMµ
AT
= A
ρ2I(−ρ)2I
M2µ
AT= ρ2VρV
Tρ + ρ2V−ρV
T−ρ + VµM
2µV
Tµ
= ρ2Π + VµM2µV
Tµ
whereΠ = VρV
Tρ + V−ρV
T−ρ
is the orthogonal projector on the invariant subspace Vρ ⊕ V−ρ of M2 corresponding to ρ2.Analogously, we also have:
M2k = ρ2kΠ + VµM2kµ V T
µ , (2.7)
CHAPTER 2. SIMILARITY ON GRAPHS 70
and since Mµ has as spectral radius µ, strictly below ρ, we multiply (2.7) by s(0):
M2ks(0) = ρ2kΠs(0) + VµM2kµ V T
µ s(0)
⇔ M2ks(0)
‖M2ks(0)‖2=
ρ2kΠs(0) + VµM2kµ V T
µ s(0)
‖ρ2kΠs(0) + VµM2kµ V T
µ s(0)‖2
⇔ s(2k) =Πs(0) + 1
ρ2kVµM2kµ V T
µ s(0)
‖Πs(0) + 1ρ2kVµM2k
µ V Tµ s(0)‖2
⇔ s(2k) = Πs(0)
‖Πs(0)‖2+O
(µ
ρ
)2k(2.8)
⇔ limk→∞
s(2k) = Πs(0)
‖Πs(0)‖2(2.9)
Analogously, when multiplying (2.7) by Ms(0) we have:
limk→∞
s(2k+1) = ΠMs(0)
‖ΠMs(0)‖2(2.10)
The expressions (2.9) and (2.10) bring up the question whether ‖Πs(0)‖2 and ‖ΠMs(0)‖2 arealways nonzero, from Definition 1.3.2 we know these norms can be calculated as:
(Πs(0))TΠs(0) and (ΠMs(0))T (ΠMs(0)),
since Π2 = Π and ΠT = Π, this equals:
s(0)TΠs(0) and s(0)TMΠMs(0),
these norms are both nonzero since s(0) > 0 and we know from the previous Theorem 2.2.9that both Π and MΠ are nonnegative and nonzero.
From the fact that M is nonnegative and the formula for seven(s(0)) and sodd(s(0)) weconclude that both limits lie in {Πs/‖Πs‖2 : s > 0}. We now show that every elements(0) ∈ {Πs/‖Πs‖2 : s > 0} can be obtained as seven(s(0)) for some s(0) > 0. Since the entriesof Π are nonnegative, so are those of s(0). This vector may have some zero entries. Froms(0) we construct s(0) by adding ε > 0 to all the zero entries of s(0). For ε → 0, the vectors(0) − s(0) is clearly orthogonal to Vρ ⊕ V−ρ and will vanish in the iteration of M2. Thus wehave seven(s(0)) = s(0) for s(0) > 0, proving our statement.
We now prove the last statement. We actually want to prove that:∥∥∥∥ Π1‖Π1‖2
∥∥∥∥1≥∥∥∥∥∥ Πs(0)
‖Πs(0)‖2
∥∥∥∥∥1∀s(0) > 0.
To prove this we rewrite both sides. Remember that Π is nonnegative, then we get with thenorm properties:∥∥∥∥ Π1
‖Π1‖2
∥∥∥∥1
= | 1‖Π1‖2
| ‖Π1‖1
= ‖Π1‖1√1TΠ21
CHAPTER 2. SIMILARITY ON GRAPHS 71
=√
1TΠ21‖Π1‖1√1TΠ21
√1TΠ21
=√
1TΠ21‖Π1‖11TΠ1
=√
1TΠ21 (|(Π1)1|+ |(Π1)2|+ · · ·+ |(Π1)n|)1TΠ1
=√
1TΠ21 ((Π1)1 + (Π1)2 + · · ·+ (Π1)n)1TΠ1
=√
1TΠ21 ((Π1)1 + (Π1)2 + · · ·+ (Π1)n)(Π1)1 + (Π1)2 + · · ·+ (Π1)n
=√
1TΠ21
and also: ∥∥∥∥∥ Πs(0)
‖Πs(0)‖2
∥∥∥∥∥1
= ‖Πs(0)‖1‖Πs(0)‖2
= |(Πs(0))1|+ |(Πs(0))2|+ · · ·+ |(Πs(0))n|√s(0)TΠ2s(0)
= (Πs(0))1 + (Πs(0))2 + · · ·+ (Πs(0))n√s(0)TΠ2s(0)
= 1TΠs(0)√
s(0)TΠ2s(0)
= 1TΠ2s(0)√
s(0)TΠ2s(0)
We apply the Cauchy-Schwarz inequality to Π1 and Πs(0), remember that Π = WW T forsome W , so ΠT = (WW T )T = (W T )TW T = WW T = Π, that the matrix Π and all vectorsare nonnegative and Π2 = Π:
|〈Π1,Πs(0)〉| ≤ ‖Π1‖.‖Πs(0)‖
⇔ |(Π1)TΠs(0)| ≤√
1TΠ21.√
s(0)TΠ2s(0)
⇔ |1TΠTΠs(0)| ≤√
1TΠ21.√
s(0)TΠ2s(0)
⇔ |1TΠ2s(0)| ≤√
1TΠ21.√
s(0)TΠ2s(0)
⇔ |1TΠ2s(0)|√s(0)TΠ2s(0)
≤√
1TΠ21
⇔ 1TΠ2s(0)√
s(0)TΠ2s(0)≤√
1TΠ21
The last equation is true since Πs(0) and Π1 are both real and nonnegative.
CHAPTER 2. SIMILARITY ON GRAPHS 72
2.2.2 Similarity matrices
We now come to the formal definition of the similarity matrix of two graphs G = (U,→) andH = (V,→′), by proving that the mutually reinforcing relation is given by (A and B are theadjacency matrices of G and H ):
Z(k+1) = BX(k)AT +BTX(k)A, k = 0, 1, . . . , (2.11)
which we already introduced in (2.6). To prove this relation, we take a detour via a niceproperty of the Kronecker product and the vec-operator. The vec-operator is a convenientway of stacking columns of a matrix left to right. With this operator we can rewrite (2.11) inz(k+1) = Mz(k) with M equal to a sum of Kronecker products. This will allow us the applyTheorem 2.2.10.
Definition 2.2.11. With each m× n-matrix A = [aij ] we can associate the vector vec(A) ∈Rmn defined by:
vec(A) =(a11 . . . am1 a12 . . . am2 a1n . . . amn
)TDefinition 2.2.12. The Kronecker product (or tensor product) of an m × n-matrixA = (aij) and a p× q-matrix B = (bij) is denoted by A⊗B and is defined by the matrix:
A⊗B =
a11B . . . a1nB... . . . ...
am1B . . . amnB
∈ Rmp×nq
Lemma 2.2.13. Consider a m× n-matrix A = (aij) and a p× q-matrix B = (bij), we have:
AT ⊗BT = (A⊗B)T
Proof. By direct computation, we get:
(A⊗B)T =
a11
b11 b12 . . . b1q... . . . ...
...bp1 bp2 . . . pq
. . . a1n
b11 b12 . . . b1q... . . . ...
...bp1 bp2 . . . pq
... . . . ...
am1
b11 b12 . . . b1q... . . . ...
...bp1 bp2 . . . pq
. . . amn
b11 b12 . . . b1q... . . . ...
...bp1 bp2 . . . pq
T
=
a11
b11 b21 . . . bp1... . . . ...
...b1q b2q . . . bpq
. . . am1
b11 b21 . . . bp1... . . . ...
...b1q b2q . . . bpq
... . . . ...
a1n
b11 b21 . . . bp1... . . . ...
...b1q b2q . . . bpq
. . . amn
b11 b21 . . . bp1... . . . ...
...b1q b2q . . . bpq
CHAPTER 2. SIMILARITY ON GRAPHS 73
=
a11BT . . . am1B
T
... . . . ...a1nB
T . . . amnBT
= AT ⊗BT
Theorem 2.2.14. Let A a m×n-matrix , B a p×q-matrix and C a m×p matrix, the matrixequation:
AXB = C
with X an unknown n×p-matrix, is equivalent tot the system of qm equations and np unkonwsgiven by:
(BT ⊗A) vec(X) = vec(C)so:
vec(AXB) = (BT ⊗A) vec(X)
Proof. Let Qk be the kth column of a given matrix Q, we get:
(AXB)k = A(XB)k= AXBk
= Ap∑i=1
bikXi
= Ab1kX1 +Ab2kX2 + · · ·+AbpkXp
=(b1kA b2kA . . . bpkA
)vec(X)
= (BTk ⊗A) vec(X)
We get:
vec(AXB) =
BT
1 ⊗A...
BTq ⊗A
vec(X)
But it follows immediately that:
vec(AXB) = (BT ⊗A) vec(X)
because the transpose of a column of B is a row of BT .
Theorem 2.2.15. Let G and H be two graphs with adjacency matrices A = [aij ] ∈ Rn×nand B = [bij ] ∈ Rm×m and let S(0) > 0 be an initial positive matrix, and define:
S(k+1) = BS(k)AT +BTS(k)A
‖BS(k)AT +BTS(k)A‖F, k = 0, 1, . . . .
Then the matrix subsequences S(2k) and S(2k+1) converge to Seven and Sodd and among all thematrices in the set of all possible limits:
{Seven(S(0)), Sodd(S(0)) : S(0) > 0}
the matrix Seven(J), with J the all-one matrix, is the unique matrix of largest 1-norm.
CHAPTER 2. SIMILARITY ON GRAPHS 74
Proof. We first rewrite the compact form of 2.11:
Z(k+1) = BZ(k)AT +BTZ(k)A
⇔ vec(Z(k+1)) = vec(BZ(k)AT +BTZ(k)A)⇔ vec(Z(k+1)) = vec(BZ(k)AT ) + vec(BTZ(k)A)⇔ vec(Z(k+1)) =
((AT )T ⊗B
)vec(Z(k)) +
(AT ⊗BT
)vec(Z(k))
⇔ vec(Z(k+1)) = (A⊗B +AT ⊗BT ) vec(Z(k))
If we define z(k) = vec(Z(k)) and M = A⊗B +AT ⊗BT we get:
z(k+1) = Mz(k),
exactly the same compact form as introduced in the beginning of this section in 2.3. If wecan prove that M is symmetric and nonnegative, then we can apply Theorem 2.2.10 and theresult follows. M is, of course, nonnegative because the adjacency matrices A and B arealways nonnegative. To see that M is symmetric, first notice that M can be rewritten usingLemma 2.2.13 to:
M = A⊗B + (A⊗B)T , (2.12)
because A ⊗ B is a nm × nm square matrix, we know that M is symmetric because it isthe sum of a square matrix and its transpose. In order to stay consistent with the Euclediannorm appearing in Theorem 2.2.10, we use as matrix norm the Frobenius norm: we knowfrom section 1.3.2 that for A ∈ R1×n the Frobenius norm equals the 2-norm. We can nowapply Theorem 2.2.10 and the result follows.
Definition 2.2.16. Let G ,H be two graphs, then the unique matrix
S = Seven(J) = limk→+∞
S(2k)
(with the notations of the previous theorem) is the similarity matrix between G and H .
Notice that it follows from 2.11 that the similarity matrix between H and G is thetranspose of the similarity matrix between G and H .
CHAPTER 2. SIMILARITY ON GRAPHS 75
2.2.3 Algorithm 5
Data:A: the n× n adjacency matrix of a directed graph GB: the m×madjacency matrix of a directed graph H .TOL: tolerance for the estimation error.Result:S: the similarity matrix between G and H .begin similarity matrix(A,B,TOL)
k = 1 ;Z(0) = J (n×m-matrix with all entries equal to 1);µ = n×m-matrix with all entries equal to TOL;repeat
Z(k) = BZ(k−1)AT+BTZ(k−1)A‖BZ(k−1)AT+BTZ(k−1)A‖F
;k = k + 1;
until k is even and |Z(k) − Z(k−2)| < µ;return Z(k);
endAlgorithm 5: Algorithm for calculating the similarity matrix between G and H
The compact form introduced in the previous section leads directly to the approximationalgorithm 5. Note that the algorithm must iterate an even number of times and to check ifthe tolerance limit is reached, only the even iteration steps are considered because the resultswill oscillate between the even and odd iterates. This is of course an obvious consequenceof the definition of the similarity matrix and Theorem 2.2.10. Remember that the absolutevalue that is mentioned is the same as in Notation 1.1.8. A Matlab implementation of thealgorithm can be found in Listing A.2 in Appendix A.
Equivalence with the power method
Algorithm 5 is in fact completely equivalent to the Power Method of section 1.4.2. To seethis, remind that we are in fact calculating the expression of Theorem 2.2.10:
seven(s(0)) = limk→∞
s(2k) = Πs(0)
||Πs(0)||2,
when we compare this with equation (1.21) from the power method section, we see thatin both cases we are in fact calculating the largest eigenvector of a matrix, but with thedifference that in this Π is constructed in such way that it oscillates and the even iteratesalways converge. The attentive reader will note that Π doesn’t appear directly in Algorithm5, but this is, of course, a consequence of the vectorization defined in Definition 2.2.11. Withthis operation, we are able to work directly with matrices while proving convergence throughTheorem 2.2.10. With this in mind, it is clear that Algorithm 5 can be seen as a matrixanalogue of the power method.
This brings us to one of the main ideas of this master thesis: when we use the importantTheorem 2.2.10 to prove convergence of some algorithm (all the algorithms that will followare using this theorem for this purpose), we know for sure that the resulting algorithm willsomehow be equivalent to the power method. Indeed, this theorem shows that the algorithm
CHAPTER 2. SIMILARITY ON GRAPHS 76
essentially comes down to finding the largest eigenvalue of some orthogonal projector Π whichmight, however, not appear directly in the algorithm thanks to vectorization . Depending onthe possible characteristics of Π, this makes it possible to derive the computational costs ofthe constructed algorithm.
Also note that in this case, the equivalence is very clear when looking at the (pseudo)codefor both algorithms, but as this master thesis proceeds, the algorithms will get more compli-cated and the equivalence with the power method will not be as clear.
Computational cost
We already showed in Theorem 2.2.10 that the convergence of the even iterates in the aboverecurrence relation is linear with ratio (µ/ρ)2n, where ρ is the spectral radius of M and µ isthe spectral radius of the matrix Mµ, a matrix where the columns span the subspaces of alleigenvectors besides ρ and also −ρ when −ρ is an eigenvalue of M .
With the accuracy level TOL, we write:∣∣∣∣µρ∣∣∣∣2n ≈ TOL
So, for example, for TOL = 10−5 we get n = −5/(2 logµ/λ), we become:
similarity matrix ∈ O( 1
logµ− log ρ
)Note that this O holds for any estimation error TOL of the form 10−e with e ∈ N. Alsonote that this is completely analogous to the computational cost of the power method, whichcomes as no surprise since the algorithms are similar, but remember from Theorem 2.2.4 thatwe cannot encounter a situation in which µ = ρ, as µ is the spectral radius of the spectrumwithout ρ and −ρ (if −ρ is an eigenvalue). When Mµ is trivial (meaning that M has onlyeigenvalues ρ and possibly −ρ), we conclude from equation (2.8) in Theorem 2.2.10 that inthis case
∣∣∣1ρ ∣∣∣2n ≈ TOL so:
similarity matrix ∈ O( 1
log 1− log ρ
).
We conclude that the algorithm can always be calculated in a finite number of steps.
2.2.4 Special cases
We now consider some special cases of of similarity scores between two vertices of graphs.
HITS algorithm
Remember that the HITS algorithm assumed that
M =(
0 BBT 0
),
has a unique dominant eigenvalue to calculate the hub and authority scores of nodes in agraph Gσ with adjacency matrix B. This assumption is a direct result of the power method,
CHAPTER 2. SIMILARITY ON GRAPHS 77
because it is one of the conditions to apply the algorithm. One of our goals was to drop thisassumption. We finally arrived at this result, that also returns a solution when the Perron roothas a multiplicity larger then 1. To prive this result, we need the singular value decompositionof real matrices. A comprehensive overview of this decomposition can be found in [? ], werestrict ourself to the main result of this decomposition, without the intuitive explanations,otherwise we would eviate too much from the main subject.
Theorem 2.2.17. (Singular value decomposition) If A is a real m×n-matrix, then thereexist orthogonal matrices:
U = [u1, . . . ,um] ∈ Rm×m
V = [v1, . . . ,vn] ∈ Rn×n,
such thatUTAV = Σ = diag(σ1, . . . , σp) ∈ Rm×n,
meaning that Σ is a so called pseudo-diagonal matrix meaning that the first diagonal elementsare the singular values σ1, . . . , σp and all other elements of Σ are zero, p = min(m,n) andσ1 ≥ · · · ≥ σp. Equivalently:
A = UΣV T .
The columns of V are right singular vectors of A, and those of U are left singularvectors.
Proof. Let x and y be unit vectors in Rn and Rm, respectively, and consider the bilineairform:
z = yTAx.
The setS = {x,y | x ∈ Rn,y ∈ Rm, ‖x‖2 = ‖y‖2 = 1}
is compact, so that the scalar function z(x,y) must achieve a maximum value on S (possiblyat more than one point). Let u1, v1 be two unit vectors in Rm and Rn respectively wherethis maximum is achieved, and let σ1 be the corresponding value of z:
max‖x‖2=‖y‖2=1
yTAx = uT1 Av1 = σ1.
We want to show that u1 is parallel to the vector Av1: if this were not the case, the innerproduct uT1 Av1 could be increased by rotating u1 towards the direction of Av1, therebycontradicting the fact that uT1 Av1 is a maximum. Similarly, notice that:
uT1 Av1 = vT1 ATu1
and repeating the argument above, we see that v1 is parallel to ATu1. The vectors u1and v1 can be extended into orthonormal bases for Rm and Rn, respectively. Collect theseorthonormal basis vectors into orthogonal matrices U1 and V1. Then:
UT1 AV1 = S1 =(σ1 0T0 A1
).
In fact, the first column of AV1 is Av1 = σ1u1, so the first entry of UT1 AV1 is uT1 σ1u1 = σ1,and its other entries are uTj Av1 = 0 because Av1 is parallel to u1 and therefore orthogonal,
CHAPTER 2. SIMILARITY ON GRAPHS 78
by construction, to u2, . . . ,um. A similar argument shows that the entries after the first rowof S1 are zero: the row vector uT1 A is parallel to vT1 and therefore orthogonal to v2, . . . ,vn,so that uT1 Av2 = · · · = uT1 Avn = 0. The matrix A1 has one fewer row and column than A.We can repeat the same construction on A1 and write
UT2 A1V2 = S2 =(σ2 0T0 A2
)
so that (1 0T0 UT2
)UT1 AV1
(1 0T0 V T
2
)=
σ1 0 0T0 σ2 0T0 0 A2
.this procedure can be repeated until Ak vanishes to obtain:
UTAV = Σ,
where UT and V are orthogonal matrices obtained by multiplying together all the orthogonalmatrices used in the procedure, and:
Σ = (σ1, . . . ,p ).
Since matrices U and V are orthogonal, we can multiply with U and V T to obtain:
A = UΣV T ,
which is the desired result.It remains to show that the elements on the diagonal of Σ arrange in decreasing order. To
see that σ1 ≥ · · · ≥ σ where p = min(m,n) we can observe that the successive maximizationproblems that yield σ1, . . . , σp are performed on a sequence of sets each of which contains thenext.
Theorem 2.2.18. Let G be a graph with adjacency matrix B. The normalized hub and au-thority scores of the vertices are given by the normalized dominant eigenvectors of the matricesBBT and BTB, provided the corresponding Perron root is of multiplicity 1. Otherwise, it isthe normalized projection of the vector 1 on the eigenspace of the Perron root.
Proof. The corresponding matrix M is:
M =(
0 BBT 0
)so:
M2 =(BBT 0
0 BTB
),
and the result follows from Theorem 2.2.10 under the condition that the matrix M2 has adominant root ρ2 . This can be seen as follows: let V and U be the dominant right and leftsingular vectors of B, so:
BV = ρU, BTU = ρV
CHAPTER 2. SIMILARITY ON GRAPHS 79
then clearly V and U are also the dominant right and left singular vectors of BTB and BBT
so:BTBV = ρ2V, BBTU = ρ2U.
The projectors associated with the dominant eigenvalues of BBT and BTB are, respectively:
Πv = V V T and Πu = UUT ,
the projector of Π of M2 is then:
Π = diag(Πv,Πu),
and hence the subvectors of Π1 are the vectors Πv1 and Πu1, which can be computed withBTB and BBT .
Central scores
As for the hub and authority scores we can give an explicit expression for the similarity scorewith vertex 2, which we will call the central score.
Theorem 2.2.19. Let G be a graph with adjacency matrix B. The normalized similarityscores of vertex 2 of the path graph 1 → 2 → 3 with the vertices of graph G are called thecentral scores and are given by the normalized dominant eigenvector of the matrix BTB+BBT ,provided the corresponding Perron root is of multiplicity 1. Otherwise, it is the normalizedprojection of the vector 1 on the dominant invariant subspace.
Proof. Let
M =
0 B 0BT 0 B0 BT 0
so:
M2 =
BBT 0 BB0 BTB +BBT 0
BTBT 0 BTB
because M is nonnegative and symmetric, the result follows from Theorem 2.2.10 under thecondition that the central matrix BTB +BBT has a dominant root ρ2 of M2. We can statethis condition otherwise, because M can be permuted to:
M = P T(
0 EET 0
)P, where E =
(BBT
)
Now let V and U be the dominant right and left singular vectors of E:
EV = ρU,ETU = ρV,
then clearly V and U are also dominant right and left singular vectors of ETE and EET ,since:
ETEV = ρ2V, EETU = ρ2U.
CHAPTER 2. SIMILARITY ON GRAPHS 80
Moreover,
PM2P T =(EET 0
0 ETE
),
and letΠv = V V T and Πu = UUT
be the projectors associated with the dominant eigenvalues of EET and ETE. The projectorΠ of M2 is then equal to:
Π = P Tdiag(Πv,Πu)P,
and it follows that the subvectors of Π1 are te vectors Πv1 and Πu1, which can be computedfrom ETE or EET . Since ETE = BTB +BBT , the central vector Πv1 is the central vectorof Π1.
Example 2.2.20. In order to illustrate the intuitive meaning of calculating a similaritymatrix where the path graph 1 → 2 → 3 is the structure graph, consider the followingdirected bow tie graph:
1center
m left vertices n right vertices
Label the center vertex first, then the m left vertices and finally the n right vertices, then theadjacency matrix of the bow tie graph becomes:
CHAPTER 2. SIMILARITY ON GRAPHS 81
B =
0 0 . . . 0 1 . . . 11... 0n×n 0n×m10... 0m×n 0m×m0
By direct computation, we get that the matrix BTB +BBT is equal to:
BTB +BBT =
m+ n 0 00 1n×n 00 0 1m×m
By Theorem 2.2.19, the Perron root of M is equal to ρ =
√n+m because the central matrix
BTB + BBT has a dominant root ρ2 of M2 and we clearly see that if we would take thecharacteristic polynomial of BTB+BBT , we would get m+n as Perron root. The dominanteigenvector of BTB + BBT is equal to the similarity score s2 by Theorem 2.2.19. it is noweasy to see that:
s2 =
10n0m
If we see vertex 2 of the path graph 1→ 2→ 3 as the center , which can be seen as a vertexthrough which much information is passed, then it is not surprising that s2 indicates thatvertex 1 of the directed bow tie graph is the only one that looks like a center. The left verticesof the bow tie graph look like vertex 1 of the path graph and the right vertices of the bow tiegraph look like vertex 3. This is a beautiful example because it confirms our intuition.
Self-similarity of a graph
When we compute the similarity matrix of two equal graphs G = H , the similarity matrixS is square matrix with as entries the similarity scores between vertices of G , we call S theself-similarity matrix of G in this case.
Intuitively, we expect that vertices have the highest similarity scores with themselves,which means that the largest entries are located on the diagonal of S. We prove in the nexttheorem that the largest entry of a self-similarity matrix always appears on the diagonaland that, except for some special cases, the diagonal elements of a self-similarity matrixare nonzero. This doesn’t mean that the diagonal elements are always larger than the otherelements on the same row or column and we conclude this paragraph with some easy examplesthe show this.
That the similarity matrix of a graph with itself is not always diagonally dominant isa serious limitation on the intuitive concept of similarity between graphs! It shows thatthe algorithm is sometimes not capable to detect the essential link between a vertex anditself, which is one of the reasons why the presented algorithm of similarity is still not totallysatisfactory.
To prove these statements, we first need to dive a little bit in the world of the symmetric,positive semidifinite matrices with some definitions and some proofs.
CHAPTER 2. SIMILARITY ON GRAPHS 82
Definition 2.2.21. Let A be an n×n-matrix, the determinant of a k×k principal submatrix(see Definition 1.2.7) is called a principal minor of order k.
Definition 2.2.22. A n × n, symmetric matrix A is called positive semidefinite if alleigenvalues of A are nonnegative.
Theorem 2.2.23. For a n×n, symmetric matrix A, the following properties are equivalent:
1. A is positive semidefinite,
2. A = UTU for some matrix U ,
3. xTAx ≥ 0 for every x ∈ Rn,
4. All principal minors A are nonnegative.
Proof. (1)⇒ (2) Write A = UDUT by Theorem 2.1.2 where U is orthogonal and D diagonal.Now, we know from that theorem, that the entries on the diagonal of D are the eigenvalues ofA (which are nonnegative), so we may write D = C2 where C will also be a diagonal matrix.Then we have
A = UCCUT = UC(UC)T ,as desired.
(2)⇒ (3) For every x ∈ Rn we have
xTAx = xTUTUx = (Ux)T (Ux) ≥ 0.
(3)⇒ (1) If v is an eigenvector of A with eigenvalue λ then 0 ≤ vTAv = λvTv which impliesthat λ ≥ 0.
(2) ⇒ (4) Let B be a submatrix of A formed by deleting the rows and columns withindices in the set S. Then modify U to form V by deleting the columns with indices in theset S. Now: det(B) = det(V TV ) = det(V )2 ≥ 0. Remember that U is orthogonal from andthus V is orthogonal too.
(4)⇒ (1) By contraposition, we may assume that A has a unit eigenvector v with eigen-value λ ≤ 0. If A has only one eigenvalue ≤ 0, then det(A) < 0 and we are done. Otherwise,choose a unit eigenvector u orthogonal to v with eigenvalue µ ≤ 0. Now choose α ∈ R so thatthe vector w = v + αu has at least one zero coordinate, say wi. If A′ is the matrix obtainedfrom A by removing the column i and row i and w′ is obtained by removing coordinate i ofw, then we have
(w′)TA′w′ = wTAw = λ+ α2µ < 0,so A′ is not semidefinite since we already demonstrated the equivalence between (3) and (1),and the result follows by induction.
Lemma 2.2.24. The scaled sum of two positive semidefinite matrix is again a positivesemidefinite matrix.
Proof. Assume that A and B are n× n, positive semidefinite matrices and α, β ∈ R+, so forall x ∈ Rn we have xTAx ≤ 0, so then:
xT (αA+ βB)x = α(xTAx) + β(xTBx) ≥ 0.
CHAPTER 2. SIMILARITY ON GRAPHS 83
Theorem 2.2.25. The self-similarity matrix of a graph G is positive semidefinite. The largestentry of the matrix appears on the diagonal and if a diagonal entry is equal to zero, all theentries of the corresponding row and column are also equal to zero.
Proof. Since A = B, the compact form of Theorem 2.2.15 becomes:
S(k+1) = AS(k)AT +ATS(k)A
‖AS(k)AT +ATS(k)A‖F, S(0) = J,
First notice that S(k+1) is in this case always symmetric because sij = sji (the similarityscores of vertex vi and vj are the same as the similarity scores of vertex vj and vi). Theall-one matrix J is clearly positive semidefinite as it has only nonnegative principal minors.So S(0) is postive semidefinite, so it can be decomposed as S(0) = W TW . We write for somevector x ∈ Rn:
xTATS(0)Ax = xTATW TWAx= ‖WAx‖2≥ 0,
similarly, xTAS(0)ATx ≥ 0, thus all matrices S(k) are positive semidefinite because the(scaled) sum of two positive semidefinite matrices is also positive semidefinite. Now, the restof the proof follows from the following observation: let α be a real number. If x = αei − ejthen:
xtS(k)x = α2s(k)ii − 2αs(k)
ij + s(k)jj , (2.13)
so to prove that the largest entry of the matrix S(k) appears on the diagonal, assume, con-trapositively, that the strictly largest entry is s(k)
ij = s(k)ji with i 6= j. Taking α = 1, equation
(2.13) becomes:
xtS(k)x = s(k)ii − 2s(k)
ij + s(k)jj = (s(k)
ii − s(k)ij ) + (s(k)
jj − s(k)ij ) < 0.
To prove the last statement, if s(k)ii = 0 but s(k)
ij 6= 0 for some j gives for (2.13):
xtS(k)x = −2αs(k)ij + s
(k)jj ,
which is negative for α large enough.
Example 2.2.26. The self-similaritiy matrix of the graph:v1 v2 v3
is equal to: 0.5774 0 00 0.5774 00 0 0.5774
Example 2.2.27. The self-similartiy matrix of the graph:
CHAPTER 2. SIMILARITY ON GRAPHS 84
v1
v2 v4
v3
is equal to: 0.250 0.250 0.250 0.2500.250 0.250 0.250 0.2500.250 0.250 0.250 0.2500.250 0.250 0.250 0.250
Example 2.2.28. The self-similartiy matrix of the graph:
v1
v2
v4v3
is equal to: 0.4082 0 0 0
0 0.4082 0 00 0 0.4082 0.40820 0 0.4082 0.4082
Undirected graphs
In the case of undirected graphs, the compact form 2.2.15 only becomes easier and equals:
Theorem 2.2.29. Let G and H be two undirected graphs with adjacency matrices A =[aij ] ∈ Rn×n and B = [bij ] ∈ Rm×m and let S(0) > 0 be an initial positive matrix, and define:
S(k+1) = BS(k)A
‖BS(k)A‖F, k = 0, 1, . . . .
CHAPTER 2. SIMILARITY ON GRAPHS 85
Then the matrix subsequences S(2k) and S(2k+1) converge to Seven and Sodd and among all thematrices in the set of all possible limits:
{Seven(S(0)), Sodd(S(0)) : S(0) > 0}
the matrix Seven(J) is the unique matrix of largest 1-norm.
Proof. This follows easily from the fact that in the case of undirected graphs, A and B aresymmetric:
S(k+1) = BS(k)AT +BTS(k)A
‖BS(k)AT +BTS(k)A‖F
= BS(k)A+BS(k)A
‖BS(k)A+BS(k)A‖F
= 2BS(k)A
‖2BS(k)A‖F
= BS(k)A
‖BS(k)A‖F
CHAPTER 2. SIMILARITY ON GRAPHS 86
2.3 Node-edge similarity
The paper of Blondel et al. [? ] caused a flow of successive papers which build upon theconcept of similarity between two graphs. For instance, the paper ‘Graph similarity scoringand matching’ of Laura Zager and George Verghese [? ] expands the notion of node similaritypresented in the previous section to similarity between the edges of two graphs. The paperwas presented in 2006. Intuitively an edge of a graph G is similar to an edge of graph H iftheir source and terminal nodes (Definition 1.5.6) are similar. As a consequence, the notionof similarity between edges introduces a coupling between edge and node similarity scores.The algorithm presented in this paper is therefore an extension of the algorithm presented inthe previous section.
2.3.1 Coupled node-edge similarity scores
We now present the extended algorithm allowing us to calculate not only a node similarityscores, but also edge similarity scores. The algorithm will use a new sort of matrices thatrepresent a graph.
Definition 2.3.1. Let be G = (V,→) be a graph with adjacency matrix A, numbered verticesv1, v2, . . . , vn ∈ V and numbered edges e1, e2, . . . em ∈→. We define the source-edge matrixAS as a n×m matrix with entries:
(AS)ij ={
1 if sG (ej) = vi
0 otherwise,
the notation AS is derived from the adjacency matrix A.
Definition 2.3.2. Let be G = (V,→) be a graph with adjacency matrix A, numbered verticesv1, v2, . . . , vn ∈ V and numbered edges e1, e2, . . . em ∈→. We define the terminus-edgematrix AT as a n×m matrix with entries:
(AT )ij ={
1 if tG (ej) = vi
0 otherwise,
the notation AT is derived from the adjacency matrix A.
Property 2.3.3. Let G = (V,→) be a graph, then ASATS is a diagonal matrix where the ith
diagonal entry is equal the outdegree of vertex vi.
Proof. By direct computation, we get:
(ASATS )ij =m∑k=1
(AS)ik(AS)Tkj =m∑k=1
(AS)ik(AS)jk
Assume i 6= j. Then for each k, (AS)ik(AS)jk = 0 since vetrex vi and vertex vj can’t be boththe source node of edge k.
When i = j, then for each k, (AS)ik(AS)jk = ((AS)ik)2 equals 0 or 1 depending on whethervi is a starting point or not, so 1 is added to (ASATS )ii each time an edge ‘departs’ from vi,this is exactly the outdegree of vertex vi.
CHAPTER 2. SIMILARITY ON GRAPHS 87
In an analogous way, we prove:
Property 2.3.4. Let G = (V,→) be a graph, then ATATT is a diagonal matrix where the ith
diagonal entry is equal the indegree of vertex vi.
Property 2.3.5. Let G = (V,→) be a graph, then the adjacency matrix A is equal to ASATT .
Proof. By direct computation, we get:
(ASATT )ij =m∑k=1
(AS)ik(AT )Tkj =m∑k=1
(AS)ik(AT )jk
The terms (AS)ik(AT )jk equal 1 if edge k goes from vi to vj and since the sum is taken overall edges we conclude:
(ASATT )ij = Aij .
Let G (V,→), H (U,→′) be two (directed) graphs, G has nG vertices and mG edges andH has nH vertices and mH . Remember the following updating equation from (2.5), whichreturns the (node) similarity score between vertices ui from H and vj from G :
x(k+1)ij =
∑r:(ur,ui)∈→′,w:(vw,vj)∈→
x(k)rw +
∑r:(ui,ur)∈→′,w:(vj ,vw)∈→
x(k)rw ,
if we number the edges of G and H as e1, e2, . . . emG ∈→ and e′1, e′2, . . . , e
′mH∈→′, this can
be rewritten as:
x(k+1)ij =
∑tH (e′p)=ui,tG (eq)=vj
x(k)sH (e′p)sG (eq) +
∑sH (e′p)=ui,sG (eq)=vj
x(k)tH (e′p)tG (eq).
We now extend this mutually reinforcing relation with the notion of edge similarity. xijdenotes again the node similarity score between vertex ui from H and vertex vj from G andypq denotes the edge similarity score between edge p from H and edge q in G , the updateequations for edge and node similarity scores take now the following form:
y(k+1)pq = x
(k)sH (e′p)sG (eq) + x
(k)tH (e′p)tG (eq) (2.14)
x(k+1)ij =
∑tH (e′r)=ui,tG (ew)=vj
y(k)rw +
∑sH (et)=ui,sG (ew)=vj
y(k)rw (2.15)
With the same reasoning as presented in the previous section, these scores can be assembledinto matrices Y (k) and X(k) by using the source-edge matrices AS and the terminus-edgematrices AT . Let A be the adjacency matrix of G and B be the adjacency matrix of H , letX(k) be the nH × nG -matrix with entries xij , the node similarity scores at iteration step kand let Y (k) be the mH ×mG -matrix with entries ypq, the edge similarity scores at step k.The equations (2.14) and (2.15) can be rewritten as:
Y ′(k+1) = BTSX
(k)AS +BTTX
(k)AT (2.16)X ′(k+1) = BSY
(k)ATS +BTY(k)ATT (2.17)
CHAPTER 2. SIMILARITY ON GRAPHS 88
for k = 0, 1, . . .Of course we want to customize these equations in a way that we can apply Theorem
2.2.10 to prove convergence. This will be completely analogous to Theorem 2.2.15, but wecan achieve a slightly better result: not only the even and odd iterates will converge, theiteration converges as a whole. Lastly, remember that one of the conditions of Theorem2.2.10 is to normalize the results at each iteration step. Therefore, the following theoremcomes as no surprise:Theorem 2.3.6. Let G and H be two graphs with adjacency matrices A and B, G has nG
vertices and mG edges and H has nH vertices and mH edges, define:
Y (k+1) = BTSX
(k)AS +BTTX
(k)AT‖BT
SX(k)AS +BT
TX(k)AT ‖F
(2.18)
X(k+1) = BSY(k)ATS +BTY
(k)ATT‖BSY (k)ATS +B′TY
(k)ATT ‖F(2.19)
for k = 0, 1, . . ..Then the matrix subsequences X(2k), Y (2k) and X(k+1), Y (k+1) converge to Xeven, Yeven
and Xodd, Yodd. If we take:
X(0) = J ∈ RnH ×nG
Y (0) = J ∈ RmH ×mG
as initial matrices, then Xeven(J) = Xodd(J), Yeven(J) = Yodd(J) are the unique matrices oflargest 1-norm among all possible limits with positive initial matices and the matrix sequenceX(k), Y (k) converges has a whole.Proof. By Theorem 2.2.14 we can rewrite (2.16) as follows:
Y ′(k+1) = BTSX
′(k)AS +BTTX
′(k)AT
⇔ vec(Y ′(k+1)) = vec(BTSX
′(k)AS +BTTX
′(k)AT )⇔ vec(Y ′(k+1)) = vec(BT
SX′(k)AS) + vec(BT
TX′(k)AT )
⇔ vec(Y ′(k+1)) = (ATS ⊗BTS ) vec(X ′(k)) + (ATT ⊗BT
T ) vec(X ′(k))⇔ vec(Y ′(k+1)) = (ATS ⊗BT
S +ATT ⊗BTT ) vec(X ′(k))
Completely analogously we can also rewrite (2.17):
vec(X ′(k+1)) = (AS ⊗BS +AT ⊗BT ) vec(Y k),
define y(k) = vec(Y ′(k+1)) and x(k) = vec(X ′(k+1)), we get:
y(k+1) = (ATS ⊗BTS +ATT ⊗BT
T )x(k)
x(k+1) = (AS ⊗BS +AT ⊗BT )y(k)
If we define G = ATS ⊗BTS +ATT ⊗BT
T , then with Lemma 2.2.13 and well known properties oftranspose matrices:
GT = (ATS ⊗BTS +ATT ⊗BT
T )T
= ((AS ⊗BS)T + (AT ⊗BT )T )T
= ((AS ⊗BS)T )T + ((AT ⊗BT )T )T
= AS ⊗BS +AT ⊗BT
CHAPTER 2. SIMILARITY ON GRAPHS 89
So we get:
y(k+1) = Gx(k) (2.20)x(k+1) = GTy(k), (2.21)
G is a mGmH × nGnH -matrix, the previous expressions can be concatenated to a singlematrix update equation (we define matrix M and z(k+1)):
z(k+1) =(
xy
)(k+1)
=(
0mG×mH GT
G 0nG×nH
)(xy
)(k)
= Mz(k),
M is clearly nonnegative because G and GT consists of sums of Kronecker products ofAS , BS , AT and/or BT , all matrices with entries equal to zero or one, M is as a (nGnH +mGmH ) × (nGnH + mGmH )-matrix clearly symmetric, so the result follows immediatelyfrom Theorem 2.2.10. The appearance of the Perron norm can be explained in the same wayas in Theorem 2.2.15, note that we write Y (k) and X(k) for the normalized results at iterationstep k. We still have to prove that odd and even iterates are the same and that M has onlypositive eigenvalues, meaning that the sequence converges to Πz(0)
‖Πz(0)‖2, this can be done easily
by expanding the matrix equation:
x(k) =
(GTG) k2 x(0) k even(GTG) k−1
2 GTy(0) k oddand y(k) =
(GGT ) k2 y(0) k even(GGT ) k−1
2 Gx(0) k odd
The matrix G is the sum of two matrices (ATS ⊗BTS ) and (ATT ⊗BT
T ). Now, ATS is a mG ×nG -matrix and has in each row a single ‘1’-entry, simply because an edge has at most one sourcenode . The same holds for BT
S . Therefore, taking the Kronecker product of those two matricesresults in the matrix ATS ⊗ BT
S which also has just a single ‘1’ entry in each row. With thesame reasoning, we conclude that also ATT ⊗ BT
T has also just a single ‘1’-entry in each row.Taking the sum of (ATS ⊗BT
S ) and (ATT ⊗BTT ) results thus in the matrix G with exactly two
1 entries in each row. If we now choose the initial conditions as x(0) = 1 ∈ RnH nG andy(0) = 1 ∈ RmH mG , we conclude:
Gx(0) = 2y(0),
we get:
x(k) =
(GTG) k−22 GTGx(0) = 1
2(GTG) k−22 GTy(0) k even
(GTG) k−12 GTy(0) k odd
y(k) =
(GGT ) k2 y(0) k even(GGT ) k−1
2 Gx(0) = (GGT ) k−12 1
2y(0) k odd
First, observe that the matrices GGT and GTG are besides being symmetric (for example,(GGT )T = GGT ) and nonnegative, are also positive semi-definite and thus have only nonneg-ative eigenvalues. Note that the factor 1
2 that appears in the odd iterates will be eliminatedby the normalization in each step. So in the limit as k →∞ , the even and odd iterates arethe same.
We now define Xeven(1) as the node similarity matrix and Yeven(1) as the edge similaritymatrix.
CHAPTER 2. SIMILARITY ON GRAPHS 90
2.3.2 Algorithm 6 and Algorithm 7
Data:AS : the nG ×mG source-edge matrix of a graph GAT : the nG ×mG terminal-edge matrix of a graph GBS : the nH ×mH source-edge matrix of a graph GBT : the nH ×mH terminal-edge matrix of a graph GTOL: tolerance for the estimation error.Result:X: the node similarity matrix between G and HY : the edge similarity matrix between G and Hbegin node edge similarity matrix(AS,AT , BS,BT ,TOL)
k = 1 ;X(0) = 1 (nH × nG -matrix with all entries equal to 1);Y (0) = 1 (mH ×mG -matrix with all entries equal to 1);µX = nH × nG -matrix with all entries equal to TOL;µY = mH ×mG -matrix with all entries equal to TOL;repeat
Y (k) = BTSX(k−1)AS+BTTX
(k−1)AT‖BTSX(k−1)AS+BTTX(k−1)AT ‖F
;
X(k) = BSY(k)ATS+BTY (k)ATT
‖BSY (k)ATS+B′TY (k)ATT ‖F;
k = k + 1;until |X(k) −X(k−1)| < µX and |Y (k) − Y (k−1)| < µY ;return X(k), Y (k);
endAlgorithm 6: Algorithm for calculating the node and edge similarity matrix X and Ybetween G and H .
The algorithm to calculate the node and edge similarity scores is presented in pseudo-code in Algorithm 6. Remember that the absolute value that is mentioned is the same as inNotation 1.1.8. Besides the different calculations, the main difference with the algorithm ofthe previous section (Algorithm 5) is that the whole sequence converges, not only the even(or odd) iterates, making this algorithm twice as fast. This is because in Algorithm 5 wetake the limit of the even iterates but we need the calculations of the odd iterates too toachieve a result. In this implementation, the algorithm will not oscillate and both the evenand odd iterates converge to the same limit, so the tolerance level will be satisfied with halfof the number of steps we would need for Algorithm 5. Also notice that we implement asequential update: when Y (k) is calculated,the result is immediately used in X(k). This isnot according to Theorem 2.3.6, which suggest simultaneous updating equations: in thatcase X(k) uses Y (k−1) in its calculations. It is not difficult to see, by an analogous argumentof Theorem 2.3.6, that both approaches yield the same set of similarity scores. Numericalanalysts, however, will always prefer the sequential update procedure because it performsslightly better (see section 3.2 in [? ]) as you directly use a more accurate result for Y (k) inthe calculation of X(k).
A Matlab implementation can be found in Listing A.3 in Appendix A. Because givingsource-edge matrices and terminal-edge matrices as input is quite unnatural, in Listing A.4
CHAPTER 2. SIMILARITY ON GRAPHS 91
and Listing A.5 some Matlab code is also presented to transform an adjacency matrix intoa source-edge matrix and a terminal-edge matrix. The resulting matrices represent an edgenumbering left-to-right, based on the entries of the adjacency matrix. The algorithm tocalculate the source-edge matrix based on the adjacency matrix is also written in pseudo-code in Algorithm 7, the calculation of the terminal-edge matrix is completely analogous. AMatlab implementation of Algorithm 6 that takes two adjacency matrices as input can befound in Listing A.6.
Data:A: the n× n adjacency matrix of a graph GResult:AS : the source-edge matrix of graph Gbegin source edge matrix(A)
m = sum of all entries of A ;AS = initialize a n×m-matrix with all entries equal to 0;current edge = 1;for i : 1 to n do
for j : 1 to n doif (A)ij > 0 then
for e : 1 to (A)ij do(AS)i,current edge = 1;current edge = current edge + 1 ;
endend
endend
endreturn AS ;
Algorithm 7: Algorithm for calculating the source-edge matrix AS based on the adjacencymatrix A of a graph G .
2.3.3 Difference with node similarity
It is clear that the node-edge similarity algorithm is different from Algorithm 5 from theprevious section. We will make this a little more explicit and show the difference in theresulting node similarity matrix. We first need another property of the Kronecker product.
Property 2.3.7. (mixed-product property of Kronecker products) Let A ∈ Rm×n, B ∈Rr×s, C ∈ Rn×p and D ∈ Rs×t then:
(A⊗B)(C ⊗D) = AC ⊗BD
CHAPTER 2. SIMILARITY ON GRAPHS 92
Proof.
(A⊗B)(C ⊗D) =
a11B . . . a1nB... . . . ...
am1B . . . amnB
c11D . . . c1pD
... . . . ...an1D . . . anpD
=
∑nk=1 a1kck1BD . . .
∑nk=1 a1kckpBD
... . . . ...∑nk=1 amkck1BD . . .
∑nk=1 amkckpBD
= AC ⊗BD.
Now take equations (3.10) and (3.11) and consider only the even iterates. We consideronly the even iterates because Algorithm 5 does, remember that in Algorithm 6 the even andodd iterates yield the same set of similarity scores. We get:
x(k) = (GTG)x(k−2)
= ((AS ⊗BS +AT ⊗BT )(ATS ⊗BTS +ATT ⊗BT
T ))x(k−2)
= ((AS ⊗BS)(ATS ⊗BTS ) + (AS ⊗BS)(ATT ⊗BT
T )+(AT ⊗BT )(ATS ⊗BT
S ) + (AT ⊗BT )(ATT ⊗BTT ))x(k−2)
= (ASATS ⊗BSBTS +ASA
TT ⊗BSBT
T
+ATATS ⊗BTBTS +ATA
TT ⊗BTBT
T )x(k−2)
= (ASATS ⊗BSBTS +A⊗B +AT ⊗BT +ATA
TT ⊗BTBT
T )x(k−2)
= (A⊗B +AT ⊗BT +ASATS ⊗BSBT
S +ATATT ⊗BTBT
T )x(k−2)
Where A,B are the adjacency matrices of G ,H (Property 2.3.5), ASATS , BSBTS are the
diagonal matrices with the outdegrees of the vertices on the diagonal (Property 2.3.3) andATA
TT , BTB
TT are the diagonal matrices with the indegree of the vertices on the diagonal
(Property 2.3.4). The iteration sugested in Theorem 2.2.15 in the previous section has thefollowing form (see equation (2.12)):
x(k) = A⊗B +AT ⊗BTx(k−2),
thus Algorithm 6 differs from Algorithm 5 in three important ways:
• In Algorithm 6 the inclusion of additional diagonal terms serve to amplify the scores ofnodes that are highly connected in the node similarity matrix,
• Algorithm 6 returns a node similarity matrix and an edge similarity matrix, Algorithm5 returns only a node similarity matrix,
• The whole sequence in Algorithm 6 converges, not only the even and odd iterates.
2.3.4 Example
Example 2.3.8. We retake Example 2.2.2 and number the edges as follows, let G = (V,→)be :
CHAPTER 2. SIMILARITY ON GRAPHS 93
v1 v2 v31 2
Let H = (W,→) be the following graph:
w1
w2 w3
w4 w5
1 2
5
3
46
7
Then the node similarity matrix is:
X =
0.2338 0.0718 00.2472 0.3230 0.01280.1841 0.7553 0.3185
0 0.0935 0.23380 0.0441 0.0576
,and the edge similarity matrix equals:
Y =
0.2166 0.03290.3847 0.15180.3899 0.24950.1325 0.21660.1133 0.14800.3653 0.41760.1080 0.3847
If you look at the structure of H and compare it with the structure of G , then intuitively it isnot surprising that edge 3 of H is most similar to edge 1 of G and edge 6 of H is most similarto edge 2 of G because they are both very central edges with source nodes and terminal nodesthat are very similar (they both have high similarity scores). Also note that w7 is consideredas the most ‘central node’ (see subsection 2.2.4) as it has the largest similarity score for v2,also this can be explained because w3 has 2 incoming edges and 2 outgoing eges, no otheredge has this in H .
2.4 Colored graphs
In this last section, we extend the notion of similarity to colored graphs. The paper [? ] iswritten by two Belgian professors Paul Van Dooren and Catherine Fraikin from the CatholicUniversity of Louvain in 2009.
CHAPTER 2. SIMILARITY ON GRAPHS 94
Graph coloring is already introduced in paragraph 1.5.1 and we will construct a methodthat returns similarity matrices for two graphs where the coloring is on the nodes or on theedges of both graphs. If you would compare the original paper to explanation in this section,you will see that there are a lot of differences. This has two reasons: first, with the previoussections we have already a broad overview of similarity on graphs so we can achieve results ina more detailed way, second, to make the notations uniform in the whole master thesis, thispaper had to be rewritten.
2.4.1 Colored nodes
Method
We first extend the node similarity introduced in section 2.2 to node colored graphs. Taketwo node colored graphs G = (V,→, C, a) and H = (U,→′, C, a′) with |C| (remember that a,a′ are surjective) different colors and assume that the nodes in both graphs are renumberedsuch that those of color 1 come first, then those of color 2,... The adjacency matrices A (ofgraph G ) and B (of graph H ) can then be partitioned as follows:
A =
A11 A12 . . . A1|C|A11 A12 . . . A1|C|
...... . . . ...
A|C|1 A|C|2 . . . A|C||C|
and
B11 B12 . . . B1|C|B11 B12 . . . B1|C|
...... . . . ...
B|C|1 B|C|2 . . . B|C||C|
Remember that cG (V, i) denotes the number of vertices of color i in graph G , then eachblock Aij ∈ RcG (V,i)×cG (V,j) and Bij ∈ RcH (U,i)×cH (U,j) describes the adjacency relationsbetween the nodes of color i to the vertices of color j in both A and B. In fact, by justrenumbering the vertices such that the nodes with the same color succeed each other, youimmediately get such partitioning. If you see this renumbering as a permutation on thenodes, then performing on the original adjacency matrix a left and right multiplication withthe corresponding permutation matrix (see Definition 1.1.1) leads to this adjusted adjacencymatrix. To make the idea more comprehensible, we give an example.
Example 2.4.1. Let G be following graph:
v1 v2 v3 v4
The adjacency matrix equals:
A =
0 1 1 00 0 2 00 1 0 20 0 0 1
.If we order the colors as: {green, blue, yellow} (so color 1 = green, color 2 = blue, color 3 =yellow), we can renumber the vertices and get the following graph:
CHAPTER 2. SIMILARITY ON GRAPHS 95
v1 v2 v3 v4
The adjacency matrix equals:
A′ = PAP =
1 0 0 00 0 1 00 1 0 00 0 0 1
0 1 1 00 0 2 00 1 0 20 0 0 1
1 0 0 00 0 1 00 1 0 00 0 0 1
=
0 1 1 00 0 1 20 2 0 00 0 0 1
,which can be partitioned in blocks as follows (we have 3 colors: 1 = green, 2 = blue, 3 =yellow):
A′ =
A11 A12 A13A21 A22 A23A31 A32 A33
=
(0 10 0
) (11
) (02
)(0 2
) (0) (
0)
(0 0
) (0) (
1)
.
Now, we will only compare the nodes of the same color in both graphs, so that we definesimilarity matrices Sii ∈ RcG (V,i)×cH (vi), with i = 1, . . . , |C|, which we can put in a block-diagonal similarity matrix:
S =
S11 0 . . . 00 S12 . . . 0...
... . . . ...0 0 . . . S|C||C|
Theorem 2.4.2. Let G = (V,→, C, a) and H = (U,→′, C, a′) be two node colored graphsand define:
Z(k+1)1 =
∑i∈{1,...,|C|}
B1iS(k)ii A
T1i +BT
i1S(k)ii Ai1
Z(k+1)2 =
∑i∈{1,...,|C|}
B2iS(k)ii A
T2i +BT
i2S(k)ii Ai2
...Z
(k+1)|C| =
∑i∈{1,...,|C|}
B|C|iS(k)ii A
T|C|i +BT
i|C|S(k)ii Ai|C|
(S11, S22, . . . , S|C||C|)(k+1) =
(Z
(k+1)1 , Z
(k+1)2 , . . . , Z
(k+1)|C|
)∣∣∣∣∣∣(Z(k+1)
1 , Z(k+1)2 , . . . , Z
(k+1)|C|
)∣∣∣∣∣∣F
for k = 0, 1, . . .Then the the matrix subsequences Z(2k)
j for every j ∈ {1, . . . , |C|} and (S11, S22, . . . , S|C||C|)(2k)
converge to Zevenj and (S11, S22, . . . , S|C||C|)even. Also the odd subsequences converge. If we
take:S
(0)jj = J ∈ RcH (U,j)×cG (V,j)
CHAPTER 2. SIMILARITY ON GRAPHS 96
as initial matrices, then the resulting Sevenjj (1) are the unique matrices of largest 1-norm
among all possible limits with positive start vector.
Proof. By induction on |C|. Remember from Definition 1.5.18 that the function a is surjective,meaning that C only consists of colors that are actually used.
For |C| = 1, we have a graph with all vertices having the same color, which can be seenas an uncolored graph. This is just the normal case as proved in Theorem 2.2.15.
Although redundant, it is instructive to prove the case |C| = 2 separately, because thegeneralization in the induction step is then immediately clear, so consider the partitionedadjacency matrices: (
A11 A12A21 A22
)and
(B11 B12B21 B22
),
the equations of the theorem are in this case:
Z′(k+1)1 = B11S
(k)11 A
T11 +BT
11S(k)11 A11 +B12S
(k)22 A
T12 +BT
21S(k)22 A21
Z′(k+1)2 = B21S
(k)11 A
T21 +BT
12S(k)11 A12 +B22S
(k)22 A
T22 +BT
22S(k)22 A22
(S11, S22)(k+1) = (Z ′(k+1)1 , Z
′(k+1)2 )
‖(Z ′(k+1)1 , Z
′(k+1)2 )‖F
We can write with Theorem 2.2.14:
Z′(k+1)1 = B11Z
′(k)1 AT11 +BT
11Z′(k)1 A11 +B12Z
′(k)2 AT12 +BT
21Z′(k)2 A21
⇔ vec(Z ′(k+1)1 ) = vec(B11Z
′(k)1 AT11 +BT
11Z′(k)1 A11 +B12Z
′(k)2 AT12 +BT
21Z′(k)2 A21)
⇔ vec(Z ′(k+1)1 ) = (A11 ⊗B11) vecZ ′(k)
1 + (AT11 ⊗BT11) vec(Z ′(k)
1 )+(A12 ⊗B12) vec(Z ′(k)
2 ) + (AT21 ⊗BT21) vec(Z ′(k)
2 )⇔ vec(Z ′(k+1)
1 ) = (A11 ⊗B11 +AT11 ⊗BT11) vec(Z ′(k)
1 )+(A12 ⊗B12 +AT21 ⊗BT
21) vec(Z ′(k)2 )
In a similar way, we can write Z ′(k+1)2 , which is the not normalized version of Z(k+1)
2 , as:
vec(Z ′(k+1)2 ) = (A21 ⊗B21 +AT12 ⊗BT
12) vecZ ′(k)1 + (A22 ⊗B22 +AT22 ⊗BT
22) vecZ ′(k)2
If we define M , zk+1 as follows, the previous expressions concatenate to a single matrix updateequation:
z(k+1) =(
vec(Z ′1)vec(Z ′2)
)(k+1)
=(A11 ⊗B11 +AT11 ⊗BT
11 A12 ⊗B12 +AT21 ⊗BT21
A21 ⊗B21 +AT12 ⊗BT12 A22 ⊗B22 +AT22 ⊗BT
22
)(vec(Z ′1)vec(Z ′2)
)(k)
= Mz(k)
Notice that the diagonal blocks in M are related to links between the nodes with the samecolor, while the off diagonal blocks refer to links between nodes of another color. As always,
CHAPTER 2. SIMILARITY ON GRAPHS 97
we want to use Theorem 2.2.10 to get the result. M is clearly nonnegative, because everyblock Aij and Bij is nonnegative (it is derived from the nonnegative adjacency matrices Aand B). Proving that M is symmetric a bit trickier, but notice that A11⊗B11 +(A11⊗B11)Tis a symmetric cG (1)cH (1)× cG (1)cH (1) matrix (because it is the sum of a matrix with histranspose), and A22 ⊗ B22 + (A22 ⊗ B22)T is a symmetric cG (2)cH (2) × cG (2)cH (2). If wedefine G = A21 ⊗B21 +AT12 ⊗BT
12, G is a cG (2)cH (2)× cG (1)cH (1)-matrix. Now, notice thefollowing relation between the off diagonal blocks:
GT = (A21 ⊗B21 +AT12 ⊗BT12)T
= (A21 ⊗B21)T + ((AT12 ⊗BT12)T )T
= A12 ⊗B12 +AT21 ⊗BT21
GT is a cG (1)cH (1)× cG (2)cH (2)-matrix. So we can rewrite M as:
M =(A11 ⊗B11 +AT11 ⊗BT
11 GT
G A22 ⊗B22 +AT22 ⊗BT22
)
M is a (cG (1)cH (1) + cG (2)cH (2)) × (cG (1)cH (1) + cG (2)cH (2))-matrix, we want to provethat (M)ij = (M)ji and to do so we distinguish all possible cases:
• If i ≤ cG (1)cH (1), j ≤ cG (1)cH (1), then (M)ij and (M)ji will both be entries ofA11 ⊗B11 +AT11 ⊗BT
11 and this submatrix was symmetric.
• If i ≤ cG (1)cH (1), j > cG (1)cH (1), then (M)ij will be an entry of GT and (M)ji willbe an entry of G, so they are equal.
• If i > cG (1)cH (1), j > cG (1)cH (1), then (M)ij and (M)ji will both be entries ofA22 ⊗B22 +AT22 ⊗BT
22 and this submatrix was symmetric.
• If i > cG (1)cH (1), j ≤ cG (1)cH (1), then (M)ij will be an entry of G and (M)ji will bean entry of GT , so they are equal.
The result immediately follows from Theorem 2.2.10. We motivated the usage of theFrobenius norm already in the proof of Theorem 2.2.15. Notice that the normalization aftereach iteration step happens ‘together’ by dividing by ‖(Z(k+1)
1 , Z(k+1)2 )‖F , because this is in
accordance to the conditions of Theorem 2.2.10. Normalizing Z(k+1)1 , Z
(k+1)2 separately after
the expressions are calculated is a bad idea: it gives an iterative process that is different fromthe one described in Theorem 2.2.10 and we cannot prove convergence in this case.
We now prove the induction step |C| = n− 1⇒ |C| = n. The only crucial thing to proveis that M is again symmetric, the rest of the steps consist of an easy expansion of the case|C| = 2. M is in this case equal to:
M =
A11 ⊗B11+AT11 ⊗BT
11. . .
A1(n−1) ⊗B1(n−1)+AT(n−1)1 ⊗B
T(n−1)1
A1n ⊗B1n+ATn1 ⊗BT
n1... . . . ...
...A(n−1)1 ⊗B(n−1)1+AT1(n−1) ⊗B
T1(n−1)
. . .A(n−1)(n−1) ⊗B(n−1)(n−1)+AT(n−1)(n−1) ⊗B
T(n−1)(n−1)
A(n−1)n ⊗B(n−1)n+ATn(n−1) ⊗B
Tn(n−1)
An1 ⊗Bn1+AT1n ⊗BT
1n. . .
An(n−1) ⊗Bn(n−1)+AT(n−1)n ⊗B
T(n−1)n
Ann ⊗Bnn+ATnn ⊗BT
nn
CHAPTER 2. SIMILARITY ON GRAPHS 98
which can be seen as:
M =
A1n ⊗B1n+ATn1 ⊗BT
n1
M ′...
A(n−1)n ⊗B(n−1)n+ATn(n−1) ⊗B
Tn(n−1)
An1 ⊗Bn1+AT1n ⊗BT
1n. . .
An(n−1) ⊗Bn(n−1)+AT(n−1)n ⊗B
T(n−1)n
Ann ⊗Bnn+ATnn ⊗BT
nn
From the induction hypothesis we now that M ′ is symmetric. It is clear that the entries inthe last column of M are the transpose of the entries in the last row. It follows that M isagain symmetric.
Algorithm 8 and Algorithm 9
We present an algorithmic implementation of the described method. First, we have to finda way to easily calculate a partitioned adjacency matrix. This is presented in Algorithm 8.This algorithm expects as input an adjancency matrix where the vertices are arranged bycolor and an ordered list that tells how many vertices belong to each color. In Algorithm 9we calculate the similarity matrix of two node colored graphs based on the adjacency matrixpatitioning algorithm. A Matlab implementation of both algoritmhs can be found in ListingA.7 and Listing A.8 in Appendix A.
CHAPTER 2. SIMILARITY ON GRAPHS 99
Data:C: a list with the number of vertices sharing the same colors in G (both the vertices asthe colors must be ordered appropriately)A: the ‘normal’ adjacence matrix of the graph G (the vertices must be orderedapproprately: the first vertices must belong to the first color)Result:Z: a 2-dimensional list where the element on place (i, j) represents the partition ofadjancency matrix A, Aij , betweentheverticesofcoloriandtheverticesofcolorjbegin colored node adjacency matrix partitioning(C,A)
Z = ∏i,j C(i).C(j)×∏i,j C(i).C(j)-matrix with all entries equal to 0;
for i : 1 to |C| dofor j : 1 to |C| do
Ci = the number of vertices of color i (= C(i));Cj = the number of vertices of color j (= C(j));B = a Ci × Cj-matrix with all entries equal to 0;start vertex i = 0;start vertex j = 0;for k : 1 to i− 1 do
Ck = the number of vertices of color k (= C(k));start vertex i = start vertex i + Ck;
endfor k : 1 to i− 1 do
Ck = the number of vertices of color k (= C(k));start vertex j = start vertex j + Ck;
endfor r : 1 to ci do
for k : 1 to cj doB(r, k) = A(start vertex i +r, start vertex j +k);
endendZ(i, j) = B;
endendreturn Z;
endAlgorithm 8: Algorithm that takes an ordered list with the number of vertices and anormal adjacency matrix as input and returns a partitioned adjacency matrix.
CHAPTER 2. SIMILARITY ON GRAPHS 100
Data:CA: a list with the number of vertices sharing the same colors in G ,A: the ‘normal’ adjacence matrix of the graph G ,CB: a list with the number of vertices sharing the same colors in H ,B: the ‘normal’ adjacence matrix of the graph H .TOL: tolerance for the estimation error.Result:S: a 1-dimensional list with on place (i) the similarity matrix of the vertices with colori, Sii.begin colored node similarity matrix(CA, A, CB, B, TOL)
AP = colored node adjacency matrix partitioning (CA, A);BP = colored node adjacency matrix partitioning (CB, B);k = 1;for i : 1 to |C| do
S(i) = a CB(i)× CA(i)-matrix with all entries equal to 1;µ(i) = a CB(i)× CA(i)-matrix with all entries equal to TOL;
endwhile True do
norm = 0;for i : 1 to |CA| do
Z(i) = ∑j∈{1,...,|CA|}BP (1, j)Sprevious(i)ATP (1, i) +BT
P (i, 1)SpreviousAP (i, 1);norm = norm + trace (ZT (i).Z(i));
endS = Z
norm ;if k even then
if |S − Sprevious even| < TOL thenbreak;
elseSprevious even = S
endend
endreturn S;
endAlgorithm 9: Algorithm that takes an ordered list with the number of vertices and anormal adjacency matrix as input and returns a partitioned adjacency matrix.
Example
Example 2.4.3. Let G be the graph:
v1 v2 v3 v4 v5
CHAPTER 2. SIMILARITY ON GRAPHS 101
and H be the graph:
w1 w2 w3 w4 w5
The similarity matrix is given by:
S =
0.5969 0.3791 0 0 0
0 0.3791 0.5969 0 00 0 0 0.5986 00 0 0 0.3764 0.37640 0 0 0 0.5986
As one would expect, the highest similarity scores (0.59) are obtained for the nodes thatare the transitions between two type of nodes: all the pairs (v1, w1), (v3, w2), (v4, w3)(v5, w5)share this similarity score.
2.4.2 Colored edges
Method
We now extend the node-edge similarity method from Section 2.3 to edge colored graph.Take two edge colored graphs G = (V,→, C, b) and G = (U ′,→′, C, b′) with |C| differentcolors (remember that b, b′ are surjective). The edges can be renumbered such that thoseof the same color are next to each other in the source-edge and terminal-edge matrices, soAS , AT from G and BS , BT from H can be partitioned as follows:
AS =(AS1 . . . AS|C|
)and AT =
(AT1 . . . AT|C|
)BS =
(BS1 . . . BS|C|
)and BT =
(BT1 . . . BT|C|
)Again, this can easily be achieved by multiplying the original AS by the permutation matrixthat represents the renumbering of the edges. The blocks ASi , ATi ∈ RnG×cG (→,i) with nG
the number of vertices of G and cG (→, i) the number of edges of color i. So for H we have:BSi , BTi ∈ RnH ×cH (→′,i). We give a small example.
Example 2.4.4. Let G be following graph:
v1 v2 v3 v4
e1
e2
e3 e7
e6
e4, e5 e8
When we calculate the source-edge matrix with Algorithm 7 (the resulting matrices repre-sent an edge numbering left-to-right, the same as indicated in the graph) and terminal-edge
CHAPTER 2. SIMILARITY ON GRAPHS 102
matrices in an equal way, we get:
AS =
1 1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 0 0 1 1 1 00 0 0 0 0 0 0 1
and AT =
0 0 0 0 0 0 0 01 0 0 0 1 0 0 00 1 1 1 0 0 0 00 0 0 0 0 1 1 1
If we order the colors as: {green, blue, orange} (so color 1 = green, color 2 = blue, color 3 =yellow), we can renumber the edges:
v1 v2 v3 v4
e1
e3
e8 e2
e6
e4, e5 e7
A′S and A′T are now:
A′S = ASP =
1 1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 0 0 1 1 1 00 0 0 0 0 0 0 1
1 0 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 0 0 0 0 10 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 1 0 0 0 0 0 00 0 0 0 0 0 1 0
=
1 0 1 0 0 0 0 00 0 0 1 0 0 0 10 1 0 0 1 1 0 00 0 0 0 0 0 1 0
A′T = ATP =
0 0 0 0 0 0 0 01 0 0 0 1 0 0 00 1 1 1 0 0 0 00 0 0 0 0 1 1 1
1 0 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 0 0 0 0 10 0 0 1 0 0 0 00 0 0 0 1 0 0 00 0 0 0 0 1 0 00 1 0 0 0 0 0 00 0 0 0 0 0 1 0
=
0 0 0 0 0 0 0 01 0 0 0 1 0 0 00 0 1 1 0 0 0 10 1 0 0 0 1 1 0
which can be partitioned in blocks as follows (we have 3 colors: 1 = green, 2 = blue, 3 =
CHAPTER 2. SIMILARITY ON GRAPHS 103
orange):
A′S =(A′S1
A′S2A′S3
)=
1 00 00 10 0
1 0 0 0 00 1 0 0 00 0 1 1 00 0 0 0 1
0100
A′T =(A′T1
A′T2A′T3
)=
0 01 00 00 1
0 0 0 0 00 0 1 0 01 1 0 0 00 0 0 1 1
0010
Just like the colored node method, the edge similarity matrix will be block-diagonal be-cause we compare only the edges of the same type. The edge similarity matrix has thus ablock diagonal structure with blocks Yii ∈ RcG (→,i)×cH (→′,i).
Y =
Y11 0 . . . 00 Y22 . . . 0...
... . . . ...0 0 . . . Y|C||C|
The node similarity matrix X, on the other hand, is no different from the one of Theorem2.3.6. To adapt the method of Theorem 2.3.6 to colored edges we have to rewrite the equations(2.18) and (2.19) in a decoupled form such that X(k) and Y (k) can be calculated independentlyof each other. To make this paragraph more readable, we write our orignal equations again:
Y (k+1) = BTSX
(k)AS +BTTX
(k)AT‖BT
SX(k)AS +BT
TX(k)AT ‖F
X(k+1) = BSY(k)ATS +BTY
(k)ATT‖BSY (k)ATS +B′TY
(k)ATT ‖F
Remember that G = ATS ⊗ BTS + ATT ⊗ BT
T and GT = AS ⊗ BS + AT ⊗ BT . From equations(3.10) and (3.11) we can write (see the proof of Theorem 2.3.6):
x(k+1) = GTG(x(k))‖GTG(x(k))‖F
and y(k+1) = GGT (y(k))‖GGT (y(k))‖F
(2.22)
Remember that x(k) = vec(X(k)), with Lemma 2.2.13 we can rewrite this to decouplednotations (see the first part of the proof Theorem 2.3.6):
X(k+1) = BS(BTSX
(k)AS +BTTX
(k)AT )ATS +BT (BTSX
(k)AS +BTTX
(k)AT )ATT‖BS(BT
SX(k)AS +BT
TX(k)AT )ATS +BT (BT
SX(k)AS +BT
TX(k)AT )ATT ‖F
Y (k+1) = BTS (BSY (k)ATS +BTY
(k) +ATT )AS +BTT (BSY (k)ATS +BTY
(k) +ATT )AT‖BT
S (BSY (k)ATS +BTY (k) +ATT )AS +BTT (BSY (k)ATS +BTY (k) +ATT )AT ‖F
To keep the notation understandable, we will keep on using the decoupled equations of(2.22). We are ready for the theorem that describes the method of edge similarity on colorededges:
CHAPTER 2. SIMILARITY ON GRAPHS 104
Theorem 2.4.5. Let G = (V,→, C, b) and H = (U,→′, C, b′) be two edge colored graphs anddefine:
X ′(k+1) =∑
i∈{1,...,|C|}BSiY
(k)ii ATSi +BTiY
(k)ii ATTi
Y′(k+1)
11 = BTS1X
(k)AS1 +BTT1X
(k)AT1
...Y′(k+1)ii = BT
SiX(k)ASi +BT
TiX(k)ATi
...Y′(k+1)|C||C| = BT
S|C|X(k)AS|C| +BT
T|C|X(k)AT|C|
(X,Y11, . . . , Y|C||C|) =
(X ′(k+1), Y
′(k+1)11 , . . . , Y
′(k+1)|C||C|
)∥∥∥(X ′(k+1), Y
′(k+1)11 , . . . , Y
′(k+1)|C||C|
)∥∥∥F
for k = 0, 1, . . . Then the the matrix subsequences (X,Y11, . . . , Y|C||C|)(2k) converge to(X,Y11, . . . , Y|C||C|)even Also the odd subsequences converge. If we take:
X(0) = J ∈ RnH ×nG
Y(0)jj = J ∈ RcG (→,i)×cH (→′,i)
as initial matrices, then the resulting Sevenjj (1) are the unique matrices of largest 1-norm
among all possible limits with positive start vector.
Proof. By induction on |C|. Remember from Definition 1.5.20 that the function b is surjective,meaning that C only consist of colors that are actually used.
For |C| = 1, we have a graph with all edges having the same color, which can be seen asan uncolored graph. This is just the normal case as proved in Theorem 2.3.6.
Although again redundant, it is instructive to prove the case |C| seperately, because thegeneralization in the inducution step is then immediately clear, so consider the partitionedsource-edge and terminal-edge matrices:
AS =(AS1 AS2
)and AT =
(AT1 AT2
)BS =
(BS1 BS2
)and BT =
(BT1 AB2
)the equations of the theorem are in this case:
X ′(k+1) = BS1Y′(k)
11 ATS1 +BT1Y′(k)
11 ATT1 +BS2Y′(k)
22 ATS2 +BT2Y′(k)
22 ATT2
Y′(k+1)
11 = BTS1X
′(k)AS1 +BTT1X
′(k)AT1
Y′(k+1)
22 = BTS2X
′(k)AS2 +BTT2X
′(k)AT2
which can be rewritten using Theorem 2.2.13 as:
X ′(k+1) = BS1Y′(k)
11 ATS1 +BT1Y′(k)
11 ATT1 +BS2Y′(k)
22 ATS2 +BT2Y′(k)
22 ATT2
⇔ vec(X ′(k+1)) = (AS1 ⊗BS1 +AT1 ⊗BT1) vec(Y ′(k)11 ) + (AS2 ⊗BS2 +AT2 ⊗BT2) vec(Y ′(k)
22 )
CHAPTER 2. SIMILARITY ON GRAPHS 105
Y′(k)
11 and Y′(k)
22 can also be rewritten:
vec(Y ′(k+1)11 ) = (ATS1 ⊗B
TS1 +ATT1 ⊗B
TT1) vec(X ′(k))
vec(Y ′(k+1)22 ) = (ATS2 ⊗B
TS2 +ATT2 ⊗B
TT2) vec(X ′(k))
We define z(k+1) and M as follows, and again the previous expressions concatenate to a singlematrix equation:
z(k+1) =
vec(X)vec(Y11)vec(Y22)
(k+1)
=
0 AS1 ⊗BS1
+AT1 ⊗BT1
AS2 ⊗BS2
+AT2 ⊗BT2
ATS1⊗BT
S1+ATT1
⊗BTT1
0 0
ATS2⊗BT
S2+ATT2
⊗BTT2
0 0
vec(X)vec(Y11)vec(Y22)
(k)
= Mz(k)
M is clearly nonnegative as it exists of zero elements or sums of Kronecker products ofnonnegative matrices. To see that M is symmetric, rewrite M with G and GT :
G =(ATS1⊗BT
S1+ATT1
⊗BTT1
ATS2⊗BT
S2+ATT2
⊗BTT2
)
With Lemma 2.2.13 we calculate GT :
GT =(AS1 ⊗BS1 +AT1 ⊗BT1 AS2 ⊗BS2 +AT2 ⊗BT2
)G is a nGnH × (cG (→, 1)cH (→′, 1) + cG (→, 1)cH (→′, 2))-matrix, so:
M =(
0cG (→,1)cH (→′,1)+cG (→,1)cH (→′,2) GT
G 0nGnH
)
which is clearly a symmetric matrix and the result follows.We now prove the induction step |C| = n− 1⇒ |C| = n. The only thing to show is that
M stays symmetric (nonnegative is clear), but this is obvious as G can be rewritten as:
G =
ATS1⊗BT
S1+ATT1
⊗BTT1
ATS2⊗BT
S2+ATT2
⊗BTT2
ATS|C| ⊗BTS|C| +ATT|C| ⊗B
TT|C|
and you can rewrite M with G and GT .
CHAPTER 2. SIMILARITY ON GRAPHS 106
Algorithm
With the same reasoning as for colored nodes, we programmed an algorithm that takes asource-edge matrix or a terminal-edge matrix as input together with a an ordered list withthe number of edges belonging to each color. Under the condition that the edges are arrangedby color in the source-edge or terminal-edge matrix, this algorithm returns a partitionedsource or terminal-edge matrix. Based on this algorithm, we can construct an algorithm thatreturns the node similarity matrix X and the (colored) edge similarity matix Y . The Matlablistings of both algorithms can be found in Listing A.9 and Listing A.10 in Appendix A.Because the algorithms resemble a lot to the previous algorithms, we didn’t write them inpseudocode.
Example
Example 2.4.6. Let G be the graph:w1
w2
w4w3 w5
e′1
e′2 e′
3 e′4
and H be the graph:
v1 v2 v3
e1 e2
The node and edge similarity matrix are given by:
X =
0.2295 0 0 0 00 0.8903 0 0 00 0 0.2284 0.2284 0.2284
Y =
(0.5893 0 0 0
0 0.5812 0.5812 0
)In fact these two graphs are very similar in a way that node v3 is replaced by three nodesw3, w4, w5. This is also detected by the node and edge similarity matrices, giving equalsimilarity scores for (e′2, e2) and (e′3, e2), (e′4, e2) is equal to zero because e′4 has a differentcolor.
CHAPTER 2. SIMILARITY ON GRAPHS 107
2.4.3 Fully colored graphs
We can also consider the combination of colored nodes and colored edges. This in fact thesame method as with the colored edges, both the matrices X and Y will in this case bothbe block diagonal (possibly with a different number of blocks). The iteration matrix for Ywill be completely equal to the iteration matrix for colored edges and it is easy to adapt theiteration matrix for X so it can handle colored nodes. We will not discuss this case in detail,as it would be an extensive repetition of the previous subsections.
CHAPTER 2. SIMILARITY ON GRAPHS 108
2.5 Applications
Similarity on graphs is a fairly new concept, so there are not so many applications developedyet. In this small subsection we give a concise overview of the already existing applications.We just give an intuitive idea to give the reader a notion of which kind of applications arealready possible. For the more detailed, mathematical approach we refer to the relevantarticles.
2.5.1 Synonym Extraction
In [? ], they use the structure graph 1→ 2→ 3 to substract synonyms in graphs constructedfrom a dictionary. The method is based on the assumption that synonyms have many wordsin common in their definitions and appear together in the definition of many words. So toconstruct a dictionary graph G , each word of the dictionary is a vertex and there is an edgefrom vi to vj if vi appears in the definition of vj . Now, with a given word w, we construct aneighboorhood graph G ′, which is the subgraph of G whose vertices are pointed to by w orare pointing to w. Finally, we rank the words by decreasing central score and the words withthe highest central scores will possibly be good synonyms.
2.5.2 Graph matching
The mean usage of the current graph similarity algorithms as presented in this section, is tofind some kind of an optimal matching between the elements (vertices and edges) betweentwo graphs. Graph matching has received a lot of popularity because it is useful for datamining. By example, in [? ] they use brute-force algorithm that calculates the similarityscores between a graph G and any possible subgraph of G . By putting some conditions aboutthe desired result, it is possible to detect the subgraph that is most similar to the originalgraph. Practical uses of graph matching are:
• Comparing two databases,
• Finding clusters in a set of data,
• Align sequences of DNA,
• Comparing networks,
• Comparing connections in social networks,
• ...
Chapter 3
Similarity on hypergraphs
3.1 Introduction
3.1.1 Motivation
In this section, we explore similarity on hypergraphs. This exploration is new. No papers canbe found on the subject. A motivation to generalize the concept of similarity to hypergraphscan be found in the applications of hypergraphs in other sciences. A very clear application canbe found in chemistry: some of the graph models used for the representation of molecules arenot sufficient to grasp the whole complexity of the molecule structure (for example, moleculeswith delocalized polycentric bonds can not be represented as graphs without losing a lot ofinformation about the structure). Representing these molecules as hypergraphs is often a goodand more illustrative solution [? ]. An established similarity method for hypergraphs maypotentially detect equivalences in the structure of such molecules. Also in computer science,some research is done about representing data structures as hypergraphs. For example, tospeed up algorithms in parallel databases, a representation with hypergraphs can be used [?]. Also in this case, a hypergraph similarity method may lead to an algorithm that efficientlycompares two parallel databases.
3.1.2 What would be a good hypergraph similarity method?
In order to generalize this concept of similarity successfully and give it a correct meaning, weneed to formulate some conditions that a possible method for similarity on hypergraphs hasto fulfill. Intuitively, all methods from the previous chapter work in the same way: in thefirst iteration step only the adjacency relations of vertices (or edges) in two graphs are usedand in each following iteration step, the relationships between these adjaceny relations arecalculated, which will result in high similarity scores (compared to the others in the similaritymatrix) for a vertex when the adjacent vertices have a high similarity score. When calculatingan edge similarity matrix, an edge will have a high similarity score (compared to others) ifthe vertices that an edge connect have a high similarity score.
Based on this, we now present our conditions. These conditions must ensure that asimilarity method for hypergraphs has the same behaviour as the similarity methods forgraphs:
(C1) When the method is applied on two undirected graphs, it must return the same similarityscores (up to a constant) as the methods from Section 2.
109
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 110
(C2) Adding a non-isolated node to one of the hypergraphs must influence the similarityscores.
(C3) Adding an edge that is not a hyperloop must influence the similarity scores.
(C4) The similarity score of a vertex in an hypergraph is large (compared to others) whenthe similarity score of the adjacent vertices in the hypergraph is large.
(C5) The similarity score of an edge is large (compared to others) when it connects verticeswith large similarity scores.
(C6) When two vertices (edges) have the same relationships to all other vertices (edges), wesay that these vertices (edges) are structural equivalent. Structural equivalent verticesand edges must have the same similarity scores because, intuitively, they play exactly thesame role in the hypergraph structure. Actually, we can give a more general definitionof structural equivalence in hypergraphs:
Definition 3.1.1. Two vertices vi, vj of a hypergraph G are structural equivalent iffor each edge of size k + 1 that connect vi to vp1 , vp2 , . . . , vpk , there is an edge of sizek + 1 (possibly the same) that connects vj to vp1 , vp2 , . . . , vpk .
When calculating an edge similarity matrix, structural equivalent edges (edges that havethe same size and connect exactly the same vertices) must also have the same similarityscores.
(C7) The cardinality of an edge E in a hypergraph (|E|) must influence the edge similarityscores: edges with a high cardinality must have a higher similarity score than two edgeswith a lower cardinality.
(C8) A vertex can only have a similarity score equal to zero when the hypergraph is notconnected. In that case, it can occur that a group of connected vertices dominates allvertices that are not adjacent to this group. Also an edge can only have a similarityscore equal to zero when the hypergraph is not connected. An isolated vertex (that isnot a loop) will always have a similarity score equal to zero.
The attentive reader may ask himself how we collected this set of conditions. These condi-tions are established in a heuristic way, based on observations from various tryouts on themethods of similarity on graphs. To give these conditions a better foundation, we now give anexplanation on how the methods of similarity on graphs of section 2 meet each condition forundirected graphs. Although most conditions are also met for directed graphs, we only haveto consider undirected graphs because hypergraphs itself are undirected. The explanationcan either be a proof of a simple theorem or a more heuristic explanation. The explanationsare numbered (E1,. . ., E8) in the same way as the conditions because we will often refer tothis explanations in the rest of this chapter.
(E1) Undirected graphs are equal to 2-hypergraphs. It is evident that in the case of a 2-hypergraph the results obtained by a similarity method for hypergraphs must returnthe same results (up to a constant).
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 111
(E2) It is easy to see that (C2) holds for every (node) similarity method on graphs as adding aextra node to a graph generates an additional row and column in the adjacency matrix.This row and column are thus included in the calculation of the similarity matrix Sand after the calculation, similarity scores for the added node compared to all the othernodes from the other graph are obtained.
(E3) This holds with the same reasoning as (E2) but now applied to the (edge) similaritymethod.
(E4) We can explain this in a heuristic way: for example, in the method of Blondel, thecompact form equals (A is the adjacency matrix of GA and B is the adjacency matrixof GB):
S(k+1) = BS(k)AT +BTS(k)A
‖BS(k)AT +BTS(k)A‖F, k = 0, 1, . . . ,
first note that BS(k)AT +BTS(k)A is in fact the sum of the similarities of the childrenand parents of node vi of GB and vertex vj of GA. We see that si,j , the similarity scorebetween vi of B and vj of A, is in fact equal to a scalar times the element (i, j) ofBS(k)AT + BTS(k)A. The same can be said about the even iterates and hence aboutthe complete similarity matrix S. So we see that this implicit relation means that thesimilarity scores of a vertex in an graph is large (compared to others) when the similarityscore of the adjacent vertices in the graph is large.
(E5) This holds with the same reasoning as (E4) but now applied to the node-edge similaritymethod (section 2.3).
(E6) To see that (C6) also holds in graphs, we define structural equivalent vertices in graphsas follows:
Definition 3.1.2. In a graph G , two vertices vi, vj are structural equivalent if theyhave exactly the same adjacency structure. Meaning that for each edge that connects vito a vertex vq, there must also be an edge that connects vj to vq. In a directed graph,the same holds in the sense that for each edge vi → vq there must exist an edge vj → vqand for each edge vp → vi there must exist an edge vp → vj.
Notice that all isolated vertices in a graph are always structural equivalent. We willshow that (C6) holds for the method of Blondel on graphs by proving the followingtheorem.
Theorem 3.1.3. Let GA be a graph with structural equivalent vertices and GB anothergraph, then by calculating the similarity matrix between GA and GB the structural equiv-alent vertices of GA will have the same similarity scores for every vertex of GB.
Proof. Let GA = (V,→),GB = (W,→′) with |V | = n, |W | = m. First, it is easy to seethat the structural equivalent vertices form an equivalence relation ∼ on V with vp ∼ vqif and only if vp and vq are structural equivalent. Take vp, an equivalency class withmore than one vertex (this exists as GA has structural equivalent vertices).Now, from the definition of structural equivalent vertices in graphs, we conclude thattwo vertices vp and vq are structural equivalent if and only if for all i ∈ {1, . . . , n} holds
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 112
that aip = aiq and that api = aqi. This holds also for directed graphs, in the undirectedcase we even have that aip = api = aqi = aiq. So all vertices in vp have the same entriesin the adjacency matrix A of GA.The similarity scores of vp and any other vertex wj of GB at iteration step k+ 1 can becalculated as (A = (aij), B = (bij), S = (sij)) :
s(k+1)jp =
∑mf=1
∑ng=1 bjfs
(k)fg a
Tgp + bTjfs
(k)fg agp
‖BS(k)AT +BTS(k)A‖F(3.1)
=∑mf=1
∑ng=1 bjfs
(k)fg apg + b
(k)fj s
(k)fg agp
‖BS(k)AT +BTS(k)A‖F(3.2)
So, if we consider this iteratively we get that the similarity scores (at iteration stepk) between a vertex in vp and a vertex wj of GB are equal to s(k)
jp because for all g ∈{1, . . . , n}, apg is the same for all vertices in vp and the same holds for the agp’s. So all thevertices in vp undergo the same calculations with the entries bjf , bfj (f ∈ {1, . . . ,m}).Note that we start with s
(0)jp equal to 1.
Because we proved that for any equivalence class in ∼ (with more than one element)the similarity scores are equal at each iteration step k, the theorem follows.
Remark 3.1.4. The notion of structural equivalence was introduced on graphs into thenetwork literature by Lorrain and White in 1971 [? ] using category theory. Later on,in 1988, also the notion of of automorphic equivalence was proposed by Chris Winshipin [? ]. An automorphism of a graph G = (V,→) is a permutation π of the vertex setV , such that for any two vertices vi, vj it holds that π(vi)→ π(vj) if and only if vi → vj.Now, two vertices vi, vj of a graph G are automorphic equivalent if and only if thereexists an automorphism π such that π(vi) = vj. Automorphic equivalence is a naturalgeneralization of structural equivalence but it is easy to see that it is a weaker condition.Just as structural equivalence, also automophic equivalence can be easy generalized tonodes and edges of hypergraphs (see [? ]). When we developed these conditions for agood hypergraph similarity method, we have long thought of imposing the condition thattwo automorphic equivalent nodes (or edges) of an hypegraph must have the same sim-ilarity score. Finally, we didn’t impose this condition because our experiments showedthat even the similarity methods for graphs fail to detect automorphic equivalence (inthe sense that they should return the same similarity scores for automorphic equivalentvertices) for large and complex graphs. The methods we propose for hypergraphs will ingeneral also fail to detect automorphic equivalent nodes or edges. Only when the consid-ered hypergraphs are very easy, the methods might return equal scores for automorphicequivalent nodes or edges. One may wonder of this is a drawback of the current graphsimilarity methods or not. This depends on the practical setting, but in general we ar-gue that this is normal behaviour we would expect from a similarity method. We givean example to make our vision clear, consider the following graph:
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 113
A
CB D
E G H I
F
This graph describes the organizational structure of an organization, with A the CEO,B,C,D managers, E,G are workers for manager B, H, I are workers for manager Dand F is the lone worker for manager C. It is easy to see that the structural equivalentpairs are {E,G} and {H, I}. They have exactly the same working relations with therest of the organization. Manager B and manager D are not structural equivalent (theydo have the same CEO, but not the same workers), but they are automorphic equivalent(of course, also the E,F,H and I are automorphic equivalent) . This is also veryintuitive: if we swapped them, together with their workers, the organizational structurewould be the same. Still, when we would compare with a graph similarity method thisorganizational structure with another organizatinal structure, we would argue that it isjustified to give B and D another similarity score as they are managers of differentworkers. Moreover, if one indeed needs a method to detect automorphic equivalence,plenty of algorithms are already available for this purpose (see [? ]).
(E7) This condition arose based on intuition. A hypergraph that is not a k-hypergraph,has at least one edge that has a different size. Now, intuitively, when an edge Ei in ahypergraph is compared to an edge E′i in a hypergraph , it is somehow clear that theymust have a higher (edge) similarity score when they connect more vertices. Of course,this only holds in connected hypergraphs (see Condition 8).One may wonder whether graphs too meet this condition, but having a different edge sizeonly arises in the pathetic case the graph contains a loop. We can show that comparingtwo loops will normally result in a lower edge similarity score than comparing twoedges that aren’t loops. We say ‘normally’, because of course, (C4) or (C8) can alwaysinterfere in some graphs. Now, to show this, look at the node-edge similarity method,the edge similarity matrix Y and the node similarity matrix X are calculated as:
Y (k+1) = BTSX
(k)AS +BTTX
(k)AT‖BT
SX(k)AS +BT
TX(k)AT ‖F
(3.3)
X(k+1) = BSY(k)ATS +BTY
(k)ATT‖BSY (k)ATS +B′TY
(k)ATT ‖F(3.4)
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 114
for k = 0, 1, . . ..Element-wise, we get:
y(k+1)ji =
∑nHf=1
∑nGg=1 b
Tsjfx
(k)fg asgi + bTtjfx
(k)fg atgi
‖BTSX
(k)AS +BTTX
(k)AT ‖F
x(k+1)ji =
∑mHf=1
∑mGg=1 bsjf y
(k)fg a
Tsgi + btjf y
(k)fg a
Tstgi
‖BSY (k)ATS +B′TY(k)ATT ‖F
which is equal to:
y(k+1)ij =
∑nHf=1
∑nGg=1 bsfix
(k)fg asgj + btfix
(k)fg atgj
‖BTSX
(k)AS +BTTX
(k)AT ‖F
x(k+1)ij =
∑mHf=1
∑mGg=1 bsif y
(k)fg asjg + btif y
(k)fg astgj
‖BSY (k)ATS +B′TY(k)ATT ‖F
Now let ej of G be a loop on vertex vp and e′i of H a loop of vertex v′q, the edgesimilarity score between ej and e′i equals:
y(k+1)ji =
bsqix(k)qp aspi + btqix
(k)qp atpj
‖BTSX
(k)AS +BTTX
(k)AT ‖F(3.5)
⇔ y(k+1)ji = 2x(k)
qp
‖BTSX
(k)AS +BTTX
(k)AT ‖F(3.6)
x(k+1)qp =
∑mHf=1
∑mGg=1 bsqf y
(k)fg aspg + btqf y
(k)fg astgp
‖BSY (k)ATS +B′TY(k)ATT ‖F
(3.7)
where we indeed see that the result of yji is only based on xqp which will be high ifboth vp and v′q are heavy connected to other vertices (see C4). In the case of em of GAconnecting the vertices vp and vo and e′n of GB connecting the vertices v′q, v′r we get thefollowing edge similarity score ynm (GA and GB are undirected):
y(k+1)nm = 2(x(k)
qp + x(k)qo + x
(k)rp + x
(k)ro )
‖BTSX
(k)AS +BTTX
(k)AT ‖F,
which will normally be higher than (3.6) (keep in mind that condition (C4) and (C8)can occur).
(E8) To see (C8) on undirected graphs, we first notice that indeed isolated vertices havealways similarity scores equal to 0 in the method of Blondel (also in the case of directedgraphs), for example, let wj of H with adjacency matrix B be an isolated vertex andwe calculate the similarity scores with the vertices of G , elementwise we write:
s(k+1)jp =
∑mf=1
∑ng=1 bjfs
(k)fg apg + b
(k)fj s
(k)fg agp
‖BS(k)AT +BTS(k)A‖F
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 115
But since wj is an isolated vertex we know that all the b′jfs and bfj are equal to zero,so:
s(k+1)jp =
∑mf=1
∑ng=1 0s(k)
fg apg + 0s(k)fg agp
‖BS(k)AT +BTS(k)A‖F= 0
The condition about connectivity is easy to see: when a matrix is not connected, thereexists zeros in the adjacency matrix. Each zero decreases the similarity score, if thisresults in a very low similarity score in the first iteration steps, it could be that inthe limit - caused by the zeros in the adjacency matrix - the similarity scores are onlygetting lower which will finally result in a similarity score equal to zero.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 116
3.2 Similarity through corresponding graph representations
In the section, we explore representations of hypergraphs as (classical) graphs and use themto calculate the similarity between two hypergraphs by simply using the methods of theprevious chapter. Intuitively, we want a characteristic graph representation of a hypergraphthat is faithful (meaning that a graph represents only 1 hypergraph), because only a faithfulcharacteristic graph will preserve all the information that the structure of a hypergraphcontains. When the characteristic graph of a hypergraph is not faithful, the resulting similaritymethods can do a fair job under certain circumstances, but we will see that we cannot fulfillall the conditions at the same time because some information of the original hypergraph islost. To prove that several graph representations of hypergraphs meet certain conditions, wewill often refer to the explanations (Ei) in the following way: we explain or prove that thegraph representation preserves a certain characteristic of the hypergraph and then we justuse the explanations for graphs.
3.2.1 Line-graphs
General definitions and properties
Definition 3.2.1. Let H = (V,E) be a hypergraph with E 6= ∅ and E = {E1, . . . Em}. Theline-graph of H is the undirected graph often denoted by `(H) = (V ′,↔) with:
1. V ′ = E,
2. Ei ↔ Ej if and only if Ei ∩ Ej 6= ∅ and i 6= j.
It is immediately clear that line-graphs are simple graphs (see Definition 1.5.14). Animportant property of a hypergraphs can be seen on a line-graph:
Property 3.2.2. If H = (V,E) is connected, then the line-graph `(H) is also connected.
Proof. By contraposition, if `(H) is not connected, take Ep, Eq, two vertices of `(H) thatdoesn’t have a path between each other. This means that there is no sequence
Ep = Ek0 , Ek1 , . . . , Ekl−1 , Eq = Ekl
such that Eki−1 ∩ Eki 6= ∅ for i ∈ {0, . . . , l}, if vp ∈ Ep and vq ∈ Eq with vp, vq vertices of H,this means that no path exists between vp and vq and therefore H is not connected.
The following example shows that characteristic line-graph of a hypergraph is not faithful:
Example 3.2.3. Let H1 be the following hypergraph:
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 117
v3 v4
v1
v5
v2
v6
v7 v8
E1
E2
E3
Let H2 be the following hypergraph:
v1 v2 v3 v4
E1
E2
E3
And let H3 be the following hypergraph (H3 is in fact also an undirected graph):
v1 v2
E1
E2
E3
All off H1,H2 and H3 lead to the same line-graph G :
E1 E2 E3
Algorithm for similarity
We want to apply Algorithm 5 to produce a similarity matrix between two hypergraphs.This will produces an edge similarity matrix, because the vertices are not represented in theline-graph representation.
To use Algorithm 5 from section 2.2, we first take a hypergraph as input and calculate theadjacency matrix of the corresponding line-graph. The Matlab implementation of this stepcan be found in Appendix A in Listing A.11. By applying this algorithm to two hypergraphs,we can use Algorithm 5.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 118
Examples
We now give some examples of the similarity of two hypergraphs by using line-graphs:Example 3.2.4. Let H1 be the following hypergraph:
v3 v4
v1
v5
v2
v6
v7 v8
E1
E2
E3
Let H2 be the following hypergraph:
v4 v5 v6 v7 v8 v9
v1 v3v2
v10 v11 v12
E′1
E′2
E′3
E′4
By applying the algorithm we get the adjacency matrices A1 for H1 and A2 for H2:
A1 =
0 1 01 0 10 1 0
and A2 =
0 0 0 10 0 0 10 0 0 11 1 1 0
Now we use Algorithm 5 with A1 and A2 and get the following similarity matrix:
S =
0.2887 0.2887 0.28870.2887 0.2887 0.28870.2887 0.2887 0.28870.2887 0.2887 0.2887
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 119
Example 3.2.5. H1 is the same as in the previous example, but now, H2 is the followinghypergraph:
v4 v5 v6 v7 v8 v9
v1 v3v2
v10 v11 v12
E′1
E′2
E′3
The similarity score of these two hypergraphs by using their line-graph representation be-comes:
S =
0 0 00 0 00 0 0
Example 3.2.6. H1 is the same as in the previous example, but now, H2 is the followinghypergraph:
v4 v5 v6 v7 v8 v9
v1 v3v2
v10 v11 v12
E′1
E′2
E′3
The similarity score of these two hypergraphs by using their line-graph representation be-comes:
S =
0.4082 0.4082 0.40820.4082 0.4082 0.4082
0 0 0
Example 3.2.7. H1 is the same as in the previous example, but now, H2 is the followinghypergraph:
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 120
v3 v4
v1
v5
v2
v6
v7 v8
E′1
E′2
E′3
E′4
The similarity score of these two hypergraphs by using their line-graph representation be-comes:
S =
0.4082 0.4082 0 0.40820.4082 0.4082 0 0.4082
0 0 0 0
Interpretation
We now discuss the conditions from the introduction:
(C1) Not fulfilled: first, the method will only return an edge similarity matrix and second,the line-graph of an undirected graph is not necessary the undirected graph itself:
v1 v2 v3e1 e2
Has as line-graph:
e1 e2
which will clearly not result in the same edge similarity scores as no information aboutthe vertex adjacency is saved.
(C2) Not fulfilled: the vertices of the hypergraph don’t play a role in the line-graph. Noinformation about them is saved in the line-graph representation.
(C3) Fulfilled: Adding an edge to one of the two hypergraphs will result in an extra vertexin the line-graph. Adding this vertex will take the edge into account when calculatingthe similarity scores and therefore we conclude on a heuristic base that this conditionis fulfilled.
(C4) Not fulfilled: in Examples 3.2.4, 3.2.5, 3.2.6 and 3.2.7 all positive similarity scores arethe same, regardless of the adjacency structure of the hypergraph.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 121
(C5) Not fulfilled: there is no information about the vertices saved in the line-graph repre-sentation of a hypergraph.
(C6) Fulfilled: from Example 3.2.7 we see that the E′1 and E′4 are structural equivalent andhave indeed the same similarity scores. We will prove this.
Theorem 3.2.8. The line-graph representation of a hypergraph preserves structuralequivalent edges.
Proof. Two edges are structural equivalent in a hypergraph if they contain exactly thesame vertices. They form an equivalence class on E where Ei ∼ Ej if they are structuralequivalent. Let Ei be the class of structural equivalent edges with Ei, we assume thatthis class contains at least 2 edges. Now from definition 3.2.1 we see that Ei ↔ Ej ifand only if Ei ∩ Ej 6= ∅ and i 6= j in the linegraph representation, thus all the edges inEi will have exactly the same adjacency structure, making them structural equivalentas vertices in the line-graph representation too.
Because the line-graph representation preserves structural equivalent edges and we usethe method of Blondel for which we already proved in (E6) that this condition holds,the result follows.
(C7) Not fulfilled: no information on the number of vertices an edge contains is preserverdby the line-graph representation.
(C8) Fulfilled: but it is also the only thing this representation tells us when used for thecalculation of similarity: two edges of the two line-graphs have a positive similarityscore when these edges are connected to other edges, meaning that for any Ep, Eq thereis a sequence
Ep = Ek0 , Ek1 , . . . , Ekl−1 , Eq = Ekl
such that Eki−1 ∩ Eki 6= ∅.This follows immediately from Property 3.2.2. The connectivity of edges is indeedrepresented in the line-graph representation by definition.
Conclusion
The line-graph fails to satisfy lots of the conditions and is therefore not a good representationto calculate similarity between two hypergraphs. Similarity between hypergraphs throughline-graphs only allows us to discover groups of connected edges in both hypergraphs. Thefact that the line-graph representation is a bit disappointing, was also predictable as a lot ofhypergraphs share the same line-graph representation.
3.2.2 2-section of a hypergraph
General definitions and properties
We now look at another graph representation of a hypergraph. In contrast to the line-graphrepresentation, the 2-section saves information about the vertices which will introduce a moresophisticated way to say something about similarity between two hypergraphs by using their2-section.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 122
Definition 3.2.9. The 2-section of a hypergraph H = (V,E) is the (undirected) graphdenoted by H2 = (V,↔) with:
• The same vertex set as the hypergraph,
• vi ↔ vj if and only if vi, vj ∈ Ek for some Ek ∈ E and i 6= j.
Example 3.2.10. The 2-section of the following hypergraph H is drawn on top:
v3 v4
v1
v5
v2
v6
v7 v8
E1
E2
E3
Also the 2-section of a hypergraph is not a faithful graph characteristic of a hypergraph,as the following example shows:
Example 3.2.11. The 2-section of the following hypergraph H′ is drawn on top and is thesame as the previous example:
v3 v4
v1
v5
v2
v6
v7 v8
E1
E2
E3
E4
E5
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 123
Algorithm for similarity
In Listing A.12 in Appenix A, we introduce an algorithm that takes a hypergraph as inputand returns the adjacency matrix of the corresponding 2-section of the hypergraph. Theresulting adjacency matrix can then be used in the node similarity method from Algorithm5.
Examples
We now use the same examples as in the previous subsection and look at what the similarityof two hypergraphs becomes by using their 2-section.
Example 3.2.12. Take the sameH1 andH2 as in Example 3.2.4, by calculating the adjacencymatrices of the 2-section, and we can apply Algorithm 5 which returns the similarity matrix:
S =
0.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.0263 0.0288 0.0051 0.0271 0.0288 0.0162 0.0263 0.02880.0147 0.0161 0.0028 0.0151 0.0161 0.0091 0.0147 0.01610.0790 0.0867 0.0153 0.0814 0.0867 0.0489 0.0790 0.08670.1610 0.1768 0.0312 0.1659 0.1768 0.0996 0.1610 0.17680.1610 0.1768 0.0312 0.1659 0.1768 0.0996 0.1610 0.17680.0890 0.0977 0.0172 0.0917 0.0977 0.0551 0.0890 0.09770.0263 0.0288 0.0051 0.0271 0.0288 0.0162 0.0263 0.02880.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.1347 0.1479 0.0261 0.1388 0.1479 0.0833 0.1347 0.14790.0263 0.0288 0.0051 0.0271 0.0288 0.0162 0.0263 0.0288
Example 3.2.13. Let H1 and H2 be the same as in Example 3.2.5, the similarity matrixusing the 2-section of the hypergraphs becomes:
S =
0.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.16820.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.1682
0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0
0.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.16820.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.1682
0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
0.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.16820.1532 0.1682 0.0297 0.1579 0.1682 0.0948 0.1532 0.1682
0 0 0 0 0 0 0 0
Example 3.2.14. Let H1 and H2 be the same as in Example 3.2.6, the similarity matrix
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 124
using the 2-section of the hypergraphs becomes:
S =
0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639
0 0 0 0 0 0 0 00.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.1623 0.1782 0.0314 0.1673 0.1782 0.1004 0.1623 0.17820.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639
0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639
0 0 0 0 0 0 0 0
Example 3.2.15. Let H2 be the same as in the previous example (Example 3.2.14), but takefor H1 the following hypergraph:
v3 v4
v1
v5
v2
v6
v7 v8
E1
E2
E3
E4
The similarity matrix using the 2-section of the hypergraph becomes the same as in theprevious example:
S =
0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639
0 0 0 0 0 0 0 00.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.0397 0.0436 0.0077 0.0409 0.0436 0.0246 0.0397 0.04360.1623 0.1782 0.0314 0.1673 0.1782 0.1004 0.1623 0.17820.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639
0 0 0 0 0 0 0 00 0 0 0 0 0 0 0
0.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.16390.1493 0.1639 0.0289 0.1538 0.1639 0.0923 0.1493 0.1639
0 0 0 0 0 0 0 0
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 125
Interpretation
(C1) Not fulfilled: an undirected graph with multiple edges between two vertices vp, vq will‘lose’ this edges in its 2-section. Also loops are not taken into consideration. Take forexample the following graph G :
v1 v2e1
e2
e3
G will have as 2-section:
v1 v2e1
(C2) Fulfilled: the number of vertices is preserved in the 2-section of a hypergraph, so eachvertex is taken into account when calculating the similarity matrix. We conclude withthe same reasoning as (E2) that the condition is fulfilled.
(C3) Not fulfilled: introducing an edge in a hypergraph that connects vertices who are alreadyconnected, doesn’t change anything to the 2-section of the hypergraph (by definition),for instance, we see in Example 3.2.15 that get the same similarity matrix as in Example3.2.14 as introducing edge E4 doesn’t influence the similarity scores at all becausev1, v4, v7 were already adjacent to each other in the 2-section of H1.
(C4) Fulfilled: take for Example 3.2.12, we see that the largest similarity scores occurs be-tween vertices v6, v7 in H2 and vertices v2, v5 and v8 from H1. This can be explainedfrom the fact that indeed v6, v7 are all adjacent to the vertices in E2. And that E2 isthe most central set of vertices in the whole hypergraph.This can be explained generally from the fact that the 2-section preserves the adjacencyrelations between vertices of the hypergraph by definition (namely: a vertex that isadjacent to another vertex in the hypergraph, will also be adjacent in the 2-section).By preserving these relations, we can use the information in (E4) to conclude that thiscondition is fulfilled.
(C5) This condition does not apply on this method because we don’t calculate edge similarityscores.
(C6) Fulfilled: all structural equivalent vertices have the same similarity scores in the Ex-amples 3.2.12, 3.2.13, 3.2.14. This can be explained from the fact that the structuralequivalent vertices of a hypergraph are also structural equivalent vertices in the 2-sectiongraph. We will prove this:
Theorem 3.2.16. The 2-section of a hypergraph preserves structural equivalent ver-tices.
Proof. The structural equivalent vertices of a hypergraph G form an equivalence classon the set of vertices V , take an equivalence vi with |vi| > 1 (we assume that G has at
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 126
least two structural equivalent vertices), this means that all vertices in vi have exactlythe same adjacency structure in the hypegraph G. So, when a vertex in vi is adjacent toa vertex vp, all vertices in vi are adjacent to vp in the hypergraph G. By the definitionof the 2-section, this means that all vertices in vi will also have an edge connectingthem to vp in the 2-section. So in the 2-section, all vertices in vi will be adjacent tothe same vertices, making them also structural equivalent by the definition of structuralequivalent vertices in graphs.
Because the 2-section preserves structural equivalent vertices and we use the method ofBlondel for which we already proved in (E6) that this condition holds, the result follows.
(C7) Not fulfilled: the 2-section doesn’t save any information about the number of verticesin each edge.
(C8) Fulfilled from the fact that isolated vertices in a hypergraph will also be isolated in the2-section by definition. Because we already know from (E8) that this condition holds forthe method of Blondel, we conclude that this condition is fulfilled because the 2-sectionpreserves connectivity by definition.
Conclusion
The 2-section of a hypergraph is a rich structure that saves a lot more information comparedto the the line-graph representation. The biggest drawback for this method is that adding anedge to a hypergraph can sometimes have no effect at all. This happens when the added edgeconnects vertices which were already connected. This is bad, because adding an edge alwaysshould have an impact on the similarity scores as it expresses an additional union betweenvertices. As a consequence, adjacent vertices that aren’t structural equivalent can still havethe same similarity scores. We saw in Example 3.2.11 that the 2-section of a hypergraphis not unique, meaning that we are loosing certain information on the hypergraph. In thiscase, we can lose information about the edges as some edges will not be represented in the2 section and we also lose all information about the number of vertices contained in eachedge. Meaning that the number of vertices contained in an edge doesn’t play any role whencalculating the similarity scores.
We can resolve the problems in conditions (C1) and (C3) by allowing multiple edgesbetween vertices and loops: every edge in the hypergraph is then also represented in thisextended 2-section.
3.2.3 Extended 2-section of a hypergraph
General definitions and properties
Definition 3.2.17. The extended 2-section of a hypergraph H = (V,E) is the (undirected)graph denoted by H’2 = (V,↔) with:
• The same vertex set as the hypergraph,
• for every Ei ∈ E with |Ei| > 1: vk ↔ vl for every vk, vl ∈ Eiandk 6= l,
• for every Ei ∈ E with |Ei| = 1 : vk ↔ vk for vk ∈ Ei.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 127
Algorithm for similarity
We introduce Algorithm 10 that takes a hypergraph as input and returns the adjacency matrixof the corresponding extended 2-section of the hypergraph. A Matlab implementation can befound in Listing A.13 in Appenix A.
Data:n: the number of vertices of hypergraph HE: a set of subsets Ei of {1, . . . , n} that represent the edges of hypergraph HResult:A: the adjacency matrix of the corresponding extended 2-sectionbegin hypergraph to extended2section(n, E)
A = initialize a n× n-matrix with all entries equal to 0;m = number of edges;for i : 1 to m do
if |Em| = 1 thenk = vertex in Em;(A)kk = Akk + 1;
elsefor j : 1 to |Em| do
C = all possible combinations of elements in Em\j;for l : 1 to |C| do
Ajl = Ajl + 1end
endend
endendreturn A;
Algorithm 10: Algorithm to calculate the adjacency matrix of the extended 2-section ofa hypergraph.
Example
We take an example that really shows the power of extended 2-sections:
Example 3.2.18. Take H1 as in Example 3.2.4 and H2:
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 128
v3 v4
v1
v5
v2
v6
v7 v8
E1
E2
E3
E4
The extended 2-section has as adjacency matrices for H1 and H2:
A =
0 1 0 1 1 0 1 11 0 0 1 2 1 1 20 0 0 1 0 0 0 01 1 1 0 1 0 1 11 2 0 1 0 1 1 20 1 0 0 1 0 0 11 1 0 1 1 0 0 11 2 0 1 2 1 1 0
and B =
0 1 0 2 1 0 1 11 0 0 1 2 1 1 20 0 0 1 0 0 0 02 1 1 0 1 0 1 11 2 0 1 0 1 1 20 1 0 0 1 0 0 11 1 0 1 1 0 0 11 2 0 1 2 1 1 0
Remember that the ‘normal’ 2-section for H1 and H2 would return the following adjacencymatrices:
A′ =
0 1 0 1 1 0 1 11 0 0 1 1 1 1 10 0 0 1 0 0 0 01 1 1 0 1 0 1 11 1 0 1 0 1 1 10 1 0 0 1 0 0 11 1 0 1 1 0 0 11 1 0 1 1 1 1 0
and B′ =
0 1 0 1 1 0 1 11 0 0 1 1 1 1 10 0 0 1 0 0 0 01 1 1 0 1 0 1 11 1 0 1 0 1 1 10 1 0 0 1 0 0 11 1 0 1 1 0 0 11 1 0 1 1 1 1 0
The node similarity matrix with the extended 2-section is:
S =
0.1108 0.1651 0.0174 0.1132 0.1651 0.0763 0.1108 0.16510.1409 0.2099 0.0222 0.1438 0.2099 0.0970 0.1409 0.20990.0168 0.0250 0.0026 0.0171 0.0250 0.0116 0.0168 0.02500.1128 0.1680 0.0177 0.1151 0.1680 0.0776 0.1128 0.16800.1409 0.2099 0.0222 0.1438 0.2099 0.0970 0.1409 0.20990.0629 0.0937 0.0099 0.0643 0.0937 0.0433 0.0629 0.09370.0962 0.1433 0.0151 0.0982 0.1433 0.0663 0.0962 0.14330.1409 0.2099 0.0222 0.1438 0.2099 0.0970 0.1409 0.2099
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 129
Conversely, the node similarity matrix with the ‘normal’ 2-section would return:
S′ =
0.1409 0.1547 0.0273 0.1452 0.1547 0.0872 0.1409 0.15470.1547 0.1698 0.0299 0.1594 0.1698 0.0957 0.1547 0.16980.0273 0.0299 0.0053 0.0281 0.0299 0.0169 0.0273 0.02990.1452 0.1594 0.0281 0.1496 0.1594 0.0898 0.1452 0.15940.1547 0.1698 0.0299 0.1594 0.1698 0.0957 0.1547 0.16980.0872 0.0957 0.0169 0.0898 0.0957 0.0539 0.0872 0.09570.1409 0.1547 0.0273 0.1452 0.1547 0.0872 0.1409 0.15470.1547 0.1698 0.0299 0.1594 0.1698 0.0957 0.1547 0.1698
The most important thing to notice here is the difference in similarity scores of verticesv1, v7 of H2: in the extended 2-section these vertices have different similarity scores which iscorrect as v1 is also contained in edge E4 and therefore, v1 and v7 are not structural equivalent.Conversely, in the ‘normal’ 2-section, the representation doesn’t take E4 into account, leadingto the same similarity scores for v1, v7.
Example 3.2.19. Take H1 as in Example 3.2.4 and H2 as:
v3v2 v4
v1E1
E2
E3
E4
The similarity matrix becomes:
S =
0.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.23320.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.23320.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.23320.1566 0.2332 0.0246 0.1599 0.2332 0.1078 0.1566 0.2332
This is a beautiful example to show that sometimes (when considering easy hypergraphs),automorphic equivalent vertices can share the same similarity scores (see Remark 3.1.4).
Interpretation
(C1) Fulfilled: it is trivial to see that by definition of the extended 2-section, an undirectedgraph will have exactly the same vertices and exactly the same edges in its extended2-section.
(C2) Fulfilled: the number of vertices is preserved in the extended 2-section of a hypergraph,so each vertex is taken into account when calculating the similarity matrix. We concludewith the same reasoning as (E2) that the condition is fulfilled (equal to the ‘normal’2-section).
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 130
(C3) Fulfilled: the number of edges is preserved in the extended 2-section of a hypergraph,so each edge is taken into account when calculating the similarity matrix. We concludewith the same reasoning as (E3) that the condition is fulfilled.
(C4) Fulfilled: the extended 2-section preserves the adjacency relations between vertices ofthe hypergraph by definition (namely: a vertex that is adjacent to another vertex in thehypergraph, will also be adjacent in the extended 2-section). By preserving this rela-tions, we can use the information in (E4) to conclude that this condition is fulfilled.(equalto the ‘normal’ 2-section)
(C5) This condition does not apply on this method because we don’t calculate edge similarityscores.
(C6) Fulfilled, because:
Theorem 3.2.20. The extended 2-section of a hypergraph preserves structural equiva-lent vertices.
Proof. The structural equivalent vertices of a hypergraph G form an equivalence classon the set of vertices V . Take an equivalence class vi with |vi| > 1 (we assume that Ghas at least two structural equivalent vertices), this means that all vertices in vi haveexactly the same adjacency structure in the hypegraph G. So, when a vertex in vi isadjacent to a vertex vp with k edges that connect both vertices (the size of these edgesdoesn’t matter), all vertices in vi are adjacent with k edges to vp in the hypergraph G.By the definition of the extended 2-section, this means that all vertices in vi will alsohave k edges connecting them to vp in the 2-section. So in the extended 2-section, allvertices in vi will be adjacent to the same vertices, and connecting each vertex vi withthe same amount of edges to these vertices, making them also structural equivalent inthe extended 2-section by the definition of structural equivalent vertices in graphs.
Because the 2-section preserves structural equivalent vertices and we use the methodof Blondel for which we already proved in (E6) that this condition holds, the resultfollows. (equal equal to the ‘normal’ 2-section)
(C7) Not fulfilled: the extended 2-section doesn’t save any information about the number ofvertices contained in an edge.
(C8) Fulfilled from the fact that isolated vertices in a hypergraph will also be isolated in theextended 2-section by definition. Because we already know from (E9) that this conditionholds for the method of Blondel, we conclude that this condition is fulfilled because theextended 2-section preserves connectivity by definition. (equal to the ‘normal’ 2-section)
Conclusion
The extended 2-section solves all the issues with (C1) and (C3) as it preserves all the in-formation about the number of edges connecting vertices and the adjacency of the vertices.Still, this representation is not unique as no information about the cardinality of the edges issaved, therefore, the method fails to (C8). Therefore, the method preforms much better than
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 131
the normal 2-section, but is unreliable in detecting differences between edges. Still, if one iswilling to accept this limitation, by example in the case of an application that is not focussedon detecting differences in number of vertices in th edges, the extended 2-section can be anoption. Also note that it is impossible to calculate edge similarity scores with the extended2-section: an edge of a hypergraph is translated into multiple vertices in the 2-section so itwould be impossible to satisfy (C5) and (C7).
3.2.4 The incidence graph of a hypergraph
General definitions and properties
Definition 3.2.21. Let H = (V,E) be a hypergraph, then the incidence graph Gi of H isthe undirected graph with:
1. V ′ = V ∪ E,
2. ∀vi ∈ V,∀Ej ∈ E : vi ↔ Ej if vi ∈ Ej.
Because all edges in Gi are between an element of V and E, i is a bipartite graph and wewrite i = ((V,E),↔).
Example 3.2.22. Take the same hypergraph H as in Example 3.2.10, then the incidencegraph Gi equals:
v1 v2 v3 v4 v5 v6 v7 v8
E1 E2 E2
We can prove that the incidence graph Gi = (W = (V,E),↔) of a hypergraph is a faitfhulcharacteristic graph representation if we know its bipartite structure.
Theorem 3.2.23. The incidence graph Gi = (W = (V,E),→) of a hypergraph is a faithfulrepresentation: the incidence graph represents only one hypergraph under the condition thatwe know the bipartite structure, which means that for the vertex set W , we know which disjointsubset represents the vertices V of the hypergraph, respectively the edges E of the hypergraph.
Proof. Suppose that G and H are two hypergraphs that share the same incidence graphGi = ((V,E),→). Then G and H have the same set of vertices V and the same set of edges E.Also the adjacency relations in G and H are the same by the edge set → of Gi. We concludethat G = H.
Algorithm for similarity
In Listing A.14 in Appenix A, we introduce an algorithm that takes a hypergraph as input andreturns the adjacency matrix of the corresponding incidence graph. The resulting adjacencymatrix can then be used in the node similarity method from Algorithm 5.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 132
An important note has to be made here: this algorithm will return both the node andedge similarity scores in the same matrix. Also similarity scores will be available to comparean edge to a node and vice versa. Since we cannot give a correct meaning to such similarityscores, we regard them as redundant, intermediate results. In the examples, we will alwaysdraw lines in order to clearly separate the node similarity submatrix and the edge similaritysubmatrix.
Examples
Example 3.2.24. Take H1 and H2 as in Example 3.2.4, the similarity matrix with theincidence graph representation becomes:
0.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0129 0.0220 0.0021 0.0150 0.0220 0.0091 0.0129 0.0220 0.0035 0.0217 0.01530.0081 0.0138 0.0013 0.0094 0.0138 0.0057 0.0081 0.0138 0.0022 0.0137 0.00960.0517 0.0880 0.0082 0.0599 0.0880 0.0363 0.0517 0.0880 0.0139 0.0871 0.06120.1067 0.1817 0.0170 0.1237 0.1817 0.0750 0.1067 0.1817 0.0287 0.1799 0.12650.1067 0.1817 0.0170 0.1237 0.1817 0.0750 0.1067 0.1817 0.0287 0.1799 0.12650.0565 0.0961 0.0090 0.0655 0.0961 0.0397 0.0565 0.0961 0.0152 0.0952 0.06690.0129 0.0220 0.0021 0.0150 0.0220 0.0091 0.0129 0.0220 0.0035 0.0217 0.01530.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0631 0.1075 0.0101 0.0732 0.1075 0.0444 0.0631 0.1075 0.0170 0.1065 0.07480.0129 0.0220 0.0021 0.0150 0.0220 0.0091 0.0129 0.0220 0.0035 0.0217 0.01530.0123 0.0209 0.0020 0.0143 0.0209 0.0086 0.0123 0.0209 0.0033 0.0207 0.01460.0958 0.1632 0.0153 0.1111 0.1632 0.0674 0.0958 0.1632 0.0258 0.1616 0.11360.0196 0.0333 0.0031 0.0227 0.0333 0.0138 0.0196 0.0333 0.0053 0.0330 0.02320.0661 0.1126 0.0106 0.0767 0.1126 0.0465 0.0661 0.1126 0.0178 0.1115 0.0784
Example 3.2.25. Take H1 and H2 as in Example 3.2.5, the similarity matrix with theincidence graph representation becomes:
0.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0920 0.1566 0.0147 0.1067 0.1566 0.0647 0.0920 0.1566 0.0247 0.1551 0.10900.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 133
Example 3.2.26. Take H1 and H2 as in Example 3.2.6, the similarity matrix with theincidence graph representation becomes:
0.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0254 0.0432 0.0041 0.0294 0.0432 0.0178 0.0254 0.0432 0.0068 0.0428 0.03010.0254 0.0432 0.0041 0.0294 0.0432 0.0178 0.0254 0.0432 0.0068 0.0428 0.03010.1092 0.1860 0.0174 0.1267 0.1860 0.0768 0.1092 0.1860 0.0294 0.1842 0.12950.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0839 0.1428 0.0134 0.0972 0.1428 0.0589 0.0839 0.1428 0.0226 0.1414 0.09940.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0302 0.0514 0.0048 0.0350 0.0514 0.0212 0.0302 0.0514 0.0081 0.0509 0.03580.0997 0.1697 0.0159 0.1156 0.1697 0.0701 0.0997 0.1697 0.0268 0.1681 0.11810.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Example 3.2.27. Take H1 and H2 as in Example 3.2.7, the similarity matrix with theincidence graph representation becomes:
0.0565 0.0961 0.0090 0.0655 0.0961 0.0397 0.0565 0.0961 0.0152 0.0952 0.06690.0944 0.1608 0.0151 0.1095 0.1608 0.0664 0.0944 0.1608 0.0254 0.1592 0.11190.0253 0.0431 0.0040 0.0293 0.0431 0.0178 0.0253 0.0431 0.0068 0.0427 0.03000.0818 0.1392 0.0130 0.0948 0.1392 0.0575 0.0818 0.1392 0.0220 0.1379 0.09690.0944 0.1608 0.0151 0.1095 0.1608 0.0664 0.0944 0.1608 0.0254 0.1592 0.11190.0379 0.0646 0.0061 0.0440 0.0646 0.0267 0.0379 0.0646 0.0102 0.0640 0.04500.0565 0.0961 0.0090 0.0655 0.0961 0.0397 0.0565 0.0961 0.0152 0.0952 0.06690.0944 0.1608 0.0151 0.1095 0.1608 0.0664 0.0944 0.1608 0.0254 0.1592 0.11190.0237 0.0403 0.0038 0.0275 0.0403 0.0166 0.0237 0.0403 0.0064 0.0399 0.02810.1057 0.1800 0.0169 0.1226 0.1800 0.0743 0.1057 0.1800 0.0284 0.1783 0.12530.0710 0.1210 0.0113 0.0824 0.1210 0.0499 0.0710 0.1210 0.0191 0.1198 0.08420.0237 0.0403 0.0038 0.0275 0.0403 0.0166 0.0237 0.0403 0.0064 0.0399 0.0281
Example 3.2.28. Take G1 and G2 as in Example 3.2.19, the similarity matrix becomes:
0.1034 0.1761 0.0165 0.1199 0.1761 0.0727 0.1034 0.1761 0.0278 0.1744 0.12260.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.1034 0.1761 0.0165 0.1199 0.1761 0.0727 0.1034 0.1761 0.0278 0.1744 0.12260.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.09410.0794 0.1352 0.0127 0.0921 0.1352 0.0558 0.0794 0.1352 0.0214 0.1339 0.0941
Interpretation
(C1) Fulfilled: we will prove later in Corallary 3.3.7 that using this method with two undi-rected graphs returns the same results as the node-edge similarity matrix of section
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 134
2.3.
(C2) Fulfilled: the number of vertices is preserved in the incidence graph a hypergraph, soeach vertex is taken into account when calculating the similarity matrix. We concludewith the same reasoning as (E2) that the condition is fulfilled.
(C3) Fulfilled from the fact that all edges are represented in the incidence graph of thehypergraph, so each edge is taken into account when calculating the similarity matrix.We conclude with the same reasoning as (E3) that the condition is fulfilled.
(C4) Fulfilled from the fact that the adjacency relations of the original hypergraph are rep-resented in its incidence graph and that we use Algorithm 5, for which this statementalready holds by (E4).An example can be found in Example 3.2.24 where v2, v5, v8 of H1 have the largestsimilarity score with v6, v7 of H2. This is not surprising as v6, v7 are contained in allpossible edges.
(C5) Fulfilled from the fact that the adjacency relations of the original hypergraph are repre-sented in its incidence graph, that the edges are considered as normal vertices and thatwe use Algorithm 5, for which this statement already holds by (E4).An example can be found in Example 3.2.24 where E2 of H1 has the largest similarityscore with E′2 of H2. This is not surprising as these edges are almost equal.
(C6) Fulfilled, an example of structural equivalent edges can be found in Example 3.2.27,where E′1 and E′4 of H2 have the same similarity scores. In the same example, thevertices v1, v7 of H2 as well as v2, v5, v8.We know prove that the incidence graph preserves the interchangeability of edges andvertices:
Theorem 3.2.29. The incidence graph of a hypergraph preserves structural equivalentvertices.
Proof. The structural equivalent vertices of a hypergraph G form an equivalence classon the set of vertices V , take an equivalence vi with |vi| > 1 (we assume that G has atleast two structural equivalent vertices), this means that all vertices in vi have exactlythe same adjacency structure in the hypegraph G. So, when a vertex in vi is adjacent toa vertex vp, all vertices in vi are adjacent with the same edge Ek to vp in the hypergraphG. By the definition of the incidence graph, this means that all vertices in in vi willbe connected to the edge Ek connecting them to vp in the incidence graph. So in theincidence graph, all vertices in vi will be adjacent to the same vertices and the sameedges, making them also structural equivalent by the definition of structural equivalentvertices in graphs. With the same reasoning, we conlcude that also the structuralequivalent edges are preserved.
(C7) Fulfilled: the adjancency relations in the incidence graph are determined by the numberof vertices contained in an edge in the hypergraph.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 135
(C8) Fulfilled. An example is Example 3.2.6 where the vertices v3, v8, v9 and v12 have allsimilarity scores equal to zero as they form a clique that is not connected to the othervertices. The incidence graph also preserves the adjancency relations and thus connec-tivity, so we know from (E8) that this condition holds.
Conclusion
We conclude that using the incidence graph returns similarity scores that satisfy all the con-ditions and can be seen as a very good way to calculate the similarity scores of a hypergraph.By Theorem 3.2.23 this is not so surprising: the incidence graph is a faithful representation.Intuitively, this means that this representation saves all information of the represented hyper-graph, meaning that it is possible to reconstruct the hypergraph based on its incidece graph.As a result, no information of the hypergraph is lost when using Algorithm 5, so it is possibleto satisfy all the conditions.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 136
3.3 Similarity by using the incidence matrix
We already showed that using the incidence graph representation of a hypergraph is a verygood way to calculate similarity between hypergraphs. It would be somehow logic to stopsearching for similarity methods for hypergraphs now, but there is one very natural general-ization that is worth mentioning: the node-edge similarity method described in Section 2.3uses a source-edge matrix and a terminus-edge matrix. When we would use this method (seeTheorem 2.3.6) with undirected graphs G and H , the source-edge matrix and terminus-edgematrix are the same and are equal to the incidence matrices of G and H . The question isnow: is there any problem if we enter not the incidence matrices of two graphs, but of twohypergraphs (see Definition 1.6.10)? The only difference is that each column can have morethan 2 entries equal to 1. The answer is no.
Even better, we will be able to prove that this method is equal to the method with theincidence graph up to a constant! By this ‘concluding theorem’, we know that also thismethods meets all conditions imposed in the introduction.
We first prove the compact form of the method:
3.3.1 Compact form
Theorem 3.3.1. Let G = (V,E) and H = (V ′, E′) be two hypergraphs, G has nG vertices andmG eges and H has nH vertices and mH edges. Let A and B be the incidence matrices of Gand H and define:
Y (k+1) = BTX(k)A
‖BTX(k)A‖F(3.8)
X(k+1) = BY (k)AT
‖BY (k)AT ‖F(3.9)
for k = 0, 1, . . ..Then the matrix subsequences X(2k), Y (2k) and X(k+1), Y (k+1) converge to Xeven, Yeven
and Xodd, Yodd. If we take:
X(0) = J ∈ RnH ×nG
Y (0) = J ∈ RmH ×mG
as initial matrices, then Xeven(J) = Xodd(J), Yeven(J) = Yodd(J) are the unique matrices oflargest 1-norm among all possible limits with positive initial matices and the matrix sequenceX(k), Y (k) converges has a whole.
Proof. The only thing we have to prove is that we can construct a matrix M that is nonneg-ative and symmetric, the rest of the proof is completely analogous to the proof of Theorem2.3.6.
So, by Theorem 2.2.14 we can rewrite (2.16) as follows:
Y ′(k+1) = BTX(k)A
⇔ vec(Y ′(k+1)) = vec(BTX ′(k)A)⇔ vec(Y ′(k+1)) = (AT ⊗BT ) vec(X ′(k))
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 137
Completely analogous we can also rewrite (2.17):
vec(X ′(k+1)) = (A⊗B) vec(Y ′k),
define y(k) = vec(Y ′(k+1)) and x(k) = vec(X ′(k+1)), we get:
y(k+1) = (AT ⊗BT )x(k)
x(k+1) = (A⊗B)y(k)
If we define G = AT ⊗BT , then with Lemma 2.2.13:
GT = (AT ⊗BT )T
= A⊗B
So we get:
y(k+1) = Gx(k) (3.10)x(k+1) = GTy(k), (3.11)
G is a mGmH × nGnH -matrix, the previous expressions can be concatenated to a singlematrix update equation (we define matrix M and z(k+1)):
z(k+1) =(
xy
)(k+1)
=(
0mGmH GT
G 0nGnH
)(xy
)(k)
= Mz(k),
M is clearly nonnegative because G and GT are nonnegative and M is also clearly symmetric,so the result follows immediately from Theorem 2.2.10. The rest of the proof is now completelyanalogous to the proof of Theorem 2.3.6.
We define now Xeven(1) as the node similarity matrix and Yeven(1) as the edge similaritymatrix.
Note that the equations in compact form in the node-edge similarity method where definedas:
Y (k+1) = BTSX
(k)AS +BTTX
(k)AT‖BT
SX(k)AS +BT
TX(k)AT ‖F
X(k+1) = BSY(k)ATS +BTY
(k)ATT‖BSY (k)ATS +B′TY
(k)ATT ‖F
But since AS = AT = A and BS = BT = B in this case, the second term of the sum isredundant and would be eliminated by the normalization in each step. This shows that thecompact form as presented in the previous theorem is not different from the compact form ofTheorem 2.3.6.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 138
3.3.2 The algorithm
The algorithm is completely analogous to Algorithm 6. A Matlab implementation can befound in Listing A.16 in Appendix A. We also present Algorithm 12, an algorithm thattakes a hypergraph as input and returns the incidence matrix of the hypergraph. A Matlabimplementation of this algorithm can be found in Listing A.15.
Data:A: the nG ×mG incidence matrix of a hypergraph GB: the nH ×mH incidence matrix of a hypergraph GTOL: tolerance for the estimation error.Result:X: the node similarity matrix between G and HY : the edge similarity matrix between G and Hbegin node edge similarity matrix hypergraphs(A,B,TOL)
k = 1 ;X(0) = 1 (nH × nG -matrix with all entries equal to 1);Y (0) = 1 (mH ×mG -matrix with all entries equal to 1);µX = nH × nG -matrix with all entries equal to TOL;µY = mH ×mG -matrix with all entries equal to TOL;repeat
Y (k) = BTX(k−1)A‖BTX(k−1)A‖F
;X(k) = BY (k)AT
‖BY (k)AT ‖F;
k = k + 1;until |X(k) −X(k−1)| < µX and |Y (k) − Y (k−1)| < µY ;
endreturn X(k), Y (k);
Algorithm 11: Algorithm for calculating the node and edge similarity matrix X and Ybetween G and H.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 139
Data:n: the number of vertices of hypergraph HE: a set of subsets Ei of {1, . . . , n} that represent the edges of hypergraph HResult:A: the incidence matrix of the hypergraphbegin hypergraph to incidencematrix(n, E)
m = |E|;A = initialize a n× n-matrix with all entries equal to 0;for Ej ∈ E do
for vi ∈ Ej do(A)ij = 1;
endendreturn A;
endAlgorithm 12: Algorithm to calculate the adjacency matrix of the 2-section of a hyper-graph.
3.3.3 Examples
We now calculate the same example as in the previous section:
Example 3.3.2. LetH1,H2 be the same hypergraphs as in Example 3.2.4, we get as incidencematrix A of H1 and B of H2:
A =
0 1 00 1 11 0 01 1 00 1 10 0 10 1 00 1 1
and B =
0 1 0 00 1 0 00 0 1 01 0 0 01 0 0 10 1 0 10 1 0 10 0 1 10 0 1 00 1 0 00 1 0 00 0 1 0
which can be used to apply Algorithm 11 which results in the node similarity matrix X and
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 140
the edge similarity matrix Y :
X =
0.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0171 0.0291 0.0027 0.0198 0.0291 0.0120 0.0171 0.02910.0108 0.0183 0.0017 0.0125 0.0183 0.0076 0.0108 0.01830.0686 0.1168 0.0109 0.0796 0.1168 0.0482 0.0686 0.11680.1417 0.2413 0.0226 0.1643 0.2413 0.0996 0.1417 0.24130.1417 0.2413 0.0226 0.1643 0.2413 0.0996 0.1417 0.24130.0750 0.1277 0.0120 0.0869 0.1277 0.0527 0.0750 0.12770.0171 0.0291 0.0027 0.0198 0.0291 0.0120 0.0171 0.02910.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0838 0.1428 0.0134 0.0972 0.1428 0.0589 0.0838 0.14280.0171 0.0291 0.0027 0.0198 0.0291 0.0120 0.0171 0.0291
Y =
0.0134 0.0840 0.05900.1045 0.6549 0.46030.0213 0.1337 0.09400.0721 0.4519 0.3177
Example 3.3.3. Take H1 and H2 as in Example 3.2.5, the node similarity matrix X and theedge similarity matrix Y become:
X =
0.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.1152 0.1961 0.0184 0.1336 0.1961 0.0810 0.1152 0.19610.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Y =
0.0000 0.0000 0.00000.1294 0.8112 0.57020.0000 0.0000 0.0000
Example 3.3.4. Take H1 and H2 as in Example 3.2.6, the node similarity matrix X and the
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 141
edge similarity matrix Y become:
X =
0.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0326 0.0555 0.0052 0.0378 0.0555 0.0229 0.0326 0.05550.0326 0.0555 0.0052 0.0378 0.0555 0.0229 0.0326 0.05550.1401 0.2386 0.0224 0.1625 0.2386 0.0985 0.1401 0.23860.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.00000.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.1076 0.1832 0.0172 0.1247 0.1832 0.0756 0.1076 0.18320.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Y =
0.0375 0.2351 0.16520.1239 0.7764 0.54570.0000 0.0000 0.0000
Example 3.3.5. Take H1 and H2 as in Example 3.2.7, the node similarity matrix X and theedge similarity matrix Y become:
X =
0.0778 0.1326 0.0124 0.0903 0.1326 0.0547 0.0778 0.13260.1302 0.2216 0.0208 0.1509 0.2216 0.0915 0.1302 0.22160.0349 0.0594 0.0056 0.0404 0.0594 0.0245 0.0349 0.05940.1127 0.1919 0.0180 0.1307 0.1919 0.0792 0.1127 0.19190.1302 0.2216 0.0208 0.1509 0.2216 0.0915 0.1302 0.22160.0523 0.0891 0.0083 0.0607 0.0891 0.0368 0.0523 0.08910.0778 0.1326 0.0124 0.0903 0.1326 0.0547 0.0778 0.13260.1302 0.2216 0.0208 0.1509 0.2216 0.0915 0.1302 0.2216
Y =
0.0233 0.1459 0.10250.1039 0.6512 0.45770.0698 0.4376 0.30760.0233 0.1459 0.1025
3.3.4 Concluding theorem
Before heading to an interpretation of the examples and the method, we now present a‘concluding theorem’ that shows that the method of similarity using the incidence matrix iscompletely equal to the method with the incidence graph of the matrix. This gives us animmediate guarantee that all conditions of the introduction are met for the method with theincidence matrix.
Theorem 3.3.6. The similarity method using the incidence graphs of two hypergraphs G,Hreturns the same results as the method with the incidence matrices of G,H, up to a constant,so:
X(k) = αS(k)[1, . . . , nH; 1, . . . , nG ]Y (k) = βS(k)[nH, . . . ,mH;nG , . . . ,mG ]
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 142
with X,Y , the node and similarity matrices of the method with the incidence matrices ofH,G, S the similarity matrix of method of Blondel used with the incidence graphs of H,G andα, β ∈ R and k even. S[R,C] denotes the submatrix of S with rows R and columns C.
Proof. To calculate the similarity matrix using the incidence graph representation with themethod of Blondel, an element sij at iteration step k is calculated as follows (remember thatA,B are symmetric):
s(k+1)ij =
∑nH+mHf=1
∑nG+mGg=1 bifs
(k)fg agj
‖BS(k)AT ‖F(3.12)
Now, remember that the incidence graph is a bipartite graph and thus, the adjancecy matricesA and B have the following property (remember our convention to order the columns androws in a way that the vertices come first, followed by the edges):
bif = 0 when 0 ≤ i, f ≤ nH or nH ≤ i, f ≤ nH +mH (3.13)agj = 0 when 0 ≤ g, j ≤ nG or nG ≤ g, j ≤ nG +mG (3.14)
We are only interested in the case where sij is a calculation between nodes and between edges,as the method with the incidence matrix also only calculates this and it is rather hard to givea correct meaning to a comparision between an edge and an a node.
For the rest of the proof, we only consider the calculating of node similarity, the calculationof edge similarity is completely analogous. In this case, we have that 0 ≤ i ≤ nH and0 ≤ j ≤ nG , now (3.12) becomes with property (3.13):
s(k+1)ij =
∑nH+mHf=nH
∑nG+mGg=nG bifs
(k)fg agj
‖BS(k)AT ‖F(3.15)
So in the case of calculating similarities between nodes, only the s(k)ij of the edges play a role!
Now, remember the compact form with the method using the incidence matrices (B′ is theincidence matrix of H, A′ of G, the rows in the incidence graph represent the vertices, thecolumns the edges):
X(k+1) = B′Y (k)A′T
‖B′Y (k)A′T ‖F(3.16)
Y (k+1) = B′TX(k)A′
‖B′TX(k)A′‖F(3.17)
Elementwise, we write:
x(k+1)ij =
∑mHp=1
∑mGq=1 b
′ipy
(k)pq a′jq
‖B′TX(k)A′‖F(3.18)
y(k+1)cd =
∑nHh=1
∑nGe=1 b
′hcx
(k)he a
′ed
‖B′Y (k)A′T ‖F(3.19)
Now every non-zero element of the adjacency matrices A,B, can be found in the incidencematrices A′, B′ too (we assume that the vertices and edges have the same numbering in bothmatrices):
aij ={a′i,j−nH if 0 ≤ i ≤ nH and nH < j ≤ nH +mH
a′j,i−nH if nH < i ≤ nH +mH and 0 ≤ j ≤ nH
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 143
So we rewrite 3.15 an we get (remember that 0 ≤ i ≤ nH and 0 ≤ j ≤ nG):
s(k+1)ij =
∑nH+mHf=nH
∑nG+mGg=nG b′i,f−nHs
(k)fg a
′j,g−nG
‖BS(k)AT ‖F(3.20)
=∑mHf=1
∑mGg=1 b
′i,fs
(k)f+nH,g+nGa
′j,g
‖BS(k)AT ‖F(3.21)
The equivalence of (3.21) and (3.18) is now clear: (3.21) depends on the edge similarity scores(to calculate the node similarity scores) and every edge similarity score is multiplied withexactly the same entries of the incidence graph. The result follows by proving that also theedge similarity scores ycd are equal to the sij with nH < i ≤ nH+mH and nG < j ≤ nG+mG ,this is possible with the exact same reasoning. The difference up to a constant is caused bythe normalization in each iteration. We conclude that the theorem is true for each even k.
Corollary 3.3.7. The node-edge similarity method (see 2.3) returns the same results forundirected graphs G ,H as the node similarity method of Blondel (see 2.1) using incidencegraphs of G ,H , up to constant.
Proof. This follows immediately from the fact that in the node-edge similarity method forundirected graphs, the source-edge matrix and terminal-edge matrix are the same and areboth equal to the incidence matrix for undirected graphs (see Definition 1.5.27).
Remark 3.3.8. One may wonder whether the node-edge similarity method of 2.3 also returnsthe same results for directed graphs G ,H as the node similarity method of Blondel using theincidence graphs of G ,H . This is the case, but is a little harder to prove. First, we use thefollowing definition for the incidence graph of a directed graph:
Definition 3.3.9. Let G = (V,→) be a directed graph, then the incidence graph G ′i =(V ′,→′) of G is the directed graph with:
1. V ′ = V ∪ →,
2. ∀vi ∈ V,∀ej ∈→: vi → ej if vi is the source node of ej,
3. ∀vi ∈ V,∀ej ∈→: ej → vi if vi is the terminal node of ej.
Starting from the node-edge similarity method described in section 2.3, we notice that thismethod uses source-edge and terminal-edge matrices. In the undirected case, the source-edgeand terminal-edge of a graph are both equal to the incidence matrix of the graph, but thisis not true for the directed case. Indeed, both the source-edge and the terminal-edge matrixhave nonnegative entries and therefore the node-edge similarity method returns nonnegative(node or edge) similarity scores, while the incidence matrix for directed graphs also containsnegative entries (see Definition 1.5.27). Still, we can prove that using the method of Blondelwith the adjancy matrices A,B from the incidence graph of the directed graphs G ,H returnsthe same results (up to a constant) as using the node-edge similarity method that uses thesource-edge and terminal-edge matrices of G ,H . The proof of this theorem will have exactlythe same reasoning as the proof of Theorem 3.3.6, but the proof will be more extensive as youhave to consider both the equalities with the source-edge matrix and the terminal-edge matrix.
CHAPTER 3. SIMILARITY ON HYPERGRAPHS 144
3.3.5 Interpretation
By the concluding theorem, we know that the similarity method for hypergraphs using theincidence matrix is the same (up to a constant) as the method with the incidence graphrepresentation of hypergraphs. Since the latter method meets all the conditions, this methodsatisifies all the requirements too.
Appendix A
Listings
Listing A.1: The MatLab code for the power method described in algorithm 1function [y, mu] = power metho(A, y, TOL)k = 1;mu = 0;while true
z = A*y;mu previous = mu;mu = norm(z,2);y previous = y;y = z/mu;k = k + 1;if and(k>2,abs(mu - mu previous)<TOL)
break;end
endif (y(1) == -y previous(1))
mu = -mu;endreturn;end
Listing A.2: The MatLab code for algorithm 5.function [Z] = similarity matrix(A,B, TOL)
Z 0 = ones(size(B,1),size(A,1));mu(1:size(B,1),1:size(A,1)) = TOL;Z = Z 0;Z previouseven = Z 0;k=1;while true;
Y = norm((B*Z*transpose(A)+transpose(B)*Z*A),'fro');X = B*Z*transpose(A)+transpose(B)*Z*A;Z = X/Y;if mod(k,2) == 0
difference = abs(Z-Z previouseven);disp(difference);Z previouseven = Z;
145
APPENDIX A. LISTINGS 146
if (difference < mu)break;
endendk = k + 1;
endreturn;
end
Listing A.3: The MatLab code for Algorithm 6.function [X, Y] = node edge similarity matrices(AS, AT, BS, BT, TOL)
X = ones(size(BS,1),size(AS,1));Y = ones(size(BS,2),size(AS,2));mu X(1:size(BS,1),1:size(AS,1)) = TOL;mu Y(1:size(BS,2),1:size(AS,2)) = TOL;X previouseven = X;Y previouseven = Y;k=1;
while true;Y = (transpose(BS)*X*AS+transpose(BT)*X*AT)
/norm(transpose(BS)*X*AS+transpose(BT)*X*AT, 'fro');X = (BS*Y*transpose(AS)+BT*Y*transpose(AT))
/norm(BS*Y*transpose(AS)+BT*Y*transpose(AT), 'fro');
difference X = abs(X-X previous);difference Y = abs(Y-Y previous);X previous = X;Y previous = Y;if difference X < mu X
if difference Y < mu Ybreak;
endend
endreturn;
end
Listing A.4: The MatLab code to converse an adjacency matrix to a source-edge matrix (edgesare numbered left-to-right based on the adjacency matrix).function [AS] = source edge matrix(A)
number of edges = sum(A(:));AS = zeros(size(A,1), number of edges); %initialize the souce-edge matrix A Scurrent edge = 1;for i=1:size(A,1)
for j=1:size(A,2)if A(i,j) > 0
for e=1:A(i,j)AS(i,current edge) = 1;current edge = current edge + 1;
endend
endend
APPENDIX A. LISTINGS 147
end
Listing A.5: The MatLab code to converse an adjacency matrix to a terminal-edge matrix(edges are numbered left-to-right based on the adjacency matrix).function [AS] = terminal edge matrix(A)
number of edges = sum(A(:));AS = zeros(size(A,1), number of edges); %initialize the terminal-edge matrix A Tcurrent edge = 1;for i=1:size(A,1)
for j=1:size(A,2)if A(i,j) > 0
for e=1:A(i,j)AS(j,current edge) = 1;current edge = current edge + 1;
endend
endend
end
Listing A.6: The MatLab code for Algorithm 6 but with adjacency matrices as input (edgesare numbered left-to-right based on the adjacency matrices).function [X,Y] = node edge similarity matrices with adjacency matrix(A, B, TOL)AS = source edge matrix(A);AT = terminal edge matrix(A);BS = source edge matrix(B);BT = terminal edge matrix(B);[X,Y] = node edge similarity matrix(AS, AT, BS, BT, TOL);
end
Listing A.7: The MatLab code for Algorithm 8 that takes an ordered list with the number ofvertices of the same color and a normal adjacency matrix as input and returns a partitionedadjacency matrix.function [Z] = colored node adjacency matrix partitioning(colored vertices,A)number of colors = size(colored vertices,2);
for i = 1:number of colorsfor j = 1:number of colors
c i = colored vertices(i);c j = colored vertices(j);B = zeros(c i,c j);start vertex i = 0;start vertex j = 0;for k = 1:i-1
start vertex i = start vertex i + colored vertices(k);endfor k = 1:j-1
start vertex j = start vertex j + colored vertices(k);endfor r = 1:c i
APPENDIX A. LISTINGS 148
for k = 1:c jB(r,k) = A(start vertex i + r, start vertex j + k);
endendZ{i,j} = B;
endendreturn;
end
Listing A.8: The MatLab code for Algorithm 9 that calculates the node similarity matrix forcolored nodes.function [Z] = colored node similarity matrix(colored verticesA, A,
colored verticesB, B, TOL)partitionedA = colored node adjacency matrix partitioning(colored verticesA, A);partitionedB = colored node adjacency matrix partitioning(colored verticesB, B);number of colors = size(colored verticesA, 2);k = 1;for i=1:number of colors
Z{i} = ones(colored verticesB(i), colored verticesA(i));endZ previouseven = Z;while true;
norm = 0;for i=1:number of colors
Z temp = 0;for l = 1:number of colors
Z temp = Z temp + partitionedB{i,l}*Z{l}*transpose(partitionedA{i,l})+ transpose(partitionedB{l,i})*Z{l}*partitionedA{l,i};
endZ{i} = Z temp;norm = norm + trace(transpose(Z{i})*(Z{i}));
end
norm = norm(Z, 'fro');
for i=1:number of colorsZ{i} = Z{i}/norm;
endif mod(k,2) == 0
have to stop = 1;for i = 1:number of colors
difference i = abs(Z{i}-Z previouseven{i});if not(all(difference i) < TOL)
have to stop = 0;end
endif(have to stop == 1)
break;else
Z previouseven = Z;end
endk = k + 1;
end
APPENDIX A. LISTINGS 149
end
Listing A.9: The MatLab code for the algorithm that takes an ordered list with the numberof edges of the same color and a source-edge matrix or terminal-edge matrix as input andreturns a partitioned source-edge or terminal-edge matrix.function [Z] = colored edge matrix partitioning(colored edges,A)number of colors = size(colored edges,2);number of nodes = size(A,1);for i = 1:number of colors
c i = colored edges(i);B = zeros(number of nodes,c i);start vertex i = 0;for k = 1:i-1
start vertex i = start vertex i + colored edges(k);end
for r = 1:number of nodesfor k = 1:c i
B(r,k) = A(r, start vertex i + k);end
endZ{i} = B;
endreturn;
end
Listing A.10: The MatLab code for the algorithm that calculates the node and (colored) edgesimilarity matrix for colored edges.function [S] = colored edge similarity matrix(colored edgesA, AS,AT, colored edgesB, BS,BT, TOL)partitionedAS = colored edge matrix partitioning(colored edgesA,AS);partitionedAT = colored edge matrix partitioning(colored edgesA,AT);partitionedBS = colored edge matrix partitioning(colored edgesB,BS);partitionedBT = colored edge matrix partitioning(colored edgesB,BT);number of colors = size(colored edgesA, 2);k = 1;for i=1:number of colors
Z{i,i} = ones(colored edgesB(i), colored edgesA(i));endZ previouseven = Z;while true;
norm2 = 0;for i=1:number of colors
Z temp = 0;for l = 1:number of colors
Z temp = Z temp + partitionedBS{l}*Z{l,l}*transpose(partitionedAS{l})+ partitionedBT{l}*Z{l,l}*transpose(partitionedAT{l,i});
endZ{i,i} = Z temp;norm2 = norm2 + trace(transpose(Z{i,i})*(Z{i,i}));Z{i,i} = Z{i,i}/norm(Z temp,'fro');
end
APPENDIX A. LISTINGS 150
if mod(k,2) == 0have to stop = 1;for i = 1:number of colors
difference i = abs(Z{i,i}-Z previouseven{i,i});if not(all(difference i) < TOL)
have to stop = 0;end
endif(have to stop == 1)
break;else
Z previouseven = Z;end
endk = k + 1;
enddisp(Z);for i = 1:number of colorsfor j = 1:number of colors
if not(i == j)Z{i,j} = zeros(size(Z{i,i},1),size(Z{j,j},2));
endendend
S = cell2mat(Z);return;
end
Listing A.11: The MatLab code for to calculate the adjacency matrix of the representingline-graph of a hypergraph.function [A] = hypergraph to linegraph(n,E)
m = numel(E);A = zeros(m,m);
for i=1:mfor j=1:m
if or(i==j,isempty(intersect(E{i},E{j})))A(i,j) = 0;
elseA(i,j) = 1;
endend
endend
Listing A.12: The MatLab code for to calculate the adjacency matrix of the 2-section of ahypergraph.function [A] = hypergraph to 2section(n,E)A = zeros(n,n);for i=1:n
for j=1:n
APPENDIX A. LISTINGS 151
for idx = 1:numel(E)if i==j
A(i,j) = 0;break;
elseif not(or(isempty(intersect(E{idx}, [i])),isempty(intersect(E{idx}, [j]))))A(i,j) = 1break;
elseA(i,j) = 0;
endend
endend
end
Listing A.13: The MatLab code for Algorithm to calculate the adjacency matrix of theextended 2-section of a hypergraph.function [A] = hypergraph to extended2section(n,E)A = zeros(n,n);for k = 1:numel(E)
if size(E{k},2) == 1A(cell2mat(E(k)), cell2mat(E(k))) = A(cell2mat(E(k)), cell2mat(E(k))) + 1;
elsep = perms(E{k});already done = [];for i = 1:size(p,1)
if isempty(intersect([p(i,1)], already done))for j = 2:size(p,2)
A(p(i,1),p(i,j)) = A(p(i,1),p(i,j)) + 1;endalready done = [already done, p(i,1)];
endenddisp(A);
endendreturn;
end
Listing A.14: The MatLab code for Algorithm to calculate the adjacency matrix of theincidence graph of a hypergraph.function [A] = hypergraph to incidencegraph(n,E)m = numel(E);A = zeros(n+m,n+m);
for idx = 1:numel(E)edge = cell2mat(E(idx));for idx 2 = 1:numel(edge)
A(edge(idx 2), n + idx) = 1;A(n + idx,edge(idx 2)) = 1;
end
end
APPENDIX A. LISTINGS 152
end
Listing A.15: The MatLab code for Algorithm 12 to calculate the incidence matrix of ahypergraph.function [A] = hypergraph to incidencematrix(n,E)m = numel(E);A = zeros(n,m);
for idx = 1:numel(E)edge = cell2mat(E(idx));for idx 2 = 1:numel(edge)
A(edge(idx 2), idx) =1end
endend
Listing A.16: The MatLab code for Algorithm 11 to calculate the node-edge similarity scoresof a hypergraph.function [A] = hypergraph to 2section(n,E)A = zeros(n,n);for i=1:n
for j=1:nfor idx = 1:numel(E)
if i==jA(i,j) = 0;break;
elseif not(or(isempty(intersect(E{idx}, [i])),isempty(intersect(E{idx}, [j]))))A(i,j) = 1break;
elseA(i,j) = 0;
endend
endend
end
App
endi
xB
Res
ults
ofth
eE
urov
isio
nSo
ngC
onte
st20
09-2
015
Tabl
eB.
1:Eu
rovi
sion
Song
Con
test
2009
AlbaniaAndorraArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGermanyGreeceHungaryIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Nor
way
710
881
2101
02
810
3121
28
881
2101
212
8121
212
881
012
551
2101
0121
212
812
3121
0387
10.3
1138
2108
0.10
7545
693
Icel
and
68
52
52
521
08
210
74
712
108
812
351
21
810
36
510
57
282
1820
.175
6834
130.
1134
9331
3A
zerb
aija
n4
110
33
810
810
87
32
181
06
45
110
610
64
74
41
810
1210
3207
30.1
6594
6697
0.10
9357
170
Tur
key
1041
212
710
11
612
5121
03
53
53
73
62
612
812
177
40.1
3663
4156
0.09
4714
828
Uni
ted
Kin
gdom
84
73
47
47
63
64
812
110
42
310
12
410
68
731
03
617
350
.137
1075
990.
1336
3049
8
Est
onia
44
63
512
15
610
610
107
81
812
74
129
60.1
0679
3281
0.13
1401
482
Gre
ece
1210
55
512
712
64
72
84
64
12
21
5120
70.0
9646
0793
0.14
8053
884
Fran
ce2
36
71
62
43
66
35
56
11
103
73
76
311
0780
.087
4622
710.
1090
3393
7
153
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 154Ta
ble
B.1:
Euro
visio
nSo
ngC
onte
st20
09(c
ontin
ued)
AlbaniaAndorraArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGermanyGreeceHungaryIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Bos
nia
&H
erze
govi
na5
241
210
62
212
1281
05
44
810
690
.081
4273
950.
1191
5838
7
Arm
enia
17
641
21
16
58
24
75
34
35
62
9210
0.06
9134
3820
.132
2172
98R
ussi
a12
68
18
105
76
76
34
891
110.
0721
2247
10.1
5299
2047
Ukr
aine
310
65
82
42
451
06
21
62
7612
0.05
8251
7490
.142
8263
04D
enm
ark
56
53
44
32
65
87
21
85
7413
0.05
6476
6070
.156
3654
31M
oldo
va2
74
31
351
212
15
77
6914
0.04
8643
1910
.119
1605
33P
ortu
gal
66
17
37
72
810
5715
0.03
9280
5350
.105
8987
49Is
rael
78
84
510
45
11
5316
0.03
7533
9190
.157
0314
29A
lban
ia5
17
72
72
16
1048
170.
0354
6362
10.1
4149
3351
Cro
atia
512
42
12
85
645
180.
0328
3076
90.1
3453
9946
Rom
ania
32
52
122
27
540
190.
0279
4082
40.1
1946
6004
Ger
man
y3
22
73
16
12
17
3520
0.02
7028
5310
.155
1341
99Sw
eden
12
46
72
34
433
210.
0251
7333
90.1
6032
8666
Mal
ta1
34
51
13
76
3122
0.02
4200
0050
.139
6822
95L
ithu
ania
11
12
77
423
230.
0182
9026
10.1
5260
0566
Spai
n12
17
323
240.
0151
5637
60.1
2273
9159
Fin
land
34
83
422
250.
0169
4659
70.1
3277
3031
And
orra
00.1
0978
1594
Bel
arus
00.1
4958
7578
Bel
gium
00.1
2223
3834
Bul
gari
a00
.122
7935
35C
ypru
s00
.146
5668
20C
zech
Rep
ublic
00.1
0391
7420
FY
RM
aced
onia
00.1
2422
7588
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 155Ta
ble
B.1:
Euro
visio
nSo
ngC
onte
st20
09(c
ontin
ued)
AlbaniaAndorraArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGermanyGreeceHungaryIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Hun
gary
00.1
5826
0146
Irel
and
00.1
2812
6358
Lat
via
00.1
4378
7849
Mon
tene
gro
00.1
2614
4148
Pol
and
00.1
3597
2955
Serb
ia00
.123
0803
64Sl
ovak
ia00
.142
3497
52Sl
oven
ia00
.133
1784
41Sw
itze
rlan
d00
.117
7334
65T
heN
ethe
rlan
ds00
.163
3949
17
Tabl
eB.
2:Eu
rovi
sion
Song
Con
test
2010
AlbaniaArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Ger
man
y10
110
83
641
212
812
32
38
1210
412
71
36
8121
0121
212
410
542
4610
.263
7813
090.
1159
0929
0T
urke
y8
123
6101
012
661
031
251
01
42
83
23
53
881
0170
20.1
6931
8656
0.13
7704
079
Rom
ania
51
71
28
84
57
83
251
210
610
36
171
010
45
22
8162
30.1
6889
8361
0.12
8083
998
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 156Ta
ble
B.2:
Euro
visio
nSo
ngC
onte
st20
10(c
ontin
ued)
AlbaniaArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Den
mar
k2
54
22
52
212
1241
03
881
241
21
712
48
261
4940
.158
2508
110.
1441
7417
0A
zerb
aija
n6
571
210
23
81
43
72
127
78
87
212
1214
550
.150
0375
830.
1208
1845
1B
elgi
um10
36
712
6101
04
710
310
54
551
02
76
114
360
.147
5567
630.
1473
6242
0A
rmen
ia5
78
74
610
77
121
65
612
14
81
126
614
170
.139
2313
190.
1015
0866
0G
reec
e12
312
65
127
48
87
21
87
105
53
312
140
80.1
4395
4029
0.09
5629
350
Geo
rgia
128
74
46
75
85
52
55
125
14
101
16
15
713
690
.135
3928
620.
0862
8112
1U
krai
ne81
010
76
27
27
68
35
77
71
5108
100.
1037
5014
30.1
3927
7675
Rus
sia
1031
210
108
510
26
410
9011
0.08
5893
5820
.141
8611
13Fr
ance
63
33
71
18
38
66
32
47
43
22
8212
0.08
0095
3410
.118
1936
39Se
rbia
312
81
710
58
710
172
130.
0715
3058
30.1
3245
4774
Isra
el4
58
12
101
41
58
610
33
7114
0.07
1051
2440
.118
9699
83Sp
ain
77
12
42
15
84
122
45
468
150.
0659
0434
00.1
6262
6170
Alb
ania
35
512
110
72
18
71
6216
0.05
9724
4710
.140
5376
52B
osni
a&
Her
zego
vina
610
65
124
851
170.
0501
4570
70.1
3905
3777
Por
tuga
l4
48
61
61
26
543
180.
0429
2200
10.1
0767
4053
Icel
and
28
34
54
36
641
190.
0422
3612
60.1
3749
2185
Nor
way
25
76
23
33
435
200.
0341
2301
10.1
5545
4891
Cyp
rus
44
112
12
327
210.
0243
0572
40.1
4417
4824
Mol
dova
64
161
027
220.
0240
3919
60.1
1969
6370
Irel
and
12
12
66
725
230.
0242
3986
30.1
4741
1097
Bel
arus
112
32
1824
0.01
3826
7390
.104
1077
56U
nite
dK
ingd
om1
23
410
250.
0093
7836
90.1
3617
1561
Bul
gari
a00
.136
2080
81C
roat
ia00
.114
0144
18E
ston
ia00
.136
0977
82F
YR
Mac
edon
ia00
.127
6261
78F
inla
nd00
.133
3304
33
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 157Ta
ble
B.2:
Euro
visio
nSo
ngC
onte
st20
10(c
ontin
ued)
AlbaniaArmeniaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceIcelandIrelandIsraelLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Lat
via
00.1
4283
7341
Lit
huan
ia00
.144
6337
55M
alta
00.1
3807
2738
Pol
and
00.1
5943
3083
Slov
akia
00.1
4532
8357
Slov
enia
00.1
3727
7314
Swed
en00
.153
8583
52Sw
itze
rlan
d00
.127
0765
74T
heN
ethe
rlan
ds00
.139
0851
80
Tabl
eB.
3:Eu
rovi
sion
Song
Con
test
2011
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Aze
rbai
jan
88
63
810
88
56
85
78
42
8121
08
8101
210
73
161
210
221
10.2
3945
6546
0.11
9172
772
Ital
y12
66
13
66
16
13
87
310
43
512
1010
1010
612
231
24
7189
20.1
9606
0135
0.09
8845
242
Swed
en4
34
45
4101
012
661
01
610
741
26
310
31
611
04
510
41
3185
30.1
9143
4165
0.10
6864
013
Ukr
aine
312
1210
85
57
107
77
48
27
210
612
62
715
940
.164
4381
090.
1209
9208
6D
enm
ark
17
310
76
1212
106
57
34
68
1012
5134
50.1
3390
6705
0.10
4782
124
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 158Ta
ble
B.3:
Euro
visio
nSo
ngC
onte
st20
11(c
ontin
ued)
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Bos
nia
&H
erze
govi
na7
122
712
57
34
412
1281
281
012
560
.118
3923
530.
1271
8601
9
Gre
ece
107
88
1012
361
08
21
88
83
26
120
70.1
2232
5490
0.14
5912
383
Irel
and
47
23
1210
84
108
16
871
25
1211
980
.112
3439
660.
1037
0368
1G
eorg
ia10
1012
13
85
212
77
67
812
110
90.1
1725
2558
0.13
2743
840
Ger
man
y10
85
31
83
46
68
25
47
36
87
310
7100
.103
2116
170.
1039
4199
7U
nite
dK
ingd
om6
25
212
43
51
22
611
05
37
41
34
21
63
1001
10.1
0067
1386
0.09
6916
917
Mol
dova
57
71
45
48
84
512
75
78
9712
0.09
5132
5890
.126
1248
31Sl
oven
ia3
1212
67
210
16
23
33
14
511
01
22
9613
0.09
6457
4690
.118
6823
26Se
rbia
11
710
88
23
56
65
104
63
8514
0.08
2541
1260
.095
0249
68Fr
ance
25
124
67
42
121
11
22
33
105
8215
0.08
5010
9390
.137
2684
91R
ussi
a4
84
54
25
45
13
18
65
48
7716
0.07
7844
3200
.143
3602
23R
oman
ia1
610
64
11
161
212
84
577
170.
0733
5584
40.1
3167
4852
Aus
tria
37
53
12
131
22
23
54
71
12
6418
0.05
9654
2910
.129
4287
33L
ithu
ania
212
107
312
12
71
663
190.
0587
7302
10.1
2742
4625
Icel
and
16
812
58
42
110
461
200.
0527
5762
90.1
3035
8202
Fin
land
57
210
31
125
75
5721
0.05
2100
5750
.101
8913
25H
unga
ry2
212
25
78
65
453
220.
0482
3698
20.1
3064
9629
Spai
n5
44
1212
52
23
150
230.
0521
1593
60.1
1425
4584
Est
onia
22
77
54
76
22
4424
0.04
0929
1370
.143
2535
87Sw
itze
rlan
d5
410
1925
0.01
6064
2780
.092
6207
42A
lban
ia00
.142
1075
51A
rmen
ia00
.128
5312
90B
elar
us00
.138
9337
98B
elgi
um00
.118
4865
08B
ulga
ria
00.1
1077
6143
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 159Ta
ble
B.3:
Euro
visio
nSo
ngC
onte
st20
11(c
ontin
ued)
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Cro
atia
00.1
2998
0068
Cyp
rus
00.1
4569
0798
FY
RM
aced
onia
00.1
1619
7406
Isra
el00
.131
6767
97L
ativ
a00
.121
3288
88M
alta
00.1
6219
1624
Nor
way
00.0
9901
1716
Pol
and
00.1
2782
4729
Por
tuga
l00
.130
7234
91Sa
nM
arin
o00
.158
5591
28Sl
ovak
ia00
.141
1303
28T
heN
ethe
rlan
ds00
.137
6365
06
Tur
key
00.1
4560
7545
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 160Ta
ble
B.4:
Euro
visio
nSo
ngC
onte
st20
12
AlbaniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Swed
en51
27
612
88
7101
212
6121
281
261
2121
212
1210
67
712
3101
231
0121
012
712
661
2372
10.3
0563
9378
0.09
9063
059
Rus
sia
351
012
83
66
58
84
84
57
47
76
7101
06
36
48
84
107
38
87
471
032
5920
.206
5723
840.
1299
0657
5Se
rbia
110
5101
212
710
28
108
43
65
531
210
55
46
712
1010
102
214
30.1
7694
6532
0.11
3907
146
Aze
rbai
jan
47
710
82
105
18
1212
105
104
36
1212
2150
40.1
1705
1052
0.11
3395
687
Alb
ania
810
64
54
512
63
610
812
11
101
121
311
25
146
50.1
1894
9905
0.09
8184
652
Est
onia
410
104
108
88
27
76
106
87
141
2060
.097
9342
800.
1314
9547
9T
urke
y10
312
74
72
85
78
31
18
35
63
811
1270
.088
8723
290.
1135
3531
3G
erm
any
24
410
101
72
1010
87
32
310
23
42
6110
80.0
8902
4803
0.15
7167
197
Ital
y7
22
37
51
42
35
24
410
52
42
75
51
54
101
90.0
8031
4920
0.10
9725
154
Spai
n6
65
36
43
61
210
126
14
86
897
100.
0785
6162
00.1
3317
3657
Mol
dova
15
17
22
54
361
27
81
72
881
110.
0628
1188
10.1
1943
9246
Rom
ania
63
31
27
56
712
41
104
7112
0.05
4187
0990
.119
8286
31F
YR
Mac
edon
ia8
212
28
18
121
68
371
130.
0537
7386
70.1
3261
1623
Lit
huan
ia4
85
331
27
44
16
51
770
140.
0555
8404
50.
1441
2079
Ukr
aine
310
13
61
33
62
78
11
82
6515
0.04
9463
4100
.123
9980
28C
ypru
s6
212
83
52
28
512
6516
0.04
5779
1110
.140
3847
35G
reec
e12
52
11
121
12
48
34
35
6417
0.04
7316
8400
.124
0541
95B
osni
a&
Her
zego
vina
710
76
52
71
1055
180.
0441
1928
50.1
3519
1508
Irel
and
11
43
44
45
55
1046
190.
0364
6241
70.1
3286
3608
Icel
and
16
67
36
54
44
4620
0.03
9862
2300
.129
0600
84M
alta
83
17
26
27
541
210.
0306
0357
00.1
2280
3553
Fran
ce2
26
32
621
220.
0164
2624
60.1
5143
4886
Den
mar
k2
55
52
221
230.
0177
2086
30.1
3970
3998
Hun
gary
17
28
119
240.
0145
5320
90.1
5129
0741
Uni
ted
Kin
gdom
15
42
1225
0.00
9791
3070
.121
9219
55
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 161Ta
ble
B.4:
Euro
visio
nSo
ngC
onte
st20
12(c
ontin
ued)
AlbaniaAustriaAzerbaijanBelarusBelgiumBosnia&HerzegovinaBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPortugalRomaniaRussiaSanMarinoSerbiaSlovakiaSloveniaSpainSwedenSwitzerlandTheNetherlandsTurkeyUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Nor
way
13
372
60.0
0525
5627
0.15
4850
199
Aus
tria
00.1
5379
6281
Bel
arus
00.1
2091
3553
Bel
gium
00.1
5346
1988
Bul
gari
a00
.150
7723
07C
roat
ia00
.132
0229
28F
inla
nd00
.141
4721
75G
eorg
ia00
.123
0665
28Is
rael
00.1
3994
8084
Lat
via
00.1
3843
5935
Mon
tene
gro
00.1
3730
0587
Por
tuga
l00
.116
8356
94Sa
nM
arin
o00
.130
8405
89Sl
ovak
ia00
.138
3987
27Sl
oven
ia00
.148
6645
54Sw
itze
rlan
d00
.125
1280
02T
heN
ethe
rlan
ds00
.147
6384
19
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 162Ta
ble
B.5:
Euro
visio
nSo
ngC
onte
st20
13
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace
AutorithyScore
HubScore
Den
mar
k1
45
511
021
07
812
712
710
7101
212
812
62
661
07
64
1212
810
310
5122
8110
.251
7385
610.
1124
9638
1A
zerb
aija
n7
1210
512
78
812
4121
27
212
3121
281
210
122
53
76
210
234
20.2
1107
5596
0.10
9552
608
Ukr
aine
1211
212
8101
210
310
88
73
810
571
0101
21
41
1010
552
1430
.196
3244
660.
1064
3808
1N
orw
ary
23
23
71
341
23
812
47
421
06
88
63
24
87
77
551
27
63
191
40.1
6402
7896
0.09
8933
496
Rus
sia
78
45
65
712
62
66
211
07
127
77
810
65
441
0174
50.1
5712
9979
0.13
0276
520
Gre
ece
108
74
62
751
26
41
61
27
14
85
7101
28
181
5260
.126
7714
450.
1449
4265
6It
aly
1211
06
86
1010
26
86
14
481
212
126
70.1
1148
0415
0.13
4388
723
Mal
ta6
87
34
53
23
53
85
105
105
101
82
7120
80.0
9892
9720
0.14
6482
452
The
Net
herl
ands
48
122
107
28
58
64
82
37
84
6114
90.0
9596
8817
0.13
3925
774
Hun
gary
86
44
1012
23
52
62
310
784
100.
0666
4189
10.1
5032
0451
Mol
dova
21
43
37
43
14
512
66
28
7111
0.06
2911
5970
.139
5190
96B
elgi
um3
52
35
44
23
85
32
712
371
120.
0582
3365
80.1
5918
9216
Rom
ania
54
61
106
171
01
64
465
130.
0544
3120
30.1
4174
2263
Swed
en1
41
81
43
45
45
121
63
6214
0.05
0114
0340
.122
8162
63G
eorg
ia10
105
83
25
750
150.
0404
3627
60.1
4738
6391
Bel
arus
57
55
13
12
43
1248
160.
0394
4080
40.1
4335
1285
Icel
and
16
58
64
46
52
4717
0.03
8200
7610
.149
9055
94A
rmen
ia2
871
03
11
21
641
180.
0352
5796
20.1
2579
2175
Uni
ted
Kin
gdom
75
31
41
223
190.
0206
4564
90.1
4361
4616
Est
onia
610
319
200.
0154
3322
90.1
4109
6214
Ger
man
y3
65
31
1821
0.01
5543
9360
.126
6499
81L
ithu
ania
35
11
61
1722
0.01
4312
7770
.141
5520
96Fr
ance
21
12
814
230.
0101
5384
70.1
3518
6986
Fin
land
23
14
313
240.
0106
2753
50.1
0800
8031
Spai
n6
282
50.0
0563
0226
0.15
3609
206
Irel
and
22
152
60.0
0454
3069
0.13
7789
556
Alb
ania
00.1
0228
2457
Aus
tria
00.1
2705
8480
Bul
gari
a00
.136
3416
00
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 163Ta
ble
B.5:
Euro
visio
nSo
ngC
onte
st20
13(c
ontin
ued)
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumBulgariaCroatiaCyprusDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace
AutorithyScore
HubScore
Cro
atia
00.1
6931
0202
Cyp
rus
00.1
6141
2928
FY
RM
aced
onia
00.1
3852
1061
Isra
el00
.157
7438
66L
atvi
a00
.134
6273
81M
onte
negr
o00
.156
1646
97Sa
nM
arin
o00
.092
5209
25Se
rbia
00.1
6596
9147
Slov
enia
00.1
4137
5070
Swit
zerl
and
00.1
1758
5074
Tabl
eB.
6:Eu
rovi
sion
Song
Con
test
2014
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Aus
tria
51
128
431
2101
071
2101
0121
212
6101
07
210
128
512
1212
1212
8122
9010
.285
1100
290.
1379
5631
2T
heN
ethe
rlan
ds41
02
8101
27
88
1281
2121
041
212
1212
103
321
071
010
8238
20.2
3572
0093
0.14
5211
066
Swed
en7
610
1210
104
22
87
410
87
76
38
481
221
081
06
812
7218
30.2
1294
9392
0.15
5725
338
Arm
enia
1210
25
841
212
67
72
610
631
01
47
86
45
710
174
40.1
5609
1783
0.05
0899
422
Hun
gary
87
85
73
710
56
61
71
412
610
67
27
14
314
350
.124
4533
840.
1656
9976
Ukr
aine
510
84
18
26
52
510
75
107
57
611
360
.095
4102
970.
1357
6945
2R
ussi
a10
1212
16
810
32
65
82
489
70.0
8650
4385
0.10
2149
33
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 164Ta
ble
B.6:
Euro
visio
nSo
ngC
onte
st20
14(c
ontin
ued)
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Nor
way
14
63
72
53
17
58
27
31
53
510
8880
.074
4749
110.
1566
7604
5D
enm
ark
16
63
88
11
65
41
63
83
13
7490
.070
6751
280.
1605
4953
9Sp
ain
122
22
61
64
44
52
54
82
574
100.
0684
3237
90.1
6576
4986
Fin
land
43
46
46
56
37
33
64
26
7211
0.06
5352
465
0.10
7531
18R
oman
ia8
61
54
28
581
24
18
7212
0.06
2560
7680
.155
2120
94Sw
itze
rlan
d5
31
34
15
22
35
107
63
31
6413
0.05
5868
2280
.153
8641
43P
olan
d2
75
510
13
38
24
21
27
6214
0.05
3973
263
0.11
2296
57Ic
elan
d2
57
25
76
18
14
64
5815
0.05
2246
0350
.165
0624
83B
elar
us8
71
35
112
643
160.
0379
7888
90.0
8922
7164
Uni
ted
Kin
gdom
17
34
84
35
540
170.
0299
5831
70.1
4434
2619
Ger
man
y4
65
82
27
539
180.
0279
4585
20.1
3686
9443
Mon
tene
gro
612
127
3719
0.02
6911
4080
.097
1934
44G
reec
e2
74
64
23
14
235
200.
0238
1742
80.1
5927
8464
Ital
y10
21
126
233
210.
0234
2529
60.1
2163
8547
Aze
rbai
jan
37
1012
133
220.
0226
1371
60.0
6800
5452
Mal
ta1
53
31
45
1032
230.
0218
1615
50.1
1906
3784
San
Mar
ino
33
34
114
240.
0093
1575
60.0
9820
4303
Slov
enia
18
9250
.005
8431
620.
1565
3382
3Fr
ance
11
2260
.002
1673
870.
1725
7433
5A
lban
ia00
.091
8976
17B
elgi
um00
.169
5846
94E
ston
ia00
.162
0564
01F
YR
Mac
edon
ia00
.153
9939
35G
eorg
ia00
.117
4230
87Ir
elan
d00
.147
5849
17Is
rael
00.1
5305
9334
Lat
via
00.1
6629
5198
Lit
huan
ia0
0.15
9920
12M
oldo
va00
.115
8749
04
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 165Ta
ble
B.6:
Euro
visio
nSo
ngC
onte
st20
14(c
ontin
ued)
AlbaniaArmeniaAustriaAzerbaijanBelarusBelgiumDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSloveniaSpainSwedenSwitzerlandTheNetherlandsUkraineUnitedKingdomPointsPlace
AuthorityScore
HubScore
Por
tuga
l00
.173
6109
46
Tabl
eB.
7:Eu
rovi
sion
Song
Con
test
2015
AlbaniaArmeniaAustraliaAustriaAzerbaijanBelarusBelgiumCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUnitedKingdomPointsPlace
AuthorityScore
HubScore
Swed
en7
712
761
0121
0101
210
512
871
041
0121
0101
2121
010
851
212
88
87
812
812
1012
365
10.2
4760
20.1
3024
8R
ussi
a81
210
8121
210
581
012
681
051
28
63
881
010
710
72
6101
010
510
67
663
0320
.203
5610
.128
412
Ital
y12
681
08
181
27
53
72
68
312
26
612
811
27
65
7121
2121
07
812
86
782
9230
.193
4950
.146
217
Bel
gium
14
65
87
67
71
712
68
712
42
47
47
67
55
710
54
361
021
232
1740
.147
4820
.145
478
Aus
tral
ia3
112
64
48
55
27
58
85
76
53
64
108
24
83
271
28
8101
9650
.135
9290
.157
516
Lat
via
27
15
57
35
46
67
46
35
712
412
42
810
25
212
17
45
42
7186
60.1
2628
60.1
4952
7E
ston
ia4
36
47
22
621
02
27
13
26
83
13
21
72
33
410
670
.071
5420
.145
329
Nor
way
44
34
44
410
45
25
23
66
64
271
03
102
80.0
7078
30.1
4433
1Is
rael
52
27
26
31
55
81
53
44
71
26
14
35
597
90.0
6571
80.1
4669
1Se
rbia
25
33
102
312
65
11
5310
0.03
0338
0.12
8584
Geo
rgia
101
103
14
12
36
55
5111
0.03
1356
0.11
1867
Aze
rbai
jan
1210
28
38
33
4912
0.02
8747
0.12
5808
Mon
tene
gro
68
24
1210
244
130.
0260
560.
0855
79Sl
oven
ia3
81
26
34
41
15
139
140.
0225
770.
1356
49R
oman
ia5
12
15
124
535
150.
0239
820.
1534
99A
rmen
ia4
32
331
21
634
160.
0201
100.
1192
44
APPENDIX B. RESULTS OF THE EUROVISION SONG CONTEST 2009-2015 166Ta
ble
B.7:
Euro
visio
nSo
ngC
onte
st20
15(c
ontin
ued)
AlbaniaArmeniaAustraliaAustriaAzerbaijanBelarusBelgiumCyprusCzechRepublicDenmarkEstoniaF.Y.R.MacedoniaFinlandFranceGeorgiaGermanyGreeceHungaryIcelandIrelandIsraelItalyLatviaLithuaniaMaltaMoldovaMontenegroNorwayPolandPortugalRomaniaRussiaSanMarinoSerbiaSloveniaSpainSwedenSwitzerlandTheNetherlandsUnitedKingdomPointsPlace
AuthorityScore
HubScore
Alb
ania
612
610
3417
0.01
6755
0.12
4384
Lit
huan
ia1
23
77
64
3018
0.02
0556
0.11
8747
Gre
ece
105
823
190.
0143
580.
1255
44H
unga
ry1
81
14
419
200.
0130
610.
1461
21Sp
ain
15
11
23
11
1521
0.00
9537
0.15
8995
Cyp
rus
101
1122
0.00
6646
0.14
5637
Pol
and
43
12
1023
0.00
6909
0.15
9925
Uni
ted
Kin
gdom
11
352
40.0
0325
80.1
5312
5Fr
ance
31
4250
.002
3380
.142
272
Aus
tria
026
00.1
5045
4G
erm
any
027
00.1
5786
5B
elar
us00
.147
342
Cze
chR
epub
lic00
.130
862
Den
mar
k00
.160
507
FY
RM
aced
onia
00.0
8541
4F
inla
nd00
.149
564
Icel
and
00.1
4581
7Ir
elan
d00
.141
492
Mal
ta00
.145
890
Mol
dova
00.1
3090
4P
ortu
gal
00.1
4498
5Sa
nM
arin
o00
.131
533
Swit
zerl
and
00.1
4673
9T
heN
ethe
rlan
ds00
.155
502