A Path Algebra for Mapping Multi-Relational Networks to Single-Relational Networks

A Path Algebra for Mapping Multi-Relational

Networks to Single-Relational Networks

Marko A. [email protected]

T-7: Mathematical Modeling and Analysis GroupCNLS: Center for Nonlinear Studies

Los Alamos National Laboratory

June 26, 2008

Single- and Multi-Relational Networks

Human-B

Human-C

Human-D

Human-E

Human-F

Human-A

Article-A

Journal-A

Publisher-A

Article-B

Human-B

Human-A

authored

authored

authoredcontainedIn

editorOf

publishedBy

Center for Non-Linear Studies Public Lecture - June 26, 2008

Presentation Article

Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks toSingle-Relational Network Analysis Algorithms”, LA-UR-08-03931, May2008, http://arxiv.org/abs/0806.2274.

Acknowledgements:

• Ideas inspired by the MESUR problem space [Bollen et al., 2007].

• Vadas Gintautas aided in reviewing drafts of the article and presentation.

• Michael Ham aided in reviewing drafts of the presentation.

• Razvan Teodorescu taught me FoilTex and this is his template (I’m sorry,I just don’t know how to change the color scheme!)


Problem Statement

Think about all of the known network analysis algorithms:

• geodesics: diameter, eccentricity [Harary & Hage, 1995], closeness[Bavelas, 1950], betweenness [Freeman, 1977], ...• spectral: PageRank [Brin & Page, 1998], eigenvector centrality

[Bonacich, 1987], ...• community detection: leading eigenvector [Newman, 2006], edge

betweenness [Girvan & Newman, 2002], ...• mixing pattens: scalar and discrete assortativity [Newman, 2003], ...• on and on and on...

These algorithms have been developed for directed or undirectedsingle-relational networks. What do you do when you have amulti-relational network?


Problem Statement

Now think about all of the known network analysis packages:

• Java Universal Network/Graph Framework (JUNG) [O’Madadhain et al., 2005]

• iGraph: Package for Complex Network Research [Csardi, 2006]

• Pajek

• NetworkX [Hagberg et al., n.d.]

• on and on and on...

These packages (for the most part) have been developed for directed orundirected single-relational networks. What do you do when you have amulti-relational network?


Problem Statement

• Do you reimplement all of the known algorithms to support a multi-relational network?

• Even if you do, what do these algorithms look like?


Solution Statement

• You map your multi-relational network to a “meaningful” single-relationalnetwork and re-use existing algorithms, packages, and theorems from thesingle-relational domain.


Outline

• Formalizing Single- and Multi-Relational Networks

• Background on Multi-Relational Network Analysis

• The Elements of the Path Algebra

• The Operations of the Path Algebra

• Multi-Relational Network Analysis


Outline







An Undirected Single-Relational Network

Human-B

Human-C

Human-D

Human-E

Human-F

Human-A

All edges have a single homogenous meaning (e.g. co-author).

G = (V,E ⊆ {V × V })


A Directed Single-Relational Network

Article-B

Article-C

Article-D

Article-E

Article-F

Article-A

All edges have a single homogenous meaning (e.g. citation).

G = (V,E ⊆ (V × V ))


A Multi-Relational Network

Article-A

Journal-A

Publisher-A

Article-B

Human-B

Human-A

authored

authored

authoredcontainedIn

editorOf

publishedBy

Edges are heterogenous in meaning.

M = (V,E = {E0 ⊆ (V × V ), E1, . . . , Em})


Outline







Flatten the Multi-Relational Network

• Suppose you have a multi-relational network, where there exists only twoedge sets defined as coauthor and friend.

M = (V,E = {E0, E1})

and you want to determine the most central “scholar” in this network.

• It is not sufficient to simply ignore edge labels (flatten the multi-relational network to a single-relational network) and execute a centralityalgorithm on the network. You will confuse central friendship with centralscholarship.


Extract a Single-Relational Network Component

• You could simply pull out the coauthor single relational network

G = (V,E0)

and calculate a centrality algorithm on that network to get your result.

• That works, but for more complex situations with “richer semantics”,this mechanism will not work.


Execute a Grammar-Based Walker

• A walker obeys a “grammar” that specifies the way in which the walker should move through the network[Rodriguez, 2008].

Journal-A

Publisher-A

Article-B

Human-B

Human-Aauthored

authoredcontainedIn

editorOf

publishedBywhile(true) incr vertex counter go authored go authored but don't go back to previous vertex

coauthor

-1

coauthor primary eigenvector grammar

• Problem – this solution mixes the analysis algorithm and the traversed implicit network.

• Solution – an algebra that is agnostic to the final executing algorithm.


Outline







An Adjacency Matrix Representation of aSingle-Relational Network

Article-B

Article-C

Article-D

Article-E

Article-F

Article-A

0

0

0

0

1 Article-A

0

1

0

0

0

0

0 00

0

0

0

0 0

1

0

0 0

0

1

Article-B

Article-C

Article-D

Article-E

Article-A

Article-B

Article-C

Article-D

Article-E

n = |V |

n=

|V|

* NOTE: Sorry about missing the vertex Article-F in the adjacency matrix. Too lazy to redo diagrams.


An Adjacency Matrix Representation of aSingle-Relational Network

A single-relational network defined as

G = (V,E ⊆ (V × V ))

can be represented as the adjacency matrix A ∈ {0, 1}n×n, where

Ai,j =

{1 if (i, j) ∈ E0 otherwise.


A Three-Way Tensor Representation of aMulti-Relational Network

Article-A

Journal-A

Publisher-A

Article-B

Human-B

Human-A

authored

authored

authoredcontainedIn

editorOf

publishedBy

authored

publishedBy

editorOf

containedIn

Human-A

Article-A

Article-B

Human-B

Journal-A

0

0

0

0

1 Human-A

Article-A

Article-B

Human-B

Journal-A

1

1

0

0

0

0

0 0 0

0

0

0

0 0

0

0

0 0

0

0

n = |V |

m= |E|

n=

|V|

* NOTE: Sorry about missing the vertex Publisher-A in the tensor. Too lazy to redo diagrams.


A Three-Way Tensor Representation of aMulti-Relational Network

A three-way tensor can be used to represent a multi-relational network[Kolda et al., 2005]. If

M = (V,E = {E0, E1, . . . , Em ⊆ (V × V )})

is a multi-relational network, then A ∈ {0, 1}n×n×m and

Ami,j =

{1 if (i, j) ∈ Em

0 otherwise.


The General Purpose of the Path Algebra

• Map a multi-relational tensor A ∈ {0, 1}n×n×m to a single-relational path matrix Z ∈ Rn×n+ .

• By performing operations on A, a single-relational path matrix is created whose “edges” are loadedwith meaning.

• For example, you can create a coauthorship network, a social science journal citation network, acoauthorship network for scholars from the same university who have not been on the same project inthe last 10 years, but are in the same department, etc.

• The theorems of the algebra can be used to manipulate your mapping operation to a smaller/moreefficient form (i.e. how a composition is spoken in words can differ from its reduced form).

0

0

0

0

1

1

1

0

0

0

0

0 0 0

0

0

0

0 0

0

0

0 0

0

0

0

0

0

72

1

15.3

0

0

0

23

0

24 00

0

0

0

4 0

0

0

0 12

0

0

A ! {0, 1}n!n!m Z ! Rn!n+


The Elements of the Path Algebra

• A ∈ {0, 1}n×n×m: a three-way tensor representation of a multi-relationalnetwork.

• Z ∈ Rn×n+ : a path matrix derived by means of operations applied to A.

——————————————————————————————

• Cj ∈ {0, 1}n×n: “to” path filters.

• Ri ∈ {0, 1}n×n: “from” path filters.

• I ∈ {0, 1}n×n: the identity matrix as a self-loop filter.

• 1 ∈ 1n×n: a matrix in which all entries are equal to 1.

• 0 ∈ 0n×n: a matrix in which all entries are equal to 0.


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iExample Scholarly Tensor Used in the Remainder of the

Presentation

• A1: authored : human→ article

• A2: cites : article→ article

• A3: contains : journal→ article

• A4: category : journal→ subject category

• A5: developed : human→ program/software.


Outline







The Operations of the Path Algebra

• A ·B: ordinary matrix multiplication determines the number of (A,B)-paths between vertices.

• A>: matrix transpose inverts path directionality.

• A ◦B: Hadamard, entry-wise multiplication applies a filter to selectivelyexclude paths.

• n(A): not generates the complement of a {0, 1}n×n matrix.

• c(A): clip generates a {0, 1}n×n matrix from a Rn×n+ matrix.

• v±(A): vertex generates a {0, 1}n×n matrix from a Rn×n+ matrix, where

only certain rows or columns contain non-zero values.

• λA: scalar multiplication weights the entries of a matrix.

• A + B: matrix addition merges paths.


The Traverse Operation

• An interesting aspect of the single-relational adjacency matrix A ∈ {0, 1}n×n is that when it is raised

to the kth power, the entry A(k)i,j is equal to the number of paths of length k that connect vertex i to

vertex j [Chartrand, 1977].

• Given, by definition, that A(1)i,j (i.e. Ai,j) represents the number of paths that go from i to j of length

1 (i.e. a single edge) and by the rules of ordinary matrix multiplication,

A(k)i,j =

∑l∈V

A(k−1)i,l ·Al,j : k ≥ 2.

0

0

1

0

0

0 0

1

0 0

0

1

0

0

0 0

1

0

·0

0

0

0

0

0 1

0

0

=

a b c

a b c

a

b

c

a b c a b c

a

b

c

a

b

c

there is a path of length 2 from a to c


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Traverse Operation

Z = A1 · A2 · A1>,Zi,j defines the number of paths from vertex i to vertex j such that a path goes from author i to one the

articles he or she has authored, from that article to one of the articles it cites, and finally, from that cited

article to its author j. Semantically, Z is an author-citation single-relational path matrix.

Human-A

authored

Article-A

authored

Human-B

Article-Bcites

author-citation

A1

A2

A1!

Z

* NOTE: All diagrams are with respect to a “source” vertex (the blue vertex) in order to preserve clarity. In reality, the operations

operate on all vertices in parallel.


The Filter Operation

Various path filters can be defined and applied using the entry-wiseHadamard matrix product denoted ◦, where

A ◦B =

A1,1 ·B1,1 · · · A1,m ·B1,m... . . . ...

An,1 ·Bn,1 · · · An,m ·Bn,m

.

0

0

0

72

1

15.3

0

0

0

23

0

24 00

0

0

0

4 0

0

0

0 12

0

0

0

0

0

1

1

0

0

0

0

1

0

0 00

0

0

0

0 0

0

0

0 0

0

0! =

0

0

0

72

1

0

0

0

0

23

0

0 00

0

0

0

0 0

0

0

0 0

0

0

Path Matrix Path Filter Filtered Path Matrix


The Filter Operation

• A ◦ 1 = A• A ◦ 0 = 0• A ◦B = B ◦A• A ◦ (B + C) = (A ◦B) + (A ◦C)• A> ◦B> = (A ◦B)>.


The Not Filter

The not filter is useful for excluding a set of paths to or from a vertex.

n : {0, 1}n×n → {0, 1}n×n

with a function rule of

n(A)i,j =

{1 if Ai,j = 00 otherwise.

0

0

0

1

1

1

0

0

0

1

0

1 00

0

0

0

1 0

0

0

0 1

0

0=n

1

1

1

0

0

0

1

1

1

0

1

0 11

1

1

1

0 1

1

1

1 0

1

1


The Not Filter

If A ∈ {0, 1}n×n, then

• n(n(A)) = A• A ◦ n(A) = 0• n(A) ◦ n(A) = n(A).


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Not Filter

A coauthorship path matrix is

Z = A1 · A1> ◦ n(I)

Human-A

authored

Article-A

Human-Bcoauthor

A1 A1!

Z

authored

coauthor

n(I)


The Clip Filter

The general purpose of clip is to take a path matrix and “clip”, ornormalize, it to a {0, 1}n×n matrix.

c : Rn×n+ → {0, 1}n×n

c(Z)i,j =

{1 if Zi,j > 00 otherwise.

0

0

0

72

1

15.3

0

0

0

23

0

24 00

0

0

0

4 0

0

0

0 12

0

0

0

0

0

1

1

1

0

0

0

1

0

1 00

0

0

0

1 0

0

0

0 1

0

0=c


The Clip Filter

If A,B ∈ {0, 1}n×n and Y,Z ∈ Rn×n+ , then

• c(A) = A• c(n(A)) = n(c(A)) = n(A)• c(Y ◦ Z) = c(Y) ◦ c(Z)• n(A ◦B) = c (n(A) + n(B))• n(A + B) = n(A) ◦ n(B)


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Clip Filter

Suppose we want to create an author citation path matrix that does not allow self citation or coauthorcitations.

Z =

„A1 · A2 · A1>

«| {z }

cites

◦n

„c

„A1 · A1> ◦ n(I)

««| {z }

no coauthors

◦ n(I)|{z}no self

Human-A

authored

Article-A

authored

Human-B

Article-Bcites

author-citation

A1

A2

A1!

Z

authored

Human-C

A1!

authored

coauthor

self n(I)

n!c!A1 · A1! ! n(I)

""


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Clip Filter

However, using various theorems of the algebra,

Z =(A1 · A2 · A1>

)︸︷︷︸

cites

◦n(c(A1 · A1> ◦ n(I)

))︸︷︷︸

no coauthors

◦ n(I)︸︷︷︸no self

becomes

Z =(A1 · A2 · A1>

)◦ n(c(A1 · A1>

))◦ n(I).


The Vertex Filter

In many cases, it is important to filter out particular paths to and from avertex.

v− : Rn×n+ × N→ {0, 1}n×n,

v−(Z)i,j =

{1 if

∑k∈V Zi,k > 0

0 otherwise

turns a non-zero column into an all 1-column and

v+ : Rn×n+ × N→ {0, 1}n×n,

v+(Z)i,j =

{1 if

∑k∈V Zk,j > 0

0 otherwise

turns a non-zero row into an all 1-row.


The Vertex Filter

0

23

2

0

1

0

0

0

0

0

0

0 10

0

0

0

0 0

32

0

0 0

0

0

1

1

1

1

1

0

0

0

0

0

0

0 10

0

0

0

1 0

1

1

1 0

0

0=v!

v+ not diagrammed, but acts the same except for makes 1-rows. Two import filters are the column and

row filters, C ∈ {0, 1}n×n and R ∈ {0, 1}n×n, respectively.

1

1

1

1

1

0

0

0

0

0

0

0 00

0

0

0

0 0

0

0

0 0

0

0

0

0

1

0

0

0

0

0

0

1

0

0 00

0

1

0

0 0

1

0

0 0

0

1C2 = R3 =


The Vertex Filter

• v−(Ci) = Ci

• v+(Rj) = Rj

• v−(Z) = v+(Z>)>• v+(Z) = v−(Z>)>.


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Vertex Filter

Assume that vertex 1 is the social science subject category vertex and we want to create a journalcitation network for social science journals only.

Z =hv

+“C1 ◦ A4

”◦ A3

i| {z }

soc.sci. journal articles

·A2 ·»A3> ◦ v

−„

R1 ◦ A4>«–

| {z }articles in soc.sci. journals

.

Social Science

Journal-A

Journal-B

Journal-CArticle-C

Article-Bcategory

contains

contains

contains

Article-A

cites

cites

category

v+!C1 !A4

"A3

A2

A2

A3!

A3!v!

!R1 !A4"

"

1social-science journal citation

Z


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Vertex Filter

hv+“C1 ◦ A4

”◦ A3

i| {z }


0000

0J-A

0

0

1111

1 00

00

0

0 0000 000

A-A

A-B

A-C

J-B

J-C

S

J-A A-A A-B A-CJ-B J-CS

0 000

0 0

0 01 0 0 0 00 01 0 0 0 0

0

00

0

C1

0000

0J-A

0

0

0011

0 00

00

0

0 0000 000

A-A

A-B

A-C

J-B

J-C

S


0 000

0 0

0 00 0 0 0 00 00 0 0 0 0

0

00

0

A4

0011

0J-A

0

0

0011

0 00

11

0

1 1100 001

A-A

A-B

A-C

J-B

J-C

S


0 011

0 0

0 00 0 0 0 00 00 0 0 0 0

0

11

0

v+(C1 !A4)

0000

0J-A

0

0

0000

0 00

00

0

0 1000 000

A-A

A-B

A-C

J-B

J-C

S


0 001

0 1

0 00 0 0 0 00 00 0 0 0 0

0

00

0! =

0000

0J-A

0

0

0000

0 00

00

0

0 1000 000

A-A

A-B

A-C

J-B

J-C

S


0 001

0 0

0 00 0 0 0 00 00 0 0 0 0

0

00

0

A3 v+(C1 !A4) !A3

! =0000

0J-A

0

0

0011

0 00

00

0

0 0000 000

A-A

A-B

A-C

J-B

J-C

S


0 000

0 0

0 00 0 0 0 00 00 0 0 0 0

0

00

0

C1 !A4


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Vertex Filter

Z =[v+(C1 ◦ A4

) ◦ A3]︸︷︷︸


·A2 ·[A3> ◦ v−

(R1 ◦ A4>

)]︸︷︷︸

articles in soc.sci. journals

.

However,

v−(R1 ◦ A4>

)= v−

((C1 ◦ A4

)>)Cx = R>x

= v+(C1 ◦ A4

)>v+(Z) =v−(Z>)>.

Therefore, because A> ◦B> = (A ◦B)>,

Z =[v+(C1 ◦ A4

) ◦ A3]︸︷︷︸

reused

·A2 · [v+(C1 ◦ A4

) ◦ A3]︸︷︷︸

reused

>.


The Weight and Merge Filter

• λZ: scalar multiplication weights paths.

• Y + Z: matrix addition merges paths.

0

0

0

72

1

15.3

0

0

0

23

0

24 00

0

0

0

4 0

0

0

0 12

0

0

0

0

0

10

1

0

0

0

0

1

0

0 00

0

34

0

0 0

0

0

0 2

0

0+ =

0

0

0

2

15.3

0

0

0

24

0

24 00

0

34

0

4 0

0

0

0 14

0

0

82


hA1 : authored

i hA2 : cites

i hA3 : contains

i hA4 : category

i hA5 : developed

iThe Weight and Merge Filter

Z = 0.6(A1 · A1> ◦ n(I)

)︸︷︷︸

coauthorship

+ 0.4(A5 · A5> ◦ n(I)

)︸︷︷︸

co-development

merges the article and software program collaboration path matrices asspecified by their respective weights of 0.6 and 0.4. The semantics of theresultant is a software program and article collaboration path matrix thatfavors article collaboration over software program collaboration. Asimplification of the previous composition is

Z =[0.6(A1 · A1>

)+ 0.4

(A5 · A5>

)]◦ n(I).


Outline







Application to the Real-World

• A can be represented in a standard matrix manipulation package.

• Z can be constructed with the same matrix manipulation package.

• The path matrix Z has a weighted network representation.

Z = (V,E ⊆ (V × V ), λ), where λ : E → R+

• Z can be used in standard network analysis packages.


The Page Rank Tensor

In the matrix form of PageRank, there exist two adjacency matrices in[0, 1]n×n denoted

A1i,j =

{1

|Γ+(i)| if (i, j) ∈ E0 otherwise.

and

A2i,j =

1|V |.

A1 is a row-stochastic adjacency matrix and A2 is a fully connectedadjacency matrix known as the teleportation matrix.


The Page Rank Tensor

The purpose of PageRank is to identify the primary eigenvector of aweighted merged path matrix of the form

Z =[δ · A1

]+[(1− δ) · A2

].

Z is guaranteed to be a strongly connected single-relational path matrixbecause there is some probability (defined by 1− δ) that every vertex isreachable by every other vertex.


Conclusion

• Most of graph and network theory is concerned with the design oftheorems and algorithms for single-relational networks.

• Given a multi-relational network, you can manipulate a tensorrepresentation of it to yield a “semantically-rich” single-relationalnetwork.

• Thus, a multi-relational network can be exposed to the concepts of thesingle-relational domain.

Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks toSingle-Relational Network Analysis Algorithms”, LA-UR-08-03931, May2008, http://arxiv.org/abs/0806.2274.


References

[Bavelas, 1950] Bavelas, A. 1950. Communication Patterns in TaskOriented Groups. The Journal of the Acoustical Society of America,22, 271–282.

[Bollen et al., 2007] Bollen, Johan, Rodriguez, Marko A., & Van de Sompel,Herbert. 2007. MESUR: usage-based metrics of scholarly impact. In:Joint Conference on Digital Libraries (JCDL07). Vancouver, Canada:IEEE/ACM.

[Bonacich, 1987] Bonacich, Phillip. 1987. Power and centrality: A familyof measures. American Journal of Sociology, 92(5), 1170–1182.


[Brin & Page, 1998] Brin, Sergey, & Page, Lawrence. 1998. The anatomyof a large-scale hypertextual Web search engine. Computer Networks andISDN Systems, 30(1–7), 107–117.

[Chartrand, 1977] Chartrand, Gary. 1977. Introductory Graph Theory.Dover.

[Csardi, 2006] Csardi, Gabor. 2006. The igraph software package forcomplex network research. InterJournal Complex Systems.

[Freeman, 1977] Freeman, L. C. 1977. A set of measures of centrality basedon betweenness. Sociometry, 40(35–41).

[Girvan & Newman, 2002] Girvan, Michelle, & Newman, M. E. J. 2002.Community structure in social and biological networks. Proceedings ofthe National Academy of Sciences, 99, 7821.


[Hagberg et al., n.d.] Hagberg, Aric, Schult, Daniel A., & Swart, Pieter J.NetworkX. https://networkx.lanl.gov.

[Harary & Hage, 1995] Harary, Frank, & Hage, Per. 1995. Eccentricity andcentrality in networks. Social Networks, 17, 57–63.

[Kolda et al., 2005] Kolda, Tamara G., Bader, Brett W., & Kenny,Joseph P. 2005. Higher-Order Web Link Analysis Using MultilinearAlgebra. In: Proceedings of the Fifth IEEE International Conference onData Mining ICDM’05. IEEE.

[Newman, 2003] Newman, M. E. J. 2003. Mixing patterns in networks.Physical Review E, 67(2), 026126.

[Newman, 2006] Newman, M. E. J. 2006. Finding community structure innetworks using the eigenvectors of matrices. Physical Review E, 74(May).


[O’Madadhain et al., 2005] O’Madadhain, Joshua, Fisher, Danyel, Nelson,Tom, & Krefeldt, Jens. 2005. JUNG: Java Universal Network/GraphFramework.

[Rodriguez, 2008] Rodriguez, Marko A. 2008. Grammar-Based RandomWalkers in Semantic Networks. Knowledge-Based Systems, [in press].


Date post:	11-May-2015
Category:	Technology
Upload:	marko-rodriguez
View:	1,833 times
Download:	1 times

A Path Algebra for Mapping Multi-Relational Networks to Single-Relational Networks

Technology