Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 1
CSE 6240: Web Search and Text Mining. Spring 2020
Graph and Knowledge Graph Representation Learning
Prof. Srijan Kumarhttp://cc.gatech.edu/~srijan
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 2
Today’s Lecture• Embedding entire graphs • Introduction to Knowledge Graphs• Embeddings in Knowledge Graphs
– TransE– TransR
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 3
Embedding Entire Graphs
• Goal: How to embed an entire graph 𝐺?
• Tasks:– Classifying toxic vs. non-toxic molecules– Identifying anomalous graphs
𝒛$
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 4
Approach #1Simple idea: • Run a standard graph embedding technique on
the (sub)graph 𝐺• Then just sum (or average) the node
embeddings in the (sub)graph 𝐺
• Used by Duvenaud et al., 2016 to classify molecules based on their graph structure– Convolutional Networks on Graphs for Learning
Molecular Fingerprints. NeurIPS 2015
𝑧$ = '𝑧(
�
(∈$
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 5
Approach #2• Idea: Introduce a “virtual node” to represent
the (sub)graph and run a standard graph embedding technique
• Proposed by Li et al., 2016 as a general technique for subgraph embedding– Gated Graph Sequence Neural Networks. ICLR 2016
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 6
Approach #3• Represent a graph as a
distribution/set of walks on that graph
• Anonymous Walk Embeddings:
– States in anonymous walk correspond to the index of the first time we visited the node in a random walk
– Anonymous Walk Embeddings, ICML 2018
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 7
Number of Walks GrowsThe number of anonymous walks grows exponentially:
– There are 5 anon. walks 𝑎, of length 3: 𝑎-=111, 𝑎.=112, 𝑎/= 121, 𝑎0= 122, 𝑎1= 123
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 8
Idea #1: Anonymous Walks• Enumerate all possible anonymous walks 𝑎,
of 𝑙 steps and record their counts• Represent the graph as a probability
distribution over these walks• For example:
– Set 𝑙 = 3– Then we can represent the graph as a 5-dim
vector• Since there are 5 anonymous walks 𝑎,of length 3:
111, 112, 121, 122, 123– 𝑍$[𝑖] = probability of anonymous walk 𝑎, in 𝐺
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 9
Idea #2: Learn Walk Embeddings
Learn embedding 𝒛𝒊of every anonymous walk 𝒂𝒊• The embedding of a graph 𝐺 is then
sum/avg/concatenation of walk embeddings z,
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 10
Idea #2: Learn Walk Embeddings
How to embed walks?• Idea: Embed walks such
that the next walk starting from the same node can be predicted– Set walk embedding z, such
that we maximize𝑃 𝑤>? 𝑤>@A? , … ,𝑤>? = 𝑓(𝑧)• Where 𝑤>? is a 𝑡-th random
walk starting at node 𝑢– Similar to the word2vec idea
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 11
Idea #2: Learn Walk Embeddings• Run 𝑻 different random walks from 𝒖
each of length 𝒍: 𝑁M 𝑢 = 𝑎-?, 𝑎.? …𝑎N?
– Let 𝑎, be its anonymous version of walk 𝑤,
• Learn to predict walks that co-occur in 𝚫-size window
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 12
Idea #2: Learn Walk Embeddings• Estimate embedding 𝒛𝒊 of anonymous
walk 𝒂𝒊 of 𝒘𝒊:
max1𝑇'log𝑃(𝑎>|𝑎>@A, … , 𝑎>@-)
N
>ZA
where: Δ = context window size• 𝑃 𝑤> 𝑤>@A,… ,𝑤>@- = \]^(_ `a )
∑ \]^(_(`c))dc
, i.e., softmax over all
walks• 𝑓(𝑎>) = 𝑏 + 𝑈 ⋅ -
A∑ 𝑧,A,Z-
– where 𝑏 ∈ ℝ, 𝑈 ∈ ℝj, 𝑧, is the embedding of 𝑎, (anonymized version of walk 𝑤,)
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 13
Summary of Graph Embeddings
We discussed 3 ideas to graph embeddings:• Approach 1: Embed nodes and sum/average
them• Approach 2: Create super-node that spans
the (sub) graph and then embed that node • Approach 3: Anonymous Walk Embeddings
– Idea 1: Represent the graph via the distribution over all the anonymous walks
– Idea 2: Embed anonymous walks
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 14
Today’s Lecture• Embedding entire graphs • Introduction to Knowledge Graphs• Embeddings in Knowledge Graphs
– TransE– TransR
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 15
Knowledge Graphs• Knowledge in graph form
– Capture entities, types, and relationships• Nodes are entities• Nodes are labeled with
their types• Edges between two
nodes capture relationships betweenentities
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 16
Example: Bibliographic networks• Node types: paper, title, author, conference,
year • Relation types: pubWhere, pubYear,
hasTitle, hasAuthor, cite
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 17
Example: Social networks
• Node types: account, song, post, food, channel
• Relation types: friend, like, cook, watch, listen
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 18
Example: Google Knowledge Graph
paintedBy
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 19
Knowledge Graphs in Practice
• Google Knowledge Graph • Amazon Product Graph• Facebook Graph API • IBM Watson • Microsoft Satori • Project Hanover/Literome• LinkedIn Knowledge Graph • Yandex Object Answer
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 20
Applications of Knowledge Graphs• Serving information
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 21
Applications of Knowledge Graphs• Question answering and conversation
agents
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 22
Knowledge Graph Datasets• Publicly available KGs:
– FreeBase, Wikidata, Dbpedia, YAGO, NELL
• Common characteristics:– Massive: millions of nodes and edges– Incomplete: many true edges are missing
Given a massive KG, enumerating all the
possible facts is intractable!
Can we predict plausible BUT missing
links?
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 23
Example: Freebase• Freebase
– ~50 million entities– ~38K relation types– ~3 billion facts/triples
• FB15k/FB15k-237– A complete subset of Freebase, used by
researchers to learn KG models
93.8% of persons from Freebase have no place of birth and 78.5% have no nationality!
[1]Paulheim, Heiko. "Knowledge graph refinement: A survey of approaches and evaluation methods." Semantic web 8.3 (2017): 489-508.[2]Min, Bonan, et al. "Distant supervision for relation extraction with an incomplete knowledge base." Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013.
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 24
Today’s Lecture• Embedding entire graphs • Introduction to Knowledge Graphs• Embeddings in Knowledge Graphs
– TransE– TransR
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 25
Key Task: KG Completion• Knowledge Graph completion is a link
prediction problem• KG incompleteness can substantially affect
the efficiency of systems relying on it
• Main paper: Translating Embeddings for Modeling Multi-relational Data. Bordes, Usunier, Garcia-Duran. NeurIPS 2013.
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 26
Key Task: KG Completion
missingrelation
• Intuition: a link prediction model that learns from local and global connectivity patterns in the KG, taking into account entities and relationships of different types at the same time
• Models: TransEand TransR
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 27
Translating Embeddings: TransE• Relationships between entities = triplets
– 𝒉 (head entity), 𝒍 (relation), 𝒕 (tail entity) => (ℎ, 𝑙, 𝑡)• Entities and relations are all embedded in
an entity space 𝑅o• Relations are represented as translations
– ℎ + 𝑙 ≈ 𝑡 if the given fact is true; else, ℎ + 𝑙 ≠ 𝑡
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 28
TransE• Translation Intuition:
– For a triple (ℎ, 𝑟, 𝑡), 𝐡, 𝐫, 𝐭 ∈ ℝv,𝐡 + 𝐫 = 𝐭
• Score function: 𝑓w ℎ, 𝑡 = ||ℎ + 𝑟 − 𝑡||
𝐡 𝐭
𝐫 ObamaNationality
American
NOTATION: embedding vectors will appear in boldface
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 29
Link Prediction in a KG using TransE• Who has won the Turing award?
• Who is a Canadian citizen?
Win
HintonBengio
Pearl
TuringAward
Canada
Trudeau Bieber
𝐪
Answers!
HintonBengio
Pearl
TuringAward
CanadaCitizen
Trudeau Bieber
Answers!
𝐪
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 30
TransE Optimization• Learn embeddings such that ℎ + 𝑙 = 𝑡 for
real triplets that exist in the knowledge graph, ℎ + 𝑙 ≠ 𝑡 for triplets that do not exist– Create a positive training set: of valid triples– Create a negative training set: by replacing
entities/relations from valid triples• Replacement is by random sampling
– Update embeddings till the distance for positive training set triples is minimized and distance for negative training set triples is maximized
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 31
TransE Training• Translation Intuition: for a triple (ℎ, 𝑙, 𝑡),
𝐡 + 𝒍 = 𝐭
• Max-margin loss:ℒ = ' 𝛾 + 𝑑(ℎ + 𝑙, 𝑡) − 𝑑(ℎ′ + 𝑙, 𝑡′)
�
(~,�,>)∈$,(~�,�,>�)∉$
where 𝛾 is the margin, i.e., the smallest distance tolerated by the model between a valid triple and a corrupted one.
Valid triple Negative triple
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 32
TransE Learning Algorithm
Entities and relations are initialized uniformly, and normalized
Negative sampling with triplet that does not appear in the KG
Comparative loss: favors lower distance values for valid triplets, high distance values for corrupted ones
Valid sample
Negative sample
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 33
Complex Relational Patterns• Symmetric Relations:
𝑟 ℎ, 𝑡 ⇒ 𝑟 𝑡, ℎ ∀ℎ, 𝑡– Example: Family, Roommate
• Composition Relations:𝑟- 𝑥, 𝑦 ∧ 𝑟. 𝑦, 𝑧 ⇒ 𝑟/ 𝑥, 𝑧 ∀𝑥, 𝑦, 𝑧
– Example: My mother’s husband is my father.• 1-to-N, N-to-1 relations:
𝑟 ℎ, 𝑡- , 𝑟 ℎ, 𝑡. , … , 𝑟(ℎ, 𝑡�) are all True.– Example: 𝑟 is “StudentsOf”
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 34
Composition in TransE• Composition Relations:
𝑟- 𝑥, 𝑦 ∧ 𝑟. 𝑦, 𝑧 ⇒ 𝑟/ 𝑥, 𝑧 ∀𝑥, 𝑦, 𝑧– Example: My mother’s husband is my father.
• In TransE, compositional relations are possible if r3 = r1 + r2
𝐱𝐫- 𝐫.𝐫/
𝐲𝐳
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 35
Symmetric Relations in TransE• Symmetric Relations: 𝑟 ℎ, 𝑡 ⇒ 𝑟 𝑡, ℎ ∀ℎ, 𝑡
– Example: Family, Roommate• In TransE, symmetric relations are not
possible:– For TransE to handle symmetric relations 𝑟, for all ℎ, 𝑡 that satisfy 𝑟(ℎ, 𝑡), 𝑟(𝑡, ℎ) is also True.
– So, ℎ + 𝑟 − 𝑡 = 0 and 𝑡 + 𝑟 − ℎ = 0.– Then 𝑟 = 0 and ℎ = 𝑡.– However ℎ and 𝑡 are two different entities and
should be mapped to different locations.
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 36
Limitation: N-ary Relations
• 1-to-N, N-to-1, N-to-N relations– Example: (ℎ, 𝑟, 𝑡-) and (ℎ, 𝑟, 𝑡.) both exist in
the knowledge graph, e.g., 𝑟 is “StudentsOf”• In TransE, 𝑡- and 𝑡. will map to the same
vector, although they are different entities.– 𝐭- = 𝐡 + 𝐫 = 𝐭.– 𝐭- ≠ 𝐭.
• In TransE, N-aryrelations are not possible
𝐡
𝐭-𝐭. 𝐫
𝐫
contradictory!
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 37
Today’s Lecture• Embedding entire graphs • Introduction to Knowledge Graphs• Embeddings in Knowledge Graphs
– TransE– TransR
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 38
Solution: TransR• Learn embeddings for entities and relations
in separate spaces– Model entities as vectors in the entity space ℝv– Model a relation as vector 𝒓 in relation space ℝo
• Learn a relation-specific transformation from the entity-to-relation space per relation– Train 𝐌w ∈ ℝo×v as the projection matrix for
vector 𝒓• Reference: “Learning entity and relation
embeddings for knowledge graph completion.” Lin et al. AAAI 2015.
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 39
TransR Formulation
• ℎw = 𝑀wℎ, 𝑡w = 𝑀w𝑡• 𝑓w ℎ, 𝑡 = ℎw + 𝑟 − 𝑡w
– instead of 𝑓w ℎ, 𝑡 = ℎ + 𝑟 − 𝑡
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 40
Symmetric Relations in TransR• Symmetric Relations: 𝑟 ℎ, 𝑡 ⇒ 𝑟 𝑡, ℎ ∀ℎ, 𝑡
– Example: Family, Roommate• For TransR, we can learn Mr to map ℎ and 𝑡 to
the same location on the space of relation 𝑟𝑟 = 0, ℎw = 𝑀wℎ = 𝑀w𝑡 = 𝑡�ü
𝐡 𝐭w, ℎw
𝐭
𝑴w
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 41
N-ary Relations in TransR• 1-to-N, N-to-1, N-to-N relations
– Example: If (ℎ, 𝑟, 𝑡-) and (ℎ, 𝑟, 𝑡.) exist in the knowledge graph.
• We can learn 𝑀w so that 𝑡w = 𝑀w𝑡- = 𝑀w𝑡., even though 𝑡- does not need to be equal to 𝑡.!
𝐡𝐡w 𝐭w
𝐭-
𝐭.𝐫
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 42
Limitation: Composition in TransR• Composition Relations:
𝑟- 𝑥, 𝑦 ∧ 𝑟. 𝑦, 𝑧 ⇒ 𝑟/ 𝑥, 𝑧 ∀𝑥, 𝑦, 𝑧– Example: My mother’s husband is my father.
• Each relation has different space.• TransR is not naturally compositional for
multiple relations! û
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 43
Translation-Based KG EmbeddingEmbedding Entity Relation 𝒇𝒓(𝒉, 𝒕)TransE ℎ, 𝑡 ∈ ℝv 𝑟 ∈ ℝv ||ℎ + 𝑟 − 𝑡||TransR ℎ, 𝑡 ∈ ℝv 𝑟 ∈ ℝo,𝑀w
∈ ℝo×v||𝑀wℎ + 𝑟−𝑀w𝑡||
Embedding Symmetry Composition One-to-many
TransE û ü ûTransR ü û ü
Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining 44
Today’s Lecture• Embedding entire graphs • Introduction to Knowledge Graphs• Embeddings in Knowledge Graphs
– TransE– TransR