Aprendizado de Máquina e Inferência em Grafos de...

Post on 23-May-2020

0 views 0 download

transcript

Aprendizado de Máquina e Inferência em Grafos

de ConhecimentoDaniel N. R. da Silva

Artur ZivianiFabio Porto

dramos,ziviani,fporto@lncc.brLaboratório Nacional de Computação Científica (LNCC)

Simpósio Brasileiro de Bancos de Dados (SBBD), Outubro de 2019

Machine Learning and Inference in Knowledge Graphs

Big Data ManagementComputational Reproducibility

Knowledge BasesMachine LearningNetwork Science

Scientific Workflowshttp://dexl.lncc.br/

Outline

• Introduction and Motivation• Applications• Data Models and Systems• Relational Machine Learning• Knowledge Graph Embeddings• Events, Softwares, and Datasets• Open Research

3

Introduction and Motivation

The third (current) rise of AI

Medical Image Analysis

Online Advertisement

Web Search

Drug Discovery and

Toxicology

Custom Relationship Management

Medical Informatics

Board Games Image Captioning

Natural Language

Processing

Self-Driving Vehicles

Recommendation Systems Robotics

Deng, Li. Artificial Intelligence in the Rising Wave of Deep Learning: The Historical Path and Future Outlook [Perspectives]. IEEE Signal Processing Magazine. 2018.

5

Knowledge Representation and Reasoning (KRR)How to symbolically encode human knowledge and reasoning in such a way that this encoded knowledge can be processed by a computer via encoded reasoning to obtain intelligent behavior.

6Chein, M. and Mugnier, M-L. Graph-based Knowledge Representation: Computational Foundations of Conceptual Graphs. Springer. 2008.

SurrogateSet of

Ontological Commitments

Fragmentary Theory of IntelligentReasoning

Medium for Efficient

Computation

Medium of Human

Expression

7

Ontology

1980s

1960s

SemanticNetworks

1998

The SemanticWeb

Porphyry'sCommentaries

300s AD

Linked OpenData

2006 KnowledgeGraph

2012

8https://www.gartner.com/smarterwithgartner/5-trends-appear-on-the-gartner-hype-cycle-for-emerging-technologies-2019/

9

Graph DBMS

https://db-engines.com/en/ranking_categories

Linked Open Data

10https://lod-cloud.net/

Knowledge Graph

• Network representation for knowledge;• Heterogeneous data integration;• Constrained by an ontology or data schema;• AI applications.

11

Knowledge Graph• Terminological Component

• C: Concepts• RC:Concept Relations• TC: Relationships between concepts• TC->A: Attributive relationships• A: Attributes• V: Attribute Values

• Assertional Component:• E: Entities• RE: Entity Relations• TE: Relationships between entities• TE->C: Instantiation relationships• TE->V: Attributive relationships

12

13

Related Terms

• Knowledge base: Set of sentences (assertions) expressed in a language called a knowledge representation language (semantics and syntax);

• (Graph) Database: Storage and data organization;

• Ontology: Formal explicit description of concepts in a domain of discourse. It provides sharable and reusable knowledge.

• Knowledge Based System:Keeps a knowledge base and performs reasoning.

14Noy, N. and McGuinness, D. Ontology development 101: A guide to creating your first ontology. 2001.Ehrlinger, L. and Wöß, W. Towards a definition of knowledge graphs. SEMANTiCS. 2016.

Related Fields

15

Artificial Intelligence

Data Management

Machine Learning

Natural Language Processing

Data Integration

Knowledge Representation and Reasoning

Data Modeling Information Retrieval

16

NELL

General-purpose KGs Bio & Medical KGs

Common-sense KGs & NLP

Product Graph & E-commerce KGs

Big Techs

• Apple Siri Knowlege Graph• Amazon Product Graph• Facebook Graph• Google Knowledge Graph• IBM Watson• Microsoft Satori

17

Knowledge Graphs

18

# Entities # Triples # Concepts # RelationsDBpedia 4 298 433 411 885 960 736 2 819Freebase 49 947 799 3 124 791 156 53 092 70 902OpenCyc 41 029 2 412 520 116 822 18 028Wikidata 18 697 897 748 530 833 302 280 1 874YAGO 5 130 031 1 001 461 792 569 751 106

Färber, M. and Rettinger, A. Which Knowledge Graph Is Best for Me? ArXiv. 2019.

Knowledge Graphs

19

0

0.2

0.4

0.6

0.8

1Accuracy

Trustworthiness

Consistency

Relevancy

Completeness

TimelinessEase ofunderstanding

Interoperability

Accessibility

Licensing

Interlinking

DBpedia Freebase OpenCyc Wikidata YAGO

Färber, M. and Rettinger, A. Which Knowledge Graph Is Best for Me? ArXiv. 2019.Färber, M. et al. Linked data quality of dbpedia, freebase, opencyc, wikidata, and yago. Semantic Web. 2018.

DBpedia

• Crowd-sourced community effort to extract structured content from Wikmedia projects;

• Served as Linked Data;• Ontology, including persons, places, creative

works, organizations, species, and diseases;• 125 languages;• Links to YAGO and Wikipedia.

Lehmann, J et al. DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia.SWJ. 2012.https://dbpedia.org/

Structure in Wikipedia• Title• Abstract• Infoboxes• Geo-coodinates• Categories• Images• Links:

• other language versions• other Wikipedia pages• to the web;• redirections;• disambiguations.

21

YAGO

• YAGO (Yet Another Great Ontology).• Derived from Wikipedia, WordNet, and

Geonames.• 95% accuracy on sample facts.• Temporal Dimension:

• Before, after, during, and overlaps• Location Dimension:

• northOf, eastOf, southOf, westOf, nearby, and locatedIn.

• Multilingual facts.

Rebele, T. et al. YAGO - A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames. ISWC. 2016.https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/

23https://gate.d5.mpi-inf.mpg.de/webyago3spotlxComp/SvgBrowser/

24

StructuredData

Ex.: Relational Databases.

Semi-structuredData

Ex.: XML and JSON documents.

Non-StructuredData

Ex.: Text, audio, image, and video files.

Knowledge BaseConstruction

Ex.: Entity and relation extraction.

Knowledge Graph

Refinement

Applications

Completion and Correction

Completeness

Freshness

Correctness

Example

(Marie Curie, date-of-birth, 07/11/1867)(Marie Curie, born-in, Poland)(Marie Curie, nationality, French)(Marie Curie, profession, Physicist)(Marie Curie, profession, Chemist)(Marie Curie, research-field, Radioactivity) 25

Marie Curie was born on November 7, 1867, in Warsaw, Poland. She was a french naturalized, physicist and chemist who conducted pioneering research in the fields of radioactivity.

Text fragment

Extracted triples

26

MarieCurie

Polish

Physicist

Radioactivity

Chemist

“07/11/1867”

French

born-in

profession

research-field

nationality

date-of-birth

profession

Challenges

27

Coverage/CompletenessHave we got the information we need?

CorrectnessIs information accurate?

FreshnessIs information up to date?

Gao, Yuqing. Building a Large-scale, Accurate and Fresh Knowledge Graph. KDD Tutorial 2018.

Applications

Applications

• Conversational Agents;• Data integration;• Fact checking and fake news detection;• Question Answering;• Recommendation Systems;• Search Engines.

29

Search Engines

30

Recommendation Systems

31

User

Cast Away

BackTo TheFuture

SavingPrivateRyan

BladeRunner

TomHanks

RobertZemeckis

HarrisonFord

StevenSpielberg

Drama

SciFi

Action

ForrestGump

JurassicPark

Movies theuser has watched.

Recommendedmovies.

starred-indirector-ofgenre

Wang, H et al. Exploring High-Order User Preference on the Knowledge Graph for Recommender Systems. ACM TOIS. 2019.

Conversational Agents

32

User

ChatBot

NaturalLanguageProcessing

QueryGeneration

ContextManagement

DomainKnowledge

NaturalLanguageGeneration

Jonh Doe: @bot How much money do I have in my account?

Account Balance

Jonh Doe

JDAccount $ 0,99

Bot: @johndoe You have $0,99 inyour bank account.

isA

has

has

owns

Question Answering

33

Which films directed by Steven Spielberg won the Oscar Award?

Oscar Award

StevenSpielberg

Which

won directed film

nsubjdep nsubj

nmod:by

det

Zheng, W et al. Question Answering Over Knowledge Graphs: Question Understanding Via Template Decomposition. PVLDB. 2019.https://nlp.stanford.edu/software/lex-parser.shtml

StevenSpielberg

Film

Oscar Award

? ? ... ?directed

isA

won

Fact Checking and Fake News Detection

34

Barack Obama

Columbia University

Association ofAmerican Universities

Canada

Stephen Harper

Calagary

Naheed Neshi

Islam

Ciampaglia, G. L. Computational Fact Checking from Knowledge Networks. PLOS ONE. 2015.

Barack Obama secretlypractices Islam.

Personalized Medicine

35iASiS: Big Data to Support Precision Medicine and Public Health Policy. 2017.

36Vidal, M-E et al. Semantic Data Integration of Big Biomedical Data for Supporting Personalised Medicine. Springer. 2019.

Literome

• Extract genomic knowledge from PubMed articles;

• Knowledge pertinent to genomic medicine:• Directed genic interactions such as pathways;• Genotype–phenotype associations.

37Poon, H. Literome: PubMed-scale genomic knowledge base in the cloud. Bioinformatics. 2014.

Project HANOVER

• Machine reading for accelerating “Curations as a service” (CaaS) in precision medicine:

• Molecular Tumor Board:• Cancer is a thousand diseases driven by disparate

genetic mutations.• Real-world evidence:

• FDA-approved drug takes over a decade and costs more than $2 billion.

• Clinical trial matching:• 20% of clinical trials fail due to insufficient patients.

38https://hanover.azurewebsites.net/

ResearchSpace

• Semantic Web / Linked Data application;• Museums, libraries, archives: knowledge or

memory institutions• CIDOC-CRM:

• Conceptual Reference Model

39Oldman, D. and Tanase, D. Reshaping the Knowledge Graph by Connecting Researchers, Data and Practices in ResearchSpace. ISWC. 2018.https://www.researchspace.org.

40Oldman, D. and Tanase, D. Reshaping the Knowledge Graph by Connecting Researchers, Data and Practices in ResearchSpace. ISWC. 2018.https://www.researchspace.org.

Drug Discovery

Kanza, S. and Frey, J. G. A new wave of innovation in Semantic web tools for drug discovery. Expert Opnion on Drug Discovery. 2019.

Drug Discovery Data

Drug Discovery RDF Drug Discovery Ontologies

Drug Discovery Knowledge Bases

Semantic Search for Drug Discovery

Drug DiscoveryKnowledge Graphs

Intelligent Applications for DD

New Drug Discovery

Hum

an C

urat

ion

41

Data Models and Systems

Data Models

• Labeled Property Graph:• Node-oriented;

• Resource Description Framework (RDF):• Triple-oriented.

43

Labeled Property Graph

• Graph Databases:• Ex.: Amazon Neptune, Neo4J, JanusGraph, and

TigerGraph.• Query languages:

• Cypher, GCore, GS, Gremlim, and PGQL

44

https://aws.amazon.com/neptune/https://neo4j.com/https://janusgraph.org/https://www.tigergraph.com/

Labeled Property Graph

45

n1: PersonfirstName = “João”lastName = “Silva

n2: PersonfirstName = “Maria”lastName = “Santos”

n5: Posttext = “I love italian food.”lang = “en”

n3: Taglabel = “italian”

n4: Taglabel = “food”

e2: createsdate = 2019-09-04

e1: likesdate = 2019-09-05

e4: has

e3: has

Nodes- One or more labels- Set of propertiesEdges- Edge type- Set of properties

RDF

46

Nodes: Resources, literals or blank nodes Subjects (resource or blank node) Objects (resource, literal or blank node)

Edges: Predicates (resource)

Literal:- Can be interpreted as datatypes - Encoded as strings- Represent data values

Subject ObjectPredicate

Triple

RDF

• IRI/URI (Internationalized/Uniform Resource Identifier)

• Serialization: JSON-LD, N-Triples, RDF/XML, and Turtle

• Query Languages: SPARQL.

47

@prefix dbr: <http://dbpedia.org/resource>.@prefix dbo: <http://dbpedia.org/ontology>.@prefix foaf: <http://xmlns.com/foaf/0.1/> .

dbr:Albert_Einstein rdf:type foaf:Person; dbo:birthDate "1879-03-14"^^xsd:date.

dbr:Albert_Einstein

foaf:Person 1879-03-14

rdf:type dbo:birthDate

xsd:date

Systems for Knowledge Bases

• Grakn• Ontotext GraphDB• Amazon Neptune

48

Grakn

• Hyper-relational deductive database:• Entity and relationship types reasoning;• Rule-based reasoning.

• Core:• Built on top of JanusGraph, Apache TinkerPop,

Apache Hadoop, Apache Cassandra, Apache Spark• Data Model based on Entity-Relationship

Model and Hypergraphs;• Graql:

• OLTP, OLAP

49https://grakn.ai/

50https://dev.grakn.ai/docs/concept-api/overview

Concept

Type Rule Thing

EntityType

AttributeType

RelationType Entity

Attribute

Relation

51

01. define02. name sub attribute, datatype string;03.03. person sub entity, has name, plays person_;04. friend sub relation, relates person_, relates person_;05. is_friend sub rule,06. when {07. (person_: $p1, person_: $p2) isa friend;08. (person_: $p2, person_: $p3) isa friend;09. }, then {10. (person_: $p1, person_: $p2) isa friend;11. };12.13. insert14. $p1 isa person, has name "A";15. $p2 isa person, has name "B";16. $p3 isa person, has name "C";17.18. match19. $p1 isa person, has name "A";20. $p2 isa person, has name "B";21. $p4 isa person, has name "C";22. insert23. $r (person_: $p1, person_: $p2) isa friend;24. $r (person_: $p2, person_: $p3) isa friend;25. commit26.27. match28. $p1 isa person; $p2 isa person; 29. $r (person_: $p1, person_: $p2) isa friend;30. get31. $r;

“A”

“B”

“C”

@has

friend

person

name

owner value

person_ person_

Inferred

Ontotext GraphDB

• Triplestore:• RDFS, SPARQL, and RDF4J.

• Query Optimizer;• Reasoner (TREE Engineer);• Storage: Entity Pool

52http://graphdb.ontotext.com

53http://graphdb.ontotext.com/documentation/free/architecture-components.html

Amazon Neptune

• Graph service and database;• Amazon Web Services;• Labeled graph property:

• Apache Tinkerpop Gremlim• RDF

• SPARQL• Quad (subject, object, predicate, graph);

54https://aws.amazon.com/neptune/

55https://www.slideshare.net/AmazonWebServices/new-launch-amazon-neptune-overview-and-customer-use-cases-dat319-reinvent-2017

Knowledge Graph Tasks

Knowledge Graph Tasks

• Automated Construction• Refinement:

• Completion• Correction

• Analytics

57

Automated Construction

Knowledge Base/Graph Construction• Entity and Relation Extraction:

• Named Entity Recognition• Entity Alignment

• Systems:• DeepDive• Fonduer• Snorkel

59

Named Entity Recognition

60

Person State City

Pedro II of Brazil was the second and last monarch of the

Empire of Brazil. He was born in Rio de Janeiro to

Emperor Pedro I of Brazil and Empress Maria

Leopoldina,who were married at that time.

Entity Linking

61Q156774 Q939 Q84239 Q8678

Q217230

Pedro II of Brazil was the second and last monarch of the

Empire of Brazil. He was born in Rio de Janeiro to

Emperor Pedro I of Brazil and Empress Maria

Leopoldina,who were married at that time.

Entity Linking

62Abrams, R. Google Thinks I’m Dead (I know otherwise.) 2017. https://www.nytimes.com/2017/12/16/business/google-thinks-im-dead.html?_r=0

Relation Extraction

63

Pedro II of Brazil was the second and last monarch of

the Empire of Brazil. He was born in Rio de Janeiro to

Emperor Pedro I of Brazil and Empress Maria

Leopoldina, who were married at that time.

Q156774Q8678

Q939

Q84239

Q217230spouse

son

son born-in

emperor-of

DeepDive

CandidateMapping and

Feature ExtractionData Input

Supervision Learning andInference

Output

SpouseSpouse1 Spouse2

Tom RitaSarah Matthew

ErrorAnalysis

Ce Zhang, C. et al. DeepDive: Declarative Knowledge Base Construction. Communications of the ACM 2017.http://deepdive.stanford.edu/kbc

DeepDiveFeature Extractors: Populates the database using a set of SQL queries and UDFs. By default, one sentence per row using NLP.Candidate Mapping: SQL queries to produce possible mentions, entities and relations.

SupervisionEvidence Relation: Handlabeling and Distant Supervision.

Learning and InferenceFactor Graph similar to Markov Logic.Gibbs sampling.

Ce Zhang, C. et al. DeepDive: Declarative Knowledge Base Construction. Communications of the ACM 2017.http://deepdive.stanford.edu/kbc

Fonduer

KBCInitialization

CandidateGeneration

MultimodalFeaturization &

MultimodalLSTM

Supervision & Classification

Schema Matchers andThrottlers

LabelingFunctions

Phase 1 Phase 2 Phase 3Data Input

Output

DateOfBirthPerson Date

Madonna 16/08/1958Gisele B. 20/07/1980

User Input

ErrorAnalysis

66Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/

FonduerCandidate Generation (Apply UDFs: Matchers and Throtlers)Matchers: Returns mentions to entities.Throttlers: Decrease the # of relationship candidates.

Multimodal FeaturizationAssociate textual, structural, visual and tabular features to relationship candidates.

Supervision and ClassificationLabeling Function: Yields a label (-1, 0 or 1) for each candidate.Data Programming: Estimate the true label for each candidate.Multimodal BiLSTM: Estimate the true label for each candiate considering features.

67Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/

FonduerCandidate Generation (Apply UDFs: Matchers and Throtlers)Matchers: Returns mentions to entities.Throttlers: Decrease the # of relationship candidates.

- DateMatcher()- DictionaryMatch()- LambdaFunctionMatcher()- LambdaFunctionFigureMatcher()- LocationMatcher()- MiscMatcher()- NumberMatcher()- OrganizationMatcher()- PersonMatcher()- RegexMatchEach()- RegexMatchSpan() https://spacy.io/

- For each si in schema R(s1, ..., sn): - Define a set of matchers - Define the mention space- Cartesian product to yield candidates- Define throttlers: - Operates on candidates

Wu, S. et al. Fonduer: Knowledge Base Construction from Richly Formatted Data. SIGMOD. 2018.https://fonduer.readthedocs.io/en/latest/

Data Programming

69

Candidates Labeling Functions

GroundTruth Prob

Spouse1 Spouse2 L1 L2 L3 Yc1 Tom Hanks Rita Wilson 1 1 -1 ? .75c2 Matthew Broderick Sarah J. Parker 0 1 1 ? .80c3 Brad Pitt Jennifer Aniston 0 1 0 ? .42

Estimate the ground truth based on Labeling Functions agreements and disagreements.

Ratner, A. J. et al. Data programming: Creating large training sets, quickly. NIPS. 2016.Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/

Data Programming

70

L1

L2

L3

Y2

3

11

23

1

3

2

Accuracy: 1{Li == Y}Labeling Propensity: 1{Li != 0}

Pairwise Correlation: 1{Li == Lj}

w1 w2 w3 w4 w5 w6 w7 w8 w9

1 2 3 1 2 3 1 2 3

Indicator Function

Ratner, A. J. et al. Data programming: Creating large training sets, quickly. NIPS. 2016.Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/

Snorkel

• Data Labeling:• Labeling functions to heuristically or noisily label

some subset of the training examples;• Data augmentation:

• Transformation functions to heuristically generate new, modified training examples by transforming existing ones;

• Slicing:• Slicing functions to heuristically identify subsets of

the data the model should particularly care about.

71Ratner, A. J. et al. Snorkel: rapid training data creation with weak supervision. VLDB Journal. 2019.https://www.snorkel.org/

Multimodal BiLSTM

72Wu, S. et al. Fonduer: Knowledge Base Construction from Richly Formatted Data. SIGMOD. 2018.https://fonduer.readthedocs.io/en/latest/

Relational Machine Learning

Knowledge Graph Completion

74

Task Query Example Result Example

Triple Classification (Einstein, died-in, USA)? (Yes, 90%)

Link

Pre

dict

ion

Tail Prediction (Elvis Presley, starred-in, ?) (1, Blue Hawaii, 3.23), (2, Change of Habit, 3.12), ...

Head Prediction (?, starred-in, Casablanca) (1, Humphrey Bogart, 2.21), (2, Ingrid Bergman, 2.01), ...

Relation Prediction (Einstein, ?, Germany) (1, born-in, 5.01), (2, died-in, 1.23),...

Attribute Prediction (Obama, nationality, ?) (1, american, 2.21), (2, kenian, 1.02), ...

Entity Classification (Michael Jackson, isA, ?) (1, singer, 6.20), (2, composer, 5.22),...

(ranking, answer, score)

Relational Machine Learning • Probabilistic Graphical Models (PGMs)• Graphical Feature Models (GFMs)• Latent Feature Models (LFMs):

• Knowledge Graph Embedding.

75Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.

φθ: E X RE X E → Y Model

Parameters ScorePossible Triples

( , ) ( , )

( , )( , ) ( , )

( , )

Supervised Learning

76

Square Square

Triangle

Triangle Circle

Circle

Supervised Learning

Predictive Model

Square

Triangle

Circle

Predictive Model

77

João Maria

LucasParent

Spouse

Child

(João, Spouse, João)(João, Spouse, Maria)(João, Spouse, Lucas)(João, Parent, João)(João, Parent, Maria)(João, Parent, Lucas)(João, Child, João)(João, Child, Maria)(João, Child, Lucas)

(Maria, Spouse, João)(Maria, Spouse, Maria)(Maria, Spouse, Lucas)(Maria, Parent, João)(Maria, Parent, Maria)(Maria, Parent, Lucas)(Maria, Child, João)(Maria, Child, Maria)(Maria, Child, Lucas)

(Lucas, Spouse, João)(Lucas, Spouse, Maria)(Lucas, Spouse, Lucas)(Lucas, Parent, João)(Lucas, Parent, Maria)(Lucas, Parent, Lucas)(Lucas, Child, João)(Lucas, Child, Maria)(Lucas, Child, Lucas)

(Lucas, child, João)?

Possible Triples: Not observed + Observed Triples

78

( ,0)Spouse

( ,1)Child

( ,1)( ,0)( ,1)( ,0)

Child

Spouse

Spouse

Parent

Supervised Learning

Predictive Model

Predictive Model

Parent 1

Tensor Factorization Formulation

79

Based on observed relationships, does an unseen relationship exist?

Entities

Entit

ies

Relations

Relational Machine Learning

• Probabilistic Graphical Models (PGMs):The existence of a triple depends on the existence of other triples.

• Graphical Feature Models:The existence of a triple is conditionally independent of the existence of other triples given model parameters and observed features in the graph.

• Latent Feature Models:The existence of a triple is conditionally independent of the existence of other triples given model parameters and latent features.

80Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.

PGMs

81

E1 E2

Knowledge Graph:2 entities, 2 possible relations,8 possible triples

Dependency Graph:8 nodes (1 for each possible triple)27 possible pair-interdepedencies

- Yi is a RV denoting the ith triple existence.- PGM to model the distribution over possible worlds: Prob(Y1,Y2,Y3,Y4,Y5,Y6,Y7, Y8)

Y1

Y5

Y2 Y3

Y4

Y6 Y7

Y8

Markov Logic Networks

• Syntax: Weighted first-order formulas;• Semantics: Templates for Markov Nets;• Inference: Logical and Probabilistic;• Learning: Statistical and Inductive Logical

Programming.

82

Markov Logic Networks

• Set of pairs (Fi, wi):• Fi: First-order logic formula;• wi: A real number (weight).• ni: Number of satisfied groundings of Fi in y

(possible world).• PGM:

• One node for each grounding atom.• Edges between nodes appearing at the same

grounding formula.

83Richardson, M. and Domingos, P. Markov Logic Networks. Machine Learning. 2006.

Markov Logic Networks

84

Knowledge BaseRules∀x Smokes(x) → Cancer(x)∀x∀y Friends(x,y) → (Smokes(x) ↔ Smokes(y))

FactsAna(A)João(J)

Friends(A, J) Friends(J, A)

Cancer(A) Cancer(J)

Smokes(A) Smokes(J)

Friends(J, J)Friends(A, A)

Rule (feature) weights Number of times the rule is satisfied in the world y.

Richardson, M. and Domingos, P. Markov Logic Networks. Machine Learning. 2006.

Graphical Feature Models

• Graph observed features• Similarity Indices• Rule mining• Inductive Logic Programming

85

Path Ranking Algorithm

• Random walks of bounded length;• Feature Extraction

• Build feature vectors for triples;• Vectors are based on relation paths Pj(R1,...,Rn).

• Training:• Train an off-the-shelf machine learning model.

86Lao, N., Mitchell, T. and Cohen, W. W. Random walk inference and learning in a large scale knowledge base. EMNLP. 2011.

Relation Paths

87

A B C

R1

Probabilities of reaching C from A by following given relation paths:[ 1/4, 0, 0, 0, 0, 0]

R1-1 R2

-1

R2

R3-1

P1(R1, R2)P2(R2

-1, R1-1)

P3(R3, R2-1)

P4(R3-1, R1)

P5(R2, R3-1)

P6(R1-1, R3)

(A, R1, B), (B, R2,C) (A, R3, C)

R3

Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.

88

Enumeration of 2-length paths:

P1(Sp,Pa) P2(Pa-1,Sp-1) P3(Pa,Ch) P4(Ch-1,Pa-1) P5(Ch, Sp ) P6(Sp-1,Ch-1)

0 0 0 0 0 1/4

Feature vector associated with (Luísa, Parent, Bruna)

Ch-1

João Ana

Maria Lucas

Luísa

BrunaRafael

PaCh

SiCh

Sp-1

Si-1

SpPa-1Pa-1 PaCh-1

Sp

Sp-1

Ch: Child, Pa: Parent, Sp: Spouse, Si: Sibling

Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.

89

Enumeration of 2-length paths:

P1(Sp,Pa) P2(Pa-1,Sp-1) P3(Pa,Ch) P4(Ch-1,Pa-1) P5(Ch, Sp ) P6(Sp-1,Ch-1)

0 0 0 0 0 1/4

Feature vector associated with (Luísa, Parent, Bruna)

Maximilian, N. A review of relational machine learning for knowledge graphs. IEEE. 2016.

Ch-1

João Ana

Maria Lucas

Luísa

BrunaRafael

PaCh

SiCh

Sp-1

Si-1

SpPa-1Pa-1 PaCh-1

Sp

Sp-1

Ch: Child, Pa: Parent, Sp: Spouse, Si: Sibling

Path Ranking Algorithm

• Logistic regression• θ: model parameter vector• σ(x) = 1 / (1 + exp(-x)) • f(s,r,o) : (s,r,o) feature vector• y(s,r,o) : 1 iff (s,r,o) is in the knowledge graph

90

φθ(s,r,o) := σ(<θ, f(s,r,o)>)

91

n1

n2 n6

n5

n3 n7

n8

Feature Engineering- Engineered features to measure local neighborhoods;- Graph Kernels;- Summary statistics.

n4

Node Classification: Machine LearningModel or

Knowledge Graph Embeddings

Word Embeddings in NLP

93

king

man

womanqueen

walking

swimming

swamwalked

Spain

Italy

Rome

Madrid

Brasil

Brasilia

Bengio, Y. et al. Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013.

Embeddings

94

WordI

likeapple

Word EmbeddingI [0.19,...,0.87]

like [0.09,...,0,46]apple [0.97,...,0.22]

Semantic Triple(Steve Jobs, founder, Apple)(Bill gates,founder, Microsoft)

(Paris, capital-of, France)

Entitty/Relation EmbeddingApple [0.92,...,0.12]

Bill Gates [0.08,...,0.46]capital-of [0.67,...,0.37]founder [0.12,...,0.41]France [0.65,...,0.87]

Microsoft [0.78,...,0.21]Paris [0.98,...,0.46]

Steve Jobs [0.12,...,0.69]

Graph Representation Learning

95Espato, A. Innovations in Graph Representation Learning. 2019.https://ai.googleblog.com/2019/06/innovations-in-graph-representation.html.

Knowledge Graph Embedding (KGE)Embed components of a Knowledge Graph including entities and relations into continuos vector spaces, so as to simplify the manipulation while preserving the inhrent structure of the KG.

96Wang, Q. et al. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering. 2017.

KGE Models

97

plays-in

owns

acted-in

born-in

died-in

plays-in

owns

acted-in

born-in

died-in

EntityEmbedding

RelationEmbedding

KGE Models

• Translational Distance Models• Score functions based on distance.

• Semantic Matching Models• Score functions based on similarity.

• Multiplicative, additive and neural approaches.

98

Model requirements

• Expressiveness:Model KG properties and regularities as much as possible (symmetry, antisymmetry, inversion, composition, etc.)

• Space / time complexity:O(NE), O(NR), O(NT) in time and memory.

99

Anatomy of a KGE Model

• Knowledge Graph (KG);• Negative triples generation strategy;• Scoring function for a triple;• Loss function;• Optimization algorithm.

100

TransE

101Bordes, A. et al. Translating Embeddings for Modeling Multi-relational Data. NIPS. 2013.

Brasilia

Brasil

capital-of

Hierarchical structure

1-to-1 relation as translation (NLP)

vs

vo

vr

translation

TransE

• Constraints on entity embedding:• Prevent learning trivial representations.

• Limitations on dealing with 1-N, N-1, N-M relations.

102

v”Maryl Streep” + v”starred-in” = v”Death becomes her”v”Maryl Streep” + v”starred-in” = v”Doubt”

Movies are very different: cast, genre, director, etc.

Bordes, A. et al. Translating Embeddings for Modeling Multi-relational Data. NIPS. 2013.

• Project to relation-specific hyperplanes.

TransH

103

vs

vo⊥

vrvs⊥

vo wr

||wr||2 = 1

Wang, Z. et al. Knowledge Graph Embedding by Translating on Hyperplanes. AAAI. 2014.

Translational Distance Models

104

- K2GE;- ManifoldE;- SE;- STransE;- TransD;- TransE;- TransF;

- TransG;- TransH;- TransM;- TransR;- TranSparse;- UM.

Wang, Q. et al. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering. 2017.Nguyen, D. Q. An overview of embedding models of entities and relationships for knowledge base completion. ArxiV. 2019.

RESCAL

105

w12w11 w21 w22

vs2vs1 vo1 vo1

score

Too many parameters!!!

RESCAL

• DistMult: • Wr as diagonal matrix.• Can not deal with asymmetric relations.

• Complex:• Introduce complex value embedding;• Score based on the real part of embeddings.

Nickel, M. et al. A three-way model for collective learning on multi-relational data. ICML. 2011. Yang, B. et al. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. ICLR. 2015.Trouillion, T. Knowledge Graph Completion via Complex Tensor Factorization. 2017.

Analogy

107

Man is to king what woman is to queen.king - man + woman = queen

Ethayarajh, K. et al. Towards Understanding Linear Word Analogies. ACL. 2019.Liu, H et al. Analogical Inference for Multi-relational Embeddings. PMLR. 2017.

Man

King

Woman

Queenfemale

female

royal royalHelps topredict

Analogy

108Liu, H et al. Analogical Inference for Multi-relational Embeddings. PMLR. 2017.

Commutative constraints.

Normality constraints.

SimplE

• Canonical Polyadic Decomposition.• Two vectors for each entity:

• One as subject and another as object.

109Kazemi, S. M. and Poole, D. SimplE Embedding for Link Prediction in Knowledge Graphs. NIPS. 2018.

Element-wise productTrilinear product

Shallow and Deep Models

110

Lookup Table

ei

ej

Embedding Dimension

# En

titie

s

Plausibility of (ei, rk, ej) existence

Shallow Models - Representation expressiviness depends on the embedding dimension;- Transductiveness.

Deep models- Overfitting;- Time and space.

a1 a2

a3 a4

b1 b2

b3 b4

ConvE

111Dettmers, T. et al. Convolutional 2D Knowledge Graph Embeddings. AAAI. 2018.

1D Convolution

2D Convolution

a1 a2 a3 a4

b1 b2 b3 b4a1 a2 a3 a4 b1 b2 b3 b4

f1 f2 f3f1*b2 + f2*b3 + f3*b4

a1 a2 a3 a4

b1 b2 b3 b4

f1 f2

f3 f4

no paddingstride 1

f1*0 + f2*0 + f3*0 + f4*a1

padding 1stride 1

f1*a4 + f2*0 + f3*b2 + f4*0

ConvE

112

vs vr

X

X

XX X

.3

.9

.1

.5

EmbeddingDropout .25

Feature MapDropout .25

HiddenDropout .25

Concatenate ConvolveFully Connected

projectionLogisticSigmoid

Matrix multiplication

with entity matrix

Embeddings ”Image” Feature maps Projection Logits Predictions

Dettmers, T. et al. Convolutional 2D Knowledge Graph Embeddings. AAAI. 2018.

Convolution Neural Network

113

Convolution Neural Network

114

Convolution Neural Network

115

Convolution Neural Network

116

Graph Convolution

117

Generalize the operation of convolutionfrom grid data to graph data.

Main idea: Learn a representation for a node taking into account its neighbors representations.

Wang, Q. et al. A Comprehensive Survey on Graph Neural Networks. ArXiv. 2019.

Differentiable message-passing framework

Kipf, T. N and Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. ICLR. 2017.Schlichtkrull, M. et al. Modeling Relational Data with Graph Convolutional Networks. ESWC. 2018.

hidden state at l+1 layer hidden states at l layer

incoming messages for node u

element-wise activation function

- hu(l+1) is a real vector of dimension dl+1;

- Messages are often chosen to be identical to the set of incoming edges;- gm may be Whv.

Relational Graph Convolutional Networks

119Schlichtkrull, M. et al. Modeling Relational Data with Graph Convolutional Networks. ESWC. 2018.

r-neighborhood of entity e

self-relation specific matrix relation specific matrix

relation specific constant

120

h1 h2

h3 h4

h5h6

h

In

In

Out

h

h2

h3

h6

Self h

X

X

X

X

+ ReLU

Out h1 h4+X

In h5X

RGCN (Encoder) Decoder

Input 1stLayer

2ndLayer

nthLayer

...Score Edge

Loss

Last layer representations taken as latent features.

Relational Graph Convolutional Networks• Rapid growth in number of parameters:

• Overfitting (rare relations);• Models of very large size.

• Weights regularization:• Basis decomposition: linear combination of basis

transformations.• Block-diagonal decomposition: direct sum over a

set of low-dimensional matrices.

121

Graph Neural Networks

• Convolutional Graph Neural Networks• Graph Autoencoders• Recurrent Graph Neural Networks• Spatial-temporal Graph Neural Networks

122Wang, Q. et al. A Comprehensive Survey on Graph Neural Networks. ArXiv. 2019.

Attributive Relations

• Literals;• KGs often include:

• Numerical attributes (e.g., ages, dates, financial, and geoinformation);

• Textual attributes: (e.g., names, descriptions, and titles);

• Images (e.g., profile photos, flags, and posters).• Useful for entities with few relationships;• Regard attributes as entities.

123Gesese, G. A. and Russa Biswas, H. S. A comprehensive survey of knowledge graph embeddings with literals: Techniques and applications. ArXiv. 2019.

MKBE

• Multimodal Knowledge Base Embeddings;• Compositional encoding component:

• Different neural encoders for the variety of observed data.

• Embedding Multimodal Data:• Structured knowledge, numerical, text, and images.

124Pezeshkpour, P. et al. Embedding multimodal relational data for knowledge base completion. EMNLP. 2019.

125

Number...

Feed forward layer

Text

EmbeddingVector

Stacked bidirectional GRUs

... ... ... ...

CharacterLevel

SentenceLevel

CNN over wordembeddings

Image

Pretrained VGGNeton ImageNet

126

s

r

o

DenseLayer

DenseLayer

DenseLayer

ScoreFunction

GRU

CNN

VGGNet

Entity

Relation

Entity

Long Text

Short Text

Image

Embeddings

Score for thetriple (s,r,o)

Ontology

• Explicit and implicitly employ ontological knowledge on KGE learning;

• Restrictions on embedding space to guarantee expressiveness and consistency.

127

• Instance-view, ontology-view, and cross view;• Cross-view association model:

• Cross-view grouping (CG);• Cross-view transformation (CT).

• Intra-view model:• Default;• Hierarchy-aware.

• Specific Cost Functions.

JOIE

128Pezeshkpour, P. et al. Universal Representation Learning of Knowledge Bases byJointly Embedding Instances and Ontological Concepts. KDD. 2019.

B

A

X Y

subclass

isA

rInstance-view

cross-view

ontology-view

129

PersonEntities

CityEntities

Pelé

Adele

Person

Rio

Fortaleza

City

130

PersonEntities

CityEntities

Rio

Fortaleza

Adele

Pelé

CityEntities

PersonEntities

Pelé

Adele

Person

Rio

Fortaleza

City

fct

fct

131

Place

City

InstitutionState

isA isA

locatedIn

University

isA isA

works-in

Person

ScientistisA

Musician

ArtistisA isA

”Place”hierarchy

”Person”hierarchy

Model Training and Evaluation

• Triple Classification Protocol:• Test the model's ability to discriminate between true

and false triples.• Triple (s,r,o) is classified as positive if its score

exceeds a relation-specific decision threshold (learned on validation data).

• Entity Ranking Protocol:• Assess model performance in terms of ranking

answers to certain questions.

132Wang, Y. et al. On Evaluating Embedding Models for Knowledge Base Completion. RepL4NLP. 2019.

Evaluation Metrics

• Regression, classification and ranking metrics.

133

(Mean Reciprocal Rank)

134

s p o score rank

Neymar born-in Mogi 0.80 1 2

Neymar born-in Gama 0.70 2

Neymar born-in Kaká 0.20 3

Neymar born-in Neymar 0.10 4

Gama born-in Mogi 0.05 3

Kaká born-in Mogi 0.85 1

Mogi born-in Mogi 0.04 4

Kaká born-in Gama 0.80 2 1

Kaká born-in Kaká 0.10 3

Kaká born-in Neymar 0.20 4

Kaká born-in Mogi 0.85 1

Gama born-in Gama 0.04 4

Mogi born-in Gama 0.05 3

Neymar born-in Gama 0.70 2

Test triples:(Neymar, born-in, Mogi)(Kaká, born-in, Gama)

Entities:Kaká, Neymar, Mogi, and Gama

hits@1 = (1 + 0 + 0 + 1)/4 = 0.5hits@2 = (1 + 1 + 1 + 1)/4 = 1

MRR = (1 + 1/2 + 1/2 + 1)/4 = 0.75

Test Triple => Corrupted Triples:(s, r, o) => (s, r, o') and (s', r, o).

Negative Sampling

135

- Uniformally sample from entities/relation set.

- tph (Average number of tail entities per head)- hpt (Average number of head entities per tail)- Bernoulli with parameters: - tph / (tph + hpt) for replacing the head and hpt / (tph + hpt) for replacing the tail

- Corrupt a position (i.e. head or tail) using only entities that have appeared in that position with the same relation.

Perturb a triple:unusual

Pointwise

Square Error Loss

Hinge Loss

Logistic Loss

Pairwise

Hinge Loss

Logistic Loss

136

Closed WorldAssumption

Score given for triple t Label (0 or 1) for triple t

1-N Scoring

137

p y(s,r)

Probabilities Labels

Against allentities

Probability(s,r,o') beingtrue.

(s,r, o') label.

Event, Softwares, and Datasets

Conferences

139

AAAI

CIKM

EMNLP

ICML

KDD

NIPS

SIGMODPODS VLDB WSDM

Data Management:

WWW

Artificial Intelligence:

IJCAIICLRECMLPKDD

Natural Language Processing:

ACL

ESWC

• Automated Knowledge Base Construction;• Knowledge Graph Conference;• International Workshop on Challenges and

Experiences from Data Integration to Knowledge Graphs;

• Workshop on Knowledge Graph Technology and Applications;

• Workshop on Deep Learning for Knowledge Graphs.

140

KGE libraries and systems

• AmpliGraph (Tensorflow, Benchmark Datasets):• https://ampligraph.org/;

• DeepGraphLibrary (MXNet/Gluon and PyTorch):• https://www.dgl.ai/

• OpenKE (Tensorflow, Pretrained embeddings):• http://openke.thunlp.org

• PyKEEN (PyTorch):• http://pykeen.readthedocs.io

• PyTorch-BigGraph (PyTorch):• https://torchbiggraph.readthedocs.io/en/latest

141

AmpliGraph DeepGraphLibrary OpenKE PyKEEN PyTorch BigGraph

ConvEComplexDistMultERMLPHolERESCALR-GCNStructured Embedding

TransDTransETransHTransRTranSparseUnstructured Model

Model is implemented by the system.

PyTorch-BigGraph

• Built on PyTorch;• Models:

• ComplEx, DistMult, RESCAL, and TransE.• Features:

• Graph Partitioning;• Multi-threaded computation;• Distributed execution accross multiple machines;• Batched negative sampling.

143Lerer, A. et al. PyTorch-BigGraph: A Large-scale Graph Embedding System. 2019.https://torchbiggraph.readthedocs.io/en/latest/

PyTorch-BigGraph

144Lerer, A. et al. PyTorch-BigGraph: A Large-scale Graph Embedding System. 2019.https://torchbiggraph.readthedocs.io/en/latest/

DeepGraphLibrary

• Built on MXNet/Gluon and PyTorch.• Models:

• Graph Neural Networks: GCN, GAN, R-GCN, LGNN, and SSE.

• Generative Models: DGMG and JTNN.• Capsule, Transformer, and Universal Transformer

architecture.

145https://www.dgl.ai/

Bechmark Datasets• DB15K:

• https://github.com/nle-ml/mmkb

• FB15K:• https://github.com/nle-ml/mmkb

• FB15K-237:• https://github.com/TimDettmers/ConvE

• WN18:• https://everest.hds.utc.fr/doku.php?id=en:transe

• WN18RR• https://github.com/TimDettmers/ConvE

• YAGO3-10:• https://github.com/TimDettmers/ConvE

• YAGO15K:• https://github.com/nle-ml/mmkb

146

Open Research

Knowledge Graph Embeddings

• KGE and the lack of symbolic structures (e.g., rules and restrictions);

• Compare/Combine KGE to/with other approaches (e.g., Probabilistic Soft Logic + rule learning);

• KGE interpretability and explainability;• KGE consistency (e.g., embed formal

knowledge);

148

Knowledge Graph Embeddings

• KGE sparsity and uncertainty;• Temporal and spatial dynamics;• Other structures:

• Paths, motifs, graphlets, etc.• Representation:

• Hypergraphs (n-ary relations);• Meta-properties.

• Inductive vs. transductive learning.

149

Knowledge base/graph construction• Heterogeneous and multimodal information;• Multi-language knowledge bases;• Construction in specific domains;• Entity disambiguation and managing identity;

150

Knowledge base/graph construction• Managing operations at scale;• Virtual knowledge graph.• Continuously learning and self-correcting

systems.• Knowledge graph alignment.

151

152

StructuredData

Ex.: Relational Databases.

Semi-structuredData

Ex.: XML and JSON documents.

Non-StructuredData

Ex.: Text, audio, image, and video files.

Knowledge BaseConstruction

Ex.: Entity and relation extraction.

Knowledge Graph

Refinement

Applications

Completion and Correction

Completeness

Freshness

Correctness

dramos,ziviani,fporto@lncc.brOs autores agradecem ao CNPq, à FAPERJ e ao

CENPES/Petrobras pelo financiamento.

http://dexl.lncc.br/