+ All Categories
Home > Software > Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Date post: 16-Apr-2017
Category:
Upload: ontotext
View: 262 times
Download: 1 times
Share this document with a friend
81
GraphDB Fundamentals Ontotext Webinar Aug 11, 2016
Transcript
Page 1: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GraphDB FundamentalsOntotext Webinar Aug 11, 2016

Page 2: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

2

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#2

Page 3: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#3

Page 4: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Resource Description Framework (RDF) is a graph data model that• Formally describes the semantics, or meaning, of information

• Represents metadata, i.e., data about data

RDF data model consists of triples• That represent links (or edges) in an RDF graph

• Where the structure of each triple is Subject, Predicate, Object

Example triples:

‘br:’ refers to the namespace ‘http://bedrock/’ so that ‘br:Fred’ expands to <http://bedrock/Fred> a Universal Resource Identifier (URI).

What is RDF?

Subject Predicate Object

br:Fred br:hasSpouse br:Wilma .br:Fred br:hasAge 25 .

4

#4

Page 5: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

An Example of an RDF Model

5

hasSpouse

hasSpouse

hasSpouse

hasChild

hasChild hasChildhasChild hasChild

hasChild hasChild hasChild hasChild

worksFor

livesInlivesIn

worksFor

WilmaFlintstone

PebblesFlintstone

PearlSlaghoople

RoxyRubble

PearlSlaghoople

Bamm-BammRubble

PrehistoricAmerica

CobblestoneCounty Bedrock Rock

Quarry

partOf locatedIn

FredFlinstone

BarneyRubble

BettyRubble

partOf

Chip

#5

Page 6: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

RDF Schema (RDFS)

• Adds− Concepts such as Resource, Literal, Class, and Datatype − Relationships such as subClassOf, subPropertyOf, domain, and range

• Provides the means to define− Classes and properties− Hierarchies of classes and properties

• Includes “entailment rules”, i.e., axioms to infer new triples from existing ones

What is RDFS?

6

#6

Page 7: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Applying RDFS To Infer New Triplesbr:hasSpouse a rdf:Property; rdfs:domain br:Human ; rdfs:range br:Human .

br:Fred br:hasSpouse br:Wilma .br:Human a rdf:Class; rdfs:subClassOf br:Mammal .

br:Fred a br:Human .br:Wilma a br:Human .

br:Fred a br:Mammal .br:Wilma a br:Mammal .

7

#7

Page 8: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Questions?

RDF and RDFS Overviews

8

Page 9: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

9

Presentation Outline

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#9

Page 10: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

10

What is SPARQL?

SPARQL is a SQL-like query language forRDF graph data with the following querytypes:

• SELECT returns tabular results

• CONSTRUCT creates a new RDF graph based on query results

• ASK returns ‘yes’ if the query has a solution, otherwise ‘no’

• DESCRIBE returns RDF graph data about a resource; useful when the query client does not know the structure of the RDF data in the data source

• INSERT inserts triples into a graph

• DELETE deletes triples from a graph.

SemanticSearch

10

Page 11: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Using SPARQL to Insert TriplesTo create an RDF graph, perform these steps:• Define prefixes to URIs with the PREFIX keyword

• Use INSERT DATA to signify you want to insert statements. Write the subject-predicate-object statements (triples).

• Execute this query.

:pebbles:bamm- bamm

:fred :wilma

:roxy :chip

:hasSpouse

:hasChild :hasChild

:hasChild :hasChild

PREFIX br: <http://bedrock/>INSERT DATA { br:fred br:hasSpouse br:wilma . br:fred br:hasChild br:pebbles . br:wilma br:hasChild br:pebbles . br:pebbles br:hasSpouse br:bamm-bamm ; br:hasChild br:roxy, br:chip .}

#11

Page 12: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Using SPARQL to Select TriplesTo access the RDF graph you just created, perform these steps:• Define prefixes to URIs with the PREFIX keyword.

• Use SELECT to signify you want to select certain information, and WHERE to signify your conditions, restrictions and filters.

• Execute this query.

PREFIX br: <http://bedrock/>SELECT ?subject ?predicate ?object WHERE {?subject ?predicate ?object}

Subject Predicate Object

br:fred br:hasChild br:pebblesbr:pebbles br:hasChild br:roxybr:pebbles br:hasChild br:chipbr:wilma br:hasChild br:pebbles

#12

Page 13: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Using SPARQL to Find Fred’s GrandchildrenTo find Fred’s grandchildren, first find out if Fred has any grandchildren:• Define prefixes to URIs with the PREFIX keyword

• Use ASK to discover whether Fred has a grandchild, and WHERE to signify your conditions.

YESPREFIX br: <http://bedrock/>ASKWHERE { br:fred br:hasChild ?child . ?child br:hasChild ?grandChild .}

#13

Page 14: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Using SPARQL to Find Fred’s GrandchildrenNow that we know he has at least one grandchild, perform these steps to find the grandchild(ren):• Define prefixes to URIs with the PREFIX keyword

• Use SELECT to signify you want to select a grandchild, and WHERE to signify your conditions.

PREFIX br: <http://bedrock/>SELECT ?grandChild WHERE { br:fred br:hasChild ?child . ?child br:hasChild ?grandChild .}

grandChild

1. br:roxy2. br:chip

#14

Page 15: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

SPARQL Overview

Questions?

15

Page 16: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

16

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#16

Page 17: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

What is OntologyAn ontology is a formal specification that provides sharable and reusable knowledge representation.

Examples of ontologies include:

• Taxonomies

• Vocabularies

• Thesauri

• Topic Maps

• Logical Models

#17

Page 18: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

What is in an Ontology?An ontology specification includes descriptions of• Concepts and properties in a domain • Relationships between concepts • Constraints on how the relationships can be used• Individuals as members of concepts

18

#18

Page 19: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

The Benefits of an OntologyOntologies provide:• A common understanding of information• Explicit domain assumptions

These provisions are valuable because ontologies:• Support data integration for analytics• Apply domain knowledge to data• Support interoperation of applications• Enable model-driven applications• Reduce the time and cost of application development• Improve data quality, i.e., metadata and provenance

19

#19

Page 20: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

OWL Overview

The Web Ontology Language (OWL) adds more powerful ontology modelling means to RDF/RDFS• Providing

− Consistency checks: Are there logical inconsistencies?− Satisfiability checks: Are there classes that cannot have instances?− Classification: What is the type of an instance?

• Adding identity equivalence and identity difference − Such as, sameAs, differentFrom, equivalentClass, equivalentProperty

• Offering more expressive class definitions, such as− Class intersection, union, complement, disjointness− Cardinality restrictions

• Offering more expressive property definitions such as,− Object and datatype properties− Transitive, functional, symmetric, inverse properties− Value restrictions

20

#20

Page 21: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Ontology Overview

Questions?

21

Page 22: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

22

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#22

Page 23: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

"Ontology Development 101" by Noy & McGuinness (2001) is a popular, practical seven-step methodology for developing an ontology.

• Step 1: Identify the domain and scope

• Step 2: Consider re-using existing ontologies

• Step 3: Enumerate important terms

• Step 4: Define the classes and class hierarchy

• Step 5: Define the properties of classes

• Step 6: Define property facets

• Step 7: Create instances

A Methodology for Ontologies

1

23

45

6

23

#23

Page 24: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

To help identify the domain and scope of the ontology, answer these questions:

• What is the domain of the ontology?

• What is the purpose of the ontology?

• Who are the users and maintainers?

• What questions will the ontology answer?

Some say the last is most important (Competence Questions approach)

Step 1: Identify the Domain and Scope

24

#24

Page 25: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Ontologies are re-usable and extensible and there are a number of existing ontologies that you might consider:

• Your existing ontology

• Widely used ontologies− such as: Dublin Core, FOAF, SKOS, Geo (WGS84)

• Upper Level Ontologies− such as: Cyc, UMBEL, DOLCE, SUMO, PROTON

• Linked Open Data

• Specialized domain ontologies

Step 2: Consider Re-using Existing Ontology

25

#25

Page 26: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Terminology is useful for domain modeling. Start collecting terminology based on interviews and domain documentation.

Step 3: Enumerate Important Terms

26

#26

Page 27: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

To help define the class and class hierarchy, determine which type of modeling to use.

Three types of modeling are:

• Top-down modeling− Use it when the general domain concepts are known

• Bottom-up modeling− Use it when there is a great variety of concepts and no clear overarching general concepts at the outset

• Hybrid modeling− Use it when you need both top down and bottom up modeling, which is often the case

Step 4: Define Class and Class Hierarchy

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 27

#27

Page 28: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Define the properties of classes, such as:

• Intrinsic properties − For example color, mass, density

• Extrinsic properties − For example, name, location

• Parts

• Relationships to other individuals

Step 5: Define Properties of Classes

28

#28

Page 29: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Define property facets, such as:

• Property Type− Is it symmetric? Is it transitive? Is it a datatype or an object

property?

• Cardinality− Is the property optional or essential? Is the property a one-

to-many relationship?

• Domain− From which classes does this property point?

• Range− To which classes does this property point?

Step 6: Define Property Facets

29

#29

Page 30: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Create instances of classes

• For example, :Fred a :Human

Creating instances

• Tests the domain ontology

• May expose modeling issues− which can be addressed by iterative refinement

Step 7: Create Instances

30

#30

Page 31: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Ontology Modeling

Questions?

31

Page 32: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

32

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#32

Page 33: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GraphDB™ Editions

• GraphDB™ Free

• GraphDB™ Standard

• GraphDB™ Cloud

• GraphDB™ as-a-Service (S4)

• GraphDB™ Enterprise

#33

Page 34: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

#34http://info.ontotext.com/graphdb-free-graphdb

GraphDB™ Free Installation

Page 35: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

To install GraphDB™ Free Edition, perform these steps:• With the new GraphDB 7 on Windows: run the installer and it starts automatically

• Otherwise: unzip, start the GraphDB and Workbench interfaces in the embedded Tomcat server by executing the startup script located in the root directory:

startup.bat (Windows)

./startup.sh (Linux/Unix/Mac OS)

The message below appears in your Terminal and the GraphDB Workbench opens up at http://localhost:8080/.

INFO: Starting ProtocolHandler [“http-bio-8080”]

Opening web app in default browser

GraphDB™ Free Edition Installation Overview

35

#35

Page 36: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Create a new repository by:• Launching the GraphDB™ Workbench• Selecting “Admin”• Selecting “Locations and Repositories”• Configuring the new repository

GraphDB™ Free Edition Workbench New Repositoryhttp://localhost:8080

6

#36

Page 37: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Test the repository by

• Selecting “SPARQL”

• Submitting queries

GraphDB™ Workbench Execute Querieshttp://localhost:8080

7

2 Query1 Insert Data

#37

Page 38: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GraphDB™ Installation

Questions?

38

Page 39: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

39

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#39

Page 40: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

With regard to performance tuning

• Memory is the most important factor− More memory results in better performance

• Configure the heap space as follows:− Set Max Heap Space to ~90% of Free Memory (-Xmx JVM parameter)

− Use entity-index-size to set the entity index size

− Cache memory indices (statements, predicates, and FTS)

Performance Tuning: Memory

40

#40

Page 41: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Performance Tuning: Memory

JVM

opti

on –

Xmx<

size>

(tot

al Ja

va h

eap

mem

ory)

Java runtime overhead

Entities

POS/PSO PCSO PCOS

Predicate Lists (SP/OP)

Full-text search

RDF Rank

Geo-spatial

Lucene

Cache memory

GraphDB application heap

Total available Java heap

tuple-index-memory

predicate-memory

fts-memory

Depends on entity-index-size

Typically 12-15% of total heap

cach

e-m

emor

y

Remaining memory used by GraphDB and the application’s heap

Some of the space will be used for Caching the RDRank, geo-spatial, and Lucene indices (if enabled)

41

#41

Page 42: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Each dataset has its own “geometry.” Technicians must gain experience with each dataset in order to refine the loading process. Here are some tips:

• Load Performance− Set ‘cache-memory’ to be 50% of max heap− Disable optional indices− Load Data in chunks of 1 million statements− Use Fast Transaction mode

• Use the new LoadRDF Parallel Bulk Loader (video, docs)

• Normal Operations After Load− Set ‘cache-memory’ to be 38% of max heap− Re-enable optional indices− Enable safe transaction mode− Experiment with

▪ cache-memory + ▪ tuple-memory-index + ▪ predicate-memory + ▪ fts-memory

Performance Tuning: Load

42

#42

Page 43: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

To help achieve the optimal configuration, GraphDB™ has a spreadsheet that estimates memory and index configuration values.

The spreadsheet

• Generates command line parameters and ttl configuration based on your input

• Is located in your distribution at ./doc called graphdb-se-configurator.xls

Performance Tuning: Spreadsheet

43

#43

Page 44: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GraphDB™ Enterprise edition provides scalability

• Replication / High Availability cluster

• Improved concurrent querying and scalability

• Resilience for failover

Scalability: GraphDB™ Enterprise

GraphDB™

44

#44

Page 45: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Performance Tuning and Scalability

Questions?

45

Page 46: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

46

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#46

Page 47: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GraphDB™ Workbench is a web-based administration tool. It is similar to Sesame Workbench, but

• Has more features

• Is more intuitive and easier to use

GraphDB™ Workbench functions Include

• Managing GraphDB™ repositories

• Loading and exporting data

• Monitoring query execution

• Developing and executing queries including Auto-complete and charting of results

• Managing connectors and users

GraphDB™ Workbench and Sesame

47

#47

Page 48: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

On the following slide is an example of the GraphDB™ Workbench screen.

• Access the GraphDB™ Workbench from a browser.

• The splash page provides a summary of the installed GraphDB™ Workbench.

• The Workbench has a menu bar and a number of convenient pull down menus organized under “Data”, “SPARQL”, “Admin”, and the currently selected repository.

GraphDB™ Workbench

48

#48

Page 49: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Access GraphDB™ Workbenchhttp://localhost:8080/graphdb-workbench-se/

49

#49

Page 50: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Create a new repository by selecting

• The Admin menu

• Locations and Repositories

• Create Repository

Create New Repository

50

#50

Page 51: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

By selecting the SPARQL menu, the SPARQL query editor displays and

• Allows you to render your query results as Table, Pivot Table, or Google Analytic Charts

Execute Queries With GraphDB™ Workbench

51

#51

Page 52: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GraphDB™ Workbench Query Editor

52

#52

Page 53: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Query Monitoring: Abort Query

53

#53

Page 54: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GraphDB™ Workbench and Sesame

Questions?

54

Page 55: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

55

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#55

Page 56: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Loading data may be accomplished by using

• GraphDB™ Workbench− To upload individual files

− To upload bulk data from a directory

• LoadRDF Parallel Loader

Loading Data

56

#56

Page 57: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Loading DataSupported File Formats

#57

Page 58: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Loading data through the GraphDB WorkbenchTo load a local file:

#58

• Select Data -> Import.• Open the Local files tab and click the Select files icon to choose the file you want to upload.• Click the Import button.• Enter the import settings in the pop-up window

Page 59: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Loading Local Files

#59

Page 60: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Loading a database server file

#60

• Create a folder named graphdb-import in your user home directory.• Copy all data files you want to load into the GraphDB database to this folder.• Go to the GraphDB Workbench.• Select Data -> Import.• Open the Server files tab.• Select the files you want to import.• Click the Import button.

Page 61: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

The LoadRDF Parallel Bulk Loader

• Features fast loading of large datasets into new repositories

• Is not intended for updating existing repositories

• Is easy to use:− Enter loadrdf <config.ttl> <serial|parallel> <files...>

▪ For example “./loadrdf.sh config.ttl parallel example.ttl”

− The “Serial Load” option pipelines the parse, entity resolution, and load tasks.

− The “Parallel Load” batch processes the parse, entity resolution, and load tasks.

LoadRDF Parallel Bulk Loader

61

#61

Page 62: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Other ways to load data

#62

By pasting data in the Text area tab of the Import page.

By pasting a data URL in the Remote content tab of the Import page.

By executing an INSERT query in the SPARQL -> SPARQL Query page.

Page 63: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Loading Data

Questions?

63

Page 64: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

64

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#64

Page 65: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Reasoning Strategies:

• Forward Chaining− Inferences pre-computed

− Faster query performance

− Slower load times

− More memory/disk space required

− Updates are expensive (truth maintenance is non-trivial)

• Backward Chaining− Inferences performed as needed at query time

− Slower query performance

− Faster load times

• Hybrid Reasoning − Partial forward chaining at data loading time + partial backward chaining at query time

Reasoning Strategies

65

#65

Page 66: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

• GraphDB™ forward chaining/delete optimization − Fast (incremental) inserts (assertions) and deletes (retractions)− Most triplestores perform an expensive full re-compute on updates− Truth maintenance minimizes the re-compute but the required dependency tracking is expensive− GraphDB optimizes the update by using backward chaining to derive update dependencies

dynamically.− It stops at axioms or ontology triples (see onto:schemaTransaction)

• owl:sameAs forward chaining optimization− Forward chaining owl:sameAs generates a large number of triples− This is caused by statement duplication on equivalent resources− The equivalent resource optimization minimizes triples generated.− Backward-chaining can expand results at query time

GraphDB™ Reasoning Optimizations

66

#66

Page 67: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

A Rule Set Consists of• Prefixes (namespace prefixes)

• Axiomatic triples

• Custom rules

Pre-Defined Rule Sets are• empty: no reasoning, GraphDB™ operates as a plain RDF store;

• rdfs: standard RDFS semantics;

• owl-horst: RDFS + D-Entailment + Some OWL – Tractable

• owl-max: RDFS with most of OWL Lite

• owl2-rl: Conformant OWL2 RL profile except for D-Entailment (types)

• owl2-ql: Reasoning over large volumes of data

Rule Sets

67

#67

Page 68: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Rule Sets and Reasoning Strategies

Questions?

68

Page 69: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Presentation Outline

69

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#69

Page 70: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Ontotext GraphDB Connectors

#70

• Provides extremely fast full text search, range, faceted search, and aggregations

• Utilize an external engine like Lucene, Solr or Elasticsearch

• Flexible schema mapping: index only what you need

• Real-time synchronization of data in GraphDB and the external engine

• Connector management via SPARQL

• Data querying & update via SPARQL

• Based on the GraphDB plug-in architecture

Page 71: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Workflow

#71

Internal indexes Graph indexes

Solr/Elasticsearch direct

queries

Query Processor

Selective Replication

SPARQL INSERT/DELETE

SPARQL SELECT with or without an

embedded

Lucene/Solr/Elasticsearch query

Lucene/Solr/Elasticsearch GraphDB engine

Page 72: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Interface

• All interaction via SPARQL queries − INSERT for creating connectors − SELECT for getting connector configuration parameters− INSERT/SELECT/DELETE for managing & querying RDF data

#72

Page 73: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Connectors – Primary Features• Maintaining an index that is always in sync with the data stored in

GraphDB

• Multiple independent instances per repository

• The entities for synchronization are defined by:− a list of fields (on the Lucene side) and property chains (on the GraphDB side) whose

values will be synchronised− a list of rdf:type's of the entities for synchronisation− a list of languages for synchronisation (the default is all languages)− additional filtering by property and value

• Full-text search using native Lucene queries

#73

Page 74: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Connectors – Primary Features• Snippet extraction: highlighting of search terms in the search result

• Faceted search, e.g. Europeana Food and Drink

• Sorting by any preconfigured field

• Paging of results using offset and limit

• Custom mapping of RDF types to Lucene types

• Specifying which Lucene analyzer to use (the default is Lucene's StandardAnalyzer)

• Boosting an entity by the [numeric] value of one or more predicates

• Custom scoring expressions at query time to evaluate score based on Lucene #74

Page 75: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

TinkerPop Blueprints Support

• Blueprints (Apache TinkerPop, aka Gremlin) is a popular API for accessing graph databases

• It is supported by Hadoop, Neo4j, Titan, etc

• GraphDB supports Blueprints since 7.0 for accessing RDF databases

• It represents RDF as a simplified version of the Property Graph model

• In this way you can use graph programming frameworks, or use ready graph exploration software like Linkurious

#75

Page 76: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

RDF Rank is a GraphDB™ extension that• Is similar to PageRank and it identifies “important” nodes in an RDF graph based on their

interconnectedness • Is accessed using the rank:hasRDFRank system predicate• Incremental RDF Rank is useful for frequently changing data

For Example, to select the top 100 important nodes in the RDF graph:

RDF Rank

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>SELECT ?n WHERE {?n rank:hasRDFRank ?r }ORDER BY DESC(?r)LIMIT 100

76

Page 77: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

GeoSPARQL Support

#77

GeoSPARQL is a standard for representing and querying geospatial linked data from the Open Geospatial Consortium, using the Geography Markup Language

• A small topological ontology in RDFS/OWL for representation

• Simple Features, RCC8, and DE-9IM (a.k.a. Egenhofer) topological relationship vocabularies and ontologies for qualitative reasoning

• A SPARQL query interface using a set of Topological SPARQL extension functions for quantitative reasoning

Page 78: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

Extensions

Questions?

78

Page 79: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

79

Support and FAQ’s [email protected]

Additional resources:

Ontotext:Community Forum and Evaluation Support: http://stackoverflow.com/questions/tagged/graphdb GraphDB Website and Documentation: http://graphdb.ontotext.comWhitepapers, Fundamentals: http://ontotext.com/knowledge-hub/fundamentals/

SPARQL, OWL, and RDF: RDF: http://www.w3.org/TR/rdf11-concepts/ RDFS: http://www.w3.org/TR/rdf-schema/ SPARQL Overview: http://www.w3.org/TR/sparql11-overview/ SPARQL Query: http://www.w3.org/TR/sparql11-query/ SPARQL Update: http://www.w3.org/TR/sparql11-update

Page 80: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

For Further Information

• Peio Popov, North America Sales and Business Development−[email protected] −1.929.239.0659

• Ilian Uzunov, Europe Sales and Business Development−[email protected] −359.888.772.248

80

#80

Page 81: Transforming your Graph Analytics with GraphDB (Vladimir Alexiev)

The EndGraphDB™ Fundamentals


Recommended