Power of Polyglot Search

Post on 22-Jan-2018

1,645 views 0 download

transcript

GraphAware®

The power of polyglot searchingJanos Szendi-Varga

graphaware.com

@graph_aware

Most frequently used UI element

GraphAware®

Search Go

Evolution of Internet Search

https://moz.com/blog/the-evolution-of-search

Slide from BDU 2016

We started to be Polyglot

Big data architecture is not a vision

We hired Data Scientists

We started to index things (Lucene)

We started to use Solr, ElasticSearch, etc

It became the part of our Big Data architecture

We introduced Search Infrastructure

Evolution in corporate search

GraphAware®

The fundamental of search infrastructure

GraphAware®

?

They are aggregate oriented databases, they have limitations when it comes to connected data

Typical setup: Two users searching for the same thing will get the same results

They are in the search 3.0-4.0 phase

They are superstars of Full text search

We need to extend this with Graph-aided search

We have to boost some Search Hit (c`mon It is a recommender system)

We have to filter out or degrade the score

We need Things, not Strings!!444!!!négy!!!

Challenges

GraphAware®

Example of graph-based search

GraphAware®

“A knowledge graph is a multi-relational graph composed of entities as nodes and relationships as edges with different types that describe facts in the world."

Knowledge graph

GraphAware®

It is about “understanding the world as you and I do”.

Search infrastructure should be easily integrated into existing architecture New data sources should be easily added Should support the strategic goals

e.g. Search driven e-commerceScalableShould provide personalised results Simple interface

Requirements of searching and KG

GraphAware®

Take a graph database (Neo4j, Cayley, OntoText GraphDB, etc.)

Graph construction:

Knowledge extraction

from the internet

open data

grabbing

from text (NLP)

from current databases (Master Data)

from logs

Knowledge Graph Construction

Have a good graph model

Connect the things together

Steps to build KG

GraphAware®

Apache Kafka for streaming pipelines

Product topic

Search topic

Feedback topic

Spark on the processing side

Neo4j on the consuming side

CQRS (Command Query Responsibility Segregation) pattern

Push to ElasticSearch with GraphAware plugin

Neo4j Transaction Handler (afterCommit)

You can define mappings to ES

Parts of the architecture

GraphAware®

Success story 1.

• Sharing Tribal Knowledge inside the company

• >20 offices

• >3000 employees

• Data sources:

• Tableau dashboards (4000)

• Knowledge posts (>1000)

• Superset charts and dashboards (>6000)

• Experiments and metrics (>5000)

GraphAware®https://www.slideshare.net/ChristopherWilliams24/20170108scaling-tribalknowledge

Success story 2.

•Half-century of collective NASA engineering knowledge

• It is called Lessons Learned database

• They use it in Mars mission project

GraphAware®

Impact: “Neo4j saved well over two years of work and one million dollars of taxpayers funds.”

“When we had the [Apollo 1] fire, we took a step back and said okay, what lessons have we learned from this horrible tragedy? Now let’s be doubly sure that we are going to do it right the next time. And I think that fact right there is what allowed us to get Apollo done in the ‘60s.” —Dr. Christopher C. Kraft, Jr., Director of Flight Operations

Neo4j

ElasticSearch

GraphAware modules:

Neo4j to ElasticSearch

ElasticSearch Plugin

NLP plugin

Github: github.com/graphaware

Open data

Resources

GraphAware®

GraphAware®

It is not a rocket science!

Anonymous NASA scientist

www.graphaware.com@graph_aware

GraphAware

GraphAware®

world’s #1 Neo4j consultancy