Post on 14-Jan-2015
description
transcript
Thesaurus based Enterprise Search
Two Show Cases
Andreas Blumauer
Graz, September 2011
Agenda
• Semantic search scenarios
• The role of thesauri in semantic search
• PoolParty Semantic Search
– Live Demo – http://bit.ly/semantic_search
• Show Cases & Demos
2
3
Semantic search scenarios
Semantic search has many faces
Situations in which semantic search can help
4
I can´t remember how
to spell the search term
I can´t remember
exactly what I was looking for
I want to gain background
knowledge to a certain
document
I want to know more about
this entity in a certain context.
I want to see facts from
different sources describing this
entity.
I want to search in different
languages simultaneously
I want the software to understand
what I mean by „Jaguar“
Knowledge worker´s questions
5
Has anybody solved the problem xy?
Who can I ask about xy? What are the others
currently working on?
What is state of the art in xy?
Four demands for a smarter search
1. Find information faster Provide search assistants
2. Reveal hidden information Enrich the search index with background knowledge
3. Find more specific informationQuery the semantic web
4. Find linked informationIntegrate data sources
6
Find information faster – Auto-Complete
7
I can´t remember how
to spell the search term
To provide powerful auto-complete also for enterprise searchscenarios you need to establish an enterprise vocabulary.
Find information faster – Status quo
8
hydropower plantsSearc
h
I can´t remember
exactly what I was looking for
Small hydro Search
Find information faster with related search terms
9
hydropower plantsSearc
h
http://www.reegle.info/clean-energy-search
Reveal hidden information – Status quo
10
SNCRSearc
h
SNCR OR „Selective non-
Search
I forgot some of the names for the entity I´m looking for
Reveal hidden information with query expansion
11
SNCRSearc
hOR "selective non catalytic reduction"
SNCR
selective non catalytic reduction
alternative Label
preferred Label
Multi-lingual search based on a thesaurus
12
clean energy Searc
hOR energía limpia
clean energy
energía limpiapreferredLabel @es
preferred Label @en
I want to search in different
languages simultaneously
Reveal hidden information and relations
13
Find documentsor images relatedto any other text.
http://poolparty.punkt.at/demozone
I want to gain background
knowledge to a certain
document
Find more specific information – Status quo
14
Goldman SachsSearc
h
3 different contexts for„Goldman Sachs“:• Bond issuer• Analyst• Stock
I want to know more about
this entity in a certain context.
Find more specific information with faceted search
15
facets supportstructured queries
facets helpto drill down search results,adapt dynamically
Zero-result querieswon´t happen anymore
Complex queries with faceted search over linked data
16
„Show me all airlines whose parent company is Lufthansa“
http://dbpedia.neofonie.de/
My Energy-Dossier about
Find linked information – Status quo
17
I want to see facts from
different sources describing this
entity.
The user has to put together manually energy-relatedinformation about a country.
360O views: Find linked information
18
Energy-relatedinformation about countriesare „mashed“ automaticallyby using „linked data“
http://www.reegle.info/countries
Add personal context to the search
19
I want the software to understand
what I mean by „Jaguar“
JaguarSearc
h
20
The role of thesauri in semantic search
How vertical search can benefit from knowledge models
The role of thesauri in semantic search
21
The role of thesauri in semantic search (contd.)
22
Thesaurus as the central pointto control:
• labels & query expansion• facets• refine search mechanisms• metadata integration
Data integration and schema mapping based on thesauri
23
<person> Thomas Miller</person>
Source 1
<employee> Tom Miller</employee>
Source 2
Usage of linked data for semantic search
1. Align thesaurus concepts with DBpedia resources
– disambiguation!
– performance!
2. Enrich concept with category information
– schema.org / DBpedia ontology
– YAGO/Umbel
3. Use category information for concepts
1. to categorize document (usage of transitivities)
2. to provide search facets
24
25
PoolParty Semantic Search (PPS)
Make semantic search come true!
PoolParty System Architecture
26
Search Services
Search Application
Collector<xml>
Semantic Indexer
Document Index
Cartridge
Indexing and Mapping with PoolParty
• Metadata Standards
– Rich metadata in a standardized, extensible format (SKOS / RDF)
– Document metadata is mapped to concepts in the thesaurus
• Cost efficient metadata management
– Thesaurus is managed with PoolParty´s easy-to-use Thesaurus Manager
– One central metadata repository
• Improved end-user experience
– Semantic information improves search experience 27
PoolParty Search API & Standard GUI
28
• Available web services:• Search Service• Suggest Service• Similarity Service
• Supported formats:• JSON• XML• RSS
http://bit.ly/semantic_search
PoolParty Semantic Search Demo – Result
29
http://bit.ly/semantic_search
select properfacets
store querieswith search basket
facets supportstructured queries
find similar documents forrelevant results
specify your querywith categorisedauto-complete
30
Show cases & Demos
Thesaurus based search on the web & intranet
Show Case No. 1: Semantic Search based on reegle thesaurus
3131
Search Services
Search Application
CollectorSemantic Indexer
Document Index
Cartridge
Thesaurus
Projects DB
Web catalogueof actors
Actors DB
Data integration based on Reegle thesaurus
32
<sector> Hydro Power small scale</sector>
Actors DB
<category> Micro Hydro</category>
Web catalogue
Show case No. 2 - www.reegle.info
33
Show Case No. 3: Very large financial institute
3434
Search Services
Search Application
CollectorSemantic Indexer
Document Index
Cartridge
VLFIThesaurus
DMS 1
DMS 2
Contact
Andreas BlumauerManaging Partner, CEOa.blumauer@semantic-web.at
Alexander KreiserSystem Architecta.kreiser@semantic-web.at
35
Semantic Web Company GmbH
Mariahilfer Straße 70A—1070 Wien / Austria
+43-1-4021235
http://www.semantic-web.at/
http://www.poolparty.biz/
http://bit.ly/semantic_searchhttp://lod2.eu/
http://twitter.com/semwebcompany http://linkd.in/oFFnO4