April 2010
Introduction to KIMplatform
Anton Andreev
KIM overview; KIM architecture; KIM UI
Outline
• Overview• KIM WEB UI Demo• KIM Architecture• Deployment Demo
KIM platform #2April 2010
KIM is awesome!
KIM platform #3April 2010
I told you!
KIM Kardashian
#4April 2010
KIM Platfrom
• Semantic annotation of text – automatic ontology population – open-domain dynamic semantic annotation of unstructured and semi-
structured content for Semantic Web and KM applications
• Indexing and retrieval (semantically-enabled and IE-enhanced search technology)
• Query and exploration of formal knowledge
• Co-occurrence tracking and ranking of entities
• Entity popularity timelines analysis
KIM platform #5April 2010
KIM Fact Sheet
• Runs on many platforms – Officially on Sun/Oracle JVM on Linux, Windows– Reported to run on IBM Java 1.6 on PS3, also on x86 OpenSolaris
• Can be used programmatically
• KIM runs as a service and it is remotely accessible– through Java RMI– through Web-services from .NET or other– JMS starting from version KIM 3.0
• Can integrate processing resources from GATE
• Proton ontology is kind of dependency
KIM platform #6April 2010
Semantic Annotation
April 2010 #7KIM platform
GATE
OWLIM
WEB UI DEMO
KIM platform #8April 2010
• But does it really work?
April 2010
The main picture
#9KIM platform
Storage
Local Network
Document &
MetadataAggregator or Crawler
Population Service
Semantic Annotatio
n
Semantic Indexing
& Storing
Semantic Index
Multi-paradigm Search/Retrieva
l
Visual Interface
3rd party App
WWW
The semantic data path
GATE
Ontology aware annotations
SAR
OWLIM
NLP (Natural Language Processing) phase
Not just annoations, but annotations that have URIs from the Ontology provided
If you have URIs for everything, then nothing stops us from generating RDF
If we have RDF then we need to store it and merge it with the one previously available
Instance Generator
KIM platform April 2010
Generate URIs for the new entities and relations
#10
Semantic Repository - Instance URI
Gazetteer
Jape rules
OrthoMatcher
Instance generator
Instance URI
Found
OWLIM
Some entities are identified directly and
we know their instance URI and class in
advance
Benefiting form the work of the gazetteer and using rules more entities are detected
Instances of the same entity are merged: ex:
“Apple” and “Apple Inc.”
Add/Merge RDF
An algorithm is used to generate URIs.
RDF generatio
n
Yes
No
KIM platform April 2010
Ontotext predefined
kb
#11
Document Repository
Document
Full Text Index
Lucene
Storage
File Store
April 2010KIM platform
Other index
service
#12
Deployment
• How to start KIM
• How to configure KIM
• How to import/populate documents in KIM– Populator tool– KimGate
• Documentation location: – http://ontotext.com/kim/doc/sys-doc/HomePage.html
• KIM 3.0– No ORACLE dependency for some of the functionality– Pluggable component architecture - this will allow KIM to start without loading
semantic annotation service or document repository– Integration with latest GATE 5.1/5.2
KIM platform #13April 2010
Cool stuff
• How we do co-occurrence in a single document?– Using a slightly modified GATE Othomatcher processing resource
• How we do co-occurrence in many documents?– Using Instance URIs and OWLIM
• Optimizations - parallel annotation– Using multiple GATE pipelines
#14KIM platform April 2010
Links
• http://ontotext.com/kim
• http://ontotext.com/kim/doc/sys-doc/HomePage.html
• http://debian.fmi.uni-sofia.bg/~toncho/myblog/plugin/tag/kim
• http://debian.fmi.uni-sofia.bg/~toncho/myblog/plugin/tag/gate
• http://code.google.com/p/kimnetdemos
KIM platform April 2010 #15
Thank you!
April 2010 #16KIM platform
Questions?