Nuxeo Iks 2009 11 13

Post on 09-May-2015

2,612 views 1 download

description

Short introductionary slides introducing some of the work done on the Scribo project to extract Named Entities in textual documents with a UIMA engine.

transcript

Olivier Grisel - 2009-11-13 - IKS

Semantic LiftingNamed Entities Extraction with UIMA

Thursday, November 12, 2009

Nuxeo

• Open Source ECM

• Nuxeo DM 5.3 available

• office document management with workspaces

• download it at http://nuxeo.com

• Soon: Nuxeo DAM

• Multimedia content

• Full ajax search based browsing

2

Thursday, November 12, 2009

http://SCRIBO.ws

• Goal: content to knowledge using ontologies

• 3 academic research teams

• 2 NLP startups

• 2 Open Source ECM / Wiki software editors

• 2 use case providers:

• News agency

• Linux distribution

3

Thursday, November 12, 2009

UIMA

• Chain components to extract annotations on text and images

• Initially developed by IBM

• Now an Apache Software Foundation project

• Several existing components (OpenNLP, ClearTK, ...)

• Easy to wrap new libraries as UIMA annotators

4

Thursday, November 12, 2009

Scribo UIMA chain

5

Thursday, November 12, 2009

Scribo UIMA chain editor

6

Thursday, November 12, 2009

Embedded UIMA chain

7

Thursday, November 12, 2009

It’s Open Source

• Clone it!

• http://hg.nuxeo.org/sandbox/scribo

• http://hg.nuxeo.org/sandbox/nuxeo-uima

• Give me feedback!

• http://twitter.com/ogrisel

8

Thursday, November 12, 2009