+ All Categories
Home > Documents > A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN...

A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN...

Date post: 04-Aug-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
34
FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 1 Co-funded by the Horizon 2020 Framework Programme of the European Union Grant Agreement Number 644771 FREME WEBINAR HELD FOR GALA, 28 APRIL 2016 A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS OPPORTUNITIES) www.freme-project.eu Presented by Tatjana Gornostaja (Tilde) and Felix Sasaki (DFKI / W3C Fellow)
Transcript
Page 1: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 1

Co-funded by the Horizon 2020Framework Programme of the European UnionGrant Agreement Number 644771

FREME WEBINAR HELD FOR GALA, 28 APRIL 2016

A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC ENRICHMENT OF DIGITAL CONTENT (NEW L10N BUSINESS OPPORTUNITIES)

www.freme-project.eu Presented by Tatjana Gornostaja (Tilde) and Felix Sasaki (DFKI / W3C Fellow)

Page 2: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 2

OVERVIEW

• Introduction

• Technological aspects of the framework

• Localization and other FREME business cases

• Q&A

Page 3: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 3

Coupling

Knowledge and Language

via e-Service Ecosystem

Page 4: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 4

Knowledge Language

Page 5: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 5

Knowledge Language

Page 6: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 6

KnowledgeLanguage

Page 7: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 7

FREME

Picture: coloringpageswallpaper.com

Page 8: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 8

THE FREME PROJECT

• Two year H2020 Innovation action; start February 2015

• Industry partners leading four business cases arounddigital content and (linked) data

• Technology development bridging language and data

• Outreach and business modelling demonstrating monetization of the multilingual data value chain

Page 9: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 9

CURRENT STATE OF SOLUTIONS

Machine translation, terminology

annotation, ...

Linked data creation & processing

GAPS THAT HINDER BUSINESS:

• Plethora of formats

• Adaptability and platform dependency

• Language coverage

• Usability “The right tool for the right person in given and new enterprises”:technology influences job profiles

Page 10: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 10

FREME TO THE RESCUE: ENRICHING DIGITAL CONTENT

Machine translation, terminology

annotation, ...

Linked data creation & processing

LT and LD as first class citizens on the Web

A SET OF INTERFACES* - DESIGN DRIVENBY BUSINESS CASES

LT and LD for varioususer types: (application) developer, content architect, content author, …

* Graphical interfaces* Software Interfaces

Page 11: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 11

Page 12: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 12

OVERVIEW

• Introduction

• Technological aspects of the framework

• Localization and other FREME business cases

• Q&A

Page 13: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 13

FREME FROM A TECHNICAL PERSPECTIVE

A framework for multilingual and semantic enrichment of digital content that provides access via a set of APIs and GUIs to six E-services.

• e-Entity for enriching content with information on named entities;

• e-Link for enrichment with linked data sources;

• e-Terminology for detecting terms and enriching them with term related information;

• e-Translation for providing custom machine translation systems;

• e-Internationalisation for processing a variety of digital content formats; and

• e-Publishing for exporting the outcome of enrichment processes in the ePub format.

Page 14: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 14

FREME FROM A TECHNICAL PERSPECTIVE

How to access FREME – several options:

• A life version 0.5 (0.6 soon to be released!) including documentation at http://api.freme-project.eu/doc/current/

• A development version at http://api-dev.freme-project.eu/doc/

• A Java / maven software package;see the documentation for installation instructions

• Source code in a GitHub project https://github.com/freme-project/

• The framework is available under Apache 2.0 license to ease commercial use

• Underlying services have various licensing conditions

Page 15: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 15

LINGUISTIC LINKED DATA AND OTHER STANDARDSPUT IN ACTION VIA FREME

• NIF (Natural Language Processing Interchange Format) for representing digital content and enrichment information in a format agnostic manner, based on the linked data stack;

• OntoLex lemon for representing lexical information, to be used e.g. for improving machine translation output;

• Internationalization Tag Set 2.0 for representing various types of enrichment information in a standardized manner, related e.g. to terminology named entities; and

• The general linked data technology stack (RDF, SPARQL etc.)

FREME is built on outcomes of standard driving projects in FP7 in the area of linguist linked data: LIDER and FALCON

Cf. http://lider-project.eu/ and http://falcon-project.eu/

Page 16: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 16

EXAMPLE API CALL

• The request is made to the API for the e-Entity service, a service that enriches content with named entities.

• The input format of content is plain text; the output format is turtle.• The content to enrich is “Welcome to the city of Prague”.• The language or the content is English.• The dataset used for the enrichment is DBpedia.

Page 17: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 17

EXAMPLE OUTPUT: USING NIF TO STORE CONTENT …

(1) <http://freme-project.eu/#char=0,29>

(2) a nif:String , nif:Context , nif:RFC5147String ;

(3) nif:beginIndex "0"^^xsd:int ;

(4) nif:endIndex "29"^^xsd:int ;

(5) nif:isString "Welcome to the city of Prague"^^xsd:string .

1) Identifying the content via a URI2) Adding certain types from NIF*3) Identifying the start offset of the content4) Identifying the end offset of the content5) Providing the string content itself.* For More on NIF: see a dedicated tutorial http://de.slideshare.net/m1ci/nif-tutorial

Page 18: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 18

… AND ENRICHMENT INFORMATION

(1) <http://freme-project.eu/#char=23,29> …

(2) nif:anchorOf "Prague"^^xsd:string ;

(3) nif:beginIndex "23"^^xsd:int ;

(4) nif:endIndex "29"^^xsd:int ;

(5) nif:referenceContext <http://freme-project.eu/#char=0,29> ;

(6) itsrdf:taClassRef <http://dbpedia.org/ontology/City>.

1) Identifying the annotation via a URI2) Providing the string content of the annotation3) Identifying the start offset of the content4) Identifying the end offset of the content5) Relating the content to annotations6) Enrichment with ITS 2.0 class information (“Prague” = a city)

Page 19: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 19

SIMPLIFIED OUTPUT HELPS API DEVELOPERS TO CONSUME LINKED DATA

• FREME provides user specified filter mechanism to simply the output

• Supports CVS, XML or JSON

• Example output as CSV

http://dbpedia.org/resource/Prague,50.0878367932108,14.4241322001241

For more infos on filtering, see

http://api.freme-project.eu/doc/current/knowledge-base/filtering.html

Page 20: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 20

FORMAT COVERAGE

• Processing of various content formats

◦ NIF, RDF, Text, HTML, OpenOffice, XLIFF 1.2, various XML formats, …

• Many formats are processed via e-Internationalization services

• Format specified in API call as input and (partially supported) output

Page 21: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 21

USING E-TERMINOLOGY WITH HTML OUTPUT

<!DOCTYPE html> …

<body>

<p>Welcome to the city of Prague.</p>

</body> … </html>

<!DOCTYPE html> …

<p>Welcome to the <span its-term="yes">city</span> of Prague.…</html>

Call of e-Terminology

Page 22: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 22

TRANSLATING XLIFF CONTENT WITH E-TRANSLATION

...<trans-unit>

<source>This is car</source>

</trans-unit> ...

<http://freme-project.eu/#char=0,13>

nif:isString "This is a car"@en

itsrdf:target "Dies ist ein Auto"@de .

Call of e-Translation

Page 23: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 23

IMPROVING E-TRANSLATION OUTPUT VIA E-TERMINOLOGY

“The EU in brief. The EU is a unique economic and political partnership between 28 European countries that together cover much of the

continent.”

continent, partnership, briefing, economics, covering

Call of e-Terminology: detection of translation suggestions

De voorschriften in DE EU. De EU is een uniek partnerschap tussenpolitiek en economie in de Europese landen, die gezamenlijk 28

verpakking van het continent.

Call of e-Translation: improved output!

Page 24: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 24

OVERVIEW

• Introduction

• Technological aspects of the framework

• Localization and other FREME business cases

• Q&A

Page 25: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 25

MOTIVATION

• Aid translators

◦ Supplement typical linguistic support tools like glossary look-up with entity recognition and term disambiguation

◦ Possibility to introduce proprietary and domain-specific semantic datasets

• Provide “Value-Add” to customers

◦ Make their content more interactive, compelling and discoverable

◦ Open up service offerings to new customers from existing and new channels

Page 26: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 26

TRANSLATOR SUPPORT

• Automatic machine translation suggestions

• Automatic terminology look-up

◦ Includes definitions

• Automatic Entity Recognition

◦ Includes many textual and visual contextual properties: descriptions, images, links to other resources…

Page 27: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 27

CUSTOMER VALUE-ADD

• Relationships can be formed between new content and existing knowledge resources

• Utilize open and private Multilingual Linked Data Cloud

DBpedia

Proprietarydataset

TranslatedContent

Page 28: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 28

BUSINESS BENEFITS

• Technological Support to Content Authors and Localizers

◦ Aid with the cognitive and physical tasks of finding and employing the most appropriate terminology

• Opens up Conversations with New Customers

• Deliver semantically richer, more interactive, highly sociable and discoverable content

◦ Through integration, enrichment added automatically can be validated by human and saved with content

• Demonstrates Vistatec thought leadership to customers looking for service differentiators and value add

Page 29: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 29

CHALLENGE AND OPPORTUNITY: BIG DATA IS GROWING ACROSS LANGUAGES, SECTORS AND DOMAINS

• BC: Digital publishing

• BC: Translation and localisation

• BC: Agriculture and food domain data

• BC: Web site personalisation

Agriculture metadata, user content, news

content, …

WHAT LIES AHEAD FOR SEVERAL INDUSTRIES? SEE THE FREME BUSINESS CASES

EN

ESJA, ZH, ...

AR

Page 30: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 30

DIGITAL PUBLISHING

With a simple click you can fetch extra information from a dataset and use it to annotate content.

Page 31: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 31

AGRICULTURE AND FOOD DATA

Domain experts can automatically extract terms from title, description, abstracts and full text.

Page 32: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 32

PERSONALISATION OF WEB CONTENT

Businesses can identify the topics their customers are engaging with, focusing their global content strategy.

Page 34: A FRAMEWORK FOR MULTILINGUAL AND SEMANTIC … · LINGUISTIC LINKED DATA AND OTHER STANDARDS PUT IN ACTION VIA FREME • NIF (Natural Language Processing Interchange Format) for representing

FREME Webinar for GALA – April 2016 WWW.FREME-PROJECT.EU 34

OVERVIEW

• Introduction

• Technological aspects of the framework

• Localization and other FREME business cases

• Q&A


Recommended