+ All Categories
Home > Documents > CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für...

CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für...

Date post: 29-Dec-2015
Category:
Upload: lucinda-small
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
24
CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen
Transcript
Page 1: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

CLARIN - a European Research Infrastructure

Peter WittenburgMax-Planck Institut für

Psycholinguistik, Nijmegen

Page 2: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

eResearch - InfrastructuresBozen,

16.9.2010

www.clarin.eu

J. Taylor “eScience is about global collaboration in key areas of science and the next generation of infrastructures that will enable it”

Requires new persistent platforms

- to enable researchers to combine resources and tools to solve the big challenges of today (global migration, crisis of cultures and minds)- to increase the efficiency of researchers in the many small

tasks- 40 % of the time of "knowledge workers" is spent, to

find useful material (Forrester Research)

Page 3: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

CLARIN GoalBozen,

16.9.2010

www.clarin.eu

What: Offer a distributed

Research Infrastructure of integrated and interoperable Language Resources and Tools that serves researchers and students in the SSH

How: allow the combination

of existing and web-accessible digital centers hosting resources in a common federation

offer language tools and services as distributed services with a common web interface

Page 4: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Key Application/Mission Bozen,

16.9.2010

www.clarin.eu

A researcher authenticates at his own organization and creates a virtual collection of resources from different repositories and executing a virtual pipeline of processes on them.

King Arthur failed by the way

will CLARIN fail as well?

Page 5: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

CLARIN is pan-European

CLARIN:• 3 Jahre Prep-Phase• ~ 200 members • ~ 25 centre candidates

Page 6: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

CLARIN Work Dimensions

how to come to a persistent and stable

infrastructure?

how to come to a federation

and how to get access?

how to make all of their LRT

visible?

how to come to interoperable

services?

how to get it all together for

user services?

community centres

service provider

federation

CMDI future & short term solution

service oriented

architecture

pan-European demo cases

CLARIN has other very important aspects:• Relation with SSH disciplines - mainly driven by national funds• Education/Training, Help/Support/Advice, Dissemination • Harmonization of licencing and Code of Conducts• Specification of the ERIC legal framework to ensure persistency

... at least IT oriented aspects

Page 7: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Community Centres

how to come to a persistent and stable

infrastructure?

how to come to a federation

and how to get access?

how to make all of their LRT

visible?

how to come to interoperable

services?

how to get it all together for

user services?

community centres

service provider

federation

CMDI future & short term solution

service oriented

architecture

pan-European demo cases

CLARIN Centres

CentresCriteria

Long-termPreservation

REPLIX Replication

25 Centre Candidates

all are busy with restructuring plans

2 already give long-term preservation service

Page 8: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Service Provider Federation

how to come to a persistent and stable

infrastructure?

how to come to a federation

and how to get access?

how to make all of their LRT

visible?

how to come to interoperable

services?

how to get it all together for

user services?

community centres

service provider

federation

CMDI future & short term solution

service oriented

architecture

pan-European demo cases

Trust Domain

Initial Federation

PID Service

setup federation technology

build initial federation

setup EPIC service

central user attribute server

• Service Provider Federation

• Agreement 1• n centers members

• Link up with national IdFs

• Agreement 2• DFN De• HAKA Fi• SURFnet Nl

• 1 Mio pot. Users-id

• currently more countries and centers coming

http://www.pidconsortium.eu

Page 9: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Metadata Domain

how to come to a persistent and stable

infrastructure?

how to come to a federation

and how to get access?

how to make all of their LRT

visible?

how to come to interoperable

services?

how to get it all together for

user services?

community centres

service provider

federation

CMDI future & short term solution

service oriented

architecture

pan-European demo cases

Component Metadata

Metadata now

Virtual Collection

CMDI Infra

ISOcat development

setup OAI PMH machinery

ISOcat Registry

VLO Observatory

Category Definition

LRT Inventory

Virtual Language World

ARBIL MD Editor

ISOcat concept registry

component editor

myprofile

metadata editor

metadata

descriptions

CLARIN component

registry

user area

component registration

concept registration

?this is where the ILSP team played a central role

Page 10: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Service Oriented Architecture

how to come to a persistent and stable

infrastructure?

how to come to a federation

and how to get access?

how to make all of their LRT

visible?

how to come to interoperable

services?

how to get it all together for

user services?

community centres

service provider

federation

CMDI future & short term solution

service oriented

architecture

pan-European demo cases

Service Oriented

Infrastructure

Web Services Interoperability

Standards & Best

Practices

Service Framework Specification

Web Service and Processing Chains

Standards and Best Practices

Web 2.0 Application forTool Chainingand Execution

Repository

StuttgartTübingen Berlin Leipzig Finland

Standard-conformantText Corpus Encoding

Stuttgart Tübingen Leipzig

Romania

Page 11: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Demo Cases (just started)

how to come to a persistent and stable

infrastructure?

how to come to a federation

and how to get access?

how to make all of their LRT

visible?

how to come to interoperable

services?

how to get it all together for

user services?

community centres

service provider

federation

CMDI future & short term solution

service oriented

architecture

pan-European demo cases

EU Identity Index Case

Multimedia/multimodal Case

Folkstory Case

C4/WebLicht Corpus Case

Page 12: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

not alone ...

EUDAT

Meta-Net

Page 13: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

need to take care of data ...

Architecture created by EC High Level Expert Groupwill be a guideline for coming decades

Data generators Users

Common Data Services

Community Support Services

User functionalitiesData capture & transferVirtual Research Environments

Data discovery & navigationWorkflow generationAnnotation, Interpretability

Safe & persistent storageIdentifiers, Authenticity, Workflow execution, Mining

CLARIN, DARIAH etc

Daten e-Infrastructure

Page 14: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

why European?Bozen,

16.9.2010

www.clarin.eu

live in a multilingual Europe with a joint historical tradition and need to exploit this strength

many research questions are cross-national

required standards cannot be national

sharing costs in all respects is more efficient

finally it's about global competition also in SSH

Page 15: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Why now?Bozen,

16.9.2010

www.clarin.eu

there is the ESFRI process and all countries are synchronized which is a unique chance to build infrastructures

in total 44 initiatives on the ESFRI roadmap and there is the potential of gain by an eco system of RI

we need to organize our resource domain due to huge increase of data (MPI: 200 TB)

we need to take care to not loose our cultural and scientific memory

there is a huge uptake of RI and there will be many funding streams!!!

Page 16: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

who and when?

current EU CLARIN consortium in prep phase (08-10): 32 partners from 24 countries

CLARIN construction phase from 2011; main funds by national programs - but additional funding streams by EC connected to RI

legal issue: foundation of a European Research Infrastructure Consortiums (ERIC) as basis for future with automatic qualification to participate in programs

Bozen, 16.9.2010

www.clarin.eu

Page 17: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

CLARINUtrecht

March 2010www.clarin.eu

Organisation of the CLARIN ERIC

Page 18: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

who seems to be on board?

Belgium, Bulgaria, Germany, Denmark, Estonia, Latvia,

Finland, Croatia, Netherlands, Norwegen, Austria,

Portugal, Spain, Czech Republic, Hungary, South Tirol, ?

Some are discussing: FR, SW, GR?, etc.

Bozen, 16.9.2010

www.clarin.eu

Page 19: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Advantage of membershipBozen,

16.9.2010

www.clarin.eu

privilaged access to CLARIN federation

networked with CLARIN centres (direct technology transfer)

a word when discussing priorities, agreements, best practices

access to EC funding streams

access to education and training programs to make our young generation competitive

Page 20: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Weitere InformationenBozen,

16.9.2010

www.clarin.eu

CLARIN web site: http://www.clarin.eu CLARIN office: [email protected]

CLARIN Newsletter:

http://www.clarin.eu/newsletter CLARIN members:

http://www.clarin.eu/members

Page 21: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Thanks for your attention.

Page 22: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

CLARIN Usage Scenario

Scenario: A Serbian and a German PhD student want to study language variation in the Balkan area

Resource: via VLO they find all relevant language variation data for that area

Tools/Services: Modern clustering methods available via the web allow to quickly build dialect continua on top of a geographic map; visualization services allow to pipeline this to get a nice output

Page 23: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Visualization of Dialect Data: Clustering

Page 24: CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

CLARIN Usage Scenario

Scenario: Linguists, sociologists and ethnologists want to study the cultural and linguistic differences of parliament debates in SE, DE and GR about the swine flue and compare how such global problems are dealt with

Resource: building a virtual collections of all debates (Audio, Video, Transkription)

Tools/Services: allowing researchers to analyse and annotate gestures, intonation, word choices, timing etc where partly powerful computers need being used

Vision: in 2011/12 such computational services will be made available in CLARIN 2011


Recommended