Helping online communities to semantically enrich folksonomies

Post on 15-Mar-2016

27 views 1 download

Tags:

description

Helping online communities to semantically enrich folksonomies. Freddy Limpens , Fabien Gandon Edelweiss, INRIA Sophia Antipolis { freddy.limpens , fabien.gandon}@inria.fr Michel Buffa Kewi , I3S, Univerité Nice-Sophia Antipolis. Edelweiss. ISICIL, mai 2010. How to turn - PowerPoint PPT Presentation

transcript

1

Helping online communities to semantically enrich folksonomies

ISICIL, mai 2010

Freddy Limpens, Fabien Gandon Edelweiss, INRIA Sophia Antipolis

{freddy.limpens, fabien.gandon}@inria.fr

Michel BuffaKewi, I3S, Univerité Nice-Sophia Antipolis

Edelweiss

2

How to turn folksonomies ...

...into comprehensible topic structures ?

?pollution

Soil pollutions

has narrower

pollutant Energy

related related

3

… without overloading users

4

… and by collectingall user's expertiseinto the process

5

Our approach Integrate usage-analysis for a tailored solution

Supporting diverging points of view

Automatic processings +

Human expertise through user-friendly interfaces

6

Concrete scenario

Expertsproduce docs

+ tag

Archivistscentralize + tag

Public audienceread + tag

1. Modeling statements about tags

8

Supporting diverging points of view

car pollutionskos:related

9

car pollution

John Paul

Supporting diverging points of view

agrees disagrees

skos:related

Supporting diverging points of view

10

car pollutionskos:related

John Paul

hasApproved hasRejected

tagSemanticStatement (named graph)

Supporting diverging points of view

2. folksonomy enrichment Life cycle

13

Dataset

Delicious TheseNet CADIC

WhatBookmarks of users of tag

"ademe"

Keywords for Ademe's PhD

projectsArchivists

indexing lexicon

# tags 1015 6583 1439

# posts 1013 1425 4675

# (restricted tagging) 3015 10160 25515

# users 812 1425 1

ADDING TAGS

Automatic processing

User-centricstructuring

Detect conflicts

Globalstructuring

Flat folksonomy

Structured folksonomy

Folksonomy enrichment

life-cycle

15

pollution

pollutantpollution

pollutionpollutionpollutionpollution Soil pollutions

1. String-based metrics

16

Evaluation of 30 edit distances

Combining the best metrics

Needs complement !

1. String-based metrics

17

1. String-based metrics

Evaluated against Ademe expert's point of view

Related55,10%

Broader/narrower32%

CloseMatch12,82%

Result on full dataset

Result on full dataset

Node size ↔ InDegree

◉ tags (delicious + thesenet)

◉ mot-clés svic

Result on full dataset

Node size ↔ InDegree

◉ tags (delicious + thesenet)

◉ mot-clés svic

21

Fig. Markines et al. (2009)

Association via :

Users

tags

2. Tri-partite structure of folksonomies

tag1 tag2 tag3

tag1 freq (tag1) cooc (tag1, tag2) cooc (tag1, tag3)

tag2 cooc (tag2, tag1) freq (tag2) cooc (tag2, tag3)

tag3 cooc (tag3, tag1) cooc (tag3, tag2) freq (tag3)

2. Tri-partite structure of folksonomiesTag-based association :

=> gives "related" relations

Example result on CADIC dataset:

2. Tri-partite structure of folksonomies

User-based association (Mika) :

environnementagriculture

U1

U2

U3

U4

U6

=> gives "subsumption" relations

Example result on Delicious dataset:

Arrows mean "has broader"thickness ≈ weight

TheseNet dataset:

Arrows mean "has broader"thickness ≈ weight

ADDING TAGS

Automatic processing

User-centricstructuring

Detect conflicts

Globalstructuring

Flat folksonomy

Structured folksonomy

Folksonomy enrichment

life-cycle

28

Embedding structuring tasks within everyday activity (searching e.g)

29

Embedding structuring tasks within everyday activity (searching e.g)

30

Capturing user's point of view

31

Experimentation ADEME

ADDING TAGS

Automatic processing

User-centricstructuring

Detect conflicts

Globalstructuring

Flat folksonomy

Structured folksonomy

Folksonomy enrichment

life-cycle

33

Conflict detection

environment pollution

narrower

broader

34

Conflict detection

environment pollution

narrower

broader

Using rules e.g:

IF num(narrower)/num(broader) ≥ cTHEN narrower winsELSE 'more generic' wins

35

Conflict detection

environment pollution

narrower

broader

related

related

broader narrower

more generic more generic

Ademe experimentation

Total number of relations 125

Contradictory 43

Consensual 14

Only rejected (to be deleted ?) 2

Only proposed by computer (no user review) 52

Debated (approved AND rejected at least once) 14

Example result on Ademe's experimentation subset

Example result on Ademe's experimentation subset

ADDING TAGS

Automatic processing

User-centricstructuring

Detect conflicts

Globalstructuring

Flat folksonomy

Structured folksonomy

Folksonomy enrichment

life-cycle

40

environment pollutionrelated

ReferentUser

Global structuring by Referent

hasApproved

ADDING TAGS

Automatic processing

User-centricstructuring

Detect conflicts

Globalstructuring

Flat folksonomy

Structured folksonomy

Folksonomy enrichment

life-cycle

environment

pollutants

pollution

environment

pollutants

pollution

narrowernarrower

Paul

environment

pollution

narrower

John

environment

pollution

related

Referent pollutantsnarrower

43

Take away message (conclusion)

44

Help communities

structure their tags

What we do :

45

Our contributions:

Usages analysis

Automatic processing of tags

Tag structuring embedded in every-day tools

Supporting multi-points of view

46

Future work

• Interfaces :• to capture user-centric contributions • Global administrations (Referent User)• Tag searching

• Real-scale experimentation• Mapping between knowledge representation

(Gemet thesaurus – Tags – CADIC e.g.)

• Moving to other socio-structural models

47

Thank you !freddy.limpens@inria.fr

http://www-sop.inria.fr/members/Freddy.Limpens/

http://isicil.inria.fr

48

49

What is a tag ?

50

Tagging model

51

Tagging model

52

Hypotheses

Tag-link is thematic

Tags are concept-candidate

53

Supporting diverging points of view

54

An eco-system of agents

55

56

Tags are nice to organize your ownresources ...

57

... but also to get involved in the organization of

shared resources