Exploring Audiovisual Archives through Aligned Thesauri

Post on 16-Apr-2017

261 views 3 download

transcript

Exploring Audiovisual Archives through Aligned Thesauri

Victor de Boer, Matthias Priem, Michiel Hildebrand, Nico Verplancke, Arjen de Vries and Johan Oomen

CC-by-nc-nd https://www.flickr.com/photos/joinash/

The dangers of silos

The modern archiveThe modern archive

...TO CONTEXT: MUTUALLY CONNECTED COLLECTIONS...

03-05-2023

Connecting collections:topics, people, genres, etc

Catalogue

Photos

Wiki

Programm

eguides

Internal: Video hyperlinking

External: Networked heritageLinks through vocabularies

Case: Flemish Institute for Archiving (VIAA) and the Netherlands Institute for Sound and Vision (NISV) • pipeline of a real-world, international use

case that illustrates the end-user benefit of aligned SKOS thesauri

• method and tools for converting XML thesauri to SKOS;

• CultuurLINK, an interactive tool for thesaurus alignment;

• application that enables cross-collection search and browsing using the aligned thesauri.

Sound and Vision VIAADutch AV heritage

> 1.000.000 hrs of Tv (public broadcasters)

radio, music, docu, film, commercials, etc

Flemish archive,

including Flemish broadcaster (VRT)

Gemeenschappelijke Thesaurus Audiovisuele Archieven (GTAA)

184,484 terms (concepts, persons, geo,…)19,695 terms in hierarchy9 conceptSchemes90,708 scopeNotes33,542 relations

Published as SKOS Linked Open Datahttp://gtaa.beeldengeluid.nl/

VRT Thesaurus100.000+ termsStructured, but not SKOS yetNo concept schemes

.

Mapped to SKOS, Hierarchies to skos:broader/narrower

VRT Thesaurus

VRT Thesaurus102,172 terms97,744 in hierarchy4,429 top concepts212 scopeNotes6,828 relations

Conversion code available at https://github.com/viaacode/skoscreator Triples available at http://semanticweb.cs.vu.nl/test

CollectionsVIAA

• Part of the VRT AV collection • +/- 35,000 items

(out of ~1Million)• Annotated with VRT

thesaurus• Not publicly available

NISV• Openimages.eu• +/- 3,000 items out of 800K hrs• Mostly news broadcasts• Annotated with GTAA• Publicly available (CC-by-SA)

VRT Thesaurus GTAA

ALIGNMENTVRT Thesaurus GTAA

‘Happy alignments are all alike; every unhappy alignment is unhappy in its own way’

Jacco van Ossenbruggen, (with apologies to Tolstoy)

http://cultuurlink.beeldengeluid.nl/

Semi-automatic SKOS vocabulary alignment service

Successor of EuropeanaConnect’s Amalgame

Users can upload vocabularies and match with existing vocs.

Users can design, experiment, improve their alignment strategy

Matching, selecting, excluding, sampling, evaluating

Example alignment strategy: Concepts

Example alignment strategy: Persons

Four strategies

Type Nr of correspondences

Subjects 4,176

Names 2,197

Locations 4,011

Persons 11,265

Total 21,640

ALIGNMENTVRT Thesaurus GTAA

Demonstrator: Information Retrieval tool using Spinque search-by-strategy paradigm

No programming needed, just modelling the IR strategy

Keyword, vocabulary term or Related-Object search

Search on titles, description, vocabulary labels

Weight on collection (user-input)

Demonstrator

http://link.spinque.com/VIAA-1.0/

Input for keyword search or thesaurus concepts

Search results

Collection indicator

Thesaurus terms associated with video. Terms may appear in one thesaurus or in both thesauriThesaurus terms

associated with retrieval results (grouped by type)

Slider used to indicate collection preference/weight

Per results, the thumbnail, title, description, identifier and thesaurus terms are shown

The selected video appears in the search field.

Thesaurus terms associated with search results and selection.

Play screenIn this case, the user positioned the slider all the way to the right, indicating that he/she is interested in Open Images videos related to this VRT item.

List of OpenImages videos related to this VRT video. Matching terms are highlighted.

ConclusionsConversion of structured vocabularies to SKOS opens possibilities for connecting collections

Interactive alignment produces many useful links

Demonstrator shows possibilities of aligned collections

Demonstrator will be extended whenmore collections are available

> Complete NISV collection metadata (?)

> Compete VIAA collection metadata(?)

Thank you

vdboer@beeldengeluid.nl

http://cultuurlink.beeldengeluid.nl

http://link.spinque.com/VIAA-1.0/

http://semanticweb.cs.vu.nl/test

https://github.com/viaacode/skoscreator