+ All Categories
Home > Data & Analytics > Ariadne: Report on Thesauri and Taxonomies

Ariadne: Report on Thesauri and Taxonomies

Date post: 13-Feb-2017
Category:
Upload: ariadnenetwork
View: 200 times
Download: 5 times
Share this document with a friend
44
D15.1: Report on Thesauri and Taxonomies Authors: Douglas Tudhope, USW Ceri Binding, USW Ariadne is funded by the European Commission’s 7th Framework Programme.
Transcript
Page 1: Ariadne: Report on Thesauri and Taxonomies

D15.1: Report on Thesauri and Taxonomies

Authors: Douglas Tudhope, USW Ceri Binding, USW

Ariadne is funded by the European Commission’s 7th Framework Programme.

Page 2: Ariadne: Report on Thesauri and Taxonomies

TheviewsandopinionsexpressedinthisreportarethesoleresponsibilityoftheauthorsanddonotnecessarilyreflecttheviewsoftheEuropeanCommission.

ARIADNED15.1ReportonThesauriandTaxonomies(Public)

Version:1.5(final) July2016

Authors: DouglasTudhopeandCeriBinding(USW)

Contributingpartners: HollyWright(ADS),

FlorenceLaino(AIAC,L-PArchaeology),

PhilippGerth,FrancescoMambrini(DAI),

FedericoNurra,EmmanuelleBryas,NouvelBlandine,EvelyneSinigaglia(INRAP,FRANTIQ),

PaulBoon,HellaHollander,PeterBrewer(KNAW-DANS,UniversityofArizona),

EvieMonaghan,LouiseKennedy,AnthonyCorns(Discovery),

SaraDiGiorgio,TizianaScarselli(MIBACT-ICCU),

EstherJansma(RCE),

withadditionalcontributionsfromallpartners

Qualityreview

HollyWright(ADS-ArchaeologyDataService)

Page 3: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 3

Tableofcontents

ExecutiveSummary.........................................................................................................................4

1 Introduction...............................................................................................................................5

1.1 Controlledvocabularies..............................................................................................................51.2 ARIADNEpartnervocabularies...................................................................................................6

2 Mappingbetweenthesauri........................................................................................................7

2.1 Briefdescriptionofthesaurusmapping.....................................................................................72.2 MappingsinARIADNEtosupportcrosssearch..........................................................................82.3 GettyArtandArchitectureThesaurus.......................................................................................92.4 PrototypeexperimentwithAATashubvocabulary...................................................................92.5 PrototypeexperimentwithAAThierarchicalexpansioninElasticsearch................................11

3 CreatingmappingsforARIADNE...............................................................................................20

3.1 Overviewofmappings..............................................................................................................213.2 Descriptionandreflectionsonmappingexercise....................................................................22

4 MappingsintheARIADNEinfrastructure..................................................................................31

4.1 Mappingenrichmentprocess...................................................................................................314.2 MappingswithintheARIADNEportal......................................................................................32

5 Conclusion...............................................................................................................................34

6 References...............................................................................................................................35

7 AppendixA..............................................................................................................................37

8 AppendixB...............................................................................................................................41

9 AppendixC...............................................................................................................................42

Page 4: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 4

ExecutiveSummaryThisdeliverablereportsontheworkofARIADNEWP15,Task1:SKOSthesauriandtaxonomies.Thisincludesvocabularies,suchasthesauriandtermlistsindifferentlanguagesusedbypartnersforsubjectindexing.Whensearchingfreetextwithuncontrolledterms,significantdifferencescanarisefromtrivialvariationsinsearchstatementsandfromdifferingconceptualisationsofasearchbyusers.Differentpeopleusedifferentwordsforthesameconcept,oremployslightlydifferentconcepts.Assuch,thiswasakeyissuetobeaddressedwithintheARIADNEproject,andisakeyfocusofthisreport.TheissuesposedforinteroperabilityandcrosssearchbyARIADNE'smultilingualcollectionofdatasetsandreportsarediscussed,alongwiththeuseofacontrolledvocabularytoreduceambiguitybetweentermsbyvariousfeatures.ThevocabulariesmostrelevantforARIADNEarealsolistedanddescribed.

Mappingbetweenvocabulariesisakeyaspectofsemanticinteroperabilityinheterogeneousenvironments.Mappingbetweennativepartnervocabularies canprovideausefulmediationplatform forARIADNEcrosssearch, particularly as subjectmetadata are in different languages. However the creation of links directlybetween the items from different vocabularies can quickly become unmanageable as the number ofvocabularies increases. Therefore, a hub architecturewas adopted, using an intermediate structure ontowhich the concepts from local vocabulariesweremapped. Theworkonproducingmappings isdescribed,togetherwith the incorporation ofmappings in the ARIADNE infrastructure, and their use to date in theemergingARIADNEPortal.

TheGettyArtandArchitectureThesaurus(AAT)waschosenasanappropriatehubvocabulary,followingaprototypemappingandretrievalexerciseinvolvingfiveARIADNEvocabularies inthreedifferent languages.Inanotherprototypeexperiment,theimplementationofhierarchicalexpansiontechniqueswasinvestigatedusingtheElasticsearchinfrastructureadoptedfortheARIADNEPortal.

AlargescalepilotexercisewithoneARIADNEpartnerwasconducted,inordertoallowforrefinementofthemethodologyandmappingguidelinesafterreviewingtheresults.Thefirstcompletemappingexercisewassuccessfully performed by ADS, using a custom linked data vocabulary matching tool developed for theARIADNEproject.Analysisofresultsfromthispilotmappinginformedaniterationofthemappingguidelinesandthematchingtooluserinterface.Followingthereviewofthepilotmappingexercise,anadditional,basicspreadsheet based utility was developed for recordingmappingsmademanually in situations where thesourcevocabularieswerenotavailableasLinkedData. Mappingswereconductedby thevariouscontentpartners from their native vocabularies to the AAT. A summary of mappings with statistics on the SKOSmatch types employed by the various content partners is discussed. This shows that in almost all casesmappingsweresuccessfullyestablishedtotheAAT.AbouthalfwereexactMatch,withtheotherhalfmostlycloseMatch and broadMatch. As expected only a small number were narrower matches – most partnervocabularies were considered to be reasonably congruent or were more specialized than the AAT.Reflectionsbypartnersonthemappingexercisearediscussed.

The output from the partnermappings from their source vocabularies to the AAT is transformed to therequired format for further processing by the relevant MoRe enrichment services used by the ARIADNERegistry.TheenrichmentprocessaugmentsthedataimportedtotheRegistrywithmappedAATconcepts.These derived subjects in turnmake possible concept based search and browsing in theARIADNE Portal.WhilethePortal isstillevolvingat thetimeofwriting,aqueryonthePortal illustrateshowthemappingsmakepossibleconceptbasedsearchacrosssubjectmetadataindifferentlanguages.

Page 5: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 5

1 Introduction

This document is a deliverable (D15.1) of the project ARIADNE - Advanced Research Infrastructure forArchaeological Dataset Networking in Europe that has been funded under the European Community’sSeventh Framework Programme. This deliverable reports on the work of ARIADNE WP15, Task 1: SKOSthesauri and taxonomies. This includes thesauri and term lists indifferent languagesusedbypartners forsubject indexing. Followingon from the surveyof vocabulariesdescribed inD3.3, thosemost relevant forARIADNEareidentifiedandaugmentedbyasmallnumberofadditionalvocabularies.TheissuesposedforinteroperabilityandcrosssearchbyARIADNE'smultilingualcollectionofdatasetsandreportsarediscussed.Linking between vocabularies, following standard mapping relationships is considered the best practiceapproach towardsmultilingual functionality. Theworkonproducingmappings is described, togetherwiththeincorporationofmappingsintheARIADNEinfrastructureandtheirusetodateintheemergingARIADNEPortal.

1.1 Controlledvocabularies

Vocabularies are used for control of subject metadata. Other types of metadata can also benefit fromvocabulary control, including place names, time periods and personal names. Vocabulary control aims toreducetheambiguityofnaturallanguage(freetext)whenindexingandretrievingitemswhilesearchingforinformation(Svenonius2000;Tudhopeetal.2006).

Controlled vocabularies consist of terms, that is, words from natural language selected for retrievalpurposes.Atermcanconsistofoneormorewords.Inacontrolledvocabulary,suchasathesaurus,atermisusedtorepresentaconcept(whichcanhaveseveraltermsassociatedwithit).

Twofeatures(synonymsandambiguity)innaturallanguageposepotentialproblemsforretrieval:

a)Differentterms(synonyms)canrepresentthesameconcept.

b)Thesameterm(homographs)canrepresentdifferentconcepts.Thiscanbeamajorprobleminamono-lingual system and becomes a significant problem in a multi-lingual collection, such asARIADNE.

Acontrolledvocabularycanattempttoreduceambiguitybetweentermsbyvariousfeatures:• Definingthescopeofterms-howtheyaretobeusedwithinaparticularvocabulary.• Providingasetofsynonyms(oreffectivesynonymsforretrievalpurposes)foreachconcept• Restrictingscopesothattermsonlyhaveonemeaning(andrelatetoonlyoneconcept).

Notallvocabulariesprovideall three featuresabove.Someare justsimple listsofauthorisedterms (termlists). Controlled vocabularies also provide vocabulary for Knowledge Organization Systems (KOS), whichadditionally structure their concepts via different types of semantic relationship (such as broader andnarrowerconcepts).

Controlledvocabulariesaresometimescontrastedwithfreetextsearching,assistedbystatisticaltechniquesinautomatic indexingandranking.Thesearenothoweverexclusiveoptionsanddifferentcombinationsofthetwoapproachesarepossible.Controlledvocabulariescanbeusedtoaugmentfreetextsearch.

Whensearchingfreetextwithuncontrolledterms,significantdifferencescanarisefromtrivialvariationsinsearch statements and from differing conceptualisations of a search by searchers. Different people usedifferentwords for thesameconceptoremployslightlydifferentconcepts.Thismaynotbeaproblem incasual search. However, in systematic research on a specialized topic, it is undesirable to miss relevantresources.

Page 6: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 6

At the simplest level, a controlled list of terms ensures consistency in searching and indexing, helping toreduce problems arising from synonym and homograph mismatches. At a more complex level, thepresentationofconceptsinhierarchiesandothersemanticstructureshelpstheindexerandsearcherchoosethemostappropriateconceptfortheirpurposes.Browse-baseduserinterfacesbecomepossible.

AKOScanassistbothprecision(byallowingspecificsearching)andrecall(byretrievingitemsdescribedbyrelated concepts or equivalent terms). It also provides potential pathways (for human andmachine) thatconnecta searcherand indexer’s choiceof terminology.Themore formal specificationof logical semanticrelationshipswithinanontologycanassistapplicationswhererulesarespecifiedabouttherelationshipsandlogic-basedinferencingisappropriate.

The information retrieval thesaurus is designed for retrieval purposes and has a restricted set ofrelationships(TudhopeandBinding2016).TheserelationshipsareEquivalence(connectsaconcepttotermsthat act as effective synonyms),Hierarchical (broader /narrower concepts) andAssociative (more looselyrelated, ‘see also’ concepts). These are defined by an international standard (the recently approved ISO25964). The equivalence relationship connects a concept with a set of equivalent terms, treated assynonymsfortheretrievalsituationsenvisagedbythedesigners.Eithermonoorpolyhierarchicalstructuresmaybeemployed.Thesauriareusuallyemployedfordescriptive indexingpurposesandthecorrespondingsearchsystems.Thesauricanalsobeusedasaqueryexpansionresourceorasthebasisforauto-completesuggestionsinasearchuserinterface,asintheARIADNEPortal.

1.2 ARIADNEpartnervocabularies

Thevocabulariesthemselvesvaryfromasmallnumberofkeywords inapicklist foraparticulardatasettostandardnationalvocabularieswithalargenumberofconcepts.ARIADNEDeliverable3.1(Initialreportonstandards and on the project registry) listed some archaeology-related subject vocabularies (terminologyresources)andmoredetailscanbefoundthere.

Theseincludedelementsofthefollowingvocabularies,consideredparticularlyrelevantforWP15purposes:• ArtandArchitectureThesaurus(GettyResearchInstitute)–athesaurususedfordescribingitemsof

art,architectureandmaterialculture• Pactols Thesaurus (Frantiq) – six multilingual thesauri for describing items on antiquity and

archaeology• ThesaurusofMonumentTypes(FISH)–thesaurusofmonumenttypesbyfunction• ArchaeologicalObjectsThesaurus(FISH)–thesaurusforrecordingofarchaeologicalobjectsinBritain

andIrelandoverallarchaeologicalperiods• BuildingMaterialsThesaurus(FISH)–thesaurusofmaterialsusedinarchaeologicalmonuments• PICO (MiBACT) – cultural heritage thesaurus covering Who/What/Where/When for use in

Culturaitaliaportal• ICCD(MiBACT)–pictorialthesaurusfordescribingarchaeologicalfinds• ReferentienetwerkErfgoed/ABR(RCE-CulturalHeritageAgencyoftheNetherlands)–containsthe

structuredsetofconceptsofculturalheritageintheNetherlands• ARKAStermlist(ZRC-SASU)–alistoftermsforthedefinitionofarchaeologicalsitesinSlovenia• FEDOLG-Rtermlist(MNM-NÖK)–alistoftermsfordescribingarchaeologicalfindsinHungary• Museumsvocabularies(DAI)-agroupofvocabulariesfordescribingmuseumobjectsandconcepts• Archaeological Dictionary (DAI) – a multilingual dictionary for archaeological concepts under

development

AndadditionallyconsideredforWP15• FASTItermlist(AIAC)–setoftermsfordescribingmonumenttypesinFASTIOnline• IrishMonumentsVocabulary(NMS)-fordescribingmonumenttypesinIreland

Page 7: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 7

• Archaeologicaltermlist(SND)–asetoftermsfordescribingarchaeologicalobjectsandmonumenttypesinSwedendrawingonnationalstandards

Someof these vocabularies are available online or published as LinkedData in SKOS representation. Thisallowsprogrammaticaccesstothevocabularyelementsandtheuseofvocabulariesas linkinghubs inthewebofdata.ThisisfurtherdescribedintheforthcomingD15.2.

2 Mappingbetweenthesauri

2.1 Briefdescriptionofthesaurusmapping

Mappingbetweenvocabulariesisakeyaspectofsemanticinteroperabilityinheterogeneousenvironments,andisparticularlyimportanttomulti-lingualcollections(Tudhopeetal.2006).Itcanimprovebothrecall(indifferentlanguages)andprecision(falseresultsmayarisefromliteralstringsearch).

Significant effort is required, however, for useful results; detailed mapping work at the concept level isnecessary, requiring a combination of intellectualwork and automated assistance. Zeng and Chan (2004)reviewdifferentmethodologicalapproachestomapping:

a)Derivation/Modelingofaspecialisedorsimplervocabularyfromanexistingvocabulary.

b)Translation/Adaptationfromanexistingvocabularyinadifferentlanguage.

c)SatelliteandLeafNodeLinkingofaspecialisedthesaurustoalarge,generalthesaurus.

d)DirectMappingbetweenconceptsindifferentcontrolledvocabularies,usuallywithanintellectualreview.

e)Co-occurrencemappingbetweentwovocabulariesbasedontheirmutualoccurrenceswithintheindexing of items within a collection. Co-occurrence mappings are considered looser than directmappingmadebyexperts.

f)Switchinglanguageusedasanintermediary.Itcanbeanewsystemcreatedforthepurposeoranexistingsystem.

A switching language is one of the most frequently used approaches. This is the approach adopted byARIADNE, as described below, where the switching language is described as a “hub” for the ARIADNEmetadataconnections.Seealsothediscussionintherecentthesaurusstandard,ISO25964-2:2013section6“Structuralmodelsformappingacrossvocabularies”.

There are also variants and combinations of these mapping approaches in practice. Effective mappingrequires some degree of overlap and congruence of purpose in the vocabularies being mapped. Someprominent examples of mapping work are mentioned briefly. OCLC, providers of the Dewey DecimalClassification (DDC), developed various mappings between major vocabularies (both intellectual andstatisticalco-occurrencemappings)makingthemavailableasterminologywebservices(Vizine-Goetzetal.2003).TheOAIprotocolwasusedtoprovideaccesstoavocabularywithmappings,viaabrowsertohumanusers and through the OAI-PMH web service mechanisms to machines. Both direct mappings and co-occurrencemappingswere provided, depending on the situation. The DDCwas employed as a switchinglanguageintheRenardusFP5projecttosupportacross-browsingserviceforaEuropeanacademicsubjectgatewayservice(Kochetal.2003).

More recently, the United Nation’s Food and Agriculture Organization (FAO) has devoted considerableresources to its AGROVOC thesaurus, which is a significant element of the VocBench collaborativevocabularyeditingandpublishingplatformandtheassociatedAIMS(AgriculturalInformationManagement

Page 8: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 8

Standards)portal. Thishasbeenexpressedas LinkedDataand there is anextensivemappingprogrammewith(SKOS)mappingsestablishedfor13vocabulariesincludingLCSH(LibraryofCongressSubjectHeadings),GEMET (General Multilingual Environmental Thesaurus) and STW (Standard Thesaurus for Economics /Standard Thesaurus fürWirtschaft) (Caracciolo et al., 2013).Mapping services have been a longstandingfocusoftheGermanbilingualSTWThesaurus,astructuredvocabularyforsubjectindexingandretrievalofeconomics literature.This isnowbaseduponaLinkedDataarchitectureLinkedData(KempfandNeubert,2016).

TimBerners-Lee,creatoroftheWorldWideWebandtheconceptofLinkedDatahasproposedafivestardeployment scheme for grading Linked Open Data, which stresses linking to external Linked Open Dataresources to achieve full potential. In the context described here, these links take the form of machinereadablemappingstoacommonreferencevocabulary.

« Datamadeopenlyavailableonthewebinanyformat

«« Asabove,butinamachinereadablestructureddataformat(e.g.Excel)

««« Asabove,butinanon-proprietarystructureddataformat(e.g.XML)

«««« Asabove,butusingW3Copenstandards(e.g.URIs,RDF&SPARQL)

««««« Asabove,andalsolinkingouttootherexternalLOD

Figure1:The5stardeploymentschemeforLinkedOpenData

Part 2 of the International Thesaurus Standard (ISO25964-II) aims to facilitate high quality informationretrievalacrossnetworkedresourcesindexedwithdifferenttypesofvocabularies.Itexplainshowtosetupmappingsbetweentheconceptsinsuchvocabulariesandincludesadiscussionoftheimpactofmappingonretrieval.This isanimportantconsideration,particularlywhennoexactequivalentconceptexists,andit isnecessary to map to a broader or narrower concept, a partially overlapping concept, or to a (Boolean)combinationofconcepts.Section14ofISO25964-IIdiscussestechniquesforidentifyingcandidatemappings.

MappingbetweennativepartnervocabulariescouldprovideausefulmediationplatformforARIADNEcrosssearch, particularly as subjectmetadata are in different languages. However the creation of links directlybetween the items from different vocabularies can quickly become unmanageable as the number ofvocabularies increases. Mapping between more than three vocabularies would be more efficient andscalableusingthehubarchitecture(i.e.switchinglanguage),usinganintermediatestructureontowhichtheconceptsfromeachlocalvocabularymaybemapped.Asearchonaconceptoriginatingfromonevocabularycanthenutilisethismediatingstructuretoroutethroughtoconceptsoriginating fromothervocabularies,possiblyexpressedinotherlanguages.

2.2 MappingsinARIADNEtosupportcrosssearch

Forsubjectaccess,theACDMArchaeologicalResourceclasshastwokindsofsubjectproperty.Theproperty,native-subject, associates the resourcewith one ormore items froma controlled vocabulary used by thedata provider to index the data. However, there are a large number of partner vocabularies in severaldifferent languages. Cross search and semantic interoperability is rendered difficult, as there are nosemantic links or mappings between the various local vocabularies. Standard ontologies for metadataschemas,suchastheCIDOC-CRM,donotcoverparticularsubjectvocabulariesbutexpecttheontologytobecomplemented with the terminology contained in the relevant subject vocabularies for an applicationdomain.Spellingvariationsordifferentsynonymsforthesameconceptcanresultinfailuretofindrelevantresults.Thisproblemisexacerbatedwhensubjectmetadatamaybeindifferentlanguages,whichisclearlythe case when providing an infrastructure for European archaeology. Not only may useful resources be

Page 9: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 9

missedwhensearchinginadifferent languagefromthesubjectmetadatabutthereisalsotheproblemoffalseresultsarisingfromhomographswherethesametermhasdifferentmeanings indifferent languages.Forexample,“vessel”hasdifferentarchaeologicalmeanings intheEnglishlanguage,while“coin” isFrenchfor corner, “boot” isGerman forboat and “monster” isDutch for sample (very different from the Englishlanguagemeaningsofthesewords).

2.3 GettyArtandArchitectureThesaurus

TheGettyArt andArchitectureThesaurus (AAT) is an influential and longstanding,multi-lingual thesaurususedworld-wide,withover40,000conceptsandover350,000terms(Harpring,2016).TheAAThas7facets(and 33 hierarchies as subdivisions): Associated concepts, Physical attributes, Styles and periods, Agents,Activities, Materials, Objects and optional facets for time and place. The AAT’s scope is broader thanarchaeology,encompassing fineart,builtworks,decorativearts,othermaterial culture, visual surrogates,archival materials, archaeology, and conservation. However it contains much useful, high levelarchaeologicalcontent,particularlyintheBuiltEnvironment,MaterialsandObjectshierarchies.

The AAT has a faceted poly-hierarchical structure, containing generic concepts, with labels in multiplelanguages. It appears to have a good breadth of archaeological coverage to map local vocabularies to,togetherwithclear scopenotesdefining thescopeofusage foreachconcept.TheAAThas recentlybeenmadeavailableasLinkedOpenDatabytheGettyResearchInstitute(GettyResearchInstitute,2016b),whichfitswellwithARIADNE’sstrategyforsemanticinteroperability.

2.4 PrototypeexperimentwithAATashubvocabulary

TheAATwaschosenasanappropriatehubvocabulary,followingaprototypemappingandretrievalexerciseinvolvingfiveARIADNEvocabulariesinthreedifferentlanguages.ThisisdiscussedinmoredetailinBindingandTudhope(2015).Briefly,asmallextractfromthepublishedAATlinkeddatawasusedasahub,togetherwith a set of intellectual mappings via consulting the Getty Vocabularies search facility(http://vocab.getty.edu/). For this exercise, the skos:closeMatch relationship was used rather thanskos:exactMatch.Mappingswerecreatedmanually(byUSW)forthesetofconceptsemployedinthepilotstudy.Insomecases,partnervocabulariescontainedmorespecialisedconceptsthancontainedintheAAT.However,itwasconsideredthattheskos:broadMatchrelationshipshouldbeappropriateinthesesituations,sincetheusecasewascross-searchintheARIADNEPortal,ratherthanfinegrainedsemanticprocessing.

In addition, the possibility of query expansion based upon the AAT's hierarchical structure (semanticexpansion over the thesaurus hierarchical relationships)was noted. Thiswould open up the possibility inretrievalofmatchingontermsassociatedwithnarrowerconceptswhenqueryingatamoregeneral level.This would have the potential of improving recall without loss of precision. As part of the pilot, a freelyavailabledesktopRDFsearchfacility(SparqlGui,2016)wasemployedtoquerytheextractofAATconcepts,combinedwiththemappingsproducedforthepilotexercise.Usingthequerytool,aSPARQL1.1queryontheAIACconceptfasti:cemetery(seeFigure2)returnsresultsfromfivedifferentvocabularieswithtermsindifferentlanguagesviatheAATsemanticstructure(seeTable2).Thissearchmakesuseofthemappingsandalsothehierarchicalqueryexpansion.Theresults fromthepilotexercisewerepresentedanddiscussedattheARIADNEsessionintheResearchInfrastructuresonCulturalHeritageconference,co-organizedinRomebytheARIADNEprojectandtheItalianMinistryofCulture(MIBAC)inNovember2014(andpublishedinanaccompanying ARIADNE booklet). It was decided that they held sufficient promise to proceedwith a fullmappingexercise,inordertodeliversomedegreeofmultilingualcapabilityfortheARIADNEsearchsystemintheforthcomingPortal.

Page 10: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 10

Conceptidentifier Conceptlabel

iccd:catacomba catacomba

tmt:91386 catacomb(funerary)

fasti:catacomb Catacomb

iccd:colombario colombario

fasti:columbarium Columbarium

dai:3736 Kolumbarium

dans:6a7482e5-2fd5-48fb-baf4-66ad3d4ed95e kerkhof

dai:1947 Gräberfeld

iccd:necropoli necropoli

dai:2485 Nekropole

tmt:70053 cemetery

tmt:70053 necropolis

# SPARQL 1.1 to locate concepts related via AAT to FASTI “cemetery” concept

PREFIX gvp: <http://vocab.getty.edu/ontology#>

PREFIX aat: <http://vocab.getty.edu/aat/>

PREFIX fasti: <http://fastionline.org/monumenttype/>

PREFIX iccd: <http://www.iccd.beniculturali.it/monuments/>

PREFIX tmt: <http://purl.org/heritagedata/schemes/eh_tmt2/concepts/>

PREFIX dans: <http://www.rnaproject.org/data/>

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

PREFIX dai: <http://archwort.dainst.org/thesaurus/de/vocab/?tema=>

SELECT DISTINCT ?concept ?label WHERE {

fasti:cemetery (skos:exactMatch | skos:broadMatch | skos:closeMatch) ?aatconcept .

?aatdescendant gvp:broader+ ?aatconcept .

{

{?concept (skos:exactMatch | skos:broadMatch | skos:closeMatch) ?aatdescendant}

UNION

{?concept (skos:exactMatch | skos:broadMatch | skos:closeMatch) ?aatconcept}

}

OPTIONAL {?concept skos:prefLabel ?label}

}

Figure2:SPARQL1.1queryonthesemanticframeworkofAATpluslocalvocabularymappings.

Page 11: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 11

dans:be95a643-da30-40b9-b509-eadfb00610c4 christelijk/joodsebegraafplaats

dans:b935f9a9-7456-4669-91d0-2e9c0ff7d664 vlakgrafveld

iccd:cimitero cimitero

dans:abb41cf1-30dc-4d55-8c18-d599ebba1bc2 rijengrafveld

Table1:SampleextractoftheresultsfromthequeryinFigure2

2.5 PrototypeexperimentwithAAThierarchicalexpansioninElasticsearch

FollowinganARIADNEJointTechnicalMeeting,itwasdecidedtoinvestigatefurtherhowtoimplementthehierarchicalexpansiontechniquesatscaleinthecontextoftheElasticsearchinfrastructureadoptedfortheARIADNE Portal. Therefore a second prototype experiment with the AAT was conducted using theElasticsearchplatform.

Hierarchicalsemanticexpansionmakesuseofbroadergeneric(“IS-A”)relationshipsbetweenconceptsinahierarchicallystructuredknowledgeorganizationsystem,allowingasearchonaparticularsubject indexingconcepttoalsoretrieveanyitemsindexedusingconceptsthatarepositionedbelowthatconceptwithinthehierarchicalstructure.

• aat:300264092ObjectsFacet

• aat:300264551FurnishingsandEquipment(hierarchyname)• aat:300036743WeaponsandAmmunition(hierarchyname)

• aat:300036926weapons• aat:300036973edgedweapons

• aat:300036982axes(weapons)• aat:300036983battleaxes

Figure3:fullhierarchicalancestryofAATconceptID300036983(battleaxes)

Figure3 illustrates the fullhierarchical ancestry foranexampleAATconcept aat:300036983 (battleaxes).Usinghierarchicalsemanticexpansionaqueryonconceptaat:300036926(weapons) shouldthereforealsoretrieveitemsindexedasedgedweapons,axes(weapons),battleaxesetc.

The prototype experiment demonstrated hierarchical semantic expansion using SPARQL against RDFresources.TheElasticsearchinfrastructureusedinARIADNEhasfunctionalityreferredtoasgenreexpansion(GormleyandTong,2015)whichshouldbeabletoachievesimilarresultstotheSPARQLprototypedescribedinsection2.4.Theobjectofthisexercisewasthereforetoagainusetheexistingpoly-hierarchicalstructureoftheAAT,thistimetoproduceconfigurationdataintheformatrequiredtoimplementElasticsearchgenreexpansion.WefirstextractedtheAATbroadergenericrelationshipsbyrunningtheSPARQLqueryinFigure4againsttheGettyVocabularyProgramSPARQLendpoint(GettyResearchInstitute,2016c).

Page 12: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 12

#Extractthepoly-hierarchicalstructureoftheAAT

PREFIXskos:<http://www.w3.org/2004/02/skos/core#>

PREFIXgvp:<http://vocab.getty.edu/ontology#>

PREFIXaat:<http://vocab.getty.edu/aat/>

CONSTRUCT{?sgvp:broaderGeneric?o}

WHERE{?sskos:inSchemeaat:;gvp:broaderGeneric?o}

Figure4:SPARQLquerytoextractthepoly-hierarchicalstructureoftheAAT

TheresultsofthisqueryweredownloadedinN-TripleRDFformattoproducealocalfilecontaining45,443RDFtriples.TheconfigurationofElasticsearchgenreexpansionrequiresthefullancestrychainofidentifiersfor each concept to be expressed as textual “rules” containing a comma separated list of identifiers,formatted as shown in Figure 5 (note the full AAT concept URIs have been shortened for illustrationpurposes):

aat:300264551=>aat:300264551,aat:300264092

aat:300036743=>aat:300036743,aat:300264551,aat:300264092

aat:300036926=>aat:300036926,aat:300036743,aat:300264551,aat:300264092

aat:300036973=>aat:300036973,aat:300036926,aat:300036743,aat:300264551,aat:300264092

(etc.)

Figure5:Elasticsearchgenreexpansionrulesexpressed

TheextractedRDFdatafileresultingfromthequeryinFigure3wasimportedtoSparqlGui(SparqlGui,2016)—adesktop tool forperformingexperimental SPARQLqueriesonRDFdata. TheSPARQLquery shown inFigure6 then retrieved theexpansion rulesdata in the formatshown inFigure5,producinga17MB file,consistingof41,866linesoftext.

#ProducetheancestrychainsrequiredforElasticsearchgenreexpansion

PREFIXgvp:<http://vocab.getty.edu/ontology#>

SELECT(concat(str(?uri),"=>",str(?uri),",",group_concat(?broader;separator=","))AS?ancestry)

WHERE{

?urigvp:broaderGeneric+?broader.

}

GROUPBY?uri

Figure6:SPARQLquerytoproducetheancestrychainsrequiredforElasticsearchgenreexpansionrules

Page 13: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 13

Note:Thisprocesswassplit(i.e.extractingasubsetofAATdatathenqueryingtheextract)onlytoalleviatepotential performance issues, as this is a fairly demanding query. In practice itwas found that theGettySPARQL endpoint does actually support running the Figure 6 query directly - so in hindsight this wouldsimplifytheoverallprocess.

Thenext stagewas to incorporate theextractedand formatteddata intoElasticsearchand test thegenreexpansion functionality.A localdesktopcopyofElasticsearchwasused in conjunctionwith the “Marvel –Sense”dashboardusedforconfiguringandpopulatingindexesandrunningexperimentalqueries.ThefileofAATgenreexpansionruleswascopiedto the/config folderof theElasticsearch installation,andwas thenreferenced in a synonym filter for a customanalyzerwhen specifying the settings for initially creating anindex,asillustratedinFigure7.

Figure7:specifyingsettingsfortheAATgenreexpansionanalyzerandsynonymfilter

Amappingwasthencreatedspecifyinghowtohandlevaluesinthedct:subjectsubjectindexingfield(note:this was for demonstration and testing purposes; the actual naming of this field would have to be inaccordancewith the ARIADNE Elasticsearch index structure, as implemented). Note that genre expansionwas configured during initial creation of the index and not at query time (see the index_analyzer /search_analyserconfigurationsettingsinFigure8);otherwisetheexpansionwouldruninbothbroaderandnarrower directions - leading to incorrect and potentiallymisleading results. Thismeans that (by default)genre expansion of AAT concept identifiers would always be enabled in search, though possibly somemethodcouldbedevised tooverride itwithin thesearchparametersand theassociateduser interface, ifthatwasdeemednecessary.

Page 14: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 14

Figure8:Addingamappingspecifyinghowtohandlethesubjectindexingfield

Some sample data items indexed using the dct:subject field (with various AAT URI identifiers from theexample in Figure 2) were created for testing purposes and added to the experimental index, using thecommandsshowninFigure9.

Figure9:Addingsomesampleitemstotheindexfortesting

Testingtheitemindex

TestingwasachievedbyqueryingfortheitemsindexedusingspecificAATconceptURIs.TheexamplequeryshowninFigure10issearchingforitemsindexedusingadct:subjectfieldvalueofaat:300036926(weapons).Anumberofquerieswererunusingdifferentdct:subjectvalues.

Page 15: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 15

Figure10:TestingthegenreexpansionbyqueryingforitemsindexedusingspecificAATconcepts

The results shown in Table3 illustrate theeffectsof genreexpansion.A searchon aat:300036983 (battleaxes) retrievedonly the single item indexedusing that concept identifier, but a searchonaat:300036973(edgedweapons)retrieveditemsindexedusingthatconceptANDitemsindexedusinganyofthedescendantconcepts,inaccordancewiththeAAThierarchicalstructureexampleinFigure3.

dct:subjectsearchonAATconceptidentifier ID(s)oftheitemsretrieved

aat:300036983battleaxes 10

aat:300036982axes(weapons) 10&11

aat:300036973edgedweapons 10,11&12

aat:300036926weapons 10,11,12,13&14

Table2:Resultsofsearchingforspecificdct:subjectvalues

Useofvocabularyresources

Thepreviousdocumentationdiscussesgenreexpansiondirectlyappliedtoregistryitems.Asimilarapproachcanthereforebetakento indexingandexpandingtheARIADNEvocabularyconceptresourcesthemselves.Using the same test index as previously (ariadnedata) and the same analysers, some sample vocabularyconceptresourceswereindexed.First,anewmappingspecifyinghowtohandletheconceptmetadatafieldswasadded(Figure11).

Page 16: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 16

Figure11:Addingamappingspecifyinghowtohandletheconceptmetadatafields

Some sample concept metadata was then created for testing purposes and manually added to theexperimental index, using the commands shown in Figure 12. A bulk import process would have to beadoptedforimportingtheactualGettyAATconceptmetadata,asitisalargedataset.

Page 17: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 17

Figure12:Insertingthemetadataforsomeexampleconcepts

Testingtheconceptindex

Asthegenreexpansionanalyzerhadalreadybeenpreviouslycreatedandconfigured,wecouldnowperformsemanticgenreexpansionqueriesdirectlyonthevocabularyconceptresourcesthemselves.Notehowthequery shown in Figure 13 is quite similar to that shown in Figure 10, but this timewe are searching theresources under /ariadnedata/concept for a specified dct:identifier value – which in this case is the AATconceptrepresenting“weapons”(seeFigure3).

Figure13:querytoperformgenreexpansiononAATconcept300036926(“weapons”)

The results of this query are shown in Figure 14. The results include the specified concept AND allhierarchicallydescendantconceptsinaccordancewiththeAAThierarchicalstructure(fromFigure3).

Page 18: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 18

Figure14:resultsofElasticsearchgenreexpansionqueryonAATconcept300036926("weapons")

This demonstrates one possible method of implementing the hierarchical semantic expansion of AATconceptsinElasticsearch.Thetechniquecanimprovetherecallmeasureofqueryresultswithoutsacrificingprecision.The fullAAT“expansion rules”data fileasproducedcouldbe reused inotherprojects,and thesameapproachcanbeeasilyadaptedtootherhierarchicallystructuredknowledgeorganizationresources,suchastheGettyThesaurusofGeographicNames.

ThetwoprototypeexperimentsalsoshowthepotentialofworkingwiththeURIidentifiersofAATconceptsratherthantheambiguousstringsoftermlabels.UsingtheURIidentifierfortheconceptavoidstheproblem

Page 19: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 19

of ambiguity, common in multilingual datasets, of terms that are homographs in different languages.Working at the concept level also makes possible hierarchical semantic expansion, making use of thebroader generic (“IS-A”) relationships between concepts in a hierarchically structured knowledgeorganization system, such as the AAT. Thus a search expressed at a general level can (if desired) returnresults indexedat amore specific level. Forexample, a searchon settlementsmight also returnmonasticcentres.

Page 20: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 20

3 CreatingmappingsforARIADNE

Following the prototype experiment, the next step was to produce the mappings from the subjectvocabulariesemployedtoindexthevariousdatasetsselectedfortheARIADNECatalogue.ItwasdecidedtoproceedwithalargescalepilotexercisewithoneARIADNEpartner,inordertoallowforrefinementofthemethodologyandmappingguidelinesafterreviewingtheresults.

ThefirstcompletemappingexercisewasperformedbyADSonSKOSifiednationalheritagevocabulariesforEngland,ScotlandandWales,usingacustom linkeddatavocabularymatching tooldevelopedbyUSWfortheARIADNEproject.Fordetailsofthemappingexerciseandthetool,seeBindingandTudhope(2015)andtheforthcomingD15.3willdiscusstoolsinmoredetail.Analysisofresultsfromthispilotmappinginformedaniterationofthemappingguidelinesandthematchingtooluserinterface.Forexample,itwasdecidedthatmappingtoAATGuideTerms(notnormallyusedforindexing)wasundesirableforARIADNEpurposes.Also,multiplemappings fromthesamesourceconceptwereonlyconsidereduseful incertaincircumstances.Acompletesetofmappingswasthenproducedfor thesubjectmetadataused intheADSdata importedbytheARIADNERegistry.Examplesofmappings fromtheADSmappingexerciseareshown inTable4.Thesewere reviewed by a senior archaeologist and the final mappings (after minor fine tuning) werecommunicated to the ATHENADCURegistry team as RDF/JSON statements (see section 4). This exercise,togetherwiththeguidelines,wasreviewedbytheUSWteam.RevisionstothemappingguidelinesincludedrecommendationsontheappropriateSKOSmappingrelationshiptoemployindifferentcontexts,andwhenappropriate,tospecifymorethanonemappingforagivenconcept.

Sourceconcept matchURI Targetconcept

DITCHEDENCLOSURE

http://purl.org/heritagedata/schemes/eh_tmt2/concepts/70361

skos:broadMatch agriculturalsettlements

http://vocab.getty.edu/aat/300008420

CROFT

http://purl.org/heritagedata/schemes/eh_tmt2/concepts/68617

skos:closeMatch smallholdings

http://vocab.getty.edu/aat/300000211

Table3:ExamplesfromtheADSmappingexercise

The revised guidelineswere employed in themappings of vocabularies from the other partners (and seeAppendix C). Following the review of the pilot mapping exercise, an additional, basic spreadsheet basedutilitywas developed for recordingmappingsmademanually in situationswhere the source vocabularieswerenotavailableasLinkedData(seeD15.3,forthcoming).

Page 21: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 21

3.1 Overviewofmappings

Table4:Summaryofmappingswithstatisticsonmatchtype(asofJune2016).Note–forADS,ICCUandINRAPthemappingsarebasedonasubsetofthesourcethesaurusterms

Page 22: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 22

Table5givesasummaryofmappingscompletedattimeofwriting.ThevocabulariesaredescribedinSection1andreflectionsbypartnersonthemappingexercisearegiveninSection3.2.

FromtheoverallstatisticswecanseethatinalmostallcasesmappingswereestablishedtotheAAT.Some50%wereskos:exactMatch,with18%skos:closeMatchand27%skos:broadMatch.Asexpectedonlyasmallnumber (5%) were narrower matches – most partner vocabularies were considered to be reasonablycongruentorweremore specialized than theAAT.However, therewerea fewexceptionswhere theAATwasmorespecialized.Themappingguidelinessteeredpartnersawayfromusingskos:relatedMatchbutthatwas found useful byDANS in a very small number of caseswhen itwas considered appropriate tomakemorethanonemapping,perhapsadditionallytoarelatedactivity(seediscussioninSection3.2).Consideringthemappingchoicesmadebyindividualpartners,wecanseesomedifferenceinthemappingrelationshipschosen, e.g. a higher proportion of skos:closeMatch inmappings for ADS, DANS (EASY),MNM-NOK, SND,ZRC-SASU. This could variously reflect the nature of the vocabularies involved or the style of the persondoingthemapping(e.g.whenthoughtappropriatetoassertanexactmatch).Another factorcouldbetheamountofcontextualinformationavailableintheformofscopenotesetc.–ifnoinformationotherthanthepreferred term label is available, then that might be considered a reason to assert a skos:closeMatchrelationshipratherthanskos:exactMatch.

3.2 Descriptionandreflectionsonmappingexercise

A selection of example reflections on their respective mapping exercises are given below by ARIADNEcontentproviderpartners.

ADS

ADScarriedoutinitialevaluationsregardingthesuitabilityoftheAATtodescribearchaeologicalsubjects,todeterminewhetheritwasanappropriatethesaurusforthemulti-lingualmappingsnecessaryforARIADNE.ResultswereverypositiveduringtestsusingtheUKnationalvocabularies,anditwasfeltthattheAATwassufficient,althoughthereweresomeoddareasofextremedetail(i.e.knives)andotherareaswheretherewasnothingdirectlycomparable(i.e.humanoranimalremains).However,therewasfelttobeasufficientrangeofSKOSmappingtypesavailabletohandlethesesituations.Therewasalsounderstoodtobeacertainamount of subjectivity inmapping choices, even for domain experts, and itwas deemed a good practicefuture idea to havemappings done bymultiple people (essentially creating an authoritativemapping byattribution,or“expertcrowdsourcing”).

ADSalsocarriedouttheinitialmappingexercisetotestthematchingtooldevelopedbyUSWandcreatethemappingtotheUKthesauri,andprovideanexemplarforotherpartners.ItwasdeterminedtobeimpracticaltodocompletemappingsofeverytermintheUKthesauri,soallthedistincttermsinusebytheADSweremappedinstead.Thisstillrepresentedaround1000termstobemapped,themajorityofwhichwerederivedfromtheEnglishMonumentandTypethesaurus.ADSwasabletoachievecomprehensivecoverageoftheirdistincttermsmappedtotheAAT.InevitablythereweresomebroadmatchesincaseswherethegranularityoftheAATdoesnotmatch themore fine-graineddetail of the archaeologydomain, but itwas confirmedthattheAATdoesgivesufficientbreadthanddepthofdomaincoverageforsomeverygoodmatchesonallthe terms used, despite being quite diverse – including maritime craft, organic and inorganic materials,objects and monument types. The mapping exercise also clearly showed that purely automated stringmatching would indeed have been insufficient, and that expert input was necessary (e.g. Alan WilliamsTurret=>fieldfortifications,lynchet=>agriculturalland,etc.)

Page 23: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 23

AIAC

Some130mappingsfromtheFastimonumentthesaurustotheAATwereprovidedbyUSWtoFasti.Thesewereimportedviaascript,andaninterfacedevelopedtoeditthemontheFastiAdminpage.Severalmoremappingswere addedwith this interface and someminor correctionsweremade to themappings fromUSW. These are now available on the Fasti website as part of the published Fasti concepts viahttp://www.fastionline.org/concept/attributetype/monument.

By providing a URI it is possible to refer to these thesaurus items in a controlled way, with an explicitreferencetotheAATandtothetranslationstomanylanguagesthatareavailableinFasti.TheconceptsusebothanEnglish‘humanreadable’URIandanumericURIusingidentifiersfromtheFastidatabase,tocreatelanguageindependentidentifiers inamannerreflectingtheAATURIs.Thesemappingswereusedtomakesure that the terms in theOAI-PMHXMLmatch the termsused in theARIADNEPortal for ingestion. It isplannedthatbefore theendof theARIADNEproject, thesemappingswillbemadeavailable to thepublicthroughouttheFastiinterfacesothattheconceptsareusefullydefined.TheprocessofissuingURIsfortheconceptsusedinFastiwilladdmeaningtothepresenteddata,bylinkingtoexistingenrichedthesauri.

DAI

The IT infrastructureof theGermanArchaeological Institute (DAI) containsmanydifferent subject specificinformation systems, e.g. for excavations and surveys (iDAI.field), objects and for publication of data(Arachne),bibliographicalinformation(Zenon)anddigitizedbooks(iDAI.bookbrowser).Whiletheplacesarealready centrally structured within the iDAI.gazetteer (http://gazetteer.dainst.org/) and all informationsystems refer to the gazetteer, each of the systems has their own vocabulary for describing the storedobjects.AtthemomentworkisongoingtoharmonizethedifferentDAIthesauritoonecommonstandardiniDAI.vocab(http://archwort.dainst.org/).

For themappingactivities inARIADNE, the relevantvocabularycategoriesof theobjectdatabaseArachnewere chosen, as Arachne contains, in contrast to iDAI.field, withmore than 3.6million datasets, a largeamountofwhichisopenlyavailable.ThevocabularyofthefollowingcategorieswasmappedtoGettyAAT:

• Topographie(eng.Topography,http://arachne.dainst.org/category/?c=topographie):Arachne’smostgranularobjectunit,whichisthesuperiorcontextforallrelatedclasses,whichincludeslandscapes,sites,andpartofsites.ItismappedtotheACDMclass“sitesandmonuments”andcontains55valuesmappedtoGettyAATfromtwodifferentvaluelists.

• Bauwerke(eng.Buildings):Thisclasscomprisesbuildingsandmonuments,whichformsacontextforsingleobjectrecordsandcouldbepartofalargersite.ItismappedtotheACDMclass“sitesandmonuments”andcontains176valuesmappedtotheGettyAATfromfourdifferentvaluelists.

• MehrteiligeDenkmäler(eng.Multipartmonuments):Allkindsofgroups,whicharenotbuildingsortopographicunits,aresubsumedintomultipartmonuments,e.g.groupsofstatues,graveyards,hoards.ThisclassismappedtotheACDMclasses“sitesandmonuments”or“burials”,dependingontheobjecttype,andcontains108valuesmappedtotheGettyAATfromsixdifferentvaluelists.

• Sammlungen(eng.Collections):Privateandmuseumcollectionsbelongtothisclass.ItismappedtotheACDMclass“diverse”andcontains11valuesmappedtotheGettyAATfromtwodifferentvaluelists.

• Bücher(eng.Books):Digitalreproduction,characterizationandcontextofclassicalstudyprintsfromthe16thto19thcentury.ItismappedtotheACDMclass“textualdocuments”andcontains17valuesmappedtotheGettyAATfromthreedifferentvaluelists.

• Inschriften(eng.Inscriptions):Thisclasscontainsinscriptionsandepigraphsdepictedonobjects.ItismappedtotheACDMclass“textualdocuments”andcontains19valuesmappedtoGettyAATfromonevaluelist.

Page 24: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 24

DANS

DANStranslatedtheABRtermsintoEnglishasafirststeptowardsmappingtheDANSEASYComplextypentotheAAT.AsDANSdiscovered, translating a termandunderstanding the concept it stands for, gohand inhand.Associatedwiththiswork,DANStranslatedthetermsnotonlytoEnglish,butalsotoGerman,French,Italian,SpanishandCzech,withthehelpofcolleaguesandvolunteers,manyofwhomhadnoarchaeologicalbackground.Theprocessoftryingtofindtranslationsindifferentlanguageshelpedinbetterunderstandingand“pinningdown”theconceptandthusfindinganoptimalAATmappingforit.BesidesthewebsitesoftheAAT(GettyandtheDutchRKD)andthesiteoftheABRplus(RCE)DANSalsousedWikipedia.Evenifatermwas not a Wikipedia lemma, DANS could sometimes find it mentioned in a description of an evidentlyrelated lemma.Most of thematches foundwere either skos:closeMatchor skos:broadMatch. Finding themappingswasfarfromeasyhowever.Firstlyitwasdifficulttounderstandthearchaeologicalconceptbehindthe ABR term when only the term and the hierarchical context were available (without scope notes).Secondly, it was sometimes difficult to understand AAT concepts when they reflected a perspective notspecificallyarchaeological.Forexample,insomecases,theDANS(EASY)ABRessentiallycapturedthenotionof a place where an activity occurred and this had no exact match in the AAT. In these situations, askos:broadMatch was sometimes generated plus an additional skos:relatedMatch to a correspondingactivity,materialorobject.FutureworkwillconsiderstepsformakinguseoftheDutchtransactionsintheARIADNEPortalandinarchaeologicalterminologyresourcesmoregenerally.

TheTreeRingDataStandard(TRiDaS)

TRiDaS(Jansmaetal,2010)wasdesignedcollaborativelybydendrochronologistsandcomputerscientiststoaccuratelydescribe thewealthofdataandmetadataused indendrochronological research. The standardsupports information produced by all sub-disciplines of dendrochronology, not just archaeological andhistorical research facilitating the exchange of data within and between sub-disciplines. ControlledvocabularieswithinTRiDaSareakeyaspectenablingthisexchangeofdata.

TRiDaS provides for two mechanisms for describing vocabulary entries. For concepts with limited (<20terms), relativelystaticvocabularies thereare 'normalTridas’ term listsdefinedwithin theTRiDaSschema.Examplesofthisinclude:datingtype;timbershape;measurementmethod;andmeasurementvariable(seeTable5).Thesesimplelistsoftermsweredevisedduringthedesignofthestandarditselfwiththepotentialtoextendthemifnecessarywhenthestandardisrevised.

Page 25: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 25

normalTridasvocabulary Description

Datingtype Typicallydatingindendrochronologyisabsolute,however,therearecircumstanceswherethisisn'tthecase.Thedatingtypeallowstheusertodefineifthedatingisrelativeordatedwithuncertainty,typicallyusingradiocarbon.

Locationtype Thetypeoflocationrecordedfordendrochronologicalsamplescanbeextremelyimportantwheninterpretingresults.Forexampledendrochronologicaldatacanbeusedforpalaeoenvironmentalreconstructions,butfortheseanalysestobevalidthegrowthlocationofthetreeisrequired.Samplescanbetakenfromtreesintheirgrowthlocation,fromitems(suchasships)thatareinherentlymobile,orfromitems(suchasbuildings)thatarestatic.

Measuringmethod Thereareanumberofmethodsusedforrecordingdendrochronologicalmeasurementsdependingonthecircumstances,eachwiththeirprosandcons.

Remark Observationsaboutindividualtreeringscanbeanextremelyusefulindicatorofenvironmentalchange.TheTRiDaSremarkvocabularystandardisesthemostcommonfeaturessuchas:falserings;missingrings;andfrostdamage.

Shape Thisvocabularystandardisesthedescriptionoftheshapeoftimbers.

Unit Theunitvocabularystandardisestheunitsforbothring-widthmeasurementsandmeasurementsoftimbers.

Variable Thetypicalmeasurementvariableindendrochronologyisthering-width;howeverresearchersmayalsorecordsub-annualmeasurements(early/latewood),variousdensitymetrics,andvesselsize.Thisvocabularyislikelytoberevisedasnovelapproachesaredeveloped.

Table5:Summaryofthe'normalTridas'vocabulariesusedinTRiDaS.Theseshort,simpletermlistsaredefinedwithintheTRiDaSschemaandarerelativelystatic

ThesecondandmoretypicalstyleofvocabularyinTRiDaSisthe'controlledVoc'datatype.Thisenablesusersto define links to external vocabularies with a standardised term and identifier. This mechanism wasdesignedintoTRiDaS,recognisingtherapiddevelopmentofstandardvocabulariesthataresuitableforuseindendrochronologicalresearch.

WhiletheTRiDaSdevelopmentteamintendedforthestandardtolargelyuseexternalvocabulariesastheybecomeavailable,theyalsoacknowledgedtheshort-termneedsofthedendrochronologicalcommunity.Assuch, a vocabulary was developed for use primarily bymembers of the Digital Collaboratory for CulturalDendrochronology (DCCD – Jansma et al, 2012) project describing the object/element types used indendrochronological research. These range from the obvious “tree”, to many items found in thearchaeologicalandculturalrecorde.g.buildings,barrels,ships,doors,musicalinstruments,paintingsetc.

Theobject/elementvocabularywaswrittenasamultilingual(English,Dutch,FrenchandGerman)flat-tablecontainingnohierarchicalrelationships.Termsinoneormorelanguageswithnodirecttranslationscausedconfusion and overlapping concepts. Many of the terms have exact matches with the AAT. However,substantialproportionsarespecialistterms(especiallynauticalterms)thathaveonlyverygenericmatches.

Page 26: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 26

During the course of the ARIADNE project the DCCD object / element vocabulary has been substantiallyreworked. Using bespoke scripts, the redundancy within the flat table has been removed, and basichierarchical relationships defined. The simple terms list has been converted to a true concepts-basedvocabularywithredundanttermsassignedasalternatelabels.LinkstotheAAThavebeenestablishedforallconcepts(eitherexactorbroaderrelationships)andscopenotesadded.

Themajorityoftheeffortrequiredtoreworkthevocabularycamefromcontentspecialists.Combiningthespecialistknowledgeforallsubjectareasacrossfourlanguageswaspainstakingwork.Attemptstolocatingexistingsoftwareaimedatcontentratherthaninformaticsspecialistswereunsuccessful.Suchatoolissorelyneededtofullyleveragetheknowledgeofcontentspecialists.

Theenhancedvocabulary iscurrently in theprocessofbeing incorporatedback into theDCCDrepository.Theambiguousnatureoftheoriginaltermlistmeansworkisrequiredtocross-mapexistingrecordstothenewvocabulary,andinsomecasesthisunfortunatelyrequiresconsultationwiththeoriginaldataproviders.

ThesecondsubstantialvocabularyusedinTRiDaS/DCCDisthespeciestaxonomicdictionary.ThebasisofthisvocabularyistheSpecies2000andITISCatalogueofLife(http://catalogueoflife.org/).TheCatalogueofLife(CoL)formsthetaxonomicbackboneformanymajorprojectsincludingtheGlobalBiodiversityInformationFacility(GBIF),theEncyclopediaofLife(EoL)andtheIUCNredlistofendangeredplantsandanimals.AnnualeditionsoftheCoLhavebeenproducedsince2000withthemostrecenteditionincludingover1.6millionspecies from 158 contributing databases. While the CoL is an incredible resource, it suffers from thedrawbackthatthereisnolinkagebetweenconceptsineachedition.WhileeffortsareunderwaytoproduceatrueSKOSmapping,thisisnotyetavailable.Intheinterim,TRiDaS/DCCDisusingastaticsubsetwiththeintentionofmigratingtothedynamicCoLSKOSoncereleased.

ICCD/RAThesaurus

Theissueofmultilingualismisamatterthatneedstobetakenintoaccount,notonlybecauseofthevarietyofnationalthesaurithataregoingtobeintegratedbytheARIADNEinitiative,butalsoforthefuturecreationof common and transnational terminological tools. Linguistic issues often make the direct mapping of aconcept via the skos:exactMatch property to the AAT concept difficult. However, other mappingrelationships are available. The conceptual mapping between the ICCD RA Thesaurus and AAT has beencompletedandrevised; for thispurpose itwasdecidedtomanuallyconstructamapping fromthevarioustermsandfunctions(ifany),followinginsequencesthethreemaincategoriesoftheRAThesaurus.TheworkpatternwasbasedonanExcelrepresentationofthethesaurustowhichadditionalcolumnswereaddedinordertospecify:

• ThetargetLabelandtheidentifier(targetURI)ofthecorrespondingconceptselectedinAAT• matchURI was one of the SKOS mapping properties (skos:closeMatch; skos:exactMatch;

skos:broadMatch)• Thenameoftheinstitutioninchargeofthedefinitionofeachspecificmapping(creator)

OnlyasubsetoftheRAThesauruswastakenintoaccounttodemonstratethefeasibilityoftheseoperations.The subset includes 1191 terms related to 10 major categories (highlighted in the original source as"livello_1_categoria")relatingto:

● CLOTHINGANDACCESSORIES● FURNISHING● TRANSPORTATION● CONSTRUCTIONINDUSTRY● PAINTING● ARCHAEOBOTANICALFINDINGS

Page 27: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 27

● ARCHAEOZOOLOGICALFINDINGS● SCULPTURE● INSTRUMENTS-TOOLSANDOBJECTSOFUSE● GENERALTERMS

Theanalysis for finding thecorrespondingentries in theAAT thesaurus took intoaccount the informationprovidedbyscopenotesandimagesaccompanyingeachconcept;extensivewebsearcheswereperformedto find themostappropriatematching termbetween ItalianandEnglish;and terminological researchwascarriedoutusingdifferentresourcestoidentifysynonymstomaketheassociatedtargetLabelasuniqueandaspreciseaspossible.Themappingworkalsoincludesother"113"termsandCOINScategory(derivedfrom"dc:title"elementofXMLfilesuploadedtoCulturaItaliaanddeliveredtotheARIADNEPortal).Intotal,thethesaurusincludes11categoriesand1304terms.ThemappingworkhasidentifiedthefollowingSKOSmatchtypes:

• 642skos:exactMatch;• 94skos:closeMatch;• 310skos:broadMatch;• 258skos:narrowMatch.:

Themappingmethodologyadoptedisbasedonthefollowingthreeexamplesofassociationprovidedinthetable:

Categoria

livello1 livello2 livello3 Livello4termine

targetLabel AATID matchLabel

Mezziditrasporto

Terrestri Atrazioneanimale

cisium two-wheeledcarriages

300215685 broadmatch

Strumenti-UtensilieOggettid’uso

ArmieArmature

Armidadifesa

farsettodaarmare

armingdoublets

300226824 closematch

Scultura imagoclipeata clipei(portraits)

300178246 exactmatch

Table6:ExamplesofmappingsbetweenICCD/RAtermsandGettyAATconcepts

In reflection, themost significant activity, from the scientific-methodological point of view, has been thereviewof thewholeprocess. Startedaspunctual control “1:1”correspondencebetween the termsof thetwoterminologytools(thesaurusICCD/RAandAAT),thisreviewhasexpandedbyrealizingthemappingofthe terminological categories relating to individual entries with the codes referring to the facet and the

Page 28: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 28

hierarchyAAT.Thishasmadepossible:1. Disambiguatingandcorrectionofmatchespreviously selected -andoften lexically corrected -but

decontextualisedfromtheiroriginaldomain;2. Providingthebasisforfuturematchingbetweendifferentcategoriesofmultilingualthesauri.

It is worth emphasising that the focus of themappingwork is the concept of individual termsmeant asrecordsenteredinacompletehierarchicalstructureofrelatedtermsandnotes.Amongtheresultsachieved—andwhicharehighlightedthoughthemappingbetweenclasses—are thehighlevelofcorrespondencebetweentheICCD/RAthesaurusentriesandtheAATthesaurusrecordtypes.Out of 1,191 basic records, 1,164 among them are linked to “concept” and only 27 to “guide term”.AccordingtotheAATThesaurusguidelines:

• Concept:ReferstorecordsintheAATthatrepresentconcepts;recordsforconceptsincludeterms,anote,andbibliography.

• Guide term: Refers to records that serve as place savers to create a level in the hierarchy underwhichtheAATcancollocaterelatedconcepts.Guidetermsarenotusedforindexingorcataloguing.

INRAP(FRANTIQ)

DOLIA is the catalogue of the archaeological reports at the French National Institute for PreventiveArchaeologicalResearch(Inrap).TheDOLIAcataloguewasdevelopedwithFlora3.1.0software,createdbyEverteam (© Everteam 2015) http://dolia.inrap.fr:8080/flora/jsp/index.jsp. The reports, stored in pdfformat,areindexedwithnativesubjectsinheritedbythePactols“Sujets/Subjects”thesaurus.

The DOLIA catalogue currently has 1,573 (5,149 occurrences) subject metadata terms in the Pactolsthesaurus. The current mapping concerns only the indexed terms from the DOLIA catalogue used inARIADNE.

The alignment has been done between those terms and the AAT thesaurus by using a source term fromPactols,asourceURI,atargettermfromAATandatargetURI,specifyingtheSKOSmatch.

E.g.Pactols:Archéologiehttp://ark.frantiq.fr/ark:/26678/pcrty05M9SVnLuskos:exactMatchAAT:archaeologyhttp://vocab.getty.edu/aat/300054328orPactols:amphoregauloisehttp://ark.frantiq.fr/ark:/26678/pcrtiUhJYvi7PGskos:broadMatchAAT:amphorae(storagevessels)http://vocab.getty.edu/aat/300148696

Page 29: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 29

Matchtype Mappings Proportion

skos:exactMatch 1161 71%

skos:closeMatch 121 7%

skos:broadMatch 346 21%

skos:narrowMatch 6

Table7:resultofthealignment

AcompletemappingofthePactolsSubjectsisplannedinthenextfewmonths.

IrishMonumentsVocabulary

ThemostdetailedclassificationsystemavailableforIrishMonumenttypesistheclasslistdevelopedbytheNational Monuments Service (NMS). This is a flat / simple hierarchical list which was used in theclassificationofsitesandmonumentsthat formedpartof theArchaeologicalSurveyof Ireland,whichwasestablishedtocompileaninventoryoftheknownarchaeologicalmonumentsintheState.TheinformationisstoredonadatabaseandinaseriesofpaperfilesthatcollectivelyformtheASISitesandMonumentsRecord(SMR).Eachsite/monumenthasauniqueSMRnumberwhichgreatlyfacilitatesthecreationofLinkedData,andeachsite/monumentisgivenaclassificationbasedontheNMSclasslist.Thedevelopmentofthelistwasanorganicandevolvingprocessandthelistissubjecttoreviewwithamendmentsbeingmadeonanon-goingbasis.

Irish Monuments Mapping was undertaken by the Discovery Programme in order to map the subjectclassificationsintheNMSlisttotheGettyAAT.ThiswasdoneforeachtermbycomparingthescopenotesoftheNMSclass listtothenotesfieldoftheAATOnline.Thisautomatically introducesa levelofsubjectivitywhichwascounteredbyusinganappropriateSKOSmappingpropertywhenlinkingtothetargetvocabulary(AAT).Wheretherewasanyambiguityabouttheterm,broadermappingpropertieswerealwaysused.

IncertaincaseswheremappingsweredifficultandcouldbemorecloselyrelatedtotheUKFISHThesaurusofMonumentTypes,theVocabularyMatchingTooldevelopedbyUSWwasfirstusedto identifymatchingterms,whichwasinturnmappedtotheAAT(i.e.atwostagemappingprocess).

ThenatureoftheclassificationlistoftheNMSpresentedoccasionaldifficulties:• Someclassificationscontainedhighlydetailedelementse.g.objecttermswererefinedattermlevel

bytheirpresentlocation[Cist(presentlocation)]orweredevelopedinordertoclassifyidiosyncraticsites[turfstand;watchman’shut-burialground].

• TherewasgreatercongruencebetweentheFISHMonumentTypevocabularyandthe Irishsubjecttermsenablinggreaterpossibilitiestofindanexactorclosematch.Insomecasestermshadclearlybeen based on the FISH vocabulary. This was to be expected due to geographical / historicalcontiguity.Forexamplebullaunstone, forwhichthereareover1000currentlydocumented in theASI,relatesmorecloselytoa‘cup-markedstone’inFISHbutcanonlybesatisfactorilymappedusingtwo(ormore)termsintheGettyAAT[ceremonialobjects;mortaria].

• Some termsarenot clearlydefined in theNMSclass list [e.g.settlementplatform: ‘A raisedarea,oftensurroundedbywaterloggedorboggy land,whichhasevidenceof formerhumanhabitation’]whichmademapping,evenatahighlevel,difficult.

• Subjectdefinitionsoftenincludedbroadperiodclassificationswithinthescopenote;itwasdecidednottotakethisintoconsiderationasperiodtermscouldbecoveredbytheIrishPeriodsVocabulary.Occasionally terms contained period terms in their term name (e.g. House-16th century; House-16th/17thcentury)aswellasarefiningsubjectelement(e.g.House-fortifiedhouse)ThisnecessitatesboththeuseoftheIrishPeriodsVocabularyand/oradditionaltermsfromtheAAT.

Page 30: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 30

• Some classifications were subdivided (but not hierarchically) into more specific elements (e.g.Ringfort-cashel;Ringfort-rath;Ringfort-unclassified).Thegranularityofthetermswasconservedbyusingtheappropriatemappingproperty, insomecasesbymappingtermstomultipletermsinthetargetvocabularye.g.

§ Ringfort-cashel->[skos:broadMatch]->raths§ Ringfort-cashel->[skos:broadMatch]->drywalls(masonry)

ThemappingprocessattemptedtobalancethepressingneedtoimplementLinkedDatawiththerealitythattheavailablevocabularywasrich indetail,but lackedastructurethatwaseasily reconciledwithstandardconcepts of controlled vocabularies and indexing. This was largely achieved bymultiplemappings to thetarget vocabulary, as well as by utilising an intermediate vocabulary which more closely reflected theparticularnuancesofIrishmonumenttypes.

Page 31: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 31

4 MappingsintheARIADNEinfrastructure

TheARIADNECatalogueDataModel (ACDM) specifies themetadata schema that underpins theARIADNEinfrastructure(seeD12.2:InfrastructureDesign).TheACDMisbasedontheDCATvocabulary,addingclassesandpropertiesneededfordescribingARIADNEassets.TheARIADNECatalogueaggregatesmetadata,suchasdescriptions for datasets,metadata schemas, vocabularies, etc. provided by the project partners throughmetadata file uploads, or theOpen Archives Initiative Protocol forMetadata Harvesting (OAI-PMH).1 Themetadataandobject repositoryaggregator (MORe)2 (Isaacetal.,2013)hasbeencustomized forARIADNEpurposes and is driven by the ACDM.MORe includes a set ofmicro-services, including variousmetadataenrichment services. ForARIADNEpurposes, a bespokederivedAAT subject enrichment service has beendevelopedbyATHENADCU thatapplies thepartnervocabularymappings (in JSON format) to thepartnersubject metadata and derives an AAT concept (both preferred label and URI) to augment the subjectmetadata,bothintheRegistryandalsosuppliedtotheARIADNEPortal.

Forsubjectaccess,theACDMArchaeologicalResourceclasshastwokindsofsubjectproperty.Theproperty,native-subject, associates the resourcewith oneormore items froma controlled vocabulary usedby thedataproviderto indexthedata.Howeverasdiscussed inSection2.2, therearea largenumberofpartnervocabulariesinseveraldifferentlanguages,andcrosssearchisrendereddifficult,astherearenosemanticlinks or mappings between the various local vocabularies. The established solution to this problem is toemploy mapping between the concepts in the different vocabularies. However, as discussed above, thecreationoflinksdirectlybetweentheitemsfromdifferentvocabulariescanquicklybecomeunmanageableasthenumberofvocabulariesincreases.Ascalablesolutiontothismappingproblemistoemploythehubarchitecture, an intermediate structure where concepts from the ARIADNE data provider sourcevocabulariescanbemapped(ISO2013).Intheportal,retrievalbasedonaconceptfromonevocabulary(inasearch or browsing operation) can use the hub to connect to subjectmetadata from other vocabularies,possiblyexpressed inother languages. In theACDM,ariadne-subject isused forsharedconcepts fromthehubvocabulary(theAAT),whichhavebeenderivedviathevariousmappingsfromsourcevocabularies.ThisunderpinstheMOReenrichmentservicesaugmentingthedata importedtotheRegistrywithmappedhubconcepts.ThesederivedsubjectsinturnmakepossibleconceptbasedsearchandbrowsingintheARIADNEPortal. It isthusanticipatedthatthemappingscanformoneofthesteppingstonestowardsamultilingualcapabilityinthePortal.

4.1 Mappingenrichmentprocess

TheAATLinkedOpenDatathatformsthebasisoftheARIADNEmappinghubvocabulary isexpressedinacombinationofontologicalmodels includingSKOS.Theappropriaterepresentation for themappings isviaSKOSmappingproperties(seeSKOSMappingProperties).TheoutputfromthemappingtoolsofthepartnermappingsfromtheirsourcevocabulariestotheAATistransformedtotherequiredJSONformatbyUSWforcommunication to the Registry team at DCU, where it is processed by the relevant MoRe enrichmentservices.AbriefexampleofthisJSONformatisgiveninAppendixB.

The information from themapping a tool is passed toMORewhich associates itwith theproviderof thevocabulary. Itupdatesthepropertyderived-subjectusingtheAATmappingsandenrichesanACDMrecord(seeFigure15),addingabroaderterm,oraskos:altLabeltocorrelateatermusingthe‘usefor’relationship,oraddsmultilinguallabels(skos:prefLabelandskos:altLabel)inordertofacilitatemultilingualsearch.

1http://www.openarchives.org/pmh/2http://more.dcu.gr/

Page 32: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 32

Figure15:MOReenrichment

4.2 MappingswithintheARIADNEportal

Atthetimeofwriting,developmentoftheARIADNEPortalandthesearchfunctionalityisstillongoingwithmappingsstillbeing importedfromsomepartners.However, it ispossibletohaveapreviewatthisstage.Figure16 showsaqueryon thePortalmakinguseof themappings.On themainResults screen, a setoffilters isavailable for refininga search following the facetedsearchparadigm.One filter, currentlynamedDerived Subject, is populated by the MORe enrichment process described in section 4.1; effectively theDerivedSubjectsareAATconcepts,whichhavebeenmappedtothenativevocabularyconceptsthatformthesubjectmetadataofthedataresourcesinthePortal.Figure16showsthatasimplequeryonthesingleAAT (mapped) concept, churches (buildings), is able to retrieve results in multiple languages from AIAC(Fasti),DAIandDANSARIADNEcontentproviders.ResultsfromADSarealsoreturnedthoughnotshowninthisscreendump,whichonlyshowsasmallnumberoftheoveralltotalresults.

Page 33: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 33

Figure 16: Portal Query on AATmapped subject: churches (buildings) showing results from AIAC (Fasti), DAI andDANS,withmultiplelanguages(June2016)

Infuturework,makingthemappings(andmappingservices)fullyavailableasoutcomesintheirownright,with appropriate metadata for the mappings would be desirable, as more than one mapping may beproduced for large vocabularies. Themappingsmay also serve to underpin amultilingual capability in aninitialstringsearch,byaugmentingthelanguagecoverageoftheAAT.

FulltechnicaldocumentationaboutthemappingspresentedinthisreportisavailablefordownloadfromtheARIADNEwebsite.

Page 34: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 34

5 Conclusion

This report has reviewed the key vocabularies considered relevant to the ARIADNE project. Mappingbetweenvocabularieshasbeenshowntobeakeyaspectforconceptbasedsearch,avoidingtheambiguitiesposed by literal string search and making possible a multi-lingual search capability. The Getty AAT wasselectedasamappinghubvocabularyandpartnernativevocabularieshavebeenmappedtoitusingSKOSmapping relationships and bespoke mapping utilities. The mappings have been incorporated into theRegistryenrichmentprocesssothatpartnersubjectmetadatahasbeenaugmentedbyAATconcepts.

ThetwoprototypeexperimentsalsoshowthepotentialofworkingwiththeURIidentifiersofAATconceptsratherthantheambiguousstringsoftermlabels.UsingtheURIidentifierfortheconceptavoidstheproblemof ambiguity, common in multilingual datasets, for terms that are homographs in different languages.Working at the concept level also makes possible hierarchical semantic expansion, making use of thebroader generic (“IS-A”) relationships between concepts in a hierarchically structured knowledgeorganization system, such as the AAT. Thus a search expressed at a general level can (if desired) returnresults indexedat amore specific level. Forexample, a searchon settlementsmight also returnmonasticcentres.

An example from the ARIADNE Portal has illustrated the potential for themappings to assist a query inretrieving results in multiple languages. Themappings have potential to underpin various options in thesearchfunctionalityanduserinterface,offeringacosteffectiveroutetowardsdifferentformsofmultilingualfunctionality.

Page 35: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 35

6 ReferencesAitchison,J.,Gilchrist,A.,Bawden,D.(2000).Thesaurusconstructionanduse:apracticalmanual(4thedition).ASLIB,London.

ARIADNEproject(2016).Availableat:http://www.ariadne-infrastructure.eu/[Accessed15Jun.2016].ARIADNECatalogDataModel(ACDM)ArtandArchitectureThesaurus.J.PaulGettyTrust.http://www.getty.edu/research/conducting_research/vocabularies/aat/index.html[Accessed15Jun.2016]

Berners-Lee,T.LinkedData.Availableat:http://www.w3.org/DesignIssues/LinkedData.htmlBindingC.,TudhopeD.(2016).ImprovingInteroperabilityusingVocabularyLinkedData.InternationalJournalonDigitalLibraries,17(1),5-21.Springer.

Bizer,C.,Heath,T.,Berners-Lee,T.(2009).LinkedData-TheStorySoFar.InternationalJournalonSemanticWebandInformationSystems,5(3):1–22.

Caracciolo,C.,Stellato,A.,Rajbahndari,S.,Morshed,A.,Johannsen,G.,Jaques,Y.andKeizer,J.(2012).Thesaurusmaintenance,alignmentandpublicationaslinkeddata:theAGROVOCusecase.InternationalJournalofMetadata,SemanticsandOntologies,7(1):65-75.Inderscience.

Caracciolo,C.,Stellato,A.,Morshed,A.,Johannsen,G.,Rajbahndari,S.,Jaques,Y.andKeizer,J.(2013).TheAGROVOCLinkedDataset.SemanticWeb,4(3):341-348.IOSPress

Charles,V.,Devarenne,C.(2014).EuropeanaenrichesitsdatawiththeAAT.EDMcasestudy.Availableat:http://pro.europeana.eu/page/europeana-aat[accessed30/11/2015]

DataCatalogVocabulary(DCAT)Availableat:http://www.w3.org/TR/vocab/dcat/GettyResearchInstitute(2016a).GettyVocabularies[online]Availableat:http://www.getty.edu/research/tools/vocabularies/[Accessed15Jun.2016].

GettyResearchInstitute(2016b).GettyVocabulariesasLinkedOpenData.[online]Availableat:http://www.getty.edu/research/tools/vocabularies/lod/[Accessed15Jun.2016].

GettyResearchInstitute(2016c).GettyVocabulariesSPARQLendpoint.[online]Availableat:http://vocab.getty.edu/sparql/[Accessed15Jun.2016].

Gormley,C.,Tong,Z.(2015).Elasticsearch–TheDefinitiveGuide.GenreExpansion[online]Availableat:https://www.elastic.co/guide/en/elasticsearch/guide/current/synonyms-expand-or-contract.html#synonyms-genres

Harpring,P.(2016).ArtandArchitectureThesaurus:IntroductionandOverview.http://www.getty.edu/research/tools/vocabularies/aat_in_depth.pdf[Accessed15Jun.2016].

Heritagedata.org.(2016).LinkedDataVocabulariesforCulturalHeritage[online]Availableat:http://www.heritagedata.org/[Accessed15Jun.2016].

Isaac,A.,Charles,V.,Fernie,K.,Dallas,C.,Gavrilis,D.andAngelis,S.(2013).AchievingInteroperabilitybetweentheCARAREschemaforMonumentsandSitesandtheEuropeanaDataModel,inProceedingsoftheInternationalConferenceonDublinCoreandMetadataApplications,DC-2013.Lisbon,Portugal,115–125.

ISO25964-1:2011.Informationanddocumentation-Thesauriandinteroperabilitywithothervocabularies-Part1:Thesauriforinformationretrieval.Availableat:http://www.niso.org/schemas/iso25964/#part1[accessed30/11/2015]

ISO25964-2:2013.Informationanddocumentation-Thesauriandinteroperabilitywithothervocabularies-Part2:Interoperabilitywithothervocabularies.Availableat:http://www.niso.org/schemas/iso25964/#part2[accessed30/11/2015]

Jansma,E.,Brewer,P.andZandhuis,I.(2010).TRiDaS1.1:Thetree-ringdatastandard.Dendrochronologia,28(2),pp.99-130.Availableat:http://dx.doi.org/10.1016/j.dendro.2009.06.009[accessed15/06/2016]

Jansma,E.,vanLanen,R.,Brewer,P.andKramer,R.(2012).TheDCCD:Adigitaldatainfrastructurefortree-ringresearch.Dendrochronologia,30(4),pp.249-251.Availableat:http://dx.doi.org/10.1016/j.dendro.2011.12.002[accessed15/06/2016]

Page 36: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 36

Kempf,A.,Neubert,J.(2016).TheRoleofThesauriinanOpenWeb:ACaseStudyoftheSTWThesaurusforEconomics.KnowledgeOrganization,43(3),160-173.ErgonVerlag.

Koch,T.,Neuroth,H.andDay,M.(2003).Renardus:Cross-browsingEuropeansubjectgatewaysviaacommonclassificationsystem(DDC).In:McIlwaine,I.C.(ed.)Subjectretrievalinanetworkedworld:proceedingsoftheIFLASatelliteMeetingheldinDublin,OH,14-16August2001.(UBCIMPublications,NewSeries,Vol.25).München:K.G.Saur,25-33.

SKOSMappingProperties.Availableat:http://www.w3.org/TR/skos-reference/#L4138SparqlGui(2016)-desktopRDFqueryingtool.Availableat:https://bitbucket.org/dotnetrdf/dotnetrdf/wiki/UserGuide/Tools/SparqlGui

STWThesaurusforEconomicsandassociatedwebservices.LeibnizInformationCentreforEconomics.Availableat:http://zbw.eu/stw/[accessed30/11/2015]

TudhopeD.,KochT.,HeeryR.(2006).TerminologyServicesandTechnology:JISCstateoftheartreview.Availableat:http://www.jisc.ac.uk/media/documents/programmes/capital/terminology_services_and_technology_review_sep_06.pdf[accessed15/06/2016]

Tudhope,D.,Binding,C.(2016).StillQuitePopularAfterallThoseYears-TheContinuedRelevanceoftheInformationRetrievalThesaurus.KnowledgeOrganization,43(3),174-179.ErgonVerlag.

Vizine-Goetz,D.,Hickey,C.,Houghton,A.,Thompson,R.(2003).VocabularyMappingforTerminologyServices.JournalofDigitalInformation,4(4),ArticleNo.272,2004-03-11.Availableat:https://journals.tdl.org/jodi/index.php/jodi/article/view/114/113[accessed15/06/2016]

Zeng,M.,Chan,L.(2004).Trendsandissuesinestablishinginteroperabilityamongknowledgeorganizationsystems.JournalofAmericanSocietyforInformationScienceandTechnology,55(5):377-395.Wiley.

Page 37: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 37

7 AppendixA

Conceptmappingsusedfortheprototypemappingexercise(TurtleRDFformat):

#namespaceprefixes

@prefixskos:<http://www.w3.org/2004/02/skos/core#>.

@prefixaat:<http://vocab.getty.edu/aat/>.

@prefixfasti:<http://fastionline.org/monumenttype/>.

@prefixiccd:<http://www.iccd.beniculturali.it/monuments/>.

@prefixdans:<http://www.rnaproject.org/data/>.

@prefixtmt:<http://purl.org/heritagedata/schemes/eh_tmt2/concepts/>.

@prefixdct:<http://purl.org/dc/terms/>.

@prefixgvp:<http://vocab.getty.edu/ontology#>.

@prefixdai:<http://archwort.dainst.org/thesaurus/de/vocab/?tema=>.

#ICCDconcepts

iccd:catacombaskos:prefLabel"catacomba"@it.

iccd:cenotafioskos:prefLabel"cenotafio"@it.

iccd:cimiteroskos:prefLabel"cimitero"@it.

iccd:colombarioskos:prefLabel"colombario"@it.

iccd:dolmenskos:prefLabel"dolmen"@it.

iccd:mausoleoskos:prefLabel"mausoleo"@it.

iccd:menhirskos:prefLabel"menhir"@it.

iccd:monumento-funerarioskos:prefLabel"monumentofunerario"@it.

iccd:necropoliskos:prefLabel"necropoli"@it.

iccd:sepolcreto-rupestreskos:prefLabel"sepolcretorupestre"@it.

iccd:tombaskos:prefLabel"tomba"@it.

#ICCD->AATmappings

iccd:catacombaskos:closeMatchaat:300000367.

iccd:cenotafioskos:closeMatchaat:300007027.

iccd:cimiteroskos:closeMatchaat:300266755.

iccd:colombarioskos:closeMatchaat:300000370.

iccd:dolmenskos:closeMatchaat:300005934.

iccd:mausoleoskos:closeMatchaat:300005891.

Page 38: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 38

iccd:menhirskos:closeMatchaat:300006985.

iccd:necropoliskos:closeMatchaat:300000372.

iccd:sepolcreto-rupestreskos:closeMatchaat:300387008.

iccd:tombaskos:closeMatchaat:300005926.

#DANSconcepts

dans:8f14ae7e-3d66-4e85-b77c-454a261150e9skos:prefLabel"begraving"@nl.

dans:e98c8cf0-aa0d-4fcd-99a2-db76cd1d827dskos:prefLabel"begraving,onbepaald"@nl.

dans:87a2f9e9-8e40-4c97-b17b-82275d54c78dskos:prefLabel"brandheuvelveld"@nl.

dans:be95a643-da30-40b9-b509-eadfb00610c4skos:prefLabel"christelijk/joodsebegraafplaats"@nl.

dans:77130cff-58e0-4c6d-b608-33fadc946283skos:prefLabel"dierengraf"@nl.

dans:df17ef8a-1a58-4c58-ab6f-2e127c90c571skos:prefLabel"grafheuvel"@nl.

dans:9a729782-ca06-47e1-aa50-87561f36a8eeskos:prefLabel"grafheuvelveld"@nl.

dans:6a7482e5-2fd5-48fb-baf4-66ad3d4ed95eskos:prefLabel"kerkhof"@nl.

dans:e1f67762-c405-42a5-b073-88c13043aab0skos:prefLabel"megalietgraf"@nl.

dans:abb41cf1-30dc-4d55-8c18-d599ebba1bc2skos:prefLabel"rijengrafveld"@nl.

dans:74899123-2b00-4e12-83f2-f37bc4f129ffskos:prefLabel"terechtstellingsplaats/galgenberg"@nl.

dans:b98f1315-91c5-411e-b91b-9693e5dfc5c2skos:prefLabel"urnenveld"@nl.

dans:a156e09c-b40c-45a9-8487-d7b68f8dbae7skos:prefLabel"vlakgraf"@nl.

dans:b935f9a9-7456-4669-91d0-2e9c0ff7d664skos:prefLabel"vlakgrafveld"@nl.

#DANS->AATmappings

dans:8f14ae7e-3d66-4e85-b77c-454a261150e9skos:closeMatchaat:300387004.

dans:e98c8cf0-aa0d-4fcd-99a2-db76cd1d827dskos:closeMatchaat:300387004.

dans:be95a643-da30-40b9-b509-eadfb00610c4skos:broadMatchaat:300266755.

dans:6a7482e5-2fd5-48fb-baf4-66ad3d4ed95eskos:closeMatchaat:300000360.

dans:abb41cf1-30dc-4d55-8c18-d599ebba1bc2skos:closeMatchaat:300266755.

dans:b935f9a9-7456-4669-91d0-2e9c0ff7d664skos:broadMatchaat:300266755.

#EH-TMTconcepts

tmt:70053skos:prefLabel"cemetery"@en.

tmt:100531skos:prefLabel"walledcemetery"@en.

tmt:92672skos:prefLabel"mixedcemetery"@en.

tmt:70060skos:prefLabel"inhumationcemetery"@en.

Page 39: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 39

tmt:70056skos:prefLabel"cremationcemetery"@en.

tmt:70055skos:prefLabel"cairncemetery"@en.

tmt:70054skos:prefLabel"barrowcemetery"@en.

tmt:91386skos:prefLabel"catacomb(funerary)"@en.

tmt:70053skos:prefLabel"necropolis"@en.

#EH-TMT->AATmappings

tmt:70053skos:closeMatchaat:300266755.

tmt:100531skos:broadMatchaat:300266755.

tmt:92672skos:broadMatchaat:300266755.

tmt:70060skos:broadMatchaat:300266755.

tmt:70056skos:broadMatchaat:300266755.

tmt:70055skos:broadMatchaat:300266755.

tmt:70054skos:broadMatchaat:300266755.

tmt:91386skos:closeMatchaat:300000367.

tmt:70053skos:closeMatchaat:300000372.

#FASTIconcepts

fasti:burialskos:prefLabel"Burial"@en.

fasti:catacombskos:prefLabel"Catacomb"@en.

fasti:cemeteryskos:prefLabel"Cemetery"@en.

fasti:columbariumskos:prefLabel"Columbarium"@en.

fasti:mausoleumskos:prefLabel"Mausoleum"@en.

#FASTI->AATmappings

fasti:burialskos:closeMatchaat:300387004.

fasti:catacombskos:closeMatchaat:300000367.

fasti:cemeteryskos:closeMatchaat:300266755.

fasti:columbariumskos:closeMatchaat:300000370.

fasti:mausoleumskos:closeMatchaat:300005891,aat:300263068.

#DAIconcepts

dai:1819skos:prefLabel"Friedhof"@de.#cemetery

dai:1947skos:prefLabel"Gräberfeld"@de.#graveyard

dai:3736skos:prefLabel"Kolumbarium"@de.#columbarium

Page 40: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 40

dai:2485skos:prefLabel"Nekropole"@de.#necropolis

#DAI->AATmappings

dai:1819skos:closeMatchaat:300266755.

dai:1947skos:closeMatchaat:300000360.

dai:3736skos:closeMatchaat:300000370.

dai:2485skos:closeMatchaat:300000372.

Page 41: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 41

8 AppendixB

ExampleoftheJSONexchangeformatforcommunicatingthemappingstotheARIADNERegistryteam,usingthemappingsofthreeFASTI(AIAC)conceptstotheAAT

[

{

"created":"2015-11-20T15:27:13.342Z",

"sourceURI":"http://www.fastionline.org/concept/attribute/abbey",

"sourceLabel":"Abbey",

"matchURI":"http://www.w3.org/2004/02/skos/core#closeMatch",

"targetURI":"http://vocab.getty.edu/aat/300000642",

"targetLabel":"abbeys(monasteries)"

},

{

"created":"2015-11-20T15:27:13.342Z",

"sourceURI":"http://www.fastionline.org/concept/attribute/amphitheatre",

"sourceLabel":"Amphitheatre",

"matchURI":"http://www.w3.org/2004/02/skos/core#exactMatch",

"targetURI":"http://vocab.getty.edu/aat/300007128",

"targetLabel":"amphitheaters(builtworks)"

},

{

"created":"2015-11-20T15:27:13.342Z",

"sourceURI":"http://www.fastionline.org/concept/attribute/ancient_beach",

"sourceLabel":"Ancientbeach",

"matchURI":"http://www.w3.org/2004/02/skos/core#broadMatch",

"targetURI":"http://vocab.getty.edu/aat/300008816",

"targetLabel":"beaches"

}

]

Page 42: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 42

9 AppendixC

Extractfrommappingguidelines

This document should be read in conjunction withMapping-Template.xlsx. This document describes thecolumnsinthespreadsheettemplateusedformappingpartnersourcevocabularies(thesauri)totheGettyAAT(ArtandArchitectureThesaurus),aspartoftheSubjectaccessstrategyforARIADNE.ThemappingswillinformcrosssearchforresourcediscoveryintheARIADNEPortal.

The mapping exercise matches concepts in a Partner vocabulary with concepts in the AAT using SKOSmappingrelations(e.g.skos:broadMatch,http://www.w3.org/TR/skos-reference/#mapping).Thedocumentalsocontainsguidelinesformakingthemappings.

TheMapping Template is an alternative to the (USW)VocabularyMatching Tool,which requires that thesource vocabulary be available as Linked Data. The Mapping Template allows mappings to be made bypartners' own methods (e.g. using AAT and source vocabulary webpages, or some other tool) andrepresented in a spreadsheet. A separate spreadsheet should be produced for each partner vocabularymapped to the AAT. The standard column names in theMapping Template should be followed. ThiswillallowasubsequentautomatictransformationbyUSWtotheRDFstatements,employedbytheRegistryandPortal.

The first tab in a partner mapping spreadsheet (Mapping-Template-partner-source.xlsx) should containmetadata and any necessary description of the mapping exercise. This can inform a subsequent VoIDmetadatadescriptionofthemapping.Themetadatashould includethefollowing itemsusingthefirstandsecondcolumns(pleasesubstitutetheNameofSourceVocabularyforXXX):-

dcterms:creator Nameoforganisationdoingthemapping

dcterms:created Dateofcreation(onedaterepresentingacompletemappingexercise)

dcterms:modified Dateoflastmodification

dcterms:title SKOSMappingbetweenconceptsinsource(XXX)andtarget(AAT)vocabulariesusingSKOSmappingproperties.

void:subjectsTarget URIofsourcevocabularyifknown(e.g.http://purl.org/heritagedata/schemes/eh_tmt2)

void:objectsTarget URIoftargetvocabulary(forARIADNEthiswillbehttp://vocab.getty.edu/aat/)

dcterms:description AnintellectualmatchingmadeforARIADNEfromthesourcevocabularyXXXtotheGettyAATforresourcediscoverycrosssearchpurposes.Includehereanydetailsofmethod(hopefullywithexpertreview)

dcterms:license TheRightsappropriateforPartnerandARIADNE,e.g.perhapsCC0orCC_BY/3.0

Thesecondtabshouldholdthemappingusingthecolumnnamesbelow(onespreadsheetforeachdifferentsourcevocabulary).Adifferentmapping is specified ineachrow.The followingcolumnnames inboldaremandatory(necessaryforexpressingtheresultingRDFstatements).

sourceLabel (thepreferredtermorlabelfortheconcept)

sourceURI (useURIifitexists,otherwiseuniqueconceptID,otherwiseprefLabelagain)

Page 43: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 43

matchURI (skos:closeMatch|skos:exactMatch|skos:broadMatch)

targetLabel AATlabelforconcept(e.g.smallholdings)

targetURI AATURIforconcept(e.g.http://vocab.getty.edu/aat/300000211)

Additional optional columns may be useful while creating the mappings or for human inspection of themappingspreadsheetbypartnersbutarenotrequired.Examplesofoptionalcolumnsfrompartnermappingworktodateinclude:

Source-Hierarchy (hierarchyorcategorythesourceconceptbelongsto)

Source-ScopeNote (scopenoteordefinitionofconcept-thismaybeparticularlyuseful)

Source-En (anEnglishlanguagetranslation,orotherlanguagesifdesired)

Comment (ifdesired,anycommentonthismapping,egarationale)

Other-Target-prefLabel (ifusefultopartnertoalsoincludemappingstootherthesauri)

Other-Target-URI (ifusefultopartnertoalsoincludemappingstootherthesauri)

Mappingguidelines

Theaimof themappingexercise is to identify subjectmappings toAAT for concepts that are likely tobeusefultoassistbrowsingandsearchoftheportal(timeandspacearebeinghandledseparately).

IfanyexistingmappingstoAATareknowntheymaybeusefultobuildon.TheAATcanalsobesearchedandbrowsedmanuallyviatheGettywebsite–http://www.getty.edu/research/tools/vocabularies/aat/

ProbablytheAATObjectshierarchyisthemostrelevanthierarchy.

Ifresourcesarelimited,asensiblestrategyistostartwiththemostusefulconceptsinthefirstinstanceforthedatasets/reportspartnershaveprovidedtotheRegistry.Thesewouldprobably includethetop(say2)levelsofrelevantpartnerthesauri(e.g.ObjectsandMonumenttypes)andalsoconceptsusedtoindexthedataprovidedtotheregistry.Itwillalsoincludecontrolledkeywordlistsusedbypartnerstoindexthedata.

Matchtypesforthemapping

If themapping is approximate then skos:closeMatch is probably the bestmatch type. If it is a very goodmatchthenskos:exactMatchisappropriate.Ingeneral,donotmakeuseofskos:relatedMatchforARIADNEpurposes (unless perhaps as an additional mapping for a given concept). The idea is to make the mostappropriatematchforeachconceptinthePartnervocabulary.

Usuallyyouwilljustmakeonematch(thebestone)toAATforanygivenconcept-thereisusuallynoneedtoexpressmultiplerelationshipstoAATconceptsasthisisprovidedgratisviatheAAT’ssemanticstructure.Thus if youmakeamatch fromagivenpartner concept toanAATconcept then there isnoneed toalsomake mappings to narrower AAT concepts for that given partner concept. The only exception is if thepartnerconcepthastwogenuinelyquitedifferentexpressionsintheAAT(thatarenotimmediateparentorchildconcepts).Inthiscaseoneortwoadditionalmappingsarepossiblebutthatshouldbeverymuchtheexception. Normally you would work through a hierarchy making a mapping for each concept, givingcompletecoverageofthathierarchy.

IfapartnerconceptismuchmorespecificthananyAATconceptthenyoucanmakeaskos:broadMatchtotheAATconcept.Thisisusefulforcaseswhenapartnervocabularyhasdetailedarchaeologicalconcepts.Itisnotexpectedthatyouwouldneedtomakemuchuseofskos:narrowMatchforARIADNEvocabularies.

MatchesshouldbemadetoAATconceptsratherthanguide-terms(inside<>).IfanAATguidetermappearsasamatchinthetool,consideranarrowerorbroaderconceptintheAAT.Forexample,insteadofmapping

Page 44: Ariadne: Report on Thesauri and Taxonomies

ARIADNE–Deliverable15.1:ReportonThesauriandTaxonomies July2016

Deliverable15.1 44

to <containers by form>, it is better tomap to containers (receptacles) even if themapping relationshipneedstobeskos:broadMatch.

Wheretoplevelpartnerconceptsaretoohighlevelorgeneral(egperhaps‘society’,‘religion’)tomapeasilythenprobablybesttoconsiderthenext leveldown.IfanypartnerconceptsproveparticularlyproblematicthenjustsetthemasideanddiscusswithUSWlater.

Optionalmatchingtool

WhenvocabulariesarealreadyavailableasLinkedDataviatheRegistryorviaHeritageDatathentheUSWVocabularyMatchingToolmaybehelpful.

http://heritagedata.org/vocabularyMatchingTool/

When using the VocabularyMatching Tool, remember to Save the data before ending a session (data issavedinJSONformat).ThisallowsyoutosubsequentlyLoadtheJSONfileintothetoolandmakerevisionsor further mappings. When sending the final results of the matching exercise, please send us the JSONformatfile.


Recommended