+ All Categories
Home > Documents > Publishing the British National Bibliography as Linked Open Data Corine Deliot Metadata Standards...

Publishing the British National Bibliography as Linked Open Data Corine Deliot Metadata Standards...

Date post: 26-Dec-2015
Category:
Upload: margery-french
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
33
Publishing the British National Bibliography as Linked Open Data Corine Deliot Metadata Standards Analyst British Library CIG Event Birmingham, 25 November 2013 © The British Library Board 2013
Transcript

Publishing the British National Bibliography

as Linked Open Data

Corine DeliotMetadata Standards Analyst

British Library

CIG EventBirmingham, 25 November 2013

© The British Library Board 2013

www.bl.uk 2

Overview• Motivations and approach

• The modelling process and the data model

• Technical process: from MARC 21 to RDF

• Linking to external datasets

• Outcomes – datasets/platform/access

• Plans for future developments

• Use of the BNB data

• Benefits

• Challenges

www.bl.uk 3

Motivations

• Publishing our data for others to re-use

• Looking beyond library audiences

• Taking part in the Linked Data conversation

www.bl.uk 4

How?

• Pragmatic, bottom-up approach

• Using existing staff

• Building on existing skills

• Using existing tools as much as possible

www.bl.uk 5

Why BNB?

• General bibliography - not a unique institutional catalogue

• Consistent format - over 60 years

• Size & range of content - 3 million records on all subjects in many languages

• Control of metadata – publishable as CC0.

© Waldir/ Wikimedia Commons/ CC BY-SA-3.0Usage terms: http://creativecommons.org/licenses/by-sa/3.0/

www.bl.uk 6

The modelling process (I)

• identify our objects of interest, i.e. what does the MARC record says about “things in the world”

e.g. Bibliographic resources, people, organizations, places, subjects, etc.

• Assign URIs to identify these objects of interests

www.bl.uk 7

URIs: Things to think about

• Create our own URIs or use existing ones? e.g. http://viaf.org/viaf/96994048

http://id.loc.gov/authorities/names/n78095332

• Create opaque or transparent URIs?• e.g. http://viaf.org/viaf/96994048 or

http://dbpedia.org/resource/William_Shakespeare

• What pattern? URI pattern guidance from the UK Cabinet Office

“Designing URI Sets for the UK Public Sector”

• Create valid, i.e. syntax conformant URIs

www.bl.uk 8

URI patterns

• http://bnb.data.bl.uk/id/resource/{control-number}

• http://bnb.data.bl.uk/id/resource/{BNB-number}

• http://bnb.data.bl.uk/id/person/{person-name}

• http://bnb.data.bl.uk/id/organization/{organization-name}

• http://bnb.data.bl.uk/id/concept/lcsh/{topic}

• http://bnb.data.bl.uk/id/concept/ddc/{edition-number}/{dewey-number}

www.bl.uk 9

URI patterns

• http://bnb.data.bl.uk/id/resource/008043929

• http://bnb.data.bl.uk/doc/resource/008043929

• http://bnb.data.bl.uk/doc/resource/008043929.rdf

• http://bnb.data.bl.uk/doc/resource/008043929.ttl

• http://bnb.data.bl.uk/doc/resource/008043929.json

• http://bnb.data.bl.uk/doc/resource/008043929.html

www.bl.uk 10

The modelling process (II)

• Describe these objects of interest, i.e. use classes

• and how they relate to each other, i.e. use properties

Use classes and properties from existing RDF vocabularies

Define our own classes and properties when required; documented in the British Library Terms RDF schema

www.bl.uk 11

RDF Vocabularies

• Bibliographic Ontology

• Bio: a Vocabulary for Biographical Information

• British Library Terms

• Dublin Core

• Event Ontology

• FOAF: Friend of a Friend

• ISBD

• Org: an Organisation Ontology

• OWL

• RDA

• RDF

• RDF Schema

• SKOS

• WGS84 Geo Positioning

www.bl.uk 12

RDF Vocabularies

• Bibliographic Resource Dublin Core Bibliographic Ontology ISBD British Library Terms

• Event Event Ontology British Library Terms

• Person/Organization FOAF: Friend of a Friend Bio: a Vocabulary for

Biographical Information Org: an Organisation

Ontology RDA

• Place WGS84 Geo Positioning

• Concept SKOS British Library Terms

• RDF• RDF Schema• OWL

www.bl.uk 13

The British Library Terms RDF Schema

@prefix blt:<http://www.bl.uk/schemas/bibliographic/blterms#> .

• Existing property not quite right (e.g. not granular enough)

e.g. dcterms:identifier vs blt:bnb

www.bl.uk 14

The British Library Terms RDF Schema

@prefix blt:<http://www.bl.uk/schemas/bibliographic/blterms#> .

Property or class required by specific feature of the model

e.g. blt:publication and blt:PublicationEvent (rdfs:subclass of event:Event)

www.bl.uk 15

The British Library Terms RDF Schema

@prefix blt:<http://www.bl.uk/schemas/bibliographic/blterms#> .

For pragmatic reasons, e.g. facilitate searching and navigating through the graph

e.g. blt:TopicLCSH and blt:TopicDDC

e.g. blt:hasCreated owl:inverseOf dcterms:creator

www.bl.uk 16

The BNB data model - Books

http://www.bl.uk/bibliographic/pdfs/bldatamodelbook.pdf

www.bl.uk 17

Data Model Features (I): the Bibliographic Resource

www.bl.uk 18

Data Model Features (II): Publication as an event@prefix dc:<http://purl.org/dc/elements/1.1/> .

@prefix dcterms:<http://purl.org/dc/terms> .

<BibResource> dc:publisher “Publisher” ;

dcterms:issued “Date” ;

?:placeOfPublication “Place” .

@prefix blt:<http://www.bl.uk/schemas/bibliographic/blterms#> .

@prefix event:<http://purl.org/NET/c4dm/event.owl#> .

<BibResource> blt:publication <PublicationEvent> . <PublicationEvent> event:place <Place> ;

event:agent <Publisher> ; event:time <Year> .

Usual approach

Event-based approach

www.bl.uk 19

Data model features (III)

• Birth and death are modelled as biographical events

• extensive use of foaf:focus to relate “things in the world” (e.g. people, organizations, places) to their SKOS concepts.

e.g. “London”, the capital of England and the UK as a single “thing in the world” may be the “focus” of multiple concepts belonging to different concept schemes, e.g. thesauri (LCSH, Rameau, etc.)

<Thing-as-Concept> foaf:focus <Thing in the World> .

http://efoundations.typepad.com/efoundations/2011/09/things-their-conceptualisations-skos-foaffocus-modelling-choices.html by Pete Johnston

www.bl.uk 20

MARC to RDF Conversion Workflow

Full BNB MARC21

File

Transform to RDF/XML using

XSLT

Load to Linked Data Platform

Generate RDF Triple Dump

BNB RDF/XML file

Select records

Convert to pre-composed UTF-8

Normalise for improved

matching & transforms

Create BL URIs and add external

URIs by matching

MARCPre-Processing

Load to BL Downloads page

Process• Selection• Character set conversion• Pre-processing• URI generation• Data transformation• Create & load triples• Produce VoiD descriptions

Tools• Catalogue Bridge Utilities • MARC Global/MARC Report http://www.marcofquality.com/• Jena Eyeball http://jena.sourceforge.net/Eyeball/

www.bl.uk 21

Linking to external sources (I)

To give our data broader context we linked to:

• General resources:• GeoNames• Lexvo• RDF Book

Mashup

• Library resources:• LCSH• VIAF• Dewey.info• MARC language

and country codes

www.bl.uk 22

Linking to external sources (II)

Techniques included:

• Automatic generation from

record data

• Auto text match with linked data dumps

• Crosswalk matching for coded data

© Silverspoon/ Wikimedia Commons/ CC BY-SA-3.0Usage terms: http://creativecommons.org/licenses/by-sa/3.0/

www.bl.uk 23

Outcomes

• Two datasets – Books and Serials - and their VoID descriptions, accessible at:

• BNB Linked data platform: http://bnb.data.bl.uk

• SPARQL endpoint: http://bnb.data.bl.uk/sparql

• SPARQL editor: http://bnb.data.bl.uk/flint

• Bulk downloads: http://www.bl.uk/bibliographic/download.html

Updated monthly Serializations available:

RDF/XML, N-Triples

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”Usage terms: http://creativecommons.org/licenses/by-sa/3.0/

www.bl.uk 24http://bnb.data.bl.uk

www.bl.uk 25

www.bl.uk 26http://bnb.data.bl.uk/flint

www.bl.uk 27http://www.bl.uk/bibliographic/download.html

www.bl.uk 28

Platform change

• 2011 - initial Talis platform

• 2013 – data migration to TSO platformhttp://www.tso.co.uk/our-expertise/technology/openup-platform

Tendering process Migration of data and services over a couple of months

www.bl.uk 29

Plans for Future Developments

• Refine and extend the model

• Investigate frbr-ization

• Link to other external sources• Geonames at city level

• ISNI, LC/NACO, DBpedia

• DNB bibliographic resources

• Expand scope beyond current BNB

• Improve developer support

www.bl.uk 30

Use of the BNB data

• Statistics e.g. Number of hits on the SPARQL endpoint e.g. Number of downloads on the BL webpage

• BNB data used in pilot projects e.g. Linked Open BNB data used as test data for a semantic

search demonstrator.

• Anecdotal evidence

• Use is difficult to assess; part and parcel of the data being open and available for all to use.

www.bl.uk 31

Benefits of Linked Open Data

• We have learnt a lot about the practical aspects of working with linked data.

• The data model got some attention. Re-used by Danish Bibliographic Centre (DBC) Stanford Linked Data Workshop Technology Plan

““…ensure resulting model retains the BL’s high-level focus and its web derived, transparent structure for representing facts about people, organizations, places, events, and topics”

• LOD raised the Library’s profile internally and externally

• LOD helped us focus our legacy data enhancement activities

www.bl.uk 32

Challenges

Converting MARC data into RDF!

• Publication event approach: transforming transcribed text into data

• URI creation from string may result in duplication changes over time may also produce duplication.

• Legacy data issues e.g. inconsistency of the data e.g. cataloguers using inadequate input tools for diacritics

• This is (relatively) new, nobody has all the answers

www.bl.uk 33

For further information

http://bnb.data.bl.uk

http://www.bl.uk/bibliographic/datafree.html

Thank you.

Questions?

[email protected]

http://twitter.com/#!/BLMetadata


Recommended