Date post: | 10-May-2015 |
Category: |
Education |
Upload: | cigscotland |
View: | 248 times |
Download: | 0 times |
Publishing the British National Bibliography
as Linked Open Data
Corine DeliotMetadata Standards Analyst
British Library
CIGS Linked Open Data SeminarEdinburgh, 18 November 2013
www.bl.uk 2
Presentation overview
• Motivations and approach
• The modelling process and the data model
• Technical process: from MARC 21 to RDF
• Linking to external datasets
• Outcomes – datasets/platform/access
• Plans for future developments
www.bl.uk 3
Motivations
• Publishing our data for others to re-use
• Looking beyond library audiences
• Taking part in the Linked Data conversation
www.bl.uk 4
How?
• Pragmatic, bottom-up approach
• Using existing staff
• Building on existing skills
• Using existing tools as much as possible
www.bl.uk 5
Why BNB?
• General bibliography - not a unique institutional catalogue
• Consistent format - over 60 years
• Size & range of content - 3 million records on all subjects in many languages
• Control of metadata – publishable as CC0.
www.bl.uk 6
The modelling process (I)
• identify our objects of interest, i.e. what does the MARC record says about “things in the world”
e.g. Bibliographic resources, people, organizations, places, subjects, etc.
• Assign URIs to identify these objects of interests URI pattern guidance from the UK Cabinet Office
“Designing URI Sets for the UK Public Sector”
www.bl.uk 7
URI patterns
• http://bnb.data.bl.uk/id/resource/{control-number}
• http://bnb.data.bl.uk/id/resource/{BNB-number}
• http://bnb.data.bl.uk/id/person/{person-name}
• http://bnb.data.bl.uk/id/organization/{organization-name}
• http://bnb.data.bl.uk/id/concept/lcsh/{topic}
• http://bnb.data.bl.uk/id/concept/ddc/{edition-number}/{dewey-number}
www.bl.uk 8
The modelling process (II)
Describe these objects of interest and how they relate to each other.
Use classes and properties from existing RDF vocabularies
Define our own classes and properties when required; documented in the British Library Terms RDF schema
www.bl.uk 9
RDF Vocabularies
• Bibliographic Ontology
• Bio: a Vocabulary for Biographical Information
• British Library Terms
• Dublin Core
• Event Ontology
• FOAF: Friend of a Friend
• ISBD
• Org: an Organisation Ontology
• OWL
• RDA
• RDF
• RDF Schema
• SKOS
• WGS84 Geo Positioning
www.bl.uk 10
The British Library Terms RDF Schema
blt=“http://www.bl.uk/schemas/bibliographic/blterms#”
• Existing property not quite right (e.g. not granular enough)
e.g. dcterms:identifier vs blt:bnb
www.bl.uk 11
The British Library Terms RDF Schema
blt=“http://www.bl.uk/schemas/bibliographic/blterms#”
Property or class required by specific feature of the model
e.g. blt:publication and blt:PublicationEvent (rdfs:subclass of event:Event)
www.bl.uk 12
The British Library Terms RDF Schema
blt=“http://www.bl.uk/schemas/bibliographic/blterms#”
For pragmatic reasons, e.g. facilitate searching, inferencing and navigating through the graph
e.g. blt:TopicLCSH and blt:TopicDDC
e.g. blt:hasCreated owl:inverseOf dcterms:creator
www.bl.uk 13
The BNB data model - Books
http://www.bl.uk/bibliographic/pdfs/bldatamodelbook.pdf
www.bl.uk 14
Data Model Features (I): the Bibliographic Resource
www.bl.uk 15
Data Model Features (II): Publication as an event
• <BibResource> dcterms:publisher <Publisher> .
<BibResource> dcterms:issued “Date” .
<BibResource> ? “Place” .
Or
<BibResource> ? <Place> .
• <BibResource> blt:publication <PublicationEvent> .
<PublicationEvent> event:place <Place> .
<PublicationEvent> event:agent <Publisher> .
<PublicationEvent> event:time <Year> .
Usual approach
Event-based approach
www.bl.uk 16
Data model features (III)
• Birth and death are modelled as biographical events
• extensive use of foaf:focus to relate “things in the world” (e.g. people, organizations, places) to their SKOS concepts.
e.g. “Paris”, the capital of France as a single “thing in the world” may be the “focus” of multiple concepts belonging to different concept schemes, e.g. thesauri (LCSH, Rameau, etc.)
<Concept> foaf:focus <Thing in the World>
http://efoundations.typepad.com/efoundations/2011/09/things-their-conceptualisations-skos-foaffocus-modelling-choices.html by Pete Johnston
www.bl.uk 17
MARC to RDF Conversion Workflow
Full BNB MARC21
File
Transform to RDF/XML using
XSLT
Load to Linked Data Platform
Generate RDF Triple Dump
BNB RDF/XML file
Select records
Convert to pre-composed UTF-8
Normalise for improved
matching & transforms
Create BL URIs and add external
URIs by matching
MARCPre-Processing
Load to BL Downloads page
Process• Selection• Character set conversion• Pre-processing• URI generation• Data transformation• Create & load triples
Tools• Catalogue Bridge Utilities • MARC Global/MARC Report http://www.marcofquality.com/• Jena Eyeball http://jena.sourceforge.net/Eyeball/
www.bl.uk 18
Linking to external sources (I)
To give our data broader context we linked to:
• General resources:• GeoNames• Lexvo• RDF Book
Mashup
• Library resources:• LCSH• VIAF• Dewey.info• MARC language
and country codes
www.bl.uk 19
Linking to external sources (II)
Techniques included:
• Automatic generation from
record data
• Auto text match with linked data dumps
• Crosswalk matching for coded data
www.bl.uk 20
Outcomes
• Two datasets – Books and Serials, accessible at:
• BNB Linked data platform: http://bnb.data.bl.uk
• SPARQL endpoint: http://bnb.data.bl.uk/sparql
• SPARQL editor: http://bnb.data.bl.uk/flint
• Bulk downloads: http://www.bl.uk/bibliographic/download.html
Updated monthly Serializations available:
RDF/XML, N-Triples
www.bl.uk 21http://bnb.data.bl.uk
www.bl.uk 22
www.bl.uk 23http://bnb.data.bl.uk/flint
www.bl.uk 24http://www.bl.uk/bibliographic/download.html
www.bl.uk 25
Platform change
• 2011 - initial Talis platform
• 2013 – data migration to TSO platformhttp://www.tso.co.uk/our-expertise/technology/openup-platform
Tendering process Migration of data and services over a couple of months
www.bl.uk 26
Plans for Future Developments
• Refine and extend the model
• Investigate frbr-ization
• Link to other external sources• Geonames at city level
• ISNI, LC/NACO, DBpedia
• DNB bibliographic resources
• Expand scope beyond current BNB
• Improve developer support
www.bl.uk 27
For further information http://www.bl.uk/bibliographic/datafree.html
Thank you.
Questions?
http://twitter.com/#!/BLMetadata