Part of the Williams Lea Group[Presentation Title]
Moving linked data into production - and reaping the benefits
Richard Goodwin
SemTechBiz September 2012
Part of the Williams Lea Group
What does TSO do?
Semantic discoverability solutions
Linked Open Data
Breaking data out of silos
Dedicated semantic team with a variety of experience & backgrounds
Part of the Williams Lea Group
Aggregate relevant data
Why Semantic discoverability?
Extract important information from it
Enrich the content
Add value to the data
Allow linking, re-use and
repurposing
Part of the Williams Lea Group
Organograms
Good use case for Linked Open Data
Making government information
accessible and open
Automated dissemination
process
Part of the Williams Lea Group
Government Transparency initiative
Announced by No.10 in May 2010
RDF and visualisation
Live in June 2011
Part of the Williams Lea Group
What was involved?
Using Excel source data
– convert from CSV to RDF using PHP and XL Wrap
Preview and publish through linked data API
Creating custom organogram Visualisation
Supporting distributed publication by owner organisations
TSO Non Sensitive
Part of the Williams Lea Group
Achievements
Publishing in RDF right across government
– now = 200
– soon = 400+
New data published every 6 months
Humans use Visualisation for information about government
Machines can pull regularly updated info to create other resources
Part of the Williams Lea Group
Challenges
WCM unable to handle RDF documents
Officials struggling to get sign-off from ministers
Upload / validation usability issues
Minimise errors that departments are expected to remedy
Reduce bootstrapping
Exploit value in data set - see changes over time
Part of the Williams Lea Group
Challenges – improve the user experience
CSV and validation usability issues
– Apparently inconsistent validation
– Silent errors
– Department uploading ≠ department featured
– Departments see all files clearly marked as senior CSV, junior CSV and RDF
Sign-off from ministers
– Senior management, short of time
Part of the Williams Lea Group
Challenges – improve the user experience
Organisational quirks
– e.g. some Ministry of Defence (MoD) civil servants report to minister, others of same grade report elsewhere
Grades need to be more flexible
– e.g. ‘equivalent to grade X’ or accept those parts which are correct and flag the others
Duplicate uploads need flagging
Improve the speed of preview function
Part of the Williams Lea Group
Solutions – WCM and reliability
Replaced the XL Wrap with CSV2TTL
– a Python-based implementation of CSV to RDF
This supports efficient and reliable publishing of RDF triples from CSV
Early validation takes place in spreadsheet template
Data owners upload the spreadsheet to the preview server for signing-off
Part of the Williams Lea Group
Solutions – Usability
The main constraint on our action is the use of the templates from within the Government
Secure Intranet - VBA code inappropriate
Strip out lengthy formulae (hard to maintain)– Net result no change to file size despite extra features
Provide per cell rather than per row feedback to users
Hide extraneous cells and improve validation rules
Use single-cell lookup point for web application to ascertain validity
Part of the Williams Lea Group
Linked data - increasing value over time
Enables user to View the change in the
shape of government over time
Use a slider on Visualisation to show changes
Solution Serves all datasets from
same iteration into single Knowledge Base (KB) with each different iteration in separate KBs
Data registry maintains the mapping between <iteration>, <department> and knowledge base, <graph>
Part of the Williams Lea Group
Harvest Enrich Store Publish
Aggregation of data from web, APIs databases
and files
Extracting useful data and converting to re-usable formats
Highly scalable database
storage and query engine
Websites and APIs to reach
data users
Automated processes that deliver reliable data
OpenUp® Platform
Part of the Williams Lea Group
See http://openup.tso.co.uk
Follow @TSOTechnology
Meet TSO Semantic team on our stand
Test our new release of Flint online SPARQL editor launching today…
Questions?
Part of the Williams Lea Group
Disclaimer
Confidentiality statement
The contents of this document together with all other information, data, materials, specifications or other related documents provided by Williams Lea (“WL”) (together “materials”) shall be treated at all times by the recipient as the confidential and proprietary information of WL. The recipient shall not disclose any such materials to any third parties without the express, prior written approval of WL. Where such express approval is granted by WL, the recipient shall ensure that all third parties to whom disclosure is made shall keep any such materials confidential and shall not disclose them or any part of them to any other person. All intellectual property rights in the materials shall remain the property of WL, or its third party licensors, and are protected by copyright.
© 2012 Williams Lea Group
Disclaimer
This document may be incomplete without reference to any oral briefing provided by WL, reflects current conditions and WL’s views as of this date and is subject to correction or change at any time. Although the information contained in this document is believed to be accurate in all material respects, neither WL nor any of WL’s advisers, agents, officers or employees accepts responsibility or liability for or makes any promise, representation, statement or expression of opinion or warranty, express or implied, with respect to the accuracy or completeness of the content of this document (to the extent permissible by law) unless and save to the extent that such promise, representation, statement or expression of opinion or warranty is later expressly incorporated into a legally binding contract.