Linked Data Cookbook for Government Agencies, SemTech East, Washington DC 1-Dec-2011

Post on 13-May-2015

792 views 0 download

Tags:

description

Linked Data Cookbook for US Government Agencies by Bernadette Hyland, 3 Round Stones, Inc. and W3C Government Linked Data co-chair. Presented at Semantic Technology Conference Dec 2011, Washington DC

transcript

A Linked Data Cookbookfor Government Agencies

Semantic Technology Conference, Washington DC 01-Dec-2011 8:30AM

Bernadette HylandCEO, 3 Round Stones &

co-chair W3C Government Linked Data Working Groupbhyland@3roundstones.com

Twitter @BernHyland

Monday, November 28, 11

• Linked Data is about publishing and consuming data using international data standards

• Based on 20 year old idea

• Goal is to solve organizational issues related to data silos, requirements for faster data integration and an environment of reduced IT budgets

Monday, November 28, 11

Linking Government Data

• 42 contributors• ...from 8 countries• 10 chapters• Publication date:

November 2011

3

Monday, November 28, 11

Agenda• Why publishing Linked Open Data matters

• What governments are doing today

• How government use of Open Standards & Open Source Software saves lives and money

• Social contract as a government publisher

• Next steps

Monday, November 28, 11

Two sides of the Open Government Coin

#1 Short and long term public interests

•Increasing transparency

•Helping with informed civic engagement

#2 Data sharing for informed research, policy & regulation

•My talk today focuses on #2

Monday, November 28, 11

Why should we Care?• Reducing data silos has long been discussed ...

• Linked Data, based on international data exchange standards avoids vendor lock in

• Reduces the need to create & maintain data silos

• Encourages private and public partnerships

• Sows the seeds for economic growth from the top down and bottom up

Monday, November 28, 11

17%

49%

16%

13%4%

6 months12 months18 months24 monthsMore than 24 months

ACCEPTABLE ROI FOR IT

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

GovernmentsGoals: Governmental transparency and/or improved

internal efficiencies (data warehouses)

Monday, November 28, 11

PublishersGoals: Improve internal manuscript pipelines, expose

additional ways of finding and using content

Monday, November 28, 11

Monday, November 28, 11

Countries with Open Government Sites

Monday, November 28, 11

Open Government Data Camp 2011

Monday, November 28, 11

Open Government Data in 2011

Monday, November 28, 11

Government LOD on CKAN

Monday, November 28, 11

Largest Publisher of Government LOD

Monday, November 28, 11

Where is Open Source deployed?International Standards and Open Source are the reason

• The Web has become the most extensible, robust information network ever created

• US Dept of Defense is big customer of commercially support Open Source software

• US Army cites Open Source is saving lives and hundreds of millions of dollars.

• 100k instances deployed in missile defense systems & armored personnel carriers

Monday, November 28, 11

In 3 brief years ...• Starting in 2008, a few heads of state directed open

government data to be published on the Web ...

• Three months ago (September 2011), Presidents Obama (USA) and Rousseff (Brazil) endorsed the Open Government Partnership, along with 7 other nations

• Each launched their government’s National Plans during the meeting of the UN General Assembly

Monday, November 28, 11

World changing phenomenon• Using Linked Data approach, we can begin to

address data silos and interoperability using data exchange standards

• We can combine information sources

• The W3C has defined standards that enable interoperability and allow us to freely move data

Monday, November 28, 11

Monday, November 28, 11

•We’re already seeing signs of things to come.

• Structured data on the Web is becoming mainstream.

What is next?

Monday, November 28, 11

Government Linked Data Working Group

• Started June 2011; runs to May 2013

• Chartered to provide standards & develop standards track documents to help all governments share their data as high quality (“5 star”) Linked Data

• 39 participants from 25 organizations

• 50% in non-US locations

Monday, November 28, 11

http://www.w3.org/2011/gld/charterMonday, November 28, 11

DeliverablesCommunity Directory

Best Practices for Publishing Linked Data

• Procurement, vocabulary selection, URI construction, versioning, stability, legacy data issues

• Cookbook for Linked Open Data

Standard Vocabularies

• Metadata, Statistical “Cube” Data, People, Organizational structures

Monday, November 28, 11

Beta: http://dir.w3.orgemail support@3roundstones.com for login to

add your organization’s detailsMonday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

A pragmatic approach to

publishing & consuming Linked Data

Monday, November 28, 11

There is a Process

PublishConvertDescribeNameModelIdentify

Maintain

Monday, November 28, 11

Reality ... We started with the usual CSV dump ... ugly, cumbersome data

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Preparation1.Leverage what exists

• Request a copy of the logical and physical model of the database(s)

• Obtain data extracts (i.e., databases and/or spreadsheets) or create data in a way that can be replicated.

Monday, November 28, 11

Model the data2. Model data without context to allow for

reuse and easier merging of data sets

•Traditional DBAs organize data for specified Web services or applications.

•With LD, application logic does not drive the data schema, concepts, etc.

Monday, November 28, 11

Model the data3.Look for real world objects of interest (e.g.,

people, places, things, locations, etc.) and model them.

• Investigate how others are already modeling similar or related data.

• Look for duplication and normalize the data

•Use common sense to decide whether or not to make link

Monday, November 28, 11

Model the data ...4. Connect data from different sources and

authoritative vocabularies (see list of popular vocabularies below).

•Use URIs as names for your objects

Monday, November 28, 11

Model the data ...

•Put aside immediate needs of any application

•Don’t think about how an application will use your data

•Do think about time and how the data will change over time.

Monday, November 28, 11

Convert, Publish & Maintain5.Write a script or process to convert the

data set repeatedly

6.Publish to the Web and announce it! (more details shortly)

7.Maintenance strategy (more details in the social contract at the end)

Monday, November 28, 11

Take the plunge ... Be forgiving

•Simplistic data models can still be useful

•Better to make progress with something rather than do nothing because we cannot be comprehensive and complete

Monday, November 28, 11

Take an iterative approach1. Review of modeling decisions

2. Review vocabularies chosen and developed

3. Modify/update data conversion scripts

4. Do a maintenance walk-through with real use cases

5. Show how to explore data with SPARQL and visualizations

6. Discuss a persistent identifier strategy (think PURLs)

Monday, November 28, 11

Content Management Systems

•Wordpress

•Drupal

•Joomla!

•Others ...

Monday, November 28, 11

Linked Data Management System•Callimachus (kəlĭm'əkəs) is a framework for data-driven

applications based on Linked Data principles.

•Callimachus allows Web authors to quickly and easily create semantically-enabled Web applications.

Monday, November 28, 11

•Web 2.0 developers can create data driven application with templates in hours

•Triples up & down (no mySQL under the covers)

•Wiki editing of content

•Access control

•Collaboration via Web

•Change tracking (history)

•Page/form Templates

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Monday, November 28, 11

Join the Community•Callimachus has benefited from 2+ years of corporate support

•We’re using it for real world Web applications in environmental protection, finance and publishing

•Open Source project

•Visit callimachusproject.org

Monday, November 28, 11

What we covered today• Why government authorities are publishing information as

Linked Open Data

• The process for converting data into RDF

• Using Open Standards and Open Source to publish Open Data

• Note: Commercial support & products are critical for government publishing & consumption of Open Data

• Announcing agency Open Data & your social contract

Monday, November 28, 11

http://linkeddatabook.com/editions/1.0/

http://3roundstones.com/linking-enterprise-data/

Further Reading

http://www.linkeddatadeveloper.com/

http://3roundstones.com/linking-government-data/

Monday, November 28, 11

LINKED GOVERNMENT DATA:

ENVIRONMENTAL PROTECTION PERSPECTIVES

Recommended talk Thursday, 1-Dec 2011 @ 9:30

by Michael Pendleton & David G. Smith, US EPA

Monday, November 28, 11

This talk http://slideshare.net/3roundstones@BernHyland

bhyland@3roundstones.comMonday, November 28, 11

This work is Copyright © 2011 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported LicenseFull details at: http://creativecommons.org/licenses/by/3.0/

You are free:

to Share — to copy, distribute and transmit the work

to Remix — to adapt the work

Under the following conditions:Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).

Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

Monday, November 28, 11