EPA OEI Linked Data Process

Post on 07-May-2015

793 views 1 download

description

EPA OEI Linked Data Process presentation - 2012.

transcript

Publishing EPA Data as

Linked Data

A brief by Michael Pendleton

EPA Office of Environmental Informationpendleton.michael@epa.gov

“We’re moving from managing documents

to managing discrete pieces of open data

and content which can be tagged, shared,

secured, mashed up and presented in the

way that is most useful for the consumer

of that information.”

-- Report on Digital Government: Building a 21st Century Platform to

Better Serve the American People

What is driving us?

Goal: Make Open Data, Content, and Web APIs the New Default

Slide Credit: David G. SmithAug 16, 2011 presentation U.S. Environmental Protection Agency

Linked DataWhat’s It All About?

• Speak the Language of the Web• Just as you surf web pages, linked data lets you surf

data.• SOAP was about making the web try to work like

applications; REST was about making applications work like the web.

• Linked Data is about making your DATA work like the web.

4

RDF is a lingua RDF is a lingua franca for data franca for data

exchangeexchange

Slide Credit: David G. Smith U.S. Environmental Protection Agency

Linked Data Basics

•Tim Berners-Lee: 5-Star model for publishing data

• http://www.w3.org/DesignIssues/LinkedData.html

6

•Linked Data is about publishing and consuming data using international data standards

•Based on 20 year old idea (the Web)

•A system of linked information systems

Global requirements

•Comprehensively link legislation & regulations for more effective government

•Explain context, source, version & publication date with the data itself

•We need global standards for metadata

The mission of the Government Linked

Data (GLD) Working Group is to provide

standards and other information which

help governments around the world

publish their data as effective and usable

Linked Data using Semantic Web

technologies.

Best Practices

Vocabulary Guidance

Community Building

US EPA publishes lots of CSV files ...

And now, Linked Open Data ...

• A proof-of-concept launched 2011 with 5 Star Linked Data

• Publication of 1.3M facilities (FRS) and the substances (SRS) regulated by the EPA

• TRI program links to 25 years of data on major polluters

• Additional pilots in 2012 incorporating EPA and anonymized electronic medical records (EMR) data from Sentara Healthcare

• 5 Star Linked Open Data to be hosted & accessible on an EPA production Web site in summer 2012

• Empower users to create their own views of data to satisfy different applications

• Build a community around the data in which users help each other to curate and connect as needed

• Skip the supermodel - Leave data in the multiple “best of breed” systems; wrap and expose on the Web of Data

Increase re-use by publishing Linked Data

There is a Process

PublishPublish PublishPublish

ConvertConvert ConvertConvert

DescribeDescribe DescribeDescribe

NameName NameName

ModelModel ModelModel

IdentifyIdentify IdentifyIdentify

MaintainMaintain

• Identify a dataset others are likely to want to re-use

•Modeling

•Onsite modeling session (half day)

• Linked Data modeling supported by experts

• Validate the model with data owners/stewards

• Publish data on the Web (opendata.epa.gov) per Best Practices

• Produce automated scripts to maintain current data

• Announce Linked Open Data sets *

• Review usage reports to support relevance & user feedback

7 steps to publishing Linked Data

* Pending EPA Systems Security Plan approval

Open Data Platforms• We’re using Callimachus, a Web platform for data-driven applications based on Linked Data principles.

• It is hosted on Amazon EC2 and we have 24x7x365 data & application support.

• There are other data platforms, we selected this one because it is fully W3C standards compliant, no vendor “lock in”

• It’s Open Source (Apache 2.0)

•Linked Data promotes goals of transparency & economic development during times of fiscal austerity

•Publish in reusable format (RDF family of standards)

•Use OPEN vs proprietary in data formats

•Define a URI Policy and Strategy

•Use best practices and vocabularies exist -- don’t recreate the wheel

Recommendations

Publishing Linked Data will require continual nurturing but the rewards are worth it

Resources

• VisibleGovernment.ca Website http://visiblegovernment.ca

• Hack, Mash and Peer: Crowdsourcing Government Transparency, Jerry Brito, George Mason University, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1023485

• Blog on UK Environment Agency Water Quality, see http://data.southampton.ac.uk/datasets.html

• Southampton Open Data Service, see http://data.southampton.ac.uk/datasets.html

• Blog post on Clean Energy data from Reegle, see http://blog.semantic-web.at/2012/04/13/reegle-info-linked-open-energy-data-cloud/

• Blog post on Publishing Linked Open Data in Tight Economic Times, 30-Jan-2012, http://3roundstones.com/2012/01/30/publishing-linked-open-data-makes-good-sense-in-tight-economic-times/

• Blog post on HealthData.gov from US Health & Human Services, 4-June-2012, http://www.healthdata.gov/blog/welcome-new-healthdatagov

• Blog post on US HHS Domain Challenge 1: Metadata, 2-June-2012, http://www.healthdata.gov/blog/domain-challenge-1-metadata

Coming soon ...• Best Practices for Publishing Linked Data (editor’s

Draft 20-Apr-2012), see https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html

• Linked Data Cookbook, see http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook

• Linked Data Directory, see http://dir.w3.org

• Attend the 2012 International Open Government Data Conference co-sponsored by data.gov & The World Bank 10-12 July 2012, Washington DC, see http://www.data.gov/communities/conference

This work is Copyright © 2011-2012 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported LicenseFull details at: http://creativecommons.org/licenses/by/3.0/

You are free:

to Share — to copy, distribute and transmit the work

to Remix — to adapt the work

Under the following conditions:

Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).

Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

CreditsJennifer Bell,

VisibleGovernment.ca(CC-BY-SA)

http://www.slideshare.net/jenniferbell

1-5 Star Linked Data image

http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/

LOD Cloud DiagramsRichard Cyganiak, Anja

Jentzsch, (CC-BY-SA)http://lod-cloud.net/

Book covers © their respective owners and used under Fair Use for educational purposes

© 2012 Bernadette Hyland, released under a CC-BY-SA license