Open Sesame: Open Data, Data Liberation and Opportunities for Librarians

Post on 14-Jan-2015

3,844 views 2 download

Tags:

description

A call to librarians to use their library powers in the community beyond the walls of their institutions as the open data folks need their knowledge! Title: Open Sesame: Open Data, Data Liberation and New Opportunities for Libraries Abstract: Cities and data producers are quickly embracing Open Data, albeit unevenly. The Data Liberation Initiative (DLI) has been a pioneer in broadening access to data for nearly two decades. This session will examine the relevance of Data Liberation in terms of Open Data and explore how librarians can step up to the plate to make Open Data/Open Government as successful as DLI. Speakers: - Wendy Watkins, Data Librarian, Carleton University - Ernie Boyko, Adjunct Data Librarian, Carleton University - Tracey P. Lauriault, Post Doctoral Fellow, Carleton University (tlauriau@gmail.com) - Margaret Haines, University Librarian, Carleton University

transcript

Open Sesame: Open Data, Data

Liberation and New

Opportunities for Libraries

CLA 2012

Margaret Haines, Wendy Watkins, Ernie Boyko, Tracey P. Lauriault

June 1st, 1:30 Session E37

RM 205, Ottawa Convention Centre

Introductions

CLA 2012

Margaret Haines

June 1, 2012

Part 1

The Data Liberation Initiative:

Kind-of-Sort-of Open Data

A look at the way Data Liberation (DLI) started a move toward open data

CLA 2012

Wendy Watkins

June 1, 2012

Why Should We Care About Data?

• We use data to understand the world in which we live

• Gives us evidence for decision and policy making – Individuals

– Corporations

– Governments

• Good data are essential

for good governance

What is the Data Liberation Initiative?

• Program to provide affordable access to Statistics Canada’s public datafiles and databases to academics

• Not really open – subscription based

• Partnership between Statistics Canada and Canadian post-secondary institutions

• Housed in academic libraries – Logical place on campus because of service

orientation and campus-wide coverage

– Used to administering licences

Why and How Did It Start?

• Canadian universities unable to afford StatCan data

• Used US data or simply did without

• 1992 paper suggested a solution

• 1996 government adopted the plan

• Expected 30 universities to join

– 50 became members within the 1st year

– Far exceeded expectations

What Does It Include?

• All public Statistics Canada databases – Tables, graphs, time-series aggregate data

• Geographic files at every level – National

– Provincial

– Sub-provincial

• 350 Public Use Microdata files – Anonymized records of individual responses

– “Designer data”

Statistics and Data

• Statistics are data that have been organized

• Data are raw numbers that must be processed to make sense

Statistics and Data – an analogy

• Using statistics is like buying a postcard

– Someone else defines the view

• Data have the power of a camera

– Researcher makes decisions on content

What Are DLI’s Benefits?

• Dedicated and knowledgeable team at Statistics Canada

• One-stop-shop for all data and statistical products

• Ready help via the listserv

• Annual regional training programs

• National training (every 4 years)

• Boot camps for new members (as needed)

• Community of data professionals

Transition to Free Statistics Canada Data

• No change for DLI – Stopped paying for data in 2000

• All about data management – Robust metadata – Quality control

• One licence per institution for the collection – non-DLI – one licence per PUMF per person – Not a workable solution for academic libraries

• Access to valuable data NOT available outside DLI – Canadian Centre for Health Information (CIHI)

microdata Discharge Abstract Database (DAD) – Other important microdata under negotiation

Data Liberation’s Relations

Programs/Projects as a Result of DLI

• Research Data Centre Network http://www.rdc-cdr.ca/

– network of 27 research centres with secure access to Statistics Canada’s confidential data

• <odesi> http://odesi.ca

– a digital repository for social science data

– data exploration, extraction and analysis tool

– built by academic data librarians

• Data Liberation International

Remember, librarians built the infrastructure for DLI, <odesi>, Equinox

and much much more!

Over to Ernie

Part 2

A view from the Developing World

CLA 2012

Ernie Boyko

June 1, 2012

A view from the Developing World

• AKA: Data Liberation International

- Different set of challenges for developing countries than Canadian DLI

- But the principles are transferable

- This presentation will outline the path followed by developing countries to reach the goal of data to support research and learning in the context of their economic and social development

Data and Development

• The value of data to guide economic and social development has been recognized a long time ago - World census of population and agriculture

program

- UN Statistics Division coordination

- Periodic household surveys sponsored by international donors

- A greater focus on macro financial time series after the ‘Mexican Peso Crisis’

Barriers to Use of Sound Data

• Relevance

- Aligned to national or sponsors’ priorities ?

- Optimal timing and sequencing ?

• Data Quality

• Reliability

• Comparability

- Over time, and across countries

Barriers Cont’d

• Accessibility

- Legal, technical, political, psychological issues

• Usability

- Poor documentation risk of misuse

Barriers Cont’d

• Accessibility

Legal, technical, political, psychological issues

• Usability

Poor documentation risk of misuse

Lesson: Even if data are open, one

must pay attention to quality and

accessibility issues

The Marrakech Action Plan

The Marrakesh Action Plan

• Established the

• PARIS21 Secretariat under OECD umbrella as a

consortium of development agencies

• International Household Survey Network (IHSN)

to develop tools and policies

• Accelerated Data Program (ADP) to work with

countries

Marrakech Action Plan for Statistics

But measuring development

progress is still difficult…

International Household Survey Network

• For better data collection - Coordination for better survey planning

- Harmonization of recommendations

• For better use of existing survey data - Tools and guidelines for better data

documentation, dissemination, preservation (Microdata Management Toolkit).

IHSN is a partnership of international organizations

Microdata Management Toolkit

• Document data according to international XML standards and good practices

• Availability in several languages and open source

• Benefits: - Preserve institutional memory

- Data quality control

- Better documentation lower risk of misuse

- Easy dissemination (html, PDF output)

Microdata Management Toolkit

A specialized metadata editor for data documentation and quality control Automatic generation of

user friendly outputs

Accelerated Data Program (ADP)

Providing support to countries to : • Establish national microdata archives

• Document, disseminate existing data

• Analyze existing data for selected key issues

• Assess reliability, relevance, comparability

• Support new survey programs

• Sponsor lots of training

A web-based database of surveys,

searchable by region/country,

type of survey, year, etc.

Advanced search (by topic)

being developed.

Survey Description

For each survey,

information

is provided

in four pages:

description,

content,

documentation,

and dataset

Conclusions

• Data need to be transformed from their raw state to make them more useable

• The key to success for IHSN, ADP is standards based tools and out reach

• Tools and infrastructure for managing data can be shared (tool kit etc. are open source)

• Data and information specialists need to work with data producers in this process

• There is a role for professional librarians in making data more accessible

Thanks to Olivier Dupriez and Neil Fantom from World Bank/IHSN for program

slides

Over to Tracey

Part 3

Open Data in Canada &

Why we need Librarians

CLA 2012

Tracey P. Lauriault

tlauriau@gmail.com, datalibre.ca

June 1, 2012

Citizens & Open Data

Open North Public Participation Budget

http://opennorth.ca/ Budget Plateau

http://budgetplateau.com/

Open North Democratic Engagement

http://mamairie.ca/ http://represent.opennorth.ca/

Zone Cone Avoiding Construction

Données sources Au niveau municipal, les données sont accessibles indirectement sur le site de la ville de Montréal. En d'autres termes, ces données n'ont pas été prévues pour être utilisées de manière directe mais sont affichées sur une carte dans la section Info-Travaux. Au niveau provinciale, les données viennent du Ministère des transports du Québec et de son service Québec 511. Là aussi le MTQ se démarque de ses homologues canadiens en étant a priori le premier à proposer des données GPS pour la localisation des chantiers.

http://zonecone.ca/

Recreation Patiner Montréal

http://patinermontreal.ca/rinks/74-saint-simon-apotre http://montrealouvert.net/a-propos/

Open North Transparency – Gov. Contracting

http://documents.montrealgazette.com

RAPLIQ Accessibility – Auditing Physical Space

http://www.rapliq.org/2011/06/09/journee-de-laccessibilite-dans-le-vieux-montreal/

Whether a location is accessible depends on

more than the presence of a ramp.

RAPLIQ audits a building on several

dozen criteria important to people

with different disabilities.

Accessibility Audit Prototype Map

Catherine Roy: ecrire@catherine-roy.net http://montrealaccessible.ca/

Qu’est-ce que c’est? This is a prototype of a map of

accessible businesses in Montreal, based on data

compiled over the last several years by RAPLIQ.

We're interested in finding potential partners or

sponsors.

Qui sommes-nous? This prototype was built by

Michael Lenczner, Josh Vanwyck, Keharn Yawnghwe,

and Michael Mulley

Hacking Health

http://www.hackinghealth.ca/

Winners from the judging competition

We’re proud to announce

our top winners from Hacking Health. Each team will receive $400

and will be invited by BDC for a consultation on how

to take their projects forward into viable

startups.

Health Innovation most likely to succeed:

Montréal Accessible

Hackathons

1. Windsor 2. London 3. Ottawa 4. Montréal 5. Toronto 6. Calgary

http://www.opendataday.org/francais.html

http://opendataapps.org/

7. Edmonton 8. Vancouver 9. Victoria 10. Guelph 11. Halton

Random Hacks of Kindness

http://www.rhok.org/

Hackathon

http://blog.opendataottawa.ca/

http://www.livinglabmontreal.org/TranspoCampMTL

http://montrealouvert.net/2011/11/23/compte-rendu-du-3e-hackathon-montreal-ouvert/?lang=en

Open Data Cities http://datalibre.ca/

• OpenData Framework; Municipal Open Government Framework

• City of Burlington (ON), Pilot

• City of Calgary (AB)

• City of Edmonton (AB)

• City of Fredericton (NB)

• Gatineau Ouverte – Citizen Led

• City of Guelph (ON), Guelph Coffee and Code – Citizen Led

• City of Hamilton (Transit Feed) (ON), Open Data Hamilton – Citizen Led

• OpenHalton (ON) – Citizen Led

• City of London (ON), OpenData London – Citizen Led

• Township of Langley (BC)

• City of Mississauga – Mississauga Data (ON)

• Ville de Montréal Portails données ouvertes

(QC), Montréal Ouvert – Citizen Led

• City of Nanaimo (BC)

• City of Niagara Falls (ON)

• District of North Vancouver (BC) GeoWeb

• City of Ottawa (ON), Citizens’ APP Group – OpenData Ottawa; Apps

• Region of Peel (ON)

• Ville de Québec Catalogue de données, / Capitale Ouverte (QC)- Citizen Led in Ville de Québec

• City of Prince George (BC) catalog

• City of Regina (SK) Open Gov & Open Data site

• City of Surrey (BC) GIS Catalog

• City of Toronto (ON); DataTO – Citizen Group

• City of Vancouver (BC); Open Data Wiki

• Region of Waterloo (ON) – Citizen Led

• City of Windsor (ON) Open Data Catalog

Open Data BC

http://www.data.gov.bc.ca/

Open Data Canada

http://www.data.gc.ca/default.asp?lang=En&n=F9B7A1E3-1

Where we need librarians:

• Point to these data & apps

• Point citizens to related resources

• Examine & evaluate portals

• Cataloguing expertise

• Data & app curation

• Be a citizen librarian at hackfest & hackathons

• Contribute expertise in public consultations

• Advise your city, prov. & fed gov’ts

Community & Locally based Data

FCM Quality of Life Reporting System

• City of Calgary • Region of Durham • City of Edmonton • Ville de Gatineau • Halton Region • City of Hamilton • City of Kingston • Ville de Laval • City of London • City of Toronto • City of Vancouver • Metro Vancouver • York Region

• Regional Municipality of Waterloo • Halifax Regional Municipality • Regional Municipality of Niagara • Communauté métropolitaine de

Montréal • City of Ottawa • Region of Peel • City of Regina • City of Saskatoon • City of Greater Sudbury • City of Surrey • City of Winnipeg

http://fcm.ca/home/programs/quality-of-life-reporting-system/program-resources.htm/home

Participating Member Communities:

FCM QoLRS Domains

http://www.municipaldata-donneesmunicipales.ca/Site/Reporting/en/reporting_tool.php

FCM Quality of Life Reporting System

http://www.municipaldata-donneesmunicipales.ca/Site/Reporting/en/reporting_tool.php

Public Health - Saskatoon

http://www.communityview.ca/index.html

Santé Publique - Montréal

http://emis.santemontreal.qc.ca/

Cities

www.toronto.ca/wellbeing

Community Based Research Social Planning Council of Winnipeg

http://www.spcw.mb.ca

Community Based Research Social Planning and Research Council of Hamilton

http://www.sprc.hamilton.on.ca/CommunityMappingService.php

Community Based Research Community Development Halton

0

http://www.cdhalton.ca/lens/index.htm

http://communitydata-donneescommunautaires.ca/home

Community Data Program Canadian Council on Social Development (CCSD)

1. Calgary 2. Edmonton 3. Halton Region 4. Hamilton 5. Kingston 6. London 7. Montréal 8. Saint John, New Brunswick 9. Newfoundland (In Discussions) 10. Niagara (In Discussions) 11. Ottawa 12. Peel Region 13. Peterborough 14. Regina (In Discussions) 15. Saskatoon (In Discussions) 16. Sault Ste. Marie 17. Simcoe County 18. Sudbury 19. Thunder Bay 20. Toronto 21. Vancouver 22. Victoria 23. Waterloo 24. Winnipeg 25. York Region

http://communitydata-donneescommunautaires.ca/home

Community Data Program Canadian Council on Social Development (CCSD)

Community Data Canada NGOs, Cities, Federal Govt.

http://cdc-dcc.info/mandate.php

Social Data Portals

http://hifis.hrsdc.gc.ca/index-eng.shtml

Data are inaccessible to researchers

FCM Municipal Data Collection Tool

http://www.municipaldata-donneesmunicipales.ca/Site/Collection/en

/index.php

Where we need librarians:

• Add these resources to your collection

• Point to these data & apps

• Create a local blog

• Volunteer in a local org. & help w/their data resources (e.g., librarians w/out borders)

• Apply cataloguing expertise

• Data & app curation

• Develop a local advisory/reference group for non profits

Research Data

https://gcrc.carleton.ca/confluence/display/GCRCWEB/Atlases

FCM & Geomatics and Cartographic Research Centre Data & Software - Nunaliit Cybercartographic Atlas

Framework ( BSD) - Data Liberation Initiative

(DLI) Statistics Canada

(Restricted use) - FCM QoLRS

(Viewing only) - City Neighbourhood framework data files

(Viewing only) - Toronto Community

Housing (Viewing only)

Atlas of the Risk of Homelessness Geomatics and Cartographic Research Centre & FCM

City of Toronto, GCRC, & FCM Aging Social Housing Stock

https://gcrc.carleton.ca/confluence/display/GCRCWEB/Atlases

Atlas of Antarctica Geomatics and Cartographic Research Centre

http://atlases.gcrc.carleton.ca/antarctic/intro/intro.xml.html#intro

duction

http://atlases.gcrc.carleton.ca/antarctic/territorial/territories.xml.html

ISIUOP – Participatory Data Collection Geomatics and Cartographic Research Centre

Data & Software

- Nunaliit Cybercartographic Atlas Framework (BSD) - Geogratis Framework & Topographic Data (Unrestricted terms of use) - Flow lines collected by different hunters (Shared rights) - More sensitive data – e.g. Bear Dens, sacred sites, environmentally sensitive data are for viewing & use by the community only

- Data part of IPY Canada

https://gcrc.carleton.ca/confluence/display/ISIUOP/Inuit+Sea+Ice+Use+and+ Occupancy+Project+(ISIUOP)

Nunaliit iPad Data Capture app Geomatics and Cartographic Research Centre

• Community: • Wished access was faster and atlas and data

were housed in community

• Wished adding content was easier

• Needed flexibility for types of data and metadata to be saved and how to present it

• Nunaliit: • Distributed network of replicating nodes,

including nodes in communities, and on mobile

• Simplified data collection app replaces half a dozen devices for offline data collection

• Document oriented database with data and applications loosely connected via flexible schema system

Inuit Siku (sea ice) Atlas Geomatics and Cartographic Research Centre

http://sikuatlas.ca/sea_ice_map.html?module=1

Lake Huron Treaty Atlas Geomatics and Cartographic Research Centre

http://atlas.gcrc.carleton.ca/lakehurontreaties/

International Polar Year (IPY) ( IPY Research funding and data management)

http://www.ipy-api.gc.ca/pg_IPYAPI_052-fra.html

Natural Resources Canada (NRCan)

http://www.geoconnections.org/fr/resourcelibrary/keySt

udiesReports

http://geodiscover.cgdi.ca

http://www.geobase.ca

http://geogratis.cgdi.gc.ca/

Antarctic Digital Database (ADD)

http://www.add.scar.org:8080/add/WMSmap.jsp

Where we need librarians:

• Archival & gov’t documents

• Point to portals

• Discovery of scientific & historical data

• Work with indigenous groups & help to manage knowledge resources

• Apply cataloguing expertise

• Data & app curation

• Help researchers find specialist librarians

Q & A

Thank you!