+ All Categories
Home > Technology > APLIC 2012: Discovering & Dealing with Data

APLIC 2012: Discovering & Dealing with Data

Date post: 22-Oct-2014
Category:
View: 377 times
Download: 1 times
Share this document with a friend
Description:
A presentation to the Association of Parliamentary Libraries in Canada, Toronto, September 2012.
Popular Tags:
15
17 September 2012 Discovering & Dealing with Data Presented by Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, University of Toronto
Transcript
Page 1: APLIC 2012: Discovering & Dealing with Data

17 September 2012

Discovering & Dealing with Data

Presented by

Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, University of Toronto

Page 2: APLIC 2012: Discovering & Dealing with Data

2

Agenda

• The MPI information environment

• Common data sources & authority

• Data management, discovery and access

• What is Open Data? Big Data?

• Fun with data visualization

• Q & A

Page 3: APLIC 2012: Discovering & Dealing with Data

About the MPI

• The Martin Prosperity Institute is a economic think-tank; we are part of the Rotman School within the University of Toronto

• My client group consists of grad students, post-docs, visiting faculty and researchers who use social-science data to support their research

• To support their research process, I procure, curate, preserve and make discoverable data sets.

• The MPI has our own data repository that has grown to 4 TB in size.

3

Page 4: APLIC 2012: Discovering & Dealing with Data

4

Data Sources

• Common & Very authoritative sources – StatsCan via the Data Liberation Initiative

– Bureau of Labor Statistics, Bureau of Economic Analysis, American Fact Finder (Census)

– OECD eLibrary

– World Bank

– Int’l sources such as UK Data Archive, Swedish National Data Service, etc.

– Pew Research Center

– Gallup

Page 5: APLIC 2012: Discovering & Dealing with Data

5

More data sources

• Less authoritative??

– Chinese Data Center

– Rolling Stone

– MySpace

– CrunchBase

Page 6: APLIC 2012: Discovering & Dealing with Data

Data Challenge: Discovery

• Lots of research data being collected and added, but no method to manage it, catalogue it, or make it findable

• Demands from various clients: faculty, students, researchers, staff, administration

• The shared network drive was no longer effective

6

Page 7: APLIC 2012: Discovering & Dealing with Data

7

Show & Share…

• We want the world to see our data catalogue

• But, we don’t want the world to be able to copy or change what’s in the catalogue, or the catalogue itself

• We need to manage access to our data; who are you? Where are you from? Why do you want the data? What are you going to do with it? Will you share your results?

Page 8: APLIC 2012: Discovering & Dealing with Data

8

Data Discovery Platforms

• I reviewed several platforms that would work in an academic environment: – Nesstar – developed in Norway by Norwegian Social

Science Data Services, used by StatsCan, UK Data Archive, NORC at UChicago

– Islandora – Open source system based on Fedora developed at UPEI

– ODESI – proprietary system developed and used by Scholars Portal

– Dataverse – Open source system developed by the Institute for Quantitative Social Science at Harvard, used by NBER, and many academic think tanks.

Page 9: APLIC 2012: Discovering & Dealing with Data

9

Dataverse

• Dataverse was a good choice since we could install an iteration at UToronto, in the UToronto cloud, and I could manage it myself

• It was free, and my colleagues at Scholar’s Portal was interested in installing it – I was the perfect guinea pig

• Slowly, I am cataloguing my data collection; I have set up a lending agreement, and it’s working very well.

• Demo: http://dataverse.scholarsportal.info/dvn/dv/mpi

Page 10: APLIC 2012: Discovering & Dealing with Data

10

Open Data

• Open data is an idea, that certain data should be freely available to everyone to use, reuse, and redistribute without restriction.

• Governments around the world have begun to “open up” some of their data: US, UK, New Zealand, Norway, Russia, Australia, Morocco, Netherlands, Chile, Spain, Uruguay, France, Brazil, Estonia, Portugal, etc.

• State- and municipal-levels of government have also created open data sites.

Page 11: APLIC 2012: Discovering & Dealing with Data

11

Open Data Opportunities…

• Governments open up their data to foster better citizenship and improve transparency

• Open Data can spur grass-roots innovation: citizens access open data to use in software programs to solve problems, such as finding a local daycare, knowing when the next bus will come, reporting crime on-the-fly, or watching congress proceedings in real time.

Page 12: APLIC 2012: Discovering & Dealing with Data

12

… and Challenges

• Open Data takes commitment. Successful implementations have a dedicated team of people who decide what data to release according to usefulness and demand

• The data must be anonymized, cleansed and in a non-proprietary format

• Organizations must be prepared to listen to the citizens, be responsive, and trouble-shoot.

• Open data is a public service.

Page 13: APLIC 2012: Discovering & Dealing with Data

13

Big Data

• Big Data is a collection of data sets that is too large for the average database management tool (Access and Excel, for instance).

• Examples come from meteorology, genomics and physics. At MPI we wrestle with large GIS data sets (maps and satellite data), and deal with data at the terabyte (1 trillion bytes) level.

• Larger data sets deal with petabytes (1 quadrillion bytes) and exabytes (1 quintillion bytes).

Page 14: APLIC 2012: Discovering & Dealing with Data

14

Data Visualizations

• The visual representation of data ---- literally, a picture can say a thousand [numbers]

• Edward Tufte is a key pioneer: http://www.edwardtufte.com/tufte/

• Fantastic examples at Flowing Data: http://flowingdata.com/

• RSA Animate: http://www.thersa.org/

Page 15: APLIC 2012: Discovering & Dealing with Data

17 September 2012

Q & A

(and, Thank You!)

Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, University of Toronto

[email protected]


Recommended