Date post: | 14-May-2015 |
Category: |
Education |
Upload: | rachel-frick |
View: | 671 times |
Download: | 1 times |
Boundless OpportunityThe Impact of Cloud-Based* Services for Libraries
Rachel L. FrickDirector, Digital Library Federation
Council on Library and Information Resources
Ticer Summer SchoolAugust 21, 2012
Cloud Based* services
Not just technical infrastructure
Distributed Services
Collections
Expertise
Network Opportunities
Capacity to do more
Leverage local expertise
Amplify local excellence
Macrosolutions: towards convergence
“Common to these efforts will be developing strong coalitions that bring together diverse institutions within a national framework; federating shared resources and interests, including collections, technology, and expertise; and creating a genuine, volitional dependency on other participating institutions for the provision of what was once a locally owned and managed asset. We are calling these collaborative projects macro solutions.”
CLIR Annual Report, 2009-2010, p. 3
Collaboration Continuum
• Common Interest• Common Values• Convergence
http://www.oclc.org/research/publications/library/2010/2010-09.pdf
High Risk / High Reward
Requires high trust threshold / risk tolerance
Dependence on others
Less control
Research Library at Web-scale
10,449,391 total volumes
5,516,747 book titles
272,663 serial titles
3,657,286,850 pages
468 terabytes
124 miles = 199.5 Kilometers
8,490 tons (US) = 7702 metric tons
3,140,629 volumes (~30% of total) in the public domain
Cloud Sourcing Library Collections
Managing Print in the Mass Digitized Library EnvironmentConstance Malpas, 2011
1/3 of U.S. ARL content duplicated in HathiTrust Shared Print Archiving / Collective Collections Regional Print/ Digital Archives Service Centers
http://www.oclc.org/research/publications/library/2011/2011-01.pdf
Print Archiving: network scale
ReCAP - http://recap.princeton.edu/
WEST - http://www.cdlib.org/services/west/about/
ASERL / University of Florida: US Gov Docs http://www.aserl.org/programs/gov-doc/
Maine Shared Print - http://www.maineinfonet.net/mscs/
Organizational Node: Center for Research Libraries Print Archive Community Forum
http://www.crl.edu/archiving-preservation/print-archives/forum
New Metrics
How do we –
Count Collections?
Measure “quality”?
Reward high ratios of services, collections per budget $
Rate Trustworthiness
Identify good collaborators / team players?
http
://ww
w.fl
ickr.com
/ph
oto
s/bla
ckcou
ntry
mu
seu
ms/4
88
78
03
84
0
Pause for a moment
http://www.flickr.com/photos/hckyso/3870006964/
Networked Collections: not just books
Digitized Primary Resource Collections Europeana - http://www.europeana.eu/portal/ Biodiversity Heritage Library -
http://www.biodiversitylibrary.org/
Scholarly Communications OA publications / IR’s, disciplinary depositories
Research Data DataOne - http://www.dataone.org/ OpenAire- http://www.openaire.eu/
Challenge of Data Collections
BIG DATA vs. small data Data sharing, small science and institutional repositories. Melissa
H. Cragin, Carole L. Palmer, Jacob R. Carlson, and Michael Witt. Philosophical Transactions of the Royal Society A 2010; 368(1926): 4023-4038. doi:10.1098/rsta.2010.0165
Preservation services Brief online interview with Sayeed Choudhry, JHU. http://youtu.be/
oWw7Ifn1Xx8
Data post production services: Access, reuse, remix
Challenge of Data Collections
Researchers aligned with discipline, not institution
Restrictive campus IT policies
Not adequate network storage
Focused on publication, not curation
Data breach (privacy) top concern
Library viewed as dispensary of goods, not a data service partner.
http://www.clir.org/pubs/reports/pub154
Data Preservation Communities
Professional Organizations providing guidance International Digital Curation Centre - http://www.dcc.ac.uk/ Digital Preservation Coalition - http://www.dpconline.org/ National Digital Stewardship Alliance -
http://www.digitalpreservation.gov/ndsa/index.html Open Planets Foundation - http://www.openplanetsfoundation.org/
Centers that “bridge the gap” Data to Insight Center – http://d2i.indiana.edu/ D2C2 – http://d2c2.lib.purdue.edu/ UC3 – California Digital Library - http://www.cdlib.org/services/uc3/
Networks that balance the load Text Grid - http://www.textgrid.de/ DataOne - http://www.dataone.org/ Data Conservancy - http://dataconservancy.org/
Why prioritize data curation services?
Data are emerging as the research output of importance Data papers, example Ecological Society ofAmerica:
http://esapubs.org/archive/archive_D.htm Data citation http://www.datacite.org/ Databib http://databib.org/
Published journal articles will be less important Metadata of the research data Gravemarker of research activity and version of dataset
What are conversations on your campus?
How is the library positioning itself in your campus’ data ecology? Active Participant? Research Partner? Passive – end of process?
How is your library connected to larger data communities?
http://www.flickr.com/photos/marcwathieu/2979581445/
Collections = DATA
Data sets are not just scientific and business tables or spreadsheets
Not just generated by satellites and sensors
Libraries (archives,museums): potential distributed data stores
Digital Collections: Libraries’ Big Data
Computational Research
Digital Humanities Digging into Data Challenge
http://www.diggingintodata.org/
CLIR publication: One Culture http://www.clir.org/pubs/reports/pub151/pub151.pdf
Case Study: Historic Newspapers
• Chronicling America• http://chroniclingamerica.loc.gov/
• 5 million page images from historic newspapers with OCR from organizations in 25 states
• ~ 4 million hits per day
• Traditional research: • SERACHING for stories
• Data research:• MINING newspaper OCR for
trends across time periods and geographic areas
Case Study: Historic Newspapers
http://www.stanford.edu/group/ruralwest/cgi-bin/drupal/visualizations/us_newspapers
Data Research Service Needs
To use collections as a whole, mining and organizing and the information in novel and innovative ways
Algorithmic and visualization tools
Working with both the artifact and its data representation
Data Collection Services
The ingest and inventory of such collections, other than scale, is basically understood.
How much ingest processing should be done with data collections, or collections that can be treated as data?
Do we process collections to create a variety of derivatives that might be used in various forms of analysis before ingesting them?
Do we have sufficient infrastructure to support full discovery?
Do we load collections into analytical tools?
Library Service Implications
Collections as “self-serve”
If only provide access to data, do we limit it to native format or provide pre-processed or on-the-fly format transformation services for downloads?
Can we handle the download traffic?
Can our staff develop the expertise to provide guidance to researchers in using analytical tools?
Do we leave researchers to fend for themselves?
The De-centered Library
De-centered Networked Library
http://www.slideshare.net/yiibu/beyond-themobilewebbyyiibu/128
United by Brand
DPLA: Library as Platform
Constellation Model:
http://s.socialinnovation.ca/files/constellation%20and%20open%20source%20article%20september08_osbr.pdf
New Librarianship
Honesty about the limits of re-tooling
Re-think the librarian’s role in research
Crucial leadership challenge
Priorities of traditional services “Stop moving the books, okay?”
Back to Basics Collections that are unique REAL Research support Archiving, preservation, and access: distributed, but at scale
Get out of the comfort zone
Take the time to ask the hard questions
Consider the possibility for radical change
Are we deciding for today? Or making the hard choice for tomorrow?
Are we network ready?
http://www.flickr.com/photos/iamthebestartist/203179552/
Being ready
Research environments (including library systems) with permeable borders
Advocacy Value of “Open Data”
Facilitating information flow
Courage
http://www.clir.org/pubs/reports/pub154/pub154.pdf
Connected-ness
Bollen J, Van de Sompel H, Hagberg A, Bettencourt L, Chute R, et al. 2009 Clickstream Data Yields High-Resolution Maps of Science. PLoS ONE 4(3): e4803. doi:10.1371/journal.pone.0004803
Action, Trust and Risk
Credits and Attribution
Ideas and contributions Patricia Cruse, UC3 – California Digital Library Lorcan Dempsey, OCLC Josh Greenburg, Sloan Foundation Leslie Johnston, Library of Congress Patricia Cruse, UC3 – California Digital Library Gunter Waible, Smithsonian Institution Jon Voss, History Pin – We are what we do Martin Kalfatovic – Smithsonian Institution Libraries / BHL Charles Henry and my colleagues at CLIR