Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | mariah-perkins |
View: | 213 times |
Download: | 0 times |
Cloud Computing in Libraries and Web-scale Library Management and Discovery
Marshall BreedingIndependent Consultant, Author, SpeakerFounder and Publisher, Library Technology Guideshttp://www.librarytechnology.org/http://twitter.com/mbreeding
15 March 2013 SENYLRC
Abstract
This is an introduction to the concepts of cloud computing and how this suite of technologies is positioned to re-shape the way’s that libraries make use of strategic applications such as discovery and management applications. The instructor will describe the evolution of discovery systems from next-generation library catalogs that provided some improvements in the interfaces and performance of the established online catalogs toward the current wave of index-based or Web-scale discovery services. Major changes are also underway in the applications that libraries use to manage their operations and collections, with a new slate of library services platforms coming on the scene, providing an alternative to the integrated library systems that have been available for many decades.
Cloud Computing for Libraries
Volume 11 in The Tech Set
Published by Neal-Schuman / ALA TechSource
ISBN: 781555707859
http://www.neal-schuman.com/ccl
Book Image Publication Info:
Appropriate Automation Infrastructure
Current automation products out of step with current realities
Majority of library collection funds spent on electronic content
Majority of automation efforts support print activities
New discovery solutions help with access to e-content
Management of e-content continues with inadequate supporting infrastructure
Key Context: Libraries in Transition
Academic Shift from Print > Electronic E-journal transition largely complete Circulation of print collections slowing E-books now in play (consultation > reading)
All libraries: Need better tools for access to complex
multi-format collections Strong emphasis on digitizing local collections Demands for enterprise integration and
interoperability
Key Text: Changed expectations in metadata management
Moving away from individual record-by-record creation Life cycle of metadata
Metadata follows the supply chain, improved and enhanced along the way as needed
Manage metadata in bulk when possible E-book collections
Highly shared metadata E-journal knowledge bases, e.g.
Great interest in moving toward semantic web and open linked data Very little progress in linked data for operational systems AACR2 > RDA MARC > Bibframe (http://bibframe.org/)
Fundamental technology shift Mainframe computing Client/Server Web-based and Cloud Computing
http://www.flickr.com/photos/carrick/61952845/
http://soacloudcomputing.blogspot.com/2008/10/cloud-computing.html
http://www.javaworld.com/javaworld/jw-10-2001/jw-1019-jxta.html
Local Computing
Traditional model Locally owned and managed Shifting from departmental to enterprise Departmental servers co-located in
central IT data centers Increasingly virtualized
Virtualization
The ability for multiple computing images to simultaneously exist on one physical server
Physical hardware partitioned into multiple instances using virtual machine management tools such as VMware
Applicable to local, remote, and cloud models
Cloud Computing
Major trend in Information Technology Term “in the cloud” has devolved into
marketing hype, but cloud computing in the form of multi-tenant software as a service offers libraries opportunities to break out of individual silos of automation and engage in widely shared cooperative systems
Opportunities for libraries to leverage their combined efforts into large-scale systems with more end-user impact and organizational efficiencies
Beyond “Cloudwashing”
Cloud as marketing hype Cloud computing used very freely,
tagged to almost any virtualized environment
Any arrangement where the library relies on some kind of remote hosting environment for major automation components
Includes almost any vendor-hosted product offering
Example: ASP now Software-as-a-Service
Cloud computing – characteristics
Web-based Interfaces Externally hosted Pricing: subscription or utility Highly abstracted computing model Provisioned on demand Scaled according to variable needs Elastic – consumption of resources can
contract and expand according to demand
Gartner Hype Cycle 2009
Gartner Hype Cycle 2010
Gartner Hype Cycle 2011
Gartner Hype Cycle 2012
Budget Allocations
Server Purchase Server
Maintenance Application
software license Data Center
overhead Energy costs Facility costs
Annual Subscription Measured
Service? Fixed fees
Factors Hosting Software Licenses Optional modules
Local Computing Cloud Computing
Infrastructure-as-a-service
Provisioning of Equipment Servers, storage
Virtual server provisioning Examples:
Amazon Elastic Compute Cloud (EC2) Amazon Simple Storage Service (S3) Rackspace Cloud (
http://www.rackspacecloud.com/) EMC2 Atmos (http://www.atmosonline.com/)
Amazon EC2
Amazon Machine Instances (AMI) Red Hat Enterprise Linux Debian Fedora Ubuntu Linux Open Solaris Windows Server 2003/2008
Storage-as-a-Service
Provisioned, on-demand storage Bundled to, or separate from other cloud
services
Software as a Service
Multi Tennant SaaS is the modern approach One copy of the code base serves multiple
sites Software functionality delivered entirely
through Web interfaces No workstation clients
Upgrades and fixes deployed universally Usually in small increments
Data as a service
SaaS provides opportunity for highly shared data models
Bibliographic knowledgebase: one globally shared copy that serves all libraries
Discovery indexes: article and object-level index for resource discovery
E-resource knowledge bases: shared authoritative repository of e-journal holdings
General opportunity to move away from library-by-library metadata management to globally shared workflows
Software-as-a-Service
Complete software application, customized for customer use
Software delivered through cloud infrastructure, data stored on cloud
Eg: Salesforce.com—widely used business infrastructure
Multi-tenant: all organizations that use the service share the same instance (codebase, hardware resources, etc) Often partitioned to separate some groups of
subscribers
Application service provider
Legacy business applications hosted by software vendor
Standalone application on discrete or virtualized hardware
Staff and public clients accessed via the Internet Same user interfaces and functionality as if
installed locally Established as a deployment model in the 1990’s Can be implemented through Infrastructure-as-a
Service Individual instances of legacy system hosted in EC2
ASP vs SaaS
From: THINKstrategies: CIO’s Guide to Software-as-a-Service
Platform-as-a-Service
Virtualized computing environment for deployment of software
Application engine, no specific server provisioning
Examples: Google App Engine
SDKs for Java, Python Heroku: ruby platform Amazon Web Service
Library Specific platforms WorldShare Platform
Library Context
Cloud Computing
Library automation through SaaS Almost all library automation products
offered through hosted options SaaS or ASP?
ILS Products offered as SaaS (mostly ASP)
SirsiDynix Symphony SirsiDynix Horizon Innovative Interfaces Millennium Ex Libris Aleph EOS International EOS.Web Evergreen – Equinox Software Koha – LibLime, ByWater, many others
internationally …many other examples …
Multi-tenant SaaS
Serials Solutions Summon Intota (Announced for 2012-13) 360 Search, 360 Link, KnowledgeWorks
Ex Libris Alma Primo Central
BiblioCommons OCLC WorldShare Management Services
Platform as a Service
OCLC WorldShare Platform WorldShare Management Services WorldShare License Manager Library-created applications
Library Management in the Cloud Almost all library automation vendors
offer some form of “cloud-based” services Server management moves from library
to Vendor Subscription-based business model Comprehensive annual subscription
payment Offsets local server purchase and
maintenance Offsets some local technology support
Leveraging the Cloud
Moving legacy systems to hosted services provides some savings to individual institutions but does not result in dramatic transformation
Globally shared data and metadata models have the potential to achieve new levels of operational efficiencies and more powerful discovery and automation scenarios that improve the position of libraries overall.
Transition to Web-scale Technologies
Web-scale: a characterization or marketing tag that denotes a comprehensive, highly-scalable, globally shared model
Web-scale: One of the key characteristics of emerging library management and discovery services
Displaces applications or data models targeting individual libraries in isolation
Discovery: index-based search Management: Library Services Platforms
Repositories in the cloud
Dspace – institutional repository application
Fedora – generalized repository platform DuraSpace – organization now over both
Dspace and Fedora DuraCloud – shared, hosted repository
platform Pilot since 2009, production in early 2011 http://www.duraspace.org/duracloud.php
Caveats and concerns with SaaS Libraries must have adequate bandwidth
to support access to remote applications without latency
Quality of service agreements that guarantee performance and reliability factors
Configurability and customizability limitations
Access to API’s Ability to interoperate with 3rd party
applications Eg: Connect SaaS ILS with discovery
product from another vendor
Benefits of Cloud Computing
Elimination of capital expenses for equipment
Lower annual costs
Redeployment of technical staff to more meaningful activities
Higher revenues relative to software-only arrangements
Provision of infrastructure at scale with lower unit costs
Longer-term relationships with customers
Libraries Providers / Vendors
Cost implications
Total cost of ownership Do all cost components result in increased or
decreased expense Personnel costs – need less technical administration Hardware – server hardware eliminated Software costs: subscription, license,
maintenance/support Indirect costs: energy costs associated with power and
cooling of servers in data center IaaS: balance elimination of hardware investments
for ongoing usage fees Especially attractive for development and prototyping
Risks and concerns
Privacy of data Policies, regulations, jurisdictions
Ownership of data Avoid vendor lock-in
Integrity of Data Backups and disaster recovery
Security issues
Most providers implement stronger safeguards beyond the capacity of local institutions
Virtual instances equally susceptible to poor security practices as local computing
Cloud computing trends for libraries Increased migration away from local
computing toward some form of remote / hosted / virtualized alternative
Cloud computing especially attractive to libraries with few technology support personnel
Adequate bandwidth will continue to be a limiting factor
Increased pressure
Library automation vendors promoting SaaS offerings Some companies already exclusively SaaS
Software pricing increasingly favorable to SaaS
Caveat
technologies promoted by companies and organizations have a vested interest in their adoption
Critically assess viability of the technology and its appropriateness for your organization
A New Generation of Resource Discovery
Next-Gen Library Catalogs
Marshall BreedingNeal-Schuman PublishersMarch 2010
Volume 1 of The Tech Set
Online Catalog
Books, Journals, and Media at the Title Level
Not in scope: Articles Book Chapters Digital objects
Scope of SearchSearch:
Search Results
ILS Data
Next-gen Catalogs or Discovery Interface
Single search box Query tools
Did you mean Type-ahead
Relevance ranked results Faceted navigation Enhanced visual displays
Cover art Summaries, reviews,
Recommendation services
Books, Journals, and Media at the Title Level
Other local and open access content
Not in scope: Articles Book Chapters Digital objects
Scope of Search
Discovery from Local to Web-scale Initial products focused on interface improvements
AquaBrowser, Endeca, Primo, Encore, VuFind, LIBERO Uno, Civica Sorcer, Axiell Arena Mostly locally-installed software
Current phase is focused on pre-populated indexes that aim to deliver Web-scale discovery Primo Central (Ex Libris) Summon (Serials Solutions) WorldCat Local (OCLC) EBSCO Discovery Service (EBSCO) Encore Synergy (no index, though)
Discovery Interface search model
Search: Digital
Collections
ProQuest
EBSCOhost
…MLA
Bibliography
ABC-CLIO
Search Results
Real-time query and responses
ILS Data
Local Index
Meta
Search
En
gin
e
Web-scale Index-based Discovery
Search:
Digital Collections
Web Site ContentInstitution
al Repositori
es
…E-Journals
Reference Sources
Search Results
Pre-built harvesting and indexing
Conso
lidate
d In
dex
ILS Data
Aggregated Content packages
(2009- present)
Public Library Information Portal
Search:
Digital Collections
Web Site ContentCommunit
yInformatio
n
…Customer-providedcontent
Reference Sources
Search Results
Pre-built harvesting and indexing
Conso
lidate
d In
dex
LMS Data
Aggregated Content packages
Archives
Usage-generate
dData
Customer
Profile
Web-scale Search Problem
Search:
Search Results
Pre-built harvesting and indexing
Con
solid
ate
d
Index
???
Non Participating
Content Sources
Problem in how to deal with resources not provided to ingest into consolidated index
Digital Collections
Web Site ContentInstitution
al Repositori
es
…E-Journals
ILS Data
Aggregated Content packages
Discovery Products
http://www.librarytechnology.org/discovery.pl
Challenge for Relevancy
Technically feasible to index hundreds of millions or billions of records through Lucene or SOLR
Difficult to order records in ways that make sense
Many fairly equivalent candidates returned for any given query
Must rely on use-based and social factors to improve relevancy rankings
Challenges for Collection Coverage To work effectively, discovery services
need to cover comprehensively the body of content represented in library collections
What about publishers that do not participate?
Is content indexed at the citation or full-text level?
What are the restrictions for non-authenticated users?
How can libraries understand the differences in coverage among competing services?
Open Discovery Initiative
NISO Work Group to Develop Standards and Recommended Practices for Library Discovery Services Based on Indexed Search
Informal meeting called at ALA Annual 2011
Co-Chaired by Marshall Breeding and Jenny Walker
Term: Dec 2011 – May 2013http://www.niso.org/workro
oms/odi/
Balance of Constituents
Libraries
Publishers
Service Providers
57
Marshall Breeding, Vanderbilt UniversityJamene Brooks-Kieffer, Kansas State University Laura Morse, Harvard UniversityKen Varnum, University of Michigan
Anya Arnold, Orbis Cascade AllianceSara Brownmiller, University of OregonLucy Harrison, College Center for Library Automation (D2D liaison/observer)Michele Newberry, Florida Virtual Campus
Lettie Conrad, SAGE PublicationsBeth LaPensee, ITHAKA/JSTOR/PorticoJeff Lang, Thomson Reuters
Linda Beebe, American Psychological Assoc
Aaron Wood, Alexander Street PressRoger Schonfeld, JSTOR, Ithaka
Jenny Walker, Ex Libris GroupJohn Law, Serials SolutionsMichael Gorrell, EBSCO Information Services
David Lindahl, University of Rochester (XC)Jeff Penka, OCLC (D2D liaison/observer)
ODI Project Goals:
Identify … needs and requirements of the three stakeholder groups in this area of work.
Create recommendations and tools to streamline the process by which information providers, discovery service providers, and librarians work together to better serve libraries and their users.
Provide effective means for librarians to assess the level of participation by information providers in discovery services, to evaluate the breadth and depth of content indexed and the degree to which this content is made available to the user.
New-generation Library Management
Fragmented Library Management LMS for management of (mostly) print Duplicative financial systems between library and local
government or other parent organization E-book lending platform (multiple?) Interlibrary loan (borrowing and lending) Self-service and AMH infrastructure Electronic Resource Management PC Scheduling and print management Event scheduling Digital Collections Management platforms (CONTENTdm,
DigiTool, etc.) Discovery-layer services for broader access to library collections No effective integration services / interoperability among
disconnected systems, non-aligned metadata schemes
Integrated (for print) Library System
Circulation
BIB
Staff Interfaces:
Holding / Items
CircTransact
User Vendor Policies$$$
Funds
Cataloging Acquisitions Serials OnlineCatalog
Public Interfaces:
Interfaces
BusinessLogic
DataStores
LMS / ERM: Fragmented Model
Circulation
BIB
Staff Interfaces:
Holding / Items
CircTransact
User Vendor Policies$$$
Funds
CatalogingAcquisitionsSerials OnlineCatalog
Public Interfaces:
Application Programming Interfaces
`
LicenseManagement
LicenseTerms
E-resourceProcurement
VendorsE-Journal
Titles
Protocols: CORE
Common approach for ERM
Circulation
BIB
Staff Interfaces:
Holding / Items
CircTransact
User Vendor Policies$$$
Funds
CatalogingAcquisitionsSerials OnlineCatalog
Public Interfaces:
Application Programming Interfaces
Budget License Terms
Titles / Holdings
Vendors
Access Details
Comprehensive Resource Management
No longer sensible to use different software platforms for managing different types of library materials
ILS + ERM + OpenURL Resolver + Digital Asset management, etc. very inefficient model
Flexible platform capable of managing multiple type of library materials, multiple metadata formats, with appropriate workflows
Academic Libraries need a new model of library management
Not an Integrated Library System or Library Management System
The ILS/LMS was designed to help libraries manage print collections
Generally did not evolve to manage electronic collections
Other library automation products evolved: Electronic Resource Management Systems –
OpenURL Link Resolvers – Digital Library Management Systems -- Institutional Repositories
Library Services Platform
Library-specific software. Designed to help libraries automate their internal operations, manage collections, fulfillment requests, and deliver services
Services Service oriented architecture Exposes Web services and other API’s Facilitates the services libraries offer to their users
Platform General infrastructure for library automation Consistent with the concept of Platform as a Service Library programmers address the APIs of the platform to
extend functionality, create connections with other systems, dynamically interact with data
Library Services Platform Characteristics
Highly Shared data models Knowledgebase architecture Some may take hybrid approach to accommodate local
data stores Delivered through software as a service
Multi-tenant Unified workflows across formats and media Flexible metadata management
MARC – Dublin Core – VRA – MODS – ONIX Bibframe New structures not yet invented
Open APIs for extensibility and interoperability
Open Systems
Achieving openness has risen as the key driver behind library technology strategies
Libraries need to do more with their data Ability to improve customer experience and
operational efficiencies Demand for Interoperability Open source – full access to internal
program of the application Open API’s – expose programmatic
interfaces to data and functionality
Con
solid
ate
d in
dex
Unified Presentation LayerSearch:
Digital Coll
ProQuest
EBSCO…
JSTOR
Other Resour
ces
New Library Management Model
`
API Layer
Library Services Platform
LearningManageme
nt
LearningManageme
nt
Enterprise ResourcePlanning
Enterprise ResourcePlanning
StockManageme
nt
StockManageme
nt
Self-Check /
Automated Return
Self-Check /
Automated Return
Authentication
Service
Authentication
Service
Smart Cad /
Payment systems
Smart Cad /
Payment systems
Discovery
Service
Library Services Platforms
Category WorldShare Management Services
Alma Intota Sierra Services Platform
Kuali OLE
Responsible Organization
OCLC. Ex Libris Serials Solutions
Innovative Interfaces, Inc
Kuali Foundation
Key precepts Global network-level approach to management and discovery.
Consolidate workflows, unified management: print, electronic, digital; Hybrid data model
Knowledgebase driven. Pure multi-tenant SaaS
Service-oriented architectureTechnology uplift for Millennium ILS. More open source components, consolidated modules and workflows
Manage library resources in a format agnostic approach. Integration into the broader academic enterprise infrastructure
Software model
Proprietary Proprietary
Proprietary Proprietary Open Source
Library Services Platforms
Category WorldShare Management Services
Alma Intota Sierra Services Platform
Kuali OLE
Responsible Organization
OCLC. Ex Libris Serials Solutions
Innovative Interfaces, Inc
Kuali Foundation
Key precepts Global network-level approach to management and discovery.
Consolidate workflows, unified management: print, electronic, digital; Hybrid data model
Knowledgebase driven. Pure multi-tenant SaaS
Service-oriented architectureTechnology uplift for Millennium ILS. More open source components, consolidated modules and workflows
Manage library resources in a format agnostic approach. Integration into the broader academic enterprise infrastructure
Software model
Proprietary Proprietary
Proprietary Proprietary Open Source
Development / Deployment perspective
Beginning of a new cycle of transition Over the course of the next decade,
academic libraries will replace their current legacy products with new platforms
Not just a change of technology but a substantial change in the ways that libraries manage their resources and deliver their services
Traditional Proprietary Commercial ILS Aleph, Voyager, Millennium, Symphony, Polaris, BOOK-IT, DDELibra, Libra.se LIBERO, Amlib, Spydus, TOTALS II, Talis Alto, OpenGalaxy
Traditional Open Source ILS Evergreen, Koha
New generation Library Services Platforms Ex Libris Alma Kuali OLE (Enterprise, not cloud) OCLC WorldShare Management Services, Serials Solutions Intota Innovative Interfaces Sierra (evolving)
Competing Models of Library Automation
Convergence
Discovery and Management solutions will increasingly be implemented as matched sets Ex Libris: Primo / Alma Serials Solutions: Summon / Intota OCLC: WorldCat Local / WorldShare Platform Except: Kuali OLE, EBSCO Discovery Service
Both depend on an ecosystem of interrelated knowledge bases
API’s exposed to mix and match, but efficiencies and synergies are lost
How do libraries make the transition?
Migrating to the Cloud
Infrastructure
Move existing applications to cloud hosting? Infrastructure as a service Marginal gains
Create platforms designed for cloud deployment Multi-tenant software as a service
Transition of services
Identify specific library services as candidates
What activities are performed by individual libraries that could be done more effectively collaboratively
Candidate services
Bibliographic support Reference / Research support Resource sharing E-resource management Resource Discovery Library Management
Organizational strategy
Individual institutions make gains by moving legacy applications to hosted services
Amplify impact as new collaborative services are built that span organizations
Partnership opportunities
When to partner with existing service providers?
When to create services for a specific country or sector?
More than a technical transition Transforming infrastructure
Transform resources Working toward shared infrastructure Identify areas where libraries can collaborate to share
resources Infrastructure transformation
Bandwidth Shared services Refocus development from stand-alone applications to
platforms Platform development APIs that allow individual libraries or campuses to consume
content or services according to local needs
New conceptual models
Think beyond moving existing functionality
Re-evaluate the way that technical and information infrastructure supports the library in its strategic services to its parent institution
Candidates for Cloud-based Services
Identify services that can be provided at the national or international level
Resource sharing Document delivery Interlibrary Loan
E-resource knowledgebase Index-based discovery
Infrastructure
Robust Interconnectivity Development and support capacity Distributed data centers
Organization and personnel issues Refocus efforts of technologists and
technicians Away from redundant local
implementations Toward collaborative broad-based cross-
institutional services Deployment and maintenance of
conventional systems consumes all available resources
Library-by-library model least efficient
From software development to Platform development
Multi-tenant software as a service platforms that scale to meet the needs of the largest organizations or clusters of organizations
Consume platform services when available and appropriate
Create strategic platforms
Progressive consolidation of library services
Centralization of technical infrastructure of multiple libraries within a campus
Resource sharing support Direct borrowing among partner institutions
Shared infrastructure between institutions Examples: 2CUL (Columbia University /
Cornell University) Orbis Cascade Alliance (37 independent
colleges and universities to merge into shared LSP)
Consolidation of library automation services
Centralized library services within institutions Strategically cooperate between institutions From software development to platform
development Refocus efforts of technology personnel Less attention to deployment of conventional
systems More attention on broad-based services Library-by-library automation model least
efficient
Open source and Open Access Open source development of platform
services Open source infrastructure components Open APIs to expose platform services Knowledge base components
Open access Community maintained Adequately resourced
Reassess expectations of Technology
Many previous assumptions no longer apply
Technology platforms scale infinitely No technical limits on how libraries share
technical infrastructure Cloud technologies enable new ways of
sharing metadata Build flexible systems not hardwired to
any given set of workflows
Reassess workflow and organizational options
ILS model shaped library organizations New Library Services Platforms may
enable new ways to organize how resource management and service delivery are performed
New technologies more able to support strategic priorities and initiatives
Time to engage
Transition to new technology models just underway
More transformative development than in previous phases of library automation
Opportunities to partner and collaborate Vendors want to create systems with long-
term value Question previously held assumptions
regarding the shape of technology infrastructure and services
Provide leadership in defining expectations
Questions and discussion