Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 216 times |
Download: | 0 times |
Global Cooperation for Global Access:
The Million Book Project
Denise Troll CoveyPrincipal Librarian for Special Projects
Carnegie Mellon
CRIS 2004 – Antwerp, Belgium
The Million Book Project
• Digitize & provide open access to a million books
• Vision, leadership, & research – Carnegie Mellon
• $$ Equipment & travel – NSF
• $$ Labor & research – India & China
“Attempt to understand & solve
the technical, economic, & social policy
issues of providing online access
to all creative works of the human race.”Raj Reddy
National Surveys of Students & Faculty
• 90% want convenient, speedy, easy access– The only thing they want more is quality information
• 61% want remote access to full-text e-resources
• Fewer than half think the library meets these needs
• 48% start with Google or other Internet search engine
Gloriana St. Clair
National Surveys of Undergraduates
• 96% believe surface web information is adequate
• 72% use an Internet search engine
• 48% believe library web site information is inferior
• 46% use online resources all or most of the time
• Efficiency is more important than relevance
Michael Shamos
Carnegie Mellon Graduate Students
• 82% start with an Internet search engine
• Getting information from the web is at least twice as easy as getting it from library e-resources
• Using library e-resources is about as convenient as getting information from professors or classmates
• 24% often can’t get information when they need it – Out of print books & old journals
Social Significance
• Help meet the need for convenient, speedy, easy,
remote access to quality academic resources
• Address disparity in library size & accessibility
• Democratize & facilitate new knowledge
• Support digital library research
• Preserve heritage
Collection of Collections
• What librarians select & partners want
– Books for College Libraries (BCL)
– Technical reports
– Cultural artifacts
– Government documents
• What we can acquire
– Bulk, cheap, fast
Nov 2001 – NSF Planning meeting
Michael Lesk
Seeking Copyright Permission for Open Access
• Increased success: improved request letter, prompt follow up, nature of collection, & ability to preview
• University presses, scholarly associations, & estates are more likely than commercial presses to grant permission
• Transaction cost of $78 per volume is too expensive
Response rate per contacts
Success rate per responses
Success rate per contacts
Random books 58% 43% 25%Posner fine books * 76% 70% 53%
Shift from Per Title to Per Publisher
Indigenous Materials
Public Domain
In Copyright
Initial Current
Requires 18% success rate with BCL publishers
& 500 books each
Copyright Permission Request Letter
• Educate
– Users want to find information online, but use print
– Online access increases use, even use of older works
– Open access does not decrease & can increase sales
– Currently no revenue
from out-of-print books
Request & Incentive
• Ask for non-exclusive permission to digitize – All out of print, in copyright titles– All titles published prior to a date of their choosing– All titles published # or more years ago– List of titles they provide
• Assure
– Following preservation standards & copyright law– Print & save only one page at a time
• Give – images, metadata, & OCR $$$$
Early – Preliminary – Statistics
Million Book Copyright Owners
Total 206
1. Owners contacted 100%
2. Owners responded 24%
3. Success - Responses 57%
Success - Contacts 14%
Posner Copyright Owners
107
65%
76%
70%
53%Nov 2003 – Mar 2004
Many more follow up negotiations to be done
Don’t yet know number of titles
Success Rate Comparison
0%
25%
50%
75%
100%
Randombooks
P os nerbooks
M illionbooks
Based on responses
Scholarly associations
University presses
Commercial publishers
Authors/Estates
Other
Digital Registry
• Registry of reproductions of books & journals digitized or queued for digitization– Reduce duplication– More access for less cost
• Registry signals– Intent to preserve & make accessible in entirety– Compliance with standards & best practices– Professionally managed storage & maintenance– Use copy available for public access
Release May 31, 2004
Acquisitions & Shipping
• Acquisitions – Copyrighted books – OCLC locating in partner libraries – Out of copyright – weeding; depositories; duplicates
• Lessons learned from pilot shipment to India– Reduce cost to $1 per book round trip by changing packing
– Reduce time by distributed shipping & knowing customs
• Lessons learned working with China– Customs & content issues initially prohibited shipping – Scanning centers declared free enterprise zones 2004
Metadata & Digitization
• Following standards
• Operators scan & post-process– Above average wages
– 4000 books per year per scanner (two shifts per day)
– 400,000 books per year with 100 scanners
• Librarians capture metadata– Bibliographic: MARC or DC
– Administrative: copyright permission & source library
Sustainability
• Following standards will enable migration
• Organizations committed to host the Collection– Carnegie Mellon
– Internet Archive
– Perhaps OCLC
• Goal is to have ten mirror sites – Estimated cost is one million dollars
– Estimated size is 20 terabytes
– University of California at Merced
– DL of India
– China
Brewster Kahle
Issues & Next Steps
• Adding value
– Negotiating with Amazon.com for print on demand
• Updating workflow & processing the backlog
• Coordinating acquisition & shipping
• Integrating the collection
• Improving the interface
• Copyright permission work