Date post: | 07-Aug-2015 |
Category: |
Education |
Upload: | bobby-dhea-santos |
View: | 103 times |
Download: | 3 times |
Reinhard Feldmann
Data Curation / Digitisation /iCloud / Long-term storage
of digital data
Legaspi, Aquinas University10th to 12th April 2014
Data curation
Reinhard Feldmann
General Introduction
Tape, CD, DVD
Preserving the past: Digitasation strategies in Germany
Final remarks
Data curation
Reinhard Feldmann
Microfilming or Digitasation?
Intelligence program during World War II
Civil Microfilming since World War II
Microfilming has been a success
Digitising the microfilms
DAMP: Digitising of ageing microfilm project
Data curation
Reinhard Feldmann
New Complexity
„Born digital documents“
Migration vs. Emulation
Digital „dark ages“ (Example: U.S. Elections)
New formats (books – music)
Maschine readable texts
Subscription – preservation – commercial?
Users – Libraries
Complex matrix of issues
Data curation
Reinhard Feldmann
Audio- and Videotapes
History:
1888 Iron-wire
1928 Steel- and Paperband-Technology
1935 first non-metallic strapping (Berlin: AEG): Cellulose-Acetat
Compact cassette: 1964 (Philips)
Video: 1975 (Sony)
Data curation
Reinhard Feldmann
Audio- and Videotapes
Damages
Vinegar Syndrom >
Hydrolysis of binding agent
Tape reel
External magnetic fields ?
Data curation
Reinhard Feldmann
Audio- and Videotapes
Damages
Vinegar Syndrom >
Hydrolysis of binding agent
Tape reel
External magnetic fields ?
Data curation
Reinhard Feldmann
Audio- and Videotapes
Best Storage Conditions
Climate conditions
8° / 25% RH
Separate storage
Data curation
Reinhard Feldmann
Audio- and Videotapes
Resumee
Permanent data carrier vs. permanent data
Migration (expensive)
Optimal conditions for storage
Data curation
More informations
http://www.restaumedia.de (!)
http://www.memoriav.ch
http://www.forum-bestandserhaltung.de
http://www.tape-online.net
Data curation
Books to be scanned. Calculation by the „German Distributed Library“
Century Editions Pages at
average
Total pages
–1500 27.000 235 6.345.000
1501–1600 140.000 220 30.800.000
1601–1700 265.000 213 56.445.000
1701–1800 600.000 300 180.000.000
1801–1870 511.978 245 125.434.610
1871–1900 525.000 245 128.625.000
total 2.068.978 255 527.649.610
Data curation
Reinhard Feldmann
Criteria for digitisation
Relevant for research
Rare and precious
Preservation needs
Persisent addressing
Example: Erstlich wolgedeutes Böhmisches Glücks vnd VnglücksRath : hernach in Radt/ doch mit
der That schad vnd vnrath Wesen. [S.l.] 1621
Shelf mark of the original: Einbl. Xb FM 28
Shelf mark of the digital copy: drucke/einbl-xb-fm-28 (basic element of all identifiers;
ASCII 7-Bit, one-word conversion from the original shelf mark)
URL: http://diglib.hab.de/wdb.php?dir=drucke/einbl-xb-fm-28 (since 2004)
PURL: http://diglib.hab.de/drucke/einbl-xb-fm-28/start.htm (since 1998)
URN: urn:nbn:de:gbv:23-drucke/kb-53-2f-25 (since 2005)
Resolver at the library (used for URN and other identifier):
http://diglib.hab.de/?urn=urn:nbn:de:gbv:23- drucke/einbl-xb-fm-285
Resolver at the German National Library:
http://nbn-resolving.de/urn/resolver.pl?urn=urn:nbn:de:gbv:23- drucke/einbl-xb-fm-285
Data curation
Reinhard Feldmann
Longterm Storage of Digital data
How long is “longterm”?
What is “sure storage“?
What data are we saving?
“state of the art”
Financial remarks
Data curation
Reinhard Feldmann
Semantic problem I: How long is Longterm?
"five years or more" (IFLA 2006)
“Data should normally be preserved and accessible for not less than 10 years for any projects, and for projects of clinical or major social, environmental or heritage importance, the data should be retained for up to 20 years, and preferably permanently within a national collection, or as required by the funder's data policy." (Research Councils UK 2008)
"a period of time long enough for there to be concern about the impacts of changing technologies (...) on the information being held in a repository. This period extends into the indefinite future." (CCSDS 2002: 1-11)
„'Longterm' is a non specified period, while unknown technological or sociocultural changes may take place.“ (Nestor 2008)
Longterm or forever?
Data curation
Reinhard Feldmann
Semantic problem II: What is sure?
Conservation of a bitstream?
Usability of data? What means usability?
Storage of the content?
Data and machines (Emulation)
Data and applications?
Semantic Context?
Layout?
Data curation
Reinhard Feldmann
Semantic problems III:
What are we saving?
Bitstream?
Data?
Digital documents?
Digitale Representations of analogue documents?
Contents?
Information?
Information nets?
Knowledge? And how does information creates knowledge?
Everything?
Data curation
Reinhard Feldmann
Financial remarks
"Like almost all engineering problems, bit preservation is fundamentally a question of budgets.“ (David S. H. Rosenthal)
„A quick review of the literature reveals no consensus on metrics or factors for calculating all the costs involved in digitizing a book.“ http://hurstassociates.blogspot.com/2008/04/costs-of-large-scale-digitization.html
Data curation
Reinhard Feldmann
Theses
Longterm storage must be seen under the conditions of the www: international, netbased, divided.
„Everything“ and „always“ is impossible: priorities and decisions!
Not the keeping of single objects is important, but information and context creates permanent knowledge
Analogue or Digital?
Analogue and Digital!