+ All Categories
Home > Education > Legaspi 02 data_curation_powerpoint

Legaspi 02 data_curation_powerpoint

Date post: 07-Aug-2015
Category:
Upload: bobby-dhea-santos
View: 103 times
Download: 3 times
Share this document with a friend
Popular Tags:
47
Reinhard Feldmann Data Curation / Digitisation / iCloud / Long-term storage of digital data Legaspi, Aquinas University 10th to 12th April 2014
Transcript

Reinhard Feldmann

Data Curation / Digitisation /iCloud / Long-term storage

of digital data

Legaspi, Aquinas University10th to 12th April 2014

Data curation

Reinhard Feldmann

General Introduction

Tape, CD, DVD

Preserving the past: Digitasation strategies in Germany

Final remarks

Data curation

Reinhard Feldmann

Data curation

Reinhard Feldmann

Microfilming or Digitasation?

Intelligence program during World War II

Civil Microfilming since World War II

Microfilming has been a success

Digitising the microfilms

DAMP: Digitising of ageing microfilm project

Data curation

Reinhard Feldmann

Microfilmscanner

6

Data curation

Reinhard Feldmann

New Complexity

„Born digital documents“

Migration vs. Emulation

Digital „dark ages“ (Example: U.S. Elections)

New formats (books – music)

Maschine readable texts

Subscription – preservation – commercial?

Users – Libraries

Complex matrix of issues

Data curation

Reinhard Feldmann

Audio- and Videotapes

History:

1888 Iron-wire

1928 Steel- and Paperband-Technology

1935 first non-metallic strapping (Berlin: AEG): Cellulose-Acetat

Compact cassette: 1964 (Philips)

Video: 1975 (Sony)

Data curation

Data curation

Reinhard Feldmann

Audio- and Videotapes

Damages

Vinegar Syndrom >

Hydrolysis of binding agent

Tape reel

External magnetic fields ?

Data curation

Data curation

Reinhard Feldmann

Audio- and Videotapes

Damages

Vinegar Syndrom >

Hydrolysis of binding agent

Tape reel

External magnetic fields ?

Data curation

Reinhard Feldmann

Audio- and Videotapes

Best Storage Conditions

Climate conditions

8° / 25% RH

Separate storage

Data curation

Reinhard Feldmann

Audio- and Videotapes

Resumee

Permanent data carrier vs. permanent data

Migration (expensive)

Optimal conditions for storage

Data curation

Reinhard Feldmann

Duke August Library

Data curation

Reinhard Feldmann

Main Hall

Data curation

Books to be scanned. Calculation by the „German Distributed Library“

Century Editions Pages at

average

Total pages

–1500 27.000 235 6.345.000

1501–1600 140.000 220 30.800.000

1601–1700 265.000 213 56.445.000

1701–1800 600.000 300 180.000.000

1801–1870 511.978 245 125.434.610

1871–1900 525.000 245 128.625.000

total 2.068.978 255 527.649.610

Data curation

Reinhard Feldmann

Data curation

Reinhard Feldmann

www.deutsche-digitale-bibliothek.de

22

Data curation

Reinhard Feldmann

Criteria for digitisation

Relevant for research

Rare and precious

Preservation needs

24

Preserving the Past

The Wolfenbuettel Book

Reflector

45° opening angle

Preserving the Past

The Wolfenbuettel Book

Reflector

90° opening angle

Preserving the Past

Graz Book Cradle, used in

Research libraries

Preserving the Past

ScanRobot 2.0

Preserving the Past

Preserving the Past

Preserving the Past

Preserving the Past

Preserving the Past

Fileserver at the Duke August

Library

RAID-Arrays with hard discs

Preserving the Past

Server for

master …

… and derivative

files

Preserving the Past

Persisent addressing

Example: Erstlich wolgedeutes Böhmisches Glücks vnd VnglücksRath : hernach in Radt/ doch mit

der That schad vnd vnrath Wesen. [S.l.] 1621

Shelf mark of the original: Einbl. Xb FM 28

Shelf mark of the digital copy: drucke/einbl-xb-fm-28 (basic element of all identifiers;

ASCII 7-Bit, one-word conversion from the original shelf mark)

URL: http://diglib.hab.de/wdb.php?dir=drucke/einbl-xb-fm-28 (since 2004)

PURL: http://diglib.hab.de/drucke/einbl-xb-fm-28/start.htm (since 1998)

URN: urn:nbn:de:gbv:23-drucke/kb-53-2f-25 (since 2005)

Resolver at the library (used for URN and other identifier):

http://diglib.hab.de/?urn=urn:nbn:de:gbv:23- drucke/einbl-xb-fm-285

Resolver at the German National Library:

http://nbn-resolving.de/urn/resolver.pl?urn=urn:nbn:de:gbv:23- drucke/einbl-xb-fm-285

Data curation

Reinhard Feldmann

Longterm Storage of Digital data

How long is “longterm”?

What is “sure storage“?

What data are we saving?

“state of the art”

Financial remarks

Data curation

Reinhard Feldmann

Semantic problem I: How long is Longterm?

"five years or more" (IFLA 2006)

“Data should normally be preserved and accessible for not less than 10 years for any projects, and for projects of clinical or major social, environmental or heritage importance, the data should be retained for up to 20 years, and preferably permanently within a national collection, or as required by the funder's data policy." (Research Councils UK 2008)

"a period of time long enough for there to be concern about the impacts of changing technologies (...) on the information being held in a repository. This period extends into the indefinite future." (CCSDS 2002: 1-11)

„'Longterm' is a non specified period, while unknown technological or sociocultural changes may take place.“ (Nestor 2008)

Longterm or forever?

Data curation

Reinhard Feldmann

Semantic problem II: What is sure?

Conservation of a bitstream?

Usability of data? What means usability?

Storage of the content?

Data and machines (Emulation)

Data and applications?

Semantic Context?

Layout?

Data curation

Reinhard Feldmann

Semantic problems III:

What are we saving?

Bitstream?

Data?

Digital documents?

Digitale Representations of analogue documents?

Contents?

Information?

Information nets?

Knowledge? And how does information creates knowledge?

Everything?

Data curation

Reinhard Feldmann

Financial remarks

"Like almost all engineering problems, bit preservation is fundamentally a question of budgets.“ (David S. H. Rosenthal)

„A quick review of the literature reveals no consensus on metrics or factors for calculating all the costs involved in digitizing a book.“ http://hurstassociates.blogspot.com/2008/04/costs-of-large-scale-digitization.html

Data curation

Reinhard Feldmann

Theses

Longterm storage must be seen under the conditions of the www: international, netbased, divided.

„Everything“ and „always“ is impossible: priorities and decisions!

Not the keeping of single objects is important, but information and context creates permanent knowledge

Analogue or Digital?

Analogue and Digital!

Data curation

Reinhard Feldmann

Thanks

Thank you for your attention!

Special thanks to:

Dr. Thomas Stäcker (Duke August Library Wolfenbüttel)

Prof. Dr. Stefan Gradmann (Humboldt University Berlin)


Recommended