Michael B. Toth [email protected] R.B Toth Associates
www.rbtoth.com
DHSS 2014 Challenges in Digitization Studies
Integration of Spectral Imaging as an Effective Preservation Support Tool
@michabt Follow
Eureka! Medieval Manuscripts on the Web
www.rbtoth.com
Applying Digitization
…to Preservation Support, Manuscript Studies, and Global Access
www.rbtoth.com
Providing Access
1. Access to data • By People • By Machines
2. Intellectual Property
• Global Storage & Access
www.rbtoth.com
…and “Digital Preservation”
www.rbtoth.com
…for Global Sharing
www.rbtoth.com
Applied Science & Technology…
www.rbtoth.com
…in Support of Users
www.rbtoth.com
Jan 2001 Proof of Concept
2003 2008 2004 2005 2006
Manuscript Study
Information and Imaging Technology
Archimedes Palimpsest Program 2003 2008 2004 2005 2006 2007
Oct 1998 Palimpsest Purchased
Sep 1998 Google Founded 10,000 queries/day
2007 Google indexes 4.28+ billion web pages 250+ million search results/day
2007 2000
2000
2001
2001
2002
2002
1999
1999
1998
1998
Imaging
Sep 2003 Firefox Released
Oct 2001 XML 2d Ed. Standard
1998 Kodak DC210+ 1.0 Megapixel Camera
2003 Kodak DX 6440 4 Mp Camera
2007 Kodak V1003 10 Mp Camera
2001 Kodak DCS 760 6.1 Megapixel Camera
2001 Archimedes Draft Metadata Standard
2007 Stokes 256 Mp Imaging System
Apr 2001 – Nov 2006 Phased Optical Imaging and Stitching Aug 2007 Optical Imaging
April 2004 ADITUP Conference
2000 Study Phase
2003 Archimedes Forum
2005-2006 XRF Imagining
2006 POC Data Release
2001 Method Sciamus Paper
2003 Stomachion NYTimes Article
2004 Hyperides
2005 Aristotle Commentary
2008 Data Release
2007-2008 Transcriptions
Technology Development
www.rbtoth.com
Camera Manuscript Optical Fibers LEDs
Prototype LED Illumination System November 2006 -Walters Art Museum
www.rbtoth.com
“National Treasure: Book of Secrets” December 2007
“Imaging” of page fragment from “John Wilkes Booth’s diary” using prop system based on 2006 LED system
Justin Bartha Nicolas Cage Diane Kruger
www.rbtoth.com
2007 LED Illumination
Bill Christens-Barry
Equipoise Imaging
www.rbtoth.com
Prototype LED Panel
UV LED • l0 = 365 nm 7 visible LEDs • 445 nm • 470 nm • 505 nm • 530 nm • 570 nm • 617 nm • 625 nm
4 Infrared LEDs • 700 nm • 735 nm • 780 nm • 870 nm
Two Panels, each w/ six banks of 12 visible and UV LEDs + two banks of 4 IR LEDs
www.rbtoth.com
LED Illumination
Imaging Systems
www.rbtoth.com
System Development
Integrate Integrate with Work with Work Processes Processes and Data and Data
StandardsStandards
AssessAssessCultural Cultural Heritage Heritage NeedsNeeds
Develop Develop System System
to Meet User/to Meet User/Stakeholder Stakeholder
NeedsNeeds
Image, Image, Process,Process,Distribute Distribute
Data, Data, ReviewReview
Develop Develop Phased Plan, Phased Plan,
Schedule, Schedule, Budget, Goals Budget, Goals
Integrate Integrate Feedback Feedback into Next into Next
Deliverable Deliverable
Review Plans Review Plans with Users with Users
Stakeholders Stakeholders & Team& Team
Inputs:
Funding, Goals, Standards, Deliverables
Deve
lopm
ent
Deve
lopm
ent
Proc
ess
Proc
ess
Desi
gn
Desi
gn
Elem
ents
Elem
ents Technology:
Proof of ConceptTestbedsPrototypesSoftware
Documentation: Review ReportsUser FeedbackUser and System MetricsOperational Manuals
Risk Reduction Evaluation:Test technical capability Identify development alternativesMilestone Reviews:Incorporate stakeholder/user feedback Goals met for each phase Inputs for next development phases
www.rbtoth.com
St. Catherine’s Monastery, Sinai 2009- Present
www.rbtoth.com
Syriac Galen Palimpsest 2009- Present
www.rbtoth.com
David Livingstone Diaries
www.rbtoth.com
XML Transcriptions
www.rbtoth.com
Metadata in Transcriptions
www.rbtoth.com
Integration of Data
This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.
www.rbtoth.com
System Integration
Technology
Processes People
www.rbtoth.com
• Requirements definition Program phasing to meet funding & schedule
• Systems integration New technologies & appropriate skills
• Process engineering Development of efficient work processes
• Standardization Application of broadly accepted standards
Program Management
www.rbtoth.com
1. Governance: Models for repository and data management with transparency and institution-wide acceptance
2. Staffing and Work Processes: Common current and future work processes supported by needed skills base
3. Cataloging and Metadata: Cataloguing information and metadata integrated with data set
4. Standards: International mature standards used in storing and maintaining digital data set, including metadata
5. Quality Control: Security, monitoring and auditing of any manipulation of data and/or supporting infrastructure
6. Content Management and Access: Continued Content Management and Access throughout full lifecycle
7. Outreach and External Collaboration: External collaboration and access to ensure broad storage and common standards
Full Lifecycle Program
www.rbtoth.com
Color Digital Imaging
X-Ray Synchrotron
Experimental Spectral Imaging, 3D
Imaging
Production Spectral Imaging &
Processing
$$$$ €€€€
$$$ €€€
$$ €€
$ €
Potential Applications
www.rbtoth.com
• Image Collection – Integrated illumination system and camera – Rapid set-up, ease of operator use – Integration of data and metadata
• Processing – Broadly accessible image processing algorithms – Sufficient power to digitally process large images – Ease and speed of use
• Data Storage – Sufficient storage for multiple large image sets – Standardized data and metadata – Long-term data viability
• Dissemination and Access – Integration of data from processing, studies – Access and dissemination by multiple users – Rapid access & transfer of large image files via Internet
Technical System & Infrastructure
www.rbtoth.com
Metadata
Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information.
National Information Standards Organization, 2004
www.rbtoth.com
Cataloging & Metadata • Metadata Integrated with Digital Object
– Adherence to broadly accepted standards – Simple, flat metadata records
• Persistent Identifiers
• Accepted Standards – Standardized Vocabularies – Metadata Schema – XML to support conversion to other formats
• (e.g. MARC, MODS, EAD)
• Documentation & Preserve Standards
www.rbtoth.com
Six Types of Metadata Elements: 1. Identification Information 2. Spatial Data Reference Information 3. Imaging & Spectral Data Reference Information 4. Data Type Information 5. Data Content Information 6. Metadata Reference Information
(including XRF Extensions)
http://www.archimedespalimpsest.net/Documents/Internal/
Spectral Imaging Metadata
www.rbtoth.com
Dublin Core Metadata Initiative Element Set
www.rbtoth.com
Spectral Imaging Metadata
Six Types of Metadata Elements: 1. Identification Information 2. Spatial Data Reference Information 3. Imaging & Spectral Data Reference Information 4. Data Type Information 5. Data Content Information 6. Metadata Reference Information
(including XRF Extensions)
http://www.archimedespalimpsest.net/Documents/Internal/
www.rbtoth.com
Including: • Catalog Information • Metadata • File Format • Conservation Data • Intellectual Property • Storage • Others
Standardize Digital Products
www.rbtoth.com
Metadata & Standards Development: • Define consensus standards and metadata
elements to be used in pilot digitization o Include required cataloging & metadata
standards, standardized vocabulary, schemas and crosswalks
Metadata and Standards Review: • Define, document, review key metadata
elements and pilot project standards with major stakeholders and partners.
Data Planning
www.rbtoth.com
www.rbtoth.com
Content Management System
• Data Repository (CMS) – Master Server
• Master Files • Cataloging Data • Restricted Access Files
– Application Server • User Data
– PREMIS Archive • Dark Archive
• Access Infrastructure • Security
www.rbtoth.com
Simple Data Layout
ReadMe
Technical ReadMe
Data
Supplemental
Access Core Images
Access Other
Images
www.rbtoth.com
§Ensure utility of simple data §Broad distribution to service
providers §Standardized formats & encoding
Impermanence of Digital Data
Dynamic technology, media & formats • Rapid obsolescence requires regular reformatting
www.rbtoth.com
Digital “Preservation” • Sufficient storage for multiple large image sets • Globally replicate data online • Rapid access via Internet
www.rbtoth.com
Data & Metadata
• Long-term data set viability beyond the lifetime of current technologies – Adherence to existing, broadly accepted standards – Simple, flat digital records
• Critical for access, sharing and interpretation of digital data
• Integration of metadata with images, supporting data and scholarly products
www.rbtoth.com
Data Sharing
www.rbtoth.com
Changing Preservation
www.rbtoth.com
Photograph by Mark Schrope © 2012 Used with permission
www.rbtoth.com [email protected]
@michabt Follow
Eureka! Medieval Manuscriptson the Web
digitalgalen.net thedigitalwalters.org
Thank You