Date post: | 18-Nov-2014 |
Category: |
Technology |
Upload: | dave-mcallister |
View: | 331 times |
Download: | 0 times |
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Dave McAllister, Director, Open Source and Standards Lessons from document archiving – PDF/A
© 2012 Adobe Systems Incorporated. All Rights Reserved.
2
Archiving Requirements: Live
§ Repeat the current experience at some future time. § Including active nature and that depends upon being able to provide a suitable “execution environment” in
the future.
§ Interactive and dynamic types
§ More powerful computers and displays available § New delivery mechanisms and devices
§ invent new metaphors that move away from our more static paper-based ideas § Digital Rights Management
§ two-fold challenge § obsolescence of new (emerging) types
§ base technologies may become obsolete
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Digital Documents have existed for some time
§ PDF (1993)
A comprehensive format for representing documents and forms
§ High !delity, high precision text layout with embeddable fonts
§ High-end device independent, color managed graphics features
§ Platform independent de!nition
§ Interactive elements & content
§ Multimedia & 3D
§ Security & digital signatures
3
"at’s great for the web, screen viewing, eBooks, etc. But what about people who just want reliable printing and/or archiving?
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Digital Document Archive needs
§ A document format that
§ Conveys critical information
§ Can be rendered accurately (predictable and consistent)
§ Offers metadata support
§ Standard schemas
§ Custom Schemas
§ Provenance, version, history, audit
§ Can incorporate marginalia
§ Notes, comments, mark ups
§ Can be “signed” (tamper proof)
§ Provides a de!nition of retrieval
4
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Enter PDF/A
§ PDF/A-1 (ISO 19005)
§ Long term preservation of black and white and color compound documents as electronic data
§ Combinations of character, raster, vector and other data
§ Provisions for capturing semantic information
§ Preservation and retrieval of appropriate metadata
§ “static paper” ++
§ Annotations & Marginalia
§ Metadata
§ Signatures
5
© 2012 Adobe Systems Incorporated. All Rights Reserved.
PDF/A-1 Details
6
§ More restrictive “coding” of PDF details
§ Ensures less ambiguity when implementing
§ Based on PDF 1.4 & PDF/X-3
§ All PDF/X-3 documents can potentially be minimally conforming PDF/A documents without any changes
§ Reduces ambiguity between different vendor's implementations
§ Removal of any complex or potentially confusing graphic concepts
§ No transparency
§ Limited colorspaces
§ No Security/Encryption
§ All data must be self-contained § No external resources
§ Fonts MUST be embedded(!!)
§ Limited annotation support § No movies and sounds
§ No JavaScript
§ Links are stored but not executed
§ Metadata based on Adobe XMP
§ Low level font requirements § Matching font widths
§ CharSet/CIDSet
§ CMaps
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Levels of Conformance
§ Minimal Conformance (PDF/A-1B)
§ Meet the standard/basic requirements
§ Full Conformance (PDF/A-1A)
§ Tagged PDF
§ Improved searchability via Unicode mappings
§ Comprehensive metadata recommendations
§ Font data
§ Document “pedigree”
§ Audit trail
7
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Not just File Format – Viewer Requirements
§ Color management
§ Use of output intent
§ No use of alternates (except for Spot & DeviceN)
§ Speci!c handling of DeviceGray
§ Font handling
§ ALWAYS use embedded data
§ Interactivity
§ Annotations & form !elds are non-interactive
§ must use the stored appearance
§ provide access to data/contents
§ Hyperlinks are “questionable”
8
© 2012 Adobe Systems Incorporated. All Rights Reserved.
PDF-A/2
§ Remain focused on “static paper” metaphor
§ No interactivity, 3D, multimedia, etc.
§ Updated to reference ISO 32000-1
§ Ensure as close to 100% forward compatibility as possible
§ A PDF/A-1 document SHOULD be also a valid PDF/A-2 document
§ However, valid technical changes to ensure long term reliability were preferred over compatibility.
§ Predominantly in the areas of fonts & metadata
§ Continue to maintain compatibility with other ISO standards
§ PDF/X-4
§ PDF/E-1
9
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Some new and important features in A-2 (and A-3)
§ Improved compression technology/Smaller !les
§ JPEG2000
§ Compressed XRefs & Streams (aka “Full Compression”)
§ Transparency
§ PDF Layers (aka Optional Content)
§ Whatever you view on screen, must print!
§ PDF Packages/Collections
§ May only contain other PDF/A documents
§ Digital signature enhancements
§ Certi!ed Documents
§ Improved revocation checking
§ PAdES (ETSI TS 102778) Compliance required
§ Improved tagging/accessibility
10
"is is the only major change to PDF/A-3 Embedded !les can be of any format
© 2012 Adobe Systems Incorporated. All Rights Reserved.
New work under consideration
§ Archiving PDFs with embedded 3D
§ Archiving of digitally signed PDFs
§ Archiving of source material inside the PDF
§ Secondary issues
§ Archiving of “documents of record”
§ Rich forms, such as those based on XFA or with embedded JavaScript
§ Desire to archive the business logic with the values
§ May or may not be digitally signed
§ Archiving of video and audio embedded into PDF
§ New features of ISO 32000-2
§ Portfolios, RichMedia, GIS, etc.
11
© 2012 Adobe Systems Incorporated. All Rights Reserved.
Summation
§ "e basic requirements for multimedia archiving mirror those for digital documents
§ "e scope of formats is wider
§ "e envelope of contents is substantially larger
§ Documents themselves may be considered a form of media
§ Arising complexity will exist as the envelope for documents encompasses media types
§ No media type is entirely separate from any other
12
© 2012 Adobe Systems Incorporated. All Rights Reserved.