Date post: | 18-Nov-2014 |
Category: |
Technology |
Upload: | four-pees |
View: | 82 times |
Download: | 2 times |
callas pdfaPilot 5 Archiving Emails with pdfaPilot David van Driessche CTO, Four Pees - [email protected]
What’s new in pdfaPilot 5?
Email to PDF/A
- Why and for whom?
Adjustable reports based on HTML-templates
!Support for ePub 3.0
!Support for ZUGFeRD (release candidate) - Specific for Germany
!Under the hood:
- New font engine
- New transparency flattening and rendering engine
- Better color transformations
Performance improvements for process-plans
Emails replace “documents”
… and have to be integrated in document archiving processes
!Which leads to questions:
- What’s the original format for an email?
- Is it standardised?
- Can it be archived securely?
!!!
!
What’s in an email?
Header
- Equivalent to letter head in paper letters
- But the actual routing happens through SMTP
Body
- Can combine different variants
- Straight (7-bit ASCII) text
- Simple formatted text (bold, italic), with specific encodings
- Fully formatted content with HTML, possibly images…
- No guarantee that the different pieces of content are equivalent!
Attachments
- Embedded in ASCII
- Can be document formats, archives (ZIP) or applications (EXE)
!
What’s the standard email format?
Server: mostly stores emails in proprietary formats
Clients with POP3: convert the email data stream to
- Single files
- eml, emlx (Outlook Express, Thunderbird, Apple Mail)
- pst, msg (Outlook)
- Databases
- mbx (Opera Mail)
!The important Outlook Format (msg) is proprietary, can be understood but is sensitive for version changes
!
So there is no real standard format; the RFC #833 standard controls communication but not what is saved. There are simply no guarantees!
But PDF/A offers a full solution!
Complete
- All fonts embedded
- All metadata embedded and defined using an embedded “schema”
- No password protection or encryption
Well-defined
- Device independent color definitions
- Well-defined encodings for all text
- Well-defined and embedded appearance for comments and form fields
No dynamic content allowed
But allowed are:
- External links
- Digital signatures
!
So what are the possibilities?
PDF/A-1
- Doesn’t allow file attachments in the PDF/A file
- Email attachments converted to PDF and appended to the email representation
PDF/A-2
- Allows embedded PDF/A files
- All attachments converted to PDF/A files and embedded in the email representation
PDF/A-3
- Allows embedded arbitrary files
- Attachments can be embedded as original file and/or converted PDF/A files
!
!!
In practice!
Archiving result example
- Email converted to PDF or PDF/A
- Header information used in the PDF and fully stored in the XMP metadata
- Attachments stored as original and as PDF/A together with the original email
Thank you! Questions?David van Driessche CTO, Four Pees - [email protected]