1T3XT BVBA, the iText Company http://itextpdf.com/
iText in Action — 2nd Edition
Bruno Lowagie @ Zenika
March 10, 2011
1T3XT BVBA, the iText Company http://itextpdf.com/
About this talk
• 2010:
– History of iText: development & IP
– How to write a book
– Book preview
• 2011:
– Book overview
– Samples: code snippets, PDFs, techniques
– The future of iText
1T3XT BVBA, the iText Company http://itextpdf.com/
First edition: 2006
1T3XT BVBA, the iText Company http://itextpdf.com/
Second Edition: 2010
10Q2: ebooks: 1526 pbooks: 1953
1T3XT BVBA, the iText Company http://itextpdf.com/
Examples on SourceForge
1T3XT BVBA, the iText Company http://itextpdf.com/
Chapter info on itextpdf.com
1T3XT BVBA, the iText Company http://itextpdf.com/
Part 1
Creating PDF from scratch
• Ch 1: Introducing PDF and iText
• Ch 2: Using iText’s basic building blocks
• Ch 3: Adding content at absolute positions
• Ch 4: Organizing content in tables
• Ch 5: Table, cell, and page events
1T3XT BVBA, the iText Company http://itextpdf.com/
Creating PDF from scratch
Creating PDF with iText
1. Create a Document
2. Create a Writer
3. Open the Document
4. Add content
5. Close the Document
Hello World
// step 1
Document document = new Document();
// step 2
PdfWriter.getInstance(
document, new FileOutputStream(filename));
// step 3
document.open();
// step 4
document.add(new Paragraph("Hello World!"));
// step 5
document.close();
1T3XT BVBA, the iText Company http://itextpdf.com/
Basic Building Blocks
1T3XT BVBA, the iText Company http://itextpdf.com/
Database
1T3XT BVBA, the iText Company http://itextpdf.com/
Report using tables
1T3XT BVBA, the iText Company http://itextpdf.com/
Report using direct content
1T3XT BVBA, the iText Company http://itextpdf.com/
Combining approaches
1T3XT BVBA, the iText Company http://itextpdf.com/
Part 2
Manipulating existing PDF documents
• Ch 6: Working with existing PDFs
• Ch 7: Making documents interactive
• Ch 8: Filling out interactive forms
1T3XT BVBA, the iText Company http://itextpdf.com/
Invoice application
1T3XT BVBA, the iText Company http://itextpdf.com/
An all PDF web app?!?
1T3XT BVBA, the iText Company http://itextpdf.com/
AcroForm
• AcroForm PdfReader reader =
new PdfReader("resources/pdf/subscribe.pdf");
PdfStamper stamper = new PdfStamper(reader,
new FileOutputStream("results/subscribed.pdf"));
AcroFields form = stamper.getAcroFields();
form.setField("personal.name", "Bruno Lowagie");
form.setField("personal.loginname", "blowagie");
form.setField("personal.password", "12345678");
form.setField("personal.reason",
"Because!\nI want to be subscribed");
stamper.setFormFlattening(true);
stamper.close();
1T3XT BVBA, the iText Company http://itextpdf.com/
AcroForm
Interactive Flattened
1T3XT BVBA, the iText Company http://itextpdf.com/
XML Data
1T3XT BVBA, the iText Company http://itextpdf.com/
XML Schema Definition
1T3XT BVBA, the iText Company http://itextpdf.com/
Creating a dynamic XFA form
1T3XT BVBA, the iText Company http://itextpdf.com/
Importing an XSD
1T3XT BVBA, the iText Company http://itextpdf.com/
Rearranged fields
1T3XT BVBA, the iText Company http://itextpdf.com/
Dynamic XFA form
1T3XT BVBA, the iText Company http://itextpdf.com/
Fill out the form
• XFA PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader,
new FileOutputStream(dest));
AcroFields form = stamper.getAcroFields();
XfaForm xfa = form.getXfa();
xfa.fillXfaForm(new FileInputStream(xml));
stamper.close();
1T3XT BVBA, the iText Company http://itextpdf.com/
Form with data
1T3XT BVBA, the iText Company http://itextpdf.com/
A look inside the form
1T3XT BVBA, the iText Company http://itextpdf.com/
Part 3
Essential iText skills
• Ch 9: Integrating iText in your web application
• Ch 10: Brightening your document with color and images
• Ch 11: Choosing the right font
• Ch 12: Protecting your PDF
1T3XT BVBA, the iText Company http://itextpdf.com/
Structure of a PDF file
A PDF file consists of a collection of objects.
A PDF files starts with %PDF-1.x and ends with %%EOF
%PDF-1.x
%âãÏ•Ó
1 0 obj
...
2 0 obj
... (Hello World) Tj ...
xref
0 81
0000000000 65535 f
0000000015 00000 n
...
trailer
<< ... >>
startxref
15787
%%EOF
1T3XT BVBA, the iText Company http://itextpdf.com/
Changing the content of a PDF file
You can use software to change the content of a PDF document: change a stream, add objects (e.g annotations), and so on.
%PDF-1.x
%âãÏ•Ó
1 0 obj
...
2 0 obj
... (Hello People) Tj ...
121 0 obj
...
xref
0 85
0000000000 65535 f
0000000015 00000 n
...
trailer
<< ... >>
startxref
16157
%%EOF
1T3XT BVBA, the iText Company http://itextpdf.com/
What are our concerns?
• Integrity—we want assurance that the document hasn’t been changed somewhere in the workflow
• Authenticity—we want assurance that the author of the document is who we think it is (and not somebody else)
• Non-repudiation—we want assurance that the author can’t deny his authorship.
1T3XT BVBA, the iText Company http://itextpdf.com/
Integrity
• A digest is computed over a range of bytes from the file.
• This ByteRange is signed using the private key of the sender.
• This digest and the sender’s Certificate are embedded in the PDF.
• The receiver compares the embedded digest with the digest of the content.
1T3XT BVBA, the iText Company http://itextpdf.com/
Digital Signature field
A signed PDF file contains a signature dictionary.
The binary value of the PDF signature is placed into the Contents entry of a signature dictionary.
%PDF-1.x
%âãÏ•Ó
1 0 obj
...
2 0 obj
<<
/Type/Sig /Contents/...
>>
...
xref
0 81
0000000000 65535 f
...
trailer
<< ... >>
startxref
15787
%%EOF
1T3XT BVBA, the iText Company http://itextpdf.com/
Embedded Digital Signature
The digital signature isn’t part of the ByteRange.
There are no bytes in the PDF that aren’t covered, other than the PDF signature itself.
%PDF-1.x
%âãÏ•Ó
...
2 0 obj
<<... /Type/Sig /Contents<
> ... >>
xref
0 81
0000000000 65535 f
...
trailer
<< ... >>
startxref
15787
%%EOF
DIGITAL
SIGNATURE
1T3XT BVBA, the iText Company http://itextpdf.com/
Cryptography
• Symmetric key algorithms: the same key is used to encrypt and decrypt content.
• Asymmetric key algorithms: a public key is used to encrypt, a private key is used to decrypt (for encryption purposes).
• Or, a private key is used to encrypt, a public key is used to decrypt (for digital signatures).
1T3XT BVBA, the iText Company http://itextpdf.com/
Obtain a public/private key
• Create your own keystore (with the private key) and self-signed certificate (with the public key); e.g. using keytool
• Ask a Certificate Authority (CA) to sign your certificate to prove your identity
• A Certificate signed by a CA’s private key can be decrypted with the CA’s root certificate (stored in Adobe Reader)
1T3XT BVBA, the iText Company http://itextpdf.com/
Digital Signatures
Stored on the producer’s side
• Certificate – Public key
– Identity info
• Private key
• Original document
ByteRange
Received by the consumer
%PDF-1.x
...
/ByteRange ...
/Contents<
>...
%%EOF
DIGITAL SIGNATURE
• Certificate
• Signed Message Digest
• Timestamp
1T3XT BVBA, the iText Company http://itextpdf.com/
DIGITAL SIGNATURE
• Certificate
%PDF-1.x
...
...
%%EOF
• Timestamp
• Signed Message
Digest
Possible architecture
Existing PDF document Created by PDF producer
Fill out signature field Using iText
Externally sign digest created with iText
Application Device
1T3XT BVBA, the iText Company http://itextpdf.com/
Displaying digital signatures
• Digital signatures are part of the file structure: it isn’t mandatory for a digital signature to be displayed on a page.
• Digital signatures are listed in the signature panel.
• A digital signature can be visualized as a field widget (this widget can consist of graphics, text,...).
1T3XT BVBA, the iText Company http://itextpdf.com/
Invisible signature
1T3XT BVBA, the iText Company http://itextpdf.com/
Visible signature
1T3XT BVBA, the iText Company http://itextpdf.com/
Invalid signature
1T3XT BVBA, the iText Company http://itextpdf.com/
Custom signature
1T3XT BVBA, the iText Company http://itextpdf.com/
Important note
• A signature signs the complete document.
• The concept of signing separate pages in a document (“to initial a document”) doesn’t exist in PDF.
• Legal issue: how to prove that a person who signed for approval has read the complete document?
1T3XT BVBA, the iText Company http://itextpdf.com/
Serial signatures
A PDF document can be signed more than once, but parallel signatures aren’t supported, only serial signatures: additional signatures sign all previous signatures.
%PDF-1.x
% Original document
% Additional content 1
...
...
%%EOF
DIGITAL SIGNATURE 1
...
%%EOF
DIGITAL SIGNATURE 2
% Additional content 2
...
...
%%EOF
DIGITAL SIGNATURE 3
Rev1
Rev2
Rev3
1T3XT BVBA, the iText Company http://itextpdf.com/
Demo: two signatures
1T3XT BVBA, the iText Company http://itextpdf.com/
Types of signatures
• Certification (aka author) signature— only possible for the first revision; involves modification detection permissions.
• Approval (aka recipient) signature— workflow with subsequent signers.
• Usage Rights signature— involving Adobe’s private key to Reader enable a PDF (off-topic here).
1T3XT BVBA, the iText Company http://itextpdf.com/
Problems solved?
• Integrity—signature is invalidated if bytes are changed
• Authenticity—Certificate Authority verifies the identity of the owner of the private key
• Non-repudiation—the author is the only one who has access to the private key
1T3XT BVBA, the iText Company http://itextpdf.com/
What if?
• What if the author’s private key is compromised?
• What if the author falsifies the creation date of the document?
• What if the certificate expires too soon?
1T3XT BVBA, the iText Company http://itextpdf.com/
Revocation checking
• Certificate Revocation List (CRL)
The certificate is checked against a list of revoked certificates.
• Online Certificate Status Protocol (OCSP)
The revokation status is obtained from a server.
If the certificate was revoked, the signature is invalid.
1T3XT BVBA, the iText Company http://itextpdf.com/
OCSP
1T3XT BVBA, the iText Company http://itextpdf.com/
Timestamping
• The timestamp of a signature can be based on the signer’s local machine time,
• Or the signer can involve a Time Stamp Authority (TSA). The message digest is sent to a trusted timestamp server. This server adds a timestamp and signs the resulting hash using the TSA’s private key.
• The signer can’t forge the time anymore.
1T3XT BVBA, the iText Company http://itextpdf.com/
Timestamp
1T3XT BVBA, the iText Company http://itextpdf.com/
PAdES - LTV
• PAdES: PDF Advanced Electronic Signatures
• LTV: Long Term Validation
• Requires extensions to ISO-32000-1
• Described by ETSI in TS 102 778 part 4
• Requires Document Security Store (DSS) and Document Timestamp
• A new DSS+TS are added before expiration of the last document timestamp
1T3XT BVBA, the iText Company http://itextpdf.com/
Part 4
Under the hood
• Ch 13: PDFs inside-out
• Ch 14: The imaging model
• Ch 15: Page content and structure
• Ch 16: PDF streams
1T3XT BVBA, the iText Company http://itextpdf.com/
Parsing PDF
1T3XT BVBA, the iText Company http://itextpdf.com/
Render listener interface
public void renderText(
TextRenderInfo renderInfo) {
System.out.print("<");
System.out.print(renderInfo.getText());
System.out.print(" @ (");
System.out.print(
renderInfo.getBaseline()
.getStartPoint().get(0));
System.out.print(", ");
System.out.print(
renderInfo.getBaseline()
.getStartPoint().get(1));
System.out.print(") l: ");
System.out.print(
renderInfo.getBaseline()
.getLength());
System.out.println(">");
}
RenderListener
beginTextBlock()
renderText(TextRenderInfo info)
endTextBlock()
renderImage(ImageRenderInfo info)
1T3XT BVBA, the iText Company http://itextpdf.com/
Output
1T3XT BVBA, the iText Company http://itextpdf.com/
Tagged PDF
1T3XT BVBA, the iText Company http://itextpdf.com/
Optional Content
• A different type of marked content: PdfLayer a1 = new PdfLayer("answer 1", writer);
a1.setOn(false);
BaseFont bf = BaseFont.createFont();
PdfContentByte cb = writer.getDirectContent();
cb.setRGBColorFill(0xFF, 0x00, 0x00);
cb.beginText();
cb.setFontAndSize(bf, 18);
cb.beginLayer(a1);
cb.showTextAligned(Element.ALIGN_LEFT,
"A1: Stanley Kubrick", 50, 742, 0);
cb.endLayer();
cb.endText();
1T3XT BVBA, the iText Company http://itextpdf.com/
Portable collections
1T3XT BVBA, the iText Company http://itextpdf.com/
Creating a Flash Component
1T3XT BVBA, the iText Company http://itextpdf.com/
Flash component in HTML
1T3XT BVBA, the iText Company http://itextpdf.com/
Online XML data
1T3XT BVBA, the iText Company http://itextpdf.com/
Crossdomain.xml
1T3XT BVBA, the iText Company http://itextpdf.com/
Flash component in PDF
1T3XT BVBA, the iText Company http://itextpdf.com/
The future of iText
Five ideas for 2011 • The frustration of working with HTMLWorker
• Finally start working on XFA to PDF conversion
• Digital Signatures: PAdES, timestamps,...
• Eclipse plug-in for iText
• iText for Android
Additional ideas: • Accessibility (Tagged PDF, PDF/UA?)
• GIS Options
1T3XT BVBA, the iText Company http://itextpdf.com/
HTMLWorker
• Support for straight forward HTML – No URL to PDF conversion yet
– Support for more HTML tags and CSS styles
– Target for iText 5.1 (April 2011) • HTML generated with FCKEditor and TinyMC
• “Rich Text” as defined in XFA and PDF specs
• Support for all HTML would be nice too – Full blown HTML to PDF conversion
– Do what a browser does
1T3XT BVBA, the iText Company http://itextpdf.com/
XFA to PDF
• The new HTMLWorker will be based on a new class XMLWorker
• XFA is the XML Forms Architecture
• With Adobe’s “Rich Text”, we’re already implementing a small part of the XFA.
• Once iText 5.1 is released we’re ready to start an XFA to PDF project, but...
• Is there a sponsor for such a project?
1T3XT BVBA, the iText Company http://itextpdf.com/
Digital Signatures
• PAdES: needs to be in future iText version
• Signing server: product?
• Timestamp server: service?
1T3XT BVBA, the iText Company http://itextpdf.com/
iText for Android
• iText light for phones
– Demo: Hello world
• iText full for tablet PCs