PDF made easy with iText 7

transcript

PDF made easy with iText 7What’s new in iText and iTextSharp?

Benoit Lagae, Developer, iText SoftwareBruno Lowagie, Chief Strategy Officer, iText Group

Why did we write iText?

• Specific problems that needed to be solved– Emancipate PDF from the desktop to the server

• Solved in 1998 with a first PDF library• Deep knowledge of PDF required

– Make PDF creation easier for developers• Solved in 2000 with the release of iText• Concept: PdfWriter and Document• Add high-level objects (e.g. paragraph, list, table)

History

• First release: 2000• iText 1: 2003• iText 2: 2007• iText 5: 2009; upgrade to Java 5• iText 7: 2016; upgrade to Java 7

iText is available for Java and .NET

Why iText 7?iText 5 was approaching the limits of its architecture.iText 7 overcomes these limits and enables further user-driven feature development and more efficient support• Complete revision of all classes and interfaces based on experience

with iText 5.• Complete new layout module, which resolves some inconsistencies

in iText 5 and enables generation of complex layouts.• Complete rewrite of font support enabling advanced typography.

iText 7: modular approach

Basic design principleOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);// PDF knowledge needed to add contentpdf.close();

OutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);Document document = new Document(pdf);// No PDF knowledge needed to add contentdocument.close();

iText’s basic building blocks: examples

Hello world: codeOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);Document document = new Document(pdf);document.add(new Paragraph("Hello World!"));document.close();

Hello world: result

Hello world: the hard wayFileOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer);PageSize ps = PageSize.A4;PdfPage page = pdf.addNewPage(ps);PdfCanvas canvas = new PdfCanvas(page);canvas.beginText() .setFontAndSize( PdfFontFactory.createFont(FontConstants.HELVETICA), 12) .moveText(36, 790) .showText("Hello World!") .endText();pdf.close();

List example: code// Create a PdfFontPdfFont font = PdfFontFactory.createFont(FontConstants.TIMES_ROMAN);// Add a Paragraphdocument.add(new Paragraph("iText is:").setFont(font));// Create a ListList list = new List() .setSymbolIndent(12) .setListSymbol("\u2022") .setFont(font);// Add ListItem objectslist.add(new ListItem("Never gonna give you up")) .add(new ListItem("Never gonna let you down")) .add(new ListItem("Never gonna run around and desert you")) .add(new ListItem("Never gonna make you cry")) .add(new ListItem("Never gonna say goodbye")) .add(new ListItem("Never gonna tell a lie and hurt you"));// Add the listdocument.add(list);

List example: result

Image exampleImage fox = new Image(ImageFactory.getImage(FOX));Image dog = new Image(ImageFactory.getImage(DOG));Paragraph p = new Paragraph("Quick brown ").add(fox) .add(" jumps over the lazy ").add(dog);document.add(p);

New in iText 7:improved typography

and support for Indic scripts

iText 5: missing links

Indic scripts:• Only unsupported major script family• Feature request #1• Huge opportunity

• limited support in most other PDF libraries

Other features:• Optional ligatures in Latin script• Vowel diacritics in Arabic

Indic scripts: problems•Lack of expertise

• Unicode encodes 49 Indic scripts• Complex scripts with unique features

• Glyph repositioning: ह + ि� = हिह• Glyph substitution: ம + ு� = மு• Half-characters: त + �� + य = त्य

•Unsolvable issues for iText 5 font engine• No dedicated Unicode points for half-characters• No font lookups past ‘\uFFFF’• Ligaturization is context-dependent (virama)

Indic scripts: solutions

Writing a new font engine• Automatic script recognition

• Based on Unicode ranges

• Flexibility = extensibility• Generic Shaper class • Separate module, only called when necessary

• Glyph replacement rules• Different per writing system• Alternate glyphs are font-dependent

Indic scripts: examplesPdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true);String txt = "\u0938\u093E\u0939\u093F\u0924\u094D\u092F\u0915\u093E\u0930"; // saahityakaardocument.add(new Paragraph(txt).setFont(font));

String txt = "\u0B8E\u0BB4\u0BC1\u0BA4\u0BCD\u0BA4\u0BBE\u0BB3\u0BB0\u0BCD"; // eluttaalardocument.add(new Paragraph(txt).setFont(font));

Other scripts: examplesPdfFont font = PdfFontFactory.createFont(arial, PdfEncodings.IDENTITY_H, true);String txt = "\ u0627\u0644\u0643\u0627\u062A\u0628"; // al-katibudocument.add(new Paragraph(txt).setFont(font));

String txt = "writer"; GlyphLine glyphLine = font.createGlyphLine(txt);Shaper.applyLigaFeature(foglihtenNo07, glyphLine, null);canvas.showText(glyphLine)

Status of advanced typography in iText 7

•Indic scripts• We already support:

• Devanagari• Tamil

• Coming soon:• Telugu• Others: based on customer demand

•Arabic• Support for vocalized Arabic (diacritics) is in development

•Latin• Optional ligatures are fully supported

Real-world use:Publishing a database

CSV example

Imagine a series of records

Parse CSV line by lineOutputStream fos = new FileOutputStream(dest);PdfWriter writer = new PdfWriter(fos);PdfDocument pdf = new PdfDocument(writer); Document document = new Document(pdf, PageSize.A4.rotate());document.setMargins(20, 20, 20, 20);PdfFont font = PdfFontFactory.createFont(FontConstants.HELVETICA);PdfFont bold = PdfFontFactory.createFont(FontConstants.HELVETICA_BOLD);Table table = new Table(new float[]{4, 1, 3, 4, 3, 3, 3, 3, 1});table.setWidthPercent(100);BufferedReader br = new BufferedReader(new FileReader(DATA));String line = br.readLine();process(table, line, bold, true);while ((line = br.readLine()) != null) { process(table, line, font, false);}br.close();document.add(table);document.close();

Process each linepublic void process(Table table, String line, PdfFont font, boolean isHeader) { StringTokenizer tokenizer = new StringTokenizer(line, ";"); while (tokenizer.hasMoreTokens()) { if (isHeader) { table.addHeaderCell( new Cell().add( new Paragraph(tokenizer.nextToken()).setFont(font))); } else { table.addCell( new Cell().add( new Paragraph(tokenizer.nextToken()).setFont(font))); } }}

CSV: resulting report

Form fillingForm flattening

Example form

Look inside your PDF

Fill the formPdfReader reader = new PdfReader(src);PdfWriter writer = new PdfWriter(dest);PdfDocument pdf = new PdfDocument(reader, writer);PdfAcroForm form = PdfAcroForm.getAcroForm(pdf, true);Map<String, PdfFormField> fields = form.getFormFields();fields.get("name").setValue("James Bond");fields.get("language").setValue("English");fields.get("experience1").setValue("Off");fields.get("experience2").setValue("Yes");fields.get("experience3").setValue("Yes");fields.get("shift").setValue("Any");fields.get("info").setValue("I was 38 years old when I became an MI6 agent.");pdf.close();

Result after filling

Flatten the formPdfReader reader = new PdfReader(src);PdfWriter writer = new PdfWriter(dest);PdfDocument pdf = new PdfDocument(reader, writer);PdfAcroForm form = PdfAcroForm.getAcroForm(pdf, true);Map<String, PdfFormField> fields = form.getFormFields();fields.get("name").setValue("James Bond");fields.get("language").setValue("English");fields.get("experience1").setValue("Off");fields.get("experience2").setValue("Yes");fields.get("experience3").setValue("Yes");fields.get("shift").setValue("Any");fields.get("info").setValue("I was 38 years old when I became an MI6 agent.");form.flattenFields();pdf.close();

Result after flattening

Form flatteningMerging

United States: Example form

Flatten and mergePdfDocument destPdfDocument = new PdfDocument(new PdfWriter(dest));BufferedReader bufferedReader = new BufferedReader(new FileReader(DATA));String line;while ((line = bufferedReader.readLine()) != null) { ByteArrayOutputStream baos = new ByteArrayOutputStream(); PdfDocument sourcePdfDocument = new PdfDocument(new PdfReader(SRC), new PdfWriter(baos)); PdfAcroForm form = PdfAcroForm.getAcroForm(sourcePdfDocument, true); StringTokenizer tokenizer = new StringTokenizer(line, ";"); Map<String, PdfFormField> fields = form.getFormFields(); fields.get("name").setValue(tokenizer.nextToken()); form.flattenFields(); sourcePdfDocument.close(); sourcePdfDocument = new PdfDocument( new PdfReader(new ByteArrayInputStream(baos.toByteArray()))); sourcePdfDocument.copyPagesTo(1, sourcePdfDocument.getNumberOfPages(), destPdfDocument, null); sourcePdfDocument.close();}bufferedReader.close();destPdfDocument.close();

The result(and why we don’t like it)

Flatten and mergePdfWriter writer = new PdfWriter(dest).setSmartMode(true);PdfDocument destPdfDocument = new PdfDocument(writer);BufferedReader bufferedReader = new BufferedReader(new FileReader(DATA));String line;while ((line = bufferedReader.readLine()) != null) { ByteArrayOutputStream baos = new ByteArrayOutputStream(); PdfDocument sourcePdfDocument = new PdfDocument(new PdfReader(SRC), new PdfWriter(baos)); PdfAcroForm form = PdfAcroForm.getAcroForm(sourcePdfDocument, true); StringTokenizer tokenizer = new StringTokenizer(line, ";"); Map<String, PdfFormField> fields = form.getFormFields(); fields.get("name").setValue(tokenizer.nextToken()); form.flattenFields(); sourcePdfDocument.close(); sourcePdfDocument = new PdfDocument( new PdfReader(new ByteArrayInputStream(baos.toByteArray()))); sourcePdfDocument.copyPagesTo(1, sourcePdfDocument.getNumberOfPages(), destPdfDocument, null); sourcePdfDocument.close();}bufferedReader.close();destPdfDocument.close();

The result(much better than before)