©2010 Really Strategies, Inc. | www.rsuitecms.com
1
1
DITA for Publishers
Intelligent Content 2012
Eliot Kimber
Senior Solutions Architect
©2010 Really Strategies, Inc. | www.rsuitecms.com
2
What Are Publishers?
Enterprises whose primary business is producing print or online content for human consumption
Types of Publishers: Fiction and Trade Books (novels, popular non-fiction)
Educational (textbooks, test prep, etc.)
Scientific, Technical, Medical (STM)
Journals
Reference (dictionaries and encyclopedias)
Magazines
Other
All share common business process challenges
Most use similar technology for authoring, production, and delivery
©2010 Really Strategies, Inc. | www.rsuitecms.com
3
Business Drivers for XML Use
Rise of digital publishing
Maximize value of content through reuse and repurposing
Better enable licensing of content
Automation of production Reduce cycle times
Reduce production costs
React quickly to new technology
Integrate interactive media (video, apps, etc.)
Management of large volume of content: what do I have and where is it?
©2010 Really Strategies, Inc. | www.rsuitecms.com
4
Why DITA Makes Sense for Publishers
DITA can be applied to any kind of document
DITA provides robust base vocabulary that can be safely and easily extended
Publishing content is simple, except where it isn’t
Publishing content is not modular, except where it is
Publishers have critical need for:
Re-use of content at various levels of granularity
Reliable interchange of content in XML form
Within organizations
With licensing partners
Flexible vocabulary
Ability to create new products quickly from existing content
©2010 Really Strategies, Inc. | www.rsuitecms.com
5
What Is DITA for Publishers?
Open-Source project
(dita4publishers.sourceforge.net)
Project goal:
Provide general-purpose DITA vocabulary and supporting
processing specific to Publishing documents and
business processes in order to lower the cost of using
XML as far as possible.
Current version as of Feb, 2012:
0.9.18
0.9.19 under development
Long-term intent is to make it a formal standard
aaaaaaa
©2010 Really Strategies, Inc. | www.rsuitecms.com
6
Why is DITA for Publishers?
DITA offers shortest, fastest route to high-value, extensible, sustainable XML solutions
DITA out-of-the-box optimized for tech docs, not Publishing documents
Publishers need basic and fairly obvious starting point
Need to support Publishing workflows:
Non-XML authoring (Word manuscripts)
InDesign for print production
EPUB, Kindle, and other digital formats
As a market, Publishing orders of magnitude larger than tech doc.
Publishing is a profit center, tech doc is a cost center
©2010 Really Strategies, Inc. | www.rsuitecms.com
7
What D4P Provides
DITA vocabulary modules for new topic types, map
types, and domains. These are packaged as Toolkit
plugins.
Extensions to HTML and PDF transforms to
support D4P vocabulary
Transformation types for EPUB, Kindle, “HTML 2”,
and, soon, HTML 5.
Word-to-DITA transformation framework
DITA-to-InDesign transformation framework
©2010 Really Strategies, Inc. | www.rsuitecms.com
8
D4P Topic Types
Reflect typical Publishing components:
Article, chapter, part, subsection, sidebar, division, cover
Are otherwise identical to generic <topic>
Reflect Publishers’ expectation that there will be
element types for chapters, articles, etc.
By default, D4P topic types may be nested without
restriction
For example, typical practice is to create each
chapter as a single XML document with nested
subsection topics within the root chapter topic.
©2010 Really Strategies, Inc. | www.rsuitecms.com
9
D4P Map Types and Domains
Two map domains:
Publication metadata:
Defines markup for publication metadata as required by
publishers. More complete and sophisticated than base DITA
publication metadata.
Publication map domain:
Provides large variety of topicref types reflecting different
parts of publications
Sample publication map: “pubmap”
Combines the publication metadata and publication map
domains.
Suitable for generic publications
©2010 Really Strategies, Inc. | www.rsuitecms.com
10
Topic Domain modules
Adds stuff Publishers need that’s not in base DITA Formatting domain: <br/>, <tab/>, <dropcap>, <smallcap>, etc.
Enumeration domains: capture literal numbers, generate numbers using author-defined rules
Math domain: represent equations as graphics or using MathML (TeX also a possibility but not yet implemented)
Publication content domain: epigraph, epigram, pull quotes, etc.
Verse domain: poetry and similar content
Textbook domain: structures unique to textbooks (display-map, etc.)
Rendition target attribute domain: conditional attribute for specifying the output target (PDF, EPUB, etc.)
Except for MathML, this is all stuff Tech Docs would never use or even sanction as sensible
All this stuff is essential for Publishing use cases
©2010 Really Strategies, Inc. | www.rsuitecms.com
11
Typical D4P Document
©2010 Really Strategies, Inc. | www.rsuitecms.com
12
Word-to-DITA transformation framework
Enables conversion of styled Word to DITA XML
XML configuration file maps Word styles to XML
elements
Can extend the base transform with XSLT as
needed
Can generate maps and topics from a single input
Word document if needed
©2010 Really Strategies, Inc. | www.rsuitecms.com
13
DITA-to-InDesign Transformation
Framework
Enables generating InDesign InCopy articles from DITA XML using XSLT
InCopy articles may be manually placed in InDesign documents by Designers
XML-to-InDesign style mapping defined by easy-to-modify XSLT module
Separate Java library for generating complete InDesign documents
Limited in how complete the InDesign document can be because there’s no feedback from InDesign
Typefi product provides much more complete solution where book designs are reasonably consistent
©2010 Really Strategies, Inc. | www.rsuitecms.com
14
DITA FOR PUBLISHERS IN
ACTION
©2010 Really Strategies, Inc. | www.rsuitecms.com
15
Japanese Literature
aaaaaaa
©2010 Really Strategies, Inc. | www.rsuitecms.com
16
Typical Trade Book
©2010 Really Strategies, Inc. | www.rsuitecms.com
17
Enhanced EPUB
©2010 Really Strategies, Inc. | www.rsuitecms.com
18
Linear Algebra Textbook
©2010 Really Strategies, Inc. | www.rsuitecms.com
19
InDesign From DITA
©2010 Really Strategies, Inc. | www.rsuitecms.com
20
DITA for Publishers Future
HTML 5 output with basic designs for Web and
tablet delivery
EPUB3 (once readers are available)
Improved DITA-to-PDF process (XSL-FO based)
Input into DITA 1.3 process (more highlight
elements, pubmap, etc.)
Pursue formal standardization
Continue to add vocabulary as requirements are
discovered
©2010 Really Strategies, Inc. | www.rsuitecms.com
21
Summary
Publishers need XML
XML means DITA
Publishers need DITA adapted to the needs of
Publishers
Thus DITA for Publishers
DITA for Publishers is an open-source project
Many parts are of course useful to all DITA users
(EPUB generation, enhanced HTML, etc.)
©2010 Really Strategies, Inc. | www.rsuitecms.com
22
Questions?
Contact: [email protected]
D4P Project: http://dita4publishers.sourceforge.net
General D4P discussion: DITA Users Yahoo group