Date post: | 18-Dec-2014 |
Category: |
Technology |
Upload: | keith-schengili-roberts |
View: | 188 times |
Download: | 0 times |
(Almost) Four Years On: Metrics, ROI, and Other Stories from a Mature DITA CMS InstallationKeith Schengili-Roberts | November 15, 2010
2
Agenda
• Intro + ROI• Things We Didn’t Expect• Measuring Productivity: Uses of Metadata
3
Who is This Guy?
Keith Schengili-Roberts• Manager for documentation and
localization for AMD’s Professional Graphics division (formerly ATI) Prior to becoming manager of the
group, was its information architect
• Lecturer at University of Toronto’s Professional Learning Center since 1999, teaching courses on information architecture and content management (sample slide decks available from: http://www.infoarchcourse.com/)
• Author of four titles on Internet technologies; last title was “Core CSS, 2nd Edition” (2001)
4
ROI Executive Summary
Proven return on investment (ROI) benefits from using a CMS-based DITA over the previous toolchain: Productivity/output increases
– Somewhere between 2.3 and 3 times more efficient
Can “do more with what we’ve already got”– Minimalism and content re-use goes a long way– We have fewer writers than when we started while our
output rate continues to increase
Localization cost savings– Localization budget is now less than half of what we
needed from the year before we started using the DITA CMS
– We are much more productive
5
What We Do
Documentation & Localization Group at AMD's Graphics Product Group (GPG) Formerly ATI
Based in Markham, Ontario
4 writers, 2 process engineers, 2 localizers, 1 manager
CMS: DITA CMS from Ixiasoft (www.ixiasoft.com)
Responsible for: End-user documentation, including online help (20%)
Engineering documentation for ODM/OEM partners (60%)
Technical training documentation for partners (20%)
Localize in up to 25 languages (mostly end-user and UI)
Primary outputs are PDF and XHTML
6
Where We Started (i.e., “The Bad Old Days”)
Circa 2003-2006:
• Used unstructured FrameMaker Localization costs very high
Code page issues made localization QA work hard
Could not reliably keep in sync with major software releases (monthly cadence required for online help; could only do it twice a year)
Writers were deeply siloed Very little content shared
Content re-use (especially between different docs) very low
Output was efficient but quality was highly variable
7
Where We Are Now
Have been using Ixiasoft’s DITA CMS in production since February 2007Have published more than 2,200 documents in that time
46% in English 54% in the languages to which we localize (21 maximum)
Writers and the documentation process are more nimble Any writer can take on another’s projects Content re-use rate is good (slightly more than 50% monthly) Quality is uniformly better; re-used topics are edited topics
Localization process is streamlined, with more time now available to focus on QA than on administration or fixing formatting issues
8
Getting ROI by Doing More with What We’ve Already Got
• Using the old toolchain, we spent about 50% of our time formatting content; this equates to an almost equal boost in productivity using the DITA CMS.
• We automate things that can (and should) be automated; no more TOCs or Indexes built by hand.
• Through attrition, we have fewer personnel writing/localizing content; despite this, our output rate has increased. An information architecture content audit of existing materials
emphasized minimalism and re-use within and between document types.
Content re-use is considerable; now, de-siloed writers are more flexible on what they can work on.
We continue our effort to find out what customers find useful, and to give them only the information they require.
9
ROI: Doing More with Less
Comparative numbers from 2007:
• Numbers show equivalent work on engineering docs (size types/sizes of docs/product release cycle)
• DITA CMS made us faster
• More than doubled output using the same headcount while taking on an expanded range of document types
10
ROI: Doing More with Less (cont.)
What’s happened since 2007?
11
ROI: Doing More with Less (cont.)
In 2009, 4 writers were responsible for 366 docs.• On average, each writer produced 91.5 docs in a year = ~23 per
writer per quarter This figure does include revisions; however, on average, we do same
number of revisions as we did under the old toolchain (we just do them faster).
• Compare this to some roughly equivalent numbers from another Tech Writing team cover a similar subject area using our old toolchain: They produced 360 docs using 9 over the course of a year; their docs
roughly the same size, type and having a similar release cadence
This = 40 docs per writer per year, or 10 per writer per quarter– By these numbers, use of the DITA CMS improves efficiency by 2.3 times
(your own results may vary)
• The two localization coordinators were responsible for producing 432 docs in the system during 2009.
12
ROI: Localization Cost Savings
• Content re-use in English corresponds directly to translated content re-use
• Eliminated desktop publishing (DTP) charges• As a result, we are able to produce publications
more quickly and reliably and less expensively than with our old toolchain: One example is our Catalyst Control Center online help:
prior to the DITA CMS, we could only hope to do this at most every 6 months; now, we can keep up with the monthly software release cycle.
13
CMS-based DITA and Localization Costs
Blue line= localization budget for quarter; Red line= actual localization spend
Our annual localization budget is now 2.5 times less than the year before we started using the CMS (2006)
• DITA CMS has more than paid for itself based only on reduced localization costs
The volume of localized content has increased over this time period
“Bad Old Days”
Content audit +Single-sourcing
CMS ROI
14
DITA Advantages from a Writer’s Perspective
Moving and implementing DITA is typically a management decision, but there are advantages for the writers:
Learning a new and valued skill (I've had two writers hired out from under me by another firm looking to "do DITA").
As content re-use increases over time, the writers act more as editors, so have a higher "value-add" to the content process.
Significant topic re-use means that writers learn more about other subjects using other writers’ topics, effectively de-siloing the writing team.
Programmatic skills increasingly called into play because there is a need for people who understand XSL and text-parsing languages (such as Python) and also understand publishing.
Things We Didn’t Expect
• Need for a “house” DITA Style Guide Also found ways to help enforce it
• Conrefs vs. Cloning• More nimble options available for doing localization• Use of tracking-based metadata allows us to do
thorough productivity measures And allows us to measure useful things we had not
initially anticipated
16
How Much DITA Do You Need?
In terms of the number of tags you need to use, it may be less than you think:
Our initial approach was evolutionary; writers could use any tag they felt necessary, and over time DITA tagging styles were established and made uniform (DITA Style Guide).
Using fewer tags decreases formatting issues/clashes when creating XSL output types.
In all, we actively use fewer than half of all DITA 1.1 tags.
17
Cloud of Relative Tag Usage
• 67 tags displayed, with a threshold of +20 min. usage• Tags not included because they are auto-populated/included in
our topic templates: othermeta, metadata, prolog, searchtitle, shortdesc, titlealts, navtitle
• Created using “Wordle” from www.wordle.net
18
Creating a DITA Style Guide
A recommendation for any tech docs group that uses DITA extensively: Helps new writers/contributors come up to speed Usefully narrows the scope of the XSL work that needs to
be done Many things are “legal” in DITA but may be poor from a
“house style” standpoint, for example:– Can have unformatted block content between a header and a table
in a section
– Tables and figures do not have to have a title
– Can have unlimited nested lists
– Alpha lists can contain more than 26 items
– Lists can contain only a single item
19
Schematron Can Help Enforce DITA Style
What is Schematron? “Schematron is a rule-based validation language for making assertions about the presence or absence of patterns in XML trees.” (www.wikipedia.org)
We use Schematron to point out to the writers potential errors/lapses in our DITA House Style:
Text between a section and table not wrapped in block tags:
A list ought to have more than one item (otherwise, why make it a list?):
20
XSL Can Help Enforce DITA House Style
We have a DITA house style that says nested lists should be no more than two levels deep.
Here’s Schematron doing it’s job:
And here is the result if you try to output it:
21
Conrefs vs. Cloning
At a very early stage we decided not to use conrefs in our DITA content• Made localization programmatically complicated/inefficient
• Creating a localization kit would mean finding all conrefs in a doc (however many levels they are nested) and then “flattening” them; leads to inefficient segment-matching
• Did not seem cost-effective from an author’s perspective• Would seem to limit reuse as conref targets become “fixed”; dare
not change without affecting many docs
• Searching and then defining a single phrase or paragraph to reuse not always an efficient use of time
22
Conrefs vs. Cloning
• We instead chose a “clone” approach to topic re-use:• Essentially, make a copy of an existing topic and use only the
parts that you need in your current document
• Original topic and cloned are completely separate (though trackable; parent/child relationship is retained in CMS)
• Cloning is only done when the amount of change is sufficient that the original topic cannot accommodate it
• Writers can more freely re-use existing topics for their own needs
• When a localization kit is made, the segment matching process is efficient
23
Nimble Localization Processes with DITA XML
Under the old toolchain, localizing a 200+ page document to a single language within a week (without huge expense) was impossible.DITA XML allows us to be more nimble: for critical large documents, we can send the localization firm finished “parts” as we get them (“70/20/10”):
When roughly 70% of a large document is done, we send it off for translation, followed a week or two later with another 20% of new and updated material, then the last 10% when we complete it.
While this process does cost more than sending in a whole document at once, it reduces the turnaround time from weeks to days, and quality is much improved because it is not done in a rush.
This approach was simply not feasible using our old toolchain; ultimately, the new toolchain is still cheaper and much faster.
Measuring Productivity: Uses of Metadata
There are three main purposes for metadata: Retrieval
Re-use Tracking
• Everyone who has used a search engine is familiar with the “Retrieval” part.
• Authors can add their own metadata to topics to aid in later retrieval for re-use. Topic and map dependencies can be checked, and
associated topics re-used in other publications.
25
Tracking Metadata
Tracking metadata (in our case, mainly dates, author, and topic/map status) is used for understanding trends and managing workflow.The types of questions we can readily answer include: Who created the content (author)? When was it created (date)?
Who modified it (editor)? Who reviewed it (reviewer/approver)?
Where has it been re-used (map relation)? Has it been published or translated (status/language)?
26
How We Measure Productivity
Metric we use is a combination of topics created + topics modified in a monthly/quarterly timeframe:
Each new topic created counts as 1. Modified topics are also counted, though again only as 1. Subsequent revisions to the same topic in a given
timeframe are not counted.Provides us with a very good view of ongoing work, and the numbers align with known product release cycles.Works both as an aggregate measure (total output per month), and as a measure of a writer’s individual productivity.
Maps are also tracked, but are not as good for measuring productivity since they come in many sizes and have widely varying development timelines.
27
Topics Created/Modified (Monthly)
28
Topic Production Matches Product Cadence
Product Release Cycle
#1
Product Release Cycle
#2
Product Release Cycle
#3
• Regular peak of production in Q3, typically followed by secondary peak in Q1M
ain Peak
Main Peak
Main Peak
Secondary Peak
Secondary Peak
Secondary Peak
29
Localization Segments Auto-translated within CMS Monthly
• Portion in orange is the percentage that were 100% matches, and were never sent to a localization vendor = pure ROI!
• From July 2008 to July 2009, an avg. of 54% of segments were auto-translated within the system.
30
Sample Topic Reuse Rate (Monthly)
From Jan 2008 to June 2009, average monthly topic reuse rate = 53.53%
31
An Interesting Trend: Topic Ratios
Except in year one, reference topics steadily make up ~74% of all topics used
32
What is the Average Size of a Topic?
Maps avg. = 3.47 kb
Concepts avg. = 2.46 kb
References avg. = 7.88 kb
Tasks avg. = 3.20 kb
1 byte = 1 character
1000 bytes (1 kb) = 1000 characters
• Concepts avg. 0.65 of a page of Lorem ipsum text in Word
• References avg. 2.6 pages Smallest: half a page
Largest: ~200 pages
• Tasks avg. 1 page
33
Questions & Answers