Content Modeling 101
A Cross Agency Study
Don BrunsJune 14, 2006
© Aquilent, Inc. 2006. All Rights Reserved
Define Content Modeling
Identifying the data elements, metadata elements, relationships, and reuse patterns that are inherent to an information product.
Often applied within the context of a CMS implementation Informs requirements for CMS design, selection, and implementation Often includes development of a taxonomy Drives content reuse Crucial step in running a successful CMS implementation
Think of content as collections of discrete chunks of information. Captured separately Stored centrally Reused, rearranged, and redeployed according to business logic
© Aquilent, Inc. 2006. All Rights Reserved
Content Strategy Framework
© Aquilent, Inc. 2006. All Rights Reserved
Process for Content Modeling
Perform content inventory
Identify content types
Find representative samples
Identify chunks
Document the content model
Confirm with stakeholders at every step
© Aquilent, Inc. 2006. All Rights Reserved
Content Inventory
Different from traditional UCD content inventory: Less emphasis on identifying navigation, site structures, page names, and ownership. More emphasis on identifying content types, metadata, and opportunities for reuse.
© Aquilent, Inc. 2006. All Rights Reserved
Identify Content Types
Information products with a common set of metadata and common purpose.
Aim for high value content types first. Supports large amounts of content Has high audience exposure Has high potential for reuse Crosses organizational lines
Recognize that 80% of content is unstructured (aka Generic web pages).
Confirm your analysis with stakeholders.
© Aquilent, Inc. 2006. All Rights Reserved
Find Representative Samples of Each Content Type
Choose several examples per content type.
Cross organizational lines if possible.
Find instances of reuse.
Look for difficult cases.
Confirm examples with stakeholders.
© Aquilent, Inc. 2006. All Rights Reserved
Identify Chunks
Separate content from presentation –
Draw boxes around possible data and metadata elements (a.k.a. “chunks.”)
Dig deeper – Many chunks won’t appear on the page (keywords in source code, content lifecycle dates).
Take a step back – Look for additional chunks wherever content is reused.
© Aquilent, Inc. 2006. All Rights Reserved
Identify Chunks (continued)
Approach metadata from all angles. Elemental (Title, Body) Descriptive (Subject, Intended Audience, Content Type) Lifecycle/Administrative (Publish Date, Expiration Date, Refresh-by Date)
Be realistic about chunking. Over-enthusiastic chunking can create a burden for content contributors. Do you really need 47 fields for a press release? Are you really going to reuse that?
© Aquilent, Inc. 2006. All Rights Reserved
Identify Chunks (continued)
Chunk appropriately – Granularity is mainly dictated by reuse requirements.
Avoid under-chunking Excessively coarse level of granularity Inhibits content reuse
Avoid over-chunking Excessively fine level of granularity Imposes a burden on users Can complicate reuse
Level of Granularity Chunks
Excessively Coarse Entire book
Coarse Chapters
Medium Pages
Fine Paragraphs or sentences
Excessively Fine Words or letters
© Aquilent, Inc. 2006. All Rights Reserved
Document the Content Model
Things to capture: Shared fields – common to all
content types Additional fields unique to this
content type Points of relationship between this
and other content types
Keep it conceptual at first.
Don’t infer database structures from this… yet.
Try to break the content model.
Confirm with your stakeholders.
Press Release
Case Studies
© Aquilent, Inc. 2006. All Rights Reserved
Office of Justice Programs (OJP)
Grant-making branch of DOJ
Federated web presence Main OJP website Five bureau-level offices with websites Two program offices with websites Each website has its own design,
navigation, content, web managers, content contributors, etc.
Little content reuse across websites
Some content out-of-synch across websites
© Aquilent, Inc. 2006. All Rights Reserved
OJP Example 1 – State Administering Agency Contacts
State Administering Agency (SAA) Contacts - Government officials in a particular state who administer federal grants on behalf of one or more OJP bureaus.
The Problem: Each of the five bureaus and HQ
maintained separate lists. Lists were often out-of-synch with
each other.
© Aquilent, Inc. 2006. All Rights Reserved
The Solution
Worked with web council to develop common content model for SAA Contact
Parsed content chunks for contact person
Included required dropdown lists for OJP Office and for State.
Captured SAA Contacts within the content management system
CMS deploys query-driven pages that display contacts by state, by agency, or both
© Aquilent, Inc. 2006. All Rights Reserved
OJP Example 2 – Topic pages
The Problem: Very little content from the five bureau level offices was appearing on the main OJP website.
The Goal: Unify OJP’s web presence. Dissolve content silos. Promote content reuse across
organization lines.
© Aquilent, Inc. 2006. All Rights Reserved
First Attempt
Web manager developed topic pages Linked off main OJP site Topic driven Draws content from all bureau level
sites Subject Matter Experts (SMEs)
recruited to act as Topic Page editors.
Why it failed: Topic page editors required to keep
track of new content on multiple sites. Required manual updates. Editors couldn’t keep up. Content became stale.
© Aquilent, Inc. 2006. All Rights Reserved
Second Attempt
CMS implementation between July 2004 and October 2005.
Led web council in developing cross-agency content model
Led web council in developing cross-agency taxonomy
Six facets to taxonomy Topic Crime Type Language Information Type Geography Demographic
© Aquilent, Inc. 2006. All Rights Reserved
Crime Type Facet
Drug Crime Drug Related Crime Manufacturing Possession Trafficking
Gangs
Hate Crimes
Organized Crime
Property Crime Arson Burglary Electronic Crime - Cybercrime Fraud Identity Theft Larceny/Theft Motor Vehicle theft Stolen Property White Collar Crime
Public Order Offenses Alcohol-related Offenses Antitrust Conspiracy Driving Under the Influence Environmental Offenses Immigration Offenses Money Laundering Prostitution and Commercialized Vice Racketeering and Extortion Regulatory Offenses Weapons Violations
Terrorism/Mass Violence
Trafficking in Persons
Violent Crime Assault Carjacking Domestic/Intimate Partner/Family Violence Gun Violence Homicide Kidnapping Rape and Sexual Assault Robbery Stalking
© Aquilent, Inc. 2006. All Rights Reserved
Applying Taxonomy within CMS
Authoring interfaces for all major content types include Taxonomy fields.
Most workflows include editorial / tagging step.
Taxonomy terms and relationships managed within the CMS.
© Aquilent, Inc. 2006. All Rights Reserved
Tagging OJP Content Using the Taxonomy
Topic Facet
Drugs Legal Substances
Alcohol
Juvenile Justice Child Health and Welfare
Underage Drinking
Law Enforcement
Crime Facet
Public Order Offenses Alcohol-related Violations
© Aquilent, Inc. 2006. All Rights Reserved
OJP Topic Pages Redux
Query-driven topic pages – CMS updates pages whenever relevant content is published.
Dynamic content reuse – Made possible by having a unified content model and taxonomy applied cross-agency.
Empowers bureau-level content managers act as stewards for larger OJP site.
Required special training for taggers – This taxonomy actually does something.
© Aquilent, Inc. 2006. All Rights Reserved
Commodity Futures Trading Commission (CFTC)
Independent Federal agency
Regulates Futures and Options markets in the US
Strong emphasis on preventing and prosecuting fraud
Diverse content reuse needs No technical infrastructure to support content reuse Web team working manually to meet reuse requirements
© Aquilent, Inc. 2006. All Rights Reserved
Legal Pleadings
Court documents Complaints Opinions Orders Decisions
Pertain to specific cases initiated by CFTC against accused violators
© Aquilent, Inc. 2006. All Rights Reserved
Enforcement Press Releases
Specialized media releases pertaining to ongoing cases
Dynamic Reuse Legal Pleadings
© Aquilent, Inc. 2006. All Rights Reserved
Case Status Reports
Updates on court cases initiated by CFTC against violators
Intended for general public (particularly victims of fraud)
Dynamic and manual reuse: Legal Pleadings Enforcement Press Releases
© Aquilent, Inc. 2006. All Rights Reserved
The Solution
Included “Defendant” attribute in the content models for:
Legal Pleadings Case Status Reports Enforcement Press Releases
Value added – Good example of how one metadata attribute can add lots of value to content.
Press Release
© Aquilent, Inc. 2006. All Rights Reserved
The Solution (continued)
Probably will involve a combo box or custom GUI control
Content contributors can add new defendants.
Content contributors can also select from existing defendants.
Worldwide Commodity
© Aquilent, Inc. 2006. All Rights Reserved
Top 10 Best Practices
1. Process leads technology decisions.
2. Don’t skimp on your content audit.
3. Separate presentation from content.
4. Think reuse.
5. Chunk appropriately (i.e. level of granularity.)
6. Think of your users’ needs and pain points. (47 fields for a press release?)
7. Add value to content (especially unstructured content) by applying a global taxonomy.
8. Base the content model on Dublin Core Metadata Standards.
9. Unify the content model across organizational lines as much as possible.
10. Involve key stakeholders at all levels at every step.
© Aquilent, Inc. 2006. All Rights Reserved
Contact Information
Don Bruns
Lead Information Architect
202-415-1284
Peter Fogelsanger
Director of Marketing
301-939-1706